Code
# Import libraries
import polars as pl
import duckdb as db
import matplotlib.pyplot as plt
import seaborn as sns
import json
plt.style.use('ggplot')A quick summary about Lego bricks
Jesus LM
Mar, 2024
The Lego brick was invented in 1949 by Ole Kirk Christiansen, and the company has since grown to become one of the world’s leading toy manufacturers. Lego products are sold in over 140 countries, and the company has over 40,000 employees worldwide.
LEGO is a Danish toy production company. The company is best known for its colorful interlocking plastic bricks, and the vast possibilities of what can be built with them. The Lego Group also produces a variety of other toys, including board games, video games, and clothing.
In addition to its traditional brick-based toys, Lego also produces a variety of other products, including:
Lego products are popular with children of all ages, and they are also enjoyed by adults. Lego bricks are a great way to encourage creativity and problem-solving skills, and they can also be used to build models of just about anything you can imagine.
Lego was derived from the Danish phrase leg godt, which means “play well”.
In this project, we will show a summary of lego blocks.
A comprehensive database of lego blocks is provided by Rebrickable.
The data is available as csv file and the schema is shown below
Database schema
Let us start by reading in the colors data to get a sense of the diversity of lego sets!
Now that we have read the colors dataset, we can start exploring it! Let us start by understanding the number of colors available.
The colors data has a column named is_trans that indicates whether a color is transparent or not.
We shall to explore the distribution of transparent vs. non-transparent colors.
Another interesting dataset available in this database is the sets data.
It contains a comprehensive list of sets over the years and the number of parts that each of these sets contained.
Sets data
Let us use this data to explore how the average number of parts in Lego sets has varied over the years.
| set_num | name | year | theme_id | num_parts |
|---|---|---|---|---|
| str | str | i64 | i64 | i64 |
| "00-1" | "Weetabix Castle" | 1970 | 414 | 471 |
| "0011-2" | "Town Mini-Figures" | 1978 | 84 | 12 |
| "0011-3" | "Castle 2 for 1 Bonus Offer" | 1987 | 199 | 2 |
| "0012-1" | "Space Mini-Figures" | 1979 | 143 | 12 |
| "0013-1" | "Space Mini-Figures" | 1979 | 143 | 12 |
Lego blocks ship under multiple themes.
Let us try to get a sense of how the number of themes shipped has varied over the years.
| set_num | name | year | theme_id | num_parts |
|---|---|---|---|---|
| str | str | i64 | i64 | i64 |
| "00-1" | "Weetabix Castle" | 1970 | 414 | 471 |
| "0011-2" | "Town Mini-Figures" | 1978 | 84 | 12 |
| "0011-3" | "Castle 2 for 1 Bonus Offer" | 1987 | 199 | 2 |
| "0012-1" | "Space Mini-Figures" | 1979 | 143 | 12 |
| "0013-1" | "Space Mini-Figures" | 1979 | 143 | 12 |
| year | theme_id |
|---|---|
| i64 | u32 |
| 2013 | 593 |
| 2014 | 715 |
| 2015 | 670 |
| 2016 | 609 |
| 2017 | 470 |
# Plot trends in average number of parts by year
plt.figure(figsize=(11, 7))
plt.plot(themes_by_year['year'], themes_by_year['theme_id'], marker='o', linestyle='-')
plt.title('Lego - Evolution of average number of parts (1950-2017)', fontsize = 18)
plt.xlabel('')
plt.ylabel('Themes')
plt.legend('',frameon=False)
plt.show()Jesus LM
Economist & Data Scientist