Data Mining
The original plan was to scrape Lego's "Pick-A-Brick" webpage and use HTML to get the images of the Lego bricks and their corresponding prices and size. This would have worked, if the webpage had a unique URL for different pages. With Lego's specific webpage, the URL stays the same regardless of the user navigating to different pages, making it very difficult to scrape for data unless you store cookies and user information.
The next step was to save the webpage and its contents as a .html file and to use image processing on the images of the bricks, such as the one below.
Left: this is the image from Lego's Pick-A-Brick website.
Right: this is the displayed RGB value of the middle pixel of the brick (36, 91, 174).
To analyze the pixels of the Lego brick image, I used numpy and the Pillow library.
- First, the program loads the image and converts the loaded image to an array using the built-in method np.array().
- From there, we chose a pixel in the middle of the image and found the RGB value of the image. These RGB values are returned as data values.
- These values are added to a dictionary for easy access once the user's input image had been pixelated.