CS 194-26 Fall 2022
Sergei Mikhailovich Prokudin-Gorskii was revolutionary for his time when he presented a new approach to color photography. By taking three photographs of the same subject with a red, green, and blue filter, we can overlay the images on each other to create a colored representation of the subject. In the early 20th century, Prokudin-Gorskii was sponsored by the Tsar for his brilliant idea and had the incredible opportunity to travel across the Russian Empire to take color photographs of everything that he saw. Although he wasn't actually able to see his colored images come to fruition, the RGB negatives were preserved by the US Library of Congress and have since been converted into amazing unique images.
For this project, we were tasked to convert Prokudin-Gorskii's red, green, and blue filtered glass plate negative photos into a singular colored image using various correlation measures to align each image on top of each other to produce the best-colored images possible.
The main problem to solve at first was how to compute the best alignment for the photos based solely on pixel values. Since I have had previous experience with using the Sum of Squared Differences (SSD) for correlating similarities between two pieces of data, this was my first approach to solving this problem. For this project, I kept the blue filtered image as my base for layering on just since it was the top image in the 3-photo strip. Then I calculated the minimum SSD for both the green and red layer compared to the blue to determine the best translation for the red and green filtered images for everything to line up. Since these images are still quite small, I implemented this algorithm using for loops so that both the red and green photo would be tested on the range (-15, 15) for both the X and Y values (essentially just moving the image around left, right, up, and down to find the best fit.
Once this process was done, the images still didn't come out as sharp as I would like it so I went back and did a couple of things to try and improve the results. My first thought was to remove the borders around each image to take out the added noise that could be affecting the result of my SSD calculations. I did this manually by splicing the data arrays but found that this made very little to no difference in the quality of the output. My next step was to normalize the image data arrays before doing any SSD analysis to try and better account for general image differences such as light intensity at the time it was taken. After a bit of debugging/tinkering with values, I was able to see a significant increase in the quality of my photos and were able to produce the images you see below.
I found it very interesting that after performing SSD on some images, there were very slight to no major differences in the output however after normalizing the data for other images it had a major effect (the Cathedral image versus the Monastery image). After doing a little more digging into the data, I found that the pixel values for the Cathedral images weren't all in the same range of 0 to 1 as the others were. This shows that when comparing two pieces of data, it is important to make sure the data that is being compared is similar in some way to ensure the best matching. This is why normalizing the data before performing SSD drastically improved the quality of some images but didn't have much effect on others.
For the higher resolution photos, running the same program I already created takes way too long due to the massive size we are dealing with. To improve this I first chose to align the images using Normalized Cross Correlation (NCC) to determine the number of similarities between the two images. With this change, I had already noticed slight changes to the low-resolution images and expected NCC to have a better output when I continue on to using an Image Pyramid to decrease the run time. For the image pyramid, I recursively scaled down my images by a factor of two until my image size was roughly less than 1000 pixels. With this scaled-down version of the image, the NCC algorithm can run more efficiently and thus find an appropriate shift value to line up the images. With each recursive call, I also make sure to check the NCC values on the pixel range of -1 to 1 to ensure as the image is resized up, the alignment of the photo continues to get better.
Church
Red Shift of (-4, 25)
Green Shift of (3, 25)
Icon
Red Shift of (22, 40)
Green Shift of (17, 40)
Lady
Red Shift of (11, 53)
Green Shift of (8, 53)
Melons
Red Shift of (12, 81)
Green Shift of (9, 81)
Onion Church
Red Shift of (36, 51)
Green Shift of (26, 51)
Sculpture
Red Shift of (-26, 33)
Green Shift of (-11, 33)
Self Portrait
Red Shift of (36, 77)
Green Shift of (28, 77)
Train
Red Shift of (31, 41)
Green Shift of (4, 41)
Images that did not perform as well as others...
Emir
Red Shift of (-873, 49)
Green Shift of (24, 49)
Harvesters
Red Shift of (3, 31)
Green Shift of (15, 31)
Church
Red Shift of ()
Green Shift of ()
Red Shift of (0, 55)
Green Shift of (0, 55)
Red Shift of (16, 72)
Green Shift of (16, 72)
Red Shift of (11, 4)
Green Shift of (-2, 4)
With the larger-resolution files, some of the images did not come out quite as clear as the others and I have found that it is usually due to the red layer of color being off place. I played around with various different techniques like changing when I crop the images (at the very beginning vs only in the aligning function). Although I was able to improve the results for these images, they still didn't come out quite as clear as the others and will require further investigation to solve!