CS180 Project 1 HOME / template
Note: this webpage does not print well.
Sergei Mikhailovich Prokudin-Gorskii (1863-1944) thought of a way to capture pictures in color, even though cameras that could take colored photos have not been invented yet! His idea was simple: take three identical photos of a subject, using a red filter for the first, a green filter for the second, and a blue filter for the third. That way, a special project could project these images using red, green, and blue light, and then align them to produce a colored image. For this project, I will do this digitally by aligning the three images using numpy and then colorizing and stacking them.
Most of the work for colorizing these images comes from finding the best algorithm to align the green and red images with the blue image.
Here are some examples of the original images taken by Prokudin-Gorskii using the RGB filters and black and white cameras. Before applying the alignment algorithms, I first split these images into thirds (in order to get to red, green, and blue images) and then cropped 10% off each edge to get rid of the black border. Click on the images to expand them.
The most basic way to align the three images is to use np.roll
to shift the green and red images across a certain range
and then return the shifts that is the smallest Euclidean distance away from the blue image. However, this method is very
slow and only finishes in a timely manner on low res .jpg images.
Since these low res images were all around 390x340 pixels, I found that searching +-15 pixels in the y and x directions
were sufficient for aligning them fairly well.
The alignments (in pixels) for these images are:
Cathedral
green image: 5 down 2 right
red image: 12 down 3 right
Monastery
green image: 3 up 2 right
red image: 3 up 2 right
Tobolsk
green image: 3 down 3 right
red image: 6 down 3 right
A method to speed up alignment is to use image pyramiding. By using recursion and scaling down the image by half at each recursion level,
we can search for the correct alignment on a much smaller range of values, even if the original image have thousands of pixels.
At level i, the image is scaled down to 1/(2^i) of its original size. We can then find the best alignment on the smaller image at the ith level,
add this alignment to the running total, shift the (i-1)st image by 2x this running total (since this image is 2x as big), search
for the best alignment on the (i-1)st image, and so on until we find the best alignment for the original image.
Since these high res images are all around 3700x3200 pixels, I found that having a recursion depth of 9, with a search range of +-5
pixels in the y and x directions at each level, worked well for aligning most of these images.
The alignments (in pixels) for these images are:
Self Portrait
green image: 2522 up 29 right
red image: 2424 up 37 right
Church
green image: 2536 up 4 right
red image: 2503 up 4 left
Icon
green image: 2554 up 17 right
red image: 2506 up 23 right
Lady
green image: 2518 up 9 right
red image: 2457 up 11 right
Emir
green image: 2519 up 24 right
red image: 2542 up 829 left
Note: Most of these images have a very high shift in the y direction. However, because np.roll
wraps around, this is the same as shifting these images down for less pixels.
All of the images seem well-aligned using the pyramid algorithm, except for emir.tif (the last image). This is likely because, due to the dark colors in the photo, shifting the red image far to the left brought the image much closer to the blue image in terms of Euclidean distance. To fix this using brute force, I limited the search range in the x direction to be 1 pixel to the right at each level, allowing the emir photo to be slightly better aligned (though not perfectly so).
Another way to fix the emir photo is to use the sobel edge detection algorithm on the images and then run the pyramid alignment algorithm on the edge-detected photos instead of RGB.