robert pless

Project 1: Aligning and compositing the images of the Prokudin-Gorskii photo collection

You are allowed to work in groups of 1 or 2 or 3 for this project.

Useful Links

Data: zip file (~60MB)
Browse through more data: Library of Congress
Due: 10:00am on Monday, Jan. 24. A link to a blog or web page that is a write-up of your project

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a photographer ahead of his time. He saw color photography as the wave of the future and came up with a simple idea to produce color photos: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter and then project the monochrome pictures with correctly coloured light to reproduce the color image; color printing of photos was very difficult at the time. Due to the fame he got from his color photos, including the only color portrait of Leo Tolstoy (a famous Russian author), he won the Tzar's permission and funding to travel across the Russian Empire and document it in 'color' photographs. His RGB glass plate negatives were purchased in 1948 by the Library of Congress. They are now digitized and available on-line.

Requirements

Take the digitized Prokudin-Gorskii glass plate images and automatically produce a color image with as few visual artifacts as possible. You will need to extract the three color channel images and align them so that they form a single RGB color image. The high-resolution images are quite large so you will need to have a fast and efficient aligning algorithm. You are required to implement a single-scale and multi-scale aligning algorithm that searches over a user-specified window of displacements. Also, you are required to try your algorithm on other images from the Prokudin-Gorskii collection.

Details

Important notes about the images:

The images are, from top to bottom, in BGR order.
Each image has a high and low res image available online, so consider trying your aligning algorithm on both.
Assume the negatives are evenly divided into 3 plates (ie, each plate is in exactly 1/3 of the negative).
Assume that a simple x,y translation model is sufficient for proper alignment.

MATLAB stencil code is available, as is python stencil code. You're free to do this project in whatever language you want.

Example digitized glass plate images, both hi-res and low-res versions are available in this zip file

Your program will take a glass plate image as input and produce a single color image as output. The program should divide the image into three equal parts and align the second and the third parts (G and R) to the first (B). For each image, you will need to record the displacement vector that was used to align the parts. Don't get your coordinate order mixed up -- Matlab matrices are accessed (y, x).

The easiest way to align the parts is to exhaustively search over a window of possible displacements (e.g. [-15,15] pixels), score each one using some image matching metric, and take the displacement with the best score. There are several possible metrics to measure how well images match:

Sum of squared differences: sum( (image1-image2).^2 )
Normalized cross correlation: dot( image1./||image1||, image2./||image2|| )

Note that in this particular case, the images to be matched do not actually have the same brightness values (they are different color channels), so other metrics might work better.

Exhaustive search will become prohibitively expensive if the displacement search range or image resolution are too large (which will be the case for high-resolution glass plate scans). To avoid this, you will need to implement a better strategy. Some options:

a coarse-to-fine search strategy using an image pyramid. This means shrinking your image (for example, you might shrink your original 3000 x 3000 pixel image to be 300 x 300 pixels), where you can solve the problem quickly because the image is small, and also because you may only have to slide the R,G,B channel a few pixels relative to each other. Once you have the answer for the 300 x 300 pixel version of the image, you can use that displacement estimate to search only the most likely displacements in the 600 x 600 pixel image, and so on, until you are optimizing on the original image but only have to search a little bit.
There are other possible approaches to making a search faster. Can you think of any and test them?

Bells and Whistles

Although the color images resulting from this automatic procedure will often look strikingly real, they are still not nearly as good as the manually restored versions available on the LoC website and from other professional photographers. However, each photograph takes days of painstaking Photoshop work, adjusting the color levels, removing the blemishes, adding contrast, etc. Can you come up with ways to address these problems automatically? Feel free to come up with your own approaches or talk to the Professor or TAs about your ideas. There is no right answer here, just try out things and see what works.

Here are some ideas, but we will give credit for other clever ideas:

Automatic cropping: Remove white, black or other color borders. Don't just crop a predefined margin off of each side -- actually try to detect the borders or the edge between the border and the image.
Automatic contrasting: It is usually safe to rescale image intensities such that the darkest pixel is zero (on its darkest color channel) and the brightest pixel is 1 (on its brightest color channel). More drastic or non-linear mappings may improve perceived image quality.
Better colors: There is no reason to assume (as we have) that the red, green, and blue lenses used by Produkin-Gorskii were especially good, or correspond directly to the R, G, and B channels in RGB color space. Try to find a mapping that produces more realistic colors
Better features: Instead of aligning based on RGB similarity, try using gradients or edges.
Better transformations: Instead of searching for the best x and y translation, additionally search over small scale changes and rotations. Adding two more dimensions to your search will slow things down, but the same course to fine progression should help alleviate this.
Finding and trying this on related data: Can you make your algorithm work on the images from an even earlier time?. Can you find or track down high resolution originals?
Aligning and processing data from other sources. In many domains, such as astronomy, image data is still captured one channel at a time. Often the channels don't correspond to visible light, but NASA artists stack these channels together to create false color images. For example, here is a tutorial on how to process Hubble Space Telescope imagery yourself. Also, consider images like this one of a coronal mass ejection built by combining ultraviolet images from the Solar Dynamics Observatory. To get full credit for this, you need to demonstrate that your algorithm found a non-trivial alignment and color correction.

For all extra credit, be sure to demonstrate on your web page cases where your extra credit has improved image quality.

Web-Publishing Results

All the results for each project will be put on the course website so that the students can see each other's results. In class we will have presentations of the projects and the students will vote on who got the best results. If you do not want your results published to the web, you can choose to opt out. If you want to opt out, email the class TA saying so.

Write up

For this project, and all other projects, you must do a project writeup that will be shared as a webpage; either on the gwu blogging service (or any other blogging service that you like). In the report you will describe your algorithm and any decisions you made to write your algorithm a particular way. Show and discuss the results of your algorithm, including, if possible, examples where your algorithm works, where (and why!) it fails. Also discuss any bells and whistles that you did. Feel free to add any other information you feel is relevant. How much should you write about your project? The following are blog posts that are about the level of detail that I hope to see:

These are not class projects so the format isn't exactly, but they should a little bit about how to use images to talk about image based algorithms and show the size and scope of the write-up I'm hoping for.

Rubric

+20 pts: Correctly aligned images, even if done by hand (explain how!)
+30 pts: Single-scale implementation (if correct, no need to show by hand version)
+20 pts: Multi-scale or otherwise faster implementation
+10 pts: Implementation of Bell + Whistles. You must have at least as many bells and whistles as the number of people in your group.
+20 pts: Quality of the write-up.

Final Advice

A lot of the suggested MATLAB code will be in the Image Processing Toolbox. As a SEAS student you can download Matlab for free.
For all projects, don't get bogged down tweaking input parameters. Most, but not all images will line up using the same parameters (for example, how big of a range of possible displacements there might be). Your final results should be the product of a fixed set of parameters (if you have free parameters). Don't worry if one or two of the handout images don't align properly using the simpler metrics suggested here.
The input images can be in jpg (uint8) or tiff format (uint16), remember to convert all the formats to the same scale (see im2double and im2uint8).
Shifting a matrix is easy to do in MATLAB by using circshift, in a way that doesn't change the size of the array you are shifting.
The borders of the images will probably hurt your results, try computing your metric on the internal pixels only.
Output all of your images to jpg, it'll save you a lot of disk space.
When debugging any code that works with images, you should basically find a way to visualize almost everything that you do! In my (professional) code, I probably have one line of visualization/debugging code for every line of "real" code.

Credits

Project derived from Alexei A. Efros' Computational Photography course, with permission.

Collaboration Policy

All work that you turn in must be your work, your code. But it is also incredibly useful to talk with classmates and explore. So I try to share here some explicit guidance. You are explictly ALLOWED to:

Search the internet for answers to how to read in images, write out images, etc.