Content-Based Image Retrieval Using Depth Information

This project uses depth information for content-based image retrieval. In this kind of problem, given a query image which introduces a particular scene, we search for other photos from a database

Abstract

This project uses depth information for content-based image retrieval. In this kind of problem, given a query image which introduces a particular scene, we search for other photos from a database which display the same scene but are distinct from the first one by viewpoint, covered areas, image scale or camera rotation angle. For instance, given a photo of the Empire State building which was shot in a right angle relative to the building, we would like to retrieve additional photos which were shot from its left or right. For each image, it is possible to extract interesting points which provide the feature descriptors of objects and parts in the photo. These descriptors, generated by the SIFT algorithm, can later be used to recognize the same objects in other photos in an invariant way to scale, rotation, noise or illumination. In this project depth information is used in order to make the algorithm even more robust to wider viewpoints.

Adding Depth

In this project depth information is used in order to make the algorithm even more robust to wider viewpoints. This is done using depth sensors
For each pixel in the photo, in addition to color data, a value describing the distance from the camera is added

Flowchart Part I: Features

Flowchart Part II: Retrieval

SIFT Extraction

Construction of Scale-Space

Laplacian of Gaussian Approximation

UBP Segmentation Algorithm

Obtaining the Results

We used the algorithm over a database of 80 photos which were taken around the campus
For each photo, we rated the match with the other 79 photos and selected the photo that gives the best match
We created an automatic test to check whether the selected photo gives a right or wrong match

Results

Distance Ratio between nearest-neighbor to second-nearest-neighbor:

Analyzing the Results

The standard SIFT algorithm gives slightly better results (a difference of one or two correct photos). Several reasons apply:

Errors from depth sensor
Errors in segmentation
SIFT robustness

Future Directions

Correcting errors
Additional features besides SIFT
Sophisticated Surfaces Rating