Space-Time Super Resolution in Video

Space-Time Super Resolution is the increase of temporal and spatial resolution in video, using multiple video sequences describing the same event.

Abstract
Space-Time Super Resolution is the increase of temporal and spatial resolution in video, using multiple video sequences describing the same event. Rapid movements and high spatial frequencies, which are otherwise lost, are thus recovered.
In this project we implement an algorithm of Super Resolution, proposed by Shechtman, Caspi and Irani in their paper “Increasing Space-Time Resolution in Video”. This algorithm requires no a-priori knowledge of the movement inside the frame. We wish to investigate the algorithm’s capabilities and limitations. The implementation is performed using Matlab.
Several examples of using the algorithm are shown. The examples use synthetic gray-levels video sequences. They show the capability of increasing both the spatial and temporal resolution, while reducing the blur introduced by the camera. A Trade-off between resolution increase in different axes is introduced in the algorithm. Several artifacts and limitations are examined through the examples, and directions for future improvement of the algorithm are proposed.

Key-words: super-resolution, space-time, multiple video sequences, PSF, exposure-time.

The problem
Whenever we use a video camera in order to document a Dynamical Scene, we encounter the camera’s limitations. Every video camera is characterized by finite spatial and temporal resolutions. These limit the maximal frequencies which are captured by the camera.
The sampling rate in is commonly called “Frame Rate” (given in frames-per-second). Movement in the frame, occurring at rates higher than the frame-rate cannot be captured correctly in the video. In the example below we see that a ball bouncing in a sinusoidal movement will seem to follow a straight line, if the video is sampled in a too slow rate (below the Nyquist frequency).

Another effect caused by the finite temporal sampling rate is the effect known as the “Wagon-Wheel Effect” in which fast rotating objects seem to rotate backwards because of the low sampling rate.
In the spatial domain the limited resolution, similarly, causes spatial aliasing when it falls below the Nyquist frequency. In the example below a picture containing several frequencies is down-sampled by factor 4 in the x-axis (and enlarged to the same size of the original with simple ZOH interpolation), demonstrating the aliasing effect, as high frequencies are lost and are aliased into lower ones.

1

In addition to the effects caused by the limited frequencies, two more effects are caused by the physical properties of the camera. The sampling of each frame lasts a finite time, called Exposure-Time, in which light is integrated by the detectors (of a digital camera). Objects in the frame moving with high velocity compared to the Exposure-Time are blurred, and their shape is distorted. Example for that can be seen in the following image, where the ball and racket are smeared along their trajectory.

2

Moreover, non-ideal detectors and poor optics introduce spatial blur, which affects the sharpness of the images received by them. The following images show an example in which the sharp scene is blurred by the camera during the sampling process.

The Solution

Several methods, introduced in order to solve this problem, are based on Motion Estimation in the video. However, motion in a video sequence is usually quite complex, and increasing the resolution in such video sequences is a very hard task.

While one video sequence does not provide enough information to overcome the effects mentioned, several video sequences, describing the same Dynamical Scene provide more information about it (more samples of it), and can be used in order to create a video with higher space-time resolution.

In their paper “Increasing Space-Time Resolution in Video” , Eli Shechtman, Yaron Caspi and Michal Irani propose a method for constructing a video sequence of higher space-time resolution, using several low-resolution video sequences generated by different cameras filming the same event. It uses the blurring effects introduced by every camera, which connect the different sequences. The method requires no a-priori information about the scene, and as such can be used on movies containing very complex motions. It also allows using and combining together video sequences of different resolutions. An interesting Trade-off between resolution increases in different axes is introduced, because of the still limited information contained in all the low-resolution video sequences.

The uses of Super-Resolution

Super-Resolution is widely used today as a method of combining information received from several sources, in order to produce better results. Several examples of the use of methods of super-resolution are:

1. A research conducted by NASA uses super-resolution methods in order to reconstruct the surface of the planet Mars from mutiple images.
See: A Bayesian Approach to High Resolution 3D Surface Reconstruction from Multiple Images

2. Combining the two concepts of Super-Resolution and Image Mosaicing. A Panoramic Mosaic image is created and super-resolution methods are activated to increase its resolution.
See: Applying Super-Resolution to Panoramic Mosaics

3. Combining information from multiple sensors, which can be of different kinds. This can be used to take advantage of the correlation between the three color layers to produce color images of higher resolution and in some applications of computer vision.
See: Multi-sensor super-resolution

And many more…

The aim of the project
In this project we offer an implementation of the method mentioned earlier. We wish to investigate the method’s capabilities in increasing the resolution and reducing the blur. We bring several examples of using the algorithm, and conclude some limitations of this method.

Tools
Our algorithm is implemented using Matlab. We also make use of the Optimization Toolbox in Matlab. Our tests and examples of the algorithm use synthetic images and video sequences, produced by the computer. We simulate the video camera by image processing tools.

Some Examples
Here are some examples of using the algorithm, which show it’s capabities in increasing the resolution both in the spatial and temporal domain, while reducing the blur.

1. Reducing blur in the spatial domain: On the left, one of four images of size 256X256, produced from the 512X512 Lena, by down-sampling by factor 2 in each spatial axis, and blurring. On the right, The result of using the algorithm on the four images, yielding a sharper image (of size 256X256).

3 4

2. Increasing spatial resolution: From four images of size 240X240 (top) created as described in 1, we produced a result of higher resolution, of size 288X288. The result (bottom) is larger, and yet sharper.

5

6

Conclusions

The examples above show the capabilities of the algorithm. One major advantage of this specific algorithm, which made it very easy to use, is its movie independence. This is in contrast to other algorithms (and improved version of this one), where a-priori information on the movie is needed, and regularization is made. However, these methods produce better results, and allow advanced uses of Super-Resolution, such as increasing video sequences’ resolution using high-resolution still images.

Acknowledgment

We are grateful to our project supervisor Nir Maor for his help and guidance throughout this work. We also want to thank Eli Appleboim for his help regarding the solution of the equation system involved, and all the staff at the VISL lab.
We are also grateful to the Ollendorff Minerva Center Fund for supporting this project.

References