Image Extrapolation Using a Texture Synthesis Algorithm

In recent years, several algorithms in the field of Texture Synthesis have been suggested. However, only few of them are applicable (or adaptable) to the field of Image Extrapolation.

Abstract
In recent years, several algorithms in the field of Texture Synthesis have been suggested. However, only few of them are applicable (or adaptable) to the field of Image Extrapolation. This work examined the relevancy of a specific non-parametric method for texture synthesis, to the field of Image Extrapolation. The algorithm “”grows”” one pixel at a time from a given Sample image, based on the assumption of a Markov random field model.

However, in light of the algorithm’s poor running time performance, most of the research focused on exploring different approaches to speed up the extrapolation. These approaches were based on a variety of well known techniques, such as Features Vector and Gaussian Pyramid. Different heuristics were examined, as well as the combinations between several techniques.

In conclusion, we have dramatically reduced the running time, while achieving quality of results that usually varies between reasonable and very good, in comparison to the original algorithm. Thus, the algorithm may now be used by different applications in a reasonable time.

Introduction – The Original Algorithm
Most of the algorithms that have been suggested in recent years in the field of Texture Synthesis, focus on “”growing”” an artificial image from a small “”seed””, which is extracted from a wider Sample image. The results produced do not necessarily include the original Sample image. As a consequence, only few of these algorithms are applicable to the field of Image Extrapolation, in which the entire Sample is included in the result. Moreover, it is reasonable to assume that the Sample image might not be “”patterned””.

This work examined the relevancy of a non-parametric method for texture synthesis, presented by Efros and Leung, to the field of Image Extrapolation. Basically, the algorithm “”grows”” textures – one pixel at a time – from a given image, while assuming a Markov random field model. This implies that the distribution of the brightness value of a pixel, given the value of its neighbors, is independent of the rest of the image. Thus, the conditional distribution of each pixel, given all its neighbors, is calculated by searching for “”similar neighborhoods”” in the entire Sample image. “”Similarity”” is measured using the Sum of Square Differences (SSD) metric.

The problem
In many cases, the algorithm mentioned above produces good results, however, its running time is a true obstacle for any practical use. Extrapolating even relatively small images may take several hours. The reason for this poor performance is that for each pixel that is being synthesized, full SSD calculations are applied between the pixel’s vicinity and the corresponding windows surrounding each one of the pixels in the Sample Image. Since these operations are known to be computationally heavy, some heuristics should take place, in order to drastically shorten the running time.

The solution
Two major approaches, each with several heuristics in it, were examined in order to solve the performance issue:

The first one approach – using the results achieved when synthesizing one pixel, to synthesize other pixels as well (“”complete more than one””). Based on the presumption of local structure, after a pixel p has been synthesized according to the value of a pixel p’, some of its immediate unfilled neighbors are synthesized according to the corresponding neighbors of p’.

The second approach – reduce the number of SSD calculations per synthesized pixel. It is assumed that in most cases, there is no need to perform an exhaustive search on all of the windows in Sample Image. Instead, a subset of candidate windows can be extracted, using “”cheap”” calculations. The complete SSD calculation is then applied only on that subset. Under this approach, two heuristics were studied:

Features Vector – in which, each coordinate represents a certain characteristic of the vicinity of the pixel. Such a vector is constructed for each pixel, either being synthesized, or in the Sample Image. The candidate subset is then chosen according to the distance (in the Features Vector space) between the synthesized pixel’s vector and each of the vectors in the Sample Image. This process should be efficient, while dealing with problems caused by the non-valid pixels. Three main features were examined – “”DC”” (average value in the vicinity), “”Sum of absolute directional derivatives””, and “”Standard deviation approximation””. In addition, several distance functions and candidate subset sizes were studied
A Gaussian Pyramid – which consists of a series of images, each scaled down by two in each dimension from the previous one. Such a pyramid is formed, based on the Sample Image. An exhaustive SSD search is performed on the pyramid’s deepest level, and the best matches are “”bubbled up”” to the next level, where only their surroundings are checked. Again, only the best matches are “”bubbled up””, and so on. Such an approach can be seen as an attempt to correlate the low frequencies first, and refining according to higher frequencies. Unique solutions were presented to solve problems that arose when scaling down the edges or a region with non-valid pixels. For example, padding the image using the “”DC Feature”” and “”Complete more than one”” methods, before scaling it down

Tools
All implementations were done in a Matlab environment.

Results
After examining the different approaches described above, and the different variants within each approach, it can be stated that:

The “”Complete more than one”” approach proved to be effective. Running time was reduced to about half of its original value, while quality was hardly damaged
Forming a candidate set using the “”Features Vector”” approach produced mixed results. It seems that the “”DC Feature”” by itself, was the only feature worth while (quality & runtime). The rest of the features (and distance functions) – did not correlate with the results produced by the exhaustive SSD search. Using the “”DC Feature”” reduced runtime to about 1-4% (!) of its original value
The “”Gaussian Pyramid”” method produced mixed results, though usually significantly better in quality than the “”Features Vector”” and the “”DC Feature””. Its running time was about 3-12% of the original algorithm

Conclusions
In light of these results, and depending on the purpose of the extrapolation, different combinations of these heuristics may be suggested. For example:
“”Quick Extrapolation”” – which includes only the “”Complete more than one”” and the “”DC Feature”” with a small candidate set – leading to a fast operation. This mode fits most usage models that require the extrapolation as an intermediate stage, such as padding before applying a filter.

“”Not So Quick Extrapolation”” – which comprises of “”Complete more than one”” and the “”Gaussian Pyramid”” methods. This mode is relevant mainly to applications where the extrapolation itself is the output.

Extrapolating this image (originally 250×170 pixels) by 15 pixels to each side, using a combination of the “”Complete more than one”” and the Pyramid Approach, took less then an hour. In comparison, the extrapolation of the image presented above, using the original algorithm, took more than 16 hours. It can be seen that the quality is hardly affected

In any case, by using each of these modes, the running time was dramatically reduced to 1-6% of its original value. The quality of the results usually varies between reasonable and very good. Thus, the algorithm can now be used by different applications in a reasonable time. Nevertheless, in many cases the results were disappointing. Hence there is clearly more work to be done and future research may either concentrate on improving the quality of results, or focus no possible applications.

Acknowledgment
We wish to extend our sincere gratitude to our instructor – Boaz Ophir – and to Dr. Yoav Schechner for their advise, support and guidance. We would also like to thank VISL staff and Mr. Johanan Erez, for giving us the opportunity to explore this topic. We are also grateful to the Ollendorff Minerva Center that supported this project.