When trying to compress data rearranging the data in some ways sometimes gives a better result when applying the compression algorithm.
Abstract
When trying to compress data rearranging the data in some ways sometimes gives a better result when applying the compression algorithm. If we rearrange the in a way that can be reversed then we have a scan method. Matching the scan method with the compression algorithm characteristics can dramatically improve the compression ratio results.
Introduction
This project explores the efficiency of data compression achieved by applying
Peano – Hilbert scan algorithm followed by compression via the Lempel Ziv Welsh algorithm.
The Peano – Hilbert algorithm initial scan is made in order to improve the compression ratio achieved by the LZW algorithm, this is done by producing an output vector of the input image that preserves the spatial locality.
This spatial locality principle enables producing better compression ratio then we would have received by supplying a regular raster scan based vector of the image as an input of the algorithm.
The LZW compression algorithm benefits most from data who has many repetitions of data strings. When converting an image to the LZW algorithm’s input, that is a vector we fragment spots of the same color or color sequence into small bits and by that we loose possible data strings repetitions. Applying the Peano – Hilbert algorithm initial scan enables us to preserve more data strings.
Discussion
As a consequence a to the LZW algorithm’s nature a “good” input for this routine would be images that include pixels groups of similar color, the spatial locality of this pixel groups would then be preserved and the result would be a better compression ratio.
Another good implementation of this routine would be applying it on a sequence of images. The spatial locality would then be converted to spatial – time locality since a time limited sequence of images will be scanned at once. In this work image quartets will be scanned, but any sequence of images that is a power of 2 can be scanned. This limitation and the fact that the image must also be square with side size that is a power of 2 is caused by the nature of the Peano – Hilbert algorithm whose example is given in the following figure:

In order to further improve the compression ratio a lossy scan was implemented.
The lossy method is to replace currently scanned blocks with blocks that were previously scanned and who apply to the lossy error criteria.
In the implementation there is a square root error criteria, which is used for 8 Bpp images, and Humming distance criteria (XOR) used for 1 Bpp images.
The total result of the lossy scan is a less detailed image that is better compressed by the LZW algorithm the price in that case is a loss of data.
The LZW algorithm better compresses less detailed images because the algorithm builds a lookup table for blocks that are found in the image, and so fewer details in the image causes better use of the lookup table, more details results in rapidly filling the lookup table and compression results get worse.
Conclusions
The LZW + Peano Hilbert sequence although looking promising is rather simplistic in compare to predictive schemes in steel images and differential schemes in movies, the results are not good in more complex approaches.
The LZW algorithm because of using tables is very sensitive for detailed images; such images will not be compressed well because of the large overhead of the tables.
Changing the LZW word size to match the frequently repeated strings size improves the compression ratio by making the tables smaller and more usable.
When applying a loosy algorithm the result can be breaking frequently used data strings and by that achieving worse compression ratio there for the lossy algorithm must be matched with the LZW algorithm’s nature.
In contrary to some of the above LZW + Peano Hilbert sequence can sometimes give very good results, further optimizing the sequence according to the above conclusions might produce much better results.
Bibliography
1. Dr’ Micael Elad “Processing and analyzing images” November 1999.
2. A. Lempel and J. Ziv Compression of two-dimensional data, IEEE trans. On information theory, vol. IT-32 no. 1, pp.2-8 January 1986.
3. Thomas M. Cover, Joy A.Thomas Elements of Information Theory pp.319-326 Wiley-Interscience 1991.
4. Marcelo J. Weinberger and Gadiel Seroussi, The LOCO-I Lossless Image Compression Algorithm: Principles and Standardization into JPEG-LS , Hewlett Packard Laboratories.
Acknowledgments
We are grateful to our project supervisor Shie Manor for his help and guidance throughout this work.
We are also grateful to the Ollendorf Research Center Fund for supporting this project.

