Preprocessing Before JPEG Compression

JPEG is the most common format for saving pictures. JPEG is a LOSSY format, which means some of the information is lost after compression.

Abstract
JPEG is the most common format for saving pictures. JPEG is a LOSSY format, which means some of the information is lost after compression.
If compressing with low quality JPEG, distortions appear in the image. The reason is that the high frequencies in the picture are lost and therefore mainly edges are affected.
This project aims to reduce the side affects of JPEG. The way suggested is to preprocess the image before compressing and to save some of the information in a separate file, such that the picture compressed has less edges than the original picture. In the process of reconstruction both files will be used to rebuild the original image.

Initial Algorithm
The whole process deals with only gray-scale images.
The algorithm suggested works the following way:

Compressing

Divide the picture to blocks 8*8. The picture is divided that way because JPEG works this way. When dealing with blocks that size, it is most likely there are only one or two main levels of intensity in the block. There are two main levels if the block contains an edge, and only one level if the block doesn’t contain an edge
For each block, it is determined whether the block contains one or two main levels of intensity in the following way:

1. Set a parameter threshold
2. Operate max-lloyd algorithm with two intensity levels on the block
3. Check if the absolute difference between the two levels is bigger than the threshold. If yes, there are two main levels of intensity. If not there is one main level of intensity, the mean value of the block

Create a binary block with the one or two levels of intensity. A binary block corresponding to a block without an edge will have one level of intensity, while a binary block corresponding to a block with an edge will have two levels of intensity as determined by Max-Lloyd algorithm. After having this process for each one of the blocks, a “multi-binary” picture is created
Subtract the multi-binary picture from the original, to get a “difference picture”. This picture has fewer edges that the original picture
Compress the multi-binary image with PNG (lossless compression). Compress the difference picture with JPEG compression
Note: the parameter threshold has a crucial effect. Small threshold à more blocks will have two levels of intensity in the binary picture and more edges are detected. Big threshold à less blocks have two levels of intensity and less edges are detected

Reconstruction
Open the compressed difference picture, and add to the multi-binary picture.

Demonstration

Difference block

Multi-binary block

Original block

Results
Threshold is set to 60

a. size of files vs. quality b. MSE vs. quality
The size of files (Multi-binary+Differences) is bigger than the single JPEG picture.
The MSE is bigger than with JPEG.
According to the results our algorithm is worse than JPEG in all aspects.
One of the assumptions was that the edges are clear, one pixel clearly belongs to one object and another to the other object. However, it is usually not so. Usually in photographs there is an edge that consists of one-two pixels with intermediate values. Those pixels will have values different from their surrounding in the differences picture. Therefore the JPEG compression does not work well for the differences picture.
The goal is to change the multi-binary picture such that it will be closer to the original picture.

Improved algorithm

After acquiring the multi-binary picture, operate a LP algorithm so the edges are less clear. Then subtract it from the original image. Now the multi-binary after LP picture is closer to the original picture, therefore the differences picture is smoother
The LP algorithm: for each block of the multi-binary picture, filter it with several LP filters and compute the absolute mean error between the filtered binary block and the original block. The filter that minimized the error is chosen
Save the multi-binary picture (before filtering), the differences picture, and a weights file

Reconstruction
Filter the multi-binary picture according to the weights file. Then add the differences picture.

Demonstration

Original picture

Multi-binary picture

Differences with initial algorithm

Differences with improved algorithm

Results

a. size of files vs. quality b. MSE vs. quality

a. The size of files (Multi-binary+Differences) is bigger than the single JPEG picture.
b. The MSE is smaller than with JPEG for small quality.

Conclusion
After improving the algorithm the results are still not satisfying.
When the quality of JPEG compression is high, our algorithm is limited with the improvement it can afford.
When the quality of JPEG compression is low, the size of file using regular JPEG compression is very small thus the files in our algorithm are much bigger.
It is hard to find cases in which our algorithm is better than JPEG comparing size of file – MSE.

Acknowledgement
We would like to thank Ari Shenhar for supervising this project. Our acknowledgement goes also to the laboratory staff and to the Ollendorff Minerva Center for their support.