Handwriting Recognition

Our project implements a handwriting recognition system. Our system is made of a hierarchical multiclass SVM based classifier which tries to classify each of the input letters by itself.

Abstract
Our project implements a handwriting recognition system. Our system is made of a hierarchical multi-class SVM based classifier which tries to classify each of the input letters by itself. And a dictionary comparison system which checks classified words against a dictionary file to see if such words exist. The presence of a word in the dictionary implies a high probability of correct classification. If the word is not in the dictionary then the system tries to correct the classification so that it would match a word from the dictionary.
We use a new kernel function in our SVM binary classifiers which takes into account the exact way/direction in which the letters were written. This way we try to train a classifier per writer because each writer has his unique way of writing the different letters.

 

The problem
The main problem is to read raw handwritten text from input devices such as digital pen, touch pad or mouse and convert it to a digital form that can be further processed, analyzed and stored by computer programs such as text editors.
In modern computer application there is a need for handwriting recognition tools as it is not convenient to use traditional keyboard with devices like pocket pcs, tablet pcs and laptops. A robust handwriting recognition system that is well fit to a specific user is a convent solution to those problems.
Therefore there is need for an online, fast, robust and user friendly handwriting recognition system.

 

The solution
The data input source we used is a touch pad with a digital pen. We used a special input program we have developed to store the input data. The program stored the order in which the pixels that draw the letters. The order of the input of the pixels in each letter is very important for the core of our classifiers.
We have created a multiclass classifier which classified each input letter to its correspondent class from the letters a-z. The multiclass classifier is built of a set of 26 binary classifiers. Each binary classifier is a Support Vector Machine classifier (SVM) and is capable of separating its input examples to two groups according to its training set data. The core of the SVM is the kernel function. This function is actually a kind of distance measure between two examples. The kernel function we used is the Dynamic Time Warp (DTW) which finds a minimal distance between two input letters. When distance is defined by scanning both letters from their first pixel and each step advancing one pixel in each of the letters, or one pixel in only one letter until we reach the end of the two letters. Each step we accumulate the distance between the two current pixels.
After creating the multiclass classifier we use another set of input examples to find common misclassifications between letters. Such could be for example between ‘a’ and ‘u’, or ‘i’ and ‘l’. A matrix with this information is formed and is called the confusion matrix (CM). From this matrix we create a hierarchical classification tree. This is our final letter domain classifier. The motivation between the creation of such tree is to use as small classifiers as possible to separate classes which are similar (high confusion measure). The way this is done is by joining similar classes from bottom up. We find the two most similar classes and join them until we are left with only one class. Each joining means creating a classifier to separate the two classes. When the tree is complete we use it from the tree root to decide the classification path in the tree and in the end to decide which class our correct input is when we get to the tree nodes.
While creating the tree we also calculate for each classified letter the probability of it being classified correctly. This is done by multiplying the probabilities from all binary classifiers we passed on the classification path. Each binary classifier is matched with a sigmoid function that maps its output to a probability of it being correct. The sigmoid is calculated during the construction phase using subset of the learning set not yet used.
To check that the classifications went without errors we check each classified word against a word list. If the word is in that list we assume it was classified correctly. If it wasn’t we assume it wasn’t classified correctly and we wish to fix it. To fix the word we choose the letter from the word which has the least probability to be classified correctly and replace it with a wildcard ( ‘.’ or *). We search for the template in the wordlist and if matching words are found we choose one of them according to a predetermined rule or let the user to choose from the list.

 

Tools
In our project we used the following input devices:
AIPTEK Hyper Pen 8000u cordless pc tablet

Software development tools:
Matlab 6.5
Visual Studio .NET
Visual C++ 6.0

Additinal tools:
SVMlight by Thorsten Joachims
egrep by GNU

 

Conclusions
We achieved very high accuracy results using our classifier system. Adding the dictionary and the probability measure raised the accuracy to up to 100% in some cases. In other cases where the text contained words which do not appear in the wordlist the effect of the dictionary was the opposite. The error rate increased. There are many ways in which the system we built could be improved to produce even better results. The whole system could be up scaled, meaning more features per letter and more letters in the training set. Also the probability measure function fitting algorithm should be improved to produce more accurate probability measures. As our system is fit to a specific group of users it requires smaller learning set than a system that tries to match handwriting from an arbitrary writer

 

Acknowledgment
We are grateful to our project supervisor Dori Peleg for his help and guidance throughout this work
We are also grateful to the VISL lab engineer Johanan Erez who provided us with any necessary equipment we needed, including the HyperPEN 8000 PC tablet.
Many thanks to the Ollendorff Minerva Center for supporting the project.