Thyroid Diagnosis with SVM

The project comes to create a learning instrument that learns to diagnose thyroid patients, and separate them to three groups: healthy people,hipper and hippo.

Abstract
The project comes to create a learning instrument that learns to diagnose thyroid patients, and separate them to three groups: healthy people,hipper and hippo,based on 21 parameters and the known result of 3772 samples. the focus of the project is for that instrument to have as small an error as possible and as short as possible learning period. to create the instrument we use SVM training and analysis process.

The problem
the problem that the project comes to solve is the need to relay only on doctors opinion in the diagnosis of thyroid patients, this diagnosis relay on experience, and experienced doctors in the field are rare. the project will supply the less experienced doctor with a mathematical toll that indicates the most probable result based on the available data.

The solution
to solve the problem we will train a learning engine called SVM according to a known group of checked people and the tests results they had, and then let the SVM decide for any new patient if he is healthy ,hipper or hippo. support vector machine (svm) classifier creates a hyperplane which separates tow groups:positive and negative. all the points x which lie on the hyperplane satisfy:

1

where w is normal to the hyperplane. for linear separable case the support vector algorithm simply looks for the separation hyperplane with the largest margin.

2

we will have to find with which core and with which parameters the SVM makes the best separation, how to separate the 3 groups with the SVM separating only 2 groups at a time, and which of the measurements really help (or at list don’t come in the way of) separating between them.

Tools
to do that we have used a PC,Matlab 6.5 and dos based SVM engine from LIBSVM — A Library for Support Vector Machinesm

Results
4
Figure 1 – the minimum error achieved by the SVM

5
Figure 2 – the error dissent when choosing the measurements with a statistic selection system

6
Figure 3 – the error dissent when choosing the measurements with a genetic selection system

Conclusions
from the project one can conclude that the SVM tool can produce reliable results for the thyroid diagnose and probably many other medical diagnose problems,with very low errors. specifically an error of 0.87% for the thyroid problem, when using a polynomial core,with the measurements chosen according to the faster exponential core (1.06% error), in three days with a probability system, and in 32 hours with a genetic system.

Acknowledgment
We are grateful to our project supervisor Dori Peleg for his help and guidance throughout this work, and to the VISL lab team for there help and resources. Many thanks to the Ollendorff Minerva Center Fund for supporting this project.
We are also grateful to the LIBSVM — A Library for Support Vector Machinesm for the SVM engine that we used through out the project.