Mirror Descent Implementation in Emission Tomography

This project implements a new way to reconstruct a PET (photon emission tomography) image.

Abstract
This project implements a new way to reconstruct a PET (photon emission tomography) image. The implementation is based on paper [1], written by Ben-Tal & Nemirovski. The implementation is integrated into a given simulation code, written by Kisilev.
The project targets were to implement the mirror descent algorithm presented by Ben-tal & Nemirovski, integrate the implementation with an existing PET reconstruction code, written by Kisilev and finally to compare the two algorithms.

Introduction
This paper presents the adaptation and implementation of the iterative mirror descent algorithm for the PET image reconstruction problem.
PET (Photon Emission Tomography) is a medical imaging technique. The technique is based on injecting radioactive tracers to the body. These tracers emit photons, detected in detector pairs, which are referred as a bin. Measuring the number of photons detected by each bin, we get a projection of the tracer distribution.
In the last years, the use of iterative algorithms to solve the reconstruction problem has gained recognition as a better way then using the inverse transform method.
This is due to the rise of computation power and the introduction of new, powerful algorithms, such as the one presented here.
The Mirror Descent (MD) algorithm, recently proposed by Ben-Tal and Nemirovski [1], belongs to the family of gradient descent methods. It is a variation of the Conjugated Barrier (CG) algorithm, introduced by Ben-Tal and Nemirovski, and implemented by Kisilev [2]. The idea of the CB is to make the iterative step in a conjugated space, using Legendre transformation. First, we calculate the values of the minimized function and it’s gradient in the normal space. Next, we transform the image to the conjugated space, and create a new image there. Due to the nature of the conjugated space, the convergence of the minimization process is faster. After making the step, the image is transformed back to the normal space to calculate the next values of the function and it’s gradient.
The new idea in the Mirror Descent algorithm is the use of memory. E.g. the use of previous iterations to calculate the current step’s gradient & step size. The use of previous iterations to decide the current step size & the gradient form, should create a faster rate of convergence then the classical methods.
In this paper we begin by describing the general PET reconstruction problem. Next, we describe the CB algorithm for solving this problem. Afterwards, we present the MD algorithm, with notes to the differences between the two.

The general PET reconstruction problem
The PET imaging technique, as mentioned above, is done by detecting photons emitted from radioactive tracers. The detection is done in bins (detector pairs).
Let the total number of photons detected in each bin be y(b),. As in all medical imaging, the body is divided to voxels. Let number of photons generated independently in each voxel be n(v),. Generation of photons in each voxel is described by the following Poisson process, with expected value of photons :

The values depend on the tracer distribution, and is a function of the tissue structure of the tested body part. The variables y(b) are independent and Poissonian with expected values :

Let p(v,b)denote the probability of the event that a photon emitted from voxel v is detected in bin b, forming a matrix with entries. This probability matrix represents some physical factors, like the scanner geometry, detector efficiency, and the scanned body composition. In the next chapter we denote the function that represents the measure of y(b).

The minimization function
The power of all iterative algorithms comes from minimizing a well-defined function, and it’s gradient. It is important to choose a representing function, that by minimizing it we clear the image from unwanted elements (noise, artifacts, etc.)
In the area of medical imaging, the log-likelihood function is the traditional choice for such a function. In the case of PET imaging, the function takes the form:

Where y(b) are the measurements taken by the scanner, and is the set of unknown parameters, and .
Is the sensitivity image (i.e. the probability that an emission from v is detected in the bins).
The gradient of the log-likelihood function takes the form

Note: in Kisilev’s paper appears a penalized log likelihood function and gradient. Since our implementation refers to an un-penalized case, we use the above gradient.

Conjugated Barrier (CB) algorithm
This chapter is based on the implementation and article created by Kisilev [2]. The name of the algorithm defines the way it works: “Conjugated” refers to the fact that, as mentioned before, this algorithm is using a conjugated space (Legendre transformation) to calculate the next step. The “Barrier” is a function that defines a valid domain for the values of the optimized vector.
Let us define a function h as:

where denotes the norm, and is the domain of valid values of – all possible normalized pictures.

where is the total number of detector counts.
Now, let us define a conjugate function of , function :

Step 1:

Step 2:

Step 3:

where is the gradient of the log-likelihood function at the n-th iteration, and is a positive step size at the n-th iteration.
The value of the conjugated function at a given point is found by solving the following optimization problem

so that is given as the optimal value of this problem, and as the optimal solution.

Mirror Descent algorithm
This algorithm was recently proposed by Ben-Tal and Nemirovski[1]. The algorithm is an improvement to the CB algorithm, while maintaining all the qualities of the previous algorithm.
The improvement is adding a memory to the iterative process. Generally, the iterative step is composed of two things: the step direction and the step size. The standard iterative algorithms calculate both based on the current iteration only. The step direction is usually the gradient of the minimized function in the current iteration, and the step size is a decading function, with respect to the minimized function value. This algorithm keeps a memory of the previous function values and gradients, and calculates the direction and the size based on this previous knowledge.
The algorithm requires, in addition to the minimization function , a constraining function such as:

We use the fact that the image is a non-positive vector, and we know the total number of photon emission to come up with the constraining function:

Where T is the total number of photons detected in all of the bins.
In the paper by Ben-Tal and Nemirovsky, they describe 3 policies to calculate the step size, based on the memory of previous iterations. We chose to implement the “divergent series policy”, mainly due to the fact that the other two requires solving a min-max problem each iteration.

We use the following variables to store values from previous iterations:
obj(i) – f(x) from previous iterations.
Cns(i) – from previous iterations.
snum(i) – part of the numerator of the lower bound on f(x) in each iteration.
sdnm(i) – part of the denominator of the lower bound on f(x) in each iteration.
Sdnm(i) – part of the denominator of the lower bound on f(x) in each iteration.

In iteration t, we have in our disposal the values of f(x), f’(x), (31) and from all previous iterations.

In the previous iteration we have already calculated fmod for this iteration.

We form the auxiliary function:

We define the following two bounds on :

After calculating and we check if is “– centered” in the current iteration:

If so,

else we update it:

using we calculate the gradient:

we use the next auxiliary function:

to calculate the level and the step size:

The next step is to calculate the acceleration factor, .
In case of plain step sizes, . Otherwise, we set:

K is the largest power that still maintains the following inequality:

is the Legendre transformation to the conjugated space.
we also need to calculate fmod for the next iteration. The following calculations lead to fmod of the next iteration:

and now we can calculate the current step:

Iteration t is complete.

Note: is the same as in CB (see section 4).

Conclusion
In our implementation we encounter a problem in estimating the lower bound (fmod).
When we used large scale methods, the estimation was too far from the real function, causing the iterative step size to be too large. The reasons for the above seems to be the constants values we have used & the initialization of fmod.
A further study should be made, based on our implementation to achieve the proper initialization.

References
[1] Ben-Tal, A. and Nemirovski, A. (???). “Large-scale non-smooth convex minimization via mirror descent”, ???

[2] Kisilev P., Zibulevsky M. and Zeevi Y. (???) “Total variation and wavelet regularization methods in emission Tomography”, ???

[3] Ben-Tal, A. and Nemirovski, A. (1999). “The conjugate barrier method for non-smooth convex optimization”, TR #5/99, Oct. 1999, Minerva Optimization Center, Technion.

Acknowledgements
We would like to thank our supervisor Pavel Kissilev for his support and guidance throughout this project. Also we would like to thank The Ollendorff Minerva Center Fund which supported this project.