Chaowei Tan, Bo Wang, Paul Liu, Zijuan Xu, Zhengjuan Fan, Dong C. Liu
The WFOV (wide field of view) imaging mode, also called extended-field-of-view or panorama imaging, provides an ultrasound image over an area much larger than the real-time window normally available.
The basic process of WFOV is to compound successive images in a sequence. For this application, we use image registration techniques to recover the probe motion relying on comparable information in the successive images; compared with the previous works, we present a modified searching algorithm for image registration and introduce an adaptive mesh method over images pre-processed by a fast smoothing operation to optimize motion estimation; and then an angle-based correction method is designed to reduce errors in the results of searching; furthermore, we calculate frame image overall motion estimation between two weighted point sets; in final, we propose the persistence method with the polygon filling algorithm to compound for a WFOV image.
Block Based Motion Estimation
The basic idea of block-based motion estimation is to divide the image into equal-sized M × N blocks, and a block is selected as the target block in the current frame. Between the current frame and the previous frame, a candidate block within the search area of the previous frame will be computed. The matching criteria we chose is the sum of absolute difference (SAD). So the best matched candidate block is selected with the minimum SAD, called MSAD, and the motion vector of the target block can be determined.
In order to balance search speed and quality of motion estimation, we propose an improved method based on the fast adaptive MSAD search strategy. This method can well adapt to WFOV processing and the results are more precise then the MSAD.
Search Error Surface and Initial Conditions Analysis
In the below equation, the error has been calculated between the target block from frame H(k) and the block at location (i, j) from frame H(k-1). Using the full search algorithm (FSA) within the search area of 33 × 33, we can get an error surface of 33 × 33. One example of the error surfaces is shown in Figure 1 with different observational positions. The starting point of the search is at the center (16, 16):
In order to get correct information of ROI, the probe is manipulated to move forward along the body surface, and all the images are maintained to be sampled in the same plane. Because the large moving distance and tissue motion can cause deformation of acoustic speckle, we control the image frame rate so that the moving distance between two successive frames is within one millimeter. Therefore, we may obtain uni-modal error surfaces as shown in Fig. 1, which shows that the local region contains the best-matched position.
The MSAD method is very suitable for uni-modal surface and it could be seen as a simplified conjugate direction search. It is fast to reach regions near the best-matched position, but it is not precise enough for the search required in WFOV. Hence we improve the MSAD method by adding a precise position method based on the gradient descent search algorithm, so that most of the best-match position can be hit. Also, the gradient algorithm can compensate for the disadvantage of MSAD search in the diagonal direction.
Implementation of Improved MSAD Method
1. Preprocessing: apply a 7 × 7 box filter to the current and the previous frames for smoothing.
2. Compute the initial SAD of each 16 × 16 block, e.g., at the initial position (is, js) for both two frames. By assuming there must be some relative movement between two successive frames, set this block to be invalid if this SAD value is smaller than a predefined threshold, and then do motion estimation for another. After this step, the relative static blocks and blocks that contain a region without enough information (e.g. low gray value region) will be discarded.
3. Compute its left and right SAD values along the x-direction. If the left is less, continue the SAD along the left direction until a larger SAD value is obtained or hit the left search boundary. Similarly (but now if the right is less), continue the SAD search along the right direction. The final SAD value will be a local minimum along the x-direction. Then we do searches along the y-direction. This search is similar to the x-direction: compute up or down SAD values and finds the minimum along one direction. This candidate SAD will be a minimum along the y-direction.
4. From the position of the candidate block, we use the gradient descent search algorithm to get the final SAD. The traditional gradient descent algorithm with a 3 × 3 search mode evaluates 9 points around every check point. When the minimum occurs at the center of the search window, the search stops and the final position (i , j) is found.
5. The motion vector will be vx = if – is and vy = jf – js. If the length of the vector (vx, vy) is close to 0, we discard this block and save its corresponding SAD value as the new threshold which will be used in the initial step of Item 2.
6. Continue Items 2 to 5 for all blocks in the current frame.
7. If the number of discarded blocks reaches a high proportion of the whole number of blocks in the current frame, this frame is treated as invalid and will be discarded from the sequence of compounding frames.
Angle Based Correction Method
Because the local minimum problem, tissue motion, the speckle deformation, and the wrong operation in scanning will make the estimated motion vectors inaccurate, a correction method is needed to regulate the bad motion vectors.
The current mean vector (derived from block motion vectors of the current frame) and the historical mean vector (mean vector of the previous corrected frame image) is the criteria during the correction processing. There are two reasons for choosing the mean vectors over other measures. First, the use of thresholds in our methods makes the distribution of valid blocks irregular, so median filtering for block vectors is not suitable here. Second, the motion of the corrected previous frame should dominate the motion of the current frame. The angle based correction method has the following steps:
A. The normalized current mean vector CurNorVec and normalized history mean vector HisNorVec are calculated first, and then the angle θ between the two vectors is obtained. If the absolute of θ is greater than 10°, the current image is treated as invalid for the excessive deviation degrees between the two successive images, and the current frame is discarded. If not, the following angle correction equation is applied to get the new normalized vector NewNorVec (noted that it could be seen as an angle):
NewNorVec = CurNorVec × ek + HisNorVec × (1–ek)
ek = e-0.25(S)2(sin2(θ))
The new mean vector NewMVec is obtained as the product of the NewNorVec and the length of the original current mean vector.
B. Correcting the motion vector of each block in the current frame with NewMVec. The corrected equation is:
NewBlockVec = CurBlockVec ×α+ NewMVec × (1–α)
where CurBlockVec is the motion vector of a block and NewBlockVec is its regulated value. The coefficient α is defined as,
α = e-c1 × min(1, c2/(c3+1))
where the coefficients c1 and c2 control the shape of the curve and c3 is the angle between CurBlockVec and NewMVec in degrees.
C. Updating the history mean vector with the new mean vector derived from these corrected motion vectors.
Frame Image Overall Motion Estimation
In order to obtain the overall motion parameters, we calculate the the minimum mean squared error (MSE) between images,
where R is rotation, t is translation, and c is scaling, {xi} and {yi} are two point patterns, i = 1, 2, …, n in an n-dimensional space. And in this application, n = 2. The optimum transformation parameters as follows,
Therefore, the least-squares optimization method is used to recover the motion between successive images in a sequence, i.e. from images 1 to 2, 2 to 3 and so on. Briefly, if the motion from images 1 to 2 is given by the parameters R, t and c, and that between images 2 to 3 is given by RĄ, tĄ and cĄ, the transformation matrix representing the overall motion from image 1 to 3 is simply given by,
where μx and μy are mean vectors of {xi} and {yi}, σx2 and σx2 are variances around the mean vectors of {xi} and {yi}, UDVT is a singular value decomposition of ∑xy the covariance matrix of these two point patterns, and S is expressed as,
Persistence Based Spatial Compounding Method
The persistence based spatial compounding method will be applied to image combining. Motion dependent persistence function is used to control the display of the real-time B-mode image while the target always changes. It tries to keep up with the motion of the target as well as to display the changes smoothly in the lively image. The basic equation of the persistence function is:
where I represents the value of an individual pixel in a frame, k and k+1 denote the frame numbers in the discrete time domain and p indicates the persistence frame. I(p)(k) is a pixel value on the k-th persistence frame while I(k) is a pixel value on the k-th acquired frame. The persistence parameter .β .is a function of the absolute difference of the current and previous acquired pixels and determines the speed of how frames changing. The parameter. .β is determined by,
where d represents the absolute difference of the current and previous pixel values, level is the index for different β. curves which is selected by users interactively for different compounding qualities, A and B are constants defining the shape of the curve.
During the persistence compounding, the compounded pixel in overlapped areas can be constructed by a set of pixels in successive images. But the tracking error will be increased when the sample displacement becomes large. So only the first n constructions in the set can be treated as the correct ones to the compounded image. The threshold n is a constant from experiments, and it limits the number of pixels to construct the compounded pixel in overlapped area. In our experiments, n is assigned to 30.
The basic methods for WFOV are limited to a rigid transformation of the image blocks and hence unable to account for more general tissue motion such as shear or non-linear deformations. In improved methods, we introduce an adaptive mesh method over images pre-processed by a fast smoothing operation to optimize motion estimation. Furthermore, we incorporate a more flexible motion model by solving the least square optimal linear transformation between two weighted point sets, where inaccurate block displacements are given less weight. And finally, we use persistence method with the polygon filling algorithm for improving accuracy.
Pre-smoothing
To improve the accuracy of registration, we should reduce the influence of speckle. Here, we use a fast and flexible method to do preprocessing. It is Semi-implicit Additive Operator Splitting (AOS) based Gaussion smoothing. Because Gaussian Smoothing is a kind of isotropic diffusion,
where N(i) is the set of two neighbors of pixel i (boundary pixels have only one neighbor). The 2D AOS scheme Gaussion smoothing equation is,
where uk and uk+1 are the image gray value vectors of the current and next time steps respectively, obtained by stacking each column vector of the K by L image on top of another into one long column vector of length KL, τ is the time step size (TSS), it controls the extent of blurring. A(uk) is a tridiagonal matrix, with A(uk) = [ai j (uk)] and,
Above linear system is tridiagonal and diagonally dominant, we could use the O(KL) Thomas Algorithm to solve it where KL is the length of uk. Compared with spatial domain filtering such as Gaussian and box filters whose time consumption will increase as we increase the extent of blurring, the first advantage of AOS scheme based Gaussian filtering is that its time consumption is constant for a large
range blurring. Although the AOS scheme based Gaussian filtering is unconditionally stable numerically, if we use a too large TSS to process a image, there will be many artifacts on the blurred image. So we said that it is constant for a large range, and the range is large enough for us to do preprocessing. Another advantage of AOS scheme based Gaussian filtering is that it is very flexible to control the
extent of blurring, we do not need to consider the boundary condition problem as we use spatial domain filters.
Adaptive Mesh and Block Motion Estimation
In order to have more accurate matching information, we develop a faster structure matrix based method to reduce time consumption. Structure matrix can be expressed as follows,
these new parameters are calculated according to the local image information around the sampling point and it could be diagnolized as,
where I(i, j) is the intensity of pixel (i, j) and Ix and Iy are the x and y partial derivatives respectively. However, directly discretizing the above equation is too susceptible to noise, here, we use a modified structure tensor to detect local coherence,
Here, the eigenvectors and the eigenvalues correspond to the directions of maximum and minimum variations and the strength of these variations, respectively.
Using the above directions, we could use an elastic spring model to determine a new nodal for registration. The criteria of model is the potential energy εm at node m, which is a combination for the internal spring restoration energy εs and the feature energy εf, it is,
and
ε2 at each node is the total sum of the squared length change of springs connected to the node, k is the spring constant, and ||D||2 is the squared magnitude of the nodal displacement vector. di = li −L is the length change of rest length L and stretched length li. Therefore, nodal positions are determined where εpre ≥ εcur and εcur ≤ εpost , εcur is potential energy at current position.
This adaptive mesh method is able to adapt to the actual ultrasound images, and its advantage is low calculation cost. The search is 1D compared with 2D gradient descent method, and more importantly, it does not need to compute an additional feature energy map for the whole image but only the 2×2 structure matrix at certain points. As shown in Fig. 1, we obtain the adapted mesh comparing with the initial mesh on pre-smoothed images by AOS method. After the adaptive mesh is positioned at areas of feature, motion estimation using blocks centered at the individual mesh nodes generates a field of more accurate motion vectors.
Linear Transformation for Recovery of Overall Motion
In previous method for the overall motion estimation, the rigid transformation can not be a good description of the actual situation, for some deformation of tissue and speckle happens in the adjacent images. In contrast to finding optimal rigid transformations, many methods have been tried in computer graphics to simulate deformable objects, and here the method is referred to compute optimal linear transformations to give a better description of the deformation. In this approach, the point patterns matching problem with a priori one-to-one relationship can be stated as follows: Given two sets of points xi and yi, find the rotation matrix R and the translation vectors t and t0 which minimize,
∑i wi (R (xi − t0) + t − yi)2
where the wi are weights of individual points.
The optimal translation vectors turn out to be the center of mass of the initial shape and the center of mass of the actual shape, i.e.
In order to find the optimal rotation R, this method defines the relative locations qi = xi − xcm and pi = yi − ycm of points with respect to their center of weight and relaxes the problem of finding the optimal rotation matrix R to finding the optimal linear transformation A. And the term to be minimized is ∑i (Aqi − pi)2. Setting the derivatives with respect to all coefficients of to zero yields the optimal transformation,
which contains the rotational part R. In order to obtain an accurate description of the actual deformable motion, we use the linear transformation matrix computed above equation to extend the range of motion from rigid to linear. Finally, the goal positions can be computed as,
And we could get gi = Axi + T, and define T = ycm − Axcm, which represents the translation vector (tx, ty).Therefore, the linear transformation method is used to recover the motion between successive images in a sequence, i.e. from images 1 to 2, 2 to 3 and so on. Briefly, if the motion from images 1 to 2 is given by the parameters A, T, and that between images 2 to 3 is given by A', T'. The
transformation matrix representing the overall motion from image 1 to 3 is given by:
Polygon Filling for Sequence of Images Compounding
The point-to-point compounding method causes float-to-int position mapping errors, which will degrade the quality of compounded image. In order to compound the adjacent images accurately, we use the algorithm of polygon filling to eliminate the float-to-int mapping errors. The main process of this method has the following steps,
Finding the filling range of the sampled image in the WFOV buffer. This step places the four corners of image to the corresponding position in the WFOV buffer first, and this is a float-to-int mapping. Then the edge of the polygon is calculated. Because we use a linear deformation model, the filling range is an arbitrary quadrilateral.
Getting the integer coordinate Pi of each pixel in the filling range using the polygon filling algorithm. All edge pixels of the polygon are discarded, for these pixels’ positions are not integers in the filling range.
Calculating the inverse matrix M' of the current linear transformation matrix M. Then we get the starting coordinate Pi'of Pi, where Pi' = M'Pi. So we can calculate the pixel value vi using bilinear interpolation with the original coordinate Pi' in sampled image.
When the process of polygon filling is finished, we get accurate pixel values for each position of integer coordinate in the compounding range.
To verify our proposed improved WFOV algorithms, we have tested lots of sequences of successive images using in vivo data. We used practical color ultrasound scanner for data acquisition which is the system we designed in our lab. The standard test set contain 200-400 images. The proposed experiment objects contain the femur and neck part of human body.
The result for the basic WFOV method is as follows,
The result for the improved WFOV method is as follows, from Fig. 2 and Fig. 3, we can clearly see the clear muscle texture and two sides of the thyroid besides the carotids and jugulars in the denoised neck image.
Utilization of WFOV function in practical color ultrasound system is as follows,
Color WFOV (Blood WFOV) applications for great vessels are as follows,
Mutual information based method for the registration improvement of WFOV
Mutual information (MI) based registration is widely used in medical image registration in recent years because of its high accuracy and good characteristic. Different from the conventional use, Zijuan Xu and I use it as criteria on blocks instead of the whole image to generate motion vectors. Based on the information theory, when the two images are well matched, one can perfectly predict the corresponding position's gray value in another image according to the gray value in one image. MI method treats the gray values of the two images as two random variables. The better the two images matched, the larger the MI is. On the contrary, the MI is small.
The Shannon entropy is defined as,
where pi is the probability of an event, the sum of pi is equal to one.
The Shannon entropy for a joint distribution is defined as,
where p(i, j) is the joint probability distribution, the sum of p(i, j) is equal to one.
Studholme et al. proposed a normalized measure of MI (NMI), which is less sensitive to changes in overlap,
where H(A), H(B) are the Shannon entropy of variables respectively, H(A, B) is the Shannon entropy for joint probability distribution of variables A and B.
MI has some advantages: it does not need prior assumption and pre-processing, has satisfactory accuracy and robustness. However, because of the real-time processing requirement of WFOV and the high calculation for MI in image processing, we have not integrated it into practical ultrasound system.
Interest Point Detection in Ultrasound Image for WFOV Pre-determination of Sampling Points
Interest point detection is a series of technologies in computer vision that refer to the detection of interest points for subsequent processing. Zhengjuan Fan and I try to utilize these technologies to obtain the pre-determination of sampling points and further to improve the quality of WFOV. And the following is about the Harris Corner Detector for ultrasound images.
The Harris corner detector has several advantages, its invariance for rotation, illumination variation and image noise. And more important, it asks less computation compared with other feature detection methods, such as the SIFT, so that it could be closer to the real-time processing requirement of WFOV.
The Harris corner detector is based on the local auto-correlation function of a signal:
where I denotes the image function and (xi, yi) are the points in the window W centered on (x, y). W is a smooth circular window, for example Gaussian. The local auto-correlation function measures the local changes of the signal with patches shifted by a small amount in different directions.
The shifted image is approximated by a Taylor expansion truncated to the first order terms,
where matrix C(x, y) captures the intensity structure of the local neighborhood. Let λ1, λ2 be the eigen-values of matrix C(x,y) . The eigen-values form a rotational invariant description. There are three cases to be considered:
If both λ1, λ2 are small, so that the local auto-correlation function is flat (i.e., little change in C(x,y) in any direction), the windowed image region is of approximately constant intensity.
If one eigenvalue is high and the other low, so the local auto-correlation function is ridge shaped, then only local shifts in one direction (along the ridge) cause little change in C(x, y) and significant change in the orthogonal direction; this indicates an edge.
If both eigenvalues are high, so the local auto-correlation function is sharply peaked, then shifts in any direction will result in a significant increase; this indicates a corner.
The following is the test for the great vessels ultrasound B-mode images using in the first Color WFOV application above,
where Ix and Iy denote the partial derivatives in x and y, respectively.
From the results, we could further improve the results of points by selecting the points that are closer to the tissue in the ultrasound images.
Optical Flow Methods for WFOV Motion Estimation
Sequences of ordered images could use the Optical Flow methods to estimate the motion as either instantaneous image velocities or discrete image displacements. Zhengjuan Fan and I try to utilize the Optical Flow methods to get the motion vectors between two consequent images in a WFOV data stream. We have tested the two traditional OF methods, the Lucas and Horn methods.
Lucas method presents a image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. In the two-dimensional registration problem, vector h can be written as:
where F(x) and G(x) are images, x is the position vector in two dimensions.
Two adjacent B-mode images using in above experiment have been tested, the original images are shown as follows,
Estimate optical field using Lucas method and add post-procession with visualization optical field using arrow, the result is as follows,
Horn method is based on an optical constraints equation and a smoothness constraint to estimate optical field.
Optical Constraint Equation
Let the image brightness at point (x, y) in the image plane at time t be denoted by E(x, y, t). Assume the brightness of a particular point in the pattern is constant, we could get,
Assume u = dx / dt, v = dy / dt, we could get the optical constraint equation,
Smoothness Constraint
Assume that opaque objects of finite size undergoing rigid motion or deformation. Thus, neighboring points on the objects have similar velocities and the velocity field of the brightness patterns in the image varies smoothly almost everywhere. This smoothness constraint can be written as,
Estimate Optical Field
The problem of optical field then is to minimize the sum of the errors:
Choose a suitable weighting factor (denoted byα2 ) to inhibit the noise impact on the measurement. The minimization is to be accomplished by finding suitable values for the optical flow velocity (u, v). Using the calculus of variation, the approximation to the Laplacian and Gauss-Seidel method, we obtain,
The same two adjacent B-mode images using in Lucas method have been tested, the result is as follows,
From these results, optical methods could estimate the motion vectors between two images. However, these methods are noise-sensitive and high computation. Therefore, we are still studying some other methods to get accurate motion vectors for WFOV, such as the structure tensor based OF method.
I and my co-workers are still using our spare time out of work or study to find and develop new ways to consummate the WFOV. I think some ultrasound features and signal parameters could be involved into this research, and we are trying our best to publish a journal paper in IEEE transactions on UFFC or Medical Imaging about this topic.
We present a new method for extended field of view imaging. The image registration techniques are used for motion estimation. The persistence based compounding method and post-processing strategies are applied for constructing the final image and increasing the image quality.
Additive Operator Splitting based Gaussion smoothing method is used for pre-processing, and the adaptive mesh method improves the IMSAD searching, so block-based motion estimation is improved.
The weighted linear transformation makes the recovery of overall motion be able to adapt both rigid and deformable conditions. Applying the polygon filling algorithm improves the quality of compounded WFOV image.
These methods for WFOV have already been utilized in a practical color ultrasound system, and we are still hard working for more improvement on this function.