STAT:7400 Project

Introduction

Magnetic resonance imaging (MRI) is widely used in radiology as a non-invasive method to obtain cross-sectional images of the anatomy and physiological processes of the internal organs. MRI can be used to identify lesions caused by multiple sclerosis (MS) which is a chronic disease of the central nervous system (brain, spinal cord, optic nerves). MS lesions in the brain are areas of active inflammation, demyelination, or permanent tissue damage and occur primarily in the white matter. MRI is well-suited for assessing lesion burden (volume and patterns) because lesions appear as hyper intensities on images.

According to Despotović, Goossens, and Philips, image segmentation is the process of measuring and visualizing the anatomical structure of the brain, analyzing structural changes, delineating pathological regions for both surgical planning and image-guided interventions (2015).

Analyzing MRI images by visual inspection is time-consuming and prone to both inter and intra-operator error. To alleviate these concerns, computerized methods are employed to improve image segmentation, registration, and visualization of MRI images (Despotović, Goossens, & Philips, 2015).

Our goal is to provide functions that aid in the storage of MRI image data, visualization of 2D and 3D images, identifying and distinguish the MS lesions with the healthy tissues, and classification of the tissues in brain, including gray matter, white matter, cerebro-spinal fluid, and MS lesions.

Data set description

The data set we used is called: 3D MR image database of Multiple Sclerosis patients with white matter lesion segmentations. It is available on the website of Laboratory of Imaging Technologies (http://lit.fe.uni-lj.si/tools.php?lang=eng).

This data set includes images of 5 patients. All the images are NIfTI format.

Every patient has 6 MR images: brainmask, consensus, FLAIR, T1W, T1WKS, and T2W. We chose brainmask, consensus and T2W images to use, since brainmask is used for filtering the T1 or T2 images, consensus is the image where several doctor's manually identify the lesion areas (which is extremely time consuming), and T2 image best displays: useful for detecting edema and inflammation, revealing white matter lesions. The consensus image, we use to compare our algorithm's ability to identify the lesion area to what the doctors have manually identified.

Each MR image of one patient has the same dimension, for example, the patient 01' MR image dimension is 192 \(\times\) 512 \(\times\) 512.

Methodology

In this project, we will focus on visualization methods and optimization methods to investigate the brain MRI images.

Some important functions and user packages we used are:

\(oro.nifti::readNIfTI\): read in MRI data.
\(mritc::initOtsu\): get The Initial Values Through Multilevel Thresholding Using A Fast Algorithm For Otsu's Method

Singular Value Decomposition

Singular-value decomposition (SVD) can decompose any real or complex matrix into two orthonormal matrices of eigenvectors and a diagonal matrix whose elements are the square roots of non-zero eigenvalues.

If a matrix is ill-conditioned, a rounded off version of the matrix can be constructed. Ill-conditioning occurs when the ratio of the largest and smallest diagonal elements is substantially than one.

Similar to principal component analysis, the majority of the variation in the data lies along the components of the large diagonal elements. MRI image data can result in large data files. SVD provides an opportunity to compress data by discarding the information related to the smaller diagonal elements.

SVDcompression Function

Using R's svd function, the T2W MRI image data of one patient will be decomposed. The data is imported using the oro.nifti package (Whitcher et al., 2017). It is a nifti class object with dimensions 192 x 512 x 512. Each of 192 slices will be decomposed by SVD. The determination of which diagonal elements to retain is the absolute difference of the diagonal elements. The default setting, obtained through trial an error, will retain the diagonal elements and their corresponding eigenvectors if the absolute difference is above 100.

The SVD compression function takes two inputs: a 3D matrix and a cutoff.diff value (default = 100). The output is a list comprised of four objects: the left-hand eigenvectors, right-hand eigenvectors, diagonal elements, and a vector of indices to determine which slice is related to which eigenvectors and diagonal elements.

The goal of this procedure is to discard superfluous data. If successful, the list will be considerably smaller in size than the original data. The next section will introduce a function that can reconstitute the data from the list into a rounded version of the original 3D MRI matrix. Then, 2D images of the original data and SVD data will be compared.

SVDcompression <- function(d, cutoff.diff = 100) {

    if(cutoff.diff <= 0) {stop("cutoff.diff must be positive")}
    if(!is.numeric(d)) {
        stop("matrix must be numeric")
    } else if(!length(dim(d)) == 3) {
        stop("matrix must be 3D")
    }
        
    a <- dim(d)[1]
    b <- dim(d)[2]
    c <- dim(d)[3]
    keep <- rep(NA, a)
    h1 <- matrix(NA, nrow = b)
    s1 <- NULL
    a1 <- matrix(NA, nrow = c)
    
    for(i in 1:a) {
        
        slice <- d[i, , ]
        A <- svd(slice)
        k <- abs(diff(A$d))
        kn <- length(k[k > cutoff.diff])
        if(kn != 0) keep[i] <- kn else keep[i] <- length(A$d)
        
        for(j in 1:keep[i]) {
            
            h1 <- cbind(h1, A$u[, j])
            s1 <- c(s1, A$d[j])
            a1 <- cbind(a1, A$v[, j])
            
        }
    }
    h1 <- h1[,2:ncol(h1)]
    a1 <- a1[,2:ncol(a1)]
    
    list(h1, s1, a1, ind=cumsum(keep))
}

SVD Comparison

Using data from Lesjak et al. (2017), the original data is compared to the SVD data. ¹ A compression of 86.54% is achieved. This represents a decrease in the number of matrix vs. vector elements. Comparing object size: the original 3D matrix is 384 Mb while the list of SVD objects is 51.7 Mb.

library(oro.nifti)
d <- readNIfTI("patient01_T2W.nii.gz", reorient = FALSE)
load("patient01_T2W.RData")

# data comparison
n.d <- length(d)
n.mylist <- length(mylist[[1]]) + length(mylist[[2]]) + 
    length(mylist[[3]]) + length(mylist[[4]])
comp <- (n.d - n.mylist) *100 / n.d
cat("\nCompression Percentage = ", round(comp,2), "%\n", sep="")

## 
## Compression Percentage = 86.54%

# object size comparison
format(object.size(mylist), units = "MB")

## [1] "51.7 Mb"

format(object.size(d), units = "MB")

## [1] "192 Mb"

SVDunwrap Function

The list produced by the SVDcompression function has no useful application as separate objects other than its reduced size for storage/transfer applications. To use the SVD data for MRI image production, the objects must be reconstituted into a 3D matrix. The SVD 3D matrix will be a rounded version of the original data with the same dimensions. The SVDunrap function, takes the list as an input and outputs a 3D matrix.

SVDunwrap <- function(mylist){
    
    keep2 <- mylist[[4]]
    dim.a <- length(mylist[[4]])
    dim.b <- nrow(mylist[[1]])
    dim.c <- nrow(mylist[[3]])
    d <- array(NA, dim=c(dim.a, dim.b, dim.c))
    keep2 <- c(0, keep2)
    
    for(i in 1:dim.a){
        rb <- matrix(0, nrow=dim.b, ncol=dim.c)
        beg <- keep2[i] + 1
        end <- keep2[i + 1]
        for(j in beg:end) {
            h2 <- mylist[[1]][,j]
            s2 <- mylist[[2]][j]
            a2 <- mylist[[3]][,j]
            
            r2 <- as.matrix(h2) %*% as.matrix(s2) %*% 
                as.matrix(t(a2))
            rb <- rb + r2    
        }
        d[i, , ] <- rb
    }
    return(d)
}

2D Image Comparison

To check the quality of the compression, images of all three mid-splices for both the original and SVD data are displayed.

All three pairs of plots are very similar indicating the SVD compression works well with a cutoff of 100. More detail/less can be obtained by decreasing/increasing the cutoff.diff input with the trade off being the amount of compression.²

3D Plots

condense3d Function

The misc3d package (Tierney & Feng, 2013) includes the contour3d function which renders 3D contours or isosurfaces by using the marching cubes algorithm. This function is able to accept a 3D matrix of image data.

The original 3D matrix of dimension 192 x 512 x 512 is too large to work well with the contour3d function. The original 3D matrix's dimensions must be condensed.³ The condense3d function takes a 3D matrix as an input along with a divisor and outputs a smaller 3D matrix whose dimensions have been scaled down by the divisor. All three sides must be scaled by the same divisor and a value must be chosen such that it divides evenly with all three dimensions.

The default divisor equal to 8 creates a 24 x 64 x 64 matrix. The condense3d function iterates through the original 3D matrix by selecting 8 x 8 x 8 blocks and averaging the elements of the block. This average value is the new element in the smaller matrix.

library(misc3d)
condense3d <- function(d, div = 8){

    a <- dim(d)[1] / div
    b <- dim(d)[2] / div
    c <- dim(d)[3] / div
    
    if(!is.numeric(d)) {
        stop("matrix must be numeric")
    } else if(!length(dim(d)) == 3) {
        stop("matrix must be 3D")
    } else if((a %% 1 != 0) | (b %% 1 != 0) | (c %% 1 != 0)) {
        stop("divisor results in non-integer dimensional length")
    } 
    else {
        newdf <- array(NA, dim=c(a, b, c))
        
        for(k in 1:(192 / div)){
            begk <- 1 + div * (k - 1)
            endk <- div + div * (k - 1)
            
            for(j in 1:(512 / div)){
                begj <- 1 + div * (j - 1)
                endj <- div + div * (j - 1)
                
                for(i in 1:(512 / div)){
                    begi <- 1 + div * (i - 1)
                    endi <- div + div * (i - 1)
                    newdf[k, i, j] <- mean(d[begk:endk, begi:endi, 
                                             begj:endj], na.rm=T)
                }
            }
        }    
    }
    newdf
}

contour3d Function

The smaller 3D matrix is more manageable for the current level of computing resources. Two divisors were tried, 4 and 8, producing 48 x 128 x 128 and 24 x 64 x 64 matrices. As the divisor is increased, additional detail is lost. Below, the 24 x 64 x 64 version is displayed.

# 48 x 128 x 128
newd <- condense3d(d)
contour3d(newd, 1.5 * (1:24), 1:64, 1:64, lev=c(50, 600, 850),
          alpha = c(0.2, 0.4, 1), color = c("white", "red", "green"))
rgl::rglwidget()

Identifying MS lesions

Since different tissues have different intensities on the MR image, we use Normal Mixture model to regress on the data to get the parameters of normal distributions.

First, we read in the patient01 MR images and display the images of Brainmask, consensus, and T2W slice images, which are shown as following. From the consensus image, we can see this patient has many MS lesions spots, and on the right side, there are two large spots. Later on, we use this image to compare with our results.

# read in MRI data using oro.nifti user packege
library(oro.nifti)
img1 <- readNIfTI("patient01_brainmask.nii.gz", reorient = FALSE)
img2 <- readNIfTI("patient01_consensus_gt.nii.gz", reorient = FALSE)
img6 <- readNIfTI("patient01_T2W.nii.gz", reorient = FALSE)

image(img1, z = 225, plot.type = "single",main="Brainmask slice image along x-axis")

image(img2, z = 225, plot.type = "single",main="consensus slice image along x-axis")

image(img6, z = 225, plot.type = "single",main="T2W slice image along x-axis")

Second, we use \(brainmask\) image to filter the T2 image. Then we apply \(density\) function to get a general idea about the distributions of intensities. From the density plot of intensities for voxels, we can see three peaks: one is around 0, another one is around 400, the other is around 1150. It seems there is a forth peak around 1250, very close to the third peak. Since we know that MS lesions occur primarily in the white matter and have the higher intensity, the forth peak may indicate the intensity of MS lesions. This result encourage us to do further investigation.

img6 <- img6[img1 == 1]
dens6 <- density(img6)
plot(dens6, main = "Density plot of intensities for voxels")

Third, used Normal Mixture models to get the specific density parameters for the MS lesions, gray matter, white matter, and cerebro-spinal fluid. We classified MS lesions and the healthy tissues in brain. The meaning of different colors:

Red: represents MS lesions.
Black: represents the background.
Blue: represents cerebro-spinal fluid.
Grey: represents gray matter.
Yellow: represents white matter.

Compared with consensus image, our results shows a similar spot on left side, and less spots on the right side. Therefore, our result can be a good reference for radiologists to do their further work.

EMmix <- function(x, theta) {
    mu <- theta$mu
    sigma <- theta$sigma
    p <- theta$prop
    M <- length(mu)
    ## E step
    Ez <- outer(x, 1:M, function(x, i)
        p[i] * dnorm(x, mu[i], sigma[i]))
    Ez <- sweep(Ez, 1, rowSums(Ez), "/")
    colSums.Ez <- colSums(Ez)
    ## M step
    xp <- sweep(Ez, 1, x, "*")
    mu.new <- colSums(xp) / colSums.Ez
    sqRes <- outer(x, mu.new, function(x, m) (x - m) ^ 2)
    sigma.new <- sqrt(colSums(Ez * sqRes) / colSums.Ez)
    p.new <- colSums.Ez / sum(colSums.Ez)
    ## pack up result
    list(mu = mu.new, sigma = sigma.new, prop = p.new)
}
library(mritc)

## Loading required package: lattice

# initOtsu: Get The Initial Values Through Multilevel Thresholding Using A Fast Algorithm For Otsu's Method
# initotsu(y,m),m: number of classes (tissue types for MRI) minus 1.
#theta <- initOtsu(img6, 3)
#for (i in 1:100) {
#    theta <- EMmix(img6, theta)
#}
theta=list("mu"=c(319.9295,  417.0160,  801.5254, 1156.0664),"sigma"=c(53.25315,  151.05442,  192.84515, 68.98212),
               "prop"=c(0.36259888, 0.42636645, 0.16973653, 0.04129814))
data.frame(mu = theta$mu, sigma = theta$sigma, p = theta$prop)

##          mu     sigma          p
## 1  319.9295  53.25315 0.36259888
## 2  417.0160 151.05442 0.42636645
## 3  801.5254 192.84515 0.16973653
## 4 1156.0664  68.98212 0.04129814

theta1 <- EMmix(img6, theta)
#(theta[[1]] - theta1[[1]])/theta[[1]]
#(theta[[2]] - theta1[[2]])/theta[[2]]
#(theta[[3]] - theta1[[3]])/theta[[3]]

class <- function(x, theta) {
    mu <- theta$mu
    sigma <- theta$sigma
    p <- theta$prop
    M <- length(mu)
    Ez <- outer(x, 1:M, function(x, i)
        p[i] * dnorm(x, mu[i], sigma[i]))
    max.col(Ez, "first")
}
t12 <- array(0, dim = c(192, 512, 512))
t12[img1 == 1] <- class(img6, theta)
image(t12[, ,225 ], axes = FALSE, main = "Middle slice image along z-axis",col=c("black","blue","grey","yellow","red"))

image(img2, z = 225, plot.type = "single")

Improve the efficiency of the Identifying procedure

Using Rprof the results show the EMmix function took most of the time. \(EMmix\) spends the most proportion of time.

## [1] "summaryRprof(tmp1)$by.total[1:5, ]"

##       total.time total.pct self.time self.pct
## EMmix     510.26    100.00      0.00     0.00
## outer     394.72     77.36     56.92    11.16
## FUN       337.82     66.21     35.68     6.99
## dnorm     301.94     59.17    301.94    59.17
## sweep      95.16     18.65     17.70     3.47

Knowing the fact that the intensities only take on quite limited number of values, I modified the EMmix function with the frequency table of intensities. The new function EMmix_f is shown as follows and the system times for both functions. We can see that the EMmix_f function takes 10% as much time to finish the job, much more efficient than the former one.

#theta <- initOtsu(img6, 3)
theta=list("mu"=c(288.6646,  456.0476,  723.1701, 1077.8544),"sigma"=c(65.53487,  60.41032,  88.95950, 107.60318),
           "prop"=c(0.45472031, 0.30603261, 0.14407572, 0.09517136))
freq <- table(img6)
t1_f <- sort(unique(img6))
EMmix_f <- function(x, theta) {
    mu <- theta$mu
    sigma <- theta$sigma
    p <- theta$prop
    M <- length(mu)
    ## E step
    Ez <- outer(x, 1:M, function(x, i)
        p[i] * dnorm(x, mu[i], sigma[i]))
    Ez <- sweep(Ez, 1, rowSums(Ez), "/")
    Ez <- sweep(Ez, 1, freq, "*")
    colSums.Ez <- colSums(Ez)
    ## M step
    xp <- sweep(Ez, 1, x, "*")
    mu.new <- colSums(xp) / colSums.Ez
    sqRes <- outer(x, mu.new, function(x, m) (x - m) ^ 2)
    sigma.new <- sqrt(colSums(Ez * sqRes) / colSums.Ez)
    p.new <- colSums.Ez / sum(colSums.Ez)
   
    list(mu = mu.new, sigma = sigma.new, prop = p.new)
}
theta=list("mu"=c(288.6646,  456.0476,  723.1701, 1077.8544),"sigma"=c(65.53487,  60.41032,  88.95950, 107.60318),
           "prop"=c(0.45472031, 0.30603261, 0.14407572, 0.09517136))
print("The elapsed time of running EMmix_f ")

## [1] "The elapsed time of running EMmix_f "

system.time(for (i in 1:100) {
    theta <- EMmix_f(t1_f, theta)
})[3]

## elapsed 
##   0.084

## [1] "The elapsed time of running EMmix: 15.50"

References

Despotović, I., Goossens, B., and Philips, W. “MRI Segmentation of the Human Brain: Challenges, Methods, and Applications,” Computational and Mathematical Methods in Medicine, vol. 2015, Article ID 450341, 23 pages, 2015. https://doi.org/10.1155/2015/450341.

Lesjak, Žiga, Franjo Pernuš, Boštjan Likar, and Žiga Špiclin. "A Novel Public MR Image Dataset of Multiple Sclerosis Patients With Lesion Segmentations Based on Multi-rater Consensus", Neuroinformatics (2017), DOI: 10.1007/s12021-017-9348-7

Tierney, L., Fend, D. (2013). misc3d: Miscellaneous 3D Plots. R package version 0.8-4.

Whitcher, B., Volker, S., Thornton, A., Tabelow, K., Clayden, J., Muschelli, J. (2017). oro.nifti: Rigorous - NIfTI + ANALYZE + AFNI : Input / Output. R package version 0.9.1.

For brevity, the SVDcompression and SVDunwrap functions are not run in this html. The outputs are loaded from .RData files. The full code to produce all output is available in separate R scripts.↩
Note: the images display a truncated surface at the top of the skull. This is inherent in the original data and is not a bug in the code.↩
SVD compression does not help here. When the 3D matrix is reconstituted, it has the same dimensions as the original data.↩