Collaborative Priors with SVD for Denoising of cDNA Microarray Images

Background/Objectives: This study presents an external and internal prior guided patch based filter for minimizing Gaussian noise in complementary deoxyribonucleic acid (cDNA) microarray images. Methods/Statistical analysis: The proposed denoising filter is capable of taking into consideration both external and internal prior. It employs combined prior guided patch based denoising considering various distance based patch-matching methods. Findings: Experimental results demonstrate that the combined prior guided patch based filter outperforms the existing well-known filters in minimizing noise of cDNA microarray images. The outcome of the proposed scheme found to offer better peak signalto-noise ratio and structural similarity index in contrast to existing filtering techniques. Effectiveness of the proposed denoising method is also assessed by estimating the spot intensities of cDNA microarray image that reflects the effect of noise reduction in the image. Application/Improvements: Minimization of noise is a crucial step of cDNA microarray image processing and it aids in microarray analysis by extracting valid and good quality gene expression measurements.


Introduction
Microarray imaging technology has shown major progress in genomic research by allowing molecular biologists to monitor thousands of expression levels of genes at a time. Nowadays, the technology is a powerful tool, which is employed in the research of toxicological problems, discovering drugs, new types of diseases, diagnosis of disease and succeeds in providing promising results. In addition, the cDNA microarray technology has found application in various fields of bioinformatics. Microarrays are thousands of discrete DNA sequences printed onto glass slides by a robotic arrayer, forming spots or probes on the glass slide. Discrete DNA sequences form a two dimensional array of spots on the glass slide. Each spot on the slide represents hybridization level of cDNA from fluorescent-labelled target sample and cDNA from fluorescent-labelled reference sample. The slides are scanned to obtain two images of red channel and green channel that reflects differentially expressed genes. With the aid of various computational techniques, the gene spots are analysed for understanding the fundamental biological phenomena. Log-intensity ratio is widely used statistic 1 for downstream processing of microarray data. Microarray experimentation and image digitization process has error prone steps and introduces pollutants to the images. The extraction of log-intensity ratios from microarray images with accuracy is a major challenge because of presence of noise in the image. Thus, need for robust algorithms to minimize noise is essential and subsequently, the downstream processing of microarray image analysis can be effectively performed.
Noise minimization is the crucial step in biomedical image processing and the step aids in downstream processing of biomedical images. Noise minimization technique on biomedical images by means of an adaptive methodology of multi-resolution 2 was introduced and substantiates the need for biomedical image denoising among researchers. The method of minimizing noise for cDNA microarray images at present becomes the signifi-cant research issue. This segment lengthens the discussion of comprehensive study 3 by briefing several new techniques for minimizing noise in cDNA microarray images. The method of Chang 4 proposes decision based filter for the elimination of impulse noises in cDNA microarray images. It works on the thought of fuzzy logic-based noise detector using particle swarm optimization algorithm to assess a corrupted pixel vector, and then utilizes the center weighted vector median filter to enhance the image quality of microarray images. An interesting attempt for removing outliers in illumine beadchip microarray image 5 was investigated using least weighted squares estimation, and claims that estimated statistical measure is sensitive against outliers in the data.
In latest years, some researchers have utilized diverse noise minimization techniques aimed at cDNA microarray images in their methodology. The research papers [6][7][8][9][10] witnessed the importance of denosing procedure as pre-processing step. The attempts for noise removal techniques such as histogram equalization, logarithmic transformation, arctangent hyperbolic transformation, wavelet denosing, median integrated with anisotropic diffusion filters was performed as initial step of microarray image processing.
In recent research 11,12 histogram equalization for contrast enhancement and noise removal procedure based on Wiener filter was implemented as a pre-processing step. The study 13 introduces application of Wiener filter for denoising of natural images. Patch based filtering in two stages frequency domain and in spatial domain was performed. Patch level-based filtering offers good methodology to minimize the presence of noise. It shows an increase in performance when noise level is less and results in a smoother denoised image for higher noise level. Popular patch based denoising methods exclusively count on input noisy image to mine features. Non Local Means -NLM 14 reconstructs the noisy patch by taking the weighted average of similar reference patches from the same input image. Block-Matching 3D filtering -BM3D 15 is considered as the state of art technique. It constructs upon concept of transform domain technique by extending NLM. BM3D arranges similar reference patches into a 3D structure and then processes it through the steps of 3D linear transform, thresholding, inverse 3D transform to achieve clean estimate of noisy patch. Block-Matching 3D filtering with Principal Component Analysis -BM3DPCA 16 is modification of original BM3D and is intended for increase in efficiency of BM3D.
PCA on similar reference patches followed by Haar transform is employed to produce denoised image. Local Pixel Grouping LPG-PCA 17 builds similar patches based on similarity in spatial structures and employs PCA on the matrix of similar patches. In Targeted Image Denoising -TID 18 the similar reference patches are mined from external databases. The method exploits PCA to learn the prior and denoising quality was improved by estimating optimal spectral matrix. Following up, mining features from input noisy image and from external database has evolved in patch based approaches. Various procedures developed most recently in this category are affine regression 19 , internal and external correlations 13 , anchored regressors 20 , integrating local and non-local priors 21 , external prior guided internal prior learning 22 considering images corrupted with additive Gaussian noise. Although greater research interest in patch level filtering on natural images is observed, these frameworks has never been investigated with microarray images considering the complexities in the microarray image. Previous Studies indicate that several types of noise may corrupt the microarray images. Poisson 23 , Gaussian 24,25 and exponential distributions 23 have been used as additive or multiplicative noise model to characterize the noise features of cDNA microarray images. Non-additive and non-Gaussian noise can be mathematically expressed as additive Gaussian noise 26 , so minimizing Gaussian noise is important considering the cDNA microarray images.
Matti Nykter 27 proposed a simulation model for microarray image with biological and statistical characteristics. The model simulates a realistic microarray image by modelling measurement and biological errors. Several error models based on Gaussian distribution, Laplacian distribution, lognormal distribution were introduced for implementation of simulation of microarray image. By adding various types and amounts of noise that characterize biological and measurement errors to the simulated images, performance of the proposed algorithm can be effectively tested. This can give a significant understanding of the efficiency of the algorithms.

Proposed Methodology
cDNA microarray image denoising is defined as problem of restoring a clean image from its noisy observation. Given a noise free image I G , its noisy observation is given as I = I + N N G , where N R ∈ 2 is the additive Gaussian noise with σ 2 variance and zero mean. Aim is to obtain a noise free estimate I R G ∈ 2 from I R N ∈ 2 . An image denoising framework that integrates internal prior learned from noisy image as well as prior learned from the dataset is proposed in this article. Microarray dataset with clean images have to be created to learn the prior. However, microarray images with high quality is not achievable practically. Therefore, high quality Microarray Image Dataset (MAID) are formed by selecting visually good quality images from Stanford Microarray database. Another dataset stacked with simulated images are also constructed and the priors of these images are used to validate the proposed framework. These synthetic images are formed considering the characteristics of realistic microarray images using mamodel software 27 . Simulated images are used to corroborate the proposed framework. Clinical investigation of two-colour microarray data is validated using the true signals, Red(R) and Green (G) or their blend of various levels of their intensity.
The article proposes an integrated method that achieves best result by adaptively selecting internal prior and external prior for denoising of noisy image. In first stage, the noisy image I N is split into overlapping patches of size 8 x 8. For every noisy patch, aim is to restore its estimate patch by the statistics obtained from internal or external reference patches. For each noisy patch, the signal to noise ratio 28 is calculated. Patch with low signal to noise ratio is denoised with internal reference patches information. Patch with high signal to noise ratio is denoised with external reference patches information. The reference patches for a given noisy patch is obtained by using a search algorithm, K-Nearest Neighbor (KNN) with various distance measures. The technique considers M nearest neighbors as prior data learned from noisy image or from microarray image dataset. An intuitive method of Singular Value Decomposition (SVD) is exercised to obtain the denoised patch.
The initial step in the algorithm is to perform partitioning of the noisy image to generate overlapping patches. For each noisy patch P N , aim is to recover its particulars with the support of similar external and internal patches. Searching reference patches for a heavily polluted noisy patch from the noisy image and from external image dataset are both hard-hitting problems. Therefore, KNN based patch matching is suggested with Chebyshev, Cityblock, Euclidean, and Minkowski distance approaches to improve patch matching accuracy.
In this approach, the results prove that, by incorporating two categories of image priors, better restoration resultsare obtained taking the benefits from both external and internal priors. The idea behind the algorithm is to group together similar patches and a model is framed for those patches. This model is used to reduce significant noises of microarray images. The basic steps of proposed denoising algorithm is as follows.
Algo1 for Denoising Microarray image: 1. For the size of the noisy Image I N 2. Extract overlapped noisy patch P N of size N1 × N1 3. Find signal to noise ratio (SNR) of noisy Patch 4. If SNR of noisy Patch > C 5. Extract similar reference patches from MAID 6. Res_pat → Denoise (sim_patchExt, P N ) 7. Else if SNR of noisy Patch < C 8. Extract similar reference patches from noisy image I N 9. Res_pat → Denoise (sim_patchInt, P N ) 10. END The algorithm considers the clean microarray image as ground truth image. Gaussian noise of certain noise level is added to the ground truth image to obtain the noisy image I N . Noisy image is split into overlapping patches. Noisy patch P N of size N1 × N1 is extracted from noisy image. The equation (1) is used to estimate the signal to noise ratio of noisy patch P N .

SNR of noisy patch
where, var( ) P N is the empirical variance of the noisy patch, var( ) n is unknown and is approximated using one of the known denoisers. If P is the estimate of P N then var(n) var(P P).

−
The Line 4 and 7 of algorithm then makes a choice of denosing the noisy patch with internal prior data or external prior data based on threshold C. The patch with low content and high noise has lower signal to noise ratio and the patch with more signal content and low noise has high signal to noise proportion. The threshold C is calculated as average of signal to noise ratio of smooth patches of noisy image corrupted with various noise levels. This step (Line 4 and 7) ensures that patches with smooth or low content prefer minimizing noise using internal prior information, whereas patches with details prefer denois-

P = A A P End
Output: Res_Pat Given noisy patch P N , aim is to recover the latent clean patch P G . The observation model for the noisy patch is P P n N G = + where n is additive Gaussian noise with variance σ 2 .
The objective is to recover an estimate of P G from P N through a linear filter that minimizes Ε R P P N R is linear operator to be estimated such that there is minimum mean squared error between ground truth and estimated denoised image. R is assumed symmetric, has linearly independent basis vectors and can be factorized. These independent basis vectors will be helpful in reducing the noise if the basis vectors are sparse having high-energy concentration. Sparsity in basis matrix is achievable by finding similar reference patches for the given noisy patch using K-nearest neighbor with various distance measures. Singular value decomposition is utilized for matrix decomposition to achieve sparse basis vectors. This matrix decomposition helps in eliminating trivial noise present in the image and preserves the principal image features. Left singular matrix of R are the Eigen matrix of RR T and represents the principal image features. The architecture of denoising using SVD primarily computes the projection matrix A and diagonal matrix S as follows: Given { } rp j j M =1 , the SVD computes the projection matrix A using where, { } rp j j M =1 is represented with notation R p , A is the basis matrix and S is the spectral coefficients diagonal matrix. The reference patches obtained have different distances to the noisy patch in consideration. The infor- ) is elevated with higher weight for the most similar patches. Weight matrix is formulated using Gaussian kernel function. The more similar, the higher the weight it contributes to calculating matrix A and S . The distance between rp i and the noisy patch P N is considered to measure the similarity. The weight matrix is defined using a common Gaussian Kernel Function as follows: Z is a normalization constant and h is the tuning parameter.
Then the best estimate of orthogonal basis A and spectral value matrix S is obtained by SVD using equation (3) with weight matrix. Therefore, equation (3) becomes An estimation ˆG P of original patch P G is provided by such that the estimate has minimum mean squared error with respect to the ground truth P G and is represented with the notation Res_Pat. The best spectral matrix  that minimizes the error between ground truth patch and the estimate patch is obtained 18 from the equation given: =((S+ I)) (S) 2 1 σ − Thus solution to the optimization problem equation Therefore, focus is to combine internal and external prior data and target in minimizing noise of microarray image patches using SVD, which is stated as collaborative prior with SVD for denoising. In patch based denoising methodologies, noisy images are decomposed into overlapping patches. A pixel in a patch is encountered for denoising procedure for several instances, so averaging of pixel estimates is considered for the final denoised pixel. The entire process of algorithmic operation ensures minimization of noise with simpler block-based approach. The proposed methodology consistently results in remarkable image quality in terms of quantitative as well as subjective assessment to prove the effectiveness of the methodology. The next section discusses results in both subjective and objective measurements for the proposed scheme.

Results and Discussion
The proposed method is implemented by using matlab code. Algorithm implementation depends on parameters signal to noise ratio of noisy patch, median of the noisy image to increase the effectiveness in search of reference patches from noisy image. Noise estimation is done with the high frequency component of discrete wavelet transform. Three experiments are carried out to validate the performance of the proposed methodology. In the first experimentation, performance of filters is examined on simulated microarray images degraded with different noise levels of Gaussian noise. Validation of the proposed denoising filter is testified by assessing the quality of image by adopting the performance criterions Structural Similarity Index (SSIM) and Peak Signal to Noise Ratio (PSNR). The performance of proposed technique is evaluated by comparing it with paralleled patch based filtering techniques and common filtering methods.This collaborative approach has been implemented for the patch based denoising methods.Targeted Image Denoising -TID, Non Local Means -NLM, Block-Matching 3D Filtering with Principal Component Analysis -BM3DPCA, Local Pixel Grouping -LPGPCA, Block-Matching 3D Filtering -BM3D, median filter, Discrete Wavelet Transform (DWT) are the methods considered for comparison. The proposed approach with Chebychev, Cityblock, Euclidean, Minkowski distance achieves the best denoising results amongst the paralleled methods for various noise levels. The result for a noise level of σ = 30 is represented in Figure 1. In Figure 1(e, j, k, l, n, and o) represents the results for paralleled patch based filtering techniques. The red circles clearly show that more noise exist in the image. Figure 1(f, g, h, and i) represents results for the suggested method with various distance metrics. The red circles represent noise to a lesser extent. The integrated approach of utilizing priors from both internal and external has ensured minimization of noises. Table 1 and 2 presents the objective image qualities of the eleven denoising schemes in terms of PSNR and SSIM values for various noise level. It should be noted that in tables and figures 'Prop+' notation denotes proposed approach. Note that collaborating the priors delivered better results than the well-known common methods in cDNA microarray image denoising.  It can be perceived from Figure 2(a) and 2(b) that, the suggested scheme evidently outperforms all other techniques for the simulated microarray image. The major reason for this trend is that patches with smooth region prefers to use internal prior and patches with edges and texture prefer to use external prior. Hence the combination of these priors provides better results compared to available existing mechanisms of denoising.
Microarray simulation model 27 is used for generating simulated microarray image that includes biological and hybridization errors. Errors include background noise, spot bleeding, scratches, and air bubbles etc. Hybridization and biological errors are modelled with various error models such as simple additive Gaussian noise, Dror, Hartemink, Hierarchical, Hein, Rocke etc., in the mamodel software. Simulated Microarray Image test Dataset is created with various degrees of biological  Tables 3 and 4. The noise level in this simulated images are unknown and the simulated images that are affected with relatively very high noise level are utilized for the experimentation. The proposed algorithm depends on the noise variance (sigma) value for finding the optimal spectral coefficients (  ). Noise level is estimated using the high frequency component of discrete wavelet transform.
The visual denoising results of simulated microarray test image inflicted with biological and hybridization noise are presented in Figure 3. It can be seen clearly from visual outcome that the suggested approach with various distance metrics constantly outperforms collaborative approach for well-known patch based denoising methods and other two common schemes. The PSNR and SSIM values for the specified algorithms and for eight images in the test set are reported in Tables 3 and  4. Specifically, Figure 4(a-f) represents the visual outcomes illustrating the mean spot intensity values in red channel and green channel images. One can see that the proposed method with Chebyshev, Cityblock, Euclidean, Minkowski results in a denoised image with the spot intensity values very close to original simulated microarray image.   Visual representation of simulated noisy microarray image and its spot intensity values for red channel and green channel images (c) Visual representation of denoised simulated microarray image and its spot intensity values for red channel and green channel images using proposed collaborative approach with Chebyshev distance, (d) Visual representation of denoised simulated microarray image and its spot intensity values for red channel and green channel images using proposed collaborative approach with Cityblock distance, (e) Visual representation of denoised simulated microarray image and its spot intensity values for red channel and green channel images using proposed collaborative approach with Minkowski distance, (f) Visual Representation of Denoised Simulated Microarray Image and its spot intensity values for red channel and green channel images using proposed collaborative approach with Euclidean distance.
Finally, after validating our proposed method on simulated microarray images, the experiment is performed on realistic microarray image. The realistic microarray image is taken from Stanford microarray database. The visual outcomes for proposed method with various distance metric on a realistic microarray image is shown in Figure 5 (a -e). Here PSNR and SSIM evaluation on realistic microarray image cannot be carried out as there is no availability of noise free image. So the denoising results of realistic image as spot intensity values is shown in Figure 6 (a-b) for various algorithms. From graph, it is seen that the trend lines for the suggested method is well above the noisy trend line giving us the conclusion that the suggested approach has minimized noise. Experimentation outcomes on realistic and simulated microarray imagesshows that the proposed method with the distance metric of Chebyshev, Cityblock, Euclidean, Minkowski retains signal quality by reducing the noise and limiting the inclusion of false patterns within the image.

Conclusion
In this article, an approach that combines the prior learned from noisy image and prior learned from dataset for image denoising is suggested. Firstly, algorithm considers 8 x 8 patches of input noisy image and based on the signal to noise ratio parameter it starts checking for its match in the inquiry window of every search image in the database or from the noisy image. The match is done considering various distance metrics. Then, the algorithm further implements singular value decomposition to enable the process of denoising. The proposed model is validated on simulated microarray image and realistic microarray image. The experimental results shows that, suggested method offers improved PSNR and SSIM in comparison to existing methods. The suggested scheme validates the effectiveness of collaborating the prior learned from noisy image and prior learned from dataset. Further, the next challenge will be sought after towards the path to improve the performance of denoising framework and to reduce the computational complexity.