Color and cross diagonal symmetric pair co-occurrence matrix

Objectives: The main goal of this study is to derive an efficient feature vector to capture both the color and texture information. This makes the proposed descriptor as a multipurpose descriptor. Methodology: This study derived color information by combining the individual histograms of H, S and V plane. A texture descriptor is derived by considering the relationship: i) between pairs of symmetric cross pixels and ii) between pairs of symmetric diagonal pixels of the 3 x 3 neighborhood. The relationship is derived using XOR and AND logical functions. Findings: This study derived a unique six bit code and constructed a co-occurrence matrix. The GLCM features are derived on this. To reduce the complexity, this study also derived another descriptor by indexing the six bit code using the rotational invariant property. Novelty: The proposed color and texture descriptors give better results, and a significant improvement is achieved in this study by concatenating these two descriptors for CBIR.


Introduction
The sudden growth of image databases due to the advances in various digital technologies has led the tremendous importance to computer vision. The computer vision is playing an important role in various applications like academics, forensics, medical diagnostics, entertainment, agriculture etc. Today many researchers are showing interest in classification of images, image retrieval, image recognition, image analysis etc. The content based image retrieval (CBIR) methods retrieves the similar images of the query image, from a large image database of different types using image contents. The image contents can be the color, grey level intensities, shape, texture etc. In the initial days, the images are retrieved using the annotated text images or indexes attached to the images (1) and they simply ignored the image contents. These methods completely fail in retrieving the similar images from a large database of few or more types of images. Later the retrieval methods based on ontology, map vocabularies etc., (2,3) have been proposed in the literature to carryout automatic recognition. However these methods are time consuming and require detailed annotations. To achieve better results close to human visual perception some researchers proposed https://www.indjst.org/ semantic based image retrieval (SBIR) methods in the literature (4,5) . However, these methods require high configuration hardware components and derive high complexity. The introduction of content based image retrieval (CBIR) (6) has turned the attention of several researchers towards CBIR based methods because of their simplicity and accuracy.
A preprocessing mechanism is applied to the images in any image processing applications. In the similar way the CBIR preprocesses the given set of images. After preprocessing the image (if needed), the image features are derived. The set of image features represent the feature vector of the image. The feature vector for all database images is derived. The same feature vector is computed on the query image. The feature vector of query image and database images are compared using a distance measure. The most relevant images are retrieved. Based on this retrieval performance and quality measures are computed. The performance of the given framework or feature vector is determined by these quality measures. The texture has been identified as a prominent feature of an image. The texture attributes plays a key role in various applications of computer vision domain. Several texture analysis methods have been proposed in the literature from the past twenty years. Ojala et al. (7) proposed the most significant texture descriptor known as local binary patterns (LBP). The LBP extracts the image features by analyzing the local circular neighborhood. The simplicity and accuracy of LBP inspired several researchers to successfully derive numerous variants of LBP in the literature. The LBP is prone to noise. A small fraction of noise may affect the LBP code significantly. The semi-structured LBP (SLBP) proposed by Xingyuan et al. (8) preprocesses the image to overcome noise distortions and access more significant spatial information. The local ternary pattern (LTP) proposed by Tan et al. (9) stabilized the micropatterns by quantizing the sign differences between the center pixel and its circular neighboring pixels into three values using a threshold. In the multi region prominent LBP (MR-LBP) (10) the facial images are reflected with uniform appearance and they have exhibited good results in face recognition applications. In recent years, researchers also derived octal patterns (11) instead of binary patterns. These octal patterns (11) extracted more local neighbourhood spatial relationships and achieved better results in CBIR.
The other disadvantage of LBP and its variants is they derive local features on circular neighborhoods (isotropic structures). In recent past, researchers proposed texture descriptors that capture both isotropic and anisotropic structural information from circular and elliptical shape neighborhood. The circular and elliptical-LBP (CE-LBP) (12) comes in this category. The CE-LBP (12) exhibited effective performance in texture classification. In the literature Local directional patterns (LDP) (13) are developed for texture classification and automatic face recognition. The LDP derives more stable edges; however the main disadvantage of LDP is, in selecting the top edge responses for which human interaction is needed. Recently we have developed a descriptor "advanced local direction cross diagonal matrix (ALDCDM)" (14) to overcome the disadvantage of LDP. This frame work (14) derived a ternary pattern to derive more significant texture information and achieved high retrieval rate. Color feature is a prominent visual cue used in several data bases, where the color of the object plays a prominent role. In the literature a lot of work has been reported on color based CBIR systems. Chen et al. (15) represented the image contents using color distributions of the image. Color descriptors that include color layout, dominant color etc., are proposed in MPEG7 (16) . Local color vector binary pattern (LCVBP) (17) represents the color image using color norm patterns and color angular vector. The color features have several advantages: robust to noise, variant to size, orientation and resolution. The retrieval performance of color descriptors is generally limited because of inadequate discrimination capability. In recent past, researchers achieved better results in image retrieval by integrating color and texture features of the image (18)(19)(20) . Z. Zeng (21) proposed local structure descriptor (LSD) for color image retrieval. The LSD integrated color, texture and shape features of the image. The Hashing and binary search methods also used in CBIR. The multi-view alignment hashing (MAH) (22) and Evolutionary compact embedding (ECE) (23) are popular among these categories of CBIR and achieved good retrieval results for large size image databases. The distribution of texture information in more than two directions on the first and second order derivative is derived recently (24) . This descriptor (24) computed rich local texture information and achieved better results on very large scale databases than its counterparts in CBIR. The bag-ofwords (BoW) approach initially derived for textual document classification Later the BoW methods are tested and adopted for CBIR (25) . These days the BoW methods are used in almost all fields of computer vision domains by researchers and noticed good results. The BoW methods are computationally expensive. The deep learning based approaches have become popular in recent years in image retrieval problems. Lin et al. (26) derived binary hash codes for image retrieval using deep learning framework. The researchers also aggregated the high-level features of convolutional neural networks (CNNs) with the deep learning features to achieve good discrimination and fast retrieval capabilities (27) . Yu et al. (28) proposed a novel multimodal deep learning approach using click feature for ranking and retrieval of images. The deep learning and CNNs frameworks are also used in medical image retrieval and computer-aided diagnosis domains (29)(30)(31) . The computational time complexities and high-end configuration hardware requirement are the major challenges in these approaches.
The representation of an image by different kinds of local structural neighborhood patterns has been influenced the researchers to propose various structural based methods for CBIR in the literature. The well-known structural pattern called texton proposed by Julesz (32) has been significantly extended by the researchers in the past few years (33)(34)(35) . Our recently https://www.indjst.org/ proposed descriptor "odd and even tetra texton matrix (OE-TTM)" (36) is an extension to texton framework. The OE-TTM integrated the structural, texture and statistical features of an image and has shown noise resistance and improved discrimination abilities. Numerous extensions are proposed in the literature for another prominent structural pattern known as motif (37)(38)(39) that are shown improvement over preceding descriptors. The texture features are also derived based on the symmetric relationship between neighboring pixels of a 3 x 3 grid. The center symmetric LBP (CS-LBP) (40) and weighted symmetric-LBP (WS-LBP) (41) are the popular frameworks in this category. The local patterns on a 3 x 3 neighborhood are derived based on the relationship between symmetric sampling points in CS-LBP. The CS-LBP completely ignored the center pixel. The CS-LBP derived only 16 binary patterns on a 3 x 3 window. The WS-LBP overcomes this by estimating the relationship between weighted center pixel verses the symmetric pixels of the neighborhood. The WS-LBP derived 64 binary patterns on a 3 x 3 window by computing the other edge information. Most of the LBP variants measure a relationship between two pixels mostly between center pixel and one of the sampling points. The proposed descriptors in this study derive a relationship between two symmetric pair of pixels versus two other symmetric pair of pixels to capture more vital local information among 4-pixels.
The novelty of the current approach is (i) the two proposed descriptors derive the binary pattern using two logical operations, XOR, AND instead of a sign function, as in the case of LBP and its variants (ii) The extraction of rotation invariant "indexed cross and diagonal symmetric pair (ICDSP)" pattern significantly reduced the derived co-occurrence matrix dimension. Thus the proposed descriptors captured the local information more precisely, and integration of these local features with the HSV color histogram achieved a high retrieval rate compared to other existing descriptors.
The remaining sections of this study are planned as follows: Detailed illustration of the proposed frame works are given in section 2. Section 3 offers the description of image databases, evaluation metrics chosen and experimental results. The study is concluded in section 4.

Proposed method
In the literature several methods are proposed for CBIR. The color histograms, color correlograms, color coherence vector etc., are used to derive low-level color features. These color based methods derive global features, these methods extract significant color information, however the local based descriptors attained the promising results in CBIR. The features derived on local neighborhood and on local intensity of the image define the texture. That is the reason local neighborhood features and statistical features are derived in the literature to obtain texture features. The most significant contribution in deriving statistical features extraction is carried out by gray level co-occurrence matrix (GLCM). The GLCM was initially proposed by Haralick et al. (42) . The GLCM is also known as gray level spatial dependency matrix (GLSDM). The GLCM derives the texture features by computing the spatial relationship among the pixels of an image located by a distance 'd' . The texture features derived based on the local pattern or shape information are also popular in the CBIR. Among these the texton (32)(33)(34)(35)(36) and motif based (37)(38)(39) methods attained good results. They basically derived local pattern information on a 2 x 2 grid. The statistical attributes are also computed on local pattern information using GLCM and they derived promising and precise texture information. This study focused on exploring the mutual relationship between pairs of cross symmetric neighboring pixels as well as pairs of diagonal symmetric neighboring pixels over a 3 x 3 neighborhood. The frame work of the proposed CCDSP-CM Descriptor is shown in Figure 1. This research initially derived first-order local derivatives by exploring the relationship among neighboring pixels with center pixel. The first-order derivative among each sampling point verses center pixel is derived using a sign function as given below: Where, The 3 x 3 neighborhood is represented in Figure 2 , where n 0 , n 1 ,…, n 7 represents the intensity values of the neighboring pixels. The n c represents the center pixel's intensity value. In the above figure the even numbered pixels {b 0 , b 2 , b 4 , b 6 } represents the first-order cross pixels and known as cross unit. The pixels {b 1 , b 3 , b 5 , b 7 } i.e., the odd numbered pixels represents the first-order diagonal pixels or diagonal unit. This study computed new patterns by establishing a relationship between the symmetric pairs of cross pixels. In the similar way patterns are computed by establishing a relationship between pairs of first-order diagonal pixels. The present work, calculated three cross symmetric pairs of patterns (CSPP) {c 2 , c 1 , c 0 } and three diagonal symmetric pairs of patterns (DSPP) {d 2 , d 1 , d 0 }. The CSPP and DSPP are computed using XOR, AND logical functions. The logical AND function is used between symmetric pairs of cross and diagonal pixels as shown below.   Figure 3.
The derivation of CDSP Patten frame work for a 3 x 3 neighborhood of Figure 4 is given below. The present work combined CSPP and DSPP and derived CDSP Pattern as {d 2 , d 1 , d 0 , c 2 , c 1 , c 0 }. From this CDSP Pattern code is computed by multiplying with binary weights as given below. https://www.indjst.org/ The CDSP Pattern code for the above pattern is 57. The center pixel is replaced with CDSP Pattern code i.e., 57 in this case. This process is repeated on entire image with a step length of one, in an overlapped manner. This process transforms the given gray level image into a CDSP Pattern coded image. The range of CDSP Pattern code is 0 to 63. This work computed GLCM on this transformed image i.e., CDSP Pattern image. This results "cross diagonal symmetric pair co-occurrence matrix (CDSP-CM)". The dimension of the CDSP-CM will be 64 x 64. To reduce this dimensionality, to preserve the features more precisely and to derive rotation invariant features this research computed the rotation invariant "Indexed CDSP (ICDSP)" pattern codes on the input texture image. This study assigned index values from 0 to13 to each of the 14 different rotation invariant codes possible for a 6-bit binary pattern (7) . An example of GLCM calculation is shown in Figure 5. Figure 5(a) shows the original matrix. The Figure 5(b) shows the computed GLCM matrix for the distance d=1 and for the angle of rotation θ =45 0 , i.e, diagonal pixel pairs. https://www.indjst.org/ 6. Compute the first-order derivatives of the 3 x 3 neighborhood. 7. Divide the first-order derivatives derived in step 6 into cross and diagonal units. 8. Calculate three cross symmetric pairs of patterns (CSPP) on the cross unit of first-order derivatives. 9. Compute three diagonal symmetric pairs of patterns (DSPP) on the diagonal unit of first-order derivatives. 10. Combine the DSPP, CSPP and derive the cross and diagonal symmetric pair (CDSP) pattern code. 11. Replace the center pixel of the 3 x 3 window with CDSP Pattern code. 12. Repeat the steps from 6 to 11 on each 3 x 3 grid in an overlapped manner with a step length of one and this transforms the V-plane image into CDSP Pattern coded image. 13. Compute GLCM features on CDSP Pattern image to derive the feature descriptor "cross diagonal symmetric pair cooccurrence matrix (CDSP-CM)". 14. Integrate the CDSP-CM with color histogram features computed at step 4 to derive the final feature vector of the image known as Color and cross diagonal symmetric pair co-occurrence matrix (CCDSP-CM). 15. Repeat the steps from 2 to 14 on all database images to build feature vector of respective image. 16. Compare the query and database images for their similarity using the distance metric. 17. Display the best matched images by arranging them in the ascending order of their distance values.
End of the algorithm.
In the similar way, this research derived an algorithm for CICDSP-CM: it is similar to CCDSP-CM algorithm. The step 11 of this algorithm is modified in CICDSP-CM in the following way; Find the rotation invariant code and derive ICDSP. Replace the center pixel with ICDSP code in the step 11.
The major contribution of this study:

Results and discussion
To test the efficiency of the proposed descriptors color and cross diagonal symmetric pair co-occurrence matrix (CCDSP-CM) and color ICDSP-CM (CICDSP-CM) this study considered five popular image databases: Corel-1000 (43) , Corel-10000 (44) , MIT-visTex (45) , Color brodatz texture (CBT) (46) and CMU-PIE (47) . The main reason behind the consideration of these five databases is to study the performance of the proposed framework on various color images that are different from each other in the type of image (natural or texture), image size, number of categories, number of images in each category, total number of images, lighting conditions, backgrounds and viewing angles. To make the retrieval process faster we derived new set of images from     of these databases is summarized in the Table 1 and sample images from considered databases are displayed in Figures 6,7,8,9 and 10. To measure the similarity of query and database images this research considered the Euclidean distance metric: Where '|v|' indicate the feature vector size, f db i ( j) and fq(j) represents the feature element 'j' of i th database image db i and input image 'q' respectively. The best matched image is the one having the least distance value. This research considered the average precision rate (APR) and average recall rate (ARR) as performance evaluation measures on the proposed CCDSP-CM and CICDSP-CM frameworks. In this experiment, each database image is treated as a query image and the performance of the proposed descriptors CCDSP-CM and CICDSP-CM are tested using the metrics defined in the Equation 11 to Equation 16 below.

precision=
Number of correct matching images n (11) recall= Number of correct matching images m (12) https://www.indjst.org/ Where 'n' is total number of images retrieved and 'm' is number of matching images in the database.

Average precision
Where N ic is the number of i th category images in the database, 'n' is the number of retrieved images and |DB| indicates the size of image database. The performance efficacy of the proposed CCDSP-CM and CICDSP-CM descriptors are judged against the existing descriptors: local binary pattern (LBP) (7) , semi-structure local binary pattern (SLBP) (8) , local directional pattern (LDP) (13) , local ternary pattern (LTP) (9) , center symmetric LBP (CS-LBP) (40) , block based LBP (BLK-LBP) (48) using APR and ARR on the databases considered and the results are shown in the graphs in Figures 11,12,13,14,15,16,17,18,19 and 20. The top 20 retrieved images by CCDSP-CM descriptor from each image database for a given input image are displayed in Figures 21, 22, 23, 24 and 25.        1. The proposed framework CCDSP-CM exhibited high retrieval rate and discrimination abilities on Corel-1000, Corel-10000, CBT and MIT-visTex databases with an average precision as 84.69%, 86.32%, 87.24% and 85.12% respectively during the retrieval of top 10 images. 2. The proposed framework CCDSP-CM has shown an average performance on CMU-PIE database because of its complexity (from the graphs in Figure 11 to Figure 20). 3. The precision and recall graphs shown in Figure 11 to Figure 20 indicate the high performance of CCDSP-CM descriptor over the state-of-art descriptors in terms of the APR% and ARR%. 4. The proposed CCDSP-CM descriptor has shown 8 to 9% increase in APR over the LBP variants. 5. The proposed CCDSP-CM framework has exhibited 6 to 8% increase in ARR than LDP and LTP since it derives more spatial information in local neighborhood. 6. The proposed descriptors CCDSP-CM and CICDSP-CM has resulted a fast retrieval rate over other local descriptors due to the low dimensionality (from the graphs in Figure 11 to Figure 20). 7. The novelty of the proposed descriptors CCDSP-CM and CICDSP-CM is the derivation of texture features: the texture features are derived using a new concept by establishing a relationship between pairs of diagonal, parallel, perpendicular symmetric pixels of 3 x 3 windows. The proposed descriptors CCDSP-CM and CICDSP-CM computed a binary pattern using XOR, AND logical functions on symmetric pairs of pixels. To promote rich textural features the proposed descriptor derived GLCM features on this coded image instead of a histogram. Thus the proposed descriptor derives rich information of symmetric relations between pair of elements, GLCM features and color contribution. And it is one of the significant reasons to achieve high retrieval rate on variety of different databases. This has made the proposed descriptor to outperform the other existing methods. Thus this research achieved high retrieval rate over existing descriptors by integrating the color and statistical features of texture. 8. The proposed framework CICDSP-CM has been shown better performance over the proposed descriptor CCDSP-CM with an average precision as 86.53%, 87.19%, 88.17% and 86.41% on Corel-1000, Corel-10000, CBT and MIT-visTex databases respectively during the retrieval of top 10 images. 9. The proposed CICDSP-CM descriptor has shown good retrieval performance on CMU-PIE database compared to the proposed descriptor CCDSP-CM in terms of the precision rate. 10. The proposed CICDSP-CM framework is robust to rotational changes on images and also exhibited fast retrieval rate over the proposed CCDSP-CM descriptor due to the decreased dimensionality. The graphs shown in Figure 11 to Figure  20 clearly indicate that the proposed CICDSP-CM descriptor giving 2 to 3% improved performance over the proposed CCDSP-CM framework in terms of the APR% and ARR%. https://www.indjst.org/

Conclusion
This study contributed two novel descriptors CCDSP-CM and CICDSP-CM for CBIR by combining color and texture features. The first texture descriptor is named as cross and diagonal symmetric pair co-occurrence matrix (CDSP-CM). This CDSP-CM descriptor effectively captures the relationship between two symmetric pair of pixels of a local window of size 3 x 3. The proposed CDSP Pattern initially derives the first-order derivatives of the neighborhood and divides the window into cross and diagonal units. The CDSP Pattern developed a method that extracts a binary pattern by computing the relationship between two symmetric pairs of pixel elements using XOR, AND logical functions on cross and diagonal units. This has derived a six bit binary code. Thus, CDSP Pattern code captures the strong texture information. The co-occurrence matrix (CM) derived on CDSP Pattern extracts the spatial co-occurrence relationship. The dimensions of the proposed CDSP-CM will be 64 x 64. To reduce this dimension and to derive rotation invariant CDSP Pattern code, this study derived indexed CDSP (ICDSP) Pattern code, which is the basic theme for second descriptor. The CM is derived on ICDSP and it has produced the second texture descriptor named as indexed CDSP-CM (ICDSP-CM). The color features are derived using the individual color histograms of H, S and V plane. The combination of these color features with the derived texture features by combining symmetric and statistical patterns of GLCM contributed significantly to achieve high retrieval rate on variety of different databases. The experimental results indicate the similar performance on the proposed descriptors CCDSP-CM and CICDSP-CM. However, the CICDSP-CM exhibited slightly higher performance than CCDSP-CM, with lower dimensionality.