Object Recognition by Feature Weighted Matrix – A Novel Approach

Objective: This research aims at formulating a method to categorise a given class of objects by obtaining a weighted matrix computed as explained below. Methods/Analysis : The method deployed can be branched into two phases: Training and Testing. In the first phase, a set of images of the concerned objects are taken. By set of images, one can refer to images of different objects, or different positions of the same object. The features are then, extracted for these input images and stored in the database as vectors. Any computation hence forth, is performed using these vectors. In testing stage, the algorithm uses its knowledge to identify the input image to a specified class. Findings: Our method is computationally inexpensive since all the calculations are performed on the basic grounds of matrix operations. This method is not just limited to the domain of object recognition alone. Any real-time entity that can be statistically represented in a vector form can be deployed. All that is required of the application is that the range of vectors is defined so as to obtain the minimum components and maximum components, individually. Once this is obtained, the algorithm will be sufficient to identify any input and will accordingly determine the category to which it belongs. The only challenge identified is that the range of vectors obtained from the input data for various categories must not overlap. That being the case will result in multiple hits or in simpler words, will give an incorrect result. Further work can be implemented on how to make the algorithm independent of this dependency. Also, the algorithm improves the results through various illumination and scaling conditions and this has been discussed in results and analysis section.Even with the existing methods to recognize an object, this algorithm can be combined to categorize or classify objects. Conclusion/Application: The proposed algorithm successfully classifies the input image into one of the trained categories by identifying the features followed by computing these obtained features as prescribed the given algorithm.


Introduction
Identifying meaningful region of interest in an image and video is a profound research problem in computer vision.Object recognition [1][2][3][4] is the ability to perceive an object's physical attributes such as regions of disparate spatial description; shape, color, and salient texture pattern.There have been several methods and approaches to formulate object recognition.Some of the available approaches to model objects are geometric shape modeling, appearance based modeling and local or global feature modeling.One such technique, mentioned under appearance based method, is finding the histogram 5,6 of intensities of a given image, or a region.
Geometric shape modeling [7][8][9][10][11] refers to the technique of realizing the shape of the object, or its outline, and representing it in a generalized form, more specifically, a mathematical function.Many methods are established to realize the shape of an object, using dominant point detection and polygonal approximation, or assuming Vol 8 (S7) | April 2015 | www.indjst.org the outline of an object to be a collection of vertices, representing the object as a graph, and finding the most important vertices which lead in the shape of the outline.In our approach boundary of the target is divided using 32 equally spaced horizontal lines along their height.For example, an upright triangle has its width increasing along its height proportionally.Or a circle has its width increasing till it reaches the center, and then, decreases, with the same rate it increased.Any random object, given its outline, can be identified in this manner.This method will be delineated in the proposed work.
In this paper, a novel approach called object recognition by feature weighted Matrix is proposed.Our algorithm recognizes the object in two phases, the feature Weighted Matrix Model will be constructed using feature vector and is estimated by Geometrical modelling and Histogram bin in the first phase and in the second phase target, the objects are identified using Diagonal Rank Matrix.

Proposed Work
The proposed method recognizes the target object in two phases; in the first phase, the front and top position of the training image is acquired at a distance 50 cm and 100 cm respectively, and a weighted matrix for training dataset is modelled from the corresponding feature vector (Geometric modelling and Histogram bin) and feature matrix for test image is estimated.In the second phase Diagonal Rank Matrix (DRM) is made to discern the category of the target object.The detailed overview of the proposed method is illustrated in Figure 1.

Training Data Set
In this work, a commonly used home based rigid object is considered as an input.The process of collecting the training data set is as follows: The front and top view of training image is acquired at the distance 50 cm and 100 cm respectively, and for each view and distance, there are five different shots of the same object being generated.A similar procedure can be extended to n objects to obtain a considerable input data set.

Feature Extraction
Feature extraction is the first phase of the proposed work; the acquired training dataset is fed into two feature extraction algorithms, Geometrical modelling and Histogram bin.Geometric modelling uses classical edge detection algorithms in pre-processing to detect boundary of the target.The detected boundary is divided using 32 equally spaced horizontal lines ((ymax-ymin)/33) along with their height as illustrated in figure 2a-2c.The length of each line <g(u, 1), g(u, 2)... g(u, 32) > is used as a vector component.Eg u is the corresponding feature vector of Geometric modelling, as described in the equation below. Where, Histogram bin is a type of histogram that represents the tonal distribution of object by plotting a relation between the number of pixel values and its corresponding intensity by grouping them into bins (32 bins, as proposed), and Feature vector is estimated.The below equation describes the feature vector for the Histogram bin (Eh u ).The figure 3a shows a sample input image and the corresponding Histogram bin shown in the following figure 3b.

Weighted Matrix
The

The Minimum Matrix
Let m (1,j) , m (2,j) , m (3,j) and m (4,j) be the corresponding minimum vectors, where j varies from 1 to 32.The Minimum Matrix can be obtained by finding the minimum of all the components, along the columns from the input dataset, which consists of feature vector obtained for five images of an object in the front view at a distance of 50 cm, five images at a distance of 100 cm, five images taken from the top view at a distance of 50 cm, and another five at a distance of 100cm.The steps are illustrated below using an intermediate matrices.
Minimum matrix for Geometrical modelling: Minimum vector m (1, j) is described below for the front view of an object is placed at a distance 50 cm Minimum vector m (4, j) is described below for the top view of the object is placed at a distance 100 cm Where, g( , j)...g( , j)Feature vector of input data set 1 20 Minimum matrix for histogram bin: Minimum vector m (1, j) is described below for the front view of an object is placed at a distance 50 cm Minimum vector m (3, j) is described below for the front View of the object is placed at a distance 100 cm Minimum vector m (4, j) is described below for the top View of the object is placed at a distance 100 cm Where, h( , j)...h( , j)Feature vector of input data set 1 20 The minimum vectors described above are combined along the column.The minimum matrix for geometrical modelling (mg) and histogram bin (mh) is obtained as follows.
Where, Size of m g and m h is (4n x32),n is the number of objects considered.

The Maximum Matrix (M )
The procedure estimating the maximum matrix is similar to that of finding the minimum matrix; in this case we find the maximum of all the components along the column present in the input data set is considered.The steps are illustrated below using an intermediate matrix M (i,j) .
Maximum matrix for Geometrical modelling: Maximum vector M (1, j) is described below for a front view of an object is placed at a distance 50 cm Maximum vector M (2, j) is described below for top View of the object is placed at a distance 50 cm Where, I( , j)...I( , j)Feature vector of input data set 1 20 As done previously for the minimum matrix, the maximum matrix is formed, by combining the maximum vectors along the column of individual vectors.The following matrices M g and M h describe the maximum matrix for geometrical modelling and histogram bin, respectively.

Mg
1 1 1 32 Where, Size of m h and m g is (4n x32), In is the number of objects considered

Calculating Matrix T from Vector t
Given an input test image, feature vector (tg and th are the feature vector for Geometric model and Histogram bin respectively) is extracted from the feature extraction module.Matrix T g and T h obtained from the feature vectors (t g and t h ) separately by replicating 4n times along the row order. (*) Initially the feature vectors present in the matrix T g /T h are assumed to be a true positive object and in the subsequent process objects are recognized using matrix R, r and DRM.The below equation describes the feature vector for test image and followed by matrix T g and T h respectively.
(*): For the reader's understanding, T g is obtained by replicating t g 4n times, so as to match the rules of matrix subtraction.Since, the minimum and maximum matrices will be of the order 4n X 32, the input matrix shall be augmented as mentioned above.

Deviation from Maximum (R)
The input image feature matrix T i is subtracted from the maximum matrix M i .The corresponding deviation matrix, R i is obtained and described as: Where, T i stands for Input matrix replicated k times i g ∈ (or)h

Deviation from Minimum (r)
The minimum matrix m i is now, subtracted from the input test matrix T i .Hence, the corresponding deviation matrix r i is obtained as: Where, T i stands for Input matrix replicated k times i g ∈ (or)h

Formulating the Weighted Matrix (S kxk )
The weighted matrix (S) estimates the likelihood of recognizes the target object and it is obtained by multiplying.
The deviation from the maximum R i matrix and transpose of deviation from the minimum r i matrix.

S Ri trans = , (ri)
Consider a general scenario, where, 'a' and 'b' are two numerical entities, such that 'a' < 'b' .For an input entity, 'x' , the sign of the product of (b-x) and (x-a) depends on whether x lies between 'a' and 'b' .
The function given below summarizes: Each element in S is a sum of products of the corresponding values in R i and r i , respectively.The resultant product of the respective element has remained positive where x' lies between 'a' and 'b' .The product would become negative, if one of the elements does not satisfy the condition of 'x' being between 'a' and 'b' .Hence, the components of S that are positive signify that the component from the test vector falls exactly between the corresponding components of the minimum and the maximum matrices.If all the components of a row in S are positive, it is evident that the test input belongs to the category pertaining to the vector specified in that row.

Diagonal Rank Matrix (DRM)
Diagonal rank matrix is the final step of our proposed model; it is used to recognize the object based on ranking, which is estimated by weighted matrix S and then it is scalar multiplied by the Identity(**) matrix I 4n .The below equation describes the diagonal rank matrix.DRM = S.I 4n (**) Scalar multiplication is performed on corresponding elements, unlike normal matrix multiplication.The multiplication is performed, without any subsequent addition.In this case, the diagonal elements are multiplied by 1, and are thus retained while the other elements (that are of no consequence), multiplied with 0, are nulled.
The diagonal elements in DRM are a result of the sum of products of R i and the transpose of r i .The largest positive value among the diagonal elements contributes to the maximum likeliness of the target object.
The diagonal elements in DRM are a result of the sum of products of Ri and the transpose of ri.As mentioned earlier, whenever either of the components (let a or b) is negative their multiplication will result in a negative value.This negative value would lead to a reduction in the sum and thus a reduced weight and if both the components are positive (let a or b), the positive product contributes to the weight and thereby, increasing the likelihood.The row corresponding to the largest value, hence, pertains to the category.Likewise, the row corresponding to the least value shows the least possibility of the test input to belong to that category.

Implementation Details
We have identified commonly used home based objects namely coffee bottle, Ketchup bottle and coffee cup.The first two objects are used as training data set in this work.The different views and distances are summarized in the table 1.The algorithm has been implemented using Open CV and programmed in Python 2.7; and tested on a Pentium i5 processor with 4 GB RAM and the training and testing images were taken using a normal 2 mega pixel web camera.The figure 4 shows different views and distance of the training data set.

Geometric Modelling
The figure 5 and 6 illustrates Minimum and Maximum Feature Vector Visualization for Geometrical model of coffee bottle and ketchup bottle, the red line in this figure describes the minimum widths, representing the minimum matrix values of the corresponding categories and the blue line describes the maximum widths for the same.Duly note that this is done for graphical representation for ease of visualising, and by no means is used for any calculation.

Histogram Bins
The Minimum and Maximum Feature Vector Visualization for Histogram bin of coffee bottle and ketchup bottle is exemplified in figure 7 and 8, the red line represents the minimum value of the corresponding bin of each category.The blue line represents the maximum values for the same.For a given test image to qualify in one of the categories, the input's histogram bin feature vector has    to primarily lie between the blue and the red line.Else, the test image does not belong to the category.

Results and Analysis
We   As it can be observed, for case 1, the outout generated is CF50, that is, coffee bottle identified from the front view at a distance of 50cm, thereby correctly categorising the input.

Case 2: Identification of ketchup bottle
The front view of the ketchup bottle at the distance of 100 cm is taken from training data set.Figure 12 illustrates the same.The graphical representation of the minimum and maximum Feature vector of Geometric model versus training data set is shown in figure 13 (a-d).The largest positive value is represented as KF100 in the diagonal rank matrix described below.Evidently, the target object is identified as ketchup bottle at front view at the distance 100 cm.Same has been visualized in figure 13c., which is ketchup bottle front view at distance 100 cm and training data set front view at distance 100 cm.

Diagonal Rank Matrix
A similar representation for the histogram bin has been illustrated in figure 14.
The greatest value is identified to the category KF100, as show in the following DRM.

Case 3: Objectnot containing training data set
The front view of the coffee cup at the distance of 100 cm is given as a test input image, which is not present in the training data set.Figure 15 illustrates the same.The diagonal rank matrix of the geometrical model and Histogram bin, respectively, presented below reveals that the object is not identified to any of the trained category, since the diagonal matrix has got all non-positive numbers.

Diagonal Rank Matrix (Histogram bin)
Diagonal Rank Matrix (Geometric modeling) The following table summarizes the statistics of our experimentation using the above methods deploying our proposed algorithm.

Conclusion
In this research we have implemented a novel approach called object recognition using feature Weighted Matrix (WM).The proposed method identifies the object in different views namely front and top view.It improves the results in diverse illumination variation conditions for geometrical based feature vector and different scal-ing variations for both Geometrical based feature vector and Histogram bin feature vector.This feature weighted matrix can be extended to solve multiple objects present in the image and different invariant conditions.
Minimum vector m (2, j) is described below for the top view of the object is placed at a distance 50 cm Minimum vector m (3, j) is described below for the front view of the object is placed at a distance 100 cm

Figure 2a .
Figure 2a.Sample Input; Figure 2b.Boundary of the Figure 2c.32 equally spaced Image Horizontal Line Object.
Maximum vector M (3, j) is described below for front View of the object is placed at a distance 100 cm histogram bin:Maximum vector M (1, j) is described below for the front view of an object is placed at a distance 50 cm Maximum vector M (3, j) is described below for front view of the object is placed at a distance 100 cm

Figure 5 .
Figure 5. Minimum and maximum feature vector visualization for geometrical model to coffee bottle.

Figure 6 .
Figure 6.Minimum and maximum feature vector visualization for geometrical model to ketchup bottle.

Figure 7 .
Figure 7. Minimum and maximum feature vector visualization for histogram bin to coffee bottle.

Figure 8 .
Figure 8. Minimum and maximum feature vector visualization for histogram bin to ketchup bottle.

Figure 9 .
Figure 9. Input Image for coffee bottle.

Figure 11 .
Figure 11.Histogram bin for coffee bottle Vs Training data set.

Figure 14 .
Figure 14.Histogram bin for ketchup bottle Vs training data set.

Figure 13 (
Figure 13 (a-d clockwise).Minimum and maximum feature vector visualization for geometrical model ofketchup bottle vs Training data set.

Figure 15 .
Figure 15.Input image for coffee cup.

Table 2 .
Testing results summary