#### A. Sridevi<sup>1</sup>, G. K. D. Prasanna Venkatesan<sup>2</sup> and R. Priyadharshini<sup>1</sup>

<sup>1</sup>M.Kumarasamy College of Engineering (Autonomous), Karur – 639113, Tamil Nadu, India; sridevigunasekaranphd@gmail.com, priyadh97@hmail.com <sup>2</sup>SNS College of Engineering, Coimbatore – 641107, Tamil Nadu, India, prasphd@gmail.com

### Abstract

**Objectives**: The Content Addressable Memory (CAM), are often used in various application fields to get better performance with high efficient output. The high efficient output might be obtained with minimum delay and minimum power Consumption. **Methods/Statistical Analysis**: The CAM is used in real time application like data search, associative computing and in the field of networking, which involves a high speed search process. A proposed banked architecture is used to control the power dissipated, which is achieved by making small variation in hardware architecture. This proposed pre reckoning -based -CAM architecture uses parameter extractor which will save the power and time. **Findings:** Initially parameter extractor selects the process operator, based on selected operator which will choose neighboring parameter and leaves the far away limitation. The execution of the proposed design will be done by the Xilinx software. The power consumed in every module is cumulated during execution. **Application/Improvements:** The obtained results also show that our choice is correct by comparing the Banking architecture with the existing one. The Proposed Banked pre reckoning architecture has minimized the average power consumption up to 30% compared with the existing one.

Keywords: Banked Architecture, CAM, Parameter Extractor, Power Consumption, Pre- Reckoning

## 1. Introduction

The Content addressable memory (CAM) is a category of storage element which can be used for future processing to accumulate and to hold the data. This CAM is used in Asynchronous Transfer mode (ATM) for transferring data from individual destination to any more destinations without any distortion. Since this is done because lookup table has the potential to search the data without any delay.<sup>1</sup> Every instant input date will be incessantly matched up to the data positioned in functional memory. Enormous comparison is executed and during the evaluation if there is an indistinguishable data acquired, the data were launched back to the corresponding address without any deformation. The major disadvantage of this design is the CAM power consumption<sup>1</sup>. They may undergo several works in order to reduce match-line power and the separating the match-line power through the performance degradation.<sup>2-6</sup>

The Reduced Power, Voltage and Cost with Increased reliability features can be achieved by the technique called as Pre-Reckoning based Content Addressable Memory architecture. The proposed design efficiently alters the hardware such that it estimates the power consumed by the Addressable Memory architecture is agreeable.<sup>2</sup>

The Block XOR approach is introduced in this paper, in order to reduce the overlapping operation of one PR-CAM architecture with another Pre Reckoning Based (PR) CAM. This proposed architectural method uses a limitation extractor approach which decreases

\*Author for correspondence

the Power consumption, by minimizing the overlapping approach and in turn it further decreases the size and power. $^{6.2}$ 

### 1.1 Principle

The main objective of this work is to diminish the overlapping operations of the PR-CAM architecture with the other. In our proposed approach a new parameter extractor has been introduced which will reduce the overlapping approach and if once the overlapping approach is condensed the consumption of power is minimized by this innovative proposed architecture.

## 2. Existing Methodology

### 2.1 Existing Methodology

The Existing Content addressable memory design is shown in Figure 1. With this approach the data inputs are compared with the data that exist in the memory. In existing memory every Input will be compared with the data, and if the values don't match the values, then the data are returned back to the initial address location. If input data and memory data match each other than the output will be generated. This operation can be repeated for all the processes and for each and every address location, so as net consequence power consumption for each execution increases. Thus, if the architectural design is changed such that, it pre calculated the values by which power consumed is reduced further.



Figure 1. Content Addressable Memory Architecture.

## 3. Proposed Methodology

### 3.1 Pre-Reckoning based Content Addressable Memory

The Memory organization of the proposed pre reckoning-Based CAM architecture is shown in Figure 2. The organizational memory section contains three main modules, they are Data memory, Parameter Memory and Parameter extractor.<sup>2-5</sup> The proposed banked architecture is shown in the Figure 3 and every block is implemented.



Figure 2. Pre-reckoning based CAM Architecture.



Figure 3. Banked Architecture.

The Data exploration process of memory organization has been categorized into two sections in which the section one contains of the input data, which is sent to the parameter extractor and a last parameter memory. Then the second Section contains of Data memory to which the input data is stored. The First part of memory organization is modified such that the pre calculation technique is carried out in order to reduce the consumed power and by initiating a filtering process further power consumption decreased. The parameter extractor implements one's count processor and XOR block approach, by this a changes can be obtained for minimization of power consumption. The power is decreased by using 14 input bits one's count approach and it is converted to 32 bit XOR block approach<sup>z</sup>.

#### 3.2 One's Count Approach - 14-Bit datas

In our proposed PR-CAM approach they can count the number of new entries and the input & output are in the ratio of 14:4. The outputs are represented as  $S_0,S_1,S_2$ &  $S_3$ . For example the obtained output is 0101 then will select the p5 parameter in the case if there will be no similar limitation, then it will return back to the address location and search for the same in stored data's. The implementation is shown in the Figure 4 and the Output results can be tabulated as follows.

Table 1 shows the results of parameter extractor as output. From the table, we can able to estimate the amount of data associated with same parameter and its average parameter values, the calculation can be made as



Figure 4. One's count parameter extractor (14-bit data).

given below,

| Table 1. | One's Count Parameter Extractor | – Out put |
|----------|---------------------------------|-----------|
|----------|---------------------------------|-----------|

| Param | Parameter Data associated to the same parameter(in Numbers) |     | Probability %<br>(Average) |
|-------|-------------------------------------------------------------|-----|----------------------------|
| 0000  | 0                                                           | 1   | 0.01%                      |
| 0001  | 1                                                           | 12  | 0.07%                      |
| 0010  | 2                                                           | 27  | 0.16%                      |
| 0011  | 3                                                           | 446 | 2.7%                       |

| Parameter |    | Data associated to the<br>same parameter(in<br>Numbers) | Probability %<br>(Average) |
|-----------|----|---------------------------------------------------------|----------------------------|
| 0100      | 4  | 2257                                                    | 13.77%                     |
| 0101      | 5  | 3105                                                    | 18.95%                     |
| 0110      | 6  | 3672                                                    | 22.41%                     |
| 0111      | 7  | 4115                                                    | 25.11%                     |
| 1000      | 8  | 3672                                                    | 22.41%                     |
| 1001      | 9  | 3105                                                    | 18.95%                     |
| 1010      | 10 | 2257                                                    | 13.77%                     |
| 1011      | 11 | 446                                                     | 2.7%                       |
| 1100      | 12 | 27                                                      | 0.16%                      |
| 1101      | 13 | 12                                                      | 0.07%                      |
| 1110      | 14 | 1                                                       | 0.01%                      |
| 1111      | 15 | Valid bit                                               |                            |

Number of data associated with same  $(\frac{14}{n=3})$  param-

eter =14\*13\*12/3\*2\*1

Number of data associated with the same parameter =446

The average probability can be obtained by,

Probability =  $\frac{Number of Data Linked with same Parameter}{2^{14}}$ = 446/16384= 0.027Probability (%) = 2.7%

The values of table shows the number of data obtained is related to the same parameter with the combinations of 2000-3000 and that can be a major drawback obtained through the one's count approach where they results in increased power consumption. The Gaussian distribution characteristic is demonstrated in the input patterns of one's count approach as shown in Table 1. The Gaussian distribution is used to discriminate minor variation of data values by implementation of the Block- XOR approach and the each deviation can also be computed. So every decimal value variation and power are estimated in turn, it decreases power consumption<sup>Z</sup> with the 14-Bit data lock- XOR implementation. The main proposal behind the PR-CAM is to decrease the comparison operation over the Gaussian distribution operation by removing it. For examples, consider a 14 - bit input data and if we want to spread the input data equally throughout the area through the required number of comparison operations. The number of processing stages in the comparison operations can be reduced by the implementation of the new parameter is proposed.

The proposed approach will undergo the several processing steps. The first step is to divide the single input data bit into number blocks. The step 2 is to compute the output block through the XOR logic operation. Step 3 involves the consideration of obtaining output as input for the next comparison process and the process of XOR block approach will not provide exact result that the obtained output is valid or not. If the Data is not valid then it could not be allowed for PR-CAM<sup>6</sup>. So the validity of the data can be needed to Implement the PR-CAM architecture<sup>Z</sup>. The Multiplexer architecture is also implemented in addition to the existing architecture in order to equally spread the 14 bit Block XOR approach, it is shown in Figure 5. The 32bit Block-XOR Parameter Extractor shown in Figure 6 and 32bit Block-XNOR Parameter Extractor in Figure 7 are realized. The chosen signal can be obtained in the form of

$$S=A_3^*A_2^*A_1^*A_0$$
 ...(1)

The output S is transmitted all input  $(A_3A_2A_1A_0="1111")$  are one's and If any one of the Input is 0 then the output will not be acquired. Optional output ranges from 0000- 1110 and remaining values will be 1111. Finally, it will choose "D13 D12 D11 D10". So then at least we have rejected the "1111". Table 2 displays the final output of parameter extractor.

#### 3.3 Parameter Memory

The output parameter is used to dig out input parameter values. For example, if the output value is 0101 then the input value is set to or chosen to p5. The Figure 8 shows the parameter memory configuration. The Finite Impulse filtering is preferred to match or to discard the dissimilar data values and this process is pre-reckoning. This filtering technique is done as the first step of calculation. Then the step two is the CAM execution, where the executed values are given as input to data memeory<sup>4.6.7</sup>.



Figure 5. Block XOR parameter extractor (14-bit data).









| Parameter |    | Data associated to the<br>same Parameter (in<br>Numbers) | Probability<br>%(Average) |  |
|-----------|----|----------------------------------------------------------|---------------------------|--|
| 0000      | 0  | 1086                                                     | 6.62%                     |  |
| 0001      | 1  | 1254                                                     | 7.65%                     |  |
| 0010      | 2  | 1254                                                     | 7.65%                     |  |
| 0011      | 3  | 1086                                                     | 6.62%                     |  |
| 20100     | 4  | 1254                                                     | 7.65%                     |  |
| 0101      | 5  | 1086                                                     | 6.62%                     |  |
| 0110      | 6  | 1086                                                     | 6.62%                     |  |
| 0111      | 7  | 1254                                                     | 7.65%                     |  |
| 1000      | 8  | 1254                                                     | 7.65%                     |  |
| 1001      | 9  | 1086                                                     | 6.62%                     |  |
| 1010      | 10 | 1086                                                     | 6.62%                     |  |
| 1011      | 11 | 1254                                                     | 7.65%                     |  |
| 1100      | 12 | 1086                                                     | 6.62%                     |  |
| 1101      | 13 | 1254                                                     | 7.65%                     |  |
| 1110      | 14 | 1254                                                     | 7.65%                     |  |
| 1111      | 15 | Valid bit                                                |                           |  |

Table 2. Output of Blocked X or Parameter Extractor

| $\begin{array}{c c c c c c c c c c c c c c c c c c c $ |
|--------------------------------------------------------|
|--------------------------------------------------------|

Figure 8. Parameter Memories.

## 4. Simulation Result

The Comparison of Area, Power and Delay for the 14 bit ones-count and blocked XOR parameter extractor is shown in Table 3 and the results are obtained through XILINX-ISE software. The architecture of NOR Type CAM design and NAND type design is shown in Figure 9,10. The NOR Type CAM design and NAND type design can be altered by PR XOR CAM, by the process of splitting the data further decreases the reduce the hardware complexity of the process of Encoding and Decoding the Match lines after comparison.



Figure 9. NOR type CAM.



Figure 10. NAND type CAM.

Table 3. Comparisons Table of Power, Delay & Area

| 14-Bit Data             | AREA(Gate count) | DELAY<br>(ns) | POWER (mw) |
|-------------------------|------------------|---------------|------------|
| One's count<br>approach | 98               | 130.65        | 662        |
| Block XOR<br>approach   | 12               | 68.85         | 584        |

# 5. Discussion and Conclusion

### 5.1 Discussion

The Block XOR and XNOR approaches uniformly spread the values upon the input data of a Gaussian distribution. By the results of the work, it is provided that the projected Block-XOR PR-CAM reduces the processing cycle number and it increases the bit length. By increasing the number of comparable, data bit length increases and in turn reduces the power consumption. So the projected Block-XOR PR- Content addressable memory is more appropriate for wide-input Content addressable memory applications.

### 5.2 Conclusion

With the Block XOR parameter extractor, very less power is consumed and this design minimizes the number of comparison operations. Achieved result for 14 bit ones implementation is by PR-CAM = 662 MW, whereas power consumed by 14 bit Block-XOR PR-CAM is 580 MW and further the power exploitation is economical by using probable block design.

## 6. Future Enhancement

The Power and Area are minimized by the PR CAM architecture by replacing Blocked XOR operation with the Blocked XNOR operation. The NOR and NAND Type CAM design can also be designed and implemented

## 7. References

- Ye Y, Du Y, Jing W, Li X, Song Z, Chen B. CAM-based Retention-Aware DRAM (CRA-DRAM) for Refresh Power Reduction. IEICE
  - Electronics Express. 2017; 14(1):1–11.

- 2. Efthymiou A, Garside JD. A CAM with mixed serial -parallel comparison for use in low energy caches. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2004; 12(3).
- 3. Arsovski I, Chandler T, Sheikholeslami A. A Ternary Content-Addressable Memory (TCAM) Based on 4T Static Storage and Including a Current-Race Sensing Scheme. *IEEE Journal Of Solid-State Circuits*. 2003; 38(1):155–8.
- 4. Anh T, Shoushun D, Zhi C, Kiat S. A High Speed Low Power CAM With a Parity Bit and Power-Gated ML Sensing. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2013; 21(1):7–15.
- 5. Arsovski I, Chandler T, Sheikholeslami A. A ternary content addressable memory (TCAM) based on 4T static storage and including a current-race sensing scheme. IEEE Journal of Solid-State Circuits. 2003; 38(1):155–8.
- 6. Zackriya VM, Verma A, Kittur HM. Design of multi-segment hybrid type content addressable memory in high performance FinFET technologies. Indian Journal of Science and Technology. 2015; 8 (24).
- 7. Ibrahim Q. Design & implementation of high speed network devices using SRL16 reconfigurable content addressable memory (RCAM). *International Arab Journal of e-Technology*. 2011; 2(2):1–10.