# High-Speed and Energy-Efficient Carry Look-Ahead Adder

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Conventional and Proposed CLA Architectures

_{Q}represents the carry input to the Qth bit position, C

_{Q+1}represents the carry output from the Qth bit position, G

_{Q}refers to the generate signal, and P

_{Q}refers to the propagate signal corresponding to the Qth bit position. In Equation (1), P

_{Q}is obtained by performing a logical XOR of input bits X

_{Q}and Y

_{Q}(i.e., P

_{Q}= X

_{Q}⊕ Y

_{Q}), and G

_{Q}is obtained by performing a logical AND of input bits X

_{Q}and Y

_{Q}(i.e., G

_{Q}= A

_{Q}B

_{Q}). The sum bit corresponding to the Qth bit position is produced based on Equation (2).

_{K}represents the carry input to a 4-bit CLA, P

_{K+3}to P

_{K}represent the propagate signals, G

_{K+3}to G

_{K}represent the generate signals, and C

_{K+4}to C

_{K+1}represent the look-ahead carry outputs derived. Equation (4) is deduced by substituting the expression of C

_{K+1}given in Equation (3), Equation (5) is deduced by substituting the expression of C

_{K+2}given in Equation (4), and Equation (6) is deduced by substituting the expression of C

_{K+3}given in Equation (5). In Equations (3)–(6), it can be observed that the look-ahead carry outputs C

_{K+1}up to C

_{K+4}are all dependent on only the carry input C

_{K}. Hence, the look-ahead carry outputs can be generated in parallel, which can be used to generate the sum bits of the CLA in parallel and provide the carry input to the subsequent stage. The conventional CLA implementing Equations (3)–(6) is shown in Figure 1.

_{K}= 0 in Equations (3)–(6), the gate-level realization of a 4-bit CLA without any carry input would be as shown in Figure 3, and its critical path is highlighted by the pink dashed line.

_{N–1}, the second term given within brackets represents the propagation delay encountered in traversing (M–2) 4-bit CLAs, and the last term represents the propagation delay encountered in traversing the least significant 4-bit CLA that does not have a carry input. The second term on the right-side of Equation (7) reflects the optimum decomposition of 5-input OR and 5-input AND gates seen in Figure 1 into two 3-input OR gates and two 3-input AND gates, respectively.

_{K}) are grouped into an intermediate product term and represented using a Boolean variable, and the remaining sum of product terms that do not involve the carry input are grouped and represented using another Boolean variable. With reference to Equations (4)–(6), to perform the grouping, we introduced some intermediate variables in the Boolean network, namely A

_{1}, A

_{2}, A

_{3}, A

_{4}, A

_{5}, and A

_{6}, where A

_{1}and A

_{2}are used for Equation (4), A

_{3}and A

_{4}are used for Equation (5), and A

_{5}and A

_{6}are used for Equation (6), and they are expressed by Equations (8)–(13) given below. In fact, this grouping procedure is generic and can be applied to a CLA of any size by incorporating only two intermediate variables in each look-ahead carry output equation. Supposing only two intermediate variables are present in a look-ahead carry output equation, as is the case with Equation (3), it can be retained as such, and no transformation needs to be done. In general, a CLA featuring L look-ahead carry outputs may require (2L–2) intermediate variables according to our proposition.

_{K+1}, C

_{K+2}, C

_{K+3}, and C

_{K+4}can be uniformly realized using a single complex gate, viz., the AO21 gate. Assuming that A, B, and C are the inputs to an AO21 gate and Y is its output, an AO21 gate implements the logic function Y = AB + C, which requires eight transistors for a static CMOS logic design.

## 3. Implementation and Estimation of Design Metrics of Adders

_{th}32-28 nm CMOS standard digital cell library. The recommended supply voltage of 1.05V and an operating junction temperature of 25 °C was used. A default wire load model was included during synthesis and a fanout-of-4 drive strength was associated with all the output ports (i.e., sum bits) of the adders. The high-speed adder components present in the DesignWare library, i.e., the Ling adder, CSA, BKA, and Sklansky adder, were invoked during synthesis and these were synthesized along with the rest of the high-speed adders mentioned earlier by using the ‘compile’ command with speed defined as the optimization goal. To synthesize the RCA, the ‘compile_ultra’ command was used. After synthesis, the gate-level netlists generated by Design Compiler were used to perform a functional simulation using Synopsys VCS. To do this, a test bench comprising approximately 1000 randomly generated input vectors was supplied to the adders at an input frequency of 250 MHz and their functionality were verified and their corresponding switching activity were recorded. The switching activity information was then used to accurately estimate the total average power using Synopsys PrimePower. To accurately estimate the critical path delay of adders, we used Synopsys PrimeTime. The total area of the adders estimated after synthesis, including the cells area and interconnect area, was estimated using Design Compiler. The design metrics of the adders are given in Table 1.

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Hennessy, J.; Patterson, D. Computer Architecture: A Quantitative Approach, 5th ed.; Morgan Kaufmann: Burlington, MA, USA, 2003; ISBN 9780123838735. [Google Scholar]
- Garside, J.D. A CMOS VLSI implementation of an asynchronous ALU. In Proceedings of the IFIP Working Conference on Asynchronous Design Methodologies, Manchester, UK, 31 March–2 April 1993. [Google Scholar]
- Kogge, P.M.; Stone, H.S. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput.
**1973**, 100, 786–793. [Google Scholar] [CrossRef] - Synopsys SAED_EDK32/28_CORE Databook. Revision 1.0.0. 2012. Available online: https://www.synopsys.com/community/university-program/teaching-resources.html (accessed on 8 December 2021).
- Rabaey, J.M.; Chandrakasan, A.; Nikolic, B. Digital Integrated Circuits: A Design Perspective, 2nd ed.; Pearson Education: London, UK, 2003; ISBN 978-0130909961. [Google Scholar]
- Parhami, B. Computer Arithmetic: Algorithms and Hardware Designs, 1st ed.; Oxford University Press: New York, NY, USA, 2000; ISBN 978-0195125832. [Google Scholar]
- Kuo, J.B.; Liao, H.J.; Chen, H.P. A BiCMOS dynamic carry lookahead adder circuit for VLSI implementation of high-speed arithmetic unit. IEEE J. Solid-State Circuits
**1993**, 28, 375–378. [Google Scholar] [CrossRef] - Ruiz, G.A. New static multi-output carry lookahead CMOS adders. IEE Proc. Circuits Devices Syst.
**1997**, 144, 350–354. [Google Scholar] [CrossRef] - Lim, J.; Kim, D.-G.; Chae, S.-I. A 16-bit carry-lookahead adder using reverse energy recovery logic for ultra-low-energy systems. IEEE J. Solid-State Circuits
**1999**, 34, 898–903. [Google Scholar] - Wang, C.-C.; Huang, C.-J.; Tsai, K.-C. A 1.0-GHz, 0.6-µm 8-bit carry lookahead adder using PLA-styled all-N transistor logic. IEEE Trans. Circuits Syst. II Analog. Digit. Signal Processing
**2000**, 47, 133–135. [Google Scholar] [CrossRef] - Yang, G.; Jung, S.O.; Baek, K.-H.; Kim, S.H.; Kim, S.; Kang, S.-M. A 32-bit carry lookahead adder using dual-path all-N logic. IEEE Trans. VLSI Syst.
**2005**, 13, 992–996. [Google Scholar] [CrossRef] - Wang, C.-C.; Huang, C.-C.; Lee, C.-L.; Cheng, T.-W. A low power high-speed 8-bit pipelining CLA design using dual-threshold voltage domino logic. IEEE Trans. VLSI Syst.
**2008**, 16, 594–598. [Google Scholar] [CrossRef] [Green Version] - Zlatanovici, R.; Kao, S.; Nikolic, B. Energy-delay optimization of 64-bit carry-lookahead adders with a 240 ps 90 nm CMOS design example. IEEE J. Solid-State Circuits
**2009**, 44, 569–583. [Google Scholar] [CrossRef] - Morgenshtein, A.; Yuzhaninov, V.; Kovshilovsky, A.; Fish, A. Full-swing gate diffusion input logic—Case-study of low-power CLA adder design. Integr. VLSI J.
**2014**, 47, 62–70. [Google Scholar] [CrossRef] - Cho, H.; Swartzlander, E.E. Adder designs and analyses for quantum-dot cellular automata. IEEE Trans. Nanotechnol.
**2007**, 6, 374–383. [Google Scholar] [CrossRef] - Lopez, J.F.; Reina, R.; Hernandez, L.; Tobajas, F.; de Armas, V.; Sarmiento, R.; Nunez, A. Pipelined GaAs carry lookahead adder. Electron. Lett.
**1998**, 34, 1732–1733. [Google Scholar] [CrossRef] - Shaltoot, A.H.; Madian, A.H. Memristor based carry lookahead adder architectures. In Proceedings of the IEEE 55th International Midwest Symposium on Circuits and Systems, Boise, ID, USA, 5–8 August 2012. [Google Scholar]
- Liu, G.; Zheng, L.; Wang, G.; Shen, Y.; Liang, Y. A carry lookahead adder based on hybrid CMOS-memristor logic circuit. IEEE Access
**2019**, 7, 43691–43696. [Google Scholar] [CrossRef] - Dutta, P.; Bandyopadhyay, C.; Giri, C.; Rahaman, H. Mach-Zehnder interferometer based all optical reversible carry-lookahead adder. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Tampa, FL, USA, 9–11 July 2014. [Google Scholar]
- Sun, Y.; Kursun, V. Low-power and compact NP dynamic CMOS adder with 16nm carbon nanotube transistors. In Proceedings of the IEEE International Symposium on Circuits and Systems, Beijing, China, 19–23 May 2013. [Google Scholar]
- Sacchetto, D.; Ben-Jamaa, M.H.; de Micheli, G.; Leblebici, Y. Design aspects of carry lookahead adders with vertically-stacked nanowire transistors. In Proceedings of the IEEE International Symposium on Circuits and Systems, Paris, France, 30 May–2 June 2010. [Google Scholar]
- Ramkumar, B.; Kittur, H.M. Low-power and area-efficient carry select adder. IEEE Trans. Very Large Scale Integr. Syst.
**2012**, 20, 371–375. [Google Scholar] [CrossRef] - Balasubramanian, P.; Mastorakis, N. Performance comparison of carry-lookahead and carry-select adders based on accurate and approximate additions. Electronics
**2018**, 7, 369. [Google Scholar] [CrossRef] [Green Version] - Ling, H. High-speed binary adder. IBM J. Res. Dev.
**1981**, 25, 156–166. [Google Scholar] [CrossRef] - Sklansky, J. Conditional-sum addition logic. IRE Trans. Electron. Comput.
**1960**, EC-9, 226–231. [Google Scholar] [CrossRef] - Brent, R.P.; Kung, H.T. A regular layout for parallel adders. IEEE Trans. Comput.
**1982**, C-31, 260–264. [Google Scholar] [CrossRef] - Sklansky, J. An evaluation of several two-summand binary adders. IRE Trans. Electron. Comput.
**1960**, EC-9, 213–226. [Google Scholar] [CrossRef] - Bedrij, O.J. Carry-select adder. IRE Trans. Electron. Comput.
**1962**, EC-11, 340–346. [Google Scholar] [CrossRef] - Yazdanbakhsh, A.; Mahajan, D.; Esmaeilzadeh, H.; Lofti-Kamran, P. AxBench: A multiplatform benchmark suite for approximate computing. IEEE Des. Test
**2017**, 34, 60–68. [Google Scholar] [CrossRef]

**Figure 6.**Normalized PDP of different 32-bit adders (lesser value is preferable, which is highlighted by the red bar). The PDP of Ling adder is considered to be the baseline as it is higher and the PDP of all the adders are divided by the baseline value to obtain the normalized PDP plots.

Adder Name | Area (µm^{2}) | Critical Path Delay (ns) | Total Power (µW) | ||
---|---|---|---|---|---|

Cells | Interconnect | Total | |||

RCA | 155.03 | 10.98 | 166.01 | 3.40 | 42.13 |

Conventional CLA | 350.97 | 37.44 | 388.41 | 2.53 | 41.69 |

Ling adder (CLA variant) | 392.40 | 75.21 | 467.61 | 2.39 | 67.48 |

Proposed CLA | 475.50 | 51.94 | 527.44 | 1.13 | 51.33 |

Conditional sum adder (CSA) | 412.48 | 77.65 | 490.13 | 1.71 | 69.43 |

CSLA without BEC (8-7-6-4-3-2-2) | 834.35 | 117.82 | 952.17 | 1.20 | 87.54 |

CSLA with BEC (8-7-6-4-3-2-2) | 683.65 | 96.22 | 779.87 | 1.33 | 73.43 |

CSLA without BEC (8-8-8-8) | 745.15 | 95.86 | 841.01 | 1.16 | 79.45 |

CSLA with BEC (8-8-8-8) | 621.89 | 81.90 | 703.79 | 1.28 | 65.46 |

Brent–Kung adder (BKA) | 419.85 | 64.40 | 484.25 | 2.42 | 56.65 |

Sklansky adder | 387.06 | 62.75 | 449.81 | 2.74 | 57.10 |

Kogge–Stone adder (KSA) | 1014.29 | 174.43 | 1188.72 | 0.73 | 84.99 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Balasubramanian, P.; Mastorakis, N.E.
High-Speed and Energy-Efficient Carry Look-Ahead Adder. *J. Low Power Electron. Appl.* **2022**, *12*, 46.
https://doi.org/10.3390/jlpea12030046

**AMA Style**

Balasubramanian P, Mastorakis NE.
High-Speed and Energy-Efficient Carry Look-Ahead Adder. *Journal of Low Power Electronics and Applications*. 2022; 12(3):46.
https://doi.org/10.3390/jlpea12030046

**Chicago/Turabian Style**

Balasubramanian, Padmanabhan, and Nikos E. Mastorakis.
2022. "High-Speed and Energy-Efficient Carry Look-Ahead Adder" *Journal of Low Power Electronics and Applications* 12, no. 3: 46.
https://doi.org/10.3390/jlpea12030046