# A RISC-V Processor with Area-Efficient Memristor-Based In-Memory Computing for Hash Algorithm in Blockchain Applications

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. IMC-Adapted ISA Design for Hash Algorithm

#### 2.1. Hash Algorithm in Blockchain Technology

#### 2.2. IMC-Adapted ISA Design

## 3. RISV Processor with IMC

#### 3.1. Processor Architecture

#### 3.2. IMC Implementation

#### 3.2.1. IMC Core Architecture and Assistant Logic

#### 3.2.2. IMC Memristor Array and Read-Out Circuit

_{BL0}and I

_{BLn}, are compared with two reference currents, I

_{OR}and I

_{AND}. The HRS is usually 10 times larger than the LRS [10], meaning that I

_{LRS}>> I

_{HRS}, where I

_{HRS}and I

_{LRS}stand for the typical read currents for HRS and LRS, respectively. Therefore, the typical values of I

_{OR}and I

_{AND}can be set as 0.5 × I

_{LRS}and 1.5 × I

_{LRS}. When I

_{SUM}is larger than I

_{OR}, the signal OR becomes logic “1”, implying that at least one of the two activated memristors along the same bitline is the LRS. When I

_{SUM}is larger than I

_{AND}, the signal AND becomes logic “1”, implying that both activated memristors along the same bitline are the LRS. By sending the results of OR and AND to an XOR gate, the XOR result is obtained at the output O

_{0}–O

_{n}. According to the control signal sel[1:0] from the assistant logic circuit, the corresponding result is written back to the 1D1R array in the next clock cycle. To perform 320-bit operations, this work adopts a 20-kb memristor array with 64 rows and 320 columns.

## 4. IMC Compiling Policy and Data Allocation Method

#### 4.1. IMC Compiling Policy

#### 4.2. Data Allocation Method for SHA-3

## 5. Evaluation

#### 5.1. Evaluation Methods

#### 5.2. Area Overhead

^{2}. The two working SRAM memories both had a capacity of 64 kb. The SRAM cell size was 0.12 μm

^{2}and the total area of two working SRAMs was 0.028 mm

^{2}[22]. For the IMC module, the count of IMC read-out circuits was required to be as many as 320 to support 320-bit bitwise logic operations. Assuming that each IMC read-out circuit had a size of 2 μm × 4 μm, the total area of IMC read-out circuits was 0.0026 mm

^{2}. The area of the advanced row decoder was estimated to be 0.001 mm

^{2}, i.e., 50 μm × 20 μm. The areas of the other circuits in the IMC module were relatively small and were estimated to be 0.0005 mm

^{2}. By 3D stacking, the 20-kb memristor array of the 1D1R cell would not bring additional area cost. To sum up, the area of the IMC module was about 0.004 mm

^{2}. Figure 7 shows the area comparison of the baseline and the RISC-V processor with IMC. The IMC module brings an area overhead of about 12%. However, the memristor array in the IMC module also plays the part of data cache; thus, the capacity of SRAM memory for data can be reduced, alleviating the area overhead. When the capacity of SRAM memory for data is reduced by 20 kb, the total area is reduced by about 0.003 mm

^{2}, and the area overhead is decreased to only 3%.

#### 5.3. Performance Improvement

#### 5.4. Energy Reduction

#### 5.5. Comparison with Mainstream Mining Platforms and SRAM-Based IMC

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Ashton, K. That ‘internet of things’ thing. RFID J.
**2009**, 22, 97–115. [Google Scholar] - Pujolle, G. An autonomic-oriented architecture for the internet of things. In Proceedings of the IEEE John Vincent Atanasoff 2006 International Symposium on Modern Computing, Sofia, Bulgaria, 3–6 October 2006. [Google Scholar]
- Madakam, S.; Ramaswamy, R.; Tripathi, S. Internet of Things (IoT): A literature review. J. Comput. Commun.
**2015**, 3, 164. [Google Scholar] [CrossRef] - Dorri, A.; Kanhere, S.S.; Jurdak, R.; Gauravaram, P. Blockchain for IoT security and privacy: The case study of a smart home. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops, Kona, HI, USA, 13–17 March 2017. [Google Scholar]
- Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. Available online: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.9986 (accessed on 20 May 2019).
- Zhang, Y.; Wen, J. The IoT electric business model: Using blockchain technology for the internet of things. Peer Peer Netw. Appl.
**2017**, 10, 983–994. [Google Scholar] [CrossRef] - Banerjee, M.; Lee, J.; Choo, K.K.R. A blockchain future for internet of things security: A position paper. Digit. Commun. Netw.
**2018**, 4, 159–160. [Google Scholar] [CrossRef] - Gubbi, J.; Buyya, R.; Marusic, S.; Palaniswami, M. Internet of Things (IoT): A vision, architectural elements, and future directions. Future Gener. Comput. Syst.
**2013**, 29, 1645–1660. [Google Scholar] [CrossRef] [Green Version] - Zhang, Y.; Yang, K.; Saligane, M.; Blaauw, D.; Sylvester, D. A compact 446 Gbps/W AES accelerator for mobile SoC and IoT in 40 nm. In Proceedings of the IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, HI, USA, 15–17 June 2016. [Google Scholar]
- Zhang, Y.; Xu, L.; Yang, K.; Dong, Q.; Jeloka, S.; Blaauw, D.; Sylvester, D. Recryptor: A reconfigurable in-memory cryptographic Cortex-M0 processor for IoT. In Proceedings of the IEEE Symposium on VLSI Circuits (VLSI-Circuits), Kyoto, Japan, 5–8 June 2017. [Google Scholar]
- Xue, X.; Jian, W.; Yang, J.; Xiao, F. A 0.13 µm 8 Mb Logic-Based CuxSiyO ReRAM with Self-Adaptive Operation for Yield Enhancement and Power Reduction. IEEE J. Solid State Circuits
**2013**, 48, 1315–1322. [Google Scholar] [CrossRef] - Chen, W.; Lin, W.; Lai, L.; Li, S.; Hsu, C.-H.; Lin, H.-T.; Lee, H.-Y.; Su, J.-W.; Xie, Y.; Sheu, S.-S.; et al. A 16Mb dual-mode ReRAM macro with sub-14ns computing-in-memory and memory functions enabled by self-write termination scheme. In Proceedings of the IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2–6 December 2017. [Google Scholar]
- Lee, Y.; Waterman, A.; Avizienis, R.; Cook, H.; Sun, C.; Stojanović, V.; Asanović, K. A 45 nm 1.3 GHz 16.7 double-precision GFLOPS/W RISC-V processor with vector accelerators. In Proceedings of the European Solid-State Circuits Conference (ESSCIRC), Venice Lido, Italy, 22–26 September 2015. [Google Scholar]
- The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 2.2. University of California: Berkeley, CA, USA, 2017. Available online: https://riscv.org/specifications/ (accessed on 3 January 2019).
- Huang, S.; Wang, X.; Xu, G.; Wang, M.; Zhao, J. Conditional cube attack on reduced-round Keccak sponge function. In Annual International Conference on the Theory and Applications of Cryptographic Techniques; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Dinur, I.; Morawiecki, P.; Pieprzyk, J.; Srebrny, M.; Straus, M. Cube attacks and cube-attack-like cryptanalysis on the round-reduced Keccak sponge function. In Annual International Conference on the Theory and Applications of Cryptographic Techniques; Springer: Berlin, Germany, 2015. [Google Scholar]
- Keccak Specifications, Version 2, Team Keccak. 2009. Available online: https://keccak.team/obsolete/Keccak-specifications-2.pdf (accessed on 3 January 2019).
- The Ultra-Low Power RISC Core. Available online: https://github.com/SI-RISCV/e200_opensource (accessed on 18 May 2019).
- Zhou, K.; Xue, X.; Yang, J.; Xu, X.; Lv, H.; Wang, M.; Jing, M.; Liu, W.; Zeng, X.; Chung, S.S.; et al. Nonvolatile Crossbar 2D2R TCAM with Cell Size of 16.3 F
^{2}and K-means Clustering for Power Reduction. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC), Tainan, Taiwan, 5–7 November 2018. [Google Scholar] - Udayakumaran, S.; Barua, R. Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, San Jose, CA, USA, 30 October–1 November 2003. [Google Scholar]
- Synopsys VCS Verilog Simulator. Available online: http://www.synopsys.com/products/simulation/simulation.html (accessed on 3 January 2019).
- Sinangil, M.E.; Mair, H.; Chandrakasan, A.P. A 28 nm high-density 6T SRAM with optimized peripheral-assist circuits for operation down to 0.6 V. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 20–24 February 2011. [Google Scholar]
- Difference between ASIC, GPU, and CPU Mining. Available online: https://cointopper.com (accessed on 15 July 2019).

Step | Equations | Main Process |
---|---|---|

θ | (1) | Massive 64-bit and 320-bit bitwise XOR operations, a few 64-bit shift operations |

ρ | (2), (3) | Massive 64-bit shift operations and data copying |

π | (4) | Massive 64-bit data copying |

χ | (5) | Massive bitwise 320-bit logic operations (XOR, OR and AND) |

ι | (6) | Massive operations on one 64-bit binary string |

**Table 2.**In-memory computing (IMC) applications in Keccak-f permutation. XOR—exclusive or; SHIFT—64-bit shift operation; CPA—copy to all columns; CP—64-bit data copying operation.

Step | IMC Involved |
---|---|

θ | XOR, SHIFT, CPA |

ρ | SHIFT, CP |

π | CP |

χ | XOR, OR, AND, CPA |

ι | None |

Bit | 31–30 | 29–25 | 24–20 | 19–15 | 15–13 | 12 | 11–7 | 6–0 | |
---|---|---|---|---|---|---|---|---|---|

XOR | 00 | A1 | A2 | BA | I/R | A0 | Custom0 | ||

OR | 01 | A1 | A2 | BA | I/R | A0 | Custom0 | ||

AND | 10 | A1 | A2 | BA | I/R | A0 | Custom0 | ||

SHIFT | 11 | A1 | SA[5:0] | BA | I/R | SA[6:0] | A0 | Custom0 | |

Normal read | Imm[11:0] | rs | Reserved | rd | Custom1 | ||||

Normal write | Imm[11:5] | rs2 | rs1 | Reserved | Imm [4:0] | Custom2 | |||

CP and CPA | 0 | Flag | A1 | A2 | BA | I/R | Col[5:0] | Custom3 |

**Table 4.**Operation table of one-diode-one-memristor (1D1R) memristor array for IMC. HRS—high-resistance state; LRS—low-resistance state.

Operation Mode | Wordline (WL) | Bitline (BL) | ||
---|---|---|---|---|

Selected | Un-Sel | Selected | Un-Sel | |

Set (HRS→LRS) | 0 | Vset | Vset + Vt | 0 |

Reset (LRS→HRS) | 0 | Vreset | Vreset + Vt | 0 |

Logic (Read) | 0 | Vread | Vread + Vt | 0 |

Operation | Energy (pJ) |
---|---|

ALU | 70 |

SRAM read/write | 0.1/bit |

memristor read | 0.3/bit |

memristor write | 0.6/bit |

memristor logic | 0.45/bit |

Instruction | Main Actions | Energy (pJ) |
---|---|---|

ALU | Fetch, decode and execute the instruction | 70 |

SRAM read/write | ALU, 32-bit SRAM read/write | 73.2 |

IMC read | ALU, 32-bit memristor read | 82.8 |

IMC write | ALU, 32-bit memristor write | 89.2 |

IMC CP | ALU, 64-bit memristor read and write | 134 |

IMC CPA | ALU, 64-bit memristor read and 320-bit memristor write | 287.6 |

IMC Logic (AND, OR, and XOR) | ALU, 320-bit memristor logic and write | 406 |

IMC SHIFT | ALU, 320-bit memristor read and write | 390 |

**Table 7.**Comparison of memristor-based IMC with central processing unit (CPU), graphics processing unit (GPU), application-specific integrated circuit (ASIC), and SRAM-based IMC.

Mining Platform | Performance (H/s) | Active Power (Watts) | Energy Efficiency (J/H) | Area (mm^{2}) |
---|---|---|---|---|

CPU (i5 2500K) [23] | 4.80 × 10^{4} | 90 | 1.88 × 10^{−3} | large scale |

GPU (Tesla S1070) [23] | 1.55 × 10^{8} | 8.00 × 10^{2} | 5.16 × 10^{−6} | large scale |

ASIC (Antminer S4) [23] | 2.00 × 10^{9} | 1.40 × 10^{3} | 7.00 × 10^{−7} | large scale |

SRAM-based IMC | 1.03 × 10^{3} | 8.80 × 10^{−4} | 8.50 × 10^{−7} | 3.50 × 10^{−2} |

Memristor-based IMC | 1.03 × 10^{3} | 1.17 × 10^{−3} | 1.14 × 10^{−6} | 5.50 × 10^{−2} |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xue, X.; Wang, C.; Liu, W.; Lv, H.; Wang, M.; Zeng, X.
A RISC-V Processor with Area-Efficient Memristor-Based In-Memory Computing for Hash Algorithm in Blockchain Applications. *Micromachines* **2019**, *10*, 541.
https://doi.org/10.3390/mi10080541

**AMA Style**

Xue X, Wang C, Liu W, Lv H, Wang M, Zeng X.
A RISC-V Processor with Area-Efficient Memristor-Based In-Memory Computing for Hash Algorithm in Blockchain Applications. *Micromachines*. 2019; 10(8):541.
https://doi.org/10.3390/mi10080541

**Chicago/Turabian Style**

Xue, Xiaoyong, Chenzedai Wang, Wenjun Liu, Hangbing Lv, Mingyu Wang, and Xiaoyang Zeng.
2019. "A RISC-V Processor with Area-Efficient Memristor-Based In-Memory Computing for Hash Algorithm in Blockchain Applications" *Micromachines* 10, no. 8: 541.
https://doi.org/10.3390/mi10080541