# Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning

^{*}

## Abstract

**:**

## 1. Introduction

_{col,j}‘ is column(j)’s current. ‘m’ is the number of rows in the crossbar. ‘G

_{i}’ and ‘v

_{row,i}’ are memristor(i)’s conductance and input voltage(i), respectively. ‘j’ and ‘i’ are the indices of crossbar’s column and row, respectively.

_{0+}and G

_{0-}represent the memristor’s conductance on plus and minus columns, respectively, for ‘row #0′. X

_{0}is input voltage applied to ‘row #0′. Similarly, G

_{m+}and G

_{m-}are the memristor’s conductance for ‘row #m’. X

_{m}is the input voltage to ‘row #m’ in Figure 1b. Here, I

_{0+}can be calculated with ${G}_{0+}{X}_{0}+\Vert +{G}_{m+}{X}_{m}$. I

_{0-}is ${G}_{0-}{X}_{0}+\Vert +{G}_{m-}{X}_{m}$. The difference of I

_{0+}and I

_{0-}is calculated with ${I}_{0+}-{I}_{o-}$ by circuit (A). The calculated ${I}_{0+}-{I}_{o-}$ enters the voltage amplifier (B), where Y

_{0}is obtained and delivered to the next crossbar. Here, ${G}_{0+}-{G}_{o-}$ can be regarded as a synaptic weight. If G

_{0+}is larger than G

_{0-}, the weight is positive. If G

_{0+}is smaller than G

_{0-}, the weight is negative. Similarly, ${G}_{m+}-{G}_{m-}$ can be regarded as the other synaptic weight. By doing so, both positive and negative weights can be represented using the (+) and (-) columns as shown in Figure 1b [14].

## 2. Method

_{S}, R

_{W}, and R

_{N}are the parasitic source, line, and neuron resistance, respectively [15]. V

_{IN,0}is input voltage applied to ‘row #0′. I

_{0}is the column current from ‘column #0′. In Figure 2c, the input voltages such as V

_{IN,0}are applied to the crossbar’s rows. The currents generated by the crossbar’s columns can be thought of as the MAC results calculated physically from the memristor crossbar.

_{W}. Here, the crossbar is assumed to have 784 cells per column, as shown in Figure 2c. The R

_{W}means line resistance per cell. If the column has 784 cells and R

_{W}= 1.1 Ω, the total line resistance becomes as large as 862 Ω. In this figure, the normalized column current means the MAC calculation result is plotted with increasing the percentage number of active rows among 784 rows. The ‘active rows’ means the row’s input voltage is high. If the percentage number of active rows is 50%, 392 rows are applied by high voltage and the other 392 are driven by 0V, among the total 784 inputs. Here, 1T-1R means the crossbar composed of 1 transistor and 1 memristor. 1S-1R is the array made of a self-rectifying memristor. For 1T-1R, the effective LRS resistance considering both LRS and transistor’s ON resistance is assumed to be 26.3 KΩ in the circuit simulation of Figure 2d. The effective HRS resistance considering both HRS and the transistor’s ON resistance can be the same with HRS = 1 MΩ, because the HRS is much larger than the transistor’s ON resistance, as explained later in Section 3. In 1S-1R, the selector may be united with the memristor not using an external transistor as the selector.

_{W}= 0 Ω, the normalized column current seems very linear upon increasing the percentage of active rows among 784 rows for both 1S-1R and 1T-1R cells. It indicates clearly that the MAC calculation accuracy is not degraded regardless of 1S-1R and 1T-1R cells. However, when R

_{W}= 0.5 Ω and R

_{W}= 1.1 Ω, the normalized column currents seem to saturate rapidly with increasing the percentage active rows over 25%. It means the MAC calculation accuracy is degraded very much when R

_{W}is not zero. If R

_{W}becomes larger, the MAC calculation accuracy becomes degraded more, as shown in Figure 2d. From the circuit simulation of MAC calculation by the crossbar’s column current, the line resistance shows that it can degrade MAC calculation accuracy significantly. Based on the analysis of Figure 2d, we discuss how to mitigate the line resistance problem in memristor crossbars in the following paragraphs.

## 3. Results

_{3}. The bottom electrode is SrTiO3. A butterfly curve from the device in Figure 8a is shown in Figure 8b [7,29]. The block box and red line in Figure 8b indicate the experimental measured data and the Verilog-A model, respectively, in Figure 8b. The High-Resistance State (HRS) and Low-Resistance State (LRS) measured in Figure 8b are around 1MΩ and 10KΩ, respectively [28], when the read voltage is as large as 1V. Considering a transistor as the selector, when the transistor is on, the effective resistance considering both LRS and the transistor’s ON resistance can be as small as 26.3 KΩ. The effective resistance due to HRS and the transistor’s ON resistance is very similar with HRS, because the ON resistance is much smaller than HRS. Thus, if the transistor’s ON resistance is comparable to LRS but much smaller than HRS, the MAC calculation accuracy of 1T-1R crossbars cannot be degraded. When the transistor is turned off, its OFF resistance is much larger than HRS. By doing so, the sneak leakage for unselected cells can be negligibly small in memristor crossbars.

## 4. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Hu, M.; Graves, C.E.; Li, C.; Li, Y.; Ge, N.; Montgomery, E.; Davila, N.; Jiang, H.; Williams, R.S.; Yang, J.J.; et al. Memristor-Based Analog Computation and Neural Network Classification with a Dot Product Engine. Adv. Mater.
**2018**, 30, 1705914. [Google Scholar] [CrossRef] [PubMed] - Li, B.; Gu, P.; Shan, Y.; Wang, Y.; Chen, Y.; Yang, H. RRAM-Based Analog Approximate Computing. IEEE Trans. Comput. Des. Integr. Circuits Syst.
**2015**, 34, 1905–1917. [Google Scholar] [CrossRef] - Xia, L.; Gu, P.; Li, B.; Tang, T.; Yin, X.; Huangfu, W.; Yu, S.; Cao, Y.; Wang, Y.; Yang, H. Technological Exploration of RRAM Crossbar Array for Matrix-Vector Multiplication. J. Comput. Sci. Technol.
**2016**, 31, 3–19. [Google Scholar] [CrossRef] - Chen, J.; Li, J.; Li, Y.; Miao, X. Multiply accumulate operations in memristor crossbar arrays for analog computing. J. Semicond.
**2021**, 42, 013104. [Google Scholar] [CrossRef] - Suh, K.D.; Suh, B.H.; Lim, Y.H.; Kim, J.K.; Choi, Y.J.; Koh, Y.N.; Lee, S.S.; Kwon, S.C.; Choi, B.S.; Yum, J.S.; et al. A 3.3 V 32 Mb NAND flash memory with incremental step pulse programming scheme. IEEE J. Solid-State Circuits
**1995**, 30, 1149–1156. [Google Scholar] - Van Pham, K.; Tran, S.B.; Van Nguyen, T.; Min, K.-S. Asymmetrical Training Scheme of Binary-Memristor-Crossbar-Based Neural Networks for Energy-Efficient Edge-Computing Nanoscale Systems. Micromachines
**2019**, 10, 141. [Google Scholar] [CrossRef][Green Version] - Truong, S.N.; Van Pham, K.; Yang, W.; Shin, S.; Pedrotti, K.; Min, K.-S. New pulse amplitude modulation for fine tuning of memristor synapses. Microelectron. J.
**2016**, 55, 162–168. [Google Scholar] [CrossRef] - Hu, M.; Strachan, J.; Li, Z.; Grafals, E.; Gravevs, C. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In Proceedings of the 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, 5–9 June 2016; Volume 1–6. [Google Scholar]
- Li, Y.; Wang, Z.; Midya, R.; Xia, Q.; Yang, J.J. Review of memristor devices in neuromorphic computing: Materials sciences and device challenges. J. Phys. D Appl. Phys.
**2018**, 51, 503002. [Google Scholar] [CrossRef] - Krestinskaya, O.; James, A.P.; Chua, L.O. Neuromemristive Circuits for Edge Computing: A Review. IEEE Trans. Neural Networks Learn. Syst.
**2020**, 31, 4–23. [Google Scholar] [CrossRef][Green Version] - Mao, J.; Zhou, L.; Zhu, X.; Zhou, Y.; Han, S. Photonic Memristor for Future Computing: A Perspective. Adv. Opt. Mater.
**2019**, 7, 1900766. [Google Scholar] [CrossRef] - Akopyan, F.; Sawada, J.; Cassidy, A.; Alvarez-Icaza, R.; Arthur, J.; Merolla, P.; Imam, N.; Nakamura, Y.; Datta, P.; Nam, G.-J.; et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip. IEEE Trans. Comput. Des. Integr. Circuits Syst.
**2015**, 34, 1537–1557. [Google Scholar] [CrossRef] - Davies, M.; Srinivasa, N.; Lin, T.-H.; Chinya, G.; Cao, Y.; Choday, S.H.; Dimou, G.; Joshi, P.; Imam, N.; Jain, S.; et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning. IEEE Micro
**2018**, 38, 82–99. [Google Scholar] [CrossRef] - Van Pham, K.; Van Nguyen, T.; Tran, S.B.; Nam, H.; Lee, M.J.; Choi, B.J.; Truong, S.N.; Min, K.-S. Memristor Binarized Neural Networks. J. Semicond. Technol. Sci.
**2018**, 18, 568–577. [Google Scholar] [CrossRef] - Nguyen, T.; An, J.; Min, K.-S. Memristor-CMOS Hybrid Neuron Circuit with Nonideal-Effect Correction Related to Parasitic Resistance for Binary-Memristor-Crossbar Neural Networks. Micromachines
**2021**, 12, 791. [Google Scholar] [CrossRef] [PubMed] - Chakraborty, I.; Roy, D.; Roy, K. Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars. IEEE Trans. Emerg. Top. Comput. Intell.
**2018**, 2, 335–344. [Google Scholar] [CrossRef][Green Version] - Xu, W.; Wang, J.; Yan, X. Advances in Memristor-Based Neural Networks. Front. Nanotechnol.
**2021**, 3, 645995. [Google Scholar] [CrossRef] - Van Nguyen, T.; An, J.; Oh, S. Training, Programming, and Correction Techniques of Memristor-Crossbar Neural Networks with Non-Ideal Effects such as Defects, Variation, and Parasitic Resistance. In Proceedings of the 2021 IEEE 14th International Conference on ASIC (ASICON), Kunming, China, 26–29 October 2021; pp. 1–4. [Google Scholar] [CrossRef]
- Murali, G.; Sun, X.; Yu, S.; Lim, S.K. Heterogeneous Mixed-Signal Monolithic 3-D In-Memory Computing Using Resistive RAM. IEEE Trans. Very Large Scale Integr. (VLSI) Syst.
**2020**, 29, 386–396. [Google Scholar] [CrossRef] - Sah, M.P.; Yang, C.; Kim, H.; Muthuswamy, B.; Jevtic, J.; Chua, L. A Generic Model of Memristors With Parasitic Components. IEEE Trans. Circuits Syst. I Regul. Pap.
**2015**, 62, 891–898. [Google Scholar] [CrossRef] - Chou, C.-C.; Lin, Z.-J.; Tseng, P.-L.; Li, C.-F.; Chang, C.-Y.; Chen, W.-C.; Chih, Y.-D.; Chang, T.-Y.J. An N40 256K×44 embedded RRAM macro with SL-precharge SA and low-voltage current limiter to improve read and write performance. In Proceedings of the 2018 IEEE International Solid-State Circuits Conference-(ISSCC), San Francisco, CA, USA, 11–15 February 2018; pp. 478–480. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Nair, V.; Hinton, G. CIFAR-10 and CIFAR-100 Datasets. 2018. Available online: https//www.cs.toronto.edu/~kriz/cifar.html (accessed on 20 October 2018).
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Gopalakrishnan, R.; Chua, Y.; Sun, P.; Kumar, A.J.S.; Basu, A. HFNet: A CNN Architecture Co-designed for Neuromorphic Hardware With a Crossbar Array of Synapses. Front. Neurosci.
**2020**, 14, 907. [Google Scholar] [CrossRef] - Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv
**2017**, arXiv:1704.04861. [Google Scholar] - Simulator, V.S.C.; Guide, A.P.S.U. Cadence Design Systems, Inc. 2005. Available online: www.cadence.com (accessed on 12 April 2016).
- An, J.; Oh, S.; Van Nguyen, T.; Min, K.-S. Synapse-Neuron-Aware Training Scheme of Defect-Tolerant Neural Networks with Defective Memristor Crossbars. Micromachines
**2022**, 13, 273. [Google Scholar] [CrossRef] [PubMed] - Jang, J.T.; Ko, D.; Ahn, G.; Yu, H.R.; Jung, H.; Kim, Y.S.; Yoon, C.; Lee, S.; Park, B.H.; Choi, S.J.; et al. Effect of oxygen content of the LaAlO3 layer on the synaptic behavior of Pt/LaAlO3/Nb-doped SrTiO
_{3}memristors for neuromorphic applications. Solid State Electron.**2018**, 140, 139–143. [Google Scholar] [CrossRef] - Merced-Grafals, E.J.; Dávila, N.; Ge, N.; Williams, R.S.; Strachan, J.P. Repeatable, accurate, and high speed multi-level programming of memristor 1T1R arrays for power efficient analog computing applications. Nanotechnology
**2016**, 27, 365202. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**(

**a**) The block diagram of artificial neural networks with input, hidden, and output neurons (

**b**) The memristor crossbars for implementing the neural networks.

**Figure 2.**(

**a**) The convolution of 28 × 28 MNIST image with 3 × 3 kernel without using the sub-image partitioning. (

**b**) The memristor crossbar for the full-image convolution without using the sub-image partitioning. (

**c**) The crossbar circuit with parasitic resistance such as source, line, and neuron resistance. (

**d**) The normalized column current with increasing the number of active rows (%), for 1S-1R and 1T-1R cells. Here, R

_{W}means line resistance per cell and the crossbar’s column has 784 cells per column. When R

_{W}= 0.5 Ω and R

_{W}= 1.1 Ω, the normalized column currents seem to saturate rapidly with increasing the percentage active rows over 25%. It means the MAC calculation accuracy is degraded very much when R

_{W}is not zero.

**Figure 3.**(

**a**) The convolution of 28 × 28 MNIST image with 3 × 3 kernel using the sub-image partitioning. (

**b**) The memristor crossbar for the sub-image convolution using the sub-image partitioning. Here, the borderline rows and columns between two neighboring sub-images are included in the crossbar’s row number of 81(=9 × 9). When the crossbar’s column number is calculated, only the number of output pixels of the convolution should be considered. By doing so, the crossbar’s column number is equal to the sub-image size of 49(=7 × 7).

**Figure 4.**(

**a**) The sub-image convolution with 3D kernels. (

**b**) The sub-image convolution with 2D kernels. (

**c**) The sub-image convolution with 1D kernels.

**Figure 5.**The area-efficient mapping method of convolutional neural networks to memristor crossbars using sub-image partitioning.

**Figure 6.**(

**a**) The comparison of the recognition rate between the 3D and 2D+1D kernels. Here, the FW-FN means the floating-point weights and floating-point neurons used in the simulation. The TW-FN means the ternary weights and floating-point neurons. (

**b**) The comparison of the number of unit crossbars used in the sub-image convolution between the 3D and 2D+1D kernels. Here, the unit crossbar’s size is assumed to be 128 × 128 and 256 × 256.

**Figure 7.**(

**a**) The recognition rate with varying the ratio of 2D + 1D convolution layers from 0 to 1. The ratio of convolution layers with 2D + 1D kernels is calculated with the number of 2D + 1D layers divided by the total number of convolution layers. Here, it is assumed that the unit crossbar’s size is 128 × 128 and floating-point neurons and ternary synaptic weights are used in the neural network’s simulation. (

**b**) The normalized number of unit crossbars used in the neural networks with varying the ratio of 2D + 1D convolution layers. (

**c**) One example of the convolutional neural network’s architecture, when the numbers of 2D + 1D and 3D convolution layers are 7 and 7, respectively, among the total 14 convolution layers. The normalized ratio of 2D + 1D layers in this architecture is 0.5.

**Figure 8.**(

**a**) The memristor circuit with a 1T(transistor)-1R(memristor) cell. The memristor is composed of a top electrode, memristive film, and bottom electrode. (

**b**) The memristor’s butterfly curves from the experimental data (black box) and Verilog-A model (red line). (

**c**) The simulated waveforms of memristor circuit [28].

**Table 1.**(a) The neural network’s architecture of sub-image convolution using 3D kernels. (b) The neural network’s architecture of sub-image convolution using 2D and 1D kernels.

Layer # | Type/Stride | Kernel Shape | Input Size |
---|---|---|---|

(a) | |||

1 | CONV/S1 | (3 × 3 × 3) × 32 | 32 × 32 × 3 |

2 | CONV/S1 | (3 × 3 × 32) × 64 | 32 × 32 × 32 |

3 | CONV/S2 | (3 × 3 × 64) × 128 | 32 × 32 × 64 |

4 | CONV/S1 | (3 × 3 × 128) × 128 | 16 × 16 × 128 |

5 | CONV/S2 | (3 × 3 × 128) × 256 | 16 × 16 × 128 |

6 | CONV/S1 | (3 × 3 × 256) × 256 | 8 × 8 × 256 |

7 | CONV/S2 | (3 × 3 × 256) × 512 | 8 × 8 × 256 |

8 ~ 12 | CONV/S1 | (3 × 3 × 256) × 512 | 4 ×4 × 512 |

13 | CONV/S2 | (3 × 3 × 512) × 1024 | 4 ×4 × 512 |

14 | CONV/S1 | (3 × 3 × 1024) × 1024 | 2 × 2 × 1024 |

15 | AVG POOL/S2 | (2 × 2) | 2 × 2 × 1024 |

16 | FC | (1024 × 10) | 1024 |

(b) | |||

1 | CONV/S1 | (3 × 3 × 3) × 32 | 32 × 32 × 3 |

2 | DW CONV/S1 | (3 × 3 × 1) × 32 | (32 × 32 × 1) × 32 |

PW CONV/S1 | (1 × 1 × 32) × 64 | 32 × 32 × 32 | |

3 | DW CONV/S2 | (3 × 3 × 1) × 64 | (32 × 32 × 1) × 64 |

PW CONV/S1 | (1 × 1 × 64) × 128 | 16 × 16 × 64 | |

4 | DW CONV/S1 | (3 × 3 × 1) × 128 | (16 × 16 × 1) × 128 |

PW CONV/S1 | (1 × 1 × 128) × 128 | 16 × 16 × 128 | |

5 | DW CONV/S2 | (3 × 3 × 1) × 128 | (16 × 16 × 1) × 128 |

PW CONV/S1 | (1 × 1 × 128) × 256 | 8 × 8 × 128 | |

6 | DW CONV/S1 | (3 × 3 × 1) × 256 | (8 × 8 × 1) × 256 |

PW CONV/S1 | (1 × 1 × 256) × 256 | 8 × 8 × 256 | |

7 | DW CONV/S2 | (3 × 3 × 1) × 256 | (8 × 8 × 1) × 256 |

PW CONV/S1 | (1 × 1 × 256) × 512 | 4 × 4 × 256 | |

8 ~ 12 | DW CONV/S1 | (3 × 3 × 1) × 512 | (4 × 4 × 1) × 512 |

PW CONV/S1 | (1 × 1 × 512) × 512 | 4 × 4 × 512 | |

13 | DW CONV/S2 | (3 × 3 × 1) × 512 | (4 × 4 × 1) × 512 |

PW CONV/S1 | (1 × 1 × 512) × 1024 | 2 × 2 × 512 | |

14 | DW CONV/S1 | (3 × 3 × 1) × 1024 | (2 × 2 × 1) × 1024 |

PW CONV/S1 | (1 × 1 × 1024) × 1024 | 2 × 2 × 1024 | |

15 | AVG POOL/S2 | (2 × 2) | 2 × 2 × 1024 |

16 | FC | (1024 × 10) | 1024 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Oh, S.; An, J.; Min, K.-S.
Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning. *Micromachines* **2023**, *14*, 309.
https://doi.org/10.3390/mi14020309

**AMA Style**

Oh S, An J, Min K-S.
Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning. *Micromachines*. 2023; 14(2):309.
https://doi.org/10.3390/mi14020309

**Chicago/Turabian Style**

Oh, Seokjin, Jiyong An, and Kyeong-Sik Min.
2023. "Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning" *Micromachines* 14, no. 2: 309.
https://doi.org/10.3390/mi14020309