Next Article in Journal
Glueless Multiple Input Multiple Output Dielectric Resonator Antenna with Improved Isolation
Previous Article in Journal
Additive Manufacturing of a Miniaturized X-Band Single-Ridge Waveguide Magic-T for Monopulse Radar Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Convolutional Neural Network with a Wave-Based Convolver

Faculty of Information Technology and Bionics, Peter Pazmany Catholic University, Práter u. 50/A, 1083 Budapest, Hungary
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(5), 1126; https://doi.org/10.3390/electronics12051126
Submission received: 20 December 2022 / Revised: 18 February 2023 / Accepted: 19 February 2023 / Published: 25 February 2023

Abstract

:
In this paper, we demonstrate that physical waves can be used to perform convolutions as part of a state-of-the-art neural network architecture. In particular, we show that the damping of waves, which is unavoidable in a physical implementation, does not diminish their usefulness in performing the convolution operations required in a convolutional neural network (CNN), and the damping only slightly decreases the classification accuracy of the network. These results open the door for wave-based hardware accelerators for CNNs.

1. Introduction

Artificial neural networks (ANNs) have become a staple of machine learning, and they are more and more frequently applied in embedded systems as well. To enable their application in, among other things, mobile devices, their computational need and power consumption have to be significantly reduced. This is crucial because mobile devices have significantly limited computational resources and battery life, and they are often used for tasks that require real-time processing. Thus, the goal is to create ANNs that are lightweight and efficient, making them well suited for deployment in mobile devices and other resource-constrained environments.
One possible way of enabling low-energy computation is the employment of special-purpose hardware accelerators, which use the physics of waves to naturally compute convolution integrals. Such devices are known in the literature as surface acoustic waves (SAWs), where an SAW-based convolver uses the interference of two counter-propagating waves to perform a convolution with the physics of the system [1]. Another implementation of the device uses spin waves [2], which are, in fact, very amenable to low-power, on-chip implementation for such hardware accelerators [3].
However, in a real physical system (such as a wave-based convolver), dissipation is unavoidable, and waves (especially spin waves) will decay over a certain distance. This will influence the performance and usefulness of the convolver. The purpose of this paper is to evaluate the performance of the convolver in the presence of such decay.
In this paper, we demonstrate and investigate a special kernel convolution which can be the cornerstone of a wave-based convolver device that can yield a fast and energy-efficient building block of a convolutional neural network without a significant decrease in testing accuracy. With our simulations, we also identify the most important factors, such as the attenuation (decay) and its affect on classification accuracy for commonly investigated datasets. These results can help in the construction of a device by identifying the constraints regarding attenuation with which the device can function with high accuracy.
Nowadays, convolutional neural networks (CNNs) have become a crucial tool for solving various artificial intelligence problems and have been proven to deliver state-of-the-art results across a wide range of applications. In particular, CNNs have shown great success in image processing, video analysis, and natural language processing tasks. These tasks typically involve large amounts of high-dimensional data, such as images and videos, or complex relationships between words and sentences in text data. CNNs have been successful in these tasks because of their ability to automatically learn hierarchical representations of the data and to identify and use the most discriminative features for a given task. As a result, they have become the go-to solution for many researchers and practitioners in these fields and continue to be an active area of research and development.
One area where CNNs are used very successfully is the development of self-driving cars, such as in traffic sign recognition and identification [4,5,6], navigation [7], and 3D object recognition [8], agrarian object identification [9], or in the medical field, such as in ECG signal classification and prediction [10], diabetic retinopathy recognition [11], thyroid nodule diagnosis [12], or lung pattern classification for interstitial lung diseases [13], in which their application has appeared even in mobile phones and embedded devices [14].
The main and most energy-consuming operation of these architectures is convolution, so the optimal implementation of this operation is extremely important.
In the case of mobile and embedded vision applications, energy-saving implementation is also an important consideration. Lightweight deep neural networks (such as MobileNets [15], Xnor-Nets [16], and spiking neural networks [17]) can be used to achieve a reduction in energy consumption. Another possibility is to use a special device (such as FPGA [18] or ASIC devices [19]) which is able to perform the given operation extremely efficiently, and thus the architecture will be faster, and energy consumption will decrease.
However, in the case of low-power devices, especially emerging and non-Boolean devices, which exploit the analogue and nonlinear device characteristics for computation, they will not be completely ideal, since the nonlinear dynamics of the device implementing the convolution may also affect the operation itself. That being said, this nonlinear phenomenon is not necessarily a disadvantage, as the neural network requires some nonlinear operation, (Typically, nonlinear activations such as the rectified linear unit (ReLU) [20] or scaled exponential linear unit (SeLU) [21] provide these characteristics in the architecture.) and Wang et al. [22] showed that nonlinearity can be included in the convolution.
The authors introduced the applications of kernel convolution (kervolution), which was used to approximate the complex behaviors of human perception systems. The kervolution generalizes convolution via kernel functions, and the authors demonstrated that kervolutional neural networks (KNNs) can achieve higher accuracy and faster convergence than the baseline convolutional neural networks. The authors’ work represents an important contribution to the field of CNNs, and the use of KNNs in real-world applications holds significant promise [22].
Neural network models containing kervolution can be effectively used in cases of anomaly detection, time series classification [23], and in authorship attribution [24], among other applications. Furthermore, kervolution can be combined with left and right projection layers, thanks to which this model (ProKNN [25]) can be even more effective in certain situations.
In spread-spectrum communications, the real-time surface acoustic wave (SAW) convolver devices have been known for a long time. These convolvers were also applied in programmable matched filtering to improve the signal-to-noise ratio, which was one of the first applications of surface acoustic wave devices and has important potential in many cases [1].
For example, radar systems have been widely used this process, since they enable the range of the system to be enlarged for a given peak power limitation [26].
Similar to SAW devices, it is conceivable to implement convolution which is performed using spin-wave magnetic devices, which may allow for much lower energy consumption and computation at higher frequencies [27].
Spin-wave computing uses magnetic excitations for computations. Spin-wave majority gates are one of the most prominent device concepts in this field. Linear passive logic gates, which are based on spin-wave interference, are a type of technology that takes the most advantage of the wave computing paradigm and therefore holds the highest promise for future ultra low-power electronics [3].
The spin-wave circuits can also be embedded in complementary metal–oxide–semiconductor (CMOS) circuits, and these complete functional hybrid systems may outperform conventional CMOS circuits since, among other things, they promise ultra low-power operation. Nowadays, the challenges of these spin-wave circuit systems are low-power signal restoration and efficient spin-wave transducers [3].
Furthermore, several methods have been proposed and studied for the development of spin-wave multiplexers and demultiplexers to greatly increase the data transmission capacity and efficiency of spin-wave systems [3].
Therefore, based on the factors described above, in this paper, we introduce a kervolutional neural network, where the kervolution is implemented by surface acoustic waves and the nonlinearities of the kervolutions are based on the characteristic function of magnetic devices.

2. Methods

In this section, we propose a special convolutional neural network architecture, which is inspired by physical ideas and does not contain additional classical nonlinear activation functions (such as ReLU or sigmoid functions), but the system maintains nonlinearity through the physical properties of the simulated device, and these characteristics will determine the attenuation and saturation of the convolutional or kervolutional layer.
During the implementation of our neural network, the primary consideration was to examine the physical effects that a device, specifically one developed to perform the operation of convolution, may have on an ideal, theoretical artificial neural network.

2.1. One-Dimensional Network

The real-time SAW convolver, which was the starting point in the implementation of our neural network architecture, can perform convolution only on one-dimensional inputs.
Thus, for hardware considerations, we made a one-dimensional convolutional neural network. During our simulations, both one- and two-dimensional datasets were investigated. We converted the 2D input data and the convolutional kernels to one-dimensional vectors and mapped them onto our simulated devices.

2.2. Convolution

One of the main parts of a CNN is the convolutional layer. The convolution of functions f and g in one dimension can be described as follows:
f g = + f ( τ ) g ( t τ ) d τ
Since our input signal is finite, the value of the function f is zero outside a certain interval (for example, [ 0 , t ] ). This way, the value of the convolutional integral is also zero in this interval, and thus the formula can be rewritten as
[ f g ] ( t ) = 0 t f ( τ ) g ( t τ ) d τ
This operation can be implemented by real-time SAW convolvers, such as a three-port elastic SAW convolver (Figure 1) under nonlinear operation [1].
The first port of such a device is the input signal port, the second port is the kernel port, and between these is the third one, which is the output or result port. The input and kernel signals can be invoked at the edges of the device by external excitation, and the magnetic or electrical changes can be read out from the result port.
Using the Euler formula, port 1 can be expressed at time t along the z reference axis as follows:
s ( t , z ) = S ( t z / ν ) e j ( ω 0 t β z )
where S ( t z / ν ) is the signal modulation envelope as a function of the SAW velocity, where ν = f λ and β = 2 π / λ .
The output of port 2 can be similarly expressed as
r ( t , z ) = R ( t + z / ν ) e j ( ω 0 t β z )
where the sign of z is negative, since the signal propagates in the opposite direction.
In this case, the following waveform can be read out from the output port over the length L of the thin-film metal plate:
C ( t ) = P L 2 + L 2 S ( t z / ν ) R ( t + z / ν ) d z e j 2 ω 0 t
where P is a constant that is dependent on the nonlinear interaction strength. We can use a change of variable τ = ( t z / ν ) and reformulate this equation as the following:
C ( t ) = M v e j 2 ω 0 t + S ( τ ) R ( 2 t τ ) d τ
where S is the input signal, R is the kernel signal, M is a constant dependent on the strength of the nonlinear interaction, v is the velocity of the waves (signals), j is the complex unit, and ω 0 is the angular frequency of the signal [1].
Equations (1) and (6) differ in only two factors: the nonlinear dampening ( M v e j 2 ω 0 t ) at the beginning of the formula and that the argument of the kernel (R) has 2 t instead of t. The reason for this difference (time compression) is that the signals are traveling toward one another (their relative velocity is 2 v ), and thus the interaction is over in half the time [1].
In the calculations, we studied a device that works similar to real-time SAW convolvers, but the wave exhibited strong damping. Therefore, the model is highly applicable to spin-wave-like convolvers, where damping is more significant [3].
In our simulation, which can be considered a baseline, a square signal ( s ( t ) = A 1 c o s ( ω t ) ) and a triangular signal ( r ( t ) = 1 t A 2 c o s ( ω t ) ) travel opposite each other, and the waves propagate in a nonlinear manner. Square and triangular signals were selected as case studies, since they can be easily described mathematically and depict the effect of convolution fairly well. Reading the signal at the intersection of the waves yielded the convolution of the two input signals. (In fact, one of the input signals had to be inverted in time to obtain convolution; otherwise, we would find the cross-correlation of the signals.) The simulation is illustrated in Figure 2. The signal was oscillatory, but if we took advantage of the fact that the frequency of the output signal would be twice the original frequency of the signals, we could filter the output signal, and would find the convolution result.

SAW-Based Kervolution

In the physical system, the input signals attenuate over time as they travel further and further in the space. Taking this phenomenon into account, we applied exponential attenuation to both the input signal and the kernel.
According to the properties of our physical system, we had to apply saturation after the element-wise multiplication. In fact, we used the following kernel convolution ( i t h element of convolution) in our CNN architecture:
g i ( x ) = < ϕ i ( x i ) , ϕ i ( w ) >
where < · , · > is the inner product of two vectors with a hyperbolic tangent (which means < a , b > = k = 1 n t a n h ( a k b k ) , where t a n h is the saturation of the system) and ϕ : R n R n is the following nonlinear mapping function:
ϕ i ( x ) = e i a x i
where i is the discrete time and a is the attenuation parameter. Figure 3 depicts the e i a function with a different a parameter. (This attenuation formula can also be written in the following way: 0 . 999 i x i , which means ϕ i ( x ) with a = 999 ).

3. Results

We started from a simple convolutional neural network, and our goal was to implement an architecture which included a special convolutional operation that could be accomplished by a physical device, which could effectively perform the convolution. Thus, we introduced physical characteristics into the system to demonstrate the effects of these features. Then, we examined how our architecture (depicted in Figure 4) worked on several one- and two-dimensional datasets. For demonstration, we also implemented a CNN that was similar to our neural network but used one-dimensional convolution. We used 1 × 9 kernels and 3 layers (2 layers with 8 kernels and 1 layer with 16 kernels), and after every convolutional layer, we applied ReLU as a nonlinear activation function in the reference CNN model, as shown in Figure 5. For the detailed parameter settings of both the network architectures and the training algorithms, please take a look at the source code of our neural network model, which can be found in the following GitHub repository: https://github.com/andfulop/SpinWaveConvolver (accessed on 21 February 2023). The classification accuracy results of these CNNs on various datasets can be found in Table 1.
As a more complex two-dimensional case study, we investigated the well-known MNIST dataset, which contains handwritten digits and has a training set of 60,000 examples and a test set of 10,000 examples. The size of the images was 28 × 28 pixels. Another two-dimensional dataset is Fashion-MNIST, which is an MNIST-like fashion product database with 10 classes that consists of 28 × 28-sized grayscale images, where the number of elements of the training set is 60,000 and the test set has 10,000 examples. The evolution of the classification accuracies on the test set of the MNIST dataset with a traditional convolutional network and a kervolutional network implemented by an SAW convolver can be seen in Figure 6, and the confusion matrices of the trained architectures can be found in Figure 7. The same results for Fashion-MNIST can be observed in Figure 8 and Figure 9, respectively.
We examined one-dimensional datasets as well. One of those is the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set Version 2.1 (HADB [28]). This consists of a smartphone’s accelerometer and gyroscope signals during 12 different activities (standing, walking, walking downstairs and upstairs, laying, etc.) for 30 subjects. The training set contains more than 7700 samples, while the test set contains roughly 3100 samples. The test accuracies on this dataset during training are depicted in Figure 10 and the confusion matrices of the trained architectures can be found in Figure 11.
Another examined one-dimensional database is the Ozone Level Detection Data Set [29]. We used the one-hour peak set from this dataset. The samples contained wind speed values at various moments and temperature values measured at different times as well. These samples can be categorized into two classes: the first one is the normal day class, and the second one is the ozone day class. The dataset has 2536 instances, and we selected the last 500 as an independent test set. The classification accuracy results of this dataset can be found in Table 1, along with other accuracy results for the previously mentioned datasets. As can be seen from the results in this table, the same network provided different mean accuracies on different problems, ranging from 77 to 92% depending on the complexity of the exact task. One can observe an approximately 6% performance drop in almost all cases (except the OZONE dataset), and this drop was independent from the original accuracy of the reference network. This demonstrates that an energy-efficient SAW convolver could provide viable implementation in certain problems where this 6 % accuracy drop is acceptable.
The earlier results demonstrate that one can substitute convolution with kervolution for a 6 % drop in accuracy, which could enable the energy-efficient implementation of simple neural networks with SAW convolvers. Unfortunately, in an ideal neural network, signals propagate with infinite speed and without attenuation and noise. To demonstrate the practical usability of an SAW, we investigated how an SAW with different attenuation parameters would perform on the MNIST and HADB datasets. The test accuracies for both datasets can be seen in Figure 12. As these plots demonstrate, if the attenuation parameter (a) was larger than 9999, then the network reached a similar accuracy to that reported in Table 1, and a decrease from 99,999 to 9999 did not have significant effect on the classification accuracy of the network. In the case of a further decrease, as in the case of a = 999 , the accuracy of our implementation dropped significantly. This can help in the physical design of the SAW convolver, and one can select materials and frequencies which ensure this small level of attenuation.

4. Conclusions

In this paper, we introduced a special convolutional neural network with novel kernel convolution, which can be implemented with a wave-based device based on the principles of surface acoustic wave convolvers. We tested our neural network architecture on one- and two-dimensional datasets, and it was compared with similar network implementations containing normal convolution. The network accuracy was decreased by an average of 5 % in the case of kervolutions, but these operations can enable low-energy implementation on embedded devices. The proposed framework could achieve a similar or slightly worse accuracy, but it has the potential to be implemented in a much faster and more energy-efficient device. Our results also revealed some of the required properties of future magnetic devices. To ensure high accuracy, the attenuation parameter cannot be lower than e i 999 .
In this work, we used a very simple convolutional neural network architecture to examine the capability of our kernel convolution. Thus, in the future, we want to investigate a more complex and sophisticated network architecture that can be applied to a wide range of real-world problems in various fields. Additionally, a more accurate understanding of the physical parameters and the practical implementation of the physical device are also important parts of our future plans. To achieve these objectives, we plan to conduct further research and experimentation in these areas.

Author Contributions

Conceptualization, A.F., G.C. and A.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work has received financial support from the Horizon 2020 Framework Program of the European Commission under FET-Open grant agreement no. 899646 (k-NET).

Institutional Review Board Statement

Not relevant.

Informed Consent Statement

Not relevant.

Data Availability Statement

The datasets supporting the findings of this study are openly available; please see the references for more information.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Campbell, C. Surface Acoustic Wave Devices and Their Signal Processing Applications; Elsevier: Amsterdam, The Netherlands, 2012. [Google Scholar]
  2. Vasyuchka, V.I.; Melkov, G.A.; Moiseienko, V.A.; Prokopenko, A.V.; Slavin, A.N. Correlation receiver of below-noise pulsed signals based on parametric interactions of spin waves in magnetic films. J. Magn. Magn. Mater. 2009, 321, 3498–3501. [Google Scholar] [CrossRef]
  3. Abdulqader, V.M.; Florin Ciubotaru, F.V. Introduction to spin wave computing. J. Appl. Phys. 2020, 128, 16. [Google Scholar]
  4. Farag, W. Recognition of traffic signs by convolutional neural nets for self-driving vehicles. Int. J.-Knowl.-Based Intell. Eng. Syst. 2018, 22, 205–214. [Google Scholar] [CrossRef]
  5. Li, W.; Li, D.; Zeng, S. Traffic Sign Recognition with a small convolutional neural network. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Kazimierz Dolny, Poland, 21–23 November 2019; Volume 688, p. 044034. [Google Scholar]
  6. Aghdam, H.H.; Heravi, E.J.; Puig, D. A practical and highly optimized convolutional neural network for classifying traffic signs in real-time. Int. J. Comput. Vis. 2017, 122, 246–269. [Google Scholar] [CrossRef]
  7. Do, T.D.; Duong, M.T.; Dang, Q.V.; Le, M.H. Real-time self-driving car navigation using deep neural network. In Proceedings of the 2018 4th International Conference on Green Technology and Sustainable Development (GTSD), Ho Chi Minh City, Vietnam, 23–24 November 2018; pp. 7–12. [Google Scholar]
  8. Singh, R.D.; Mittal, A.; Bhatia, R.K. 3D convolutional neural network for object recognition: A review. Multimed. Tools Appl. 2019, 78, 15951–15995. [Google Scholar] [CrossRef]
  9. Chechliński, Ł; Siemiątkowska, B.; Majewski, M. A system for weeds and crops identification—Reaching over 10 fps on raspberry pi with the usage of mobilenets, densenet and custom modifications. Sensors 2019, 19, 3787. [Google Scholar] [CrossRef] [Green Version]
  10. Caesarendra, W.; Hishamuddin, T.A.; Lai, D.T.C.; Husaini, A.; Nurhasanah, L.; Glowacz, A.; Alfarisy, G.A.F. An embedded system using convolutional neural network model for online and real-time ECG signal classification and prediction. Diagnostics 2022, 12, 795. [Google Scholar] [CrossRef]
  11. Pratt, H.; Coenen, F.; Broadbent, D.M.; Harding, S.P.; Zheng, Y. Convolutional neural networks for diabetic retinopathy. Procedia Comput. Sci. 2016, 90, 200–205. [Google Scholar] [CrossRef] [Green Version]
  12. Ma, J.; Wu, F.; Zhu, J.; Xu, D.; Kong, D. A pre-trained convolutional neural network based method for thyroid nodule diagnosis. Ultrasonics 2017, 73, 221–230. [Google Scholar] [CrossRef] [PubMed]
  13. Anthimopoulos, M.; Christodoulidis, S.; Ebner, L.; Christe, A.; Mougiakakou, S. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans. Med. Imaging 2016, 35, 1207–1216. [Google Scholar] [CrossRef] [PubMed]
  14. Li, H.; Wang, H.; Liu, L.; Gruteser, M. Automatic unusual driving event identification for dependable self-driving. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, Shenzhen, China, 4–7 November 2018; pp. 15–27. [Google Scholar]
  15. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
  16. Bulat, A.; Tzimiropoulos, G. Xnor-net++: Improved binary neural networks. arXiv 2019, arXiv:1909.13863. [Google Scholar]
  17. Yamazaki, K.; Vo-Ho, V.K.; Bulsara, D.; Le, N. Spiking neural networks and their applications: A Review. Brain Sci. 2022, 12, 863. [Google Scholar] [CrossRef] [PubMed]
  18. Zhao, R.; Ng, H.C.; Luk, W.; Niu, X. Towards efficient convolutional neural network for domain-specific applications on FPGA. In Proceedings of the 2018 28th International Conference on Field Programmable Logic and Applications (FPL), Dublin, Ireland, 27–31 August 2018; pp. 147–1477. [Google Scholar]
  19. Boutros, A.; Yazdanshenas, S.; Betz, V. You cannot improve what you do not measure: FPGA vs. ASIC efficiency gaps for convolutional neural network inference. ACM Trans. Reconfig. Technol. Syst. (TRETS) 2018, 11, 1–23. [Google Scholar] [CrossRef]
  20. Khalife, S.; Basu, A. Neural networks with linear threshold activations: Structure and algorithms. In Proceedings of the International Conference on Integer Programming and Combinatorial Optimization, Eindhoven, The Netherlands, 27–29 June 2022; pp. 347–360. [Google Scholar]
  21. Jahan, I.; Ahmed, M.F.; Ali, M.O.; Jang, Y.M. Self-gated rectified linear unit for performance improvement of deep neural networks. ICT Express 2022. [Google Scholar] [CrossRef]
  22. Wang, C.; Yang, J.; Xie, L.; Yuan, J. Kervolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 31–40. [Google Scholar]
  23. Ammann, O.; Michau, G.; Fink, O. Anomaly Detection And Classification In Time Series With Kervolutional Neural Networks. arXiv 2020, arXiv:2005.07078. [Google Scholar]
  24. Suman, C.; Raj, A.; Saha, S.; Bhattacharyya, P. Authorship Attribution of Microtext Using Capsule Networks. IEEE Trans. Comput. Soc. Syst. 2021, 9, 1038–1047. [Google Scholar] [CrossRef]
  25. Mulimani, M.; Nandi, R.; Koolagudi, S.G. Acoustic scene classification using projection Kervolutional neural network. Multimed. Tools Appl. 2022, 1–11. [Google Scholar] [CrossRef]
  26. Morgan, D. Surface Acoustic Wave Filters: With Applications to Electronic Communications and Signal Processing; Academic Press: Cambridge, MA, USA, 2010. [Google Scholar]
  27. Sasaki, R.; Nii, Y.; Onose, Y. Magnetization control by angular momentum transfer from surface acoustic wave to ferromagnetic spin moments. Nat. Commun. 2021, 12, 2599. [Google Scholar] [CrossRef] [PubMed]
  28. Reyes-Ortiz, J.L.; Oneto, L.; Samà, A.; Parra, X.; Anguita, D. Transition-aware human activity recognition using smartphones. Neurocomputing 2016, 171, 754–767. [Google Scholar] [CrossRef] [Green Version]
  29. Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 15 January 2023).
Figure 1. This figure depicts a primitive three-port elastic SAW convolver. Two signals travel in opposite directions from the two input ports of the device (at the left and right edges of the device), and the convolved version of the two signals can be extracted at the integration area (in the middle of the device). This example shows how a physical system can be used to implement a complex operation in an energy-efficient manner.
Figure 1. This figure depicts a primitive three-port elastic SAW convolver. Two signals travel in opposite directions from the two input ports of the device (at the left and right edges of the device), and the convolved version of the two signals can be extracted at the integration area (in the middle of the device). This example shows how a physical system can be used to implement a complex operation in an energy-efficient manner.
Electronics 12 01126 g001
Figure 2. In the first row, s ( t ) is the square signal, and next to this function is the triangular signal r ( t ) (inverted in time). These are traveling opposite each other. The first plot in the second row is the raw result, which is read from the collision of the above signals. The last plot is the frequency-filtered result, which had a doubled frequency compared with the original signals.
Figure 2. In the first row, s ( t ) is the square signal, and next to this function is the triangular signal r ( t ) (inverted in time). These are traveling opposite each other. The first plot in the second row is the raw result, which is read from the collision of the above signals. The last plot is the frequency-filtered result, which had a doubled frequency compared with the original signals.
Electronics 12 01126 g002
Figure 3. One of the parameters of our simulation Was attenuation. The propagating waves were attenuated exponentially with the function e i a , where i is the time and a is the attenuation parameter. Here, we plotted the function with 999; 9999 and 99,999 attenuation parameters. Since the dependence of attenuation on this parameter is exponential, it can heavily affect the accuracy of a neural network.
Figure 3. One of the parameters of our simulation Was attenuation. The propagating waves were attenuated exponentially with the function e i a , where i is the time and a is the attenuation parameter. Here, we plotted the function with 999; 9999 and 99,999 attenuation parameters. Since the dependence of attenuation on this parameter is exponential, it can heavily affect the accuracy of a neural network.
Electronics 12 01126 g003
Figure 4. The architecture of our neural network model with SAW kervolution. The convolutions and ReLUs were substituted with kervolutions in this variant. Please note that the number of layers, channels, and parameters were the same in both network variants.
Figure 4. The architecture of our neural network model with SAW kervolution. The convolutions and ReLUs were substituted with kervolutions in this variant. Please note that the number of layers, channels, and parameters were the same in both network variants.
Electronics 12 01126 g004
Figure 5. The architecture of our reference neural network model. Our network contained three convolutional layers with ReLUs followed by a fully connected layer. This simple four-layered architecture is capable of solving simple classification tasks.
Figure 5. The architecture of our reference neural network model. Our network contained three convolutional layers with ReLUs followed by a fully connected layer. This simple four-layered architecture is capable of solving simple classification tasks.
Electronics 12 01126 g005
Figure 6. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the MNIST dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Figure 6. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the MNIST dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Electronics 12 01126 g006
Figure 7. The confusion matrix of a trained architecture on the test set of the MNIST dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar.
Figure 7. The confusion matrix of a trained architecture on the test set of the MNIST dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar.
Electronics 12 01126 g007
Figure 8. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the Fashion-MNIST dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Figure 8. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the Fashion-MNIST dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Electronics 12 01126 g008
Figure 9. The confusion matrix of a trained architecture on the test set of the Fashion-MNIST dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar. (The number labels mean the following: 0 = T-shirt, 1 = trouser, 2 = pullover, 3 = dress, 4 = coat, 5 = sandal, 6 = shirt, 7 = sneaker, 8 = bag, and 9 = ankle boot.)
Figure 9. The confusion matrix of a trained architecture on the test set of the Fashion-MNIST dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar. (The number labels mean the following: 0 = T-shirt, 1 = trouser, 2 = pullover, 3 = dress, 4 = coat, 5 = sandal, 6 = shirt, 7 = sneaker, 8 = bag, and 9 = ankle boot.)
Electronics 12 01126 g009
Figure 10. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the HADB dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Figure 10. This figure plots the mean, minimal, and maximal classification accuracies during training, averaged from five independent training sessions on the HADB dataset. The top image (a) depicts the classification accuracies with the reference convolutional neural network (baseline), while the lower image (b) plots the same results using an SAW convolution-based kervolutional neural network with an attenuation parameter of 99,999. The blue color is the maximum accuracy, the yellow color is the mean accuracy, and the red line shows the minimum accuracy. As can be seen, there were no significant differences between the two results.
Electronics 12 01126 g010
Figure 11. The confusion matrix of a trained architecture on the test set of the HADB dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar. (The number labels mean the following: 0 = walking, 1 = walking upstairs, 2 = walking downstairs, 3 = sitting, 4 = standing, 5 = laying, 6 = stand to sit, 7 = sit to stand, 8 = sit to lie, 9 = lie to sit, 10 = stand to lie, and 11 = lie to stand.)
Figure 11. The confusion matrix of a trained architecture on the test set of the HADB dataset in the case of the reference CNN model (a) and in the case of our kervolutional neural network model (b). As one can see, the two confusion matrices were qualitatively similar, and although there were more misclassifications in the case of the SAW convolver (which was natural since it had a slightly lower accuracy), the distribution of the errors was similar. (The number labels mean the following: 0 = walking, 1 = walking upstairs, 2 = walking downstairs, 3 = sitting, 4 = standing, 5 = laying, 6 = stand to sit, 7 = sit to stand, 8 = sit to lie, 9 = lie to sit, 10 = stand to lie, and 11 = lie to stand.)
Electronics 12 01126 g011
Figure 12. The training results of MNIST (a) and HADB (b) classification with different a (attenuation) parameters. In case of the blue curves, the a is 9999, and the red curves show the result for it being 999.
Figure 12. The training results of MNIST (a) and HADB (b) classification with different a (attenuation) parameters. In case of the blue curves, the a is 9999, and the red curves show the result for it being 999.
Electronics 12 01126 g012aElectronics 12 01126 g012b
Table 1. This table displays the test accuracies of a traditional convolutional network (as the reference) and our method using an SAW convolver in the different columns. The rows contain the accuracies on four different datasets. As can be seen from the results, the same network provided different mean accuracies for different problems, ranging from 77 to 92% depending on the complexity of the exact task. One can observe an approximately 6% performance drop in almost all cases (except the OZONE dataset), and this drop was independent from the original accuracy of the reference network.
Table 1. This table displays the test accuracies of a traditional convolutional network (as the reference) and our method using an SAW convolver in the different columns. The rows contain the accuracies on four different datasets. As can be seen from the results, the same network provided different mean accuracies for different problems, ranging from 77 to 92% depending on the complexity of the exact task. One can observe an approximately 6% performance drop in almost all cases (except the OZONE dataset), and this drop was independent from the original accuracy of the reference network.
Reference NetworkMetwork with SAW Kervolution
Dataset Mean Max Mean Max
MNIST92.61%96.52%86.51%93.58%
Fashion-MNIST77.84%83.01%72.87%79.32%
HADB88.43%91.71%82.11%88.89%
OZONE99.15%99.2%99.07%99.4%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fülöp, A.; Csaba, G.; Horváth, A. A Convolutional Neural Network with a Wave-Based Convolver. Electronics 2023, 12, 1126. https://doi.org/10.3390/electronics12051126

AMA Style

Fülöp A, Csaba G, Horváth A. A Convolutional Neural Network with a Wave-Based Convolver. Electronics. 2023; 12(5):1126. https://doi.org/10.3390/electronics12051126

Chicago/Turabian Style

Fülöp, András, György Csaba, and András Horváth. 2023. "A Convolutional Neural Network with a Wave-Based Convolver" Electronics 12, no. 5: 1126. https://doi.org/10.3390/electronics12051126

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop