Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System

Yue, Dianzuo; Hou, Yushuang; Hu, Chunxia; Zang, Cunru; Kou, Yingzhe

doi:10.3390/photonics10030236

Open AccessCommunication

Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System

by

Dianzuo Yue

¹,

Yushuang Hou

^1,2,*,

Chunxia Hu

³,

Cunru Zang

¹ and

Yingzhe Kou

¹

School of Mathematics and Information Science & Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, China

²

School of Science, Inner Mongolia University of Science and Technology, Baotou 014010, China

³

Chongqing College of Electronic Engineering, Chongqing 401331, China

^*

Author to whom correspondence should be addressed.

Photonics 2023, 10(3), 236; https://doi.org/10.3390/photonics10030236

Submission received: 26 January 2023 / Revised: 14 February 2023 / Accepted: 17 February 2023 / Published: 22 February 2023

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Photonics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this work, the performance of an optoelectronic time-delay reservoir computing system for performing a handwritten digit recognition task is numerically investigated, and a scheme to improve the recognition speed using multiple parallel reservoirs is proposed. By comparing four image injection methods based on a single time-delay reservoir, we find that when injecting the histograms of oriented gradient (HOG) features of the digit image, the accuracy rate (AR) is relatively high and is less affected by the offset phase. To improve the recognition speed, we construct a parallel time-delay reservoir system including multi-reservoirs, where each reservoir processes part of the HOG features of one image. Based on 6 parallel reservoirs with each reservoir possessing 100 virtual nodes, the AR can reach about 97.8%, and the reservoir processing speed can reach about 1 × 10⁶ digits per second. Meanwhile, the parallel reservoir system shows strong robustness to the parameter mismatch between multi-reservoirs.

Keywords:

handwritten digits recognition; optoelectronic; time-delay reservoir

1. Introduction

Nowadays, artificial intelligence (AI) technology plays an irreplaceable role in numerous fields. One of the important applications is static image recognition, such as handwritten digits recognition (HWDR) [1], vehicle perception, medical image analysis [2], and facial recognition [3]. The widespread application of AI benefits from the improvement of neuro-inspired algorithms and the rapid progress of digital computers in computing ability. However, the current efficiency of conventional digital computers when running neuro-inspired algorithms such as recurrent neural networks (RNNs) is still not satisfactory because these algorithms consume huge computing resources in training procedures [4]. In order to improve the efficiency of AI equipment, researchers have been working to find a neuro-inspired algorithm that possesses a simple training process, low energy consumption, and high performance. Reservoir computing (RC), which emerged in recent years, can meet these conditions and can be implemented in nonlinear systems. RC has been considered one of the promising neural algorithms to further promote the development of AI [5].

RC is a general term for the echo state network [6] and the liquid state machine [7]. A typical RC system can be divided into an input layer, a reservoir (a dynamic network composed of a large number of interconnected nonlinear nodes), and an output layer. RC uses randomly generated and fixed input weights and reservoir weights to reduce the training difficulty. Since there is no need to change the internal connections of the system, RC is much easier to implement in physical hardware than traditional neuromorphic computing schemes [8,9]. In 2011, the implementation of RC was greatly simplified, where a time-delay system was used as the reservoir, and the reservoir states (named as virtual node states) were constituted by the states of a single nonlinear node sampled multiple times within a delay cycle [10]. Since then, RC based on the time-delay system (TDRC) has been implemented in electrical circuits [11], optoelectronic systems [12,13,14,15], and all-optical systems [16,17,18]. Based on the serial input and output mechanism, TDRC shows excellent performance for processing time-related tasks such as time series prediction, waveform recognition, and digital speech recognition [19,20]. For instance, in 2017, Larger et al. implement a photonic reservoir computing system based on optoelectronic telecom devices, which achieves a million words per second classification speed for speech-recognition tasks [21]. In 2021, Zhong et al. obtained a chaotic time series synchronized with three dynamics systems based on an all-optical RC formed by SLs under optical feedback [22]. In the same year, Dai et al. numerically investigated the performance of optoelectronic RC for the classification of IQ-modulated signals [23]. In 2022, Han et al. proposed an approach for simultaneous modulation format identification and optical signal-to-noise monitoring in optical communication based on an optoelectronic RC [24]. However, when processing tasks such as image recognition, the performance of TDRC needs to be improved. For instance, the accuracy rate (AR) of recognizing handwritten digits in [25] using an optoelectronic TDRC is about 83%, which is significantly lower than the AR when using a convolutional neural network (CNN) [26] or an echo state network [27,28] based on parallel data processing mechanisms. Therefore, an in-depth investigation on improving the AR and recognition speed of static images based on TDRC is of some significance for promoting the application of TDRC in the AI field.

In this work, we numerically investigate the performance of an optoelectronic TDRC system for performing HWDR tasks. Based on a single time-delay reservoir, we first compare four different image input methods including the direct input of a full image, input of a trimmed image, input of 2×-downscaled features, and input histograms of oriented gradient (HOG) features. We found that under optimized parameters, the TDRC with 600 virtual nodes can achieve an AR of 97% with input HOG features. Then, we increase the number of parallel time-delay reservoirs used in the TDRC system to increase the data processing rate. Each reservoir is used for processing a part of the HOG features, and all the reservoir states are combined for training and testing. The results show that based on 6 reservoirs (each reservoir contains 100 nodes), the TDRC system can achieve an AR of 97.8% with a reservoir processing speed of 1 × 10⁶ digits per second.

2. System Model

Figure 1a,b shows the schematic diagram of the system setup and the operation procedure of the optoelectronic RC system for processing an HWDR task. The system uses a semiconductor laser as the light source to output continuous wave P₀. A Mach–Zehnder intensity modulator (MZM) is used as the nonlinear node to inject the task information into the reservoir. Under the continuous wave injection of P₀, the output optical intensity P(t) of MZM can be described as:

P (t) = P_{0} \cos^{2} (\frac{π}{2} \frac{m (t)}{V_{π}} + ϕ)

(1)

where m(t) is the modulation voltage applied to the MZM and V_π is the half-wave voltage of the MZM. φ denotes the offset phase adjusted by the bias voltage of the MZM. After passing through a variable attenuator (VA), the optical signal P(t) is converted into electrical signal P_out(t) using a photo detector (PD), which can be approximated as a linear transformation of P(t), and the amplitude of P_out(t) can be adjusted using the VA. The electrical signal P_out(t) passes through an amplifier, a high-pass filter (HPF), and a low-pass filter (LPF). Then, the output signal x(t) of the HPF is added with the input data γI(t) to modulate the MZM, where γ is an input scaling factor. As shown in Figure 1b, in the optoelectronic feedback loop, the optoelectronic conversion efficiency and electronic gain can be described as a feedback coefficient β, and the delay time of the feedback loop is τ. To simplify the analysis, the summed signal (γI(t) + x(t − τ)) applied to the MZM is expressed in a normalized unit as γI(t) + x(t − τ) ≡ πm(t)/2V_π. Therefore, the signal input to the LPF is described as βcos²[x(t − τ) + φ + γI(t)]. The LPF and HPF constitute a band-pass filter, where τ_L and τ_H describe the characteristic time of the LPF and HPF, respectively. Correspondingly, the cut-off frequency and cut-on frequency of LPF and HPF are 1/(2πτ_L) and 1/(2πτ_H), respectively. The transfer function in the frequency domain of the band-pass filter can be described as:

H (s) = \frac{s τ_{H}}{(1 + s τ_{L}) (1 + s τ_{H})}

(2)

where s is a complex Laplace variable. The detailed derivation process of the filter from the frequency domain to the time domain can refer to [29].

If without I(t), this system constitutes a common optoelectronic feedback loop, which has been widely investigated for generating a chaotic signal [30], forecasting high-dimensional chaotic dynamics [31], and performing chaos-based communication [32]. In recent years, this system has been successfully applied in the TDRC field for processing time series [33,34]. Under external information injection, the dynamics of this optoelectronic TDRC system can be expressed as [33]:

τ_{L} \frac{d x (t)}{d t} = - (1 + \frac{τ_{L}}{τ_{H}}) x (t) - v (t) + β \cos^{2} [x (t - τ) + ϕ + γ I (t)]

(3)

τ_{H} \frac{d v (t)}{d t} = x (t)

(4)

where x(t) is the reservoir output signal. v(t) is a supplementary variable derived from the necessary filtering feature of the optoelectronic amplified process [21].

In this work, we use the mixed national institute of standards and technology (MNIST) database to evaluate the performance of this TDRC for performing an HWDR task. The MNIST database contains 70,000 images of handwritten digits, where 60,000 images form a training set, and another 10,000 images form a testing set [35]. All images in the MNIST database have been tailored to a standard size of 28 × 28 pixels, and the pixel values have been normalized to [0,1]. For performing this task, the image needs to be preprocessed and converted into a time series to meet the serial input mechanism of TDRC. In this work, we compare four relatively simple preprocessing methods, i.e., inputting the full image, inputting the trimmed image, extracting 2×-downscaled features, and extracting HOG features. For the case of inputting a full image, as shown in Figure 2a, the pixel matrix with dimensions of 28 × 28 is directly converted into a row vector u with 784 elements. For the case of inputting a trimmed image, we remove the unused border area of the image to form a smaller image with 20 × 16 pixels [36]. As shown in Figure 2b, the vector u of the trimmed image contains 320 elements. For the case of extracting 2×-downscaled features, the trimmed image (20 × 16 pixels) in Figure 2b is first divided into 80 small zones with each zone consisting of 2 × 2 pixels [37]. Then, as shown in Figure 2c, the mean values of each zone are extracted and formed into a vector u with 80 elements. For the case of extracting HOG features, we calculate the gradient g = (g_x² + g_y²)^1/2 of the images using two spatial filters of F_x = [–1, 0, 1] and F_y = [–1, 0, 1], where g_x and g_y are the horizontal and vertical gradient, respectively. The orientation within [0°, 180°] is obtained as α = arctan (g_y/g_x). As shown in Figure 2d, the image is divided into 16 cells (each possesses 7 × 7 pixels), and the gradient values of each cell are calculated according to their orientations with an interval of 20°. At last, a feature vector u consisting of 324 elements is obtained using block-normalizing over a region of 2 × 2 cells.

After preprocessing, the vector u containing m elements is multiplied by an input weight matrix W_in with the dimension of m × N to form a sequence I(t), where N is the number of the reservoir’s virtual nodes. In this work, the values of W_in are randomly extracted from {−1, 1}. In the readout layer, the reservoir states x(t) are collected by sampling the output of LPF as an interval of θ. Then, we train the output weight W_out using a ridge regression method, i.e., W_out = Y^targetX^T(XX^T + λI), where W_out is the readout weight and X is the states matrix. Y^target is the target matrix that is built according to the labels of the training set. We set Y^target with 10 rows corresponding to the 10 digits of 0~9, for which the elements of the row corresponding to the target number are set to 1, and the other elements are all set to 0. λ is the regularization coefficient, and I is the identity matrix. After the training procedure, for new input images, the RC system can output recognition results of Y^output = W_outX. In the case of correct training, Y^output should be a matrix similar to Y^target. Then, the elements of each row in Y^output are summed, and the row number of the largest sum is decided as the recognition result, which procedure is called winner-takes-all [38]. The performance of the system is evaluated using the AR, i.e., the ratio of the correct recognition number to the size of the test set.

3. Results

To simulate the dynamics of the optoelectronic RC, we solve Equations (3) and (4) using the fourth-order Runge–Kuta method, where the integration step is set to 1ns. The parameter values are τ_L = 10 ns, τ_H = 0.5 μs, virtual node interval θ = 0.01 μs, and τ = Nθ (N is the number of virtual nodes).

We first compare the recognition performance of the optoelectronic RC under four different input methods. For the methods of inputting a full image (u^1×784), a trimmed image (u^1×320), 2×-downscaled features (u^1×80), and HOG features (u^1×324), the number of virtual nodes N is set to 1500, 700, 200, and 700, respectively, which satisfies an input-to-neuron ratio about 1:2. Generally speaking, a relatively large N is helpful to improve the dimension of the reservoir state space (but a too large N will lead to saturation of the state space), i.e., a relatively small input-to-neuron ratio may achieve relatively good performance. In the test of comparing different input methods, for the sake of fairness, we set the same input-to-neuron ratio for the four input methods. Figure 3 shows the evolution maps of the AR in parameter space of feedback strength β and input scaling factor γ under offset phase ϕ = 0. The different colors represent different values of the AR, and the blank area represents AR ≤ 85%. We can find that for the four input methods, the values of the AR are symmetrically distributed on both sides of γ = 0.5, and the ARs close to γ = 0.5 are relatively low. We speculate that when γ = 0.5 and ϕ = 0, the non-linear transformation effect of the reservoir is relatively weak for the input data I(t) (normalized to [0, π]). Meanwhile, we can find that the relatively high ARs are located in the range of β ∈ [0.1, 0.6] where the reservoir can generate appropriate non-linear dynamics for this task. For the methods of inputting a full image (Figure 3a), trimmed image (Figure 3b), and 2×-downscaled features (Figure 3c), the maximum values of the AR are about 90% when γ is close to 0.9. For the method of inputting HOG features (Figure 3d), the maximum AR reaches 97%, which can be obtained for a wide range of γ. Obviously, under the method of HOG, the recognition performance is better, and the robustness to β and γ is stronger. We speculate that it is due to the gradients of an image, which are more useful than its pixels. Because the magnitude of gradients is large around edges and corners (regions of abrupt intensity changes), especially for handwritten digits. Moreover, the HOG operates on the local grid cells of the image, so it maintains good invariance for the geometrical and optical deformations of the image.

According to the results of Figure 3, we set β = 0.2 and γ = 0.92 to further investigate the influence of the offset phase ϕ of MZM on the AR. Figure 4 shows the AR as a function of ϕ for the cases using four different image preprocessing methods. For the methods of inputting a full image (Figure 4a), trimmed image (Figure 4b), and 2×-downscaled features (Figure 4c), when ϕ changes from 0 to 2π, the values of the AR fluctuate between 87% and 91% with a period of 0.5π. We can find that the performance of RC for these three preprocessing methods is similar under the same input-to-neuron ratio. However, RC possesses a relatively high data processing rate with the input of 2×-downscaled features because only 200 virtual nodes are required for achieving an AR of 91%. As shown in Figure 4d, RC presents the best performance with the input of HOG features, where the AR fluctuates slightly between 97% and 97.6%, and there is no obvious dependence on ϕ. From the above tests of Figure 3 and Figure 4, we can find that the preprocessing method of HOG can effectively extract the features of handwritten digits, and the reservoir with 700 virtual nodes can achieve a good recognition performance and strong robustness to the fluctuations in β, γ, and ϕ.

In the following sections, we take the method of inputting HOG features as an example to investigate how to improve the recognition speed based on multiple parallel optoelectronic reservoirs. Since the virtual node states of a TDRC are obtained using time multiplexing, in order to obtain a sufficiently high-dimensional state space, the input period cannot be too small, which limits the speed of TDRC with a single reservoir [39]. In order to increase the recognition speed, we can increase the number of real nonlinear nodes in the system to form a mutually coupled or parallel RC system and simultaneously extract the virtual node states from the feedback loop of each reservoir [40]. In this work, as shown in Figure 5, we transform the RC system of Figure 1 into a parallel processing RC system with k decoupled time-delay reservoirs, where the SL provides a continuous wave for k parallel reservoirs. The feature matrix preprocessed using the HOG method is divided into k parts by a column, which are injected into k reservoirs after masking with different masks. The masked information of I₁(t) to I_k(t) are injected into k reservoirs, which are transformed into x₁(t) to x_k(t), respectively. Within an input period, each x(t) is sampled as a vector x with a sampling interval of θ, where each vector x contains N elements. Therefore, x₁ = [x₁¹, x₁², …, x₁^N]^T, x₂ = [x₂¹, x₂², …, x₂^N]^T, and x_k = [x_k¹, x_k², …, x_k^N]^T constitute the columns of state matrices X. X is used for training and testing, which is similar to the method used in Figure 1. In this way, under the premise of obtaining the same number of total virtual nodes, the data injection period of the parallel RC system can be reduced by k times.

Figure 6a shows AR as a function of the number of total virtual nodes N. Considering the performance deviations caused by the randomly generated mask of W_in, we run RC 10 times for each value of N. Circles and stars in Figure 6 represent the mean values of 10 test results, and the vertical bar indicates the standard deviation of 10 tests. The blue line is for the case of using a single reservoir. We can find that as N increases from 100 to 800 with a step of 20, the AR of the single reservoir monotonously increases from 93.8% to 97.8%. The red line in Figure 6 is for the case of using k-reservoirs (k × 100 = N), where each reservoir possesses 100 virtual nodes and the k reservoirs use the same parameters of β = 0.2, γ = 0.92, and ϕ = 0.4π. It can be found that the AR increases with the increase in k. When k ≥ 7, AR reaches about 98%. Meanwhile, when k > 1, for the same N, the AR of multi-reservoirs is higher than that of the single reservoir. This means that under the premise of using the same number of total virtual nodes, the RC containing multiple reservoirs can further improve the recognition rate. We speculate that this is due to the different input weights used for multiple reservoirs, which makes the reservoir states more abundant than that in a single reservoir. Figure 6b shows the recognition speed as a function of N. For the system using a single reservoir (blue line), the recognition speed (=1/(Nθ)) decreases from 1 × 10⁶ digits per second to 1.25 × 10⁵ digits per second as N increases from 100 to 800. Obviously, the single reservoir with more virtual nodes can enable a better recognition performance but pays the price of reducing recognition speed. Meanwhile, the speed of the RC system using multi-reservoirs (red line) is always 1 × 10⁶ digits per second.

In the test of Figure 6, we defaulted that the multiple parallel reservoirs possess the same parameters. Next, we take the parallel RC system with six reservoirs (k = 6) as an example to analyze the impact of parameter mismatch between multiple reservoirs on the AR. In Figure 7, we select (β_r, γ_r, ϕ_r) = (0.2, 0.92, 0.4π) as the reference values and show the variation in the AR as a function of the parameter mismatch ratio. In this test, we run RC 20 times for each point of the parameter mismatch ratio, where the masks W_in are the same for each time, and the values of β, γ, ϕ are randomly generated within a certain parameter mismatch range. For example, when the parameter mismatch is ±5%, the value of β is taken randomly within (β_r(1 − 5%), β_r(1 + 5%)]. When the parameter mismatch is ±15%, β is taken randomly within (β_r(1 − 15%), β_r(1 − 5%)] ∪ (β_r(1 + 5%), β_r(1 + 15%)]. The red curve represents the mean AR of 20 tests, and the blue curve represents the upper and lower limits of the ARs. We can find that when the mismatch ratio is lower than 55%, the AR shows a slight fluctuation around 97.8%. When the mismatch ratio is higher than 55%, the AR gradually decreases, and the fluctuation range gradually expands. However, even if the mismatch ratio reaches ±95%, the ARs are still larger than 97%. The results indicate that this RC system with multiple parallel reservoirs shows strong robustness to the parameter mismatch between different reservoirs. In other words, when the system is implemented in hardware, it has the advantages of easy parameter adjustment and stable performance.

4. Discussion

In this work, we numerically investigate the method of improving the performance of HDWR based on an optoelectronic TDRC. The performance improvements include improvements in the AR and recognition speed. For increasing the AR, four image input methods including the input of a full image, a trimmed image, 2×-downscaled features, and HOG features are compared. In fact, there are many other image feature-extracting methods that can be used, such as projection histograms, distance profiles, and Gabor filters. In this work, we find that extracting HOG features requires fewer parameters to be adjusted and achieves a relatively high AR, where the maximum AR of about 98% is achieved. This AR is significantly higher than the value of 83% achieved in [25], which directly inputs a raw digit image into the reservoir. Moreover, the system in this work based on a relatively simple structure achieves an AR comparable to the AR of about 99% in [27], which is based on a conventional network-based RC.

For increasing the recognition speed, we adopt the scheme of using multiple parallel reservoirs. As for a single reservoir, there are two ways to increase the recognition rate (=1/Nθ): reducing the number N of virtual nodes or reducing the interval θ. However, to ensure good performance, neither N nor θ can be set too small. A too-small N will cause a low dimensional state space that decreases AR. A too-small θ will cause the loss of input information. For an optoelectronic RC, θ is often related to the inverse of the characteristic bandwidth of an optoelectronic oscillator. In this work, the characteristic bandwidth ([1/τ_H, 1/τ_L]) is about 100 MHz, which is obviously lower than the available 10 GHz bandwidth of the unfiltered optoelectronic feedback loop. This means that the recognition rate of a single reservoir can be further increased by increasing the characteristic bandwidth. The pursuit of the upper limit rate of a single reservoir will be carried out in our subsequent studies. The motivation for setting a relatively small characteristic bandwidth in this work is that our main purpose is to investigate the feasibility of improving the recognition speed based on parallel reservoirs. Moreover, setting a relatively small characteristic bandwidth makes it easier to implement this system in hardware based on our existing devices. Otherwise, the RC system with a large characteristic bandwidth (adopting small θ) usually requires the support of expensive input and readout equipment. In other words, our scheme of using parallel reservoirs provides an effective way to achieve a high recognition speed for a TDRC system with relatively low hardware requirements.

5. Conclusions

Based on an optoelectronic time-delay reservoir computing system, we numerically perform the HWDR task, where the reservoir is composed of an MZM, a PD, two filters, and a delay line. First, we compare four image input methods including the input of a full image, a trimmed image, 2×-downscaled features, and HOG features. It is found that, under the first three input methods, the AR can reach about 90% within a relatively narrow parameter region of β-γ and presents periodic fluctuations with offset phase ϕ. Under the injection of HOG features, the system can achieve ARs of about 97% within a broad parameter region of β-γ and shows strong robustness to the variation in ϕ. Second, taking the injection of HOG features as an example, we investigate the recognition speed improvement based on multi-reservoirs. Using 6 parallel reservoirs with each reservoir possessing 100 virtual nodes, the system can achieve an AR of about 97.8 at a speed of 1 × 10⁶ digits per second. Meanwhile, we find that the system performance is insensitive to the parameter mismatch of β, γ, and ϕ between multi-reservoirs. In the future, there is a lot of work worthy of in-depth studies, such as exploring the upper limit recognition rate of a single reservoir, hardware implementation of the system, system integration, and applications in other fields.

Author Contributions

Conceptualization, D.Y. and Y.H.; methodology, D.Y. and C.H.; software, Y.H.; validation, D.Y. and C.Z.; formal analysis, C.H.; investigation, D.Y., Y.H., C.Z. and Y.K.; resources, D.Y. and Y.K.; data curation, C.Z. and Y.K.; writing—original draft preparation, D.Y. and C.H.; writing—review and editing, Y.H., C.Z. and Y.K.; visualization, Y.H.; supervision, D.Y. and C.H.; project administration, D.Y. and Y.H.; funding acquisition, D.Y., Y.H. and C.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Hebei Province, grant number F2022407007; the National Natural Science Foundation of China, grant number 62065015; the Natural Science Foundation of Inner Mongolia Autonomous Region of China, grant number 2021LHMS01006; the Natural Science Foundation of Chongqing, grant number CSTB2022NSCQ-MSX0774; and the Science Research Foundation of Hebei Normal University of Science and Technology.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wang, D.; Tan, X. Unsupervised feature learning with c-SVDDNet. Pattern Recognit. 2016, 60, 473–485. [Google Scholar] [CrossRef] [Green Version]
Elyan, E.; Vuttipittayamongkol, P.; Johnston, P.; Martin, K.; McPherson, K.; Moreno-Garcia, C.F.; Jayne, C.; Sarker, M.d.M.K. Computer vision and machine learning for medical image analysis: Recent advances, challenges, and way forward. Art. Int. Surg. 2022, 2, 24–45. [Google Scholar] [CrossRef]
Ahonen, T.; Hadid, A.; Pietikainen, M. Face description with local binary patterns: Application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 2037–2041. [Google Scholar] [CrossRef] [PubMed]
Kitayama, K.; Notomi, M.; Naruse, M.; Inoue, K.; Kawakami, S.; Uchida, A. Novel frontier of photonics for data processing—Photonic accelerator. APL Photonics 2019, 4, 090901. [Google Scholar] [CrossRef] [Green Version]
Tanaka, G.; Yamane, T.; Heroux, J.B.; Nakane, R.; Kanazawa, N.; Takeda, S.; Numata, H.; Nakano, D.; Hirose, A. Recent advances in physical reservoir computing: A review. Neural Netw. 2019, 15, 100–123. [Google Scholar] [CrossRef] [PubMed]
Jaeger, H. The ‘Echo State’ Approach to Analysing and Training Recurrent Neural Networks; GMD Report 148; German National Research Institute for Computer Science: Darmstadt, Germany, 2001. [Google Scholar]
Maass, W.; Natschläger, T.; Markram, H. Real-time computing without stable States: A new framework for neural computation based on perturbations. Neural Comput. 2002, 14, 2531–2560. [Google Scholar] [CrossRef] [PubMed]
Vandoorne, K.; Mechet, P.; Vaerenbergh, T.; Fiers, M.; Morthier, G.; Verstraeten, D.; Schrauwen, B.; Dambre, J.; Bienstman, P. Experimental demonstration of reservoir computing on a silicon photonics chip. Nat. Commun. 2014, 5, 3541. [Google Scholar] [CrossRef] [Green Version]
Hon, K.; Kuwabiraki, Y.; Goto, M.; Nakatani, R.; Suzuki, Y.; Nomura, H. Numerical simulation of artificial spin ice for reservoir computing. Appl. Phys. Express 2021, 14, 033001. [Google Scholar] [CrossRef]
Appeltant, L.; Soriano, M.; Sande, G.; Danckaert, J.; Massar, S.; Dambre, J.; Schrauwen, B.; Mirasso, C.R.; Fischer, I. Information processing using a single dynamical node as complex system. Nat. Commun. 2011, 2, 468. [Google Scholar] [CrossRef] [Green Version]
Soriano, M.; Ortin, S.; Keuninckx, L.; Appeltant, L.; Danckaert, J.; Pesquera, L.; Sande, G. Delay-based reservoir computing: Noise effects in a combined analog and digital implementation. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 388–393. [Google Scholar] [CrossRef]
Larger, L.; Soriano, M.; Brunner, D.; Appeltant, L.; Gutierrez, J.; Pesquera, L.; Mirasso, R.; Fischer, I. Photonic information processing beyond Turing: An optoelectronic implementation of reservoir computing. Opt. Express 2012, 20, 3241–3249. [Google Scholar] [CrossRef] [PubMed]
Paquot, Y.; Duport, F.; Smerieri, A.; Dambre, J.; Schrauwen, B.; Haelterman, M.; Massar, S. Optoelectronic reservoir computing. Sci. Rep. 2012, 2, 287. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.; Yi, L.; Ke, J.; Yang, Z.; Yang, Y.; Huang, L.; Zhuge, Q.; Hu, W. Reservoir computing system with double optoelectronic feedback loops. Opt. Express 2019, 27, 27431–27440. [Google Scholar] [CrossRef] [PubMed]
Bao, X.; Zhao, Q.; Yin, H. Efficient optoelectronic reservoir computing with three-route input based on optical delay lines. Appl. Opt. 2019, 58, 4111–4117. [Google Scholar] [CrossRef] [PubMed]
Duport, F.; Schneider, B.; Smerieri, A.; Haelterman, M.; Massar, S. All-optical reservoir computing. Opt. Express 2012, 20, 22783–22795. [Google Scholar] [CrossRef]
Hou, Y.; Xia, G.; Yang, W.; Wang, D.; Jayaprasath, E.; Jiang, Z.; Hu, C.; Wu, M. Prediction performance of reservoir computing system based on a semiconductor laser subject to double optical feedback and optical injection. Opt. Express 2018, 26, 10211–10219. [Google Scholar] [CrossRef]
Yue, D.; Wu, Z.; Hou, Y.; Cui, B.; Jin, Y.; Dai, M.; Xia, G. Performance optimization research of reservoir computing system based on an optical feedback semiconductor laser under electrical information injection. Opt. Express 2019, 27, 19931–19939. [Google Scholar] [CrossRef]
Guo, X.; Xiang, S.; Qu, Y.; Han, Y.; Wen, A. Enhanced prediction performance of a neuromorphic reservoir computing system using a semiconductor nanolaser with double phase conjugate feedbacks. J. Light. Technol. 2021, 39, 129–135. [Google Scholar] [CrossRef]
Brunner, D.; Soriano, M.C.; Mirasso, C.R.; Fischer, I. Parallel photonic information processing at gigabyte per second data rates using transient states. Nat. Commun. 2013, 4, 1364. [Google Scholar] [CrossRef] [Green Version]
Larger, L.; Baylón-Fuentes, A.; Martinenghi, R.; Udaltsov, V.S.; Chembo, Y.K.; Jacquot, M. High-speed photonic reservoir computing using a time-delay-based architecture: Million words per sercond classification. Phys. Rev. X 2017, 7, 011015. [Google Scholar] [CrossRef] [Green Version]
Zhong, D.Z.; Yang, H.; Xi, J.T.; Zeng, N.; Xu, Z.; Deng, F.Q. Predictive learning of multi-channel isochronal chaotic synchronization by utilizing parallel optical reservoir computers based on three laterally coupled semiconductor lasers with delay-time feedback. Opt. Express 2021, 29, 5279–5294. [Google Scholar] [CrossRef] [PubMed]
Dai, H.Y.; Chembo, Y.K. Classification of IO-modulated signals based on reservoir computing with narrowband optoelectronic oscillators. IEEE J. Quantum Electron. 2021, 57, 5000408. [Google Scholar] [CrossRef]
Han, M.Y.; Wang, M.G.; Fan, Y.C.; Cai, S.Y.; Guo, Y.X.; Zhang, N.H.; Schatz, R.; Popov, S.; Ozolins, O.; Pang, X.D. Simultaneous modulation format identification and OSNR monitoring based on optoelectronic reservoir computing. Opt. Express 2022, 30, 47515–47527. [Google Scholar] [CrossRef] [PubMed]
Jin, Y.; Zhao, Q.; Yin, H.; Yue, H. Handwritten numeral recognition utilizing reservoir computing subject to optoelectronic feedback. In Proceedings of the International Conference on Natural Computation (ICNC), Zhangjiajie, China, 15–17 August 2015. [Google Scholar]
Wan, L.; Zeiler, M.; Zhang, S.; Cun, Y.; Fergus, R. Regularization of neural networks using dropconnect. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
Antonik, P.; Marsal, N.; Rontani, D. Large-scale spatiotemporal photonic reservoir computer for image classification. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 7700812. [Google Scholar] [CrossRef] [Green Version]
Gardner, S.; Haider, M.; Moradi, L.; Vantsevich, V. A modified echo state network for time independent image classification. In Proceedings of the IEEE International Midwest Symposium on Circuits and Systems, East Lansing, MI, USA, 9–11 August 2021. [Google Scholar]
Murphy, T.E.; Cohen, A.B.; Ravoori, B.; Schmitt, K.R.B.; Setty, A.V.; Sorrentino, F.; Williams, C.R.S.; Ott, E.; Roy, R. Complex dynamics and synchronization of delayed-feedback nonlinear oscillators. Phil. Trans. R. Soc. 2010, 368, 343–366. [Google Scholar] [CrossRef]
Larger, L.; Dudley, J. Nonlinear dynamics optoelectronic chaos. Nature 2010, 465, 41–42. [Google Scholar] [CrossRef]
Cohen, A.; Ravoori, B.; Murphy, T.; Roy, R. Using synchronization for prediction of high-dimensional chaotic dynamics. Phys. Rev. Lett. 2008, 101, 154102. [Google Scholar] [CrossRef]
Argyris, A.; Syvridis, D.; Larger, L.; Annovazzi-Lodi, V.; Colet, P.; Fischer, I.; Garcia-Ojalvo, J.; Mirasso, C.R.; Pesquera, L.; Shore, K.A. Chaos-based communications at high bit rates using commercial fiber-optic links. Nature 2005, 438, 343–346. [Google Scholar] [CrossRef]
Tezuka, M.; Kanno, K.; Bunsen, M. Reservoir computing with a slowly modulated mask signal for preprocessing using a mutually coupled optoelectronic system. Jpn. J. Appl. Phys. 2016, 55, 08RE06. [Google Scholar] [CrossRef]
Antonik, P.; Duport, F.; Hermans, M.; Smerieri, A.; Haelterman, M.; Massar, S. Online training of an opto-electronic reservoir computer applied to real-time channel equalization. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2686–2698. [Google Scholar] [CrossRef] [Green Version]
Kussul, E.; Baidyk, T. Improved method of handwritten digit recognition tested on MNIST database. Image Vis. Comput. 2004, 22, 971–981. [Google Scholar] [CrossRef]
Chao, D.; Cai, F.; Mohammed, A.Z.; Wen, M.; Seung, H.L.; Wei, D.L. Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 2017, 8, 2204. [Google Scholar]
Jain, A.K.; Farrokhnia, F. Unsupervised texture segmentation using Gabor filters. Pattern Recognit. 1991, 24, 1167–1186. [Google Scholar] [CrossRef] [Green Version]
Vinckier, Q.; Duport, F.; Smerieri, A.; Vandoorne, K.; Bienstman, P.; Haelterman, M.; Massar, S. High performance photonic reservoir computer based on a coherently driven passive cavity. Optica 2015, 2, 438–446. [Google Scholar] [CrossRef] [Green Version]
Yue, D.Z.; Hou, Y.S.; Wu, Z.M.; Hu, C.X.; Xiao, Z.Z.; Xia, G.Q. Experimental investigation of an Optical reservoir computing system based on two parallel time-delay reservoirs. IEEE Photonics J. 2021, 13, 8500111. [Google Scholar] [CrossRef]
Sugano, C.; Kanno, K.; Uchida, A. Reservoir computing using multiple lasers with feedback on a photonic integrated circuit. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 1500409. [Google Scholar] [CrossRef]

Figure 1. (a) System schematic diagram and (b) operation procedure of the optoelectronic TDRC. SL: semiconductor laser; VA: variable attenuator; PD: photo detector; AMP: electronic amplifier; LPF: low-pass filter; HPF: high-pass filter.

Figure 2. Schematic diagram of the preprocessing procedure. (a) Input full image; (b) Input trimmed image; (c) Input 2×-downscaled features, (d) Input HOG features.

Figure 3. Evolution maps of the AR in parameter space of β-γ under ϕ = 0. (a) full image input with 1500 nodes, (b) trimmed image input with 700 nodes, (c) 2×-downscaled features input with 200 nodes, and (d) HOG features input with 700 nodes.

Figure 4. Variation in the AR with the offset phase ϕ under β = 0.2 and γ = 0.92. (a) input full image, (b) input trimmed image, (c) input 2×-downscaled features, and (d) input HOG features.

Figure 5. Schematic diagram of parallel processing based on multi-reservoirs.

Figure 6. (a) AR as a function of the number of total virtual nodes N. (b) Recognition speed as a function of N. The red line is for k-reservoirs with each reservoir possessing 100 nodes (k × 100 = N), and the blue line is for a single reservoir with N virtual nodes.

Figure 7. Impact of parameter mismatch between multiple reservoirs on the AR.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yue, D.; Hou, Y.; Hu, C.; Zang, C.; Kou, Y. Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System. Photonics 2023, 10, 236. https://doi.org/10.3390/photonics10030236

AMA Style

Yue D, Hou Y, Hu C, Zang C, Kou Y. Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System. Photonics. 2023; 10(3):236. https://doi.org/10.3390/photonics10030236

Chicago/Turabian Style

Yue, Dianzuo, Yushuang Hou, Chunxia Hu, Cunru Zang, and Yingzhe Kou. 2023. "Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System" Photonics 10, no. 3: 236. https://doi.org/10.3390/photonics10030236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Handwritten Digits Recognition Based on a Parallel Optoelectronic Time-Delay Reservoir Computing System

Abstract

1. Introduction

2. System Model

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI