A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens

Chen, Yanan; Li, Zheng; Chen, Ming

doi:10.3390/app14062361

Open AccessArticle

A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens

by

Yanan Chen

,

Zheng Li

and

Ming Chen

^*

Key Laboratory of Fisheries Information, Ministry of Agriculture and Rural Affairs, College of Information Science, Shanghai Ocean University, Shanghai 201306, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(6), 2361; https://doi.org/10.3390/app14062361

Submission received: 8 January 2024 / Revised: 2 March 2024 / Accepted: 9 March 2024 / Published: 11 March 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Litopenaeus vannamei is a common species in aquaculture and has a high economic value. However, Litopenaeus vannamei are often invaded by pathogenic bacteria and die during the breeding process, so it is of great significance to study the identification of shrimp pathogenic bacteria. The wide application of Raman spectroscopy in identifying directions of inquiry provides a new means for this. However, the traditional Raman spectroscopy classification task requires a large amount of data to ensure the accuracy of its classification. Therefore, the question of how to obtain higher classification accuracy through the means of a small amount of Raman spectrum data is a difficult point in the research. This paper proposes a distributed deep learning network based on data enhancement for few-shot Raman spectral classification of Litopenaeus vannamei pathogens. The network consists of RSEM, RSDM, and DLCM modules. The RSEM module uses an improved generative adversarial network combined with transfer learning to generate a large amount of spectral data. The RSDM module uses improved U-NET to denoise the generated data. In addition, we designed a distributed learning classification model (DLCM) which significantly speeds up model training, improves the efficiency of the algorithm, and solves the network degradation problem that often occurs during deep learning model training. The average classification accuracy of our proposed network on four shrimp pathogenic bacteria reaches 98.9%, which is higher than several models commonly used in Raman spectroscopy classification tasks. The method proposed in this article only needs the Raman spectra of a small number of microorganisms to complete the efficient and rapid identification of shrimp pathogenic bacteria, and this method certainly has the potential to solve the problem of the spectral classification of other microorganisms.

Keywords:

Litopenaeus vannamei; generative adversarial network; distributed deep learning; few-shot classification

1. Introduction

Litopenaeus vannamei, also known as whiteleg shrimp, is one of the most popular aquaculture species worldwide. Due to its high nutrient density, Litopenaeus vannamei has become the most productive strain used in Chinese shrimp farming [1]. However, changes in farming environments and other factors can easily lead to infections by various pathogenic bacteria, resulting in massive deaths and causing significant ecological and economic losses [2]. In recent years, research on the isolation and identification of shrimp pathogens has been continuously developing, but the majority of studies still rely on traditional methods such as gene sequencing. These methods are costly, time-consuming, and necessitate the use of a professional worker [3]. Therefore, a rapid and accurate method for classifying related pathogenic bacteria is needed.

Raman spectroscopy is a technique that uses scattered light to determine the vibrational modes of molecules, which can provide a structural fingerprint by which molecules can be identified [4]. Due to its highly efficient, non-destructive, and easy-to-operate characteristics, it has been favored by researchers and is widely used in the chemistry [5,6,7], materials science [8,9,10], and biomedical fields [11,12,13], among others. In the field of microbial classification, Raman spectroscopy has also made significant progress. For example, various related research has arisen in the detection of foodborne pathogens [14], environmental microorganisms [15], and human pathogens [16], and a full system has gradually been established. However, despite Raman spectroscopy’s success in rapidly identifying microorganisms, it is an unavoidable fact that obtaining Raman spectroscopy data takes up the majority of the timespan of the entire microbial identification process, especially when surface enhancement techniques are required to obtain low-noise data, which significantly increases the duration of the experiment [17].

With the rapid development of machine learning, microbiology research has begun to be combined with machine learning, and even deep learning, and the results of the combination of the two are satisfactory. For example, regarding microbial genes, Chen et al. used methods such as L0 + L1 regulation to perform computational reconstruction on haplotypes of mixed sequencing, achieving the highest accuracy levels on multiple datasets [18,19]. Wang et al. developed a new deep learning prediction method called MDeep, which is based on CNN and phylogenetic trees and is more competitive than traditional methods [20]. Regarding microbial classification, Maruthamutu et al. used CNN and attention mapping to classify 12 microorganisms, with an overall recognition rate of over 97%, a technique which helps to efficiently distinguish microbial pollutants [21]. Wang et al. combined Raman spectroscopy and deep learning to classify 30 pathogenic bacteria collected clinically. In comparison to several traditional machine learning methods, the CNN used in the paper had higher accuracy and efficiency levels [22]. The combination of microbial spectroscopy and deep learning is gradually becoming a trend. Due to the difficulty involved in obtaining a large amount of Raman spectroscopy data from microorganisms, the use of deep learning methods for Raman spectroscopy amplification is also one of the research directions.

In 2015, Goodfellow et al. first proposed the concept and implementation framework of generative adversarial networks (GANs) [23], which involve designing a pair of networks: one generates data and the other identifies data, with the two networks competing and learning from each other. After sufficient iterations, the two networks approach Nash equilibrium infinitely, and the generated data is very close to the real data. Over the years, GANs have produced many variants and demonstrated their strong capabilities in various fields. Although the Diffusion model has recently gained popularity and outperformed GANs in image generation [24], as a mature generation model, the latter’s status will not be challenged in the short term.

The success of generative adversarial networks in image generation has increased researchers’ confidence in spectral data amplification. Researchers in the area have adapted GANs and properly set the network structure and parameters to make them suitable for Raman spectroscopic data. Du used GANs to considerably extend the dataset in an experiment employing Raman spectroscopy to identify three foodborne pathogenic bacteria, eventually obtaining a classification accuracy of 90% [25]. Yu et al. used GANs to gather a large amount of eligible spectral data in a marine pathogen classification experiment, establishing the groundwork for accurate classification [26]. Ma et al. proposed a spectrum recovery conditional generative adversarial network (SRGAN). An SRGAN can accelerate spectral collection and improve the throughput of Raman spectroscopy. The researchers used the SRGAN to process the spectral data of five foodborne bacteria and achieved a classification accuracy of 94.9% in the final classification task. In the comparative experiment, without using the SRGAN, the classification accuracy was only 60.5% [27]. Liu et al. used a PGGAN to amplify spectral data in an experiment classifying five types of marine microorganisms. The results showed that only one-third of the original data needed to be substituted into the model for training to obtain ideal classification results [28].

The above studies demonstrated the efficiency of generative adversarial networks in microbial Raman spectroscopy; however, they have significant limitations. For example, most of them do not evaluate classification accuracy before and after data amplification, making it impossible to determine whether the generative adversarial network plays a role. And if you closely examine the photos associated with the generated data, you will notice that there is still noise in the data. Furthermore, most studies on microbial Raman spectrum classification still use classic machine learning approaches, indicating that there is still space for improvement. To address the aforementioned issues, we proposed a distributed deep learning network based on data enhancement for few-shot Raman spectral classification of shrimp pathogens. The research intentions include the following:

Propose a distributed deep learning network based on data enhancement for few-shot Raman spectral classification of shrimp pathogens. The network is made up of three modules: a Raman spectrum enhancement module (RSEM), a Raman spectrum denoising module (RSDM), and a distributed learning classification module (DLCM). The RSEM controls the enhancement of the spectral data. During the training process, the network in the module employs transfer learning to improve the efficiency and quality of the generated data.
Establish the first application of the UNET network in the denoising of microbial Raman spectroscopy. The RSDM module consists of an improved UNET network, which utilizes the unique structure of the UNET network to effectively eliminate irrelevant noise in the data generated by the RSEM.
As opposed to traditional machine learning methods, we design a distributed deep learning classification module (DLCM). The module consists of a server and multiple clients. The clients and server perform parallel training and interact according to the designed algorithm. This module achieves the accurate classification of high-dimensional Raman spectra and solves network degradation problems common in deep learning.

2. Materials and Methods

2.1. The Framework of the Proposed Network

The framework of our proposed network is shown in Figure 1. The network consists of a Raman spectrum enhancement module (RSEM), a Raman spectrum denoising module (RSDM), and a distributed learning classification module (DLCM). First, we use a Raman spectrometer to obtain spectral data from the sample and use relevant algorithms to preprocess the data to eliminate the impact of the spectrometer itself or environmental factors on the data. Subsequently, the preprocessed data is input into the RSEM for training, thereby amplifying the sample dataset. The blue box represents the RSEM module, and the core of the module is generative–adversarial. As shown in the upper part of the blue box, the generative adversarial network achieves data generation by training the generator network and discriminator network. Specifically, the type of generative adversarial network is the WGAN; due to its excellent performance in generating images, it has been improved and utilized as the network for Raman spectroscopy data amplification in this experiment. At the same time, in order to enable the WGAN to train normally on datasets with small amounts of data, we used a large microbiological Raman dataset for pre-training and then froze and transferred the trained parameters. In addition, experiments have proven that using transfer learning in an RSEM can also speed up network training and improve the quality of generated data. After waiting for the RSEM to complete training, the data generated in the module will be input into an RSDM for further processing. In this module, we use the improved UNET network to denoise the generated data. As shown in the orange box, the network can be viewed as an encoder and decoder. The encoder compresses the dimensions of the data, while the decoder is used to restore the dimensions of the original data. The network uses a loss function to adjust network parameters by minimizing the error between input and decoded data, thereby achieving the goal of noise reduction. The network can reduce the noise generated by WGAN training, thereby improving the classification accuracy of the downstream classification model. The green box displays the general framework of the DLCM module, which includes a server and multiple clients. The server receives the parameters trained by the client, generates global parameters through algorithms, and sends them to the client. In the client, we use residual networks to avoid gradient vanishing during the iteration process, effectively improving the accuracy of classification.

2.2. Data Acquisition and Preprocessing

2.2.1. Acquisition of Raman Spectrum Data

Due to changes in the external environment and other circumstances, whiteleg shrimp are frequently susceptible to infections produced by pathogenic bacteria during the shrimp farming process, leading to disease and death. After reviewing the relevant literature, this study selected four common whiteleg shrimp pathogens: Vibrio parahaemolyticus, Escherichia coli, Aeromonas hydrophila, and Aeromonas veronii. Vibrio parahaemolyticus is the most common pathogen in whiteleg shrimp farming, with infected shrimp turning white-grayish red and having a high mortality rate [29]. Escherichia coli [30] and Aeromonas hydrophila are among the causes of bacterial shrimp enteritis, which can lead to slow growth, significant body weight loss, and even death in severe cases. Aeromonas veronii has also been identified as a pathogen that is commonly isolated from infected and dead shrimp [31].

This study selected the above four pathogens as the research objects for the classification task. We obtained pathogen samples from the National Pathogen Collection Center for Aquatic Animals (NPCCAA) at Shanghai Ocean University, all of which were extracted from diseased whiteleg shrimps. They were then inoculated into LB (Luria Bertani) liquid medium and cultivated for 24 h at 37 °C before being rinsed with PBS and centrifuged. The bacterium solution that resulted was then kept at 4 °C. To prepare in situ coated silver nanoparticles, the bacteria solution was centrifuged to remove the supernatant, and the resulting pellet was resuspended in nitric acid solution. After thorough shaking and mixing, sodium hydroxide solution was added to further shake and mix, forming the in situ-coated silver nanoparticles. This process is known as surface enhancement, and it attempts to adhere the silver colloid to the bacterial surface, thereby boosting the bacteria’s Raman signal and keeping low bacterial density from preventing Raman spectroscopy from being obtained.

Additionally, we used the LabRAM HR Evolution Confocal Raman Microscope (HORIBA, Kyoto, Japan, 532 nm excitation wavelength) to obtain the Raman spectra of the samples. After adding an appropriate amount of surface enhanced bacteria, drops were placed on the center of the microscope slide, and the Raman spectroscopy was used to collect spectral data after the sample dried. Each sampling point was collected 3 times, and the average value of the 3 collected data was used as the Raman spectroscopy data for that point. This process was repeated to obtain a total of 160 Raman spectroscopy data points for the four bacteria. Specifically, each microorganism is assigned 40 spectral data points, with each wavelength range selected from the range of 1289 cm⁻¹ to 4000 cm⁻¹, totaling 1600 spectral features. After obtaining the Raman intensity of each feature (wavelength value), these 1600 points can be connected to obtain the final Raman spectral curve.

2.2.2. Preprocessing of Raman Spectrum Data

When using Raman spectroscopy to analyze data, the initial data often need to be preprocessed due to interference from factors such as cosmic rays, instrument noise, and autofluorescence of the sample itself, in order to prevent noise from affecting the experimental results. Common preprocessing methods include smoothing, scattering correction, baseline correction, and normalization [32]. In this experiment, S-G (Savitzky-Golay) smoothing and normalization were used to preprocess the raw data. S-G smoothing can maximize the preservation of data information while reducing noise and increasing the signal-to-noise ratio. Normalization can reduce the negative effects of large variations in spectral data, allowing the spectra to fall within a specific range. We preprocessed the obtained 40 Raman spectra, and the preprocessed spectral images are shown in Figure 2. It can be seen that the spectral curve is very smooth, and the Raman intensity (y-axis) is concentrated between 0 and 1. This proves convenient for subsequent analysis and model training. Meanwhile, we can also observe that the spectral images of the four microorganisms are very similar, with their peak positions mostly overlapping. Due to the overlapping Raman spectral characteristics of the four microorganisms, special methods are needed to distinguish them.

2.3. Raman Spectrum Enhancement Module (RSEM)

2.3.1. WGAN Network Structure

Generative adversarial networks (GANs) consist of a generator network and a discriminator network. The generator network is responsible for generating data similar to real samples, while the discriminator network is responsible for distinguishing generated data from real data. Since the introduction of GANs, hundreds of different types of GANs have emerged, and they have been proven to demonstrate excellent performance in specific domains.

In order to select the most suitable generative adversarial network, we conducted a series of comparative experiments on several types of generative adversarial networks and found that WGAN had the most outstanding performance in terms of the quality of generated data and the duration of model training. Therefore, we chose WGAN as the network for the data augmentation module. The comparative experimental results are presented in Section 3.2.

Unlike the original GAN discriminator, which solves for the classification of 0 or 1, the WGAN discriminator employs the Wasserstein distance to distinguish between generated and real data [33]. The network provides a trustworthy training process indicator, avoiding the gradient vanishing problem that frequently happens in the original GAN. The Wasserstein distance formula is shown below:

W (P_{r}, P_{g}) = \begin{matrix} i n f \\ γ ~ Π (P_{r}, P_{g}) \end{matrix} E_{(x, y) ~ γ} [‖x - y‖]

(1)

where

P_{r}

represents the distribution of real data,

P_{g}

represents the distribution of generated data,

Π (P_{r}, P_{g})

represents the set of all possible joint distributions combining

P_{r}

and

P_{g}

,

E_{(x, y) ~ γ}

[

‖x - y‖

] represents the expectation value of the distance between the real data x and the generated data y under the joint distribution γ.

After a series of derivations, the Wasserstein distance can be approximated as

L = E_{x ~ P_{r}} [f_{ω} (x)] - E_{x ~ P_{g}} [f_{ω} (x)]

(2)

where

f_{ω} (x)

represents a function containing parameter ω. Specifically in the WGAN network, the loss function expression can be written as

V_{WGAN} G, D = E_{x ~ P_{r}} [l o g D (x)] - E_{x ~ P_{g}} [l o g D (G (x))]

(3)

where

D (x)

represents the result judged by the discriminator on real data,

G (x)

represents the generated data, and

D (G (x))

represents the result judged by the discriminator on the generated data.

Therefore, the generator’s loss function can be expressed as

L_{G} = - E_{x ~ P_{g}} [l o g D (G (x))]

(4)

The loss function of the discriminator can be expressed as

L_{D} = E_{x ~ P_{g}} [l o g D (G (x))] - E_{x ~ P_{r}} [l o g D (x)]

(5)

The generator and discriminator are trained towards the goal of minimizing the loss function, and the two constantly compete until reaching a Nash equilibrium.

Figure 3 shows the improved WGAN structure used in our paper. The WGAN network is made up of a generator and a discriminator. The generator features a five-layer network structure. Each layer has a deconvolution layer and a normalization layer. Each layer generates output using the LeakyRelu activation function. Similarly, the discriminator has a five-layer network structure, with each layer consisting of a convolution layer and a normalization layer, and each layer using the Relu activation function to generate output. First, low-dimensional noise data Z is generated at random and used as the generator network’s input, after which sample data is generated using generator network mapping. The discriminator then receives both the sample and real data as inputs. The generator network optimizes the network gradient through the generator’s loss function, making the generated data distribution closer to that of the real data. The discriminator network optimizes the network gradient through the discriminator loss function, so that the discriminator’s ability to identify false data is improved. After a sufficient number of confrontations between the generator and discriminator networks, the two are infinitely close to a Nash equilibrium. At this time, the data generated by the generator network is also infinitely close to the real data. In addition, the parameters of network training are also displayed in the lower right corner of the figure below.

Compared with the original WGAN, we have made some improvements in order to better generate Raman spectrum data. The one dimensional convolution layer nn.Conv1d is used instead of the linear layer nn.Linear in the discriminator network, and the deconvolution layer nn.ConvTranspose1d is used instead of the linear layer nn.Linear in the generator network. The one-dimensional convolution layer can adjust the information interaction between channels, not only so that the model can have strong abstraction capabilities, but also so that computational efficiency is ensured. In addition, Spectral Normalization is used instead of Batch Normalization in the network. Spectral Normalization can make the parameter matrix of the network satisfy the Lipschitz continuity condition, thus making WGAN training more stable [34].

2.3.2. Transfer Learning for WGAN

The main idea of transfer learning is to use the rich knowledge information of the source domain dataset (usually with a large amount of data) to supplement the information missing in the target domain due to the small amount of data in the target domain, by finding the relationship between the source domain and the target domain [35]. The more similar the data between the original domain and the target domain, the better the transfer learning effect. In this study, we choose a large-scale pre-trained microbial spectral dataset as the source domain data, which includes 60,000 Raman spectra of 30 types of microorganisms collected by Stanford Hospital from 2016 to 2017 [16]. After sufficient iterations to stabilize the model, the training is stopped, and the model’s parameters are frozen. The frozen parameters are then transferred, and the target domain data (preprocessed spectral data) is directly trained using the transferred parameters. A schematic diagram is shown in Figure 4.

2.4. Raman Spectrum Denoising Module (RSDM)

Although GANs can generate a large amount of data similar to the original data, the instability of the training process exists due to the principle of the generator and the discriminator competing with each other. In order to reduce the exaggerated noise in the training process and make the generated data more similar to the real data, the Unet model is introduced to denoise the generated data in this experiment.

Unet is a classic network model that has been widely used in various image segmentation tasks due to its efficiency, simplicity, and ability to adapt to small datasets [36]. The original intention of Unet was to solve the problem of medical image segmentation, as its encoder–decoder structure can extract complex features and restore the original resolution. Skip connections in the model reduce the feature loss caused by convolutions, helping the decoder extract important shallow information.

With the RSDM, we improve the original U-NET structure to improve the network’s ability to extract spectral-feature information. The model is shown in Figure 5. It can be seen that the denoising model shows perfect symmetry. The left side of the network can be seen as the encoder, the right side of the network can be seen as the decoder, and BLOCK is used for transition in the middle of the network. Specifically, the model consists of an encoding structure containing four convolutional layers and a decoding structure containing four deconvolutional layers. In the convolution operation of each layer, by using a 3 × 3 convolution kernel, the number of parameters of the convolution layer is reduced and a sufficiently receptive field size is maintained. The downsampling operation in the encoding structure and the upsampling operation in the decoding structure are implemented by 2 × 2 maximum pooling and 2 × 2 upconvolution operations, respectively.

In addition, compared to the original UNET, we have made some improvements to reduce the loss of data features in the network during upsampling and downsampling processes. Firstly, we improved the skip connection method of the U-Net network. In traditional UNet, in order to avoid losing a large number of precise spatial details in the decoder, the skip technique is used, which directly concatenates the map extracted from the encoder to the corresponding layer of the decoder. However, we believe that shallow feature information enters the decoder too early, which is not conducive to the process of the decoder extracting global feature information. Therefore, we connect the output of the encoder and the output of the decoder. The new connection method can integrate shallow feature information into deep spectral data details, which is more conducive to restoring clean spectral data. Secondly, we introduce an attention mechanism. The upsampling and downsampling in the UNET network are both based on convolution, and the receptive field kernel size of the convolution operation is small, so each convolution operation can only cover local features of the data. This results in the decoder losing data features during the data recovery process. The attention mechanism can better learn the dependency relationships between global features.

We use a module called Attention Gates to train attention mechanisms, in which coarse-grained features capture contextual information and highlight the categories and positions of foreground objects. This module can suppress the task-independent parts of the model learning, while emphasizing the learning of task-related features. Subsequently, feature maps extracted at multiple scales are merged through skip connections to combine dense predictions at both coarse-grained and fine-grained levels. Our improvement can avoid data loss while allowing each layer of the network to perform noise reduction. Figure 6 shows the Attention Gates module we introduced, which can be trained in 5 steps.

Step 1: After a 1 × 1 convolution, add together the outputs of the encoding layer Oen and the output of the decoding layer Ode.

Step 2: Pass the added results through the Relu function.

Step 3: Convolve the result of step 2 and reduce the channel to 1.

Step 4: Sigmoid the result of Step 3 so that the value falls within the (0, 1) interval, which results in attention weights.

Step 5: Multiply the attention weights obtained in step 4 with the output Oen of the encoding layer, and assign the attention weights to the low level feature.

Additionally, during the downsampling process, we added residual blocks to the encoder [37]. The residual network adds the inputs and outputs of a large number of convolutional layers to extract features from the data. For networks with deeper network layers, adding a residual network can avoid the model degradation problem caused by vanishing gradients during training. Combining the relevant improvements mentioned above, the improved U-NET network proposed in this article can denoise spectral data very well. When the spectral data with noise is input into the network, the encoder compresses the features of the data, and then the decoder amplifies the features to the original size. During the encoding and decoding process, redundant noise information in the spectral data is eliminated, and at the same time, important features of the data are preserved.

2.5. Distributed Learning Classification Module (DLCM)

Distributed learning is an emerging learning method that can allocate data and computing tasks to multiple computing nodes (clients) for parallel computing. During the training process, the computing node (client) can exchange relevant parameters with the server according to a specific algorithm to achieve the purpose of jointly training the model. Through parallel computing, distributed learning can significantly speed up model training and improve the efficiency of the algorithm. In addition, distributed learning can make full use of the information in the dataset and improve the accuracy of the model.

After training with the RSEM and RSDM, we obtained microbial Raman spectrum data with a large amount of data and high dimensions. If we want to accurately classify the data, we need a complex deep learning model. However, in order to achieve the desired effect, the deep learning model needs to be trained a sufficient number of times, and time is also a very important factor in the classification and identification of microorganisms. This is also the original intention of using spectra to classify microorganisms. In addition, deep learning inevitably suffers from training instability during the iterative process. In order to solve this problem, we introduced the idea of distributed learning and built a distributed learning module for the classification of microbial Raman spectrum data. The framework of this module is shown in Figure 7.

Specifically, the module sets up “i” local clients, each using the same classification network and obtaining consistent initialization parameters from the server. The dataset is evenly divided according to the number of clients, and different clients obtain different sub-datasets. To be more specific, each client’s classification network is RESNET, which helps to prevent network degradation caused by vanishing or explosive gradients. Subsequently, all clients began training, with each client iterating “m” times in its own classification network, continuously updating gradients during the iteration process. Each client submits their revised model parameters to the server after finishing their respective training sessions. The server takes the model parameters from the client and agrees to them to create the global model for this round. In this manner, a round of global model iteration was completed using distributed learning, and the preceding stages were repeated until the model iteration had achieved the intended result and halted.

The overall process of the algorithm is described using Algorithm 1.

Algorithm 1. Model Parallel Training

Input: Client C_i, Client’s respective data, Client initial parameters

w_{i}

, Training round T, Fixed parameters

λ

Output: Global model W

Initialize global model W⁰
For each client in clients:
Train using initialized parameters $w_{i}$ and generate $w_{i}^{0}$
For round t in T:
Server executes:
Receive $w_{i}^{t}$ and aggregate to generate global model W_t
W_t = W_t−1 + $λ Σ_{i = 1}^{i} {(w}_{t + 1}^{i} -$ W_t−1 $)$
Return W_t
Client executes:
Receive the global model sent by the server W_t
Use the global model W_t for training and generate parameters $w_{i}^{t + 1}$
Send parameters $w_{i}^{t + 1}$ to server
Return W_T

END

2.6. Evaluation Indicators

In order to evaluate the classification effect of the model, we use Precision, Recall, F1 Score and Accuracy as evaluation indicators. The calculation formulas of the four indicators are as follows.

Precision = \frac{T P}{T P + F P}

(6)

Recall = \frac{T P}{T P + F N}

(7)

F 1 Score = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(8)

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(9)

where TP indicates that the predicted result is microorganism A, and the actual result is microorganism A. TN means that the predicted result is not microorganism A, and the actual result is not microorganism A. FP means that the predicted result is microorganism A, but the actual result is not microorganism A. FN indicates that the predicted result is not microorganism A, but the result is actually microorganism A.

3. Results

3.1. Raman Spectroscopy Analysis

As a non-destructive detection method, Raman spectroscopy can fully display the biological molecular information present in a substance, such as proteins, lipids, and carbohydrates [38]. In this experiment, the spectral data in the 1289 to 4000 cm⁻¹ waveband was studied. In order to better observe the differences in the spectral curves, reference materials were consulted, and the difference in the spectral curves was plotted. In Figure 8, it can be seen that the differences in spectral images of different microorganisms are mainly reflected in the positions and heights of the peaks. For example, at 1300 cm⁻¹, the CH2 twisted vibration of lipid molecules can be observed, and at 1542 cm⁻¹, the stretching vibration of the indole ring of tryptophan molecules can be found. The amide I band vibration of protein molecules can be observed at 1660 cm⁻¹, and there are significant differences in the peak intensities of the four pathogens at this position. The peak at 2715 cm⁻¹ may be caused by the stretching vibration of C-H in lipid molecules, while the interval between 2800 and 3000 cm⁻¹ is believed to contain multiple possible molecular vibrations. The main peak positions and bands, as well as the corresponding molecular vibrations, are displayed in Table 1.

3.2. Raman Spectroscopy Enhancement Module Training Results

3.2.1. Selection of Generative Adversarial Networks

The study described in this article selected the improved WGAN as the data enhancement network. In order to verify whether our choice would be appropriate, we selected several generative adversarial networks for comparative experiments. In this experiment, the original GAN, Conditional Generative Adversarial Network (CGAN), Deep Convolutional Generative Adversarial Network (DCGAN), Wasserstein Generative Adversarial Network (WGAN), and Progressive Growth Generative Adversarial Network (PGGAN) were selected as training models and unified, using the same parameters for training. In order to compare the training effects of several networks more scientifically, we introduced several indicators to evaluate the generated data.

Frechet Inception Distance (FID) [44]. FID is a classic evaluation index in generative adversarial networks which is used to evaluate the distance between real data and generated data feature vectors. The smaller the value of the FID, the more similar the generated data is to the real data.
Normalized Relative Discriminative Score (NRDS) [45]. The larger the value of NRDS, the better the data generated by this GAN.
Pearson Correlation Coefficient (PCC). The Pearson correlation coefficient is a method commonly used in statistics to measure the correlation between two sets of data. Although it is not used as an evaluation indicator in traditional generative adversarial networks, in this experimental model, due to the particularity of the data, it can be used as an indicator.

In addition, we also use the duration of network training as an evaluation metric. We trained several selected GAN networks using the same parameters, and after 500 rounds of model training, calculated the relevant indicator values, as shown in Table 2. It can be seen that the PGGAN and WGAN are generally superior to other generative adversarial networks, but the training time of PGGAN network is longer than that of WGAN and the quality of generated data is not significantly improved, compared to WGAN. Therefore, in this study of the amplification of Raman spectroscopy data associated with shrimp pathogenic bacteria, WGAN is the best choice.

3.2.2. RSEM Training Evaluation

After selecting the WGAN as the core network of the RSEM, we trained the model according to the parameters shown in Figure 3. After 500 iterations, the model tends to stabilize. At this point, the size of the microbial spectrum dataset has increased by 10 times, reaching 1600 entries. The expansion of the dataset enriches the generalization ability of the data and effectively avoids overfitting. To demonstrate the results of RSEM training, we plotted the loss curve of the network model, as shown in Figure 9. Unlike other networks, generating adversarial networks requires drawing the loss curves of both the generator and the discriminator simultaneously to determine whether the network is converging. It can be seen that the loss curves of the generator and discriminator fluctuate greatly in the early stage, which is due to the imbalance between the generator’s ability to generate data and the discriminator’s ability to distinguish data. As the number of training sessions increases, the generator and discriminator gradually stabilize, but this stability is also dynamic, because the generator and discriminator are constantly in conflict. After 500 rounds of model training, the generated data is basically sufficient for subsequent use.

3.2.3. Effect of Transfer Learning on RSEM Training

To verify the improvement in data generation brought about by transfer learning on the RSEM, we selected a pre-trained large-scale microbial spectral dataset as the source domain data. This dataset, collected by Stanford Hospital from 2016 to 2017, includes 60,000 Raman spectra of 30 types of microorganisms. After enough iterations to stabilize the model, training was halted, and the model’s parameters were fixed. Subsequently, the frozen parameters were transferred to a new WGAN network, which will be used to train the four pathogens’ preprocessed Raman spectrum data. After 500 training rounds, we created a comparison graph to determine whether to employ transfer learning in RSEM.

As shown in Figure 10a, when the RSEM is not trained using transfer learning, the data generation starts from scratch, and the network attempts to reproduce the distribution of real data from noisy noise. However, when the network starts training using the pre-trained parameters transferred from the source domain, as shown in Figure 10b, the network first generates the distribution of pre-trained data, i.e., the network learns the prior knowledge of the pre-trained data. As the number of iterations increases, the network-generated data distribution gradually approaches the distribution of the real data, while retaining the relevant details, similar to the interaction of prior data and real data, which cannot be generated by direct training. Further analysis shows that using transfer learning can accelerate the convergence of the model. When the same number of iterations is reached, the model generates data with noise levels lower than in cases without transfer learning. In summary, in our experimental process, the use of transfer learning shortened the convergence time of the RSEM, generating Raman spectroscopy data and improving the quality of generated data.

3.3. Raman Spectrum Denoising Module Training Results

Although we have been able to generate a large amount of spectral data similar to the original data by using the RSEM, careful observation of the generated images reveals that there is still some noise in the data, and the model seems to have encountered a bottle-neck. We attempted to increase the number of RSEM iterations, such as by 1000, 2000, or even 5000, but the noise still existed. Therefore, the RSDM was introduced in the experiment to denoise the data generated by the RSEM. To verify the denoising effect of our proposed denoising module, we evaluated the RSDM module using three indicators, MSE, SNR, and PSNR, which are commonly used indicators in spectral denoising tasks. MSE is used to calculate the difference between the denoised data to be evaluated and the original data. SNR stands for signal-to-noise ratio. The higher the signal-to-noise ratio, the smaller the noise mixed in the signal, and the higher the quality of the denoised data. PSNR stands for peak signal-to-noise ratio, with the denominator being the energy difference between the evaluated data and the original data, which is also equivalent to noise. The smaller the noise, the better, so the larger the PSNR, the better. We added a certain amount of noise to the original spectral data, and then used our improved module for noise reduction. The specific results are shown in Table 3.

As can be seen, we added 15 dB, 25 dB, and 50 dB of noise to the original data, and our improved model achieved the best results. To improve the quality of the generated data, we take the generated data from RSEM as the input of the model, and set the optimizer of the noise reduction model to the Adam optimizer, the learning rate to 0.0001, and the loss function to the RMSE (Root Mean Square Error) function. Comparing the original data, generated data, and denoised data, as shown in Figure 11, it can be seen that although the green lines (generated by RSEM) have a distribution similar to that of the original data (orange lines), there is clearly a substantial amount of noise present. After 1000 iterations, the model not only removed noise from the data, but also fully retained the original features of the generated data, making the denoised data (blue lines) more in line with the real data.

3.4. Experimental Analysis of Distributed Learning Classification Module

In order to determine the optimal experimental parameters in this module, we conducted a series of comparative experiments.

3.4.1. Selection of Training Models for Clients

In the distributed learning module we created, each client trains its own network in each round and then uploads the iterated model parameters to the server. To investigate the impacts of diverse network models on training results, we conducted comparison tests with many standard deep learning models, including artificial neural network (ANN), BP neural network, convolutional neural network (CNN), and residual network (ResNet). Figure 12 depicts the results of 30 rounds of training. As demonstrated in the figure, using ResNet results in the best performance, with an accuracy of 98.96%. As a result, ResNet was chosen as the network for client training.

3.4.2. Selection of the Number of Clients

In order to investigate the impact of the number of clients on classification accuracy, we conducted five control experiments, as described in the following. Use one client, three clients, five clients, ten clients, and twenty clients for local iteration. After each client iteration, upload their respective parameters to the server, which aggregates the global parameters to complete one round of model training. Record the learning results of 30 rounds and draw a comparison chart. From Figure 13, it can be seen that as the number of model training rounds increases, the classification accuracy of the five models gradually improves. After 30 rounds of training, the five models tend to stabilize, and the classification accuracy of the test set (divided into the test set and training set in a 3:7 ratio) exceeds 90%. Among these, 20 client end models have the highest classification accuracy, reaching 99.12%. This indicates that increasing the number of clients can improve the accuracy of model classification.

Although the accuracy of model classification increases with the number of clients, we ultimately choose 10 clients instead of 20 clients for training. On the one hand, the classification accuracy of 20 clients is not much improved compared to 10 clients. The classification accuracy of 20 clients is 99.12%, while the classification accuracy of 10 clients is 98.96%. Setting up 10 clients can sufficiently meet the classification requirements. On the other hand, increasing the number of clients will consume network IO resources and server performance, so we finally selected 10 clients for training.

We used the parameters determined by comparative experiments to train the distributed model. After 30 rounds of training, we obtained the classification confusion matrix of the four pathogenic bacteria. As shown in Figure 14, our method achieved good results in this classification task, especially for the classification of Aeromonas hydrophila, with an accuracy of 99.7%. However, the classification accuracy of Aeromonas versonii is only 97.9%. We speculate that when obtaining the Raman spectrum data of Aeromonas versonii, the spectrometer was interfered with by an external interference, which affected the data. Alternatively, during the microbial culture stage, it may have been affected by other miscellaneous bacteria, resulting in a decrease in the purity of Aeromonas versonii.

AUC is the area under the ROC curve, which is used to measure classifier performance. The higher the AUC, the higher the model’s accuracy. As shown in Figure 15, the average AUC of the classification model for the four pathogenic bacteria is 0.983. For individual pathogenic bacteria, the AUC values are also greater than 0.97, indicating that our proposed classification model is robust.

3.5. Ablation Experiment

In the RSEM module, we optimized the WGAN by using convolutional layers instead of the original linear layers and using Spectral Normalization instead of Batch Normalization. Therefore, it is necessary to conduct separate ablation experiments on the RSEM module. We designed four sets of experiments: one was set as the baseline, and the other three sets used the control variable method to control the optimized plan. The results of the ablation experiment are shown in Table 4.

The results described in Table 4 show that some of the optimization proofs we have made for the WGAN are successful. Compared to linear layers, convolutional layers have more receptive fields and can efficiently extract features from spectral data. Spectral normalization can solve the Lipschitz constraint problem in the WGAN, making training more stable.

Furthermore, in order to verify the effectiveness of our overall model, we conducted four ablation experiments to demonstrate the impact of the RSEM and RSDM on the classification results. Those in the first group did not utilize any processing; those in the second group only used RSEM; those in the third group only used RSDM; and those in the fourth group used both RSEM and RSDM. The results of the ablation experiment are shown in Table 5.

Comparing the first group with the second group, it can be seen that the RSEM can improve the classification results of shrimp pathogenic bacteria, and the improved accuracy reaches 4.65%. This shows that the size of the dataset has a great impact on network training. The larger the amount of data, the better it can avoid overfitting of the network. Comparing the first group and the third group shows that the RSDM can also improve accuracy, and the features of denoised data are easier to extract. Comparing the second group and the third group, it can be seen that the RSEM has a stronger effect on improving performance than does the RSDM. When the RSEM and RSDM are combined, the accuracy rate increases by 6.79%, and the other three indicators also increase significantly.

3.6. Comparison of Classification Models

Here, we compare the distributed learning classification module (DLCM) proposed in the article with several classification models commonly used in Raman spectroscopy classification tasks (SVM, PCA + LR, RF). And, in order to explore whether our RSEM and RSDM have improvement effects on other classification models, we set up three sets of comparative experiments. All models in the first group of experiments were trained using unprocessed spectral data. In the second group of experiments, all models were trained using data-enhanced data. In the third group of experiments, all models were trained using data-enhanced and denoised data. As shown in Figure 16, our proposed classification model performed the best in both the second and third experiments. In the first set of experiments, the classification accuracy of the model we proposed was lower than that of the SVM model, because the dataset size was small, and the deep learning model could not play a role. Further analyzing the experimental results, we found that after the RSEM and RSDM, the accuracy of several classification models has been greatly improved, which proves that our proposed network can also be applied to the performance improvement of traditional machine learning classification. Therefore, we believe that our proposed network has definite potential in solving other microbial Raman spectroscopic classification tasks.

Additionally, to test the classification performance of our model in the new dataset, we obtained samples of the four pathogenic bacteria mentioned in the manuscript from another batch of shrimp, and Raman spectroscopy datasets of four pathogenic bacteria were made based on the samples. We used the trained model to make direct predictions on the dataset, as shown in Table 6. Our model achieved a classification accuracy of 95.8% on the untrained new dataset, which also proves that our model has practical significance.

4. Conclusions

In response to the challenges encountered in classifying small-sample Raman spectrum data, we proposed a distributed deep learning network based on data enhancement for few-shot Raman spectral classification of Litopenaeus vannamei pathogens. The network we proposed improves the mainstream generation model, can quickly amplify Raman spectrum data, and can enhance the generalization ability of the model. In addition, we used the U-NET for the first time to reduce noise in microbial Raman spectrum data. Experiments have proved that the improved U-NET network can effectively eliminate the noise generated during the training process of the generative adversarial network. Finally, we proposed a distributed learning module for the classification of Raman spectra of shrimp pathogenic bacteria and designed an algorithm so that the client and server could train and interact in parallel. Compared with several models commonly used in Raman spectrum classification tasks, the overall classification accuracy of our proposed network is the highest, reaching 98.9%, and the required dataset size is also the smallest. In the future, we intend to study the effectiveness of diffusion models in generating Raman spectroscopy data.

Author Contributions

Conceptualization, Y.C. and M.C.; methodology, Y.C.; software, Y.C.; validation, Y.C.; formal analysis, Y.C.; investigation, Y.C.; resources, M.C. and Z.L.; data curation, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, Y.C., M.C. and Z.L.; visualization, Y.C.; supervision, M.C.; project administration, Y.C.; funding acquisition, M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key R&D Program of Guangdong Province, grant number 2021B0202070001.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ruan, X.; Luo, K.; Luan, S.; Kong, J.; Xu, S.; Chen, R.; Chen, G. Evaluation of growth performance in Litopenaeus vannamei populations introduced from other nations. J. Fish. China 2013, 37, 34–42. [Google Scholar] [CrossRef]
Valle-Gough, R.E.; Samaniego-Gámez, B.Y.; Apodaca-Hernández, J.E.; Chiappa-Carrara, F.X.; Rodríguez-Dorantes, M.; Arena-Ortiz, M.L. RNA-Seq Analysis on the Microbiota Associated with the White Shrimp (Litopenaeus vannamei) in Different Stages of Development. Appl. Sci. 2022, 12, 2483. [Google Scholar] [CrossRef]
He, Z.; Pan, L.; Zhang, M.; Huang, F.; Gao, S. Metagenomic comparison of structure and function of microbial community between water, effluent and shrimp intestine of higher place Litopenaeus vannamei ponds. J. Appl. Microbiol. 2020, 129, 243–255. [Google Scholar] [CrossRef]
Kneipp, K.; Kneipp, H.; Itzkan, I.; Dasari, R.R.; Feld, M.S. Ultrasensitive chemical analysis by Raman spectroscopy. Chem. Rev. 1999, 99, 2957–2976. [Google Scholar] [CrossRef] [PubMed]
Cuiffo, M.A.; Snyder, J.; Elliott, A.M.; Romero, N.; Kannan, S.; Halada, G.P. Impact of the fused deposition (FDM) printing process on polylactic acid (PLA) chemistry and structure. Appl. Sci. 2017, 7, 579. [Google Scholar] [CrossRef]
Naihao, C.; Michael, M.; Stephanie, Z.; Anne-Isabelle, H.; Van Duyne Richard, P. Single-Molecule Chemistry with Surface-and Tip-Enhanced Raman Spectroscopy. Chem. Rev. 2017, 117, 7583–7613. [Google Scholar]
Yang, Q.; Zhang, S.; Su, J.; Li, S.; Lv, X.; Chen, J.; Lai, Y.; Zhan, J. Identification of trace polystyrene nanoplastics down to 50 nm by the hyphenated method of filtration and surface-enhanced Raman spectroscopy based on silver nanowire membranes. Environ. Sci. Technol. 2022, 56, 10818–10828. [Google Scholar] [CrossRef]
Saito, R.; Hofmann, M.; Dresselhaus, G.; Jorio, A.; Dresselhaus, M. Raman spectroscopy of graphene and carbon nanotubes. Adv. Phys. 2011, 60, 413–550. [Google Scholar] [CrossRef]
Ding, S.-Y.; Yi, J.; Li, J.-F.; Ren, B.; Wu, D.-Y.; Panneerselvam, R.; Tian, Z.-Q. Nanostructure-based plasmon-enhanced Raman spectroscopy for surface analysis of materials. Nat. Rev. Mater. 2016, 1, 16021. [Google Scholar] [CrossRef]
Moosavinejad, S.M.; Madhoushi, M.; Vakili, M.; Rasouli, D. Evaluation of degradation in chemical compounds of wood in historical buildings using FT-IR and FT-Raman vibrational spectroscopy. Maderas Cienc. Y Tecnol. 2019, 21, 381–392. [Google Scholar] [CrossRef]
Balan, V.; Mihai, C.-T.; Cojocaru, F.-D.; Uritu, C.-M.; Dodi, G.; Botezat, D.; Gardikiotis, I. Vibrational spectroscopy fingerprinting in medicine: From molecular to clinical practice. Materials 2019, 12, 2884. [Google Scholar] [CrossRef] [PubMed]
Huang, L.; Sun, H.; Sun, L.; Shi, K.; Chen, Y.; Ren, X.; Ge, Y.; Jiang, D.; Liu, X.; Knoll, W. Rapid, label-free histopathological diagnosis of liver cancer based on Raman spectroscopy and deep learning. Nat. Commun. 2023, 14, 48. [Google Scholar] [CrossRef] [PubMed]
Chang, X.; Yu, M.; Liu, R.; Jing, R.; Ding, J.; Xia, J.; Zhu, Z.; Li, X.; Yao, Q.; Zhu, L. Deep learning methods for oral cancer detection using Raman spectroscopy. Vib. Spectrosc. 2023, 126, 103522. [Google Scholar] [CrossRef]
Yan, S.; Wang, S.; Qiu, J.; Li, M.; Li, D.; Xu, D.; Li, D.; Liu, Q. Raman spectroscopy combined with machine learning for rapid detection of food-borne pathogens at the single-cell level. Talanta 2021, 226, 122195. [Google Scholar] [CrossRef] [PubMed]
Almaviva, S.; Artuso, F.; Giardina, I.; Lai, A.; Pasquo, A. Fast Detection of Different Water Contaminants by Raman Spectroscopy and Surface-Enhanced Raman Spectroscopy. Sensors 2022, 22, 8338. [Google Scholar] [CrossRef] [PubMed]
Ho, C.-S.; Jean, N.; Hogan, C.A.; Blackmon, L.; Jeffrey, S.S.; Holodniy, M.; Banaei, N.; Saleh, A.A.; Ermon, S.; Dionne, J. Rapid identification of pathogenic bacteria using Raman spectroscopy and deep learning. Nat. Commun. 2019, 10, 4927. [Google Scholar] [CrossRef]
Ciloglu, F.U.; Saridag, A.M.; Kilic, I.H.; Tokmakci, M.; Kahraman, M.; Aydin, O. Identification of methicillin-resistant Staphylococcus aureus bacteria using surface-enhanced Raman spectroscopy and machine learning techniques. Analyst 2020, 145, 7559–7570. [Google Scholar] [CrossRef]
Cao, C.; He, J.; Mak, L.; Perera, D.; Kwok, D.; Wang, J.; Li, M.; Mourier, T.; Gavriliuc, S.; Greenberg, M. Reconstruction of microbial haplotypes by integration of statistical and physical linkage in scaffolding. Mol. Biol. Evol. 2021, 38, 2660–2672. [Google Scholar] [CrossRef]
Cao, C.; Greenberg, M.; Long, Q. WgLink: Reconstructing whole-genome viral haplotypes using L0 + L1-regularization. Bioinformatics 2021, 37, 2744–2746. [Google Scholar] [CrossRef]
Wang, Y.; Bhattacharya, T.; Jiang, Y.; Qin, X.; Wang, Y.; Liu, Y.; Saykin, A.J.; Chen, L. A novel deep learning method for predictive modeling of microbiome data. Brief. Bioinform. 2021, 22, bbaa073. [Google Scholar] [CrossRef]
Maruthamuthu, M.K.; Raffiee, A.H.; De Oliveira, D.M.; Ardekani, A.M.; Verma, M.S. Raman spectra-based deep learning: A tool to identify microbial contamination. Microbiol. Open 2020, 9, e1122. [Google Scholar] [CrossRef]
Wang, L.; Tang, J.-W.; Li, F.; Usman, M.; Wu, C.-Y.; Liu, Q.-H.; Kang, H.-Q.; Liu, W.; Gu, B. Identification of bacterial pathogens at genus and species levels through combination of Raman spectrometry and deep-learning algorithms. Microbiol. Spectr. 2022, 10, e02580-22. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
Du, Y.; Han, D.; Liu, S.; Sun, X.; Ning, B.; Han, T.; Wang, J.; Gao, Z. Raman spectroscopy-based adversarial network combined with SVM for detection of foodborne pathogenic bacteria. Talanta 2022, 237, 122901. [Google Scholar] [CrossRef] [PubMed]
Yu, S.; Li, H.; Li, X.; Fu, Y.V.; Liu, F. Classification of pathogens by Raman spectroscopy combined with generative adversarial networks. Sci. Total Environ. 2020, 726, 138477. [Google Scholar] [CrossRef]
Ma, X.; Wang, K.; Chou, K.C.; Li, Q.; Lu, X. Conditional generative adversarial network for spectral recovery to accelerate single-cell Raman spectroscopic analysis. Anal. Chem. 2022, 94, 577–582. [Google Scholar] [CrossRef]
Liu, B.; Liu, K.; Wang, N.; Ta, K.; Liang, P.; Yin, H.; Li, B. Laser tweezers Raman spectroscopy combined with deep learning to classify marine bacteria. Talanta 2022, 244, 123383. [Google Scholar] [CrossRef]
Soto-Rodriguez, S.A.; Gomez-Gil, B.; Lozano-Olvera, R.; Betancourt-Lozano, M.; Morales-Covarrubias, M.S. Field and experimental evidence of Vibrio parahaemolyticus as the causative agent of acute hepatopancreatic necrosis disease of cultured shrimp (Litopenaeus vannamei) in Northwestern Mexico. Appl. Environ. Microbiol. 2015, 81, 1689–1699. [Google Scholar] [CrossRef]
Vieira, R.H.D.F.; Carvalho, E.M.; Carvalho, F.C.; Silva, C.M.; Sousa, O.V.; Rodrigues, D.P. Antimicrobial susceptibility of Escherichia coli isolated from shrimp (Litopenaeus vannamei) and pond environment in northeastern Brazil. J. Environ. Sci. Health Part B 2010, 45, 198–203. [Google Scholar] [CrossRef]
Dewangan, N.K.; Gopalakrishnan, A.; Shankar, A.; Ramakrishna, R.S. Incidence of multiple bacterial infections in Pacific whiteleg shrimp, Litopenaeus vannamei. Aquac. Res. 2022, 53, 3890–3897. [Google Scholar] [CrossRef]
Gautam, R.; Vanga, S.; Ariese, F.; Umapathy, S. Review of multidimensional data processing approaches for Raman and infrared spectroscopy. EPJ Tech. Instrum. 2015, 2, 1–38. [Google Scholar] [CrossRef]
Arjovsky, M.; Bottou, L. Towards principled methods for training generative adversarial networks. arXiv 2017, arXiv:1701.04862. [Google Scholar]
Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. arXiv 2018, arXiv:1802.05957. [Google Scholar]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the Computer Vision–ECCV 2016: 14th Euro-pean Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part IV 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 630–645. [Google Scholar]
Smith, E.; Dent, G. Modern Raman spectroscopy: A Practical Approach; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar]
Stone, N.; Kendall, C.; Smith, J.; Crow, P.; Barr, H. Raman spectroscopy for identification of epithelial cancers. Faraday Discuss. 2004, 126, 141–157. [Google Scholar] [CrossRef] [PubMed]
Hanlon, E.; Manoharan, R.; Koo, T.W.; Shafer, K.; Motz, J.; Fitzmaurice, M.; Kramer, J.; Itzkan, I.; Dasari, R.; Feld, M. Prospects for in vivo Raman spectroscopy. Phys. Med. Biol. 2000, 45, R1. [Google Scholar] [CrossRef] [PubMed]
Koch, H.; Polepil, S.; Eisen, K.; Will, S. Raman microspectroscopy and multivariate data analysis: Optical differentiation of aqueous D-and L-tryptophan solutions. Phys. Chem. Chem. Phys. 2017, 19, 30533–30539. [Google Scholar] [CrossRef]
Kuhar, N.; Sil, S.; Umapathy, S. Potential of Raman spectroscopic techniques to study proteins. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 258, 119712. [Google Scholar] [CrossRef]
Movasaghi, Z.; Rehman, S.; Rehman, I.U. Raman spectroscopy of biological tissues. Appl. Spectrosc. Rev. 2007, 42, 493–541. [Google Scholar] [CrossRef]
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30, 6629–6640. [Google Scholar]
Zhang, Z.; Song, Y.; Qi, H. Decoupled learning for conditional adversarial networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; IEEE: New York, NY, USA, 2018; pp. 700–708. [Google Scholar]

Figure 1. The framework of the proposed network. The top layer is the RSEM module, the middle is the RSDM module, and the bottom is the DLCM module.

Figure 2. Raman spectra of four pathogens. The solid line in the figure represents the average value of Raman data for each pathogen, and the shaded part represents the standard deviation. The horizontal axis represents the Raman shift, and the vertical axis represents the Raman scattering intensity.

Figure 3. Network structure of the WGAN. The network consists of a generator network and a discriminator network, including convolutional layers, normalization layers, and activation function layers in both networks.

Figure 4. Schematic diagram of transfer learning for the WGAN. Firstly, the WGAN network is trained using the source domain dataset. After the model converges, the parameters of the generator and discriminator are frozen and transferred to the new WGAN, and the target domain dataset is used for direct training.

Figure 5. Improved U-NET Spectral Denoising Network. Blue blocks represent the input and output sizes of convolution, black arrows indicate convolution operations, rectangle blocks with plus signs represent convolution output connected to the ResNet, red arrows represent pooling operations (downsampling), and yellow arrows indicate upsampling.

Figure 6. Attention Gates module.

Figure 7. The framework of the distributed learning classification module.

Figure 8. Spectral differences of four pathogenic bacteria. The main peak positions in the figure have been marked, and some peaks with significant differences are marked with gray icons.

Figure 9. The training loss function curve of the RSEM.

Figure 10. (a) Without transfer learning; (b) With transfer learning. VP refers to Vibrio parahaemolyticus, E. coil refers to Escherichia coli, Ah refers to Aeromonas hydrophila, and A. veronii refers to Aeromonas veronii.

Figure 11. UNET denoising effect diagram. The average spectral curves of the original data, generated data, and denoised data are plotted.

Figure 12. The results of different training models on the client.

Figure 13. The results of different numbers of clients.

Figure 14. Confusion matrix of the test set. The vertical axis represents the true labels, and the horizontal axis represents the predicted labels. The diagonal values are the ratios of correct predictions.

Figure 15. ROC curves and average curves of four pathogenic bacteria. The higher AUC values demonstrate that our proposed model is robust.

Figure 16. Comparison of four classification methods.

Table 1. Major Raman bands found in spectra, and their tentative vibrational assignments.

Raman Shift (cm⁻¹)	Component	Assignment
1300	Lipids	CH₂ twisting [39]
1440–1460	Lipids	CH₂ deformation vibration [40]
1542	Tryptophan	Indole ring stretching [41]
1660	Proteins	Amide I [42]
1750	Lipids	C=O stretching [43]
2095	Thiocyanate	C-N stretch [43]
2343	CO₂⁻ hydrates	Asymmetric stretching [43]
2715	Lipids	C-H stretches [43]
2800–3000	Lipids, Fatty acids	CH, CH₂, and CH₃ symmetric and antisymmetric stretching [43]
3300–3400	Proteins	N-H vibration [43]

Table 2. Training results of different GANs.

Type	FID	NRDS	PCC	Time (min)
GAN	5.139	0.072	0.785	30 ± 3
CGAN	2.368	0.086	0.874	32 ± 2
DCGAN	1.214	0.295	0.879	45 ± 5
WGAN	0.449	0.501	0.907	45 ± 5
PGGAN	0.318	0.548	0.918	78 ± 3

Table 3. Noise reduction results of the RSDM.

Noise	Method	MSE	SNR	PSNR
15 dB	U-NET	8.18 × 10⁻⁵	18.02	38.97
	UNET++	7.40 × 10⁻⁵	19.29	41.37
	Attention U-Net	6.82 × 10⁻⁵	22.43	43.78
	RSDM	4.98 × 10⁻⁵	26.16	47.68
25 dB	U-NET	5.42 × 10⁻⁴	17.36	30.26
	UNET++	3.32 × 10⁻⁴	20.94	33.09
	Attention U-Net	2.16 × 10⁻⁴	21.77	37.36
	RSDM	8.67 × 10⁻⁵	24.51	41.65
50 dB	U-NET	6.19 × 10⁻⁴	16.53	26.32
	UNET++	4.32 × 10⁻⁴	17.64	31.74
	Attention U-Net	3.25 × 10⁻⁵	19.84	35.96
	RSDM	1.63 × 10⁻⁴	22.74	38.36

Table 4. Results of RSEM module ablation experiment.

Choice	Spectral Norm	Convolutional Layers	Pathogen	Accuracy	Precision	Recall	F1-Score
1			Vibrio parahaemolyticus	0.9211	0.9091	0.9412	0.9249
			Escherichia coli		0.9306	0.9054	0.9178
			Aeromonas hydrophila		0.9324	0.9583	0.9452
			Aeromonas veronii		0.9143	0.8767	0.8951
2	P		Vibrio parahaemolyticus	0.9420	0.9302	0.9639	0.9467
			Escherichia coli		0.9565	0.9296	0.9429
			Aeromonas hydrophila		0.9429	0.9706	0.9565
			Aeromonas veronii		0.9412	0.9014	0.9209
3		P	Vibrio parahaemolyticus	0.9396	0.9184	0.9574	0.9375
			Escherichia coli		0.9545	0.9333	0.9438
			Aeromonas hydrophila		0.9556	0.9773	0.9663
			Aeromonas veronii		0.9318	0.8913	0.9111
4	P	P	Vibrio parahaemolyticus	0.9531	0.9444	0.977	0.9605
			Escherichia coli		0.9636	0.9464	0.955
			Aeromonas hydrophila		0.9643	0.9818	0.973
			Aeromonas veronii		0.9455	0.8966	0.9204

Table 5. Results of the overall model ablation experiment.

Choice	RSEM	RSDM	Pathogen	Accuracy	Precision	Recall	F1-Score
1			Vibrio parahaemolyticus	0.9066	0.9081	0.913	0.9106
			Escherichia coli		0.8920	0.8771	0.8845
			Aeromonas hydrophila		0.9195	0.9249	0.9222
			Aeromonas veronii		0.907	0.9123	0.9096
2	✓		Vibrio parahaemolyticus	0.9531	0.9444	0.977	0.9605
			Escherichia coli		0.9636	0.9464	0.955
			Aeromonas hydrophila		0.9643	0.9818	0.973
			Aeromonas veronii		0.9455	0.8966	0.9204
3		✓	Vibrio parahaemolyticus	0.9434	0.9322	0.9565	0.9442
			Escherichia coli		0.9510	0.9327	0.9417
			Aeromonas hydrophila		0.9519	0.9706	0.9612
			Aeromonas veronii		0.9400	0.9126	0.9261
4	✓	✓	Vibrio parahaemolyticus	0.9896	0.9824	0.9929	0.9876
			Escherichia coli		0.9931	0.9897	0.9914
			Aeromonas hydrophila		0.9932	0.9966	0.9949
			Aeromonas veronii		0.9892	0.9786	0.9839

Table 6. The classification prediction results of the model in the new dataset.

Pathogen	Accuracy	Precision	Recall	F1-Score
Vibrio parahaemolyticus	0.9577	0.9444	0.9590	0.9517
Escherichia coli		0.9588	0.9538	0.9563
Aeromonas hydrophila		0.9643	0.9692	0.9668
Aeromonas veronii		0.9635	0.9487	0.9561

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Li, Z.; Chen, M. A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens. Appl. Sci. 2024, 14, 2361. https://doi.org/10.3390/app14062361

AMA Style

Chen Y, Li Z, Chen M. A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens. Applied Sciences. 2024; 14(6):2361. https://doi.org/10.3390/app14062361

Chicago/Turabian Style

Chen, Yanan, Zheng Li, and Ming Chen. 2024. "A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens" Applied Sciences 14, no. 6: 2361. https://doi.org/10.3390/app14062361

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Distributed Deep Learning Network Based on Data Enhancement for Few-Shot Raman Spectral Classification of Litopenaeus vannamei Pathogens

Abstract

1. Introduction

2. Materials and Methods

2.1. The Framework of the Proposed Network

2.2. Data Acquisition and Preprocessing

2.2.1. Acquisition of Raman Spectrum Data

2.2.2. Preprocessing of Raman Spectrum Data

2.3. Raman Spectrum Enhancement Module (RSEM)

2.3.1. WGAN Network Structure

2.3.2. Transfer Learning for WGAN

2.4. Raman Spectrum Denoising Module (RSDM)

2.5. Distributed Learning Classification Module (DLCM)

2.6. Evaluation Indicators

3. Results

3.1. Raman Spectroscopy Analysis

3.2. Raman Spectroscopy Enhancement Module Training Results

3.2.1. Selection of Generative Adversarial Networks

3.2.2. RSEM Training Evaluation

3.2.3. Effect of Transfer Learning on RSEM Training

3.3. Raman Spectrum Denoising Module Training Results

3.4. Experimental Analysis of Distributed Learning Classification Module

3.4.1. Selection of Training Models for Clients

3.4.2. Selection of the Number of Clients

3.5. Ablation Experiment

3.6. Comparison of Classification Models

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI