Next Article in Journal
Thermal-Mechanical Behaviour of Road-Embedded Wireless Charging Pads for EVs
Previous Article in Journal
Deformation Law of Tunnels Using Double-Sidewall Guide Pit Method under Different Excavation Sequences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing Traceability of Italian Almonds through IR Spectroscopy and Chemometric Classifiers

by
Claudia Scappaticci
,
Martina Foschi
,
Alessio Plaku
,
Alessandra Biancolillo
* and
Angelo Antonio D’Archivio
*
Department of Physical and Chemical Sciences, University of L’Aquila, Via Vetoio, 67100 L’Aquila, Italy
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2023, 13(23), 12765; https://doi.org/10.3390/app132312765
Submission received: 28 October 2023 / Revised: 23 November 2023 / Accepted: 27 November 2023 / Published: 28 November 2023
(This article belongs to the Section Chemical and Molecular Sciences)

Abstract

:

Featured Application

ATR-FT-IR coupled with chemometrics for the assessment of geographical origin of Italian almonds.

Abstract

Almonds are the seeds of the almond (Prunus Amygdalus) tree and are a nut consumed worldwide. The present study utilized the ATR FT-IR technique followed by a chemometric analysis to develop predictive models for determining the geographical origin of almonds from three regions in Southern Italy (Apulia, Calabria, and Sicily). IR spectra were collected on both the almond shell and the edible kernel to accurately characterize the three different geographical origins. The spectroscopic data obtained were processed using Soft Independent Modeling of Class Analogies (SIMCA) and Partial Least Squares Discriminant Analysis (PLS-DA). Both SIMCA and PLS-DA revealed that the shell spectra are more useful for assessing the geographical origin of samples. In particular, the PLS-DA model applied to these data achieved a 100% correct classification rate (on the external test set of individuals) for all the investigated classes.

1. Introduction

Almonds are the edible seed of the Prunus Amygdalus tree, a plant native to Asia Minor which, over the years, has spread throughout the Mediterranean basin, the United States of America, and Asian countries. In Italy, almond cultivation covers an area of 37,000 hectares, with plantations mostly located in Apulia (South of Italy), Calabria (South of Italy), and Sicily (South of Italy), even though almonds can be successfully grown in any Italian region [1]. This fruit is highly nutritious for humans and has found applications in traditional medicine, demonstrating the interest in this nut for the treatment of numerous diseases. Furthermore, Prunus amygdalus is widely used in the food industry. The high global consumption of almonds is attributed to their versatility and nutritional value. Almonds play a significant role in the confectionery and baking industry, being utilized in a wide range of products as sources of essential nutrients such as lipids, proteins, vitamins, fiber, and minerals like calcium, iron, magnesium, phosphorus, and potassium. In addition, they have anti-inflammatory properties and promote cardiovascular health, helping to reduce cholesterol [2]. Thanks to these health benefits, almonds are becoming increasingly popular.
As for many other agro-food products, the chemical and organoleptic properties are strictly linked to the harvesting area, and, therefore, it is important to ensure the authenticity and geographical origin of these nuts [3,4,5]. Given the current scenario, the foundation of this research rests upon the imperative to effectively tackle the mounting economic pressures linked to the export of Italian almonds, as demonstrated by the increase in prices per ton of unshelled almonds reported by the Chamber of Commerce of Bari in November 2023, amounting to EUR 3600.00 per ton [6]. Simultaneously, it seeks to safeguard the essence and integrity of this renowned product from any potential threats of food adulteration [7]. Consequently, the primary objective of this study is to formulate a comprehensive analytical methodology dedicated to the determination of the geographical origin of Italian almonds.
In light of this, attention was firmly fixed on three cultivation regions situated in South Italy, namely, Apulia, Calabria, and Sicily. These regions boast a rich heritage and longstanding tradition in the cultivation of almonds. In order to accomplish this aim, the potential of infrared spectroscopy in addition to advanced chemometric classifiers was explored [8]. This approach was chosen owing to its demonstrated efficacy in analogous investigations [9,10,11,12,13,14,15], rendering it a fitting choice for our research endeavors.

2. Materials and Methods

2.1. Almond Samples and IR Spectra Collection

In the present work, 40 almonds from Apulia, 80 from Calabria, and 40 from Sicily were analyzed, making a total of 160 samples. The analyzed almonds showed clear similarities in terms of form, size, and color, and these physical characteristics were not considered influential factors [16].
The almonds did not present the husk (removed by the manufacturer). Prior to the analysis, the shells were rubbed with absorbent paper in order to remove any impurities possibly present on their surface. The samples were not washed (nor dried) before IR analysis. Consequently, no extensive pretreatment of the sample was necessary.
Similarly, once the shells were broken, the almonds were cleaned with paper to remove any solid shell residues; then, they were analyzed without any further pretreatment.
For each almond, at first, two replicated spectra were collected on the shell (as depicted in Figure 1A). Afterward, the shell was broken, and two replicate spectra were collected on the kernel (as depicted in Figure 1B). All signal replicates (regardless of how they were collected on the shell or on the kernel) were averaged before chemometric analysis. Eventually, for each almond, a shell spectrum and a kernel spectrum were available for the analysis. Therefore, a total of 320 spectra (160 for shells and 160 for kernels) were used to build chemometric models.
The IR analysis was conducted using an attenuated total reflectance Fourier-transform infrared spectroscopy (ATR FT-IR) PerkinElmer Spectrum Two™ (PerkinElmer, Waltham, MA, USA). The spectrometer is equipped with a deuterated triglycine sulfate (DTGS) detector and a PerkinElmer Universal Attenuated Total Reflectance (uATR) accessory fitted with a single-bounce diamond crystal. Each acquisition is the result of ten averaged scans acquired in the spectral range of 4000–400 cm−1 with an instrumental resolution of 4 cm−1. The background was collected by exposing the diamond to air after cleaning it with absorbent paper soaked in methanol. After each scan performed on the shell, the ATR crystal was cleaned with absorbent paper soaked in methanol; subsequently, the spectrum of the kernel was acquired. Since the beam infiltrates the sample to a depth of a few microns and given that the thickness of the average shell is about 2 mm, the kernels do not contribute to the signals collected and the shells.

2.2. SIMCA

Soft Independent Modeling of Class Analogy (SIMCA) is a class modeling method that allows for the individual modeling of categories by defining the region of space where it is likely to find samples belonging to the same class [17].
To achieve this goal, a model is constructed for each class of interest, thus determining the possible belonging of an unknown object to one of the considered categories. Using SIMCA, a PCA [18] model is built for each known class. From this, it is possible to estimate, for each i-th observation, a distance d i , a combination of the sample’s distance in the scores space ( T 2 ), and the distance in the residual space (Q):
d i = T i ,   r e d 2 2 + Q i ,   r e d 2
Subsequently, an unknown sample is assigned to a category according to d i [19]. In particular, it is attributed to the modeled category if d < 2 ; otherwise, it is rejected.
This approach, unlike discriminant methods, does not allow for a unique prediction of class-belonging. As a consequence, samples may be confused (attributed to more than one class) or predicted as not belonging to any of the investigated categories.

2.3. PLS-DA

Partial Least Squares Discriminant Analysis (PLS-DA) is a chemometric method used for discriminant classification [20,21]. This approach is based on identifying and characterizing differences between samples belonging to different classes, focusing on creating distinct boundaries known as “decision surfaces”. These decision surfaces separate the multivariate space defined by the measured variables in a way that clearly distinguishes one class space from the spaces of other classes. In this way, PLS-DA is able to uniquely assign each sample to a specific class or category.
PLS-DA involves the transformation of a classification problem into a regression one, which is then solved using PLS. In order to make this possible, it is necessary to codify the class-membership of samples in a dummy response matrix (Y), representing, in a binary codification, the class-belonging of any observation. For instance, in a three-class problem, as the one discussed in the present work, a sample belonging to class 1 will be associated with the y-vector y = [1 0 0], an individual appertaining to class 2 will be codified as y = [0 1 0], whereas a class 3 object is indicated as y = [0 0 1]. Once the Y dummy has been generated, a calibration model can be built by solving Equation (2):
Y = X B + E
where Y is the dummy matrix, X is the training data matrix, B represents the regression coefficients, and E represents the residuals.
The Estimated B can be used to predict the class membership of unknown samples. Nevertheless, the predicted Y will be made of continuous values (i.e., it will not be binary), making the class assessment less straightforward. Consequently, to make predictions using this calibration model, different solutions can be followed [22]. In the present work, the procedure suggested by Perez et al. has been exploited [23].

3. Results

In this section, the outcome of the SIMCA and SPORT analyses is discussed. Regardless of the classifier employed, distinct pretreatments were tested on data: first derivative [24], second derivative [25], and Standard Normal Variate (SNV) [26]. Clearly, in the case of SIMCA, a distinct classification model was built for each tested pretreatment; on the other hand, SPORT allows for ensemble preprocessing, so a single model was calculated. More details can be found in Section 2.2, Section 2.3, Section 3.1 and Section 3.2. All data were mean-centered before the creation of any model.
To ensure robust model evaluation, the spectra were divided into two distinct sets: one for training and one for validation. Specifically, 80 samples were allocated for training, and 80 were reserved for validation. The dataset was divided into two mutually exclusive subsets using the Duplex algorithm [27]. Regardless of the classifier used, all models were created in order to predict the geographic origin of the samples. Consequently, three classes were taken into consideration: Calabria, Apulia, and Sicily.
Prior to the classification analysis, Principal Component Analysis (PCA) was used to explore the data to evaluate the presence of suspicious samples or outliers; this did not highlight any anomaly.

3.1. Results of SIMCA Modeling

In this section, we delve into the results obtained from the SIMCA models, which were developed to classify samples into three geographical origins.
The utilization of SIMCA allowed for us to create models tailored to each class, ensuring that the unique characteristics and patterns of each group were adequately represented.
Two different data sets were available for SIMCA: spectra collected on the shell and on the kernel. Regardless of the sampling position, the SIMCA models were created on raw (mean-centered) data, and after different data pretreatments, namely, SNV, first derivative (D1), second derivative (D2), SNV + D1, and SNV + D2. Consequently, six different cross-validated calibration models were calculated for each class. Model parameters (i.e., the most suitable pretreatment and the optimal number of PCs) for each category were individuated by looking at the cross-validated efficiency (geometric mean of specificity and sensibility).
Although six different SIMCA models were calculated for each investigated class, only details about the chosen calibration models are reported in Table 1.
The application of these models on the external test set yielded the sensitivities and the specificities reported in Table 2. In this table, the diagonal shows the sensitivities for each category, whereas in all the other entries the specificity with respect to all the different classes is reported. The total specificities for class Calabria, Apulia, and Sicily are 92.5%, 95.0%, and 80.0%, respectively.
The outcome of the models can also be appreciated in Figure 2, where a representation of the SIMCA models’ prediction in the Tred2 and Qred spaces is given for classes Calabria (A), Apulia (B), and Sicily (C). From this figure, it is apparent that all models present a satisfying sensitivity in prediction (>85%) and high specificity (>92%), except for class Sicily, where a discreet number of objects from Apulia (~30%, six test samples) and Calabria (15%, six test samples) is erroneously accepted by the model.
Similarly, in Table 3, only the optimal calibration models calculated on the spectra collected on the almonds’ kernels are shown.
The application of these models on the external test set yielded the sensitivities and the specificities reported in Table 4, whose organization resembles Table 2.
The total specificities (on the external test set) for class Calabria, Apulia, and Sicily are 72.5%, 60.0%, and 61.7%, respectively.
Comparing the results of SIMCA, it is evident that there is a notable disparity in model accuracy, with the models trained on shell spectra outperforming those trained on kernel spectra.
This outcome suggests that the spectral characteristics of almond shells contain more discriminative features for classification purposes when compared to almond kernels. Consequently, this model is not inspected further.

3.2. Results of PLS-DA Modeling

As before, two different data sets were available for PLS-DA; models were created on raw (mean-centered) data, and after different data pretreatments, namely, SNV, first derivative (D1), second derivative (D2), SNV + D1, and SNV + D2. Consequently, six different cross-validated calibration models were calculated on shell and kernel spectra.
The optimal model parameters, i.e., the appropriate data preprocessing and the most suitable number of LVs to be used to build the calibration model, were chosen by inspecting the total cross-validated correct classification rate (% CCRcv).
In Table 5, only the details associated with the optimal model, together with its prediction on the external test set (% CCRpred), are shown.
From Table 5, it is clear that the model allows for the correct classification of all test samples. Attaining a 100% accurate classification rate is noteworthy and indicates that the spectral data contain highly discriminating information about the geographical origin. This outcome can also be appreciated in Figure 3, where samples are projected onto the space spanned by the two canonical variates (CVs). From the plot, it is apparent that the three classes present three clear groups in the CV space. In fact, samples belonging to class Calabria (red diamonds) are the only ones presenting negative CV1 scores, whereas objects belonging to Sicily (blue stars) and Apulia (green squares) discriminate along CV2. Indeed, the greatest part of samples from Sicily fall at the positive CV2, whereas almonds from Apulia present negative CV2-scores.
Similarly, PLS-DA models were built on kernel spectra; the outcome of the analysis is displayed in Table 6.
The application of PLS-DA on the kernel spectra provides less accurate predictions. In fact, the optimal calibration model (built on data preprocessed by the first derivative) misclassified 11 over 80 test samples; in particular, of the 40 almonds from Calabria, 2 were predicted as belonging to class Apulia and 1 associated with Sicily; 18 (over 20) objects belonging to class Apulia were properly classified, whereas 2 samples were attributed to the other two categories and 3 almonds harvested in Sicily were predicted as belonging to class Apulia (all the others were properly classified). This outcome confirms what was already observed while discussing SIMCA, i.e., the IR profiles collected on kernels are less suitable than shell spectra for the geographical classification of the investigated samples.

4. Discussion

As outlined in Section 3.1, the outcome of SIMCA revealed a notable disparity in model accuracy, with the models trained on shell spectra outperforming those trained on kernel spectra. This indicates that shells present a more informative IR profile from the geographical origin point of view. This can be attributed to the composition, surface properties, and nature of the spectra collected on the shells. In fact, the spectral data collected on almond shells may exhibit greater consistency across samples due to the uniform nature of the shell’s surface [28]. In contrast, kernel spectra may vary more due to factors like size, ripeness, and internal variations [29]. Furthermore, by collecting spectra on the kernel, the pressure exerted on the core can cause the oils to leak out, altering the collected spectrum.
The outcome of PLS-DA confirms what was observed in SIMCA. Indeed, in both cases, the classification achieved on the shell spectra is better than that obtained by analyzing the spectra collected on the kernel. Furthermore, the results of PLS-DA are more accurate than those achieved using SIMCA. For this reason, the focus was placed on the PLS-DA model calculated on the shell spectra, and a Variable Importance in Projection (VIP) analysis was applied to understand the spectral variables that mostly contribute to the discrimination among the three different origins.
Briefly, the VIP analysis allows for the estimation of VIP indices, which represent the contribution of each variable to the solution of the classification problem. By construction, VIP indices higher than 1 are considered to be associated with significant variables. In Figure 4, a graphical representation of the spectral variables contributing the most to the model is shown. In particular, the black lines represent the mean spectrum (offset to make them visible), whereas red, green, and blue dots highlight significant variables for Class Calabria, Apulia, and Sicily, respectively.
From this plot, it is possible to obtain indications of the most significant features. The selected peaks found in the region between 3000 and 2700 cm−1 are representative of asymmetric and symmetric vibrations (peaks at ~2950 and ~2850 cm−1) of alkene groups (=C-H). The peak at ~2950 cm−1 can be attributed to the stretching vibration of saturated aliphatic alkyl (C-H) groups present in celluloses and hemicelluloses, whereas the vibration at 2800 cm−1 originates from the stretching of lignin [30]. The signal between 1740 and 1710 cm−1 can be associated with the vibrations of acetyl groups and uronic ester groups in hemicelluloses or the linkage of carboxyl groups via esters of ferulic and p-coumaric acids of lignin, and to the stretching of -COOH groups of the ferulic and p-coumaric acids present in lignin. Furthermore, peaks in the range of 1850–1680 cm−1 correspond to the ester carbonyl functional group (C=O) of triacylglyceride esters [31,32].
All the highlighted variables between 1600 and 1100 cm−1 could be associated with vibrations of the aromatic rings in lignin or to the CH, CH2, OH, C-C, C-O, and CCO skeleton vibrations. Moreover, the absorptions around 1530 cm−1 are ascribable to N−H in amide I and amide II of proteins [33]. The band at ~1050 cm−1 is attributable to the stretching of the C-O-C functional group due to the presence of polysaccharides associated with the hemicelluloses [34].
The variables highlighted in the fingerprint area are difficult to interpret; however, the signals may presumably belong to poly- and oligosaccharides.
It is not surprising that the identified spectral features play a crucial role in distinguishing almonds cultivated in areas with diverse pedoclimatic conditions. This is because perturbations in precipitation patterns and water availability have a significant impact on the synthesis of secondary metabolites in plants, both in terms of quantity and quality, thereby influencing their flavor, aroma, and medicinal characteristics [35,36,37,38].

5. Conclusions

The use of PLS-DA and SIMCA classifiers highlights a notable disparity in accuracy between shell and kernel spectra. All models built on the spectra collected on the almonds’ shells provide extremely satisfying classification rates for all the considered classes. Notably, models based on shell spectra consistently outperform those utilizing kernel spectra. This superior accuracy indicates that the chemical composition and spectral characteristics of almond shells offer more distinct and informative features for traceability purposes, likely attributed to the shell’s greater stability and reduced susceptibility to variations compared to the kernel.
The heightened accuracy of shell-based models holds significant promise for enhancing quality control and authentication in the almond supply chain. These models can precisely identify the origin of almonds, contributing to improved product quality control.
Furthermore, it is important to underscore that the utilization of shell spectra enables the non-destructive analysis of almonds. This non-invasive approach is particularly advantageous, as it permits the evaluation of almond samples without compromising their physical integrity. As a result, it preserves the quality and market value of the almonds while delivering reliable traceability information.

Author Contributions

Conceptualization, A.A.D.; methodology, A.A.D., M.F. and A.B.; software, A.B. and M.F.; validation, A.A.D. and A.B.; formal analysis, A.P. and C.S.; investigation, A.P. and C.S.; resources, A.A.D.; data curation, A.P. and C.S.; writing—original draft preparation, C.S. and A.B.; writing—review and editing, A.B. and A.A.D.; visualization, A.B.; supervision, A.A.D.; project administration, A.A.D.; funding acquisition, A.A.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dammak, M.I.; Chakroun, I.; Mzoughi, Z.; Amamou, S.; Mansour, H.B.; Le Cerf, D.; Majdoub, H. Characterization of Polysaccharides from Prunus Amygdalus Peels: Antioxidant and Antiproliferative Activities. Int. J. Biol. Macromol. 2018, 119, 198–206. [Google Scholar] [CrossRef]
  2. Maguire, S.M.L.S.; O’Sullivan, K.G.T.P.O.; O’Brien, N.M. Fatty Acid Profile, Tocopherol, Squalene and Phytosterol Content of Walnuts, Almonds, Peanuts, Hazelnuts and the Macadamia Nut. Int. J. Food Sci. Nutr. 2004, 55, 171–178. [Google Scholar] [CrossRef]
  3. Arndt, M.; Rurik, M.; Drees, A.; Bigdowski, K.; Kohlbacher, O.; Fischer, M. Comparison of Different Sample Preparation Techniques for NIR Screening and Their Influence on the Geographical Origin Determination of Almonds (Prunus Dulcis MILL.). Food Control 2020, 115, 107302. [Google Scholar] [CrossRef]
  4. Arndt, M.; Rurik, M.; Drees, A.; Ahlers, C.; Feldmann, S.; Kohlbacher, O.; Fischer, M. Food Authentication: Determination of the Geographical Origin of Almonds (Prunus Dulcis MILL.) via near-Infrared Spectroscopy. Microchem. J. 2021, 160, 105702. [Google Scholar] [CrossRef]
  5. Netto, J.M.; Honorato, F.A.; Celso, P.G.; Pimentel, M.F. Authenticity of Almond Flour Using Handheld near Infrared Instruments and One Class Classifiers. J. Food Compos. Anal. 2023, 115, 104981. [Google Scholar] [CrossRef]
  6. Available online: https://www.ba.camcom.it/info/listino-ortofrutta-e-mandorle-2023 (accessed on 20 November 2023).
  7. Von Wuthenau, K.; Segelke, T.; Müller, M.S.; Behlok, H.; Fischer, M. Food Authentication of Almonds (Prunus Dulcis Mill.). Origin Analysis with Inductively Coupled Plasma Mass Spectrometry (ICP-MS) and Chemometrics. Food Control 2022, 134, 108689. [Google Scholar] [CrossRef]
  8. Firmani, P.; Bucci, R.; Marini, F.; Biancolillo, A. Authentication of “Avola Almonds” by near Infrared (NIR) Spectroscopy and Chemometrics. J. Food Compos. Anal. 2019, 82, 103235. [Google Scholar] [CrossRef]
  9. Biancolillo, A.; Foschi, M.; Di Micco, M.; Di Donato, F.; D’Archivio, A.A. ATR-FTIR-Based Rapid Solution for the Discrimination of Lentils from Different Origins, with a Special Focus on PGI and Slow Food Typical Varieties. Microchem. J. 2022, 178, 107327. [Google Scholar] [CrossRef]
  10. Reale, S.; Biancolillo, A.; Foschi, M.; Di Donato, F.; Di Censo, E.; D’Archivio, A.A. Geographical Discrimination of Italian Carrot (Daucus Carota L.) Varieties: A Comparison between ATR FT-IR Fingerprinting and HS-SPME/GC-MS Volatile Profiling. Food Control 2023, 146, 109508. [Google Scholar] [CrossRef]
  11. Mellado-Carretero, J.; García-Gutiérrez, N.; Ferrando, M.; Güell, C.; García-Gonzalo, D.; Lamo-Castellví, S. De Rapid Discrimination and Classification of Edible Insect Powders Using ATR-FTIR Spectroscopy Combined with Multivariate Analysis. J. Insects Food Feed 2020, 6, 141–148. [Google Scholar] [CrossRef]
  12. Schwolow, S.; Gerhardt, N.; Rohn, S.; Weller, P. Data Fusion of GC-IMS Data and FT-MIR Spectra for the Authentication of Olive Oils and Honeys—Is It Worth to Go the Extra Mile? Anal. Bioanal. Chem. 2019, 411, 6005–6019. [Google Scholar] [CrossRef] [PubMed]
  13. David, M.; Hategan, A.R.; Berghian-Grosan, C.; Magdas, D.A. The Development of Honey Recognition Models Based on the Association between ATR-IR Spectroscopy and Advanced Statistical Tools. Int. J. Mol. Sci. 2022, 23, 9977. [Google Scholar] [CrossRef] [PubMed]
  14. Foschi, M.; D’Addario, A.; Antonio D’Archivio, A.; Biancolillo, A. Future Foods Protection: Supervised Chemometric Approaches for the Determination of Adulterated Insects’ Flours for Human Consumption by Means of ATR-FTIR Spectroscopy. Microchem. J. 2022, 183, 108021. [Google Scholar] [CrossRef]
  15. Cortés, V.; Barat, J.M.; Talens, P.; Blasco, J.; Lerma-García, M.J. A Comparison between NIR and ATR-FTIR Spectroscopy for Varietal Differentiation of Spanish Intact Almonds. Food Control 2018, 94, 241–248. [Google Scholar] [CrossRef]
  16. Oliveira, I.; Meyer, A.; Afonso, S.; Ribeiro, C.; Gonçalves, B. Morphological, Mechanical and Antioxidant Properties of Portuguese Almond Cultivars. J. Food Sci. Technol. 2018, 55, 467–478. [Google Scholar] [CrossRef] [PubMed]
  17. Wold, S.; Sjöström, M. SIMCA: A Method for Analyzing Chemical Data in Terms of Similarity and Analogy. In Chemometrics: Theory and Application; ACS Publication: Washington, DC, USA, 1977; pp. 243–282. [Google Scholar]
  18. Jolliffe, I.T. A Note on the Use of Principal Components in Regression. J. R. Stat. Soc. 1982, 31, 300–303. [Google Scholar] [CrossRef]
  19. Forina, M.; Oliveri, P.; Lanteri, S.; Casale, M. Class-Modeling Techniques, Classic and New, for Old and New Problems. Chemom. Intell. Lab. Syst. 2008, 93, 132–148. [Google Scholar] [CrossRef]
  20. Wold, S.; Martens, H.; Wold, H. The Multivariate Calibration Problem in Chemistry Solved by the PLS Method. In Matrix Pencils; Kågström, B., Ruhe, A., Eds.; Springer Berlin Heidelberg: Berlin/Heidelberg, Germany, 1983; pp. 286–293. [Google Scholar]
  21. Geladi, P.; Kowalski, B.R. Partial Least-Squares Regression: A Tutorial. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
  22. Barker, M.; Rayens, W. Partial Least Squares for Discrimination. J. Chemom. 2003, 17, 166–173. [Google Scholar] [CrossRef]
  23. Pérez, N.F.; Ferré, J.; Boqué, R. Calculation of the Reliability of Classification in Discriminant Partial Least-Squares Binary Classification. Chemom. Intell. Lab. Syst. 2009, 95, 122–128. [Google Scholar] [CrossRef]
  24. Savitzky, A.; Golay, M.J.E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
  25. Rinnan, Å.; van den Berg, F.; Engelsen, S.B. Review of the most common pre-processing techniques for near-infrared spectra. TrAC-Trends Anal. Chem. 2009, 28, 1201–1222. [Google Scholar] [CrossRef]
  26. Barnes, R.J.; Dhanoa, M.S.; Lister, S.J. Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
  27. Snee, R.D. Validation of Regression Models: Methods and Examples. Technometrics 1977, 19, 415–428. [Google Scholar] [CrossRef]
  28. Li, X.; Liu, Y.; Hao, J.; Wang, W. Study of Almond Shell Characteristics. Materials 2018, 11, 1782. [Google Scholar] [CrossRef]
  29. Summo, C.; Palasciano, M.; De Angelis, D.; Paradiso, V.M.; Caponio, F.; Pasqualone, A. Evaluation of the Chemical and Nutritional Characteristics of Almonds (Prunus Dulcis (Mill). D.A. Webb) as Influenced by Harvest Time and Cultivar. J. Sci. Food Agric. 2018, 98, 5647–5655. [Google Scholar] [CrossRef]
  30. Nazir, S.; Habib, U.; Ul Islam, T. Extraction and Characterization of Microcrystalline Cellulose from Walnut, Almond and Apricot Stone Shells. J. Chem. Soc. Pak. 2023, 45, 85–92. [Google Scholar] [CrossRef]
  31. Lohumi, S.; Lee, S.; Lee, H.; Cho, B.-K. A Review of Vibrational Spectroscopic Techniques for the Detection of Food Authenticity and Adulteration. Trends Food Sci. Technol. 2015, 46, 85–98. [Google Scholar] [CrossRef]
  32. Vlachos, N.; Skopelitis, Y.; Psaroudaki, M.; Konstantinidou, V.; Chatzilazarou, A.; Tegou, E. Applications of Fourier Transform-Infrared Spectroscopy to Edible Oils. Anal. Chim. Acta 2006, 573–574, 459–465. [Google Scholar] [CrossRef] [PubMed]
  33. Subramanian, A.; Harper, W.J.; Rodriguez-Saona, L.E. Rapid Prediction of Composition and Flavor Quality of Cheddar Cheese Using ATR–FTIR Spectroscopy. J. Food Sci. 2009, 74, C292–C297. [Google Scholar] [CrossRef] [PubMed]
  34. Faqeerzada, M.A.; Lohumi, S.; Joshi, R.; Kim, M.S.; Baek, I.; Cho, B.-K. Non-Targeted Detection of Adulterants in Almond Powder Using Spectroscopic Techniques Combined with Chemometrics. Foods 2020, 9, 876. [Google Scholar] [CrossRef]
  35. Liu, H.; Wei, Y.; Zhang, Y.; Wei, S.; Zhang, S.; Guo, B. The Effectiveness of Multi-Element Fingerprints for Identifying the Geographical Origin of Wheat. Int. J. Food Sci. Technol. 2017, 52, 1018–1025. [Google Scholar] [CrossRef]
  36. Zhao, H.; Guo, B.; Wei, Y.; Zhang, B. Multi-Element Composition of Wheat Grain and Provenance Soil and Their Potentialities as Fingerprints of Geographical Origin. J. Cereal Sci. 2013, 57, 391–397. [Google Scholar] [CrossRef]
  37. Giupponi, L.; Leoni, V.; Pavlovic, R.; Giorgi, A. Influence of Altitude on Phytochemical Composition of Hemp Inflorescence: A Metabolomic Approach. Molecules 2020, 25, 1381. [Google Scholar] [CrossRef] [PubMed]
  38. Suyal, R.; Rawat, S.; Rawal, R.S.; Bhatt, I.D. Variability in Morphology, Phytochemicals, and Antioxidants in Polygonatum Verticillatum (L.) All. Populations under Different Altitudes and Habitat Conditions in Western Himalaya, India. Environ. Monit. Assess. 2019, 191, 783. [Google Scholar] [CrossRef]
Figure 1. Acquisition of spectra: (A) on almond shell; (B) on the kernel.
Figure 1. Acquisition of spectra: (A) on almond shell; (B) on the kernel.
Applsci 13 12765 g001
Figure 2. Graphical representation of SIMCA models’ prediction in the Tred2 and Qred spaces. (A) Model of class Calabria; (B) model of class Apulia; (C) model of class Sicily. Legend: empty symbols represent training objects, full symbols embody test objects; red diamonds denote Calabria samples, green squares represent Apulia objects, and blue stars stand for Sicily. The dashed lines represent the class space-limits.
Figure 2. Graphical representation of SIMCA models’ prediction in the Tred2 and Qred spaces. (A) Model of class Calabria; (B) model of class Apulia; (C) model of class Sicily. Legend: empty symbols represent training objects, full symbols embody test objects; red diamonds denote Calabria samples, green squares represent Apulia objects, and blue stars stand for Sicily. The dashed lines represent the class space-limits.
Applsci 13 12765 g002
Figure 3. PLS-DA: Projection of samples onto the two canonical variates (CVs).
Figure 3. PLS-DA: Projection of samples onto the two canonical variates (CVs).
Applsci 13 12765 g003
Figure 4. VIP analysis. Variables presenting a VIP index > 1 are highlighted as follows: red dots for class Calabria, green dots for class Apulia, and blue dots for class Sicily.
Figure 4. VIP analysis. Variables presenting a VIP index > 1 are highlighted as follows: red dots for class Calabria, green dots for class Apulia, and blue dots for class Sicily.
Applsci 13 12765 g004
Table 1. SIMCA on the spectra collected on the almonds’ shells.
Table 1. SIMCA on the spectra collected on the almonds’ shells.
ClassPretreatmentPCsEfficiency (% CV)
Calabriaraw291.7
ApuliaSNV + D1386.4
SicilySNV + D2483.7
Table 2. Class-by-class sensitivities and specificities for the SIMCA models calculated on the spectra collected on the almonds’ shells.
Table 2. Class-by-class sensitivities and specificities for the SIMCA models calculated on the spectra collected on the almonds’ shells.
ClassCalabriaApuliaSicily
Calabria90.090.095.0
Apulia100.085.085.0
Sicily85.070.095.0
Table 3. SIMCA on the spectra collected on the almonds’ kernels.
Table 3. SIMCA on the spectra collected on the almonds’ kernels.
ClassPretreatmentPCsEfficiency (% CV)
CalabriaD21067.0
ApuliaSNV670.8
Sicilyraw668.1
Table 4. Class-by-class sensitivities and specificities for the SIMCA models calculated on the spectra collected on the almonds’ kernels.
Table 4. Class-by-class sensitivities and specificities for the SIMCA models calculated on the spectra collected on the almonds’ kernels.
ClassCalabriaApuliaSicily
Calabria77.565.080.0
Apulia70.075.040.0
Sicily72.540.075.0
Table 5. PLS-DA on the spectra collected on the almonds’ shells.
Table 5. PLS-DA on the spectra collected on the almonds’ shells.
PretreatmentLVs% CCRcvCalabria
% CCRpred
Apulia
% CCRpred
Sicily
% CCRpred
SNV + D2998.7100.0100.0100.0
Table 6. PLS-DA on the spectra collected on the almonds’ kernels.
Table 6. PLS-DA on the spectra collected on the almonds’ kernels.
PretreatmentLVs% CCRcvCalabria
% CCRpred
Apulia
% CCRpred
Sicily
% CCRpred
D11296.292.590.085.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Scappaticci, C.; Foschi, M.; Plaku, A.; Biancolillo, A.; D’Archivio, A.A. Enhancing Traceability of Italian Almonds through IR Spectroscopy and Chemometric Classifiers. Appl. Sci. 2023, 13, 12765. https://doi.org/10.3390/app132312765

AMA Style

Scappaticci C, Foschi M, Plaku A, Biancolillo A, D’Archivio AA. Enhancing Traceability of Italian Almonds through IR Spectroscopy and Chemometric Classifiers. Applied Sciences. 2023; 13(23):12765. https://doi.org/10.3390/app132312765

Chicago/Turabian Style

Scappaticci, Claudia, Martina Foschi, Alessio Plaku, Alessandra Biancolillo, and Angelo Antonio D’Archivio. 2023. "Enhancing Traceability of Italian Almonds through IR Spectroscopy and Chemometric Classifiers" Applied Sciences 13, no. 23: 12765. https://doi.org/10.3390/app132312765

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop