A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset

Zhao, Hefei; Wang, Selina C.

doi:10.3390/life12111864

Open AccessTechnical Note

A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset

by

Hefei Zhao

and

Selina C. Wang

^*

Department of Food Science and Technology, University of California, Davis, One Shields Ave, Davis, CA 95616, USA

^*

Author to whom correspondence should be addressed.

Life 2022, 12(11), 1864; https://doi.org/10.3390/life12111864

Submission received: 12 October 2022 / Revised: 2 November 2022 / Accepted: 10 November 2022 / Published: 12 November 2022

(This article belongs to the Special Issue Insights in Nutrition and Agri-Food Science Technology by Use of Smart Algorithm, Machine Learning, Multivariable and Omics Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

With innovations and advancements in analytical instruments and computer technology, omics studies based on statistical analysis, such as phytochemical omics, oilomics/lipidomics, proteomics, metabolomics, and glycomics, are increasingly popular in the areas of food chemistry and nutrition science. However, a remaining hurdle is the labor-intensive data process because learning coding skills and software operations are usually time-consuming for researchers without coding backgrounds. A MATLAB^® coding basis and three-in-one integrated method, ‘Ana’, was created for data visualizations and statistical analysis in this work. The program loaded and analyzed an omics dataset from an Excel^® file with 7 samples * 22 compounds as an example, and output six figures for three types of data visualization, including a 3D heatmap, heatmap hierarchical clustering analysis, and principal component analysis (PCA), in 18 s on a personal computer (PC) with a Windows 10 system and in 20 s on a Mac with a MacOS Monterey system. The code is rapid and efficient to print out high-quality figures up to 150 or 300 dpi. The output figures provide enough contrast to differentiate the omics dataset by both color code and bar size adjustments per their higher or lower values, allowing the figures to be qualified for publication and presentation purposes. It provides a rapid analysis method that would liberate researchers from labor-intensive and time-consuming manual or coding basis data analysis. A coding example with proper code annotations and completed user guidance is provided for undergraduate and postgraduate students to learn coding basis statistical data analysis and to help them utilize such techniques for their future research.

Keywords:

multidimensional dataset; omics; 3D heatmap; hierarchical clustering analysis; principal component analysis; olive; phenolics; phytochemical; MATLAB^®

1. Introduction

With innovations and advancements in analytical instruments and computer technology, omics studies based on statistical analysis, such as phytochemical omics [1], oilomics/lipidomics [2], proteomics [3], metabolomics [4], and glycomics [5], are increasingly popular in the areas of food chemistry and nutrition science [6,7]. Clear graphical representation and visual communication are effective ways to present large datasets and dense information to learners. Heatmaps with hierarchical clustering analysis and principal component analysis (PCA) are commonly used cluster analysis methods for omics studies. Wang et al. [8] investigated the interaction of fruity aromas with polyphenols by the use of heatmap cluster analysis by Origin Pro 9.0; Varunjikar et al. [9] analyzed proteomics from tandem mass spectrometry by the use of heatmap cluster analysis through Omics Explorer V 3.6 software for food-grade insect protein analysis; Lin et al. [10] analyzed the glycome profile of blueberry using a heatmap via R software. Yang et al. [11] combined headspace-gas chromatography-ion mobility spectrometry (HS-GC-IMS) with PCA to detect the flavor compounds of fermented soybean products by the use of a software package with a dynamic PCA plug-in. Green & Selina [12] employed both PCA and hierarchal cluster analysis without a heatmap to classify fatty acid and sterol profiles for analyzing avocado oil quality by the use of OriginPro2016 software. Zhao et al. [13] implemented PCA in R software and machine learning algorithms in Python to classify up to ten types of major edible oils based on fatty acid profiles and Raman spectra datasets; Zhao et al. [14] also applied PCA based on R software to analyze phenolic compound profiles of different cultivars of the US midwestern grapes with selenium and lithium fertilizer treatments. Richter et al. [15] used PCA and heatmap cluster analysis to analyze inductively coupled plasma mass spectrometry (ICP-MS) data in R software for identifying food authentication of German asparagus. Zou et al. [16] analyzed a multidimensional dataset of HS-SPME-GC×GC-TOFMS of coffee using ChromaTOF^® (ver. 5.51, LECO Corp., St. Joseph, MI, USA), ChromaTOF Tile (ver. 1.01, LECO Corp.), R version 4.0.2, and MATLAB^® (ver. R2019b, MathWorks, Natick, MA, USA).

However, processing high-dimensional data from raw food omics datasets is time-consuming [17,18,19] and remains a challenging task for data mining and untargeted foodomics studies [20,21]. To achieve multiple data analysis methods, different software or code packages may be needed. For instance, by the use of R software, packages ‘ggbiplot’ [22] and ‘ggplot’ [23] are usually used for PCA analysis, while another package ‘heatmap2′ [24] is usually applied for heatmap cluster analysis. However, it takes time for researchers to learn and operate different software and code packages with confidence.

The objective of the study is to develop an integrated code basis program based on MATLAB^® software to give a 3D heatmap, heatmap hierarchical clustering analysis, and PCA all at once by directly reading datasets from Excel^® files. The code has been optimized for figure qualities such as resolution, color code, and label font size. The code also adjusts the size of the 3D bars of the heatmap in accordance with the values, which gives readers better data visualization and differentiation. In addition, we have provided proper code annotations and completed user guidance in the supplementary materials for future learning and educational proposes.

2. Methods

2.1. Data Preparation

The original dataset of our previous publication about the US California olive pomace phenolics [25] was used as an example dataset in this study. As can be seen from Table S1, the data matrix contained 7 extracts * 22 olive pomace phenolic compounds. Data were saved in an ‘.xlsx’ file format by the use of Microsoft^® Excel; in this case, the full file name was ‘olivephenolics.xlsx’. Here, Hadley Wickham’s ‘Tidy Data’ concept [26] was referred, where each variable (22 phenolics) was a column and each sample observation (7 extracts) was a row, because the input data must be tidy for the best results. It can be seen from Figure S1a that the names of 7 olive pomace extracts were listed in the first column from A2 to A8, and the names of 22 olive pomace phenolic compounds were listed in the first row from B1 to W1. The text ‘NAME’ was placed in cell A1. The file was saved as ‘olivephenolics.xlsx’ in a MATLAB work folder.

The data area in the excel file can be expanded in both rows and columns; however, there should be no blank cells in any places in the data area. The sample observation name should also be listed in the first column and the compound variables names should be listed in the first row.

Omics data of each sample must be listed in each row, and variables/compounds must be listed in columns; otherwise, the program will still run, but output meaningless results.

The excel data file and ‘.m’ code in the MATLAB files have been uploaded to the file exchange website as a secondary way to obtain the dataset and code. Readers can download from there in the MATLAB software, as shown in the ‘screenshot’ in Figure S1b, or via the MATLAB file exchange website [27].

2.2. Software and Coding

MATLAB^® 2022a (MathWorks, Natick, MA, USA) with an academic license from the University of California, Davis (UC Davis) was used for all coding and data analysis. The ‘core’ MATLAB functions used for statistical analysis were ‘bar3′ [28], ‘clustergram’ [29] with the ‘average’ linkage as the clustering instrument, and ‘biplot’ [30] for a 3D bar heatmap, heatmap hierarchical clustering analysis, and a biplot of principal component analysis (PCA) analysis, respectively. S. Code 1 was originally designed by the authors based on those ‘core’ MATLAB functions. The bottom size adjustment of the 3D bar chart heatmap referred to the question ‘How do I obtain bars with function bar3 and different widths for each bar?’ [31] on ‘stackoverflow.com’ with modifications.

The MATLAB ‘.m’ file was prepared by the ‘copy and paste’ of S. Code 1 ‘Ana’ version 1.0 into a new ‘.m’ file window. In this case, the full file name was ‘Ana.m’, based on the description and guidance of Figure S2. Both the excel ‘olivephenolics.xlsx’ file and the MATLAB ‘Ana.m’ file was and must be saved in the same folder for successfully running the program; otherwise, the program will not run properly, because the program cannot find the excel data file if the file is in any different folder.

2.3. Hardware

Both Apple^® MacOS Monterey and Microsoft^® Windows 10 environments were employed for testing the code compatibility. The hardware for MacOS was a 2.3 GHz Quad-Core Intel Core i5 Processor and 8 GB 2133 MHz LPDDR3 RAM. The hardware for Windows was a 4.1 GHz 8-Core 16-Thread AMD Ryzen™ 7 2700X Processor and 16 GB 3200 MHz DDR4 RAM.

3. Results and Discussion

3.1. Heatmap 3D Bar Chart

As can be seen from Figure 1a, the 3D heatmap bar chart generated by the original code ‘bar3′ would not meet the general figure quality requirement for peer-reviewed publications. The default label font size of the three axes was too small to read. The color code also differentiated compounds from blue to yellow; however, the most popular color code differentiation was based on values of compound concentrations from high to low. In addition, the bars were not transparent, which made the lower bars belied by higher bars. In general, the readability of the figure from the original code is not enough for scientific readers.

Figure 1b has been presented in our previous publication [25]. As compared with Figure 1a, the font size of labels on the three axes was enlarged for better readability. In addition, a color code bar was added to represent higher values in red and lower values in blue. The figure was printed in high resolution at 300 dpi. However, the major problem is that the lower values in blue almost dominated the entire chart and could not be easily differentiated. An interesting conversation [31] on ‘stackoverflow.com’ described a method to resize the bottom length and width based on the values in each data cell. The idea is to increase the size of the bottom when the value is higher while decreasing the size of the bottom when the value is lower. By incorporating the idea and code modifications, the minor compounds did not dominate the screen and the readability increased in Figure 1c.

In addition, S. Code 1 provides different options of color schemes as can be seen in Figure 1c and Figure 2. In the S. Code1, jet(256) outputs rainbow in Figure 1c; cool is blue to pink in Figure 2a; parula is blue to yellow in Figure 2b; ‘[]’ is transparent in Figure 2c. The code also provides different resolution options from 100 to 300 dpi. The output figures were rich in color and provided enough contrast in both color and bar size to differentiate the omics dataset.

3.2. Heatmap Cluster Analysis

With high data density and revealing clusters, heatmap hierarchical clustering analysis provides better visualization than unordered heatmaps [32]. Because the program standardizes the data along each sample row, the row cluster on the left side in Figure 3 grouped the samples based on olive phenolic compound profiles instead of absolute values. The samples WE dry past and WE in DOP formed one cluster, indicating that they had more similarity than the other samples, such as 70M and 70E. The program also provides options for different color codes as can be seen in Figure 3a–d.

3.3. PCA Analysis

PCA is a dimensional reduction statistical analysis method that can be implemented to reduce the dimension of original variables to several top principal components (PCs) with most of the explained variances [33]. As shown in Figure 4a–c, the PCA biplot printed PC1 vs. PC2, PC2 vs. PC3, and PC1 vs. PC2 vs. PC3, respectively. The PCA biplot overlays the loading plot (blue vectors) and the score plot (red starts) on the one graph [34,35]. The vectors of loading plots represent the multivariate variables (in this case, the olive phenolic compounds in Table S1) that affect the differences among samples [36]. The score plot shows dot points (red starts) that represent the original samples [33]. However, the PCA biplot did not differentiate samples by different colors or dot styles. Therefore, the program ‘Ana’ was designed to output separated score plots in Figure 5 by the use of different colors for samples. Figure 4d outputs the variance of individual PCs until 98% accumulated variances.

The PCA analysis here integrated into ‘Ana’ version 1.0 has yet to differentiate sample clusters (i.e., replicated or triplicated data for each sample) by the use of both different dot styles: color codes such as the PCA score plots in the work of Zhao et al. [37] for Raman spectra of egg white protein analysis by R software, and the 95% confidence eclipse in the plot work of Uchimiya [38] for the resistant genotype and underlying chemistry of sweet sorghum juice. Nevertheless, updated versions of the program would be expected to include those functions in the future.

3.4. Time Taking and Code Compatibility

As shown in Figure S5, the program analyzed the dataset with 7 samples * 22 compounds and output six figures for three types of data visualization, including a 3D heatmap, heatmap hierarchical clustering analysis, and principal component analysis (PCA), respectively, in 18 s on a personal computer (PC) with a Windows 10 system and in 20 s on a Mac with a MacOS Monterey system. The code basis analysis is rapid and compatible with the two different operating systems.

4. Conclusions

The improved MATLAB^® coding basis data analysis and visualization method, ‘Ana’ version 1.0, outputs three types of data analysis, including a 3D heatmap, heatmap hierarchical clustering analysis, and PCA, by one program running in seconds. The code is rapid and efficient to print out high-quality figures up to 150 or 300 dpi. The colored output figures provide enough contrast to differentiate the omics dataset by both color code and bar size difference, allowing the figures to be qualified for publication and presentation purposes. The program is compatible with both Windows and MacOS operating systems.

With completed guidance in the Supplementary Materials, the analysis program would liberate researchers from labor-intensive and time-consuming manual or coding basis data analysis and would enable them to fully focus on the results of their specific area of research with a single click of the ‘Run’ button on the software. This study also provides a coding example with appropriate code annotations for undergraduate and postgraduate students to learn coding basis statistical data analysis and to help them utilize such techniques for their future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/life12111864/s1. Figure S1: Data and code preparation; Figure S2: MATLAB .m file preparation; Figure S3: By clicking the rotation icon (red arrow) in (a), figures will pop up, then (b) copy print(fig3Dbar,’olive-adjudtsize.png’,’-dpng’,’-r150’), then paste in the command window (the blue arrow), then press keyboard ‘Enter’, a ‘.png’ file at 150 dpi will show up in the ‘current folder’ (red arrow); Figure S4: Export heatmap cluster chart to a ‘.pdf’ file, (a) on the ‘Clustergram 1′ window, click ‘Insert Colorbar’, (b) on the ‘Clustergram 1′ window, click ‘File’, then click ‘Export Setup’, (c) click ‘Rendering’, select ‘Custom rendering’ as ‘Painters (vector format)’, very important for high-resolution output!!!, click ‘Apply to Figure’, then click ‘Export’, (d) input ‘File name’ as ‘Cluster’, select ‘Save as type’ the ‘Portable Document Format (*.pdf)’, (e) click ‘Save’, (f) a ‘Cluster.pdf’ file will show up in the ‘Current Folder’. Then open the ‘.pdf’ file for ‘print screen’ a high-resolution figure; Figure S5: Final outcomes of the program running; Table S1: Phenolic compound data of olive pomace extract, data from our previous publication.

Author Contributions

H.Z.: Conceptualization, Investigation, Methodology, Data Curation, Formal analysis, Software, Coding, Writing—Original Draft; S.C.W.: Conceptualization, Methodology, Supervision, Writing—Review and Editing, Funding Acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to thank the California Department of Food and Agriculture, 2020 Specialty Crop Block Grant Program (20-0001-033-SF) for funding support.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data and MATLB code have been provided in the Supplementary Materials.

Acknowledgments

The authors would also like to thank Qiao Hu at the School of Natural Resources, University of Nebraska-Lincoln (UNL) for the program consult because of his excellent background in MATLAB^® coding [39].

Conflicts of Interest

The authors declare that the information presented in this paper is for educational and academic research purposes, and there is no competing financial interest.

References

Chen, J.-T. Phytochemical Omics in Medicinal Plants. Biomolecules 2020, 10, 936. [Google Scholar] [CrossRef]
Cabrita, M.J. Portuguese Olive Oil Omics for Traceability and Authenticity. Impact 2017, 2017, 76–78. [Google Scholar] [CrossRef]
Carrera, M. Proteomics and Food Analysis: Principles, Techniques, and Applications. Foods 2021, 10, 2538. [Google Scholar] [CrossRef]
Utpott, M.; Rodrigues, E.; de OliveiraRios, A.; Mercali, G.D.; Flôres, S.H. Metabolomics: An Analytical Technique for Food Processing Evaluation. Food Chem. 2022, 366, 130685. [Google Scholar] [CrossRef] [PubMed]
Tang, W.; Liu, D.; Nie, S.-P. Food Glycomics in Food Science: Recent Advances and Future Perspectives. Curr. Opin. Food Sci. 2022, 46, 100850. [Google Scholar] [CrossRef]
Jia, W.; Dong, X.; Shi, L.; Chu, X. Discrimination of Milk from Different Animal Species by a Foodomics Approach Based on High-Resolution Mass Spectrometry. J. Agric. Food Chem. 2020, 68, 6638–6645. [Google Scholar] [CrossRef]
Wu, B.; Wei, F.; Xu, S.; Xie, Y.; Lv, X.; Chen, H.; Huang, F. Mass Spectrometry-Based Lipidomics as a Powerful Platform in Foodomics Research. Trends Food Sci. Technol. 2021, 107, 358–376. [Google Scholar] [CrossRef]
Wang, S.; Zhang, Q.; Zhao, P.; Ma, Z.; Zhang, J.; Ma, W.; Wang, X. Investigating the Effect of Three Phenolic Fractions on the Volatility of Floral, Fruity, and Aged Aromas by HS-SPME-GC-MS and NMR in Model Wine. Food Chem. X 2022, 13, 100281. [Google Scholar] [CrossRef] [PubMed]
Varunjikar, M.S.; Belghit, I.; Gjerde, J.; Palmblad, M.; Oveland, E.; Rasinger, J.D. Shotgun Proteomics Approaches for Authentication, Biological Analyses, and Allergen Detection in Feed and Food-Grade Insect Species. Food Control 2022, 137, 108888. [Google Scholar] [CrossRef]
Lin, Z.; Pattathil, S.; Hahn, M.G.; Wicker, L. Blueberry Cell Wall Fractionation, Characterization and Glycome Profiling. Food Hydrocoll. 2019, 90, 385–393. [Google Scholar] [CrossRef]
Yang, Y.; Wang, B.; Fu, Y.; Shi, Y.; Chen, F.; Guan, H.; Liu, L.; Zhang, C.; Zhu, P.; Liu, Y.; et al. HS-GC-IMS with PCA to Analyze Volatile Flavor Compounds across Different Production Stages of Fermented Soybean Whey Tofu. Food Chem. 2021, 346, 128880. [Google Scholar] [CrossRef]
Green, H.S.; Wang, S.C. Evaluation of Proposed CODEX Purity Standards for Avocado Oil. Food Control 2023, 143, 109277. [Google Scholar] [CrossRef]
Zhao, H.; Zhan, Y.; Xu, Z.; John Nduwamungu, J.; Zhou, Y.; Powers, R.; Xu, C. The Application of Machine-Learning and Raman Spectroscopy for the Rapid Detection of Edible Oils Type and Adulteration. Food Chem. 2022, 373, 131471. [Google Scholar] [CrossRef] [PubMed]
Zhao, H.; Xie, X.; Read, P.; Loseke, B.; Gamet, S.; Li, W.; Xu, C. Biofortification with Selenium and Lithium Improves Nutraceutical Properties of Major Winery Grapes in the Midwestern United States. Int. J. Food Sci. Technol. 2021, 56, 825–837. [Google Scholar] [CrossRef]
Richter, B.; Gurk, S.; Wagner, D.; Bockmayr, M.; Fischer, M. Food Authentication: Multi-Elemental Analysis of White Asparagus for Provenance Discrimination. Food Chem. 2019, 286, 475–482. [Google Scholar] [CrossRef] [PubMed]
Zou, Y.; Gaida, M.; Franchina, F.A.; Stefanuto, P.-H.; Focant, J.-F. Distinguishing between Decaffeinated and Regular Coffee by HS-SPME-GC×GC-TOFMS, Chemometrics, and Machine Learning. Molecules 2022, 27, 1806. [Google Scholar]
Muguruma, Y.; Nunome, M.; Inoue, K. A Review on the Foodomics Based on Liquid Chromatography Mass Spectrometry. Chem. Pharm. Bull. 2022, 70, 12–18. [Google Scholar] [CrossRef]
Freire, R.; Fernandez, L.; Mallafré-Muro, C.; Martín-Gómez, A.; Madrid-Gambin, F.; Oliveira, L.; Pardo, A.; Arce, L.; Marco, S. Full Workflows for the Analysis of Gas Chromatography—Ion Mobility Spectrometry in Foodomics: Application to the Analysis of Iberian Ham Aroma. Sensors 2021, 21, 6156. [Google Scholar] [CrossRef]
Valdés, A.; Álvarez-Rivera, G.; Socas-Rodríguez, B.; Herrero, M.; Ibáñez, E.; Cifuentes, A. Foodomics: Analytical Opportunities and Challenges. Anal. Chem. 2022, 94, 366–381. [Google Scholar] [CrossRef]
Jimenez-Carvelo, A.M.; Cuadros-Rodríguez, L. Data Mining/Machine Learning Methods in Foodomics. Curr. Opin. Food Sci. 2021, 37, 76–82. [Google Scholar] [CrossRef]
García-Cañas, V.; Simó, C.; Herrero, M.; Ibáñez, E.; Cifuentes, A. Present and Future Challenges in Food Analysis: Foodomics. Anal. Chem. 2012, 84, 10150–10159. [Google Scholar] [CrossRef] [PubMed]
Song, H.S.; Lee, S.H.; Ahn, S.W.; Kim, J.Y.; Rhee, J.-K.; Roh, S.W. Effects of the Main Ingredients of the Fermented Food, Kimchi, on Bacterial Composition and Metabolite Profile. Food Res. Int. 2021, 149, 110668. [Google Scholar] [CrossRef] [PubMed]
Barea-Sepúlveda, M.; Espada-Bellido, E.; Ferreiro-González, M.; Bouziane, H.; López-Castillo, J.G.; Palma, M.; Barbero, G.F. Toxic Elements and Trace Elements in Macrolepiota Procera Mushrooms from Southern Spain and Northern Morocco. J. Food Compos. Anal. 2022, 108, 104419. [Google Scholar] [CrossRef]
Cerqueira, F.; Matamoros, V.; Bayona, J.M.; Berendonk, T.U.; Elsinga, G.; Hornstra, L.M.; Piña, B. Antibiotic Resistance Gene Distribution in Agricultural Fields and Crops. A Soil-to-Food Analysis. Environ. Res. 2019, 177, 108608. [Google Scholar] [CrossRef] [PubMed]
Zhao, H.; Avena-Bustillos, R.J.; Wang, S.C. Extraction, Purification and In Vitro Antioxidant Activity Evaluation of Phenolic Compounds in California Olive Pomace. Foods 2022, 11, 174. [Google Scholar] [CrossRef] [PubMed]
Wickham, H.; Averick, M.; Bryan, J.; Chang, W.; McGowan, L.D.; François, R.; Grolemund, G.; Hayes, A.; Henry, L.; Hester, J. Welcome to the Tidyverse. J. Open Source Softw. 2019, 4, 1686. [Google Scholar] [CrossRef] [Green Version]
Zhao, H.; Wang, C.S. A 3in1 Omics Data Visualization and Analytical Method—File Exchange—MATLAB Central. Available online: https://www.mathworks.com/matlabcentral/fileexchange/117370-a-3in1-omics-data-visualization-and-analytical-method (accessed on 8 September 2022).
3-D Bar Graph-MATLAB Bar3. Available online: https://www.mathworks.com/help/matlab/ref/bar3.html (accessed on 3 September 2022).
Object Containing Hierarchical Clustering Analysis Data—MATLAB. Available online: https://www.mathworks.com/help/bioinfo/ref/clustergram.html?searchHighlight=clustergram&s_tid=srchtitle_clustergram_1 (accessed on 3 September 2022).
Biplot—MATLAB Biplot. Available online: https://www.mathworks.com/help/stats/biplot.html?s_tid=doc_ta (accessed on 3 September 2022).
Matlab—How I Obtain Bars with Function Bar3 and Different Widths for Each Bar?—Stack Overflow. Available online: https://stackoverflow.com/questions/24269516/how-i-obtain-bars-with-function-bar3-and-different-widths-for-each-bar (accessed on 3 September 2022).
Engle, S.; Whalen, S.; Joshi, A.; Pollard, K.S. Unboxing Cluster Heatmaps. BMC Bioinform. 2017, 18, 63. [Google Scholar] [CrossRef] [Green Version]
Cozzolino, D.; Power, A.; Chapman, J. Interpreting and Reporting Principal Component Analysis in Food Science Analysis and Beyond. Food Anal. Methods 2019, 12, 2469–2473. [Google Scholar] [CrossRef]
Buvé, C.; Saeys, W.; Rasmussen, M.A.; Neckebroeck, B.; Hendrickx, M.; Grauwet, T.; Van Loey, A. Application of Multivariate Data Analysis for Food Quality Investigations: An Example-Based Review. Food Res. Int. 2022, 151, 110878. [Google Scholar] [CrossRef]
Xue, C.; He, Z.; Qin, F.; Chen, J.; Zeng, M. Effects of Amides from Pungent Spices on the Free and Protein-Bound Heterocyclic Amine Profiles of Roast Beef Patties by UPLC–MS/MS and Multivariate Statistical Analysis. Food Res. Int. 2020, 135, 109299. [Google Scholar] [CrossRef]
Peng, C.; Zhu, Y.; Yan, F.; Su, Y.; Zhu, Y.; Zhang, Z.; Zuo, C.; Wu, H.; Zhang, Y.; Kan, J.; et al. The Difference of Origin and Extraction Method Significantly Affects the Intrinsic Quality of Licorice: A New Method for Quality Evaluation of Homologous Materials of Medicine and Food. Food Chem. 2021, 340, 127907. [Google Scholar] [CrossRef] [PubMed]
Zhao, H.; Han, A.; Nduwamungu, J.J.; Nishijima, N.; Oda, Y.; Handa, A.; Zhang, Y.; Majumder, K.; Xu, C. Improving Textural Properties of Gluten-Free Veggie Sausage with Egg White Proteins. Food Bioeng. 2022. [Google Scholar] [CrossRef]
Uchimiya, M. Aromaticity of Secondary Products as the Marker for Sweet Sorghum [Sorghum Bicolor (L.) Moench] Genotype and Environment Effects. J. Agric. Food Res. 2022, 9, 100338. [Google Scholar] [CrossRef]
Hu, Q.; Woldt, W.; Neale, C.; Zhou, Y.; Drahota, J.; Varner, D.; Bishop, A.; LaGrange, T.; Zhang, L.; Tang, Z. Utilizing Unsupervised Learning, Multi-View Imaging, and CNN-Based Attention Facilitates Cost-Effective Wetland Mapping. Remote Sens. Environ. 2021, 267, 112757. [Google Scholar] [CrossRef]

Figure 1. 3D heatmap bar chart of olive pomace phenolic compound profile. (a) Figure plotted by original MATLAB code of ‘bar3′ in default colormap; (b) figure quality improved by adding x and y axis labels, and increasing the font size of labels; (c) adjusting the bottom length and width of each 3D bar according to the values that outstand the high-value data; the height and color code of each column represents the true concentrations of each phenolic, while the volume and size represent the relative and underestimated concentrations, respectively. Note: the colormap of (b,c) was rainbow by ‘jet(256)’. Please refer to Figure S3 for rotating the chart to a proper angle to display a nice visualization. WE, water extract; 70M, 70% methanol extract; 70E, 70% ethanol extract; XAD7HP resin, XAD7HP resin purified extract.

Figure 2. Different colormap comparisons of 3D heatmap bar charts. (a) Colormap ‘cool’ from blue to pink; (b) colormap ‘parula’ from blue to yellow; (c) ‘[]’ transparent.

Figure 3. Heatmap hierarchical clustering analysis; data were standardized on each row for comparison of the profile, (a) colormap ‘jet(256)’ for rainbow; (b) colormap ‘cool’ from blue to pink; (c) colormap ‘parula’ from blue to yellow; (d) colormap ‘redbluecmap’ from blue to red.

Figure 4. Principal component analysis (PCA) biplot, (a) biplot of PC1 vs. PC2, (b) biplot of PC2 vs. PC3, (c) biplot of PC1 vs. PC2 vs. PC3, and (d) explained variance; bars represent the variance of individual PCs and the line represents accumulated variances. Note: Blue stars 1–22 in (a–c) represent variable/phenolic compound names (the first row) in Table S1. Red stars are the seven samples.

Figure 5. Principal component analysis (PCA) score plot: (a) score plot of PC1 vs. PC2; (b) score plot of PC2 vs. PC3.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, H.; Wang, S.C. A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset. Life 2022, 12, 1864. https://doi.org/10.3390/life12111864

AMA Style

Zhao H, Wang SC. A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset. Life. 2022; 12(11):1864. https://doi.org/10.3390/life12111864

Chicago/Turabian Style

Zhao, Hefei, and Selina C. Wang. 2022. "A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset" Life 12, no. 11: 1864. https://doi.org/10.3390/life12111864

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Coding Basis and Three-in-One Integrated Data Visualization Method ‘Ana’ for the Rapid Analysis of Multidimensional Omics Dataset

Abstract

1. Introduction

2. Methods

2.1. Data Preparation

2.2. Software and Coding

2.3. Hardware

3. Results and Discussion

3.1. Heatmap 3D Bar Chart

3.2. Heatmap Cluster Analysis

3.3. PCA Analysis

3.4. Time Taking and Code Compatibility

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI