Next Article in Journal
The Role of Multimodality Imaging Approach in Acute Aortic Syndromes: Diagnosis, Complications, and Clinical Management
Previous Article in Journal
Symptomatic Popliteal Artery Aneurysms in Recently SARS-CoV-2-Infected Patients: The Microangiopathic Thrombosis That Undermines Treatment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning System for Lung Neoplasms Distinguished Based on Scleral Data

1
Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China
2
BNRist and School of Software, Tsinghua University, Beijing 100084, China
3
Graduate School, Adamson University, Manila 1000, Philippines
4
Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
5
Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
6
Emergency General Hospital, Beijing 100000, China
7
National Engineering Research Center for Beijing Biochip Technology, Beijing 102206, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Diagnostics 2023, 13(4), 648; https://doi.org/10.3390/diagnostics13040648
Submission received: 2 January 2023 / Revised: 30 January 2023 / Accepted: 7 February 2023 / Published: 9 February 2023
(This article belongs to the Topic Diagnostic Imaging and Pathology in Cancer Research)

Abstract

:
Lung cancer remains the most commonly diagnosed cancer and the leading cause of death from cancer. Recent research shows that the human eye can provide useful information about one’s health status, but few studies have revealed that the eye’s features are associated with the risk of cancer. The aims of this paper are to explore the association between scleral features and lung neoplasms and develop a non-invasive artificial intelligence (AI) method for detecting lung neoplasms based on scleral images. A novel instrument was specially developed to take the reflection-free scleral images. Then, various algorithms and different strategies were applied to find the most effective deep learning algorithm. Ultimately, the detection method based on scleral images and the multi-instance learning (MIL) model was developed to predict benign or malignant lung neoplasms. From March 2017 to January 2019, 3923 subjects were recruited for the experiment. Using the pathological diagnosis of bronchoscopy as the gold standard, 95 participants were enrolled to take scleral image screens, and 950 scleral images were fed to AI analysis. Our non-invasive AI method had an AUC of 0.897 ± 0.041(95% CI), a sensitivity of 0.836 ± 0.048 (95% CI), and a specificity of 0.828 ± 0.095 (95% CI) for distinguishing between benign and malignant lung nodules. This study suggested that scleral features such as blood vessels may be associated with lung cancer, and the non-invasive AI method based on scleral images can assist in lung neoplasm detection. This technique may hold promise for evaluating the risk of lung cancer in an asymptomatic population in areas with a shortage of medical resources and as a cost-effective adjunctive tool for LDCT screening at hospitals.

1. Introduction

Lung cancer is the leading cause of death from cancer worldwide (18.0% of the total cancer deaths) [1]. Early detection and treatment can considerably improve 3-year survival rates [2], but over half of patients were diagnosed at an advanced stage [3]. Lung cancer is expected to remain a major public health issue for decades [4].
Low-dose computed tomography (LDCT) has been adopted as a tool for early lung cancer screening for decades. Large-scale LDCT screening has been recommended for people with high risk of lung cancer at major medical and governmental organizations [5,6,7]. Since the first recommendation of lung cancer screening by LDCT in 2013, the United States has seen a steep decline in advanced lung cancer incidence and an increase in 5-year relative survival, as well as a more promising outlook for lung cancer at all stages [2].
As an effective method to distinguish lung cancer at an early stage, it is of the utmost importance to select the population amenable to LDCT screenings [8]. The US Preventive Services Task Force (USPSTF) concludes that age (50 to 80 years old), total cumulative exposure to tobacco smoke (20 packs per year), and years since quitting smoking (15 years) are critical criteria [6]. However, not all lung cancers are caused by active smoking. Background risks resulting from other exposures, such as cooking fumes and environmental pollution, and their interactions between epigenetic and genetic processes also have a great influence on lung cancer occurrence [9]. For example, despite a low smoking prevalence, females in East Asia suffer from a high incidence of lung cancer [10]. Some lung cancer risk-prediction models are also recommended to recognize high-risk individuals [11,12,13] but are deemed unsuitable because they were validated in limited populations or with marginal predictive power [7].
Besides the criteria for selection of the high-risk population, widespread adoption of LDCT screening remains challenging. As large, complicated medical equipment, LDCT screening is out of reach at primary medical institutions, and it requires a relatively complicated operation. Although LDCT can find small nodules on the lung, it cannot distinguish between benign and malignant lung cancer, which causes patient anxiety, unnecessary follow-up, and invasive diagnostic procedures in previously screened patients [14]. Therefore, there is a critical need for a more cost-effective tool for population (pre) screening, such as identifying individuals who are likely to harbor a tumor that can be detected during follow-up LDCT examinations. An adjunctive test to help evaluate the malignancy potential is also urgently needed in order to improve the specificity of noninvasive lung cancer detection and diagnostic triage.
Moreover, the interpretation of images captured by LDCT or other medical examination machines remains challenging [15]. Detection accuracy attained by clinical experts varies widely and leaves room for improvement [16]. Artificial intelligence (AI) is proving to be a capable aid to this challenge. Several studies have proven that AI can meet or outperform human experts on a variety of medical-image analysis tasks [17,18,19,20]. Most recently, deep learning methods for lung cancer prediction have been successful with higher detection accuracy [21,22,23]. Some deep learning systems for lung cancer detection even gain the ability to analyze an entire 3D CT scan, which can provide more diagnostic information about features such as blood vessels [24].
As an internal organ that can be observed externally, the human eye can reveal possible diseases or dysfunctions of specific organs in a painless and non-invasive way [25]. The sclera in the eye, which is visible on the ocular surface, may provide useful information about one’s health status [26]. Neoplasms play a key role in the formation of new blood vessels from pre-existing blood vessels [27]. As the only human blood vessels that can be directly observed without being covered by skin or affected by pigments, dilated scleral vessels have been used to evaluate the health risk of patients who are suspected to have internal carotid artery occlusions [28]. Redness of the sclera may reflect abnormal vasodilation of the conjunctival blood vessels [29]. However, limited work has been reported using scleral images to assist with the screening of populations with high risk of lung diseases [30].
Thus, we hypothesize that lung cancer is related to the blood vessel pattern on the sclera. This study aimed to explore an artificial intelligence system based on scleral images to distinguish benign from malignant lung nodules. Firstly, we developed a novel scleral optics screening instrument to obtain scleral images in a convenient, cost-effective, and non-invasive way. Secondly, we preprocessed scleral images, extracted features, and classified data sets by using deep learning methods so as to find the correlation between lung neoplasms and scleral features. Then, we evaluate the performance of our new AI system for lung cancer detection by using scleral images from patients who have been diagnosed with lung nodules through LDCT screening. Furthermore, we showed how this system might be integrated into screening workflows to speed the advent of population lung cancer screening.

2. Materials and Methods

2.1. Participants

From March 2017 to January 2019, 3923 participants were recruited through general practitioners at the Emergency General Hospital (Beijing, China). We included adult residents showing pulmonary symptoms, such as coughing, chest pain, and dyspnea, or with at least one high risk factor: (1) current smokers who had at least 20 packs of cigarettes per year or former smokers who had quit smoking no more than 15 years ago; (2) lung cancer history of close family members; (3) long-term exposure to cooking oil fumes (> 50 dish-years); (4) a long history of passive smoking (>2 h/day indoors for more than 10 years). Individuals who had previously been diagnosed with any kind of cancer were excluded from the study. The research protocol was approved by the Ethics Committee of the Emergency General Hospital and Tsinghua University. All enrolled subjects signed a written informed consent form before entering the study.

2.2. Screening Strategy

All participants were invited to take an LDCT screening using a 64-detector CT row scanner (Brilliance, Philips, Cambridge, MA, USA). Spiral CT images of the lung were captured with a thickness of 1 mm and were reconstructed with 30% overlap between each layer. Any nonsolid nodules ≥5 mm or solid nodules ≥8 mm were considered positive. A total of 3821 subjects were negative for the LDCT test. Participants with positive LDCT screening findings were asked to take a scleral screening follow-up. The sclera images were taken using a specially developed optical instrument for scleral feature analysis. The scleral images were pre-analyzed to determine whether they qualified for deep learning in the next stage. Seven participants were excluded with invalid scleral images at this stage. For any positive pulmonary nodules, to confirm whether it is benign or malignant, pathological examinations were also conducted by experienced clinicians at the hospital who were blinded to this study. If lung cancer was diagnosed, the participant was split into the malignant group; otherwise, the participant was in the benign group. Finally, we obtained 95 valid subjects with 950 scleral images to enroll in the subsequent AI analysis, including 20 benign subjects and 75 malignant subjects. The AI analysis in this study was performed at Tsinghua University (Beijing, China). Figure 1 shows an overview of the screening strategy.

2.3. Scleral Imaging Method and Instrument

In the clinic, ophthalmic examinations are usually performed with slit lamps, which are complicated to operate and time-consuming to post-process. The reason is that because the eyeball has a multilayer quasi-sphere structure, common illumination sources will cause many reflection shadows on the scleral images from the interface of different layers of the eyeball, as shown in the picture on the left side of Figure 2B.
To eliminate the interference caused by reflection shadows and obtain reflection-free scleral images, an adaptive reflection- and shadow-free scleral imaging method was developed, as shown in Figure 2C. S is the illumination source of a 1-W LED white light, G is the cross-guiding light of a 1-W green LED, the lens has a 100-mm focal length, the CCD is a Canon 5D, Ψ is the angle between the optical axis and S, and Ø is the angle between the optical axis and the pupil. To build the neural network-based iris auto-tracking and sclera auto-focusing method, the optimum values of Ψ and Ø were obtained at 40–50° and 65–80°, respectively, when all of the reflection shadows of the illumination source were focused on a small point and superimposed onto the pupil. The optimal values of Ψ and Ø differ for different people. Using the cross-light guide G, the user is guided to rotate the eyeball and adjust the pupil position to give negative feedback on the iris automatic tracking system until the reflection shadows focus on the pupil. The auto-focus sclera can then be clearly imaged without reflection shadows, as shown in the picture on the right side of Figure 2B.
An adaptive reflection- and shadow-free scleral imaging instrument was developed based on the designs mentioned above. Using the indicator light G, subjects were instructed to rotate the eyeball center, up, down, left, and right. Thus, the sclera is synchronously photographed, and images of the entire sclera without the reflection shadows of the illumination source can be obtained within 3 min, as shown in Figure 2A. Scleral images of each subject were considered together as a package of images, which were marked as negative or positive according to the results of the pathological examination, i.e., packages of lung cancer patients were marked as positive, and packages of normal subjects were marked as negative.

2.4. Development of AI Models

Since research and open-source datasets about machine learning and deep learning for scleral images are so rare, we built various AI models to explore the association between scleral features and lung neoplasms. Every model consists of target area segmentation, feature extraction, and classification. In our experiment, we first preprocessed images to segment the scleral area, then extracted image features such as blood vessels, followed by a feature classification step, and finally we used models to decide whether the subject was benign or malignant according to the classification result of scleral images.

2.5. Statistical Analysis

Diagnostic accuracy, sensitivity, and specificity were utilized to indicate differences between the clinical diagnosis and results by using our scleral screen AI detection system. Accuracy measures the proportion of cases diagnosed correctly to the total number of participants. Sensitivity is the percentage of true positives out of all subjects, and specificity is the proportion of true negatives out of all subjects. Detection accuracy, sensitivity, specificity, and AUC value were calculated using SPSS software (version 18.0), as well as 95% confidence intervals. Means ± SD were used to represent normally distributed data. Expressions of these indices are shown as follows, where TP represents true positive, TN represents true negative, FN represents false negative, and FP represents false positive:
A c c y r a c y = T P + T N T P + T N + F N + T N ,
S e n s i t i v i t y = T P T P + F N ,
S p e c i f i c i t y = T N T N + F P .

3. Results

3.1. Characteristics of Subjects Enrolled in AI Analysis

Of the 95 subjects enrolled that were prospectively eligible for AI analysis, the main characteristics were presented according to the study group in Table 1. At the time of scleral screening, the 75 subjects diagnosed with malignant lung nodules were significantly older (average 61.9 years old) than those in the benign group (average 50.6 years old). A total of 33 females represented 34.7% of the enrolled AI analysis sample, and 83.9% of them were in the malignant group, while 68.7% of male subjects were diagnosed as having lung cancer. For the malignant subjects, the majority type of lung cancer was lung squamous cell carcinoma (LUSC), with a percentage of 37.3%, while the least prevalent was small cell lung cancer (8.0%). The proportions of lung metastasis, lung adenocarcinoma (LUAD), and mixed/unspecified NSCLC were 22.7%, 20.0%, and 12.0%, respectively.

3.2. Modeling of AI Models

Figure 3 shows the structure of the three best-performing models, and we combined them as the backbone algorithm of our non-invasion AI method.
In the first model, the collected scleral images were preprocessed before being fed. Regions of the sclera were annotated by a bounding box on each image and then cropped and resized as a separate image of size 512 × 384 px. For each subject, ten scleral images were captured, five for the left eye and the other five for the right eye. Images of the right eye were flipped horizontally to ease the deep learning model.
To improve the model’s performance, we used the technique of transfer learning, which first pre-trains a model on data from a source domain and then transfers the weights to further train on data from a target domain. The insight in transfer learning is that the source and target domains share some basic feature patterns. In our experiment, we first pre-trained ResNet-18 on ImageNet, a large image database, then fine-tuned the model on scleral images. We performed data augmentation by jittering the brightness, saturation, and contrast of the input images. To overcome the problem of class imbalance, we optimized the MIL model with focal loss63 as the target function. The model was optimized by Adam optimizer64 with an initial learning rate of 0.0001.
Then, we fed all the images of each subject into the Resnet-18 network, and a sequence of feature vectors was extracted. For each patient, different regions of the two eyes were revealed in different images. We performed data augmentation by jittering the brightness, saturation, and contrast of the input images. To overcome the problem of class imbalance, we optimized the multi-instance learning (MIL) model with focal loss as the target function. The MIL model was applied to all images of the patient to make a decision rather than predicting from single images. Using the MIL model, a fusion feature vector h f u s i o n = 1 N i = 1 N h i is aggregated from h i i = 1 10 using average pooling. Finally, h f u s i o n is fed into a multi-layer perceptron model (MLP) to obtain a prediction score, as shown in Figure 3A.
In the second model, we used U-net to segment the sclera areas, then used the autoencoder to extract the features of the image, which is the backbone of the whole model, and finally used the support vector machine for classification. The contracting path and expanding path are 29 layers in total, as shown in Figure 3B. It is trained from scratch, and we used data augmentation such as flipping and rotation to optimize the training effect. The overall structure of the third model is similar to the second one, except the target area segmentation part is changed into a crop segmentation algorithm and the feature extraction part is changed into Vgg16-net merely, as shown in Figure 3C.

3.3. Performance of the Top Three AI Models

After fine tuning, the optimal point performance of the different algorithms mentioned above is shown in Table 2. The model based on Resnet-18 and MIL performs best overall; the accuracy is up to 0.811, which is over 4% higher than the second place and nearly 6% higher than the third one. Thus, we used the first model for the further testing of different input strategies.

3.4. Comparison of Different Scleral Image Input Strategies

Further, we compared different scleral image input strategies based on the No.1 model; all other experimental settings were kept the same as previous experiments. Scleral image input strategies, including all ten images, treated the left and right eyes of each subject as two separate samples and eight scleral images other than the center ones. The binary classification results of the four input strategies were averaged across all of the test sets of the thrice-repeated three-fold cross-validation procedures. We utilize accuracy, sensitivity, specificity, AUC, as shown in Table 3, and the ROC curve (see Figure 4) to assess the performance.
After fine tuning, the eight scleral images input strategy achieved a mean AUC of 0.897 (95% CI = 0.856–0.938). At the optimal point, average accuracy, sensitivity, and specificity were 0.835 ± 0.031, 0.836 ± 0.048, and 0.828 ± 0.095, respectively. The MIL model that treated the two eyes of each subject as two separate samples achieved a mean AUC of 0.864 (only left eye) and 0.850 (only right eye), which is nearly 4% lower than our method, while the MIL models using all ten images achieved a mean AUC of 0.867, which is 3% lower than our method. That is probably because the two central images do not show much scleral information and are different from the other images. The above experiments show that using eight scleral images other than the center yielded the best performance with a high AUC of 0.897.

4. Discussion

Clinically, chest X-rays remain the most common and least expensive imaging modality to detect pulmonary abnormalities related to the early stages of the disease. Computerized tomography (CT) scans remain the standard of care for further evaluation of suspected lung cancer. However, X-ray and CT can only detect tumors of ~1mm or larger in size, and they cannot distinguish tumors from lung infections, indicating low sensitivity and high false-positive rates. Magnetic resonance imaging (MRI) and positron emission computerized tomography (PET) have also been deployed for the diagnosis of lung cancer, but they are too expensive and out of reach at primary medical sites. Meanwhile, large-scale imaging equipment with a danger of radiation exposure must be used to perform CT scans, which results in high costs. Biomarkers can also be used for the early detection of lung cancer, but only trace levels of biomarkers exist in the cancerous cells at an early stage, so the accuracy of cytology is poor.
Radiologists already employ computer-aided diagnostic tools to assist them in detecting malignant tumors, basically to tell the system to look for features related to malignancies. Additionally, researchers are attempting to develop methods for lung cancer diagnosis more efficiently and quickly. Among them are methods to make lung-cancer screening more precise and accessible to all, especially for people in areas that suffer from a shortage of large-scale equipment, such as a CT imaging system. However, there is a long way to go before new systems become clinical mainstays because a careful nurturing of the relationship between features of data and the machines on which they depend is needed.
From a point-of-care perspective, our AI method based on a non-invasive scleral screen instrument has various advantages, such as simplicity, being remotely available, low cost, requiring no reagent or a specialized laboratory setting, and being almost real-time (<3 min), making it suitable for large-scale electronic health monitoring. More importantly, due to its painless and non-invasive nature, we believe our assessment system for lung cancer at an early stage may have the chance to simplify physical examination processes and improve patients’ medical experiences.
Many aspects are worth exploring further in our research. First, the participants in our study all had lung symptoms or high-risk factors for lung cancer, so we obtain a higher rate of malignant subjects than benign subjects after LDCT screening, which causes an imbalance of data in the building of AI models. We will include more healthy subjects in our later study. Second, we only recruited participants at one hospital, which may limit the mobility of the algorithm. Thus, further external multi-center validation is warranted. Third, the non-invasive AI method could be used to detect malignant lesions in the lungs, but whether it could distinguish subtypes requires more samples from patients with different subtypes of lung cancer. Furthermore, even though AI applied to biomedical research is a powerful tool to analyze deep features and connections among lung cancer and scleral images, privacy protection and improper use with social implications are concerns.

5. Conclusions

In conclusion, we developed an AI method based on a non-invasive scleral screen instrument for predicting the risk of lung cancer. We developed an adaptive reflection- and shadow-free scleral imaging instrument to capture scleral images in a non-invasive way, which could conveniently and quickly collect complete scleral images in four directions and perform AI analysis in 3 min without any reagent consumption or the need for a laboratory. We also developed a multi-instance learning model to distinguish benign from malignant lung nodules by using scleral images. The binary classification results of the MIL model achieved an average AUC of 0.897, which indicates great potential for early screening of practical lung cancer during periodic physical checkups or daily family health monitoring.
Our results proposed a new concept: that the analysis of scleral images using deep learning can help detect lung cancer in this innovative study. This effort supported a potential step towards the development of a deep learning-based tool for the pre-test lung cancer probabilities assessment in outpatient clinics or through telemedicine in the community, which may help to evaluate the risk of lung cancer in an asymptomatic population in areas with a shortage of medical resources or as a cost-effective adjunctive tool for LDCT screening at hospitals.

Author Contributions

Conceptualization, Q.H. and W.L.; methodology, Q.H., W.L. and Z.Z.; software, Q.H., W.L., Z.Z., X.L. and Z.B.; validation, Q.H., W.L. and Z.Z.; formal analysis, Q.H., W.L. and S.T.; investigation, Q.H., W.L. and Z.Z.; resources, H.W., F.X. and G.H.; data curation, S.T., R.F., X.J., X.L. and Z.B.; writing—original draft preparation, Q.H., W.L. and Z.Z.; writing—review and editing, Q.H., W.L. and Y.G.; visualization, Q.H. and W.L.; supervision, H.W., F.X. and G.H.; project administration, F.X. and G.H.; funding acquisition, F.X. and G.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, grant number 2018YFA0704000; the Sichuan Science and Technology Program, grant number 2021YFQ0060; the National Natural Science Foundation of China, grant number 61927819, 81827808 and 62105177; Vanke Special Fund for Public Health and Health Discipline Development, Tsinghua University, grant number 2022Z82WKJ002; the Tsinghua University Spring Breeze Fund, grant number 2020Z99CFG011; the Beijing Lab Foundation, and the Tsinghua Autonomous Research Foundation, grant number 20194180031, 20201080058, 20201080510; and Tsinghua Laboratory Innovation Fund, grant number 100020019.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Emergency General Hospital and Tsinghua University (Beijing, China) (protocol code 20170012, date of approval: 11 December 2017).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Download links of the original scleral images: https://cloud.tsinghua.edu.cn/f/daf5509e53fc48c6b21f/?dl=1 (accessed on 2 January 2023) or https://drive.google.com/file/d/1hQOvS8pclx4GsjVuATkTHWGEQjWcYSpD/view?usp=sharing (accessed on 29 January 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: Globocan Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar]
  2. Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2022. CA Cancer J. Clin. 2022, 72, 7–33. [Google Scholar]
  3. National Cancer Registration and Analysis Service, Public Health England (PHE). Cancer Survival in England for Patients Diagnosed between 2014 and 2018, and Followed up to 2019. C2022. Available online: https://www.gov.uk/government/statistics (accessed on 2 December 2022).
  4. Leon, M.E.; Peruga, A.; Neill, A.M.; Kralikova, E.; Guha, N.; Minozzi, S.; Espina, C.; Schuz, J. European Code against Cancer, 4th Edition: Tobacco and Cancer. Cancer Epidemiol. 2015, 39 (Suppl. S1), S20–S33. [Google Scholar]
  5. Oncology Committee of Chinese Medical Association, National Medical Journal of China. Guidelines for the clinical diagnosis and treatment of lung cancer from the Chinese Medical Association (2022). Natl. Med. J. China 2022, 102, 1706–1740. [Google Scholar]
  6. US Preventive Services Task Force; Krist, A.H.; Davidson, K.W.; Mangione, C.M.; Barry, M.J.; Cabana, M.; Caughey, A.B.; Davis, E.M.; Donahue, K.E.; Doubeni, C.A.; et al. Screening for Lung Cancer: Us Preventive Services Task Force Recommendation Statement. JAMA 2021, 325, 962–970. [Google Scholar]
  7. Veronesi, G.; Baldwin, D.R.; Henschke, C.I.; Ghislandi, S.; Iavicoli, S.; Oudkerk, M.; De Koning, H.J.; Shemesh, J.; Field, J.K.; Zulueta, J.J.; et al. Recommendations for Implementing Lung Cancer Screening with Low-Dose Computed Tomography in Europe. Cancers 2020, 12, 1672. [Google Scholar]
  8. Tammemägi, M.C.; Church, T.R.; Hocking, W.G.; Silvestri, G.A.; Kvale, P.A.; Riley, T.L.; Commins, J.; Berg, C.D. Evaluation of the Lung Cancer Risks at Which to Screen Ever- and Never-Smokers: Screening Rules Applied to the Plco and Nlst Cohorts. PLoS Med. 2014, 11, e1001764. [Google Scholar] [CrossRef]
  9. Thun, M.J.; Hannan, L.M.; Adams-Campbell, L.L.; Boffetta, P.; Buring, J.E.; Feskanich, D.; Flanders, W.D.; Jee, S.H.; Katanoda, K.; Kolonel, L.N.; et al. Lung Cancer Occurrence in Never-Smokers: An Analysis of 13 Cohorts and 22 Cancer Registry Studies. PLoS Med. 2008, 5, e185. [Google Scholar] [CrossRef]
  10. Barta, J.A.; Powell, C.A.; Wisnivesky, J.P. Global Epidemiology of Lung Cancer. Ann. Glob. Health 2019, 85, 8. [Google Scholar]
  11. Tammemägi, M.C.; Katki, H.A.; Hocking, W.G.; Church, T.R.; Caporaso, N.; Kvale, P.A.; Chaturvedi, A.K.; Silvestri, G.A.; Riley, T.L.; Commins, J.; et al. Selection Criteria for Lung-Cancer Screening. N. Engl. J. Med. 2013, 368, 728–736. [Google Scholar]
  12. Wilson, D.O.; Weissfeld, J. A Simple Model for Predicting Lung Cancer Occurrence in a Lung Cancer Screening Program: The Pittsburgh Predictor. Lung Cancer 2015, 89, 31–37. [Google Scholar]
  13. Muller, D.C.; Johansson, M.; Brennan, P. Lung Cancer Risk Prediction Model Incorporating Lung Function: Development and Validation in the Uk Biobank Prospective Cohort Study. J. Clin. Oncol. 2017, 35, 861–869. [Google Scholar]
  14. Oudkerk, M.; Liu, S.Y.; Heuvelmans, M.A.; Walter, J.E.; Field, J.K. Lung Cancer Ldct Screening and Mortality Reduction-Evidence, Pitfalls and Future Perspectives. Nat. Rev. Clin. Oncol. 2021, 18, 135–151. [Google Scholar]
  15. McKinney, S.M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G.S.; Darzi, A.; et al. International Evaluation of an Ai System for Breast Cancer Screening. Nature 2020, 586, E19. [Google Scholar] [CrossRef]
  16. Lehman, C.D.; Wellman, R.D.; Buist, D.S.; Kerlikowske, K.; Tosteson, A.N.; Miglioretti, D.L.; Breast Cancer Surveillance Consortium. Diagnostic Accuracy of Digital Screening Mammography with and without Computer-Aided Detection. JAMA Intern. Med. 2015, 175, 1828–1837. [Google Scholar]
  17. Wan, Y.-L.; Wu, P.W.; Huang, P.-C.; Tsay, P.-K.; Pan, K.-T.; Trang, N.N.; Chuang, W.-Y.; Wu, C.-Y.; Lo, S.B. The Use of Artificial Intelligence in the Differentiation of Malignant and Benign Lung Nodules on Computed Tomograms Proven by Surgical Pathology. Cancers 2020, 12, 2211. [Google Scholar]
  18. Espinoza, J.L.; Dong, L.T. Artificial Intelligence Tools for Refining Lung Cancer Screening. J. Clin. Med. 2020, 9, 3860. [Google Scholar]
  19. Chang, Y.-J.; Hung, K.-C.; Wang, L.-K.; Yu, C.-H.; Chen, C.-K.; Tay, H.-T.; Wang, J.-J.; Liu, C.-F. A Real-Time Artificial Intelligence-Assisted System to Predict Weaning from Ventilator Immediately after Lung Resection Surgery. Int. J. Environ. Res. Public Health 2021, 18, 2713. [Google Scholar] [CrossRef]
  20. Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.; Corrado, G.; et al. End-to-End Lung Cancer Screening with Three-Dimensional Deep Learning on Low-Dose Chest Computed Tomography. Nat. Med. 2019, 25, 954–961. [Google Scholar]
  21. Szabó, I.V.; Simon, J.; Nardocci, C.; Kardos, A.S.; Nagy, N.; Abdelrahman, R.-H.; Zsarnóczay, E.; Fejér, B.; Futácsi, B.; Müller, V.; et al. The Predictive Role of Artificial Intelligence-Based Chest CT Quantification in Patients with COVID-19 Pneumonia. Tomography 2021, 7, 697–710. [Google Scholar] [CrossRef]
  22. Lu, M.T.; Raghu, V.K.; Mayrhofer, T.; Aerts, H.; Hoffmann, U. Deep Learning Using Chest Radiographs to Identify High-Risk Smokers for Lung Cancer Screening Computed Tomography: Development and Validation of a Prediction Model. Ann. Intern. Med. 2020, 173, 704–713. [Google Scholar]
  23. Gould, M.K.; Huang, B.Z.; Tammemagi, M.C.; Kinar, Y.; Shiff, R. Machine Learning for Early Lung Cancer Identification Using Routine Clinical and Laboratory Data. Am. J. Respir. Crit. Care Med. 2021, 204, 445–453. [Google Scholar]
  24. Eijnatten, M.; Rundo, L.; Batenburg, K.J.; Lucka, F.; Beddowes, E.; Caldas, C.; Gallagher, F.A.; Sala, E.; Schönlieb, C.B.; Woitek, R. 3d Deformable Registration of Longitudinal Abdominopelvic Ct Images Using Unsupervised Deep Learning. Comput. Methods Programs Biomed. 2021, 208, 106261. [Google Scholar] [CrossRef]
  25. Ma, L.; Zhang, D.; Li, N.M.; Cai, Y.; Zuo, W.M.; Wang, K.G. Iris-Based Medical Analysis by Geometric Deformation Features. IEEE J. Biomed. Health Inf. 2013, 17, 223–231. [Google Scholar]
  26. Boote, C.; Sigal, I.A.; Grytz, R.; Hua, Y.; Nguyen, T.D.; Girard, M. Scleral Structure and Biomechanics. Prog. Retin. Eye Res. 2020, 74, 100773. [Google Scholar]
  27. Judah, F. Angiogenesis: An Organizing Principle for Drug Discovery? Nat. Rev. Drug Discov. 2007, 6, 273–286. [Google Scholar]
  28. Countee, R.W.; Gnanadev, A.; Chavis, P. Dilated Episcleral Arteries-a Significant Physical Finding in Assessment of Patients with Cerebrovascular Insufficiency. Stroke 1978, 9, 42–45. [Google Scholar]
  29. Murphy, P.J.; Lau, J.; Sim, M.; Woods, R.L. How Red Is a White Eye? Clinical Grading of Normal Conjunctival Hyperaemia. Eye 2007, 21, 633–638. [Google Scholar]
  30. Hussain, T.; Haider, A.; Muhammad, A.M.; Agha, A.; Khan, B.; Rashid, F.; Raza, M.S.; Din, M.; Khan, M.; Ullah, S.; et al. An Iris Based Lungs Pre-Diagnostic System. In Proceedings of the 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), Sukkur, Pakistan, 30–31 January 2019; pp. 1–5. [Google Scholar]
Figure 1. Flow chart illustrating inclusion and exclusion criteria applied to screening and AI analysis participants.
Figure 1. Flow chart illustrating inclusion and exclusion criteria applied to screening and AI analysis participants.
Diagnostics 13 00648 g001
Figure 2. Reflection- and shadow-free scleral screen instrument and workflow of image analysis. (A) Workflow of the non-invasive AI method for detecting lung neoplasms based on scleral images. (B) Imaging the sclera directly (left) and using the novel scleral imaging instrument (right). (C) Schematic of the novel scleral imaging instrument.
Figure 2. Reflection- and shadow-free scleral screen instrument and workflow of image analysis. (A) Workflow of the non-invasive AI method for detecting lung neoplasms based on scleral images. (B) Imaging the sclera directly (left) and using the novel scleral imaging instrument (right). (C) Schematic of the novel scleral imaging instrument.
Diagnostics 13 00648 g002
Figure 3. Structure of the three best performing AI models. (A) The first model developed uses a bounding box to annotate the region of the sclera, Resnet-18 to extract features, and MIL and MLP to classify. (B) The second model uses U-net to segment the scleral area, autoencoder to extract the features, and SVM and vote to classify. (C) The third model uses a traditional threshold algorithm to segment the scleral area, Vgg16-net to extract features, and SVM and vote to classify.
Figure 3. Structure of the three best performing AI models. (A) The first model developed uses a bounding box to annotate the region of the sclera, Resnet-18 to extract features, and MIL and MLP to classify. (B) The second model uses U-net to segment the scleral area, autoencoder to extract the features, and SVM and vote to classify. (C) The third model uses a traditional threshold algorithm to segment the scleral area, Vgg16-net to extract features, and SVM and vote to classify.
Diagnostics 13 00648 g003
Figure 4. ROC curves of different scleral image input strategies. (A) The mean ROC curve with 95% CI and the optimal point for ten images; (B) The mean ROC curve with 95% CI and the optimal point for only the left eye (four images); (C) The mean ROC curve with 95% CI and the optimal point for only the right eye (four images); (D) The mean ROC curve with 95% CI and the optimal point for images other than the center (eight images). The dashed boxes in different colors represent different input strategies.
Figure 4. ROC curves of different scleral image input strategies. (A) The mean ROC curve with 95% CI and the optimal point for ten images; (B) The mean ROC curve with 95% CI and the optimal point for only the left eye (four images); (C) The mean ROC curve with 95% CI and the optimal point for only the right eye (four images); (D) The mean ROC curve with 95% CI and the optimal point for images other than the center (eight images). The dashed boxes in different colors represent different input strategies.
Diagnostics 13 00648 g004
Table 1. Main characteristics of subjects enrolled in AI analysis.
Table 1. Main characteristics of subjects enrolled in AI analysis.
CharacteristicsBenign GroupMalignant Group
    Age50.661.9
    Gender
        Female10 (30.3%)23 (68.7%)
        Male10 (16.1%)52 (83.9%)
    Tumor type
       Lung squamous cell carcinoma (LUSC) 28 (37.3%)
Lung metastasis 17 (22.7%)
Lung adenocarcinoma (LUAD) 15 (20.0%)
Mixed/unspecified NSCLC 9 (12.0%)
Small Cell Lung Cancer (SCLC) 6 (8.0%)
Table 2. Performance of the top three AI models.
Table 2. Performance of the top three AI models.
Models 1AccuracySensitivitySpecificity
No. 10.8110.8130.800
No. 20.7790.8270.600
No. 30.7680.8270.550
1 Model No. 1 uses a bounding box to annotate the region of sclera, Resnet-18 to extract features, MIL and MLP to classify; Model No. 2 uses U-net to segment scleral area, auto-encoder to extract features, SVM and vote to classify; and Model No. 3 uses a crop algorithm to segment scleral area, Vgg16-net to extract features, SVM and vote to classify.
Table 3. Results of comparisons between different scleral image input strategies.
Table 3. Results of comparisons between different scleral image input strategies.
Input Images 2AccuracySensitivitySpecificityAverage AUC
Full (10)0.818 ± 0.0430.818 ± 0.0440.817 ± 0.0900.867 ± 0.058
Only Left Eye (4)0.835 ± 0.0440.849 ± 0.0540.786 ± 0.0840.864 ± 0.063
Only Right Eye (4)0.779 ± 0.0550.778 ± 0.0610.783 ± 0.0510.850 ± 0.055
Other Than Center (8)0.835 ± 0.0310.836 ± 0.0480.828 ± 0.0950.897 ± 0.041
2 Including full eyes (ten images), only left eye (four images), only right eye (four images), and other than center (eight images), in which ± indicates the 95% CI.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, Q.; Lv, W.; Zhou, Z.; Tan, S.; Lin, X.; Bo, Z.; Fu, R.; Jin, X.; Guo, Y.; Wang, H.; et al. Machine Learning System for Lung Neoplasms Distinguished Based on Scleral Data. Diagnostics 2023, 13, 648. https://doi.org/10.3390/diagnostics13040648

AMA Style

Huang Q, Lv W, Zhou Z, Tan S, Lin X, Bo Z, Fu R, Jin X, Guo Y, Wang H, et al. Machine Learning System for Lung Neoplasms Distinguished Based on Scleral Data. Diagnostics. 2023; 13(4):648. https://doi.org/10.3390/diagnostics13040648

Chicago/Turabian Style

Huang, Qin, Wenqi Lv, Zhanping Zhou, Shuting Tan, Xue Lin, Zihao Bo, Rongxin Fu, Xiangyu Jin, Yuchen Guo, Hongwu Wang, and et al. 2023. "Machine Learning System for Lung Neoplasms Distinguished Based on Scleral Data" Diagnostics 13, no. 4: 648. https://doi.org/10.3390/diagnostics13040648

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop