Next Article in Journal
Quantitative Study on the Life-Cycle Carbon Emissions of a Nearly Zero Energy Building in the Severe Cold Zones of China
Next Article in Special Issue
6+: A Novel Approach for Building Extraction from a Medium Resolution Multi-Spectral Satellite
Previous Article in Journal
Business Ecosystem Management and Editorial on the Special Issue
Previous Article in Special Issue
Smart-Hydroponic-Based Framework for Saffron Cultivation: A Precision Smart Agriculture Perspective
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Machine-Learning-Based Hybrid CNN Model for Tumor Identification in Medical Image Processing

1
Department of Computer Science, Government Bikram College of Commerce, Patiala 201206, India
2
KIET Group of Institutions, Delhi NCR, Ghaziabad 110093, India
3
Department of Statistics, Chulalongkorn University, Bangkok 10100, Thailand
4
Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
5
Centre for Sport and Exercise Sciences, Universiti Malaya, Kuala Lumpur 50603, Malaysia
6
Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Jalan Universiti, Kuala Lumpur 50603, Malaysia
7
Department of Public Health and Community Medicine, Faculty of Medicine, Tanta City 31527, Egypt
8
Department of Information Technology, Amity University, Noida 110096, India
*
Authors to whom correspondence should be addressed.
Sustainability 2022, 14(3), 1447; https://doi.org/10.3390/su14031447
Submission received: 18 December 2021 / Revised: 15 January 2022 / Accepted: 19 January 2022 / Published: 27 January 2022
(This article belongs to the Special Issue Sustainable Smart Cities and Societies Using Emerging Technologies)

Abstract

:
The popularization of electronic clinical medical records makes it possible to use automated methods to extract high-value information from medical records quickly. As essential medical information, oncology medical events are composed of attributes that describe malignant tumors. In recent years, oncology medicine event extraction has become a research hotspot in academia. Many academic conferences publish it as an evaluation task and provide a series of high-quality annotation data. This article aims at the characteristics of discrete attributes of tumor-related medical events and proposes a medical event. The standard extraction method realizes the combined extraction of the primary tumor site and primary tumor size characteristics, as well as the extraction of tumor metastasis sites. In addition, given the problems of the small number and types of annotation texts for tumor-related medical events, a key-based approach is proposed. A pseudo-data-generation algorithm that randomly replaces information in the whole domain improves the transfer learning ability of the standard extraction method for different types of tumor-related medical event extractions. The proposed method won third place in the clinical medical event extraction and evaluation task of the CCKS2020 electronic medical record. A large number of experiments on the CCKS2020 dataset verify the effectiveness of the proposed method.

1. Introduction

With the rapid popularity of electronic medical records and the advent of big medical data, natural language processing (NLP) technology in the medical field has become a current research hotspot. NLP-related technologies, such as event extraction, relationship extraction, etc., can be used as automated methods to quickly extract scientifically valuable information from clinical medical records, thereby improving the work efficiency of scientific researchers and accelerating the progress of drug research [1].
Event extraction is a primary task of NLP. Its purpose is to extract events that users are interested in from unstructured information and present them to users in a structured form. In recent years, tumor-related medical event extraction has become a research hotspot in academia; the 4th Health Information Processing Conference (CHIP2018 [2]) and the 13th and 14th National Conference on Knowledge Graph and Semantic Computing (CCKS2019 [3], CCKS2020 [4]) all use it as a heavyweight evaluation task, attracting the participation of a large number of industry personnel and providing a series of high-quality annotation data, which significantly promotes the research of medical event extraction (Juneja et al., 2021).
Tumor-related medical event extraction, given the central entity of the medical record text data of the tumor, defines several attributes of the tumor-related medical event, such as tumor size, primary tumor site, etc., and identifies and extracts the events and details. Data released by CHIP2018, CCKS2019, and CCKS2020 define three attributes: prior tumor site, primary tumor size, and tumor metastasis site. However, these three attributes are relatively discrete; that is, they can exist relatively independently without being affected by other features. For example, in medicine, any body part may become the site of tumor metastasis, regardless of the tumor’s original location; the size of the primary tumor and the site of tumor metastasis are neither medically nor realistically related. The only partial connection is as a description of the measurement of the primary location of the tumor. The size of the primary tumor usually coexists with the sentence level of the primary site of the tumor, but this situation is not absolute. For the extraction of tumor-related medical events, the author of [5] proposed CCMNN in a previous work. The method of multi-neural-network collaboration realizes the extraction of three attributes. Among them, based on the conclusion that the primary tumor site and the size of the primary tumor co-occur at the sentence level, CCMNN uses a rule-based method to extract the size of the primary tumor. However, due to natural language, the arbitrariness of medical records, and the irregularity of medical record text writing, the actual performance of the above method is not good [6,7].
In response to the problems in CCMNN, this article improves CCMNN and proposes a standard extraction method of medical events. This method realizes the joint extraction of the primary tumor site and primary tumor size and the extraction of the tumor metastasis sites. This method is used in CCKS2020. In the electronic medical record-based clinical medical event extraction and evaluation task, an F1 value of 73.52 was obtained, winning third place in this evaluation task. To verify the method’s effectiveness in this paper, the CCKS2019 and CCKS2020 medical event extraction datasets were targeted.
Furthermore, many comparative experiments between the technique in this paper and CMNN have been carried out. The experimental results show that the method in this paper has a significant improvement in the absolute F1 value of CCMNN, compared to CCMNN [8,9]. Further exploratory analysis shows that the method in this paper has dramatically improved the performance of the extraction of the primary tumor size, achieving the research purpose of this article.
In addition, given the problem of the small number and types of medical record texts for oncology medical events, this paper proposes a pseudo-data-generation algorithm based on the global random replacement of crucial information [10,11,12]. The experimental results on the CCKS2020 medical event extraction dataset show that the algorithm can effectively expand the number and types of annotated medical record texts and improve the transfer learning ability of this method for different kinds of tumor-related medical events.

2. Related Research

Similar to information extraction in general fields [13,14], medical information extraction refers to determining the boundaries of professional terms in medical texts and then classifying them based on domain information [15]. The current methods of medical information extraction mainly include shallow machine-learning methods and two types of deep-neural-network methods. External machine-learning methods mainly include the Hidden Markov Model (HMM), Conditional Random Field (CRF), Support Vector Machine (SVM), etc. [16]. The author of [17] verified that, based on CRF, and according to the Gimli method, the F1 value on the JNLPBA 2004 dataset reached 72.23. The author of [18] proposed a multi-feature fusion CRF method, which can accurately identify the disease and symptom entities in the medical record text and can also accurately identify unregistered entity words. Shallow machine-learning methods rely, to a large extent, on the design of artificial features. The author of [19] used the CRF model for biomedical entity-recognition to solve the above problems and added different word vectors based on basic artificial features. For example, the F1 value on the JNGLPBA 2004 dataset reached 71.39. The author of [20] used a small number of artificial features and word vectors to construct a CRF model [21,22] and added post-processing. As a result, the F1 value on the JNLPBA 2004 corpus was 71.77.
In the study of using deep neural networks for medical information extraction, the author of [23] first used neural networks to generate word vectors on unlabeled biomedical texts and then built a multi-layer neural network, obtaining 71 on the JNLPBA 2004 dataset, with an F1 value of 01. The author of [24] used the BiLSTM model to obtain an F1 value of 88.6 on the BioCreativeGM dataset, and at the same time, obtained an F1 value of 72.76 on the JNLPBA 2004 corpus. Finally, the author of [25] proposed a neural-network model based on CNN BLSTM CRF that has reached the optimal F1 value on the BiocreativeIIGM and JNLPBA 2004 datasets [26,27,28,29,30].
In the tumor-related medical event extraction study, the author of [31] proposed an extraction method based on pattern-matching, which achieved an F1 value of 69.7 on the CHIP2018 dataset. The author of [32] proposed a multiple-extraction method of neural-network collaboration that has obtained an F1 value of 76.35 on the CCKS2019 dataset. Zhao et al. [33] proposed a plan based on a multi-sequence labeling model and received 76.17 on the CCKS2019 dataset. The author of [34] proposed an Elmo-based sequence-labeling method, which combined rules to obtain an F1 value of 70.69 on the CCKS2019 data-set. In the latest related research, Dai et al. [35] proposed a method based on the extraction method of RoBERT, that, combined with a large number of external resources to fine-tune RoBERT, uses rules to preprocess the data; the process obtained an F1 value of 76.23 on the CCKS2020 dataset. The author of [36] used the BiLSTM GCRF model to extract tumors. For medical incidents, the model is based on RoBERTa and uses data-augmentation and rules to process the data. It has achieved an F1 value of 74.58 on the CCKS2020 dataset [37,38].
The research of medical information extraction closely follows the pace of general information-extraction research. Still, the research progress is relatively lagging, mainly due to the lack of large-scale and high-quality medical annotation data. In addition, the current electronic medical record-based tumor-related medical event extraction method, or the use of a large number of rules, either significantly reduces the generalization ability of the extraction method [39,40], or relies highly on pre-training language models [41] and external resources, which increase the demand for computing resources and domain knowledge of the extraction method, hindering the actual application of the extraction method.

3. Joint-Extraction Method of Oncology Medical Events

3.1. Taks Analysis

The definition of primary tumor site, primary tumor size, and tumor metastasis site are as follows. Tumor prior site is the tissue or organ where a specific malignant tumor first appears. Usually, there are apparent characteristic words in the context of the primary tumor site, such as “cancer”, “malignant tumor”, “MT”, “CA”, etc. Primary tumor size is a measure of the size of the primary tumor, generally in the form of length, area, and volume. Tumor metastasis site is where the malignant tumor transfers from the original site to other tissues or organs.
The CCKS2020 electronic medical record-based clinical medical event extraction and evaluation task takes the migration learning ability of the research method across different types of tumor-related medical event extractions as an essential indicator. Therefore, the data distribution of the training set and the test set provided are mainly reflected in the type of tumor. Therefore, this article counts the tumor type information in the training (train) and test set (test) of the CCKS2020 medical event extraction dataset, as listed in Table 1.
It can be seen from Table 1 that the train mainly contains two kinds of tumor-related medical events, lung and breast, accounting for 83.48%, of which lung-related tumor-related medical events accounted for 62.67%. Many tumor-related medical events were included in the test. On the other hand, it does not appear in trains, such as the stomach, pancreas, uterus, and other such tumor-related medical events. In addition, there are also significant differences in specific descriptions of tumor-related medical events that co-occur in train and test.

3.2. Design Methods

Figure 1 shows the architecture diagram of the medical event joint-extraction method proposed in this article. This article is divided into two parts: (1) the joint extraction of the primary tumor site and the size of the primary tumor; (2) the extraction of the tumor metastasis site.
The method in this paper first extracts the candidate words of the tumor’s primary site, formalizes the extraction process (named entity recognition), and uses the BiLSTM GCRF model to extract.
The first layer of the BiLSTM GCRF model is the embedding layer, which maps each token contained in the medical record text to a token-embedding representation (a token refers to a character in the medical record text, or punctuation, or English letters, or other symbols), and finally, proceeds with the embedding representation sequence of the text. If a medical record text X contains n tokens, the embedding representation sequence of X can be expressed as X′ = (×1, ×2, ×3, ×, ×n), where xiεRd,d is the token embedding representation. Before entering the next layer, set dropout to alleviate over fitting.
The training data of the BiLSTM CRF model adopts the labeling mode of BIO, and the data is processed into a format suitable for model training according to the artificial labeling information of the data.
The candidate words of the primary tumor site may contain multiple candidate words that belong to the same body part, but with different granularities, so they need to be screened. The screening process follows the principle of refinement and commoditization, the most accurate description of the reserved part. For example, if both “lung” and “left upper lobe” are candidates, then “left upper lobe” is selected as the primary site of the tumor (Figure 2).
The size of the primary tumor is composed of numbers, length units (mm or cm), and binary symbols representing multiplication (;, ×, X, etc.) by specific rules. This article first includes all words in the medical record text that meet the defined form. Then, it extracts them as candidates for the size of the primary tumor.
Next, the primary tumor site and the primary tumor size candidate words are combined to obtain the tumor size relationship candidate tuple. The principle of combination is that the prior tumor site should appear before the primary tumor size candidate words in the medical record text. Due to the randomness of natural language, there are a lot of abbreviations and abbreviations in the case text during the writing process. To solve the above problems, this article is for each tumor source.

4. Datasets and Result Evaluation

CCKS2019 released the task of sampling and evaluating medical treatment incidents of tumors. A total of 1000 annotated tumor-related medical record texts are used as the training set, and 400 annotated tumor-related medical record texts are used as the test set; CCKS2020 released 1000 annotated tumor-related medical record texts as the training set, and 300 annotated tumor-related medical record texts are used as the test set. The above two datasets verify the effectiveness of the method in this paper. This paper uses the standard accuracy rate (P), recall rate (R), and micro-average F1 as model evaluation indicators, and the formula is as follows:
P = TP TP + FP
Recall ( R ) = TP TP + FN
F 1 = 2 PR P + R
Among them, TP stands for actual cases, that is, the number of attributes predicted by the model to be positive, and the real is also positive; FP stands for false positives, that is, the number of details predicted by the model to be positive and negative; FN stands for false negatives, that is, the number of attributes predicted by the model to be negative and positive.

Experimental Results

The method in this paper obtained CCKS2020 electronic medical records of clinical medical events. The third place in the evaluation task is drawn. This section first gives the top five results of this evaluation task. The results are listed in Table 2.
The teams DST and TMAIL both use the RoBERTa pre-trained language model. The DST team grabbed 960,000 medical texts from the Internet to fine-tune the RoBERTa model and used rules to clean the data and replace characters. In addition, the TMAIL team also used rules to perform preprocessing operations, such as data cleaning and character replacement.The methods proposed by DST and TMAIL and the methods in this article all use the BiLSTM GCRF model as the main component of the technique. However, by analyzing tumors, the properties of the two attributes of the primary site and primary tumor size are determined. The method in this paper uses two BiLSTM GCRF models to achieve the extraction of two features. Therefore, from an empirical point of view, this paper is specific to these two attributes; under the same conditions, our method is more substantial and yields a better extraction effect. In addition, compared with the methods proposed by DST and TMAIL as presented in Figure 3 below, the advantages of this method are:
  • Not using the RoBERT pre-training language model, but using randomly initialized token-embedding representation;
  • Not using any external resources;
  • No rules are used to clean the dataset, character replacement, and other preprocessing operations.
Because no rules are used to preprocess the data, the method in this article has better generalization capabilities; because no external resources and pre-training languages are used in the model, the method in this article requires lower computing resources. To verify the superiority of the standard extraction method proposed in this article, the technique and CCMNN were run on the CCKS2019 and CCKS2020 datasets. The training set was used to train the model, and the model was tested on the corresponding test set; the experimental results are listed in Table 3. For fairness, we did not use the pseudo-data-generation algorithm proposed in this article in the two methods.
It can be seen from Table 3 that, on the two datasets, the performance of the method in this paper is consistent with that of CCMNN, which proves the effectiveness of the standard extraction method proposed in this paper. Specifically, compared to CCMNN, the method in this paper uses the CCKS2019 dataset. As a result, the absolute F1 value increases by 3.13, and the total F1 value on the CCKS2020 dataset has increased by 4.14.
In addition, it can also be obtained from Figure 4 that, whether it is the method in this paper or the CCGMNN, the performance difference on the two datasets is significant, mainly due to the following two reasons:
  • For the transfer learning ability of the evaluation method, the CCKS2020 dataset, the data distribution of the training set, and the test set are fairly different;
  • The data distribution of the two datasets of CCKS2019 and CCKS2020 is quite different as seen in Figure 5 and Figure 6.
To further explore the advantages of this method compared with CCMNN, we further explore the statistics of the method of this article. Since this method and CCMNN use the same method to extract tumor metastasis sites, the extraction results are also the same, so we did not show them in Table 4.
It can be seen from Table 4 that, on the two datasets, the method in this paper and the CCMNN method have achieved nearly the same performance in the extraction of the primary tumor site. However, in the extraction of the primary tumor size, the method in this paper has a significant improvement over the absolute F1 values of CCMNN, which, respectively, are +8.93 (CCKS2019) and +7.51 (CCKS2020) as in Figure 7. Compared with CCMNN, which uses a rule-based method to extract the size of the primary tumor, the common extraction method proposed in this paper can effectively improve the extraction performance of the primary tumor size, achieving the research purpose of this article.
To improve the transfer learning ability of the method in this paper, this paper proposes a pseudo-data-generation algorithm based on the global random replacement of crucial information. DST and TMAIL use similar data pseudo-labelling algorithms. For example, TMAIL uses medical record texts, the sentences are globally reordered, and 2800 pseudo-annotated data are obtained.
To verify the effectiveness of the pseudo-data-generation algorithm proposed in this article, we conducted a series of experiments on the CCKS2020 dataset. First, the algorithm was used to generate 2000 pseudo-labeled data; then, according to different combinations of training data, combined with the method in this article, the algorithm was used to train the neural network model and use it on the CCKS2020 test set presented in Figure 8 and Figure 9 respectively.
A total of five sets of experiments were conducted. Train refers to 1000 training data of CCKS2020; test refers to 300 test data of CCKS2020; train +500, train +1000, train +1500, and train +2000, respectively, refer to train. The corresponding amount of pseudo-labeled data is added in. In addition, due to the randomness of the pseudo-labeled data, this paper carries out ten iterations of the above experimental process, taking the average F1 value of these ten iterations as the final F1 value. It can be seen that, when 1000 pieces of pseudo-labeled data are added to the train, the F1 value of 74.68 is obtained by the method in this paper, which surpasses the F1 value (73.52) of the method in the CCKS2020 medical event extraction and evaluation task. In addition, we can also conclude that, as the amount of pseudo-labeled data added to the training set increases, the performance of the method in this paper increases before decreasing after reaching a peak. Still, the pseudo-labeled data is always beneficial to the method in this paper. The reasons are:
  • The algorithm can significantly expand the number and types of medical record texts labelled, which are critical to improving model performance;
  • The pseudo-labeled samples generated by this algorithm are random and may not necessarily match the actual scene. Therefore, adding too much pseudo-labelled data will make the model correct. Sexuality produces a certain amount of interference, which affects model performance improvement.

5. Conclusions

This article proposes a standard extraction method of medical events, which realizes the joint extraction of two tumor event attributes. It presents a pseudo-data-generation algorithm based on the global random replacement of crucial information, which improves the model’s migration learning ability. The method in this paper has won third place in the clinical medical event extraction and evaluation task of CCKS2020 electronic medical records. A large number of experiments on the CCKS2019 and CCKS2020 datasets show that the method’s performance in this paper is greatly improved, compared with the CCMNN method, especially for primary tumors. In addition, the performance of size extraction has been dramatically improved, and the research purpose of this article has been achieved. However, the pseudo-data-generation algorithm proposed by the method in this article has large randomness, resulting in the generated pseudo-data not necessarily conforming to the natural semantics, damaging the model to a certain extent Therefore, next, we will study pseudo-data-generation algorithms based on semantic similarity replacement to improve the quality of pseudo-data generation and further improve the model’s performance.

Author Contributions

Conceptualization, S.J.; methodology, G.D. and H.M.; formal analysis, W.V. and I.E.B.; investigation, K.G.; resources, M.H.; writing—original draft preparation, G.D. and H.M.; writing—review and editing, M.H. and W.V.; project administration, H.M. and M.A.I.; funding acquisition, M.A.I. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Malaysian Ministry of Higher Education through FRGS grant FRGS/1/2020/TK0/UM/02/33. This research is also funded by the Universiti Malaya Research Grant (RU013AC-2021).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would thanks for financially supported by the Malaysian Ministry of Higher Education and the Universiti Malaya Research Grant.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hasegawa, R. Automatic Detection and Segmentation of Liver Tumors in Multi-phase CT Images by Phase Attention Mask R-CNN. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 10–12 January 2021; pp. 1–5. [Google Scholar] [CrossRef]
  2. Archa, S.P.; Kumar, C.S. Segmentation of Brain Tumor in MRI Images Using CNN with Edge Detection. In Proceedings of the 2018 International Conference on Emerging Trends and Innovations in Engineering and Technological Research (ICETIETR), Ernakulam, India, 11–13 July 2018; pp. 1–4. [Google Scholar] [CrossRef]
  3. Prakash, R.M.; Kumari, R.S.S. Classification of MR Brain Images for Detection of Tumor with Transfer Learning from Pre-trained CNN Models. In Proceedings of the 2019 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), Chennai, India, 21–23 March 2019; pp. 508–511. [Google Scholar] [CrossRef]
  4. Hasegawa, R.; Iwamoto, Y.; Lin, L.; Hu, H.; Chen, Y. Automatic Segmentation of Liver Tumor in Multiphase CT Images by Mask R-CNN. In Proceedings of the 2nd Global Conference on Life Sciences and Technologies (LifeTech), Kyoto, Japan, 10–12 March 2020; pp. 231–234. [Google Scholar] [CrossRef]
  5. Someswararao, C.; Shankar, R.S.; Appaji, S.V.; Gupta, V. Brain Tumor Detection Model from MR Images using Convolutional Neural Network. In Proceedings of the International Conference on System, Computation, Automation and Networking (ICSCAN), Pondicherry, India, 3–4 July 2020; pp. 1–4. [Google Scholar] [CrossRef]
  6. Yasir, M.; Hayat, U.; Rahman, A.U.; Riaz, R. Classification and Detection of Glioblastoma Tumor from MRI Images. In Proceedings of the International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Islamabad, Pakistan, 12–16 January 2021; pp. 322–327. [Google Scholar] [CrossRef]
  7. Somasundaram, S.; Gobinath, R. Current Trends on Deep Learning Models for Brain Tumor Segmentation and Detection—A Review. In Proceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 217–221. [Google Scholar] [CrossRef]
  8. Raut, G.; Raut, A.; Bhagade, J.; Gavhane, S. Deep Learning Approach for Brain Tumor Detection and Segmentation. In Proceedings of the International Conference on Convergence to Digital World—Quo Vadis (ICCDW), Mumbai, India, 18–20 February 2020; pp. 1–5. [Google Scholar] [CrossRef]
  9. Mao, Y.; Yin, Z.; Schober, J.M. Iteratively training classifiers for circulating tumor cell detection. In Proceedings of the IEEE 12th International Symposium on Biomedical Imaging (ISBI), Brooklyn, NY, USA, 16–19 April 2015; pp. 190–194. [Google Scholar] [CrossRef]
  10. Kashif, M.N.; Ahmed Raza, S.E.; Sirinukunwattana, K.; Arif, M.; Rajpoot, N. Handcrafted features with convolutional neural networks for detection of tumor cells in histology images. In Proceedings of the IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1029–1032. [Google Scholar] [CrossRef]
  11. Moutushi, N.-E.-J.; Tara, K. Comparison among Supervised Classifiers for Classification of Brain Tumor. In Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh, 5–7 June 2020; pp. 304–307. [Google Scholar] [CrossRef]
  12. Shelke, S.M.; Mohod, S.W. Automated Segmentation and Detection of Brain Tumor from MRI. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India, 19–22 September 2018; pp. 2120–2126. [Google Scholar] [CrossRef]
  13. Abdullah, A.A.; Chize, B.S.; Nishio, Y. Implementation of an improved cellular neural network algorithm for brain tumor detection. In Proceedings of the International Conference on Biomedical Engineering (ICoBE), Penang, Malaysia, 27–28 February 2012; pp. 611–615. [Google Scholar]
  14. Hemanth, G.; Janardhan, M.; Sujihelen, L. Design and Implementing Brain Tumor Detection Using Machine Learning Approach. In Proceedings of the 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 23–25 April 2019; pp. 1289–1294. [Google Scholar] [CrossRef]
  15. Gayanga, T.D.L.; Pathirana, G.P.S.N.; Sandanayake, T.C. Detecting and Capturing the Intensity of a Brain Tumor using MRI Images. In Proceedings of the 4th International Conference on Information Technology Research (ICITR), Moratuwa, Sri Lanka, 10–13 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
  16. Choudhury, C.L.; Mahanty, C.; Kumar, R.; Mishra, B.K. Brain Tumor Detection and Classification Using Convolutional Neural Network and Deep Neural Network. In Proceedings of the International Conference on Computer Science, Engineering and Applications (ICCSEA), Gunupur, India, 13–14 March 2020; pp. 1–4. [Google Scholar] [CrossRef]
  17. Jemimma, T.A.; Vetharaj, Y.J. Watershed Algorithm based DAPP features for Brain Tumor Segmentation and Classification. In Proceedings of the International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 13–14 December 2018; pp. 155–158. [Google Scholar] [CrossRef]
  18. Siar, M.; Teshnehlab, M. Brain Tumor Detection Using Deep Neural Network and Machine Learning Algorithm. In Proceedings of the 9th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 24–25 October 2019; pp. 363–368. [Google Scholar] [CrossRef]
  19. Ezhilarasi, R.; Varalakshmi, P. Tumor Detection in the Brain using Faster R-CNN. In Proceedings of the 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 30–31 August 2018; pp. 388–392. [Google Scholar] [CrossRef]
  20. Vinoth, R.; Venkatesh, C. Segmentation and Detection of Tumor in MRI images Using CNN and SVM Classification. In Proceedings of the Conference on Emerging Devices and Smart Systems (ICEDSS), Tiruchengode, India, 2–3 March 2018; pp. 21–25. [Google Scholar] [CrossRef]
  21. Nair, R.; Soni, M.; Bajpai, B.; Dhiman, G.; Sagayam, K.M. Predicting the Death Rate Around the World Due to COVID-19 Using Regression Analysis. Int. J. Swarm Intell. Res. 2022, 13, 1–13. [Google Scholar] [CrossRef]
  22. Sharma, S.; Gupta, S.; Gupta, D.; Juneja, S.; Singal, G.; Dhiman, G.; Kautish, S. Recognition of Gurmukhi Handwritten City Names Using Deep Learning and Cloud Computing. Sci. Program. 2022, 2022, 5945117. [Google Scholar] [CrossRef]
  23. Zeidabadi, F.A.; Doumari, S.A.; Dehghani, M.; Montazeri, Z.; Trojovsky, P.; Dhiman, G. MLA: A New Mutated Leader Algorithm for Solving Optimization Problems. Comput. Mater. Contin. 2022, 70, 5631–5649. [Google Scholar] [CrossRef]
  24. Zeidabadi, F.A.; Doumari, S.A.; Dehghani, M.; Montazeri, Z.; Trojovsky, P.; Dhiman, G. AMBO: All Members-Based Optimizer for Solving Optimization Problems. Comput. Mater. Contin. 2022, 70, 2905–2921. [Google Scholar] [CrossRef]
  25. Alharbi, Y.; Alferaidi, A.; Yadav, K.; Dhiman, G.; Kautish, S. Denial-of-Service Attack Detection over IPv6 Network Based on KNN Algorithm. Wireless Commun. Mob. Comput. 2021, 2021, 8000869. [Google Scholar] [CrossRef]
  26. Balakrishnan, A.; Kadiyala, R.; Dhiman, G.; Ashok, G.; Kautish, S.; Yadav, K.; Maruthi Nagendra Prasad, J. A Personalized Eccentric Cyber-Physical System Architecture for Smart Healthcare. Secur. Commun. Netw. 2021, 2021, 1747077. [Google Scholar] [CrossRef]
  27. Juneja, S.; Juneja, A.; Dhiman, G.; Jain, S.; Dhankhar, A.; Kautish, S. Computer Vision-Enabled Character Recognition of Hand Gestures for Patients with Hearing and Speaking Disability. Mob. Inform. Syst. 2021, 2021, 4912486. [Google Scholar] [CrossRef]
  28. Balakrishnan, A.; Ramana, K.; Dhiman, G.; Ashok, G.; Bhaskar, V.; Sharma, A.; Gaba, G.S.; Masud, M.; Al-Amri, J.F. Multimedia Concepts on Object Detection and Recognition with F1 Car Simulation Using Convolutional Layers. Wireless Commun. Mob. Comput. 2021, 2021, 5543720. [Google Scholar] [CrossRef]
  29. Das, S.R.; Sahoo, A.K.; Dhiman, G.; Singh, K.K.; Singh, A. Photo voltaic integrated multilevel inverter based hybrid filter using spotted hyena optimizer. Comput. Electr. Eng. 2021, 96, 107510. [Google Scholar] [CrossRef]
  30. Dhiman, G.; Kaur, G.; Haq, M.A.; Shabaz, M. Requirements for the Optimal Design for the Metasystematic Sustainability of Digital Double-Form Systems. Math. Probl. Eng. 2021, 2021, 2423750. [Google Scholar] [CrossRef]
  31. Juneja, S.; Dhiman, G.; Kautish, S.; Viriyasitavat, W.; Yadav, K. A Perspective Roadmap for IoMT-Based Early Detection and Care of the Neural Disorder, Dementia. J. Healthc. Eng. 2021, 2021, 6712424. [Google Scholar] [CrossRef] [PubMed]
  32. Hu, Y.; Sharma, A.; Dhiman, G.; Shabaz, M. The Identification Nanoparticle Sensor Using Back Propagation Neural Network Optimized by Genetic Algorithm. J. Sens. 2021, 2021, 6712424. [Google Scholar] [CrossRef]
  33. Uppal, M.; Gupta, D.; Juneja, S.; Dhiman, G.; Kautish, S. Cloud-Based Fault Prediction Using IoT in Office Automation for Improvisation of Health of Employees. J. Healthc. Eng. 2021, 2021, 8106467. [Google Scholar] [CrossRef]
  34. Kansal, L.; Gaba, G.S.; Sharma, A.; Dhiman, G.; Baz, M.; Masud, M. Performance Analysis of WOFDM-WiMAX Integrating Diverse Wavelets for 5G Applications. Wirel. Commun. Mob. Comput. 2021, 2021, 5835806. [Google Scholar] [CrossRef]
  35. Vaishnav, P.K.; Sharma, S.; Sharma, P. Analytical review analysis for screening COVID-19 disease. Int. J. Modern Res. 2021, 1, 22–29. [Google Scholar]
  36. Chatterjee, I. Artificial intelligence and patentability: Review and discussions. Int. J. Modern Res. 2021, 1, 15–21. [Google Scholar]
  37. Kumar, R.; Dhiman, G. A comparative study of fuzzy optimization through fuzzy number. Int. J. Modern Res. 2021, 1, 1–14. [Google Scholar]
  38. Iqbal, M.S.; Ahmad, I.; Bin, L.; Khan, S.; Rodrigues, J.J.P.C. Deep learning recognition of diseased and normal cell representation. Trans. Emerg. Telecommun. Technol. 2020, 32, e4017. [Google Scholar] [CrossRef]
  39. Iqbal, M.S.; Luo, B.; Mehmood, R.; Alrige, M.A.; Alharbey, R. Mitochondrial Organelle Movement Classification (Fission and Fusion) via Convolutional Neural Network Approach. IEEE Access 2019, 7, 86570–86577. [Google Scholar] [CrossRef]
  40. Iqbal, M.S.; El-Ashram, S.; Hussain, S.; Khan, T.; Huang, S.; Mehmood, R.; Luo, B. Efficient cell classification of mitochondrial images by using deep learning. J. Opt. 2019, 48, 113–122. [Google Scholar] [CrossRef]
  41. Iqbal, M.S.; Ahmad, I.; Khan, T.; Khan, S.; Ahmad, M.; Wang, L. Recent Advances of Deep Learning in Biology. In Deep Learning for Unmanned Systems; Koubaa, A., Azar, A.T., Eds.; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2021; Volume 984. [Google Scholar] [CrossRef]
Figure 1. CNN Model for Brain Tumor Detection.
Figure 1. CNN Model for Brain Tumor Detection.
Sustainability 14 01447 g001
Figure 2. Bidirectional LSTM for Brain Tumor Detection.
Figure 2. Bidirectional LSTM for Brain Tumor Detection.
Sustainability 14 01447 g002
Figure 3. Medical Event Extraction on the CCKS2020 Dataset.
Figure 3. Medical Event Extraction on the CCKS2020 Dataset.
Sustainability 14 01447 g003
Figure 4. Precision over the CCKS dataset.
Figure 4. Precision over the CCKS dataset.
Sustainability 14 01447 g004
Figure 5. Recall over the CCKS Dataset.
Figure 5. Recall over the CCKS Dataset.
Sustainability 14 01447 g005
Figure 6. F1 Score over the CCKS Dataset.
Figure 6. F1 Score over the CCKS Dataset.
Sustainability 14 01447 g006
Figure 7. Accuracy over the CCKS Dataset.
Figure 7. Accuracy over the CCKS Dataset.
Sustainability 14 01447 g007
Figure 8. F1 Value of the Proposed Model for Medical Event Attributes Extraction on the CCKS2019 Dataset.
Figure 8. F1 Value of the Proposed Model for Medical Event Attributes Extraction on the CCKS2019 Dataset.
Sustainability 14 01447 g008
Figure 9. F1 Value of the Proposed Model for Medical Event Attributes Extraction on the CCKS2020 Dataset.
Figure 9. F1 Value of the Proposed Model for Medical Event Attributes Extraction on the CCKS2020 Dataset.
Sustainability 14 01447 g009
Table 1. Statistics on the types of tumor-related medical events included in the train and test of CCKS2020.
Table 1. Statistics on the types of tumor-related medical events included in the train and test of CCKS2020.
TrainingTesting
Lung62.67Liver28.72
Milk20.81Intestine13.18
Intestine4.0Stomach12.16
Kidney2.38Lung8.11
Liver1.92Pancreas7.43
Esophagus1.13Uterus5.41
Other7.09Other24.99
Table 2. Medical event extraction on CCKS2020 dataset.
Table 2. Medical event extraction on CCKS2020 dataset.
TeamF1 Score
DST [1]76.23
TMAIL [2]74.58
LHJB [3]73.25
ARALOAK [4]72.73
Table 3. Performance of the proposed method.
Table 3. Performance of the proposed method.
CCKS2020CCKS2020
PRF1 ScoreAccuracyPRF1 ScoreAccuracy
CCMNN73.2675.2176.2485.6562.5468.271.2174.56
Proposed Approach75.2578.2479.5289.5268.2571.2174.2178.56
Table 4. F1 Value for medical event attributes extraction.
Table 4. F1 Value for medical event attributes extraction.
CCKS2019CCKS2020
PrimaryTumorPrimaryTumor
CCMNN78.5683.5679.5486.52
Proposed Approach82.5686.5484.5290.23
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dhiman, G.; Juneja, S.; Viriyasitavat, W.; Mohafez, H.; Hadizadeh, M.; Islam, M.A.; El Bayoumy, I.; Gulati, K. A Novel Machine-Learning-Based Hybrid CNN Model for Tumor Identification in Medical Image Processing. Sustainability 2022, 14, 1447. https://doi.org/10.3390/su14031447

AMA Style

Dhiman G, Juneja S, Viriyasitavat W, Mohafez H, Hadizadeh M, Islam MA, El Bayoumy I, Gulati K. A Novel Machine-Learning-Based Hybrid CNN Model for Tumor Identification in Medical Image Processing. Sustainability. 2022; 14(3):1447. https://doi.org/10.3390/su14031447

Chicago/Turabian Style

Dhiman, Gaurav, Sapna Juneja, Wattana Viriyasitavat, Hamidreza Mohafez, Maryam Hadizadeh, Mohammad Aminul Islam, Ibrahim El Bayoumy, and Kamal Gulati. 2022. "A Novel Machine-Learning-Based Hybrid CNN Model for Tumor Identification in Medical Image Processing" Sustainability 14, no. 3: 1447. https://doi.org/10.3390/su14031447

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop