Natural Language Processing in Healthcare

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (15 July 2023) | Viewed by 3968

Special Issue Editors

Department of Medicine, Stanford University, Stanford, CA 94304, USA
Interests: health informatics research using electronic health records; machine learning and natural language processing

E-Mail Website
Guest Editor
Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94304, USA
Interests: population health; NLP; AI; epidemiology; machine learning; causal inference

Special Issue Information

Dear Colleagues,

Innovations in Natural Language Processing (NLP), as a sub-domain of artificial intelligence (AI), offer new opportunities and solutions to leverage vast amounts of health data captured as text in electronic health records (EHRs), patient-generated emails or other platforms, such as social media and biomedical literature. Hence, recently, the implementation of NLP in healthcare is increasing because of its potential to effectively explore, evaluate and interpret a large amount of unstructured health data. While holding tremendous promise, NLP solutions introduce unique challenges, such as access to high-quality data for model development, reproducibility, generalizability and lack of model interpretability. This Special Issue aims at addressing these challenges in real-world NLP applications in healthcare by inviting scholarly contributions covering novel methods, best practices and evaluation frameworks.

We invite high-quality original submissions that responsibly develop and use NLP methods in healthcare. We are interested in studies that specifically focuses on addressing the above challenges to NLP, rather than merely applying existing NLP methods to downstream clinical problems (such as outcome prediction or clinical cohort selection).  

We invite:

  • Real-world applications of NLP in healthcare
  • NLP for capturing patient-reported outcomes
  • Healthcare decision support based on text analytics
  • Health Information Retrieval and Extraction
  • Health-related social media analytics
  • Question-answering technologies for health applications
  • Medical terminologies and ontologies
  • Trends and challenges in Health and medical NLP
  • Evaluation techniques for NLP methods in the clinical domain

Dr. Selen Bozkurt
Dr. Suzanne Tamang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • information extraction
  • information retrieval
  • natural language processing
  • AI methods in NLP
  • bias and fairness in NLP
  • biomedical informatics
  • electronic health records

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 494 KiB  
Article
Classification of Severe Maternal Morbidity from Electronic Health Records Written in Spanish Using Natural Language Processing
by Ever A. Torres-Silva, Santiago Rúa, Andrés F. Giraldo-Forero, Maria C. Durango, José F. Flórez-Arango and Andrés Orozco-Duque
Appl. Sci. 2023, 13(19), 10725; https://doi.org/10.3390/app131910725 - 27 Sep 2023
Cited by 2 | Viewed by 1181
Abstract
One stepping stone for reducing the maternal mortality is to identify severe maternal morbidity (SMM) using Electronic Health Records (EHRs). We aim to develop a pipeline to represent and classify the unstructured text of maternal progress notes in eight classes according to the [...] Read more.
One stepping stone for reducing the maternal mortality is to identify severe maternal morbidity (SMM) using Electronic Health Records (EHRs). We aim to develop a pipeline to represent and classify the unstructured text of maternal progress notes in eight classes according to the silver labels defined by the ICD-10 codes associated with SMM. We preprocessed the text, removing protected health information (PHI) and reducing stop words. We built different pipelines to classify the SMM by the combination of six word-embeddings schemes, three different approaches for the representation of the documents (average, clustering, and principal component analysis), and five well-known machine learning classifiers. Additionally, we implemented an algorithm for typos and misspelling adjustment based on the Levenshtein distance to the Spanish Billion Word Corpus dictionary. We analyzed 43,529 documents constructed by an average of 4.15 progress notes from 22,937 patients. The pipeline with the best performance was the one that included Word2Vec, typos and spelling adjustment, document representation by PCA, and an SVM classifier. We found that it is possible to identify conditions such as miscarriage complication or hypertensive disorders from clinical notes written in Spanish, with a true positive rate higher than 0.85. This is the first approach to classify SMM from the unstructured text contained in the maternal EHRs, which can contribute to the solution of one of the most important public health problems in the world. Future works must test other representation and classification approaches to detect the risk of SMM. Full article
(This article belongs to the Special Issue Natural Language Processing in Healthcare)
Show Figures

Figure 1

12 pages, 861 KiB  
Article
A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data
by Aaron B. Cohen, Andrej Rosic, Katherine Harrison, Madeline Richey, Sheila Nemeth, Geetu Ambwani, Rebecca Miksad, Benjamin Haaland and Chengsheng Jiang
Appl. Sci. 2023, 13(10), 6209; https://doi.org/10.3390/app13106209 - 18 May 2023
Cited by 3 | Viewed by 2173
Abstract
Our goal was to develop and characterize a Natural Language Processing (NLP) algorithm to extract Eastern Cooperative Oncology Group Performance Status (ECOG PS) from unstructured electronic health record (EHR) sources to enhance observational datasets. By scanning unstructured EHR-derived documents from a real-world database, [...] Read more.
Our goal was to develop and characterize a Natural Language Processing (NLP) algorithm to extract Eastern Cooperative Oncology Group Performance Status (ECOG PS) from unstructured electronic health record (EHR) sources to enhance observational datasets. By scanning unstructured EHR-derived documents from a real-world database, the NLP algorithm assigned ECOG PS scores to patients diagnosed with one of 21 cancer types who lacked structured ECOG PS numerical scores, anchored to the initiation of treatment lines. Manually abstracted ECOG PS scores were used as a source of truth to both develop the algorithm and evaluate accuracy, sensitivity, and positive predictive value (PPV). Algorithm performance was further characterized by investigating the prognostic value of composite ECOG PS scores in patients with advanced non-small cell lung cancer receiving first line treatment. Of N = 480,825 patient-lines, structured ECOG PS scores were available for 290,343 (60.4%). After applying NLP-extraction, the availability increased to 73.2%. The algorithm’s overall accuracy, sensitivity, and PPV were 93% (95% CI: 92–94%), 88% (95% CI: 87–89%), and 88% (95% CI: 87–89%), respectively across all cancer types. In a cohort of N = 51,948 aNSCLC patients receiving 1L therapy, the algorithm improved ECOG PS completeness from 61.5% to 75.6%. Stratification by ECOG PS showed worse real-world overall survival (rwOS) for patients with worse ECOG PS scores. We developed an NLP algorithm to extract ECOG PS scores from unstructured EHR documents with high accuracy, improving data completeness for EHR-derived oncology cohorts. Full article
(This article belongs to the Special Issue Natural Language Processing in Healthcare)
Show Figures

Figure 1

Back to TopTop