Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study

Singh, Harpreet; Kusuda, Satoshi; McAdams, Ryan M.; Gupta, Shubham; Kalra, Jayant; Kaur, Ravneet; Das, Ritu; Anand, Saket; Pandey, Ashish Kumar; Cho, Su Jin; Saluja, Satish; Boutilier, Justin J.; Saria, Suchi; Palma, Jonathan; Kaur, Avneet; Yadav, Gautam; Sun, Yao

doi:10.3390/children8010001

Open AccessArticle

Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study

by

Harpreet Singh

^1,*

,

Satoshi Kusuda

²

,

Ryan M. McAdams

³,

Shubham Gupta

¹,

Jayant Kalra

¹,

Ravneet Kaur

¹,

Ritu Das

¹

,

Saket Anand

⁴,

Ashish Kumar Pandey

⁵,

Su Jin Cho

⁶

,

Satish Saluja

⁷,

Justin J. Boutilier

⁸

,

Suchi Saria

⁹,

Jonathan Palma

¹⁰,

Avneet Kaur

¹¹,

Gautam Yadav

¹² and

Yao Sun

¹³

¹

Child Health Imprints (CHIL) Pte. Ltd., Singapore 048545, Singapore

²

Department of Pediatrics, Kyorin University, Tokyo 181-8612, Japan

³

Department of Pediatrics, University of Wisconsin School of Medicine and Public Health, Madison, WI 53726, USA

⁴

Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, New Delhi 110020, India

⁵

Department of Mathematics, Indraprastha Institute of Information Technology, New Delhi 110020, India

⁶

College of Medicine, Ewha Womans University Seoul, Seoul 03760, Korea

⁷

Department of Neonatology, Sir Ganga Ram Hospital, New Delhi 110060, India

⁸

Department of Industrial and Systems Engineering, College of Engineering, University of Wisconsin, Madison, WI 53706, USA

⁹

Machine Learning and Healthcare Lab, Johns Hopkins University, 3400 N. Charles St, Baltimore, MD 21218, USA

¹⁰

Department of Pediatrics, Stanford University, Stanford, CA 94305, USA

¹¹

Department of Neonatology, Apollo Cradle Hospitals, New Delhi 110015, India

¹²

Department of Pediatrics, Kalawati Hospital, Rewari 123401, India

¹³

Division of Neonatology, University of California, San Francisco, CA 92521, USA

Show full affiliation list

Hide full affiliation list

^*

Author to whom correspondence should be addressed.

Children 2021, 8(1), 1; https://doi.org/10.3390/children8010001

Submission received: 18 November 2020 / Revised: 15 December 2020 / Accepted: 18 December 2020 / Published: 22 December 2020

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Our objective in this study was to determine if machine learning (ML) can automatically recognize neonatal manipulations, along with associated changes in physiological parameters. A retrospective observational study was carried out in two Neonatal Intensive Care Units (NICUs) between December 2019 to April 2020. Both the video and physiological data (heart rate (HR) and oxygen saturation (SpO₂)) were captured during NICU hospitalization. The proposed classification of neonatal manipulations was achieved by a deep learning system consisting of an Inception-v3 convolutional neural network (CNN), followed by transfer learning layers of Long Short-Term Memory (LSTM). Physiological signals prior to manipulations (baseline) were compared to during and after manipulations. The validation of the system was done using the leave-one-out strategy with input of 8 s of video exhibiting manipulation activity. Ten neonates were video recorded during an average length of stay of 24.5 days. Each neonate had an average of 528 manipulations during their NICU hospitalization, with the average duration of performing these manipulations varying from 28.9 s for patting, 45.5 s for a diaper change, and 108.9 s for tube feeding. The accuracy of the system was 95% for training and 85% for the validation dataset. In neonates <32 weeks’ gestation, diaper changes were associated with significant changes in HR and SpO₂, and, for neonates ≥32 weeks’ gestation, patting and tube feeding were associated with significant changes in HR. The presented system can classify and document the manipulations with high accuracy. Moreover, the study suggests that manipulations impact physiological parameters.

Keywords:

CNN; electronic medical records; IoT; LSTM; machine learning; neonatal intensive care units; physiological deviations; physiological parameters; streaming server; video monitoring

Graphical Abstract

1. Introduction

Worldwide, of the 150 million annual births, 15 million are preterm neonates [1]. Advances in neonatal care have improved clinical outcomes [2,3,4]. Digitization of Neonatal Intensive Care Unit (NICU) workflow using the electronic medical record (EMR) and medical device physiological data has enhanced integration and data utilization in analyzable format. [5]. In theory, automated data entry and monitoring reduce the clinical staff’s manual workload and allow more time to focus on patient care [6]. Improved NICU digital infrastructure has resulted in the generation of multi-modal temporal databases, such as Medical Information Mart for Intensive Care (MIMIC), which encapsulate the integrated “big data” of the ICU environment [7,8].

Retrospective studies have demonstrated the relationship of physiological signal variations with the onset of diseases [9,10,11]. A loss of heart rate variability (HRV) in the early hours after birth is associated with high morbidity in newborns [12]; Heart Rate Onservation (HeRO) monitoring has demonstrated subtle irregularities in HRV as an early indicator of disease [13]. Furthermore, studies have compared physiological signals immediately before, during, and after performing procedures/manipulations on neonates [14,15,16]. These manipulations include invasive procedures, such as intubation, peripheral intravenous line insertion, and common non-invasive handling, of neonates for care, such as patting, diaper change, and feeding. Physiological parameter changes, like HR and SpO₂, are established assessment indicators of how well these manipulations were performed [15].

During the NICU stay, a neonate undergoes an average of 768 handling manipulations and 1341 invasive procedures, with manipulations accounting for 2 h and 26 min over 24 h [17]. Variation in physiological parameters during manipulations and procedures may be associated with negative health consequences. The fast-paced, stressful NICU environment may adversely impact how manipulations are performed, which may not be captured in procedure documentation in an EMR [18]. Recent studies have attempted to capture neonatal video streams by positioning a camera on the top of the neonate’s crib to overcome manual documentation limitations [19,20,21]. Along with the video streams, physiological data related to these manipulations can be simultaneously captured. Common non-invasive manipulations performed on neonates during NICU hospitalizations include patting, diaper change, and feeding. The decrease in SpO₂ and bradycardia (less than 80 beats per minute) have been demonstrated before, during, and after diaper changes [21]. Neonatal comforting behaviors, such as patting, rubbing, holding, and stroking behavior, by nurses have also been studied using videotape analysis and were found to be irregular and associated with prolonged periods of neonatal distress [22]. Similar videotaped studies have found a lack of cue or infant-driven feeding methods used in neonates in the NICU [23,24].

Currently available public integrated ICU databases, like MIMIC, do not store video data of neonatal manipulation behaviors and synchronized physiological data. Appendix A, Table A1 outlines a detailed literature review of studies using video data along with physiological signal variations. There is a need to acquire continuous collated video data of manipulations and associated physiological parameters over the entire NICU stay to better assess how manipulations impact neonatal care outcomes. This data acquisition approach needs to address two essential design requirements. The first requirement is the millisecond resolution-based synchronization of captured video frames with physiological data captured from medical devices, which will enable an analysis of manipulation results related to medical events, such as apnea, desaturation, and bradycardia. The second key requirement is to automate data on salient features of manipulation into a patient’s EMR. Decreasing documentation demands using automatic monitoring and data-tagging may promote better nursing workflow and well-being.

This study describes the acquisition and synchronization of video and physiological data in the NICU environment. We then present a machine learning (ML)-based automated tagging framework for three common neonatal manipulations: patting, diaper change, and tube feeding. Lastly, we demonstrate the value of synchronized video and physiological data by describing variations in physiological parameters associated with the identified manipulations.

2. Materials and Methods

This section describes the methodology of acquiring, synchronizing, and analyzing neonatal NICU data captured with respect to manipulations.

2.1. Setting and Study Sample

Digital data were collected from a sample of neonates admitted to two NICUs over a three-month (April 2020–June 2020) duration. The study sites included 22 urban beds urban and 17 rural beds; both were level III NICUs in India. The urban NICU is staffed by three neonatologists with a doctorate in neonatal sciences, three residents, and 20 nurses. The rural NICU is staffed by three neonatologists with a doctorate in neonatal sciences, four residents, and 18 nurses. The Institutional Review Board of both NICUs approved the study with a waiver of informed consent. All electronic health records were de-identified (in accordance with Health Insurance Portability and Accountability Act (HIPAA)), and all the research was performed according to relevant guidelines. Prior to the study, written consent to the video monitoring and physiological data acquisition were obtained from the parents of eligible neonates at both study sites. All the data were stored in the de-identified form in the protected health information environment as per the HIPAA compliance. Hemodynamically stable neonates who stayed in the NICU for more than 24 h and did not have assisted ventilation were eligible. Neonates with congenital anomalies or on palliative care were excluded.

2.2. Data Collection

A sample of 10 neonates was recruited for this study. De-identified individual patient admission-to-discharge data were electronically recorded using the iNICU platform [25]. This study was purely observational, and at no point in time were clinical decisions or interventions affected by study data results. The data were entered on bedside tablets through an iPad Pro (12.9 inches, IInd generation) using a Chrome browser, and data were stored in the Postgres SQL database. The clinical diagnoses of each neonate were determined by consulting neonatologists using the International Classification Diseases (ICD) ninth revision during daily rounds (morning, afternoon, and evening) performed at the patient bedside.

2.3. Video Acquisition of Manipulation

During the study, the physiological data of neonates were collected using the NEO device [26]. The NEO system was improved with an additional camera module, and the size was further reduced (Appendix B, section B: NEO TINY system). Figure 1 shows the setup in a typical NICU setting. The wall mount was installed at the same height as the baby warmer’s top to minimize interference in the routine NICU workflow (Appendix B, Figure A1). The installed wall mount could be adjusted as per the discretion of onsite clinicians. The ‘Logitech C920′ Universal Serial Bus (USB) camera was installed facing the neonate. All the units’ beds were handled in the same way, and all the beds were equipped with cameras. The camera videos had a resolution of 1280 × 720 pixels and were recorded at 30 frames per second.

Videos recording was continuous for most neonate’s NICU stay, but the parents or clinical staff could switch off the recording while the neonate was removed from the bed, such as during weight measurement and kangaroo care or breastfeeding, for privacy reasons. Thus, intermittent video data segments of each neonate were available for further analysis.

2.4. Physiological Parameters of Manipulation

Along with live video recording, real-time physiological data were simultaneously captured from the patient monitors (Appendix B: section D). All the monitors did not have the ability to record respiratory rate (RR); hence, this parameter was not used in the analysis. Heart rate (HR) and oxygen saturation (SpO₂) were continuously recorded before, during, and after the manipulations.

2.5. Selection of Manipulations to Be Studied

Video data were annotated manually with clinicians’ help, and a spreadsheet was maintained for ground truth labels of the manipulations. The overall system architecture is presented in the flow diagram shown in Figure 2. Appendix B describes the (A) hardware, data acquisition, and synchronization of video and physiological data and (B) software specifications. Appendix C describes the clinical staff interface to show an annotated video frame with physiological signals, missing data in the NICU environment, and data security. For the current feasibility study, we chose commonly used non-invasive manipulations (i) patting, (ii) diaper change, and (iii) tube feeding (definitions Table 1). The interventions were selected post hoc.

2.6. Input Data, Training, and Validation Data Set

Examples of video captured patting, diaper change and tube feeding manipulations are shown in Figure 3. Acquired video sequences were down-sampled at 15 frames per second (fps) to reduce redundant computations, and images were resized from the original 1280 × 720 pixels to a color image of 720 × 480 pixels. Manipulations were initially divided based on category and neonatal identifier. Based on the discussions with the clinical team, it was hypothesized that 8 s of video data for any neonatal manipulation were sufficient to distinguish between the different types. Therefore, for each manipulation, data were processed at 8-s intervals amounting to 120 frames total. After that, the video clip corresponding to manipulation was extracted manually and then considered a training sequence. Following this, the next video sequence was extracted by sliding the cursor programmatically by 1 s to build the next 8 s subset. Although only the first 8 s were used for classifying the type of manipulation, all the frames were used for activity recognition. This process was repeated for the entire duration of the captured video of each manipulation. Appendix C explains how the clinical team visualized the video and physiological data.

2.7. Classification of Manipulation Using Convolutional Neural Network (CNN)

The image classification technique has matured to a stage where facial recognition has become part of all consumer phones. An industrial set of algorithms trained on the large existing dataset is now available, which can be used to detect different images as per specific business domain requirements. In the current study (Figure 4), an existing pre-trained Inception-v3 CNN model [30] was used with prior ImageNet weights for colored Red Green Blue (RGB) images. The CNN-based models were then further improved with the concept of transfer learning [31], wherein the output of pre-trained models (such as InceptionV3) is trained for a specific task at hand. In our study, the task was to recognize the neonatal manipulations, and, currently, there are no established neonatal databases for neonatal procedures. We conducted the transfer learning process by providing training on our annotated images marked as (i) patting, (ii) diaper change, and (iii) tube feeding. This step improves the accuracy of the manipulation-tagging model.

The performance of the InceptionV3 CNN model with the transfer learning layer was also visualized by the t-Distributed Stochastic Neighbor Embedding (t-SNE) plot [31,32], which take perplexity as a user-specified input parameter. Perplexity corresponds to the effective number of neighbors considered for obtaining the embeddings and was shown to be robust over the range of 5–50 [33]. We picked the perplexity value of 35 to visualize the best segregation of neonatal manipulation. The individual image frames of videos were resized to 226 × 226 pixels as per Inception-v3 specifications.

2.8. Activity Recognition Combining CNN Output with LSTM

From a computer vision perspective, a neonatal manipulation, such as diaper change, is a collection of image frames collected over time encapsulating the activity (manipulations). Therefore, we further wrap up the pre-trained CNN model into a time series layer to bring the concept of manipulation (sequence of images). The output of the Time-distributed CNN model generates an output of the 2048-dimensional feature vector. This vector conveys information about constituent objects, such as the neonate, the clinical staff, diapers, syringe, and plunger, and their spatial attributes and how they correlate during the manipulations. It is not feasible to visualize these vectors in a human-readable format in the current deep learning landscape.

The CNN models are very accurate in classifying images, but other branches of machine learning, such as deep learning (e.g., Long Short Term Memory; LSTM), have also progressed to identify the activities. After training of the combined CNN and LSTM, the system can automatically classify the neonatal manipulations.

We extracted the weights of the CNN (InceptionV3) model to extract features of the images and combine them with LSTM layers to perform activity recognition. The sequential 2048 feature vector, an output of the InceptionV3 model representing activity in a manipulation, was input to the LSTM model. The LSTM layers were followed by additional dense layers and followed by a three-layer softmax layer. An early stopping criterion with the patience of 8 was employed. This monitors the validation loss and stops the training when the loss deteriorates for eight successive epochs. The model was implemented in Keras [34] and TensorFlow [35] and used the ‘categorical cross-entropy’ loss function and ‘adam’ optimizer. The EarlyStopping callback was used to stop training on the epoch when the accuracy metric has stopped improving [36].

2.9. Variation in Physiological Signals Associated with Manipulation

The variations in physiological parameters during manipulations were compared with those of baseline (defined as 5 min before the manipulation) and post-manipulation (defined as 5 min after the manipulation).

2.10. Performance Metrics

We measured the performance of the CNN/LSTM model in the classification of neonatal manipulations using Positive Predictive Value (PPV) (Equation (1)), Sensitivity (Equation (2)), and F-measure (Equation (3)), which are defined as:

PPV = \frac{TP}{TP + FP}

(1)

Sensitivity = \frac{TP}{TP + FN}

(2)

F - measure = \frac{2 \times (PPV \times Sensitivity)}{PPV + Sensitivity}

(3)

where TP, FP, and FN are true positive (TP: manipulation patting, diaper change, and tube feeding detected correctly), false positive (FP: when the system detects a manipulation when there is none), and false negative (FN: when there is manipulation that the system does not detect). For data with normal distribution, a two-sided paired t-test with a significance level <0.05 was used to compare physiological parameters during and after manipulations. This was based on our assumption that the physiological values may increase or decrease during and after manipulations in comparison to the baseline data.

2.11. Overall Activity Detection Model Evaluation

The model evaluation was done using leave-one-out cross-validation (LOOCV) utilizing PPV and sensitivity metrics. In the NICUs involved in the current study, nurses did not document routine care activities, such as diaper change and patting, in the EMR system. The comparison of tube feeding records between documented nursing records and automated tube feeding notes highlights the additional temporal data captured by machine learning-based automated classification system. The tube feeding duration and time duration from the last tube feeding were not captured in current EMR records.

Based on the visual investigation of data with the clinical team, spatial and temporal features in manipulations were documented (Table 1) to understand the classification task.

3. Results

The results of the feasibility study conducted to verify the designs of automated tagging of manipulation are below.

3.1. Baseline Data

Ten neonates admitted to NICU were enrolled from December 2019 to April 2020. The baseline characteristics of study subjects are displayed in Table 2. The mean gestational age was 34.7 weeks (range, 26 weeks to 40 weeks), and the mean birth weight of study subjects was 1893.8 g (range, 800 g to 3231 g).

3.2. Distribution of Manipulations

Table 3 shows the average duration of a patting, diaper change, and tube feeding. A total of 64 diaper changes (average duration, 45.5 s), 108 tube feedings (average duration, 108.9 s), and 167 patting’s (average duration, 28.9 s) were recorded and utilized for analysis.

3.3. CNN Based Classification of Manipulations

The 2048 features generated from manipulation images were plotted using t-SNE visualization (a) without transfer learning, which means without the knowledge of the current domain, and (b) with transfer learning. Without the transfer learning (Figure 5a), the ImageNet based Inception-V3 pre-trained model cannot classify the neonatal manipulations. However, after the transfer learning, except for a few outliers, the transfer-learning based Inception-V3 model can visualize the images of patting, diaper change, and tube feeding successfully (Figure 5b).

The accuracy of CNN-based model in classifying the manipulation frame/image is displayed in Figure 6. The validation accuracy (red) was achieved after eight epochs.

3.4. LSTM Based Classification of Manipulation Videos

The validation of automatic video classification was done in clinical settings, and the accuracy was 85% on the validation dataset. The comparison of NTS data with respect to nurse documented procedures is shown in Table 4. The 2048 features from the Inception-V3 model were generated for all frames present in the duration of the manipulation video.

The performance of the deep learning model obtained is presented in Table 5. The model automatically annotates the manipulation of a given neonate. Figure 7 demonstrates different manipulations that are classified by the CNN/ LSTM model during the validation phase.

3.5. Physiological Signal Variations during Manipulations

Figure 8a–c show variations in physiological parameters during the patting, diaper change, and tube feeding manipulations, respectively. There was an associated increase in normalized heart rate between before and during the period for neonates <32 weeks’ gestation (shown blue color) for all the three manipulations.

Table 6 shows the HR and SpO₂ physiological variables for each of the three manipulations. The significant changes (p < 0.05) are:

(I): For <32 weeks: (a) HR increased during diaper changes and decreased afterward, (b) SpO₂ increased during the diaper change.
(II): For ≥32 weeks: (a) HR increased during patting and decreased afterward, (b) the HR decreased after tube feeding.

4. Discussion

The NICU environment is highly complex, with critically ill neonates who require multiple medical devices, such as patient monitors, ventilators, syringe pumps, and infusion pumps. These many devices leave minimal working space for movement around the bedside. Therefore, a pocket-sized data aggregator, NTS, has been developed to capture valuable data with a small footprint; with its pocket-sized design (5.8 cm × 4.1 cm × 7.7 cm), it is ideal for cluttered workspaces and roaming device workflows. For video monitoring, the camera was wall-mounted above the neonate’s bed to avoid interfering with routine workflow in the NICU. The NTS client device synchronizes the acquired medical device and video data and sends it to the EMR platform. The platform displays the live video feed of a neonate, along with all the acquired vital parameters data for clinical interpretation.

The framework presented in this study can enable automatic identification of manipulation, generate corresponding EMR documentation of those manipulations, and measure changes in physiological parameters. The study demonstrates a machine learning model to classify three common neonatal care manipulations: (a) patting, (b) diaper change, and (c) tube feeding. It is important to highlight that the transfer learning of classifying the manipulations like tube feeding will strongly depend on local practices, such as syringe use, the position of the end for the tube, and even the use of gloves (and their colors). The authors anticipate that NICUs in a given geographical region or associated with similar neonatal research networks can develop a unique dataset of images as per their practices. This dataset can be readily used as ‘training’ module for the system for that group of NICU units.

In this study, the model was able to classify the manipulations with 95% accuracy in the training dataset (accompanying loss of 0.0026) and 85% in the validation dataset (with accompanying loss of 0.0409). During the manipulations, the physiological parameters were compared with those captured prior to the manipulation and after the manipulation, in neonates <32 weeks’ gestation, diaper changes were associated with significant changes in HR and SpO₂ (perhaps due to crying with subsequently increased minute ventilation). In comparison, for neonates ≥32 weeks’ gestation, patting and tube feeding was associated with significant changes in HR. The health impact of these vital sign changes associated with routine care practices is unclear. The ability to detect continuous changes in physiological parameters associated with machine learning-driven monitoring of common neonatal manipulations in the NICU illustrates the capability of the NTS model, which could be further used for further analysis of how neonatal manipulations and procedures impact short- and long-term outcomes.

Most NICUs have strict light and sound control protocols, both in the larger NICU environment and in the local environment of each neonate. In the current study, open incubators were used with most of the neonates. The lights in the NICU were recommended to be dim most of the time. We did not find any difference in the automatic classification of manipulations in different light conditions. However, these finding needs to be confirmed with a large sample size. In future studies, the feasibility of night vision mode in these cameras needs to be explored in poor light conditions. Moreover, the advent of 3-D cameras allows manipulation of specific data to be captured, which will also be explored in future efforts. With the emergence of artificial intelligence, it is anticipated that continuous monitoring and analysis will help avoid unnecessary manipulations that may cause a negative neuro-sensorial stimulus to premature and sick neonates. If specific neonatal manipulations and procedures are associated with worse outcomes, future research using the NTS model could assess how modifying routine care practices to target vital sign ranges could improve outcomes.

5. Limitations

While the presented study shows promise for future NICU neonatal monitoring applications, certain limitations need to be considered. As a pilot study to assess the feasibility of the system, only a small number of patients were recruited. Future studies will need to assess potential differences regarding gender, different gestational age groups, and other demographic parameters. A larger cohort of neonates will need to be recruited to build a physiological database that will provide more balanced data for machine learning models to simulate the NICU environment. The presented approach only utilized labeled data of three manipulations for ten neonates. The recognition capabilities of the deep learning model can be explored further by including the data of more manipulations and more neonates. (e.g., some neonatal manipulations or procedures, such as heel prick, last only a few seconds). In the current study, monitors did not capture per second data; hence the study lacks the complete resolution of physiological data required for the detailed analysis of brief manipulations or procedures. The study did not consider the medications that neonates were receiving during their stay in the NICU; since sedatives and analgesics can potentially affect the stress experienced by neonates [37] future studies should consider individual patient drug dosages and half-lives.

6. Conclusions and Future Directions

The present study demonstrates a framework to help clinical staff evaluate changes in physiological parameters associated with common care manipulations in the NICU. Due to the limitations of human resources, close and constant observation of neonates on a 24-h basis is a challenge. The current study model, which utilizes state-of-the-art computer vision and analyzes physiological parameter variations, may be a useful adjunct to assess neonates. Moreover, this framework will be extended to build video databases for other neonatal manipulations and procedures, which can be used for (a) skill evaluation of clinical staff and (b) improving the care documentation. Although the current results showed the feasibility of the system, its efficiency still needs to be studied in the larger NICU population across different sites. Another future direction is to include surrounding contextual data, such as lighting conditions, ambient noise in the NICU, and the number of clinical staff around neonates, to study the overall effect on the neonates while conducting manipulations.

Future studies will capture real-time physiological data from bedside monitors in millisecond resolution synchronized with the video data. The millisecond data will help study the impact of non-invasive and invasive manipulations (such as heel prick, intubation, and extubation) in a more granular manner with associated clinical events apnea, bradycardia, and desaturations. Recent advances in the computer vision and deep learning community have shown successful use of semi-supervised and unsupervised domain adaptation techniques. These methods could be leveraged to reduce the data labeling requirements further, while adapting the proposed system to new NICU units. In addition, given reported racial disparities in neonatal care in the United States [38,39], the NTS system could be used to study racial inequities in the NICU regarding average time dedicated to care manipulations of neonates from different racial backgrounds to provide quantifiable, informative data to healthcare providers.

7. Code Availability

The code that underpins the video analytics documentation is openly available. A Jupyter Notebook containing the code used to generate the descriptive statistics and tables included in this paper is available at: https://github.com/CHIResearch/IEEEVideo. README.md file has all the script-related and other details.

Author Contributions

Conceptualization, H.S., S.J.C., and S.K.; methodology, H.S., J.K., and R.K.; software, S.G., S.A., and H.S.; validation, S.G., S.A, A.K.P., and A.K., S.S. (Satish Saluja); formal analysis, J.J.B., S.A., and S.S. (Suchi Saria); resources, A.K, and G.Y.; data curation, J.K, and S.G.; writing—original draft preparation, J.K., R.D., and R.M.M.; writing—review and editing, R.K., R.D., H.S., R.M.M., Y.S., and J.P.; visualization, J.J.B., and H.S.; supervision, H.S., S.K., and S.S (Satish Saluja).; project administration, H.S., S.K., A.K., and G.Y.; funding acquisition, H.S., and R.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research project is funded privately by Child Health Imprints (CHIL) Pte. Ltd., Singapore.

Acknowledgments

We want to thank Microsoft, KStartup, Oracle, T-Hub, and IIM-A for recognizing the iNICU as an innovative sustainable solution in child healthcare. We want to acknowledge Harmeet Singh for help in drafting figures. We thank Tim Dougherty, Director of Sales Engineering at Wowza Media Systems, and Andrew Ramberg, Solutions Engineer at Wowza Media Systems, for helping us establish the server for video data acquisition. We acknowledge Lathika Pai, Country Head, Microsoft for Start-ups: MENA and SAARC, for providing the Azure cloud infrastructure credits, and Muni Pulipalyam, CTO-in-Residence at Microsoft Accelerator, Bangalore, for advice on the analytics pipeline on the Azure platform. We acknowledge Satyender, Deogyong Kim, Shubham Bahl, Rahul Paul for designing the hardware and analytics pipeline of the NEO TINY device. We acknowledge the help of (a) Anuj Pingal in the installation of NTS in the NICU sites, (b) Sangdo Kim, iNICU Korea, for technical design feedback on the NEO TINY device, and (c) Preeti Vishwakarma in tagging of video data. We would also like to acknowledge all the Child Health Imprints team members and other people that have made this study possible.

Conflicts of Interest

The Child Health Imprints (CHI) as an organization is focused on using technology to improve outcomes in NICU. It is disclosed that all the associated members are employees of CHI. The team has created iNICU, NEO, and analytics modules focused on the early prediction of disease and optimizing outcomes. Harpreet Singh and Ravneet Kaur are co-founders and own stock in the CHI. The informatics and clinical advisory team are responsible for providing academic inputs.

Appendix A

Table A1 provides a comparative overview table showing the key differences between the different previously published methods and the proposed NEO TINY System. It compares the previous works based on their focus on NICU population, live video data stream, and synchronization of video and physiological data. Thereby, the table describes the novelties of the current investigation.

Table A1. Comparison table of different studies.

Title	Study Done in NICU Population	Video Data	Whether Physiological Data Was Used in the Analysis	Synchronized Video and Physiological Data	Ref
Monitoring infants by automatic video processing: A unified approach to motion analysis	Yes	Yes	No	No	Cattani et al. [20]
Non-contact physiological monitoring of preterm infants in the Neonatal Intensive Care Unit	Yes	Yes	No (vital signs were monitored using video motion analysis of neonates)	No	Villaroel et al. [40]
Automatic and continuous discomfort detection for premature infants in a NICU using video-based motion analysis	Yes	Yes	No	No	Sun et al. [41]
Multi-Channel Neural Network for Assessing Neonatal Pain from Videos	Yes	Yes	No	No	Salekin et al. [42]
Automated pain assessment in neonates	Yes	Yes	Yes (captured from devices using character recognition)	Yes	Zamzmi et al. [43]
Intelligent ICU for Autonomous Patient Monitoring Using Pervasive Sensing and Deep Learning	No	Yes	Yes	Yes	Davoudi et al. [44]
Machine learning based automatic classification of video recorded neonatal manipulations and associated physiological parameters: A Feasibility Study	Yes	Yes	Yes	Yes	Presented study

Appendix B

A. Wall Mount

The camera was placed on the wall mount that was installed at the same height as the baby warmer’s top to minimize interference in the routine NICU workflow (Figure A1).

Figure A1. (a) Camera installed on custom-made wall mount to monitor the neonate in a Neonatal Intensive Care Unit (NICU). (b) Wall mount for the camera showing different sections. The wall mount weighs 1 kg.

The three divisions in wall mount were done to provide 360-degree rotation capability, along with horizontal and vertical shift possibility, to place the camera’s field of view on the neonate’s body.

B. NEO TINY System: Hardware Design

The NEO TINY system is a small form factor NEO device that can easily be set up along with existing patient monitors, ventilators, and other biomedical devices connected to a neonate in the NICU setting (Figure A2).

Figure A2. Size comparison of a 3 × 3 Rubik’s cube and NEO TINY system client.

The NTS client module captures the video and physiological data of neonates. It collects physiological data from medical devices, like bedside monitors and ventilators, along with video data from USB-based cameras. Figure A3 shows the hardware image of the NTS client with respect to its dimensions and various networking ports for integration.

Figure A3. Physical image of NTS client.

C. Hardware Specifications

There is one RS232 interface that connects to the serial port of internal NanoPi NEO Core2 single-board computer (SBC) and enables them to communicate with medical devices using the RS232 connector interface (Figure A4) mounted on the main Printed Circuit Board (PCB), and it is visible through the external top face of the NEO TINY device.

Figure A4. Schematic diagram of the NEO TINY system client depicting: (a) Main Board, (b) Power Supply, (c) Communication Ports.

There is an RJ45 connector that is mounted on the main PCB and is connected to NanoPi NEO Core2 SBC’s RJ45 interface. A USB hub is also provided on the main PCB to connect up to three USB port compatible devices. The front panel of the device has a Thin Film Transistor (TFT) Liquid Crystal Display (LCD) screen indicating notifications and alerts messages. NTS client can be powered using 5 V 2 A USB adaptor or battery backup and can be switched on/off using slider switch. NTS client’s power supply involves step upconverters and battery charging integrated circuit (IC) in order to ensure 5 V supply throughout the device. The mainboard of the NTS client is populated with one micro USB for charging and one micro USB for programming purposes. Table A2 provides the hardware specification of NEO TINY.

Table A2. Specifications of NEO TINY.

Characteristics	Details
Electrical
Input	5.0 V, 2 A DC Adaptor (AC 100–240 V, 50/60 Hz)
Embedded Battery	LiPo ¹ (DC 3.7 V, 1800 mAh)
Connectivity
Wired	RS232 × 1
	RJ45 × 1
	USB ² 2.0 × 3
Operating Conditions
Temperature	−20 °C to 70 °C
Humidity	5% to 90% R.H. ³
Memory	1 GB DDR3
Storage	eMMC: 8 GB
CPU	Quad-core 64 bit based on Cortex A53 (4 × 1.5 GHz)
Display	1.8 inch color TFT LCD ⁴ display (128 × 160 pixel resolution)
Dimensions	77 mm × 58 mm × 41 mm
Weight	150 g

1: Lithium-ion polymer; 2: Universal serial bus; 3. Relative Humidity; 4: Thin-film transistor liquid crystal display.

D. NTS: Software Specifications

NTS client software layer uses a Java version 1.8 based program to capture medical device data using the Health Level Seven (HL7) protocol on a Debian operating system. The acquisition of medical device data and associated biomedical protocols have been previously described in detail [26]. The NTS client captures video stream data from a Logitech USB camera with a built-in H.264 encoder using a Video4Linux version 2 (V4L2) application programming interface (API) [45]. The on-camera H.264 encoding minimizes the compute power on the NTS client device and ensures higher compression than other encoders. The video stream is sent via Wi-Fi to the streaming engine server within the NICU premises [46]. The transmission of video data occurs in two stages: (I) First, the avconv command (a Unix command) to grab data from a USB camera and transmit the video stream over USB to the NTS client over low latency based User Datagram Protocol (UDP) [47]. (II) In the second stage, the Secure Reliable Transport (SRT) protocol is used to transmit the video stream from the NTS client to the server [48,49]. The NTS system has a latency of 1–2 s and consumes up to 5 Mbps internet speed to display live video feeds with a resolution of 1280 × 720 pixels.

E. Synchronization of Video and Physiological Data

The server layer referred to as the NTS Server receives both video streaming and medical device data of the neonates. The live video is based on the Logitech camera clock, whereas the physiological data from the cardio-respiratory monitors are based on the equipment clock (Figure A5). The clock of the camera acquiring video data and bedside monitor capturing the physiological data were manually configured in the same time zone (UTC: Universal coordinated time) described in Table A3.

Video Data Capturing by NTS Client

Two system services are running on the NTS system, which are stream publish and SRT wrapped. Explanation of system services is as follows:

The Stream Publish system service code snippet is shown below:

[Unit]
Description = Stream Capturing
ConditionPathExists = |/usr/bin
After = network.target
[Service]
ExecStart = /usr/local/streampublish/streamPublish.sh
Restart = always
RestartSec = 5
StartLimitInterval = 0
[Install]
WantedBy = multi-user.target

This service will run a script called streamPublish.spresent in “/usr/local/streampublish” directory. The snippet of streamPublish.sh script file.

./capture -F -o -c0 | avconv -re -i - -vcodec libx264 -x264-params keyint = 30:scenecut = 0 -vcodec copy -f mpegts udp://127.0.0.1:1000?pkt_size = 1316

In the streamPublish.sh script, “capture” is an output build file of V4l2 API written in C, which captures H.264 encoded stream at the resolution of 1280 × 720p @30FPS from the camera. Then, the avconv command takes the captured stream from the capture file using pipe and transmits it to localhost at port 1000 (127.0.0.1:1000) using UDP protocol.

This service runs a script called srtwrapped.sh. Here, also, if the srtwrapped.sh script crashes due to some reason, the system service will try to restart the scrip automatically after 5 s. The “srtWrapped” system service snippet is shown below.

[Unit]
Description= Stream publishing to wowza streaming engine
ConditionPathExists = |/usr/bin
After = network.target
[Service]
ExecStart = /usr/local/srt/srtwrapped.sh
Restart = always
RestartSec = 5
StartLimitInterval = 0
[Install]
WantedBy = multi-user.target

The script snippet of srtwrapped.sh is shown below.

/usr/local/srt/srt-live-transmit udp://127.0.0.1:1000 srt://[wowza server ip]:[port]

The srtwrapped.sh script sends the stream from localhost to wowza Server IP at a designated port assigned to the NEO TINY using SRT protocol.

In the current study, GE B40® patient monitor (GE Healthcare, Milwaukee, WI, USA), SureSigns® VM6 patient monitor (Philips Medical Systems, Inc., Cleveland, OH, USA), and Philips Intellivue MP70 (Philips, Andover, MA, USA) were used. Both video and physiological data, collected using NTS client, are updated with NTS client clock time. The NTS client clock is synchronized with NTP (Network Time Protocol) at regular time intervals.

Figure A5. Synchronization of physiological data and video data with the server clock.

The time difference between video and physiological signals is adjusted at regular time intervals. In the current study, regular time intervals of 10, 30, 60, and 120 min were tried for offset calculations; 60-min video recordings were most optimum. The server records the incoming video stream by splitting the recording every hour to manage the offsets [50]. The current time of the NTS client is injected as meta-data into both video and physiological data streams. Moreover, a scheduled Cron job runs every 30 min to synchronize the NTS client’s clock with the NTP server (configurable using XML).

As both video and physiological signals are captured, the offset (difference) between the two clocks increases based on hardware and processing capabilities. This offset in milliseconds is depicted in Table A3. After 60 min, the time offset between the clocks was around 549 milliseconds. To ensure the synchronization of video and monitor data, the video clock was reset every 60 min by the offset time.

Table A3. Meta-data injection to ensure the video is treated as per clock of physiological signal, as a sample.

Time Elapsed	Time	Time of Camera	Time of Monitor	Offset (ms)
10 min	Start time	17:16:58.250	17:16:58.205	51
10 min	End time	17:26:58.135	17:26:58.269	51
30 min	Start time	17:39:36.253	17:39:36.290	27
30 min	End time	18:09:36.211	18:09:36.221	27
60 min	Start time	18:10:46.308	18:10:46.877	549
60 min	End time	19:10:46.217	19:10:46.237	549

Appendix C

A. Interface for Clinical Staff to Show Annotated Video Frame with Physiological Signals

The outputs of combined data are displayed to users using HTML5 based web-application (Figure A6). The live video stream is displayed using Web Real-Time Communication (WebRTC) video player [51], and physiological trends are shown as highcharts (a software library for charting written in JavaScript) [52].

Figure A6. Bedside interface to display manipulation video and physiological parameters.

B. Missing Data in the NICU Environment

The NTS client-server architecture can be affected by various reasons, such as bandwidth and network issues. The network and bandwidth can cause data reception delays on the NTS server. The payload size of each client request in JSON (JavaScript Object Notation) format is one kilobyte, consisting of (a) medical device information, (b) patient data, and (c) NTS client information. The small payload size allows the NTS client to perform in low bandwidth settings with a minimum internet requirement of 5 Mbps. However, the acquired patient data are stored locally on the NTS client and is sent to a cloud-based NTS server at regular intervals. NTS client has an on the device storage capacity of 8 GB of data. The server also evaluates the transmission performance of all the NTS clients and notifies the user of any data loss during a given timeframe. Since the device data capturing resolution in the present study is set to 1 min, therefore for each patient, a total of 1440 data points are received in 24 h.

During the patient’s NICU stay, physiological signal acquisition is affected by data disconnection caused by sensors falling off or poor contact. The vital tracker displays the total number of data points received for a given patient (Figure A7). In cases where the NTS server does not receive physiological data for 5 min, then the patient’s placard flashes red, and audio-visual alarms are generated to notify the onsite clinical staff (Figure A8). To consider the quality of physiological signals affected by these external factors, the extreme values, which were not associated with clinical events, were excluded from the analysis.

Figure A7. Vital tracking data monitoring screen (‘Total Data Points’ refer to the data points expected by the time of the last update, and ‘Data Points received’ refers to data points received in actual.).

Figure A8. Baby placard notifying the disconnection of sensor capturing physiological data (the red icon is flashed continuously until the physiological data resumes).

C. Data Security

Data transmission frequencies vary among medical devices. Some devices, such as cardio-respiratory monitors, continuously send data at a regular 60-s interval, whereas certain devices, such as blood gases, are used and transmit data intermittently. Depending on a patient’s respiratory needs, continuous positive airway pressure devices or ventilators are utilized and provide data at defined intervals (usually multiple values per minute). NTS enables data acquisition from various devices based on specific protocols, such as HL7 (Health Level Seven) [53], ASCII (American Standard Code for Information Interchange), ASTM (American Society for Testing and Materials) [54], binary, or proprietary. Moreover, in the current study, the video camera sends the streaming feed at 30 fps. The data acquisition module transmits the acquired video and medical device data at a per-minute resolution.

The medical environment is highly regulated, and the patient data needs to adhere to HIPAA (Health Insurance Portability and Accountability Act). The data transmitted by NTS clients are protected by HTTPS (Hypertext Transfer Protocol Secure) (256 bit) secure encryption. Each NTS client is configured with an IP address and a server port to transmit the data. The ports on NTS clients are enabled only based on connected medical devices. Private keys are needed, protected by PKI (public key infrastructure) to enable remote access protocols, like Secure Shell (SSH). Data stored on the different servers are protected by roles and rights assigned to the users. The servers are facilitated with disaster recovery mechanisms and are protected by firewalls. Each data node is kept on three different data centers to provide replication in case one server crashes.

References

Walani, S.R. Global burden of preterm birth. Int. J. Gynecol. Obstet. 2020, 150, 31–33. [Google Scholar] [CrossRef]
Kamath, B.D.; Macguire, E.R.; McClure, E.M.; Goldenberg, R.L.; Jobe, A.H. Neonatal Mortality From Respiratory Distress Syndrome: Lessons for Low-Resource Countries. Pediatrics 2011, 127, 1139–1146. [Google Scholar] [CrossRef]
Koyamaibole, L.; Kado, J.; Qovu, J.D.; Colquhoun, S.; Duke, T. An Evaluation of Bubble-CPAP in a Neonatal Unit in a Developing Country: Effective Respiratory Support That Can Be Applied By Nurses. J. Trop. Pediatr. 2006, 52, 249–253. [Google Scholar] [CrossRef] [Green Version]
Thukral, A.; Sankar, M.J.; Chandrasekaran, A.; Agarwal, R.; Paul, V.K. Efficacy and safety of CPAP in low- and middle-income countries. J. Perinatol. 2016, 36, S21–S28. [Google Scholar] [CrossRef] [Green Version]
De Georgia, M.A.; Kaffashi, F.; Jacono, F.J.; Loparo, K.A. Information Technology in Critical Care: Review of Monitoring and Data Acquisition Systems for Patient Care and Research. Sci. World J. 2015, 2015, 1–9. [Google Scholar] [CrossRef] [Green Version]
Carayon, P.; Wetterneck, T.B.; Alyousef, B.; Brown, R.L.; Cartmill, R.S.; McGuire, K.; Hoonakker, P.L.T.; Slagle, J.; Van Roy, K.S.; Walker, J.M.; et al. Impact of electronic health record technology on the work and workflow of physicians in the intensive care unit. Int. J. Med. Inform. 2015, 84, 578–594. [Google Scholar] [CrossRef] [Green Version]
Mark, R. The Story of MIMIC. In Secondary Analysis of Electronic Health Records; Secondary Analysis of Electronic Health Records; Data, M.C., Ed.; Springer International Publishing: Cham, Switzerland, 2016; pp. 43–49. [Google Scholar]
Johnson, A.E.W.; Pollard, T.J.; Shen, L.; Lehman, L.-W.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Anthony Celi, L.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Harutyunyan, H.; Khachatrian, H.; Kale, D.C.; Ver Steeg, G.; Galstyan, A. Multitask learning and benchmarking with clinical time series data. Sci. Data 2019, 6, 1–18. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fairchild, K.D.; Lake, D.E.; Kattwinkel, J.; Moorman, J.R.; Bateman, D.A.; Grieve, P.G.; Isler, J.R.; Sahni, R. Vital signs and their cross-correlation in sepsis and NEC: A study of 1,065 very-low-birth-weight infants in two NICUs. Pediatr. Res. 2017, 81, 315. [Google Scholar] [CrossRef] [PubMed]
Fairchild, K.D.; Sinkin, R.A.; Davalian, F.; Blackman, A.E.; Swanson, J.R.; Matsumoto, J.A.; Lake, D.E.; Moorman, J.R.; Blackman, J.A. Abnormal heart rate characteristics are associated with abnormal neuroimaging and outcomes in extremely low birth weight infants. J. Perinatol. 2014, 34, 375–379. [Google Scholar] [CrossRef] [PubMed]
Saria, S.; Rajani, A.K.; Gould, J.; Koller, D.; Penn, A.A. Integration of Early Physiological Responses Predicts Later Illness Severity in Preterm Infants. Sci. Transl. Med. 2010, 2, 48ra65. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fairchild, K. Aschner HeRO monitoring to reduce mortality in NICU patients. RRN 2012, 2, 65–76. [Google Scholar] [CrossRef] [Green Version]
Jeba, A.; Kumar, S. Shivaprakash sosale Effect of positioning on physiological parameters on low birth weight preterm babies in neonatal intensive care unit. Int. J. Res. Pharm. Sci. 2019, 10, 2800–2804. [Google Scholar] [CrossRef]
Barbosa, A.L.; Cardoso, M.V.L.M.L. Alterations in the physiological parameters of newborns using oxygen therapy in the collection of blood gases. Acta Paul. Enferm. 2014, 4, 367–372. [Google Scholar] [CrossRef] [Green Version]
Catelin, C.; Tordjman, S.; Morin, V.; Oger, E.; Sizun, J. Clinical, Physiologic, and Biologic Impact of Environmental and Behavioral Interventions in Neonates during a Routine Nursing Procedure. J. Pain 2005, 6, 791–797. [Google Scholar] [CrossRef]
Pereira, F.L.; Góes, F.; Fonseca, L.M.M.; Scochi, C.G.S.; Castral, T.C.; Leite, A.M. Handling of preterm infants in a neonatal intensive care unit. Rev. Esc. Enferm. USP 2013, 47, 1272–1278. [Google Scholar] [CrossRef] [Green Version]
Ellsworth, M.A.; Lang, T.R.; Pickering, B.W.; Herasevich, V. Clinical data needs in the neonatal intensive care unit electronic medical record. BMC Med. Inform. Decis. Mak. 2014, 14, 92. [Google Scholar] [CrossRef] [Green Version]
Moccia, S.; Migliorelli, L.; Carnielli, V.; Frontoni, E. 2019 Preterm infants’ pose estimation with spatio-temporal features. IEEE Trans. Biomed. Eng. 2020, 67, 2370–2380. [Google Scholar] [CrossRef]
Cattani, L.; Alinovi, D.; Ferrari, G.; Raheli, R.; Pavlidis, E.; Spagnoli, C.; Pisani, F. Monitoring infants by automatic video processing: A unified approach to motion analysis. Comput. Biol. Med. 2017, 80, 158–165. [Google Scholar] [CrossRef]
Sizun, J.; Ansquer, H.; Browne, J.; Tordjman, S.; Morin, J.-F. Developmental care decreases physiologic and behavioral pain expression in preterm neonates. J. Pain 2002, 3, 446–450. [Google Scholar] [CrossRef]
Solberg, S.; Morse, J.M. The Comforting Behaviors of Caregivers toward Distressed Postoperative Neonates. Issues Compr. Pediatr. Nurs. 1991, 14, 77–92. [Google Scholar] [CrossRef] [PubMed]
Chrupcala, K.A.; Edwards, T.M.; Spatz, D.L. A Continuous Quality Improvement Project to Implement Infant-Driven Feeding as a Standard of Practice in the Newborn/Infant Intensive Care Unit. J. Obstet. Gynecol. Neonatal Nurs. 2015, 44, 654–664. [Google Scholar] [CrossRef] [PubMed]
Kirk, A.T.; Alder, S.C.; King, J.D. Cue-based oral feeding clinical pathway results in earlier attainment of full oral feeding in premature infants. J. Perinatol. 2007, 27, 572–578. [Google Scholar] [CrossRef] [PubMed]
Singh, H.; Yadav, G.; Mallaiah, R.; Joshi, P.; Joshi, V.; Kaur, R.; Bansal, S.; Brahmachari, S.K. iNICU—Integrated Neonatal Care Unit: Capturing Neonatal Journey in an Intelligent Data Way. J. Med. Syst. 2017, 41, 132. [Google Scholar] [CrossRef] [PubMed]
Singh, H.; Kaur, R.; Gangadharan, A.; Pandey, A.K.; Manur, A.; Sun, Y.; Saluja, S.; Gupta, S.; Palma, J.P.; Kumar, P. Neo-Bedside Monitoring Device for Integrated Neonatal Intensive Care Unit (iNICU). IEEE Access 2019, 7, 7803–7813. [Google Scholar] [CrossRef]
Comaru, T.; Miura, E. Postural support improves distress and pain during diaper change in preterm infants. J. Perinatol. 2009, 29, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, Y.-W.; Chang, Y.-J. A Preliminary Study of Bottom Care Effects on Premature Infants’ Heart Rate and Oxygen Saturation. J. Nurs. Res. 2004, 12, 161–168. [Google Scholar] [CrossRef]
Jadcherla, S.R.; Chan, C.Y.; Moore, R.; Malkar, M.; Timan, C.J.; Valentine, C.J. Impact of feeding strategies on the frequency and clearance of acid and nonacid gastroesophageal reflux events in dysphagic neonates. J. Parenter. Enter. Nutr. 2012, 36, 449–455. [Google Scholar] [CrossRef] [Green Version]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; 2016; pp. 2818–2826. [Google Scholar]
Shao, L.; Zhu, F.; Li, X. Transfer Learning for Visual Categorization: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 1019–1034. [Google Scholar] [CrossRef]
Wharton, Z.; Thomas, E.; Debnath, B.; Behera, A. A vision-based transfer learning approach for recognizing behavioral symptoms in people with dementia. In Proceedings of the 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018; pp. 1–6. [Google Scholar]
Maaten, L.V.D.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Gulli, A.; Pal, S. Deep learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
Prechelt, L. Early Stopping—But When? In Neural Networks: Tricks of the Trade; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1524, pp. 55–69. [Google Scholar]
Hall, R.W.; Anand, K.J. Pain management in newborns. Clin. Perinatol. 2014, 41, 895–924. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Profit, J.; Gould, J.B.; Bennett, M.; Goldstein, B.A.; Draper, D.; Phibbs, C.S.; Lee, H.C. Racial/Ethnic Disparity in NICU Quality of Care Delivery. Pediatrics 2017, 140, e20170918. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Horbar, J.D.; Edwards, E.M.; Greenberg, L.T.; Profit, J.; Draper, D.; Helkey, D.; Lorch, S.A.; Lee, H.C.; Phibbs, C.S.; Rogowski, J.; et al. Racial Segregation and Inequality in the Neonatal Intensive Care Unit for Very Low-Birth-Weight and Very Preterm Infants. JAMA Pediatr. 2019, 173, 455–461. [Google Scholar] [CrossRef] [PubMed]
Villarroel, M.; Chaichulee, S.; Jorge, J.; Davis, S.; Green, G.; Arteta, C.; Zisserman, A.; McCormick, K.; Watkinson, P.; Tarassenko, L. Non-contact physiological monitoring of preterm infants in the Neonatal Intensive Care Unit. NPJ Digit. Med. 2019, 2, 128. [Google Scholar] [CrossRef] [Green Version]
Sun, Y.; Kommers, D.; Wang, W.; Joshi, R.; Shan, C.; Tan, T.; Aarts, R.M.; van Pul, C.; Andriessen, P.; de With, P.H.N. Automatic and Continuous Discomfort Detection for Premature Infants in a NICU Using Video-Based Motion Analysis. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 5995–5999. [Google Scholar]
Salekin, M.S.; Zamzmi, G.; Goldgof, D.; Kasturi, R.; Ho, T.; Sun, Y. Multi-Channel Neural Network for Assessing Neonatal Pain from Videos. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 1551–1556. [Google Scholar]
Zamzmi, G.; Pai, C.-Y.; Goldgof, D.; Kasturi, R.; Sun, Y.; Ashmeade, T. Automated pain assessment in neonates. In Scandinavian Conference on Image Analysis; Springer: Cham, Switzerland, 2017; pp. 350–361. [Google Scholar]
Davoudi, A.; Malhotra, K.R.; Shickel, B.; Siegel, S.; Williams, S.; Ruppert, M.; Bihorac, E.; Ozrazgat-Baslanti, T.; Tighe, P.J.; Bihorac, A.; et al. Intelligent ICU for Autonomous Patient Monitoring Using Pervasive Sensing and Deep Learning. Sci. Rep. 2019, 9, 1–13. [Google Scholar] [CrossRef] [Green Version]
Linux, T.V. Available online: https://linuxtv.org/downloads/legacy/video4linux/v4l2dwgNew.html (accessed on 21 May 2020).
Wowza Streaming Engine. Available online: https://www.wowza.com/docs/wowza-streaming-engine-product-articles (accessed on 14 January 2020).
Zaidi, S.; Bitam, S.; Mellouk, A. Enhanced user datagram protocol for video streaming in VANET. In Proceedings of the 2017 IEEE International Conference on Communications (ICC), Paris, France, 21–25 May 2017; 2017; pp. 1–6. [Google Scholar]
SRT Alliance. Available online: https://www.srtalliance.org/ (accessed on 14 January 2020).
Ruether, T. Wowza Product Resources Center. Available online: https://www.wowza.com/blog/streaming-protocols (accessed on 29 January 2020).
Record Live Video to VOD. Available online: https://www.wowza.com/docs/how-to-record-live-streams-wowza-streaming-engine#record-all-incoming-streams (accessed on 13 April 2020).
WebRTC. Available online: https://webrtc.org (accessed on 19 February 2020).
Highcharts. Available online: https://www.highcharts.com (accessed on 10 February 2020).
Bender, D.; Sartipi, K. HL7 FHIR: An Agile and RESTful approach to healthcare information exchange. In Proceedings of the 26th IEEE international symposium on computer-based medical systems, Porto, Portugal, 20–22 June 2013; 2013; pp. 326–331. [Google Scholar]
ASTM International. Available online: https://www.astm.org/ (accessed on 22 May 2020).

Figure 1. NEO TINY system (NTS) client module in typical Neonatal Intensive Care Unit (NICU) settings (box with yellow-colored border highlight the NTS client, and red-colored boxes highlight other devices).

Figure 2. The overall architecture of machine learning (ML)-based video classification system in the NICU.

Figure 3. Images of manipulation: (i) patting, (ii) diaper change, and (iii) tube feeding, the region of interest marked with a yellow border.

Figure 4. Deep learning architecture for neonatal video classification utilizing Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) network.

Figure 5. t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization for the manipulations (patting, diaper change, and tube feeding) (a) without transfer learning and (b) with transfer learning. Perplexity is 35, and the number of iterations is 20,000.

Figure 6. CNN-based model accuracy for classifying manipulation images.

Figure 7. Automatic tagging of manipulation videos: The first frame identified as manipulation is marked on the top left, and dotted lines show manipulation.

Figure 8. Variability in physiological signals captured every minute (average values) during the manipulations in the clinical setting. (a) patting, (b) diaper change, and (c) tube feeding. * number of manipulations/number of patients.

Table 1. Visual features of manipulations.

Manipulation:	Characteristics	Ref
Patting:	Definition: This is a comforting manipulation where the flat surface of the palmer surface of the caregiver’s hand was brought into contact with a surface of the neonate’s body singly or repetitively. The intensity and rate were variable in different episodes of patting.	[22]
	Spatial features: Nurse’s hand, neonate’s body boundaries
	Temporal features: Frequency: On-demand Duration: 33 s
Diaper Change:	Definition: This manipulation involves changing the diaper and cleaning the diaper area for skin hygiene.	[27,28]
	Spatial features: Two nurse’s hands, diaper, and skin contrast
	Temporal features: Frequency: 4 h Duration: 3 min
Tube Feeding:	Definition: This manipulation utilizes a soft tube placed through the nose (nasogastric) or mouth (orogastric) placed into the stomach. The feeding is provided through a tube into the stomach until the baby can take food by mouth.	[29]
	Spatial features: Nurse’s hand, milk, syringe attached to the feeding tube (with or without plunger)
	Temporal features: Frequency: 2 h Duration: 10–30 min

Table 2. Baseline characteristics of the sample (enrolled subjects, n = 10).

Id	Sex	Gestational Age	Birth Weight (g)	Age Interval for Recording (Days)	Clinical Diagnoses
1	Male	26⁺⁰	1005	24–25	RDS, Apnea, Prematurity
2	Male	27⁺¹	800	76–90	Prematurity
3	Male	29⁺⁴	1372	37–44	Prematurity, RDS, Apnea Sepsis
4	Male	35⁺²	1400	8–10	NNH
5	Male	36⁺⁰	2400	3–5	RDS, NNH
6	Male	36⁺⁶	1430	4–8	Prematurity, NNH
7	Male	36⁺⁶	3231	5–6	RDS
8	Male	39⁺²	2600	7–8	RDS, Seizure
9	Male	39⁺⁴	2000	5–6	Sepsis, RDS, Apnea
10	Male	40⁺⁰	2700	3–7	RDS, NNH

RDS: Respiratory Distress Syndrome, NNH: Neonatal Hyperbilirubinemia.

Table 3. Frequency and duration of manipulations recorded.

Manipulation	^# Frequency	* Average Duration (Seconds)	Minimum Duration (Seconds)	Maximum Duration (Seconds)
Patting	167	28.9 (12.4)	12	56
Tube Feeding	108	108.9 (55.3)	25	300
Diaper Change	64	45.5 (18.8)	17	92

^# Frequency across the length of stay in NICU, * Mean (Standard Deviation).

Table 4. NTS generated note of neonatal manipulations.

Patting	Nurse	Not Captured in EMR
Patting	NTS	The patting was started at 14:05:08 on 17-08-2020 and completed at 14:06:19 (duration: 71 s). This is manipulation number 3, since 8 a.m.
Diaper Change	Nurse	Not captured in EMR
Diaper Change	NTS	The diaper change was started at 19:35:25 on 17-08-2020 and completed at 19:37:01 (duration: 96 s). This is manipulation number 4 since 8 a.m.
Tube feed Entry	Nurse	Start Time: 17-08-2020 09:30 a.m. Type: Tube Feed Type of Milk: Preterm Formula Quantity: 11 mL
Tube feed Entry	NTS	The feeding was started at 09:30:09 on 17-08-2020 and completed at 09:32:57 (duration: 168 s). This is manipulation number 1 since 8 a.m.

Table 5. Performance of deep learning model.

	PPV	Sensitivity	F-Measure	Total Manipulations
Patting	0.86	1.00	0.92	167
Diaper Change	0.98	0.68	0.80	64
Tube feeding	1.00	0.87	0.93	108

Table 6. Physiological parameters (HR and SpO₂) before, during, and after manipulation.

		<32 Weeks					≥32 Weeks
Manipulations	Parameters	Baseline *	During *	Post *	p-Value ^$	p-Value ^#	Baseline *	During *	Post *	p-Value ^$	p-Value ^#
Patting	HR (BPM)	161.9 (10.19)	164.7 (13.7)	157.6 (24.9)	0.168	0.069	148.7 (13.9)	165.7 (30.7)	150.9 (8.2)	0.019	0.00
Patting	SpO₂ (%)	92.7 (7.4)	93.0 (7.9)	89.7 (12.8)	0.43	0.087	94.7 (6.1)	92.5 (10.9)	93.5 (11.41)	0.21	0.34
Diaper Change	HR (BPM)	152.738 (31.4)	166.9 (14.4)	157.4 (23.2)	0.000	0.036	147.8 (12.02)	152.7 (15.8)	150.7 (9.5)	0.10	0.17
Diaper Change	SpO₂ (%)	88.9 (18.2)	94.02 (5.7)	89.4 (13.6)	0.000	0.07	94.7 (5.8)	94.9 (5.4)	93.9 (12.7)	0.44	0.36
Tube Feeding	HR (BPM)	163.1 (10.55)	164.28 (13.29)	162.2 (20.0)	0.26	0.22	150.5 (16.7)	147.6 (16.6)	153.3 (11.6)	0.17	0.003
Tube Feeding	SpO₂ (%)	93.9 (6.4)	93.9 (4.9)	91.7 (9.9)	0.49	0.052	95.1 (4.9)	94.0 (8.0)	93.5 (7.6)	0.23	0.37

* Mean (Standard Deviation); HR: Heart rate, SpO₂: Oxygen saturation, BPM: Beats per minute; ^$ Comparing baseline and during manipulation parameters, ^# Comparing parameters during manipulation and post manipulation.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Singh, H.; Kusuda, S.; McAdams, R.M.; Gupta, S.; Kalra, J.; Kaur, R.; Das, R.; Anand, S.; Pandey, A.K.; Cho, S.J.; et al. Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study. Children 2021, 8, 1. https://doi.org/10.3390/children8010001

AMA Style

Singh H, Kusuda S, McAdams RM, Gupta S, Kalra J, Kaur R, Das R, Anand S, Pandey AK, Cho SJ, et al. Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study. Children. 2021; 8(1):1. https://doi.org/10.3390/children8010001

Chicago/Turabian Style

Singh, Harpreet, Satoshi Kusuda, Ryan M. McAdams, Shubham Gupta, Jayant Kalra, Ravneet Kaur, Ritu Das, Saket Anand, Ashish Kumar Pandey, Su Jin Cho, and et al. 2021. "Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study" Children 8, no. 1: 1. https://doi.org/10.3390/children8010001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Automatic Classification of Video Recorded Neonatal Manipulations and Associated Physiological Parameters: A Feasibility Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Setting and Study Sample

2.2. Data Collection

2.3. Video Acquisition of Manipulation

2.4. Physiological Parameters of Manipulation

2.5. Selection of Manipulations to Be Studied

2.6. Input Data, Training, and Validation Data Set

2.7. Classification of Manipulation Using Convolutional Neural Network (CNN)

2.8. Activity Recognition Combining CNN Output with LSTM

2.9. Variation in Physiological Signals Associated with Manipulation

2.10. Performance Metrics

2.11. Overall Activity Detection Model Evaluation

3. Results

3.1. Baseline Data

3.2. Distribution of Manipulations

3.3. CNN Based Classification of Manipulations

3.4. LSTM Based Classification of Manipulation Videos

3.5. Physiological Signal Variations during Manipulations

4. Discussion

5. Limitations

6. Conclusions and Future Directions

7. Code Availability

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

Appendix C

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI