Next Article in Journal
Meticulous and Early Understanding of Congenital Cranial Defects Can Save Lives
Next Article in Special Issue
Just Play Cognitive Modern Board and Card Games, It’s Going to Be Good for Your Executive Functions: A Randomized Controlled Trial with Children at Risk of Social Exclusion
Previous Article in Journal
Assessing Self-Concept in Children (Aged 5–7) with Functional Dyslalia
 
 
Article
Peer-Review Record

Mobile Device-Based Video Screening for Infant Head Lag: An Exploratory Study

Children 2023, 10(7), 1239; https://doi.org/10.3390/children10071239
by Hao-Wei Chung 1,2,3, Che-Kuei Chang 4, Tzu-Hsiu Huang 5, Li-Chiou Chen 6, Hsiu-Lin Chen 1,7, Shu-Ting Yang 1, Chien-Chih Chen 8,* and Kuochen Wang 4,8,*
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Children 2023, 10(7), 1239; https://doi.org/10.3390/children10071239
Submission received: 3 May 2023 / Revised: 12 July 2023 / Accepted: 13 July 2023 / Published: 18 July 2023
(This article belongs to the Special Issue Cognitive and Motor Development: Children and Adolescents)

Round 1

Reviewer 1 Report

I would, first of all, like to congratulate the authors on the great work and wonderful idea with this article. Prior publication, there are some issues that should be resolved. 

1. In the line 279 of the discussion section, the authors state the differences in the sensitivity and specificity of the 13 key points approach between the level 0 and level 1 infants, would be great to have a possible explanation for this? 

2. Line 282, 'parts of level 0 infants', it is some of level 0 infants, correct?

3. There is a typo in line 289, should be: 'improve'

4. Lines 310 to 315 are better for the introduction

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

The present study used phone recorded videos of infant movement to improve upon use of photos for AI training to detect motor delay and difficulties in infants.  The videos were with infants with high-risk of motor delay, mainly associated with prematurity.  The research provides a valuable starting point for use of AI in diagnostics.   I have two main points for consideration.

1. There is a focus on diagnostics, but I wondered if the potential use is more likely to be in screening.

2. The present study has been conducted with a high risk clinical sample.  The emphasis in the introduction (e.g. lines 33-44) and parts of the discussion seems to be all children who may have difficult to detect signs of disability.  While it can be helpful to aspire to development of tools that will support early detection of disability in all children, at the moment the focus should be on children already identified as high-risk.

Thank you for the opportunity to review this interesting study.

Copy-editing is needed.  There are a few sentences that don't make sense and overall the article is difficult to read due to error is in English expression.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 3 Report

Automation of developmental motor assessments is an important, active field, and this work extends a growing toolkit and body of evidence. Clarification of methodology is needed to permit replication and to assess the generalizability of results.

Title: This study examines only one item of the HINE-- i.e., head lag in pull-to-sit. The title of the article should reflect this narrow scope.

Introduction:
-Authors should revisit discussion of using machine learning-based tools in infant motor assessments and this study's place within the development of such a toolkit in Paragraph 4. I agree that machine learning-based tools, in principle, have the potential to automate/facilitate currently-available clinical assessments and even to extend/augment them (e.g., by adding precision; by capturing nuances that are difficult for humans to see). I also agree with the authors' approach of attempting to automate an existing measure (here, within a classification scheme) before attempting to extend/augment existing tools. However, Introduction paragraphs 4-6 should explicitly clarify this study's Aims within this framework.
-Prior art should be discussed in greater detail. Which elements of this processing pipeline have been used in similar classification tasks, and which are novel?
-Minor point: I don't necessarily agree that privacy and security concerns are the main limitation preventing the development of machine learning-based assessment tools. As is the case for much of the machine learning field, a major limitation is the need for large, well-labeled datasets-- datasets that are expensive and time-consuming to produce.

Methods:
-2.2.: Please explicitly describe inclusion and exclusion criteria. Please also expand upon logistics: were parents responsible both for performing the pull-to-sit maneuver as well as for recording? Was clinical assessment also done at the visit for validation? Were any videos obtained and submitted that were not of suitable quality for rating?
-Sections 2.3.x contain substantial text regarding conceptual frameworks but do not contain enough detail regarding actual processing steps. Please include enough detail to permit replication (including and up to code availability if applicable).
-Please clarify in particular regarding the novelty vs. prior art regarding "data fuzzifying learning." If I understand correctly, it seems to resemble a perceptron as used in most neural networks? Authors appear to non-linearly transform observations (using a "fuzzy triangular member function"; essentially normalizing weights) then integrate them (weighted sum)
-I do not understand how the time dimension is used in analyses
-Figure 2 does not currently aid understanding
-I do not understand how to interpret the authors' apparent application of multiclass ROC statistics. Classical ROC statistics (e.g. TP, TN, FP, FN, AUC) apply only to binary classifications; as there are 3 classification outcomes in this case (0, 1, and 3), the authors appear to use a multiclass generalization. The methods referenced seem to permit "one-vs-one" or "one-vs-rest" binarizations of multiclass data. If I understand correctly, authors used "one-vs-rest" methdology. Regardless, this seems to treat classification levels as nominal rather than ordinal-- 3 > 1 > 0. I recommend using ordinary ROC statistics-- but binarizing using cutoffs. That is, compute sensitivity, specificity, and AUC for each of two analyses: 1) classification accuracy of level<0.5 vs. level>0.5 and 2) classification accuracy of level>2 vs. level<2
-Authors should report kappa in addition to raw accuracy to describe AI vs. PT classification accuracy

Results:
-Participant characteristics table: In addition to perinatal characteristics, current characteristics are also important (in particular, age-- whether chronological, adjusted, or both). Current medical and developmental status would also be helpful to the degree available.
-Listing rows for each iteration is not helpful. Summary statistics of the distributions of results (e.g., accuracy, kappa, sensitivity for score<0.5, specificity for score<0.5, AUC at 0.5 cutoff, etc),  would suffice (e.g. mean and standard deviation)
-Authors claim performance differences between analysis performed using 13 vs. 5 key points-- statistics should explicitly evaluate these differences

Discussion/Limitations/Conclusions:
-My comments regarding the Introduction apply here also

Please proofread-- many minor errors and inaccuracies

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

I appreciate Authors' substantial revisions and find the manuscript substantially improved. Authors' clarifications (particularly regarding their Methods and Results) give me a much better understanding of the study and its findings. However, they also raise serious follow-up questions.

Point/response 3: Please further clarify-- were all infants meeting criteria enrolled, i.e., was enrollment consecutive?

Point/response 4: Much clearer-- thank you. Minor comment-- the frame number normalization process (Step 2) strikes me as atypical and likely to introduce artifact (e.g., altering apparent velocities in non-linear ways). This is not necessary to address in this manuscript but may be an opportunity for improvement in the future.

But critically (and leading into point/response 6): I still do not understand how time dimension handling makes sense. Intrinsically, the head lag procedure is dynamic-- head lag is present when, over the course of being pulled to sit, the neck assumes an extensor position. However, Equation 7 instead seems to average over all timepoints-- in effect deleting all dynamic spatial/temporal information. It would seem that your classifier is essentially making decisions based on a fuzzy, weighted mean position for each keypoint. This would seem to lack physical and physiologic interpretation and would suggest that high classifier performance is instead epiphenomenal-- relying, for instance, on differences in video acquisition positioning in children with head lag. Please address this concern before further review can be considered.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Back to TopTop