Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Radiomics Based on Thyroid Ultrasound Can Predict Distant Metastasis of Follicular Thyroid Carcinoma

J. Clin. Med. 2020, 9(7), 2156; https://doi.org/10.3390/jcm9072156

by Mi-ri Kwon¹

, Jung Hee Shin^2,*

, Hyunjin Park^3,4,*

, Hwanho Cho⁵

, Eunjin Kim⁵ and Soo Yeon Hahn²

Reviewer 1: Anonymous

Reviewer 2:

Pierpaolo Trimboli

J. Clin. Med. 2020, 9(7), 2156; https://doi.org/10.3390/jcm9072156

Submission received: 16 June 2020 / Revised: 30 June 2020 / Accepted: 6 July 2020 / Published: 8 July 2020

(This article belongs to the Special Issue Radiation Oncology - Head and Neck Cancers)

Round 1

Reviewer 1 Report

In this study, the authors evaluated whether radiomics analysis based on a gray-scale ultrasound can predict distant metastasis of follicular thyroid cancer (FTC). They found that radiomics signature and widely invasive histology were significantly associated with distant metastasis and concluded that radiomics signature to be an independent biomarker useful for predicting metastasis of FTC. The manuscript is well written, and the conclusion sounds reasonable.

Minor points:

Methods. How was distant metastasis confirmed?
Results. Some numerics appear both in sentences in Result and Tables. For example, ”(with metastasis: 59.35 ± 12.0 years, without 171 metastasis: 46.51 ± 14.09 years, p < 0.0001” appears in Table 1.
Discussion, line 238. There are some studies on 3D-US.
Discussion, line 243-. I understand that the radiomics signature was a significant factor that predicted distant metastasis. What do the authors think to be the potential mechanism that caused these results? Is it a black box?

Author Response

Review 1

Comments and Suggestions for Authors

Minor points:

How was distant metastasis confirmed?

>> Thank you for your question. Distant metastasis was confirmed by biopsy and/or diagnosed by imaging modalities including computed tomography, positron emission tomography or magnetic resonance imaging. We added it in the Material and methods.

Some numerics appear both in sentences in Result and Tables. For example, ”(with metastasis: 59.35 ± 12.0 years, without 171 metastasis: 46.51 ± 14.09 years, p < 0.0001” appears in Table 1.

>> Thank you for your comment. To avoid duplication, we removed duplicated numerics appearing both in the manuscript and tables.

Discussion, line 238. There are some studies on 3D-US.

>> Thank you for your comment. I want to emphasize that ultrasound usually obtains 2D data. Since the introduction of 3D-US, it has not yet been widely used in clinical practice yet, especially in the evaluation of thyroid nodules.

Discussion, line 243-. I understand that the radiomics signature was a significant factor that predicted distant metastasis. What do the authors think to be the potential mechanism that caused these results? Is it a black box?

>> Six features were common in all five folds and many of them have interpretable implications that might lead to explaining the potential mechanism. Two shape features of elongation and sphericity were selected and they imply that less elongated and anti-spherical nodules tend to be more malignant and thus could be metastatic, based on widely accepted thyroid malignant US features [17]. The texture features from GLSZM (i.e., gray level non-uniformity normalized, size zone nonuniformity, and small area low gray-level emphasis) all quantify intra-nodular heterogeneity in intensity, which plays an important in metastasis in many cancers [28,29].

Reviewer 2 Report

This study investigates the role of ultrasound radiomics in predicting distant metastasis of the follicular thyroid carcinoma (FTC). To that aim, the authors enrolled 169 patients (35 of which had distant metastasis), and extracted 60 radiomics features (RF) from each of 2D US images. Since no validation set was available, 5-fold cross-validation technique was performed, leaving each time a different sample as validation set. LASSO algorithm was used to select the non-redundant features: the RFs selected in all five training folds were used to build the radiomics signature. The predictive power of radiomics score, clinical data and US information were evaluated with univariate correlation and multivariate logistic regression analyses. Finally, two support vector machine classifiers were trained using only the radiomics score and all the statistically significant variables, respectively. The authors found that a six radiomics features score and widely invasive histology were independently associated with distant metastasis providing good predictive performance.

General Comment:

The study is of interest, however it is affected by some limitations as underlined also by the authors:

The enrollment time is not the same for patients with and without metastases. This also leads to include images acquired with different ultrasound system technologies, which may provide not reliable results.
The US imaging is one of the most operator dependent technique and the RFs are highly affected by acquisition settings; therefore, the agreement between RFs extracted from images acquired by different operators is mandatory.
It was demonstrated that RFs could be correlated with others clinical parameters; since authors did not performed any correlation analysis, it cannot be excluded that radiomics score was a surrogate of other clinical variables.
The use of SVM classifier for this sample size is not justified and the risk of overfitting is very high.
The authors did not provide any information about the calculated radiomics score as well as the full details of predictive model, and this does not allow to replicate the study.

Specific Comment:

Line 74: Why authors decided to reduce the enrollment time for patients without distant metastases?

Line 88: Authors should compare the radiomics features extracted from images acquired by the different operators.

Line 148: Usually univariate logistic regression is performed before the multivariate one: why did the authors do a correlation analysis?

Line 186: Is the radiomics score a linear combination of the six LASSO selected features? Is it statistically different between the two groups of patients?

Line 203: Authors should report the ICC results transparently.

Line 205: Usually, the model parameters has to be reported in order to make the analysis repeatable.

Line 254: The authors stated that only 6/169 patients were acquired with an old ultrasound system: why did they not exclude these patients from the analysis?

Final suggestion

Although the study is of interest, some important limitations affect this research; therefore, I do not recommend the manuscript for publication.

Author Response

Comments and Suggestions for Authors

General Comment:

The study is of interest, however it is affected by some limitations as underlined also by the authors:

The enrollment time is not the same for patients with and without metastases. This also leads to include images acquired with different ultrasound system technologies, which may provide not reliable results.

>> Thank you for your comment. Generally, the number of FTCs with distant metastasis was much smaller than FTCs without distant metastasis. For statistical analysis, FTC with distant metastasis included all data from our institution and FTC without metastasis was limited to a period of four consecutive years.

The US imaging is one of the most operator dependent technique and the RFs are highly affected by acquisition settings; therefore, the agreement between RFs extracted from images acquired by different operators is mandatory.

>> We agree with you! The agreement in radiomics features is an important issue, which was evaluated with ICC. However, the previous description of ICC was insufficient, thus we expanded the section on ICC. A full list of ICC values for all 60 radiomics features are reported in the Supplement as a Table S1 (repeated below for your review). The ICC values of the six selected features were mentioned in the main text. We expanded the Methods and Results sections.

Supplemenatary Table S1. Intraclass coefficient of each radiomics feature

Radiomics features (n=60)	ICC	Radiomics features (n=60)	ICC
Firstorder_90Percentile	0.9850	GLCM_IDM	0.9379
Firstorder_Energy	0.9822	GLCM_IDMN	0.9173
Firstorder_Entropy	0.9036	GLCM_IDN	0.9295
Firstorder_InterquartileRange	0.9704	GLCM_IMC1	0.9691
Firstorder_Kurtosis	0.9504	GLCM_IMC2	0.9741
Firstorder_Maximum	0.8838	GLCM_InverseVariance	0.9755
Firstorder_MeanAbsoluteDeviation	0.9776	GLCM_JointAverage	0.9259
Firstorder_Mean	0.9912	GLCM_JointEnergy	0.9737
Firstorder_Median	0.9933	GLCM_JointEntropy	0.9152
Firstorder_Minimum	0.9939	GLCM_MCC	0.9755
Firstorder_Range	0.8792	GLCM_MaximumProbability	0.9736
Firstorder_RobustMeanAbsoluteDeviation	0.9771	GLCM_SumAverage	0.9259
Firstorder_RootMeanSquared	0.9888	GLCM_SumEntropy	0.9116
Firstorder_Skewness	0.9532	GLCM_SumSquares	0.8957
Firstorder_TotalEnergy	0.9822	GLSZM_GrayLevelNonUniformity	0.9732
Firstorder_Uniformity	0.9407	GLSZM_GrayLevelNonUniformityNormalized	0.8215
Firstorder_Variance	0.9705	GLSZM_GrayLevelVariance	0.8788
Shape_Elongation	0.9572	GLSZM_HighGrayLevelZoneEmphasis	0.8554
Shape_PerimeterSurfaceRatio	0.9746	GLSZM_LargeAreaEmphasis	0.9603
Shape_Sphericity	0.8062	GLSZM_LargeAreaHighGrayLevelEmphasis	0.8366
GLCM_Autocorrelation	0.8912	GLSZM_LargeAreaLowGrayLevelEmphasis	0.9889
GLCM_ClusterProminence	0.8465	GLSZM_LowGrayLevelZoneEmphasis	0.9625
GLCM_ClusterShade	0.9272	GLSZM_SizeZoneNonUniformity	0.9523
GLCM_ClusterTendency	0.8966	GLSZM_SizeZoneNonUniformityNormalized	0.9593
GLCM_Contrast	0.9174	GLSZM_SmallAreaEmphasis	0.9403
GLCM_Correlation	0.9759	GLSZM_SmallAreaHighGrayLevelEmphasis	0.8480
GLCM_DifferenceAverage	0.9285	GLSZM_SmallAreaLowGrayLevelEmphasis	0.9614
GLCM_DifferenceEntropy	0.9255	GLSZM_ZoneEntropy	0.9336
GLCM_DifferenceVariance	0.9148	GLSZM_ZonePercentage	0.9210
GLCM_ID	0.9411	GLSZM_ZoneVariance	0.9603

Note: GLCM=Gray Level Co-occurrence Matrix; ID=Inverse Difference; IDM=Inverse Difference Moment; IDMN= Inverse Difference Moment Normalized; IDN=Inverse Difference Normalized; IMC=Informational Measure of Correlation; MCC=Maximal Correlation Coefficient; GLSZM=Gray Level Size Zone Matrix; Features in bold denote selectee features.

It was demonstrated that RFs could be correlated with others clinical parameters; since authors did not performed any correlation analysis, it cannot be excluded that radiomics score was a surrogate of other clinical variables.

>> Thank you for your comment. Yes, the radiomics features could be correlated with other important clinical variables. Following your comment, we correlated six important radiomics features common in all five folds with the important clinical variables that were available before surgery (i.e., tumor size, nodule-in-nodule appearance, rim calcification, and echogenicity) . The correlation analyses showed that the selected radiomics features had either weak correlation (r < 0.2) or high p-value (p > 0.1) (Supplementary Table S2). This confirmed that the selected radiomics features were not surrogates of other important clinical variables. The Results section was revised.

Supplemenatary Table S2. Correlation between selected radiomics features and important clinical variables. The first value in each cell element is r-value followed by p-value in the format of r-value (p-value).

Variables	Tumor Size	Echogenicity	Rim Calcification	Nodule-in-nodule appearance
Minimum	-0.034(0.661)	0.661(0.602)	-0.008(0.923)	0.072(0.351)
Elongation	0.055(0.474)	0.474(0.682)	0.010(0.899)	-0.016(0.833)
Sphericity	0.178(0.021)	0.021(0.514)	0.124(0.109)	-0.012(0.880)
Gray level non-uniformity normalized	-0.022(0.780)	0.780(0.987)	-0.036(0.641)	0.010(0.901)
Size zone nonuniformity	0.070(0.367)	0.367(0.376)	-0.025(0.747)	-0.047(0.541)
Small area low gray-level emphasis	-0.071(0.359)	0.359(0.962)	-0.003(0.969)	-0.069(0.372)

The use of SVM classifier for this sample size is not justified and the risk of overfitting is very high.

>> Thank you for your comment. Since our study is a single-center one, we adopted the five-fold cross-validation strictly separating training and test data to reduce the risk of overfitting. We used a different set of features (on average 10 features across different folds) from 169 samples (134 FTC without metastasis and 35 FTC with metastatic cases) for the classifier. Many machine learning studies applied the SVM classifier where the minority class had less than 30 samples for 10 or more features [3-5]. Thus, we believe SVM classifier could be applied in our study. A theoretic study pointed out that if the features followed multivariate Gaussian distribution, the number of samples per class to apply SVM effectively should be greater than three times the number of features [6]. In our case, the cutoff was 30 (=10x3) on average due to the different number of selected features in the cross-validation. Thus, we have a theoretic rationale to apply SVM to our samples. Texts in the Methods and Discussions sections were revised.

The authors did not provide any information about the calculated radiomics score as well as the full details of predictive model, and this does not allow to replicate the study.

>> Sorry for the omission. We provided the full details on how our radiomics score was computed in this revision. The SVM classifier with a linear kernel was constructed using the selected features. The parameters of SVM were weights that determined the decision hyperplane to separate two groups (i.e., FTC with and without metastasis). A signed distance was computed from features from the given sample to the hyperplane, which was further transformed using the sigmoid function to yield a probability value. The output probability value was assigned as the radiomics signature. The whole procedure was performed with a MATLAB command “fticsvm” with the prior uniform option. Texts in the Methods section were revised. Our computer code used in this study including the classifier was provided in the Supplementary material for future replication studies.

Specific Comment:

Line 74: Why authors decided to reduce the enrollment time for patients without distant metastases?

>> Thank you for your comment. The number of FTCs with distant metastasis was much smaller than FTCs without distant metastasis in our institution. As you know, planning case-control studies are usually advised to include no more than four or five controls per case because little statistical power is gained by further increasing this ratio. Therefore, we limited the enrollment period of FTCs without distant metastasis to five years to obtain statistical power.

Limitation was updated. “the enrollment period was not the same for patients with and without metastases. In fact, the number of FTCs with distant metastasis was much smaller than FTCs without distant metastasis in our institutional database. Because no more than four or five controls per case were recommend for case-control study to maintain statistical power, we limited the enrollment period of FTCs without distant metastasis to five years, while FTCs with distant metastasis had longer enrollment period.”

Line 88: Authors should compare the radiomics features extracted from images acquired by the different operators.

>> It is shown in the response above (Response #2).

Line 148: Usually univariate logistic regression is performed before the multivariate one: why did the authors do a correlation analysis?

Univariate analysis gives you a simplified marginal description between predictors and dependent variables and it is often used right before the multivariate analysis. The multivariate analysis gives you the complete picture of the variables and thus is more important than the results of univariate analysis. There are many choices in univariate analysis and we chose variants of correlation for the univariate analysis depending on the type of predictors. As you pointed out, since we adopted logistic regression as the multivariate analysis, the natural univariate analysis would be univariate logistic regression. Still, all univariate analysis evaluates a simple relationship between predictors and dependent variables and thus are similar. Following your comments, we performed univariate analysis using univariate logistic regression and confirmed that the significant results were the same in Table 4.

Line 186: Is the radiomics score a linear combination of the six LASSO selected features? Is it statistically different between the two groups of patients?

>> As shown in the response above (Response #5), the radiomics signature is a probability output of the SVM classifier using the selected features. The radiomics score was designed to discriminate between two groups (i.e., FTC with and without metastasis), and thus there should be a statistically significant difference in radiomics signature between two groups. This was confirmed with a two-sample t-test with a p-value less than 0.0001. Results section was updated.

Line 203: Authors should report the ICC results transparently.

>> As shown in the response above (Response #2), we provided full details of the ICC.

Line 205: Usually, the model parameters has to be reported in order to make the analysis repeatable.

>> As shown in the response above (Response #3), we provided full details of the model parameters.

Line 254: The authors stated that only 6/169 patients were acquired with an old ultrasound system: why did they not exclude these patients from the analysis?

>> Thank you for your question. Six were obtained from the old US system, but there was no problem in image analysis and the image quality was not bad. Rather, exclusion of these patients could induce selection bias.

Supplement X

SVM code

The code is available at https://github.com/skkuej/thyroid_SVM/blob/master/lasso_svm.mat.

%% data load, cross validation, z-normalization

X= data(:,3:62); % radiomics normalized_features(n=60),

y= data(:,2); % binary-metastases(n=169)

normalized_features = zscore(X(:,1:60));

foldMax = 5;

cvNum = 1;

c = cvpartition(y,'kfold',foldMax);

while cvNum <= foldMax

trainingFeature = normalized_features(c.training(cvNum),:);

testFeature = normalized_features(c.test(cvNum),:);

trainingLabel = y(c.training(cvNum));

testLabel = y(c.test(cvNum));

%% Feature selection - Lasso

lasso = cvglmnet(trainingFeature, trainingLabel, 'binomial');

s(cvNum).selected_features = find(lasso.glmnet_fit.beta(:,(lasso.lambda == lasso.lambda_min)));

trainingFeature = trainingFeature(:,s(cvNum).selectednormalized_features);

testFeature = testFeature(:,s(cvNum).selectednormalized_features);

%% SVM

svmMdl = fitcsvm(trainingFeature,trainingLabel,'Prior','uniform');

[labelHatTr, scoreTr] = svmMdl.predict(trainingFeature);

[labelHatTs, scoreTs] = svmMdl.predict(testFeature);

radiomics_score(cvNum).score = scoreTs;

%% model evaluation

[Xtr,Ytr,Ttr,AUCtr(cvNum)] = perfcurve(trainingLabel,scoreTr(:,2),1);

[Xts,Yts,Tts,AUCts(cvNum)] = perfcurve(testLabel,scoreTs(:,2),1);

conMat_train = confusionmat(trainingLabel, labelHatTr);

ACC_train(cvNum) = (conMat_train(1,1)+conMat_train(2,2))/sum(conMat_train(:));

SENS_train(cvNum) = conMat_train(2,2)/(conMat_train(2,1)+conMat_train(2,2));

SPEC_train(cvNum) = conMat_train(1,1)/(conMat_train(1,1)+conMat_train(1,2));

conMat_test = confusionmat(testLabel, labelHatTs);

ACC_test(cvNum) = (conMat_test(1,1)+conMat_test(2,2))/sum(conMat_test(:));

SENS_test(cvNum) = conMat_test(2,2)/(conMat_test(2,1)+conMat_test(2,2));

SPEC_test(cvNum) = conMat_test(1,1)/(conMat_test(1,1)+conMat_test(1,2));

cvNum = cvNum + 1;

end

Round 2

Reviewer 2 Report

Authors have answered to all my questions and made the necessary changes to the manuscript.

Article Menu

Radiomics Based on Thyroid Ultrasound Can Predict Distant Metastasis of Follicular Thyroid Carcinoma

Supplemenatary Table S1. Intraclass coefficient of each radiomics feature

Further Information

Guidelines

MDPI Initiatives

Follow MDPI