Next Article in Journal
Quantum-like Data Modeling in Applied Sciences: Review
Previous Article in Journal
Panel Data Models for School Evaluation: The Case of High Schools’ Results in University Entrance Examinations
 
 
Article
Peer-Review Record

Incorporating Covariates into Measures of Surrogate Paradox Risk

Stats 2023, 6(1), 322-344; https://doi.org/10.3390/stats6010020
by Fatema Shafie Khorassani, Jeremy M. G. Taylor, Niko Kaciroti and Michael R. Elliott *
Reviewer 1:
Reviewer 2: Anonymous
Stats 2023, 6(1), 322-344; https://doi.org/10.3390/stats6010020
Submission received: 16 January 2023 / Revised: 13 February 2023 / Accepted: 14 February 2023 / Published: 17 February 2023

Round 1

Reviewer 1 Report (Previous Reviewer 1)

The authors have addressed my comments, I think it is a very useful paper.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Please find the PDF file "MDPI_Review" attached in the web portal.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report (New Reviewer)

The authors have been responsive to the earlier reviews, and I quite like the revised paper.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

This paper proposes a useful approach to examine the possibility of a surrogate paradox as a function of a patient covariate. My comments are below.
1. I found Section 2.2 a little confusing because the header has the \Psi_SP123: Estimating the probability that the treatment effects for the outcome will be harmful given that the treatment effect on the marker is beneficial, but this probability is actually 1-\Psi_SP123 right? As defined in the equation right below?
2. It would be helpful to number the quadrants from the very beginning in Figure 1, as in label them INSIDE the figure, since you seem to refer to them by number.
3. I'm a bit confused about \Psi_SP13N, how is this different from \Psi_SP13 other than being N instead of N+1? It is somewhat helpful that you state this could be useful when you've collected the surrogate data but not yet the outcome data, but it's a little deflated. Can you motivate this measure better?
4. How is \Delta_S different from 0_Si? If all of your modeling is correct, are these equivalent?
5. Section 3, and more generally in the paper - I'd encourage you to motivate this more as not JUST an extension of the Elliott approach, but a significantly important tool on its own. If there are truly some subgroups of patients for which the surrogate paradox is more likely to occur, it is possibly that the original Elliott approach would completely miss that. Your approach would pick that up, and that's a big deal and is especially compelling if you consider x to be gender or race/ethnicity. We know that mechanisms of treatment can be very heterogeneous, and your method is able to identify the possibility of something very bad happening for subgroups of patients. Calling it an extension that incorporates baseline covariates undersells it, in my opinion.
6. You state a few times that your method can be "easily" extended to higher dimensions of covariates, but I don't think that is true. Perhaps it would be easily to literally right out the model, but to actually get an estimate for a higher dimension seems difficult and potentially not feasible depending on your data.
7. You also seem to state that this is for a continuous x as well, but it seems all your examples have a dichotomous or categorical x. And indeed it's hard for me to picture what Figure 2, for example, would look like with a continuous x. If you can do continuous x, can you illustrate it? If you can't feasibly do it, I think it's ok to say this is for a single categorical x for now, that is still useful. And then perhaps mention the difficulties with continuous x and/or higher dimensions in your discussion.
8. Throughout, a lot of your derivations can be safely moved to the appendix. As just one example, the derivations on page 11 don't need every single middle step of ='s just the final line is sufficient.
9. I'm a little confused about why you present 4 measures but in your simulations you only seem to assess 3 of them in Table 1.
10. In Table 1 please show bias explicitly.
1.1 It seems to be stated repeatedly that you assume positive is good and negative is bad, I think you can just say this once at the beginning.
11. Simulations - it's important that you include some simulations examining model misspecification. This entire approach relies completely on correct specification of some complex models which are, quite frankly, unlikely to ever hold in practice. We need to know how robust this is to slight and severe model misspecification.
12. Your simulations are done with 100 studies I believe - I'm not sure I know of any meta-analysis of a surrogate marker that has 100 studies. Your first example is 1 study where you use the 14 centers as different studies. 100 and 14 are very different! your second study has 1 study with 62 centers that you use, which is a little better, but the range of patients is 2-46. I strongly encourage you to show some simulation results for simulations that have a sample size that actually mirror your examples. What do your simulations look like when you only have 14 trials? What do they look like when you have 62 trials that range from 2-42 patients?
13. The outcome for the second example is binary - how does that work if everything you have specified is bivariate normal?
14. The 4th measure that involves a little s does not seem to be examined in either example - why not? Can you include it?
15. Section 5 - testing, can you specify exactly what the test is? Also this doesn't seem to be examined in the simulations or examples, why not? I would encourage you to either include it in your numerical studies, or make it a discussion point instead of its own section.
16. In the simulations and example you note some problems with the REML estimation. Perhaps you should just propose the bayesian approach for estimation and put the REML estimation in the appendix, and say outright that you generally wouldn't expect it to work well for XX reasons, and so you focus only on the Bayesian approach throughout.
17. It would be helpful to comment on the availability (or not) of the datasets from the 2 examples.
18. Please include a reference for code/software/R package to implement your method. I think your method could be extremely valuable in practice but truth be told, if you don't have user-friendly code, no one will use it.

Reviewer 2 Report

This paper is very well written and an important topic. The introduction outlines the problem in a digestible way and provides the relevant theory to understand the surrogate paradox. 

I have only two minor comments. It would be helpful if each of the terms in Figure 1 were defined upon its introduction. There is a very good and detailed explanation of each of the psi's but I did not see them until after spending some time trying to figure them out for myself. An option would be to state the overall definition and refer the reader to the below details.

Please be consistent in referencing multivariate normal distribution. Rather than use MVN, I recommend staying with Nin section 2.3.

 

Back to TopTop