Next Article in Journal
Fall Detection with CNN-Casual LSTM Network
Previous Article in Journal
Topic Modeling for Amharic User Generated Texts
 
 
Article
Peer-Review Record

Simple but Effective Knowledge-Based Query Reformulations for Precision Medicine Retrieval

Information 2021, 12(10), 402; https://doi.org/10.3390/info12100402
by Stefano Marchesin 1, Giorgio Maria Di Nunzio 1,2,* and Maristella Agosti 1
Reviewer 1: Anonymous
Reviewer 2:
Information 2021, 12(10), 402; https://doi.org/10.3390/info12100402
Submission received: 29 August 2021 / Revised: 24 September 2021 / Accepted: 24 September 2021 / Published: 29 September 2021
(This article belongs to the Section Information Systems)

Round 1

Reviewer 1 Report

Authors present in this paper a review of their previous work around the TREC PM competition. The work is valuable because it tries to put together several disperse works describing different improvements for the TREC PM tasks.

However, the paper needs some improvements, mainly in its organization. Now, the paper looks too long. There are some parts that are repeated along the paper. For example, the explanation of the three performed studies are described in the introduction, in the methods and in the conclusions. Perhaps authors should describe the overall approach in a separate section before the section "Resources". In this sections, authors must also introduce the main concepts behind the paper, namely: query expansion and reduction (these concepts are now scattered across all the paper). Conclusions must be rewritten to focus on the main achievements, limitations and future work (not as a summary of the previous sections). Finally, the introduction must emphasize the contribution and goals of the paper.

As a minor comment, Table 1 needs to be somehow simplified in order to fit it in the page width.

Author Response

We thank the reviewer and we reply hereby to the points raised in the review.

(R) Authors present in this paper a review of their previous work around the TREC PM competition. The work is valuable because it tries to put together several disperse works describing different improvements for the TREC PM tasks.

(A) Thank you.

 

(R) However, the paper needs some improvements, mainly in its organization. Now, the paper looks too long. There are some parts that are repeated along the paper. For example, the explanation of the three performed studies are described in the introduction, in the methods and in the conclusions.

(A) Thank you for the suggestion. We revised the paper to make these changes. In particular, we kept the explanation of the three performed studies only in the sections dedicated to such studies. On the other hand, in the introduction we briefly mention them, whereas in the conclusions we only highlight the main outcomes and findings related to them.

 

(R) Perhaps authors should describe the overall approach in a separate section before the section "Resources". In this sections, authors must also introduce the main concepts behind the paper, namely: query expansion and reduction (these concepts are now scattered across all the paper).

(A) We inserted a new section “Methodology” to describe the overall approach. In this section, we described all the different steps used to perform the experiments throughout the paper – that is, indexing, pre-retrieval query reformulation, retrieval, post-retrieval query reformulation, filtering, and rank fusion. In particular, we also introduced the main concepts behind the paper, such as query expansion and reduction. Thus, we removed from all the experimental sections (i.e., Sections 4, 5, and 6) the ‘Methodology’ subsection and we reported in the ‘Experimental Procedure’ subsection the instantiation of (part of) the methodology considered for that particular study or analysis.

 

(R) Conclusions must be rewritten to focus on the main achievements, limitations and future work (not as a summary of the previous sections).

(A) We rewrote conclusions to emphasize the results and findings obtained through the different studies and analyses. Besides, we also added a possible line of research for future work.

 

(R) Finally, the introduction must emphasize the contribution and goals of the paper.

(A) We revised the introduction to better emphasize the contributions and the objectives of the paper. Now, also following Reviewer’s 2 suggestions, it presents the problem, the objectives, and the contributions in a more linear and understandable way.

 

(R) As a minor comment, Table 1 needs to be somehow simplified in order to fit it in the page width.

(A) We fixed it. Thank you.

Reviewer 2 Report

I have learned a lot about the query reformulations and about its relevance to information retrieval in medical research. It was particularly interesting to see that the results correspond to the experiences I have from my information searches relating to my research in medicine and related fields. My comments to the authors are as follows:

  1. The manuscript is a good introduction into knowledge-based information retrieval, but I think we need to see more about how it can be used in medicine and health research. What does it add to a PhD student’s research work?
  2. The manuscript looks more like a textbook chapter than a research article. Even the title suggests that. The manuscript would benefit from revisions that would improve the flow of the text and follow more closely the structure of a traditional research article presenting new findings.
  3. Page 1, lines 17-21: Consider expanding on the list of users with medical researchers. I am convinced that they are the largest group of users searching medical information. They search references for their future articles.
  4. Page 2, lines 40-48: Medical researchers and students put it more bluntly: "The search produced too many articles to read. Which of these are the most relevant online? Do I have to read all 100 articles?" Your article tries to solve this issue. Help your medical readers and consider phrasing your research question like this.
  5. Introduction: manuscript would benefit from a revision to improve clarity of the introduction section. Do not describe TREC PM track in the introduction section (page 2, lines 53-66). It is enough to describe it in the methodology. Do not list your findings in the introduction (page 3, lines 96-112). The purpose of introduction is to introduce the research question and lead readers to it. Keep the introduction in a general level, do not go to details.
  6. Page 6, chapter 3. Related works: Consider removing this chapter. It interrupts the story of your article. This is not a review article. You could cite shortly previous studies of knowledge-enhanced expansions in the introduction section and describe the TREC track used in your study later in the next chapter.
  7. Starting from page 10: When you are referring to your methodology, results or what you did during the research, it is common to use the past verb tense. Please write “we employed”, “we indexed”, “we performed”, and so on.
  8. Page 11, formula (1): Consider stating that the values of parameters k1 and b will be defined later in the experimental procedure chapter. In addition, help your readers and define the scale for the measure score(q, d): minimum. maximum, what does a low value mean, what does a high value mean.
  9. Page 11, line 480: Define abbreviation PRF here.
  10. Please check the numbering of tables. You now have tables with same table number (Table 1, Table 2).
  11. Table 1 (page 13 and page 20) and Table 2 (page 22 and page 27): Tables should be able to stand alone. That is, all information necessary for interpretation should be included within the table, legend or footnotes. This means that measures, abbreviations and modeling methods used are named. Many readers will skim an article before reading it closely and identifying data analysis methods in the tables or figure will allow the readers to understand the procedures immediately. Consider providing all details in the table title or footnote, and not referring to the text.
  12. Chapter 5. In-Depth analysis of query reformulations: Please consider further clarification or provide definitions of weighting schemes.  
  13. Page 28, lines 937-939: I am not sure if this is correct interpretation of clinical trials. In my opinion, clinical trials have strict excluding and including criteria.
  14. Figures 1-3 and page 13: Please define “topic”. Do topics refer to patients or their treatment methods?

 

Author Response

We thank you the reviewer for the rich review and detailed suggestions. 

We reply hereby to the comments.

(R) I have learned a lot about the query reformulations and about its relevance to information retrieval in medical research. It was particularly interesting to see that the results correspond to the experiences I have from my information searches relating to my research in medicine and related fields.

(A) We are glad to see that our results align with the experience of an expert in medicine. Thank you for pointing this out.

 

(R) My comments to the authors are as follows:

The manuscript is a good introduction into knowledge-based information retrieval, but I think we need to see more about how it can be used in medicine and health research. What does it add to a PhD student’s research work?

(A) Following yours and reviewer 1 thoughtful suggestions, we revised the paper to better emphasize objectives, contributions, methodology, and findings. In this way, the contributions, the methodology, and the findings can be easily identified within the paper – making it more valuable for readers who need to use the proposed techniques, or want to investigate/validate their assumptions/experiences in the task. In particular, we introduced medical researchers as target users and we clarified the utility of the work in medicine and health research.

 

(R) The manuscript looks more like a textbook chapter than a research article. Even the title suggests that. The manuscript would benefit from revisions that would improve the flow of the text and follow more closely the structure of a traditional research article presenting new findings.

(A) Thank you for your suggestion.  We changed the title to make it more research-oriented. Moreover, combining yours and reviewer 1 observations, we revised the entire paper to improve the flow of text and to make the structure of the article more like a research paper than a textbook. In particular, we introduced a ‘Methodology’ section where we described the different steps used to perform the experiments throughout the paper. Then, for each experimental section (i.e., Sections 4, 5, and 6), we provided in the ‘Experimental Procedure’ subsection information about the parts of the methodology we instantiated in the given experiment. Besides, `Introduction’ and `Conclusions’ are now more research-like. To this end, we introduced the problem, the motivations, and the contributions in the introduction, whereas we presented the main findings and outcomes, as well as some future work, in the conclusion.

 

(R) Page 1, lines 17-21: Consider expanding on the list of users with medical researchers. I am convinced that they are the largest group of users searching medical information. They search references for their future articles.

(A) We expanded the list of users with medical researchers. Now the paragraph is as follows:

“Searching for medical information is an activity of interest for many users with different levels of medical expertise. For example, a patient with a recently diagnosed condition would generally benefit from introductory information about the treatment of the disease, a trained physician would instead expect more detailed information when deciding the course of treatment, and a medical researcher would require references and literature relevant to their future research articles [1].”

 

(R) Page 2, lines 40-48: Medical researchers and students put it more bluntly: "The search produced too many articles to read. Which of these are the most relevant online? Do I have to read all 100 articles?" Your article tries to solve this issue. Help your medical readers and consider phrasing your research question like this.

(A) We revised and expanded that paragraph to make it clearer for medical readers. The expanded part at the end of the revised paragraph is as follows:

“From a user’s perspective, the semantic gap prevents users from easily finding the most relevant documents for their information need. In a medical scenario, the semantic gap leads physicians and medical researchers to read through a large amount of literature before finding the most relevant articles for their needs. Thus, given the limited time of physicians and medical researchers, it becomes even more significant to reduce this gap in the medical domain.”

 

(R) Introduction: manuscript would benefit from a revision to improve clarity of the introduction section. Do not describe TREC PM track in the introduction section (page 2, lines 53-66). It is enough to describe it in the methodology. Do not list your findings in the introduction (page 3, lines 96-112). The purpose of introduction is to introduce the research question and lead readers to it. Keep the introduction in a general level, do not go to details.

(A) Thank you for your suggestion. Following yours and reviewer 1 comments, we revised the introduction to make it clearer. In particular, we moved the description of the TREC PM track to the “Resources” section and we moved the listing of our findings in the conclusions. In this way, users are guided to them throughout the paper and can find a summary of the main outcomes and findings at the end of the paper. Overall, we made the introduction more general.

 

(R) Page 6, chapter 3. Related works: Consider removing this chapter. It interrupts the story of your article. This is not a review article. You could cite shortly previous studies of knowledge-enhanced expansions in the introduction section and describe the TREC track used in your study later in the next chapter.

(A) As you suggested, we removed the related works chapter and we briefly reported some of the most relevant works in the introduction. Regarding the description of the TREC Track used in our work, we put it in the “Resources” section and we referred to it in the subsequent sections.

 

(R) Starting from page 10: When you are referring to your methodology, results or what you did during the research, it is common to use the past verb tense. Please write “we employed”, “we indexed”, “we performed”, and so on.

(A) We fixed it, thank you.

 

(R) Page 11, formula (1): Consider stating that the values of parameters k1 and b will be defined later in the experimental procedure chapter.

(A) We added a footnote to avoid interrupting the flow of the description.

 

(R) In addition, help your readers and define the scale for the measure score(q, d): minimum. maximum, what does a low value mean, what does a high value mean.

(A) Thank you for this suggestion. We added a sentence to explain how to interpret the score and what is the minimum.

 

(R) Page 11, line 480: Define abbreviation PRF here.

(A) Fixed, thank you.

 

(R) Please check the numbering of tables. You now have tables with same table number (Table 1, Table 2).

(A) Fixed, thank you.

 

(R) Table 1 (page 13 and page 20) and Table 2 (page 22 and page 27): Tables should be able to stand alone. That is, all information necessary for interpretation should be included within the table, legend or footnotes. This means that measures, abbreviations and modeling methods used are named. Many readers will skim an article before reading it closely and identifying data analysis methods in the tables or figure will allow the readers to understand the procedures immediately. Consider providing all details in the table title or footnote, and not referring to the text.

(A) Thank you for this suggestion. We added an explanation of all the features and entries of the table and an expansion of the acronyms for all the tables.

 

(R) Chapter 5. In-Depth analysis of query reformulations: Please consider further clarification or provide definitions of weighting schemes.  

(A) We further clarified this concept in the newly introduced ‘Methodology’ section. The paragraph explaining the weighting of expansion terms is now as follows:

“Furthermore, we weighted the terms added in the expansion step with either a value of 1.0 or – to limit noise injection in the retrieval process [42] – with a value of 0.1. In other words, we assigned a weight to expansion terms so that the impact of such terms when computing the score between the query q and the document d is either equal to the impact of terms in the original query or scaled by a factor of 0.1.”

 

(R) Page 28, lines 937-939: I am not sure if this is correct interpretation of clinical trials. In my opinion, clinical trials have strict excluding and including criteria.

(A) We expressed the sentence in an unclear way, you are completely right. Our intuition is that there is a difference between the level of specificity of the information contained within clinical trials and queries. For instance, several clinical trials have (strict) inclusion/exclusion criteria that are related to “solid tumors” in general – and not to a specific type of solid tumor. On the other hand, queries always present a higher level of specificity about the disease – which leads to a retrieval mismatch unless we reformulate the query appropriately. Thus, we revised the sentence as follows:

“A possible reason is related to the different level of information contained within clinical trials and queries, since clinical trials often contain general requirements to allow patients to enroll – such as, the umbrella concept “solid tumor” – rather than specific concepts – e.g., tumor types – as in queries [40].”

 

(R) Figures 1-3 and page 13: Please define “topic”. Do topics refer to patients or their treatment methods? 

(A) We added in the caption of the figures (also the last figures) a short explanation of what a “topic” is in relation to these figures.

Back to TopTop