Next Article in Journal
Classification of Actual Sensor Network Deployments in Research Studies from 2013 to 2017
Next Article in Special Issue
Factors That Affect E-Learning Platforms after the Spread of COVID-19: Post Acceptance Study
Previous Article in Journal
A Public Dataset of 24-h Multi-Levels Psycho-Physiological Responses in Young Healthy Adults
Previous Article in Special Issue
Data on Vietnamese Students’ Acceptance of Using VCTs for Distance Learning during the COVID-19 Pandemic
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Data Descriptor

Dataset of Search Results Organized as Learning Paths Recommended by Experts to Support Search as Learning

Verónica Proaño-Ríos
1,2,* and
Roberto González-Ibáñez
Departamento de Ingeniería Informática, Universidad de Santiago de Chile, Avenida Ecuador #3659, Santiago 9170124, Chile
Departamento de Ciencias Exactas, Universidad de las Fuerzas Armadas—ESPE, Av. General Rumiñahui s/n, Sangolquí 171103, Ecuador
Author to whom correspondence should be addressed.
Submission received: 5 August 2020 / Revised: 16 September 2020 / Accepted: 18 September 2020 / Published: 27 September 2020
(This article belongs to the Special Issue Big Data and E-learning)


In this article, we introduce a dataset of curated learning paths (LPs) to support search as learning. LPs were obtained through an online survey delivered to experts in different domains. Data were then analyzed and described in terms of a set of variables. The resulting dataset comprised 83 LPs, each containing three web pages, for an overall collection consisting of 249 documents. The dataset is intended to provide information scientists, education researchers, and industry professionals, who provide information services in educational contexts, a valuable resource to (i) investigate patterns in the order of LPs, (ii) improve ranking models and/or re-ranking methods, (iii) explain the structure of the recommended LPs, and (iv) investigate alternative approaches to display search results based on the features of LPs.
Dataset: The dataset is available on the Mendeley Data repository. Doi:
Dataset License: The license under which the dataset is made available is CC-BY 4.0.

1. Summary

This article describes a novel dataset comprising 83 learning paths (LPs) curated by experts. Each LP in the dataset includes a list of three sequential web pages, experts’ demographic information, experts’ judgment and reasoning to include the selected sources and their order, LP extension, and content type. The uses of the dataset includes, but are not limited to, research and development in areas such as information science, education, information retrieval, linguistics, and industry. The dataset is available through the Mendeley Data repository.
The remaining sections of this article are structured as follows. First, Section 2 provides background information and rationale. Section 3 offers a detailed description of the files included in the dataset, definitions of variables, and data distribution. Then, Section 4 describes the methodology and instruments used to collect the data. Finally, Section 5 comprises conclusions and applications for future research. Additionally, the Spanish version of the questionnaire is provided as Supplementary Materials, and the Python script used to extract general features in the dataset is presented in Appendix A.

2. Background and Rationale

The concept of LPs—defined as a finite and organized sequence of learning objects (LOs)—can be linked to Vannevar Bush’s notions of trails [1]. When trails are situated in learning contexts, a fundamental problem, known as curriculum sequencing, can be used to find the optimal sequence to maximize learning outcomes, which is considered as NP-hard [2,3] (i.e., problems that may not be decidable).
More than 50 years after Bush’s influential piece, the impact of technology in education has been prolific. In fact, different tools and resources have been developed, which include learning management systems (LMS) (e.g., Moodle) and massive online open courses (MOOCs), among others. Although different in nature, a common characteristic of both is that the content they provide is the same for all users, despite their prior knowledge or learning style. To address this problem, some have focused on personalization within this type of platforms [4,5]; however, in spite of the efforts to pursue this goal, statistics indicate that 83% of students use search engines to meet their immediate information needs [6,7].
Current search technology is based on years of research and development on information retrieval and information science. Popular alternatives such as Google and Bing provide rapid access to a vast amount of content on the Web. Although there have been numerous advances in search systems, these still face major challenges when it comes to understanding searchers’ complex information needs [8]. Search technologies exploit a wide range of features in the retrieval and ranking process, the latter being critical in the phase of organizing results in terms of their relevancy to users’ needs. Unfortunately, most approaches mainly rely on topical relevance [9,10] with limited coverage of other manifestations of relevance such as cognitive, affective, and situational [11,12], which may be critical in learning scenarios.
Despite the active role of search technologies in educational settings, these were not designed to support complex learning processes [13,14]. This is evidenced by: (i) a mismatch between search engine results’ presentation style and how people learn under high levels of uncertainty [15,16]; (ii) students’ attitudes and behaviors toward information search [17,18]; and (iii) low levels of information literacy of students [19,20]. Thus, it is fundamental to investigate alternative solutions to enhance learning outcomes as part of online search (i.e., search as learning—SAL). Furthermore, this need becomes even more urgent as the current pandemic (COVID-19) evolves, since a large number of people worldwide are turning to online resources found through search engines to support their learning processes. In this case, it is fundamental to find suitable approaches (such as LPs) to better support learning in the context of searching for information on the Web.
Regardless of numerous efforts to build LPs, to the best of our knowledge, no research has been conducted to study the effects of search results presented as LPs—based on expert knowledge—on learning outcomes. Therefore, we have seen the need to build a dataset of LPs which could be used by information scientists, education researchers, and industry professionals that offer services related to information seeking and retrieval for educational purposes. The dataset will allow them to conduct studies with numerous applications in textual data extraction. For example, in research contexts, the dataset could be used to extract features to: (i) identify patterns in the sorted LOs recommended by experts, (ii) improve ranking models and/or re-ranking algorithms to fulfill immediate learning needs, (iii) explain the order of documents within LPs, and (iv) investigate alternative approaches to display search results based on the features of LPs. Conversely, in educational settings, the dataset could be used as a learning resource or to further investigate teaching and learning strategies of complex topics. For doing so, we asked experts on specific topics and domains to provide a sequence of three web pages (mostly based on text) that can be used to guide the learning process of incoming college students who know little to nothing about a selected topic. We considered two criteria to classify an individual as an expert in a specific area: knowledge and experience time. For this particular case, we considered any person with a bachelor’s degree or higher and at least one year of experience in a specific topic to be an expert. The survey was specially directed, but not limited to, professors and researchers. Due to the geographical location of the research group carrying out the present study and the native language of the environment, the dataset was built containing mainly LOs in Spanish. It is also worth noting that Spanish is the third most used language on the Internet [21].

3. Data Description

We invited several experts from Hispanic countries to participate in an online survey. We obtained answers from seven different countries in six domains (i.e., computer science, physics, finances, laws, biology, and industrial engineering), as shown in Figure 1.

3.1. Files

We provided two files in the repository:
  • A comma-separated value (CSV) file with all the data in Spanish. The file name was LP_dataset_spanish_version.csv.
  • A copy of the previous CSV file (LP_dataset_english_version.csv) with categorical data and variable names translated into English in order to facilitate analyses for English-speaking researchers.
In this article, we addressed the English version.

3.2. Features

Table 1 describes the features available in the dataset. The features were organized as follows:
  • The first fourteen features corresponded to demographic information provided by survey respondents.
  • The following twelve variables described the LP, considering three sorted LOs and the description of the selection criteria provided by the experts.
  • The last three characteristics were general features that we extracted from the recommended LPs—to facilitate the classification process—which are described in the following section.

3.3. Data Distribution

The dataset consisted of 249 LOs organized in 83 LPs recommended by experts from Argentina, Chile, Colombia, Ecuador, Mexico, Spain, and Venezuela. Table 2 summarizes demographic data by domain, level of education, and sex. As shown in Figure 2, 81.93% of the experts belonged to higher level education institutions and 18.07% belonged to research centers. In addition, 81.93% of the respondents were men and the remaining 18.07% were women, with their age distribution shown in Figure 3. Respondents were experts belonging to six different domains: biology, computer science, finances, industrial engineering, laws, and physics—63.85% of them had a doctoral degree, 19.28% had a master’s degree, and the remaining 16.87% had a bachelor’s degree. Figure 4 shows this distribution identifying two groups: students and faculty. Finally, Figure 5 shows a brief distribution of the collected data in relation to the last three extracted characteristics: 91.57% of the documents were in Spanish (the remaining were in English), 89.16% were in text, and 67.47% were short.

4. Methods

In order to study the various aspects introduced in Section 1, we had to create a dataset to consider the aspects detailed in Table 3. These guidelines were based on the literature on searching as learning and information seeking introduced in Section 2. To carry out the creation of the dataset, we followed a method based on expert judgment, which is widely used in fields such as education (e.g., [22,23]), research (e.g., [24,25]), and industry (e.g., [26,27,28,29]), among others.
Based on the guidelines shown in Table 3, we designed a semi-structured questionnaire, which was implemented using Google Forms. The application of the questionnaire was targeted to experts in six different domains. To select domains, we first identified top searched domains on the Internet [31]. After that, we selected six domains: computer science, finances, laws, biology, industrial engineering, and physics.
In order to define specific subjects in each domain, we first identified two experts per domain. More specifically, we located 12 faculty members from different universities and countries. Once the experts were identified, an appointment was made with each one. Every one of them was asked to suggest a topic of interest for society and formulate a general question related to it. Several interviews were scheduled with each expert until the structure of the questions and the language used were fine-tuned in order to be appropriate for students who have no prior knowledge of the subject. Once the questions were defined, we requested the assistance of an expert in formulating questions in educational contexts, with the purpose of validating if they were properly posed.
Once the validation process was completed, an online survey was designed and the study was presented to the Institutional Ethics Committee of the Universidad de Santiago de Chile. The research protocol for this project was approved on 16 April 2019 (Ethical Report No. 160.2019).
The overall questionnaire consisted of 23 items including 2 agreement questions, 11 closed-ended demographic questions, and 10 open-ended questions. The online survey was tested on a pilot study by 32 members of the InTeracTion ( research group. Data collection was carried out in three stages:
  • In the first stage, prestigious universities, research centers, and industries of Spanish-speaking countries in each of the six domains of interest were identified.
  • In the second stage, we created a list including faculty members, researchers, and professionals whose institutional email was available.
  • In the third stage, invitations were sent out to the experts via an email to participate in the online survey. In addition, the experts were asked to share the survey with senior students (with at least a bachelor’s degree) who are proficient in the subject.
In the online survey, each expert was first required to fill in a demographic questionnaire. Second, experts selected a topic according to his or her field of expertise (Table 4). Third, we asked experts to provide a sorted sequence of three web pages (mostly based on text) that can be used to guide the learning process of students who know little or nothing about the selected topic. A restriction indicated for this task was that all three selected documents should be readable in a time span of 20 min (maximum). Finally, the experts were asked to describe their selection criteria.
We invited faculty members, researchers, and senior students (with at least a bachelor’s degree) from prestigious universities and research groups of Spanish-speaking countries to complete the online survey. The survey was available from 25 May 2019 to 31 January 2020.
In the 10 Hispanic countries that were invited to participate in the online survey, 3717 experts were enrolled in a university, research group, and/or industry; 109 experts completed the survey, for a response rate of 2.93%.
Collected raw data were filtered to eliminate observations containing broken URLs, duplicated URLs within a single register, or inconsistent data. Twenty-six observations were discarded during this process. In order to guarantee the selection criteria of the dataset, three variables were created using the Python script shown in Appendix A. The variables were the following:
  • LP document’s extension: This allows to identify if a LP document is short or long. For this purpose, we counted the number of words in each document of the LP. If the overall number of words was 4000 or less, the LP was classified as short. Otherwise, it was considered to be long. This decision was supported by the fact that the average reading rate is 200 words per minute for comprehensive reading tasks in the reader’s native language [32].
  • Document language: This allows to identify if the LP documents are in Spanish or English.
  • Document type: This allows to identify if the content of documents is mostly based on text or multimedia (i.e., audio and/or video).
To make it easier to identify the LOs, these were linked to unique identifiers (ID_LO) according to the following template DNNO, composed of four digits:
  • D: The first digit indicates the domain: (1) computer science, (2) finances, (3) industry, (4) physics, (5) laws, and (6) biology.
  • NN: The two digits in the middle correspond to a sequential number for each domain. Note that this number does not indicate ranking or any other ordering criteria.
  • O: The last digit indicates whether the LO is at (1) the beginning, (2) the middle, or (3) the end of the LP.
For example: the document with ID 4032 belongs to the physics domain (4) and it is in the middle of the LP (2).
Actual web documents were not included due to potential copyrights infringement. Access to actual documents will be provided upon request in case these are no longer available through the URL in the dataset.

5. Conclusions

In this article, we introduced a dataset of curated LPs. We provided detailed descriptions of the dataset structure, definitions of variables, data distribution, and methodological approach. Our dataset constitutes a valuable resource for researchers and educators dealing with problems related to information search and learning.
The dataset responds to current issues identified in the literature. First, the lack of curated search results linked to learning goals. Second, the current presentation style of search results implemented by popular search engines. Third, common students’ attitudes toward learning complex topics using Internet resources. Fourth, the fact that most content on the Web is text-based. Finally, the lack of datasets in Spanish.
The proposed dataset has different types of applications. First, researchers on information science and education could investigate the effects of LPs on students’ learning on a given topic. Second, researchers could use the dataset to find patterns that could be applied in the improvement of ranking algorithms, explain the order of documents, and investigate novel approaches to display search results in learning contexts. The dataset could also be used to investigate teaching–learning strategies of complex topics.
Finally, to the best of our knowledge, this is the first open dataset containing curated learning paths in Spanish. While relatively small compared to datasets in other domains, the methodological approach provided in this article can be followed by other researchers to further extend the current dataset with other topics and languages.

Supplementary Materials

The following supplemental data are available online at, Questionnaire S1: Spanish-version questions asked in the online survey.

Author Contributions

Conceptualization, V.P.-R. and R.G.-I.; methodology, V.P.-R. and R.G.-I.; software, V.P.-R.; validation, V.P.-R.; formal analysis, V.P.-R.; investigation, V.P.-R.; data curation, V.P.-R.; writing—original draft preparation, V.P.-R.; writing—review and editing, V.P.-R. and R.G.-I.; supervision, R.G.-I. All authors have read and agreed to the published version of the manuscript.


This research was funded by Secretaría Nacional de Educación Superior, Ciencia, Tecnología e Innovación (SENESCYT) as a part of the Programa de Becas Convocatoria Abierta 2014—Primera Fase; and research grant FONDECYT Regular #1201610 funded by the National Agency for Research and Development (ANID).


The authors would like to thank all the experts for their time and valuable contribution to data collection.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The following Python script was used to extract general features from the dataset.
Figure A1. Python script used to extract general features from the dataset.
Figure A1. Python script used to extract general features from the dataset.
Data 05 00092 g0a1


  1. Bush, V. As we may think. Atl. Mon. 1945, 176, 101–108. [Google Scholar]
  2. Acampora, G.; Gaeta, M.; Loia, V. Hierarchical optimization of personalized experiences for e-Learning systems through evolutionary models. Neural Comput. Applic. 2011, 20, 641–657. [Google Scholar] [CrossRef] [Green Version]
  3. Al-Muhaideb, S.; Menai, M.E.B. Evolutionary computation approaches to the Curriculum Sequencing problem. Nat. Comput. 2011, 10, 891–920. [Google Scholar] [CrossRef]
  4. Caputi, V.; Garrido, A. Student-oriented planning of e-learning contents for Moodle. J. Netw. Comput. Appl. 2015, 53, 115–127. [Google Scholar] [CrossRef]
  5. Dwivedi, P.; Kant, V.; Bharadwaj, K.K. Learning path recommendation based on modified variable length genetic algorithm. Educ. Inf. Technol. 2018, 23, 819–836. [Google Scholar] [CrossRef]
  6. Byrne, J.; Kardefelt-Winther, D.; Livingstone, S.; Stoilova, M. Global Kids Online Research Synthesis, 2015–2016; UNICEF Office of Research—Innocenti and London School of Economics and Political Science: London, UK, 2016; pp. 1–75. [Google Scholar]
  7. Livingstone, S.; Kardefelt-Winther, D.; Saeed, M. Global Kids Online Comparative Report 2019; UNICEF Office of Research—Innocenti and London School of Economics and Political Science: London, UK, 2019; pp. 1–135. [Google Scholar]
  8. Rieh, S.Y.; Collins-Thompson, K.; Hansen, P.; Lee, H.-J. Towards searching as a learning process: A review of current perspectives and future directions. J. Inf. Sci. 2016, 42, 19–34. [Google Scholar] [CrossRef]
  9. Saracevic, T. Relevance reconsidered. In Proceedings of the Second Conference on Conceptions of Library and Information Science (CoLIS 2), Copenhagen, Denmark, 13–16 October 1996; ACM: New York, NY, USA, 1996; pp. 201–218. [Google Scholar]
  10. Saracevic, T. Why is relevance still the basic notion in information science. In Proceedings of the Re: Inventing Information Science in the Networked Society and Proceedings of the 14th International Symposium on Information Science (ISI 2015), Zadar, Croatia, 19–21 May 2015; pp. 26–35. [Google Scholar]
  11. Nahl, D.; Tenopir, C. Affective and cognitive searching behavior of novice end-users of a full-text database. J. Am. Soc. Inf. Sci. 1996, 47, 276–286. [Google Scholar] [CrossRef]
  12. Nahl, D.; Bilal, D. Information and Emotion: The Emergent Affective Paradigm in Information Behavior Research and Theory; American Society for Information Science and Technology: Silver Spring, MD, USA; Information Today, Inc.: Medford, NJ, USA, 2007; ISBN 978-1-57387-310-9. [Google Scholar]
  13. Farrell, R.G.; Liburd, S.D.; Thomas, J.C. Dynamic assembly of learning objects. In Proceedings of the WWW—ACM, New York, NY, USA, 17–22 May 2004; Association for Computing Machinery: New York, NY, USA, 2004; pp. 162–169. [Google Scholar]
  14. Syed, R.; Collins-Thompson, K. Optimizing Search Results for Educational Goals: Incorporating Keyword Density as a Retrieval Objective. In Proceedings of the SAL@SIGIR, Pisa, Italy, 17–21 July 2016. [Google Scholar]
  15. Hearst, M. Search User Interfaces; Cambridge University Press: New York, NY, USA, 2009; ISBN 978-0-521-11379-3. [Google Scholar]
  16. Marchionini, G. Exploratory search: From finding to understanding. ACM 2006, 49, 41–46. [Google Scholar] [CrossRef]
  17. Large, A.; Nesset, V.; Beheshti, J. Children as information seekers: What researchers tell us. New Rev. Child. Lit. Librariansh. 2008, 14, 121–140. [Google Scholar] [CrossRef]
  18. Rieh, S.Y.; Kim, Y.-M.; Markey, K. Amount of invested mental effort (AIME) in online searching. Inf. Process. Manag. 2012, 48, 1136–1150. [Google Scholar] [CrossRef] [Green Version]
  19. Graham, L.; Metaxas, P.T. “Of course it’s true; I saw it on the Internet!”: Critical thinking in the Internet era. Commun. ACM 2003, 46, 70–75. [Google Scholar] [CrossRef]
  20. Johnston, B.; Webber, S. Information Literacy in Higher Education: A review and case study. Stud. High. Educ. 2003, 28, 335–352. [Google Scholar] [CrossRef]
  21. Fernández Vítores, D. El Español: Una Lengua Viva—Informe 2019; Instituto Cervantes: Madrid, Spain, 2019; pp. 1–96. [Google Scholar]
  22. Escobar-Pérez, J.; Martínez, A. Validez de contenido y juicio de expertos: Una aproximación a su utilización. Av. En Med. 2008, 6, 27–36. [Google Scholar]
  23. Fotheringham, D. The role of expert judgement and feedback in sustainable assessment: A discussion paper. Nurse Educ. Today 2011, 31, e47–e50. [Google Scholar] [CrossRef] [PubMed]
  24. Hyrkäs, K.; Appelqvist-Schmidlechner, K.; Oksa, L. Validating an instrument for clinical supervision using an expert panel. Int. J. Nurs. Stud. 2003, 40, 619–625. [Google Scholar] [CrossRef]
  25. Drew, A.; Perera, A. Expert Knowledge as a Basis for Landscape Ecological Predictive Models. In Predictive Species and Habitat Modeling in Landscape Ecology: Concepts and Applications; Springer: New York, NY, USA, 2011; pp. 229–248. [Google Scholar]
  26. Tsyganok, V.V.; Kadenko, S.V.; Andriichuk, O.V. Significance of expert competence consideration in group decision making using AHP. Int. J. Prod. Res. 2012, 50, 4785–4792. [Google Scholar] [CrossRef]
  27. Hughes, R.T. Expert judgement as an estimating method. Inf. Softw. Technol. 1996, 38, 67–75. [Google Scholar] [CrossRef]
  28. Jørgensen, M. Forecasting of software development work effort: Evidence on expert judgement and formal models. Int. J. Forecast. 2007, 23, 449–462. [Google Scholar] [CrossRef]
  29. Burgman, M.; Fidler, F.; Mcbride, M.; Walshe, T.; Wintle, B. Eliciting Expert Judgments: Literature Review; Australian Centre for Excellence in Risk Analysis (ACERA): Melbourne, VIC, Australia, 2006. [Google Scholar]
  30. Ritter, F.E.; Nerb, J.; Lehtinen, E.; O’Shea, T.M. In Order to Learn: How the Sequence of Topics Influences Learning; Oxford University Press: Oxford, UK, 2007; Volume 2, ISBN 978-0-19-803977-8. [Google Scholar]
  31. White, R.W.; Dumais, S.T.; Teevan, J. Characterizing the influence of domain expertise on web search behavior. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain, 9–12 February 2009; Association for Computing Machinery: New York, NY, USA, 2009; pp. 132–141. [Google Scholar]
  32. Grabe, W.; Stoller, F.L. Teaching and Researching Reading, 3rd ed.; Routledge: Abingdon, UK, 2019; ISBN 978-1-317-53642-0. [Google Scholar]
Figure 1. Hispanic countries where the online survey was applied.
Figure 1. Hispanic countries where the online survey was applied.
Data 05 00092 g001
Figure 2. Nationality and type of institution the experts belonged to.
Figure 2. Nationality and type of institution the experts belonged to.
Data 05 00092 g002
Figure 3. Number of experts surveyed by age and sex.
Figure 3. Number of experts surveyed by age and sex.
Data 05 00092 g003
Figure 4. Respondents’ education level and expertise domain.
Figure 4. Respondents’ education level and expertise domain.
Data 05 00092 g004
Figure 5. Features of documents.
Figure 5. Features of documents.
Data 05 00092 g005
Table 1. Dataset content, including names, variable types, and descriptions.
Table 1. Dataset content, including names, variable types, and descriptions.
Column NameTypeDescription*
ID_LPIdentifierRow unique identifier (ID) or keyC
AgeCategoricalExpert’s age rangeS
SexCategoricalWoman or ManS
NationalityCategoricalExpert’s nationalityS
Native_languageCategoricalNative languageS
EducationCategoricalHighest degree obtained or in course: bachelor’s, master’s, or doctorateS
Professional_degreeCategoricalExpert’s career or professionS
Main_activityCategoricalMain activity: student, lecturer (those that deal only with teaching duties), and faculty member (or researcher alone)S
Current_year_studyOrdinalIf the expert is a student (e.g., doctoral program), current progress in terms of years within the programS
Institution_typeCategoricalHigher level institution or research groupS
Time_spentCategoricalTime spent on the Web according to the following scale:
0. Never
1. Once a week
2. Two or three days a week
3. At least five days a week, less than an hour a day
4. At least five days a week, between one hour and three hours a day
5. At least five days a week, more than three hours a day
DomainCategoricalExpertise area: biology, computer science, finances, laws, physics, industrial engineeringS
TopicCategoricalIt can be one of the following six topics:
- Bioethics of animal tissue cloning for human intake
- Artificial neural networks
- Investment projects
- Inheritance laws in Chile
- Quantum computing
- Industrial revolutions
Experience_timeCategoricalYears of experience in the selected topic according to the following ranges:
<1 year
2–3 years
4–5 years
6–9 years
>10 years
Id_ LO_1OrdinalId of the first LO in the LPC
URL_1QualitativeURL of the first LO in the LPS
Query_1QualitativeQuery used by the expert to obtain LO_1S
Reason_1QualitativeReasons for recommending reading LO_ 1 in first placeS
Id_ LO_2OrdinalId of the second LO in the LPC
URL_2QualitativeURL of the second LO in the LPS
Query_2QualitativeQuery used by the expert to obtain LO_2S
Reason_2QualitativeReasons for recommending reading LO_2 in second placeS
Id_ LO_3OrdinalId of the third LO in the LPC
URL_3QualitativeURL of the third LO recommended in the LPS
Query_3QualitativeQuery used by the expert to obtain LO_3S
Reason_3QualitativeReasons for recommending reading LO_3 lastS
CommentsQualitativeComments and observations made by each expertS
LP_docs_extensionCategoricalLP documents’ extension: short or longC
Document_languageCategoricalDocuments’ language: Spanish or EnglishC
Document_typeCategoricalDocuments’ content: text or multimediaC
* The last column indicates whether the value was obtained directly from the survey (S) or computed (C).
Table 2. Summary of data by domain, level of education, and sex.
Table 2. Summary of data by domain, level of education, and sex.
n = 23
n = 51
n = 9
n = 83
n = 4
n = 19
n = 9
n = 42
n = 2
n = 7
Table 3. Guidelines considered for the creation of the dataset.
Table 3. Guidelines considered for the creation of the dataset.
Current ScenarioCriteria
Lack of validation for search results.Consider experts’ knowledge and criteria to select and organize web documents as LOs.
Endless search results and random reading order.Organize search results as LPs—defined as a finite and organized sequence of documents (LOs)—considering that the order in which study material is presented can lead to different learning outcomes [30].
Observed common attitudes and behaviors among students toward web search contexts as little time and effort were invested in finding information [18].Short LPs intended to satisfy an immediate learning need, since students spend 14:21 min on average in a search session to read text documents [18].
Most web content is in text format.LPs mostly based on text.
Most IR (Information Retrieval) research is based on information presented in English language.Spanish is the third most used language on the Internet [21], so it is necessary to pay attention to these users.
Table 4. Topics and subjects for each domain.
Table 4. Topics and subjects for each domain.
BiologyBioethics of animal tissue cloning for human intakeWhat are the basic ethical principles to consider when cloning animal tissues for human intake?
Computer scienceArtificial neural networksWhat are the main differences between a simple artificial neural network and a deep artificial neural network?
FinancesInvestment projectsWhat are the factors that must/should be considered when deciding whether to undertake a new business or to invest in properties?
IndustryIndustrial revolutionsWhat are the main milestones for each industrial revolution?
LawsInheritance laws in ChileIs it legal to disinherit a daughter or son? If so, in which cases?
PhysicsQuantum computingWhat are the main differences between quantum computers and classic computers?

Share and Cite

MDPI and ACS Style

Proaño-Ríos, V.; González-Ibáñez, R. Dataset of Search Results Organized as Learning Paths Recommended by Experts to Support Search as Learning. Data 2020, 5, 92.

AMA Style

Proaño-Ríos V, González-Ibáñez R. Dataset of Search Results Organized as Learning Paths Recommended by Experts to Support Search as Learning. Data. 2020; 5(4):92.

Chicago/Turabian Style

Proaño-Ríos, Verónica, and Roberto González-Ibáñez. 2020. "Dataset of Search Results Organized as Learning Paths Recommended by Experts to Support Search as Learning" Data 5, no. 4: 92.

Article Metrics

Back to TopTop