Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance

Cheng, Shu-Chen; Cheng, Yu-Ping; Huang, Yueh-Min

doi:10.3390/su15086950

Open AccessArticle

Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance

by

Shu-Chen Cheng

¹,

Yu-Ping Cheng

² and

Yueh-Min Huang

^2,*

¹

Department of Computer Science and Information Engineering, Southern Taiwan University of Science and Technology, Tainan City 710301, Taiwan

²

Department of Engineering Science, National Cheng Kung University, Tainan City 701, Taiwan

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(8), 6950; https://doi.org/10.3390/su15086950

Submission received: 15 March 2023 / Revised: 15 April 2023 / Accepted: 19 April 2023 / Published: 20 April 2023

(This article belongs to the Special Issue Sustainable Intelligent Education Programs)

Download

Browse Figures

Versions Notes

Abstract

:

There are many resources on the Internet. Searching for articles or multimedia videos is usually interspersed with irrelevant information or advertisements, which may cause students to spend a lot of time judging whether the search results are suitable for learning materials. Therefore, this study developed a learning pathway system by analyzing the representative keywords and difficulty of Internet articles in an automated way and then explored the learning performance and motivation of students using this system. In addition, 67 students were recruited into this study for 18 weeks of experimental activities. In the experimental activities, students can use the learning pathway system to search for algorithm-related materials for reading, and they can also continue to use the system proposed in this study for self-learning after class. The results show that the students’ post-test scores are significantly higher than their pre-test scores, indicating that students can use the learning pathway system to improve their academic performance in algorithm courses. In addition, the intrinsic motivation of high-achieving students was improved, while the intrinsic and extrinsic motivation of low-achieving students were both improved. This means that the learning pathway system can provide suitable learning materials for students to learn, allowing students to achieve autonomous learning.

Keywords:

learning pathway; web-based mining; association rule; learning motivation; deep learning

1. Introduction

With the vigorous development of the technology industry, the shortage of human resources has increased significantly, and attractive salaries increased the willingness of many people to switch careers. However, the technology industry is diverse, and each field includes specialized knowledge that needs to be learned. Therefore, whether it is an undergraduate student who wants to change fields or a non-undergraduate student who wants to change jobs across industries, it will take a lot of time to screen and read relevant documents to learn relevant knowledge. Although articles or multimedia videos and other related resources on the Internet are easy to obtain for ordinary people, when learners use search platforms for information, the search results are often mixed with irrelevant information and advertisements. For beginners or students, it will take a lot of time to accumulate experience to judge whether the search result information provided by these websites meets their own levels and needs. This may reduce learners’ willingness and motivation to learn, thereby affecting their performance and satisfaction in the educational environment [1]. Ref. [2] suggests that learning through reading and grading is ineffective. Therefore, many methods exist to improve learning, such as flipped classroom teaching. The authors of [3] believe that the flipped classroom has excellent potential. By carefully preparing classroom activities and improving students’ learning skills and attitudes, it can become an effective teaching tool for students at any level. Student participation in learning can improve their attention and concentration, train them to have better critical thinking abilities, and deepen their learning experience [4].

Although many studies have shown that flipped teaching can help improve grades and learning motivation, it still faces many challenges [5]. Learning is a gradual process, and one must consolidate foundational knowledge such as building blocks before tackling more complex problems. Schools typically design a suitable “curriculum learning map” for students to learn and develop their future employability gradually. Learning pathways involve the unique developmental trajectories or routes that individuals may take toward achieving their learning goals. While different learners may end up with similar learning achievements or performances, their experiences along the way can vary. This emphasizes the importance of conducting a person-centered analysis to understand the diverse pathways individuals may follow in their learning journeys [6]. Creating a clear and appropriate map can reduce the confusion of learning and allow students to focus on “career-oriented” and “skill-oriented” learning to build their research plans. It fosters the ability for “self-directed learning”. However, search engines do not provide information on article difficulty or discussion topics, leading users to spend a lot of time filtering and categorizing search results to find articles that match their needs and levels. Therefore, this study aims to develop an automatically constructed learning pathway system. It provides a variety of educational articles, automatically identifies keywords represented in each article, and analyzes their difficulty. Thus, learners can use this additional information to quickly find articles and materials that meet their needs. In addition, the system will also analyze each user’s search and viewing logs, adjust the learning path automatically generated by the system, improve the accuracy of recommendation and search results, and gradually allow learners to learn the correct knowledge and concepts in the correct learning pathway.

The system developed in this study includes automatic analysis of article difficulty, which is a key technology for constructing learning paths. The purpose is to allow users to choose learning materials according to their own level to achieve personalized learning effects. The system allows learners to quickly and conveniently obtain the required information at any time, as shown in Figure 1. The system divides keywords into elementary, intermediate, and advanced and divides the difficulty of articles into simple, medium, and challenging. All articles will be marked with difficulty, which is convenient for users to learn.

2. Literature Review

2.1. Content-Based Recommend System

To provide users with practical and easily interpretable articles for information retrieval, a content-based recommendation system is considered essential as it enables users to search for diverse sources of information [7,8]. This system, which is derived from the field of information retrieval, primarily uses text data as an information source and predicts recommendations based on various factors that influence users, such as keywords and similarity. To transform documents into vectors in a multi-dimensional space for analysis, it employs techniques such as term frequency-inverse document frequency (TF-IDF), vector space model, and latent semantic indexing [9]. Algorithms, such as support vector machine (SVM), k-means, k-nearest neighbor rule, and decision tree, are then used to establish a recognition model for media features, among others. The system assigns different scores to items based on their attributes, features, and weights in the training samples and generates recommendation results. Finally, it further analyzes these items.

2.2. Learning Pathway

A learning pathway, also known as a developmental trajectory, is the process leading to a learning goal [10]. Different learners may have the same learning goal, but their experiences in the learning process may vary. Thus, understanding the student’s learning pathway requires analyzing it with the student as the center. Studies have shown that adopting students’ learning pathways to provide adaptive support can improve their learning situation [11]. In recent years, there have been many studies on the recommendation, classification, and construction of learning pathways. Some studies compare learning pathways to examine their impact on emotions and grades [6]. Others use variables such as learning behavior and motivation to predict learners’ grades [12], while some argue that designing adaptive, customized learning pathways is crucial in the design of learning environments [13].

Learning pathways help learners stay organized and oriented during the learning process, providing a clear idea of their progress. Additionally, they enable educators to design and manage courses better, offering a more personalized and targeted learning experience. Combining learning pathway maps with other educational technology tools and technologies, such as online learning platforms, adaptive learning systems, and learning analytics and data mining technologies, can provide more support and feedback to both learners and educators.

2.3. Text Mining

Text mining is an extension of data mining. It involves the difficult task of extracting valuable information from diverse language texts. The purpose of text mining is to extract useful information from unstructured text. The process of text mining involves natural language processing tasks. Any processing of text that leads to obtaining valuable information is considered part of text mining. For example, the TF-IDF method extracts keywords from the text for classification or other applications [14]. TF-IDF is a numerical representation used in information retrieval and text mining to determine the importance of a term within a document or article.

In research related to the information field, there is generally a certain degree of correlation between words. It can usually be obtained through association analysis algorithms. The most famous algorithm of association analysis, the Apriori algorithm, was first proposed by Agrawal and Srikant in 1994. The concept of the Apriori algorithm is to find the law and correlation between two or more combinations in big data. It calculates the frequency of their occurrence among data sets and establishes association rules based on support, confidence, and lift. The following example illustrates the use of algorithm-related articles A and keyword K.

(1): Support:

Support is the proportion of the combination of particular keywords K (

K_{1}, K_{2}, \dots K_{n}

) appearing in all articles A. For example, if there are a total of 600 articles in A and the keyword pair (machine learning, deep learning) appears in 150 of them, then the support of the keyword pair (machine learning, deep learning) is 150/600 = 0.25 (25%). Equation (1) shows the formula for support.

S (K_{1} \to K_{2} \to \dots \to K_{n}) = \frac{P (K_{1} \cap K_{2} \cap \dots \cap K_{n})}{A}

(1)

(2): Confidence:

The confidence level represents the probability of the occurrence of keywords K (

K_{1}, K_{2}, \dots K_{n}

) when the keyword K (

K_{1}

) appears. For example, if the keyword K (machine learning) appears in a total of 200 articles, and among those 200 articles, the keyword K (deep learning) appears in 100 of them, the confidence level can be calculated as 100/200 = 0.5 (50%). Equation (2) shows the formula for confidence.

C (K_{1} \to K_{2} \to \dots \to K_{n}) = \frac{P (K_{1} \cap K_{2} \cap \dots \cap K_{n})}{P (K_{1})}

(2)

(3): Lift:

Lift refers to the probability of the occurrence of K (

K_{2}, \dots K_{n}

) when the keyword K (

K_{1}

) appears, representing the correlation between the keywords. For example, it calculates the lift of K (machine learning, deep learning) by dividing the confidence level of K (machine learning, deep learning) by the support level of K (deep learning). If the lift is greater than 1, it indicates that when K (machine learning) appears, K (deep learning) is also likely to appear. If the lift is less than 1, it indicates that when K (machine learning) appears, K (deep learning) is less likely to appear. If the lift is equal to 1, it indicates that there is no correlation between K (machine learning) and K (deep learning). Equation (3) shows the formula for lift.

L (K_{1} \to K_{2} \to \dots \to K_{n}) = \frac{C (K_{1} \to K_{2} \to \dots \to K_{n})}{S (K_{2} \to \dots \to K_{n})}

(3)

The main concept of the Apriori algorithm is when a set of items frequently appears in the data set, the sub-collections contained in the item must also appear frequently. On the contrary, the number of occurrences of this subset is relatively rare. On the basis of this concept, match the given minimum support and minimum confidence to achieve the purpose of screening data and generating association rules between items.

The execution flow of the Apriori algorithm is as follows:

Scan the dataset to find all candidate subsets;
Mark the frequency and count of all candidate subsets appearing in the dataset;
Set minimum support and minimum confidence levels to filter the data;
Generate candidate subsets of length K + 1 from subsets of length K;
Repeat steps 2–4 until the generated candidate subsets reach the maximum length.

2.4. Text Classification

Text classification is categorizing text into multiple classes according to given rules. The basis of text classification can be valuable information extracted from the text content through text mining methods. However, helpful information is not only present in the text content. In [15], it also extracted non-text features and text features. They are used to evaluate app reviews on the Apple App Store and Google Play Store. It found that incorporating non-text elements can improve the accuracy of classification tasks.

There are many algorithms used for text classification. It can be divided into traditional and deep-learning-based machine learning methods. Traditional machine learning methods such as Naive Bayes, Support Vector Machine (SVM), and Random Forest are commonly used [16]. Ref. [17] used Random Forest in short medical diagnosis notes and achieved an accuracy rate of 91% in a binary classification task. Deep-learning-based methods such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Attention Mechanism are also popular [18].

Currently, deep-learning-based methods have achieved outstanding results in text classification tasks. The authors of [19] fully utilized the context of the text content. They combined a BiLSTM model with a self-attention mechanism and achieved good results in fine-grained sentiment polarity classification tasks for short text. They also demonstrated that using repeated data to fill in imbalanced data can improve model training efficiency. Some studies have tried to combine several different models to achieve a compact architecture or better results, such as [20] combining CNN and RNN in the famous IMDB dataset for sentiment classification and achieving an accuracy rate of 93.2%. The authors of [21] not only used the architecture of CNN and RNN but also fused the time domain and space domain to classify Chinese financial news, achieving a classification accuracy of 96.45%.

Although many research results have shown that deep-learning-based methods generally have higher accuracy than traditional machine learning methods, the prerequisite is sufficient training data. If the text classification task data is difficult to collect or annotate, traditional machine learning methods are more advantageous.

Due to the high cost of obtaining well-labeled training data for text classification tasks but the ease of obtaining large amounts of unlabeled text, [22] used a large part of the unlabeled text to augment a small quantity of labeled text, thereby improving the accuracy of the text classifier. At the same time, [23] also pointed out that the number of samples between training data categories also affects the accuracy of the text classifier.

The following is a discussion and analysis of relevant literature on the “association analysis method”.

2.5. Word Segmentation

The segmentation difficulty of Chinese sentences is more incredible than that of English sentences, as English sentences can be easily segmented using spaces. Common methods for Chinese word segmentation include lexicon segmentation, N-Gram, and hybrid segmentation.

Lexicon segmentation: Lexicon-based segmentation involves building a keyword database using commonly used keywords in a specific field, such as “artificial intelligence” and “deep learning.” It extracts keywords through string matching. The most well-known Chinese word segmentation package is jieba, which has a more comprehensive lexicon for simplified and traditional Chinese. Therefore, this project adopts jieba as the main word segmentation method for analyzing difficulty, and experts manually establish the keyword database. Lexicon-based segmentation is also used for expanding the keyword database for information domains.

N-Gram: N-Gram segmentation is commonly used for preprocessing Chinese, Japanese, and Korean text, as shown in Equation (4):

P (w_{1} w_{2} \dots w_{τ}) = \prod_{i = 1}^{τ} P (w_{i} {| w}_{1} \dots w_{i - 1})

(4)

N in N-Gram represents the length of the segmented characters. For example, when N = 1, each character is regarded as a single entity. The results show that when N = 3, it performs best for precision, recall, and F-measure.

2.6. Sustainable Development Goals

As a specialized UN agency, UNESCO promotes the development of experience-based higher education policies [24]. It provides technical assistance to member states in strategic and policy reviews to ensure access to quality education, academic mobility, and accountability for all [25,26]. The United Nations has identified 17 Sustainable Development Goals (SDGs) for the survival and development of humans and other species, with SDG 4 addressing the social dimension. By 2030, it aims to “ensure inclusive and equitable quality education and promote lifelong learning opportunities for all, including university education, for both men and women”. Education is the foundation for improving quality of life and achieving global sustainable development [27,28]. SDG 4 aims for inclusive, equitable, quality, and lifelong education. While it is only 1 of the 17 SDGs, SDG 4 is the foundation for the other goals [29].

3. Methods

3.1. System Architecture

In this study, web articles and pictures were obtained from the Internet using web crawler technology. A total of 133,962 articles were collected, and 56,676 articles in computer science category were filtered out automatically by text mining techniques. The collected data is then preprocessed. The preprocessing steps include removing punctuation marks, stop words, English, N-gram segmentation processing, and long and short word processing. Then, use the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm to calculate the keywords that can represent each article. To determine whether an article is computer-science-related depends on whether these keywords are computer-related. At the same time, if the TF-IDF value of the words processed by the N-gram word segmentation method is high, it can also be used as the basis for automatic word expansion. After filtering out non-computer-science-related articles, use jieba to segment them. Then, carry out the calculations of difficulty level for each keyword and estimate the difficulty of the article according to the difficulty of the keywords. Finally, the calculated difficulty level of each article is integrated into the learning pathway system that combines the difficulty level and article keywords. Users can enter keywords to query, and then related articles are recommended. The difficulty level of the articles will also be displayed in the topic search result list for users’ reference. In this study, this system is provided to the students in the algorithm course. The continuous expansion of students’ feedback is used as the verification standard to classify the difficulty of the article. The system architecture of this study is shown in Figure 2.

The method of building articles in this system is to collect articles through web crawlers first, and then perform the first preprocessing “word segmentation”. The system uses the N-Gram method for word segmentation. When the value of N is 1, the word segmentation length is 1, and the text is divided into single words. When the value of N is 3, the text is split into units of 3 words. N-Gram can adjust the segmentation length according to the needs of the system, such as 2-Gram or 3-Gram.

The weight calculation method used in this study is TF-IDF. TF represents the frequency of occurrence of a word in a single article, while IDF represents the frequency of a word in all articles. Therefore, the method can extract terms representing articles.

In TF-IDF, its TF represents the frequency of a word in a specific article. The TF formula is shown in Formula (5).

{TF}_{ij} = \frac{n_{j}}{n_{all}}

(5)

Among them,

{TF}_{ij}

represents the frequency of word

j

in document

i

.

n_{j}

indicates the number of times word

j

appears in document

i

.

n_{all}

represents the total number of words in document

i

.

IDF is Inverse Document Frequency. When a word appears in several articles, the word has higher importance. This means that if the word appears in multiple articles, it is less important. The IDF formula is shown in Formula (6).

{IDF}_{j} = \log_{2} \frac{N}{d_{j}}

(6)

N

represents the total number of documents.

d_{j}

represents the number of documents in which the word

j

occurs. The result obtained by multiplying TF and IDF can define the weight value of word

j

in document

i

. TF-IDF formula is shown in Formula (7).

{TFIDF}_{ij} = \frac{n_{j}}{n_{all}} \times \log_{2} \frac{N}{d_{j}}

(7)

When using N-gram word segmentation to process an article, it is possible to find new keywords to expand the Lexicon. Handling long and short words involves comparing keywords with a given threshold to a dictionary. Words that appear in the dictionary are preserved. Next, utilize web crawler to collect new articles for newly expanded keywords. Then, text preprocessing and long and short word processing are implemented in the collected articles. By repeating these processes, the keyword thesaurus will be more extensive, and the long and short words selected will become more accurate. After the web crawler obtains web page data, proceed to judge whether it is computer-science-related. This study calculated the TF-IDF value of keywords and the proportion of keywords in the articles. Higher proportion of desired topic-related keywords in the article indicates that the article is computer-science-related. A total of 133,962 articles were retrieved using web crawlers for this study, and 56,676 articles remained after the filtering. Figure 3 illustrates the process of obtaining algorithm-related articles using TF-IDF.

The system collects technical terms during word selection and builds an initial thesaurus. With the increase in the number and types of articles extracted by web mining, the quantity and correctness of the initial lexicon will be tested. Therefore, this study will revise and expand the initial lexicon. If an article is classified as computer-science-related, but its keywords are not in the Lexicon, they will be expanded. When it modifies the thesaurus, we inevitably remove some words from the system. Therefore, this system will also calculate word weights through the TF-IDF method in correctly classified documents. If a word meets the threshold criteria, we capture it and add it to the dictionary for expansion, while assigning it an initial weight. Continuous revision significantly reduces the misjudgment of documents. In the past, when dealing with the relevance of keywords, most of them directly input the data set of the rules to be searched into the Apriori algorithm for calculation, find out the words with the highest frequency of all words, and list their setting options. However, the method used in this study is different from previous studies. Before using the Apriori algorithm, the TF and IDF in the TF-IDF formula will be performed on the data set to calculate the weight of each keyword. Additionally, use the weight to carry out the Apriori algorithm operation to find the relevant set of keywords.

After using TF-IDF to obtain the keywords of each article, the next step is to calculate the difficulty level of keywords and articles. First, we calculate the difficulty level of the keyword, and then we use the calculated difficulty level of the keyword to determine the difficulty level of the article.

This system contains a total of 515 feedbacks, and statistical analysis was carried out on the difficulty distribution of keywords. Of the 515 feedbacks, 169 are rated as Level 1, 226 as Level 2, and 120 as Level 3. Multiply the difficulty level of each keyword by the corresponding TF-IDF value and then sum it up to calculate the difficulty level of the article. The calculation formula of article difficulty is shown in Formula (8).

article_{diff}_{k} = \frac{\sum_{i = 1}^{n} Keyword_{diff}_{i} {* TFIDF}_{i}}{\sum_{i = 1}^{n} {TFIDF}_{i}}

(8)

3.2. Participants and Experiment Process

In this study, 67 students were surveyed to implement a 9-week experimental activity. A total of 67 junior students in the Department of Computer Science and Information Engineering at a university in Taiwan participated in this experiment. Additionally, this study does not have any inclusion or exclusion criteria.

Figure 4 shows the experimental process of this study. Before the start of the experiment (from the first week to the eighth week), the teacher implemented the algorithm course through traditional classroom teaching. In week 9, a pre-test and a learning motivation questionnaire were administered to all students. From weeks 10 to 17, all students could use the learning pathway system to search for algorithmic articles after class. At week 18, all students completed a post-test and a learning motivation questionnaire to examine the impact on student performance and motivation. The midterm exam is the pre-test, while the final exam is the post-test.

3.3. Data Collection and Analysis

This study experimented in an algorithm course to explore whether introducing a learning pathway system can help improve students’ learning performance and learning motivation. Therefore, 67 students were recruited in this study, and the results of the learning motivation questionnaires were distributed and collected in the 9th and 18th weeks, with a total of 67 valid questionnaires.

This study referred to the Motivated Strategies for Learning Questionnaire (MSLQ) scale proposed by Pintrich and De Groot (1991) and edited its items to meet the requirements of this experiment. For example, “in the algorithms course, I prefer materials that are challenging so that I can learn new content”; “in the algorithms course, I prefer materials that can arouse my curiosity, even if the content is difficult”; “for me, the greatest satisfaction in the algorithm course came from trying to understand what I learned”. This questionnaire is eight items in total and is divided into two dimensions, namely, four items of intrinsic motivation and four items of extrinsic motivation. In addition, this study uses reliability analysis to test the internal consistency of the learning motivation questionnaire. The Cronbach’s Alpha value is 0.817, which is good reliability. In addition, this study used the dependent sample t-test to analyze the learning performance and motivation of students using the learning pathway system.

4. Result

Table 1 shows the results of the dependent t-test analysis of learning performance for all students. The mean and standard deviation of the pre-test were 56.4 and 22.42. The mean and standard deviation of the post-test were 68.13 and 19.36, and the t-value was 4.612 (p < 0.001), which means that there is a significant difference in learning performance between the pre-test and post-test. On the other hand, this study divides all students into high-achieving and low-achieving students according to the pre-test results in order to analyze the differences in the learning performance of students with different achievements. There were 34 high-achieving students and 33 low-achieving students.

Table 2 shows the results of the dependent t-test analysis of learning performance for high- and low-achieving students. High-achieving students are students with PR50 or above. Low-achieving students are those with a grade below PR50. The mean and standard deviation of the pre-test for high-achieving students were 74.68 and 10.99. The mean and standard deviation of the post-test for high-achieving students were 82.75 and 8.47, and the t-value was 3.394 (p < 0.01), which means that there is a significant difference in learning performance between the pre-test and post-test. The mean and standard deviation of the pre-test for low-achieving students were 37.58 and 13.84. The mean and standard deviation of the post-test for low-achieving students were 53.06 and 15.4, and the t-value was 4.466 (p < 0.001), which means that there is a significant difference in learning performance between the pre-test and post-test. According to Table 1 and Table 2, students using the learning pathway system in the algorithm course can significantly improve their learning performance.

This study explored and analyzed learning motivation. Additionally, this study also analyzed the two dimensions of learning motivation, namely intrinsic motivation and extrinsic motivation.

Table 3 shows the results of the dependent t-test analysis of learning motivation, intrinsic motivation, and extrinsic motivation for all students. In learning motivation, the mean and standard deviation of the pre-test were 3.74 and 0.54. The mean and standard deviation of the post-test were 3.76 and 0.59, and the t-value was 0.333 (p > 0.05). When analyzing learning motivation, there is no significant difference between the pre-test and post-test. In intrinsic motivation, the mean and standard deviation of the pre-test were 3.78 and 0.61. The mean and standard deviation of the post-test were 3.83 and 0.63, and the t-value was 0.815 (p > 0.05). When analyzing intrinsic motivation, there is no significant difference between the pre-test and post-test. In extrinsic motivation, the mean and standard deviation of the pre-test were 3.69 and 0.62. The mean and standard deviation of the post-test were 3.67 and 0.66, and the t-value was −0.133 (p > 0.05). When analyzing extrinsic motivation, there is no significant difference between the pre-test and post-test.

Although students did not significantly improve their learning motivation after using the system, Table 3 and Table 4 show that students’ learning and intrinsic motivation in the post-test are higher than in the pre-test. This means that students can still search for auxiliary materials for algorithms in their spare time through the learning pathway provided by this system. In addition, it can also be inferred from Table 4 that introducing this system in the algorithm course can enhance students’ intrinsic motivation, enable students to learn more autonomously and spontaneously, and then achieve the benefits of students’ autonomous learning. This result also echoes previous research, some of which have indicated that improving intrinsic motivation in students can promote more spontaneous and autonomous learning [30,31].

In addition, this study analyzed the differences in learning motivation and its two dimensions among students with different achievements. Table 4 shows the results of the dependent t-test analysis for high- and low-achieving students’ learning motivation, intrinsic motivation, and extrinsic motivation. In learning motivation, the mean and standard deviation of the pre-test for high-achieving students were 3.91 and 0.48. The mean and standard deviation of the post-test for high-achieving students were 3.86 and 0.54, and the t-value was −0.4 (p > 0.05). When analyzing learning motivation, there is no significant difference between the pre-test and post-test. The mean and standard deviation of the pre-test for low-achieving students were 3.55 and 0.55. The mean and standard deviation of the post-test for low-achieving students were 3.65 and 0.63, and the t-value was 0.691 (p > 0.05). When analyzing learning motivation, there is no significant difference between the pre-test and post-test.

In intrinsic motivation, the mean and standard deviation of the pre-test for high-achieving students were 3.89 and 0.58. The mean and standard deviation of the post-test for high-achieving students were 3.95 and 0.62, and the t-value was 0.469 (p > 0.05). When analyzing intrinsic motivation, there is no significant difference between the pre-test and post-test. The mean and standard deviation of the pre-test for low-achieving students were 3.67 and 0.64. The mean and standard deviation of the post-test for low-achieving students were 3.72 and 0.64, and the t-value was 0.366 (p > 0.05). When analyzing intrinsic motivation, there is no significant difference between the pre-test and post-test.

In the extrinsic motivation part, the mean and standard deviation of the pre-test for high-achieving students were 3.93 and 0.55. The mean and standard deviation of the post-test for high-achieving students were 3.78 and 0.62, and the t-value was −1.096 (p > 0.05). When analyzing extrinsic motivation, there is no significant difference between the pre-test and post-test. The mean and standard deviation of the pre-test for low-achieving students were 3.43 and 0.58. The mean and standard deviation of the post-test for low-achieving students were 3.57 and 0.69, and the t-value was 0.9 (p > 0.05). When analyzing extrinsic motivation, there is no significant difference between the pre-test and post-test.

In summary, this method significantly improves students’ learning performance, and the learning pathway system can improve students’ learning motivation, although not significantly. Additionally, it is shown that the learning motivation of low-achieving students in the post-test is higher than in the pre-test. In addition, the intrinsic and extrinsic motivation for the low-achieving students in the post-test is higher than those in the pre-test. This means that the learning pathway provided by this study is more helpful for low-achieving students to improve their learning motivation and using this system in the algorithm course can assist low-achieving students in searching for more appropriate algorithms supplementary materials. It can reduce information overload for low-achieving students and obtain more suitable resources for autonomous learning and enhance the motivation of their autonomous learning algorithms. Even though the learning motivation and extrinsic motivation of the post-test for high-achieving students are lower than those of the pre-test, it is shown that the intrinsic motivation for high-achieving students is still improved. This study infers that high-achieving students can actively learn more auxiliary materials for algorithms through the learning pathway of this system, thereby increasing their familiarity and knowledge understanding of algorithm courses.

5. Discussion

SDG 4, Quality Education, aims to promote lifelong learning opportunities for all, including higher education. The results show that the provided system can improve learning performance and motivation. In addition, students can search for more suitable learning materials according to our system. Several studies have pointed out that the implementation of e-learning in higher education can have a positive impact on student learning [32,33]. In addition, students can maintain better motivation in their studies [34]. Quality education (SDG 4) ensures learning opportunities for students, and the findings of this study echo the purpose of this goal [27,28].

This study aims to establish a learning pathway system to explore the learning performance and motivation of students in the algorithm course. To reduce the subjectivity of students evaluating the difficulty of articles, we selected articles that received at least three or five student feedback as the basis to verify the accuracy of the article difficulty classification. A total of 153 student responses were collected for this study. These results suggest that the accuracy of difficulty classification increases with the number of feedback respondents. The method proposed in this study achieves 70% accuracy in classifying the difficulty of articles with at least 3 students’ feedback and 76% accuracy in articles with at least 5.

6. Conclusions

This study introduced a learning pathway and conducted experimental activities in an algorithm course. A total of 67 valid pre-tests, post-test, and learning motivation questionnaires were collected, and a related sample t-test was carried out using SPSS Statistics software. In addition, students were divided into high-achieving and low-achieving groups according to the pre-test. The results showed that the post-test was significantly higher than the pre-test for all students. In addition, the intrinsic motivation of high-achieving students was improved, while the intrinsic and extrinsic motivation of low-achieving students were both improved.

According to the results, this method significantly improves students’ learning performance. It helps students avoid reading materials that are too difficult due to limited prior knowledge or inadequate preliminary learning, which could affect learning quality and decrease students’ motivation. In addition, this study provides keywords for learners’ reference, allowing them to quickly browse the contents of articles and select materials before reading. On the other hand, the results show that the learning pathway system can improve students’ learning motivation, although not significantly. This echoes the findings of other studies [35,36]. Through a novel approach of analyzing difficulty levels and providing keywords, this project effectively reduces the challenges in the learning process and promotes students’ motivation, ultimately achieving the goal of tailored instruction.

Author Contributions

Conceptualization, S.-C.C. and Y.-M.H.; Methodology, S.-C.C.; Software, Y.-P.C.; Validation, Y.-M.H.; Formal analysis, S.-C.C. and Y.-P.C.; Data curation, Y.-P.C.; Writing—original draft, S.-C.C. and Y.-P.C.; Writing—review & editing, Y.-M.H.; Supervision, Y.-M.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research is sponsored in part by the National Science and Technology Council, Taiwan under Grand No. NSTC 109-2511-H-218-003-MY2 and 110-2511-H-006-008-MY3.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gunawardena, L.; Pitigala Liyanage, M.P. Flipped Classrooms Using Social Networks: An Investigation on Learning Styles. In Proceedings of the 2018 7th International Congress on Advanced Applied Informatics (IIAI-AAI), Yonago, Japan, 8–13 July 2018; pp. 956–957. [Google Scholar]
Wang, T.-H. What Strategies Are Effective for Formative Assessment in an E-learning Environment? J. Comput. Assist. Learn. 2007, 23, 171–186. [Google Scholar] [CrossRef]
Gullayanon, R. Flipping an Engineering Mathematics Classroom for a Large Undergraduate Class. In Proceedings of the 2014 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE), Wellington, New Zealand, 8–10 December 2014; pp. 409–412. [Google Scholar]
Trpkovska, M.A.; Bexheti, L.A.; Cico, B. Enhancing Flipped Classroom Model Implementation. In Proceedings of the 2017 6th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro, 11–15 June 2017; pp. 1–4. [Google Scholar]
Krishnan, G.; Dastakeer, W. Mobile Classroom-Blended Learning Through Use of Technology. In Proceedings of the 2019 Advances in Science and Engineering Technology International Conferences (ASET), Dubai, United Arab Emirates, 26 March–10 April 2019; pp. 1–6. [Google Scholar]
Ramirez-Arellano, A. Students Learning Pathways in Higher Blended Education: An Analysis of Complex Networks Perspective. Comput. Educ. 2019, 141, 103634. [Google Scholar] [CrossRef]
Sanchez Bocanegra, C.L.; Sevillano Ramos, J.L.; Rizo, C.; Civit, A.; Fernandez-Luque, L. HealthRecSys: A Semantic Content-Based Recommender System to Complement Health Videos. BMC Med. Inform. Decis. Mak. 2017, 17, 63. [Google Scholar] [CrossRef]
Longo, D.R.; Schubert, S.L.; Wright, B.A.; LeMaster, J.; Williams, C.D.; Clore, J.N. Health Information Seeking, Receipt, and Use in Diabetes Self-Management. Ann. Fam. Med. 2010, 8, 334–340. [Google Scholar] [CrossRef]
Wang, D.; Liang, Y.; Xu, D.; Feng, X.; Guan, R. A Content-Based Recommender System for Computer Science Publications. Knowl. Based Syst. 2018, 157, 1–9. [Google Scholar] [CrossRef]
Ortiz-Vilchis, P.; Ramirez-Arellano, A. Learning Pathways and Students Performance: A Dynamic Complex System. Entropy 2023, 25, 291. [Google Scholar] [CrossRef]
Rahayu, N.W.; Ferdiana, R.; Kusumawardani, S.S. A Systematic Review of Learning Path Recommender Systems. Educ. Inf. Technol. 2022, 1–24. [Google Scholar] [CrossRef]
Ramirez-Arellano, A.; Bory-Reyes, J.; Hernández-Simón, L.M. Emotions, Motivation, Cognitive–Metacognitive Strategies, and Behavior as Predictors of Learning Performance in Blended Learning. J. Educ. Comput. Res. 2019, 57, 491–512. [Google Scholar] [CrossRef]
Raj, N.S.; Renumol, V.G. An Improved Adaptive Learning Path Recommendation Model Driven by Real-Time Learning Analytics. J. Comput. Educ. 2022, 1–28. [Google Scholar] [CrossRef]
Qaiser, S.; Ali, R. Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents. Int. J. Comput. Appl. 2018, 181, 25–29. [Google Scholar] [CrossRef]
Aslam, N.; Ramay, W.Y.; Xia, K.; Sarwar, N. Convolutional Neural Network Based Classification of App Reviews. IEEE Access 2020, 8, 185619–185628. [Google Scholar] [CrossRef]
Xiao, L.; Yao, N. Research on Chinese Classification Based on TF-IDF. In Proceedings of the 2021 International Conference on Neural Networks, Information and Communication Engineering, Qingdao, China, 27–29 August 2021; SPIE: Bellingham, WA, USA, 2021; Volume 11933, pp. 59–64. [Google Scholar]
Yang, B.; Dai, G.; Yang, Y.; Tang, D.; Li, Q.; Lin, D.; Zheng, J.; Cai, Y. Automatic Text Classification for Label Imputation of Medical Diagnosis Notes Based on Random Forest. In Proceedings of the Health Information Science: 7th International Conference, HIS 2018, Cairns, QLD, Australia, 5–7 October 2018; Siuly, S., Lee, I., Huang, Z., Zhou, R., Wang, H., Xiang, W., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 87–97. [Google Scholar]
Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep Learning--Based Text Classification: A Comprehensive Review. ACM Comput. Surv. 2022, 54, 40. [Google Scholar] [CrossRef]
Xie, J.; Chen, B.; Gu, X.; Liang, F.; Xu, X. Self-Attention-Based BiLSTM Model for Short Text Fine-Grained Sentiment Classification. IEEE Access 2019, 7, 180558–180570. [Google Scholar] [CrossRef]
Hassan, A.; Mahmood, A. Convolutional Recurrent Deep Learning Model for Sentence Classification. IEEE Access 2018, 6, 13949–13957. [Google Scholar] [CrossRef]
Zhao, W.; Zhang, G.; Yuan, G.; Liu, J.; Shan, H.; Zhang, S. The Study on the Text Classification for Financial News Based on Partial Information. IEEE Access 2020, 8, 100426–100437. [Google Scholar] [CrossRef]
Nigam, K.; Mccallum, A.K.; Thrun, S.; Mitchell, T. Text Classification from Labeled and Unlabeled Documents Using EM. Mach. Learn. 2000, 39, 103–134. [Google Scholar] [CrossRef]
Ali, H.; Salleh, M.; Saedudin, R.; Hussain, K.; Mushtaq, M. Imbalance Class Problems in Data Mining: A Review. Indones. J. Electr. Eng. Comput. Sci. 2019, 14, 1560–1571. [Google Scholar] [CrossRef]
Elfert, M. Lifelong Learning in Sustainable Development Goal 4: What Does It Mean for UNESCO’s Rights-Based Approach to Adult Learning and Education? Int. Rev. Educ. 2019, 65, 537–556. [Google Scholar] [CrossRef]
Boeren, E. Understanding Sustainable Development Goal (SDG) 4 on “Quality Education” from Micro, Meso and Macro Perspectives. Int. Rev. Educ. 2019, 65, 277–294. [Google Scholar] [CrossRef]
Fuller, S. Education Diplomacy at the Intersection of Gender Equality and Quality Education. Child. Educ. 2019, 95, 70–73. [Google Scholar] [CrossRef]
Ghosn-Chelala, M. Exploring Sustainable Learning and Practice of Digital Citizenship: Education and Place-Based Challenges. Educ. Citizsh. Soc. 2019, 14, 40–56. [Google Scholar] [CrossRef]
Dybach, I. Institutional Aspects of Educational Quality Management in Higher Educational Establishments. Econ. Dev. 2019, 18, 33–43. [Google Scholar] [CrossRef]
Ferguson, T.; Roofe, C.G. SDG 4 in Higher Education: Challenges and Opportunities. Int. J. Sustain. High. Educ. 2020, 21, 959–975. [Google Scholar] [CrossRef]
Pan, Y.; Gauvain, M. The Continuity of College Students’ Autonomous Learning Motivation and Its Predictors: A Three-Year Longitudinal Study. Learn. Individ. Differ. 2012, 22, 92–99. [Google Scholar] [CrossRef]
Cheng, Y.-P.; Lai, C.-F.; Chen, Y.-T.; Wang, W.-S.; Huang, Y.-M.; Wu, T.-T. Enhancing Student’s Computational Thinking Skills with Student-Generated Questions Strategy in a Game-Based Learning Platform. Comput. Educ. 2023, 104794. [Google Scholar] [CrossRef]
Airaj, M. Cloud Computing Technology and PBL Teaching Approach for a Qualitative Education in Line with SDG4. Sustainability 2022, 14, 15766. [Google Scholar] [CrossRef]
Méndez, D.; Méndez, M.; Anguita, J.M. Digital Teaching Competence in Teacher Training as an Element to Attain SDG 4 of the 2030 Agenda. Sustainability 2022, 14, 11387. [Google Scholar] [CrossRef]
Park, S.; Kim, S. Is Sustainable Online Learning Possible with Gamification?—The Effect of Gamified Online Learning on Student Learning. Sustainability 2021, 13, 4267. [Google Scholar] [CrossRef]
Mansfield, K.J.; Peoples, G.E.; Parker-Newlyn, L.; Skropeta, D. Approaches to Learning: Does Medical School Attract Students with the Motivation to Go Deeper? Educ. Sci. 2020, 10, 302. [Google Scholar] [CrossRef]
States, N.; Stone, E.; Cole, R. Creating Meaningful Learning Opportunities through Incorporating Local Research into Chemistry Classroom Activities. Educ. Sci. 2023, 13, 192. [Google Scholar] [CrossRef]

Figure 1. Illustration of Learning Pathway to Assist Learners.

Figure 2. System Architecture to Develop the Intelligent System.

Figure 3. Flowchart for Determining Topic-Related Articles.

Figure 4. Experimental Process.

Table 1. Dependent sample t-test of learning performance for all students.

	N	Mean	SD	df	t	p
Pre-test	67	56.4	22.42	66	4.612 ***	<0.001
Post-test	67	68.13	19.36		4.612 ***	<0.001

Note: *** p < 0.001.

Table 2. Dependent sample t-test of learning performance for high- and low-achieving students.

		N	Mean	SD	df	t	p
High-achieving students	Pre-test	34	74.68	10.99	33	3.394 **	0.002
High-achieving students	Post-test	34	82.75	8.47		3.394 **	0.002
Low-achieving students	Pre-test	33	37.58	13.84	32	4.466 ***	<0.001
Low-achieving students	Post-test	33	53.06	15.4		4.466 ***	<0.001

Note: ** p < 0.01, *** p < 0.001.

Table 3. Dependent sample t-test of learning motivation, intrinsic motivation, and extrinsic motivation for all students.

		N	Mean	SD	df	t	p
Learning motivation	Pre-test	67	3.74	0.54	66	0.333	0.74
Learning motivation	Post-test	67	3.76	0.59		0.333	0.74
Intrinsic motivation	Pre-test	67	3.78	0.61	66	0.815	0.418
Intrinsic motivation	Post-test	67	3.83	0.63		0.815	0.418
Extrinsic motivation	Pre-test	67	3.69	0.62	66	−0.133	0.895
Extrinsic motivation	Post-test	67	3.67	0.66		−0.133	0.895

Table 4. Dependent sample t-test of learning motivation, intrinsic motivation, and extrinsic motivation for high- and low-achieving students.

			N	Mean	SD	df	t	p
Learning motivation	High-achieving students	Pre-test	34	3.91	0.48	33	−0.4	0.692
	High-achieving students	Post-test	34	3.86	0.54		−0.4	0.692
	Low-achieving students	Pre-test	33	3.55	0.55	32	0.691	0.494
	Low-achieving students	Post-test	33	3.65	0.63		0.691	0.494
Intrinsic motivation	High-achieving students	Pre-test	34	3.89	0.58	33	0.469	0.642
	High-achieving students	Post-test	34	3.95	0.62		0.469	0.642
	Low-achieving students	Pre-test	33	3.67	0.64	32	0.366	0.717
	Low-achieving students	Post-test	33	3.72	0.64		0.366	0.717
Extrinsic motivation	High-achieving students	Pre-test	34	3.93	0.55	33	−1.096	0.281
	High-achieving students	Post-test	34	3.78	0.62		−1.096	0.281
	Low-achieving students	Pre-test	33	3.43	0.58	32	0.9	0.375
	Low-achieving students	Post-test	33	3.57	0.69		0.9	0.375

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cheng, S.-C.; Cheng, Y.-P.; Huang, Y.-M. Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance. Sustainability 2023, 15, 6950. https://doi.org/10.3390/su15086950

AMA Style

Cheng S-C, Cheng Y-P, Huang Y-M. Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance. Sustainability. 2023; 15(8):6950. https://doi.org/10.3390/su15086950

Chicago/Turabian Style

Cheng, Shu-Chen, Yu-Ping Cheng, and Yueh-Min Huang. 2023. "Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance" Sustainability 15, no. 8: 6950. https://doi.org/10.3390/su15086950

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Developing a Learning Pathway System through Web-Based Mining Technology to Explore Students’ Learning Motivation and Performance

Abstract

1. Introduction

2. Literature Review

2.1. Content-Based Recommend System

2.2. Learning Pathway

2.3. Text Mining

2.4. Text Classification

2.5. Word Segmentation

2.6. Sustainable Development Goals

3. Methods

3.1. System Architecture

3.2. Participants and Experiment Process

3.3. Data Collection and Analysis

4. Result

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI