Next Article in Journal
Detecting Multi-Density Urban Hotspots in a Smart City: Approaches, Challenges and Applications
Previous Article in Journal
Context-Based Patterns in Machine Learning Bias and Fairness Metrics: A Sensitive Attributes-Based Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Communication

Analyzing the Effect of COVID-19 on Education by Processing Users’ Sentiments

1
PerLab, Faculty of Electrical and Computer Engineering, University of Birjand, Birjand 9717434765, Iran
2
Department of Electrical Engineering, University of Dubai, Dubai 14143, United Arab Emirates
3
Institute for Program Structures and Data Organization, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2023, 7(1), 28; https://doi.org/10.3390/bdcc7010028
Submission received: 28 October 2022 / Revised: 24 December 2022 / Accepted: 29 December 2022 / Published: 30 January 2023
(This article belongs to the Topic Social Computing and Social Network Analysis)

Abstract

:
COVID-19 infection has been a major topic of discussion on social media platforms since its pandemic outbreak in the year 2020. From daily activities to direct health consequences, COVID-19 has undeniably affected lives significantly. In this paper, we especially analyze the effect of COVID-19 on education by examining social media statements made via Twitter. We first propose a lexicon related to education. Then, based on the proposed dictionary, we automatically extract the education-related tweets and also the educational parameters of learning and assessment. Afterwards, by analyzing the content of the tweets, we determine the location of each tweet. Then the sentiments of the tweets are analyzed and examined to extract the frequency trends of positive and negative tweets for the whole world, and especially for countries with a significant share of COVID-19 cases. According to the analysis of the trends, individuals were globally concerned about education after the COVID-19 outbreak. By comparing between the years 2020 and 2021, we discovered that due to the sudden shift from traditional to electronic education, people were significantly more concerned about education within the first year of the pandemic. However, these concerns decreased in 2021. The proposed methodology was evaluated using quantitative performance metrics, such as the F1-score, precision, and recall.

1. Introduction

COVID-19 was first identified in late 2019, and it soon spread worldwide. Due to its rapid prevalence, the virus has become one of the most often discussed issues in the subsequent years. It has had a significant impact on several facets of people’s lives, including health [1], tourism [2], education [3], and the economy [4].
As an attempt to limit the spread of the COVID-19 pandemic, people worldwide were home quarantined. Following these quarantines and safety protocols was the declaration of COVID-19 as a global pandemic by the World Health Organization (WHO) [5]. Establishments such as restaurants, schools, and other learning facilities were closed. As a result, teaching and learning at various levels of education were transitioned from the traditional face-to-face method to online-based education [3]. Devices connected to the Internet, such as laptops, mobile phones, and tablets are a major part of online learning or E-learning [3]. The closure of schools and universities, along with the change of the education system from traditional to online education, doubled people’s concerns about this issue.
Twitter, as one of the most popular social media, is a good resource for exploring the public’s opinions and extracting information [6]. People actively shared their concerns about every topic, and especially education, on Twitter. The analysis of these tweets can contribute to understanding the feedback and response of people to the shift caused by COVID-19 on education. One can read and manually analyze tweets to investigate people’s experiences about changes in the form of education, assessment, and virtual exams. However, nowadays, artificial intelligence and, more specifically, natural language processing techniques can be used to analyze text automatically. The first step towards this approach is to conduct text mining in order to segregate the tweets related to education, particularly to learning and assessment parameters. Traditionally, keywords were used to extract tweets; however, this method produces many errors in identifying related tweets.
This study reports an examination and assessment of the worldwide response to the effect of COVID-19 on education by processing user tweets. Using the Oxford Dictionary, we propose a lexicon of education-related words in order to identify educational tweets from a huge dataset of COVID-19-related tweets. Subsequently, by proposing lexicon-based methods, we extract the educational tweets, as well as the tweets related to the educational parameters of learning and assessment, from the dataset of tweets related to COVID-19. To identify the location of tweets we use a content analysis method [4,7]. In this method, a geographic database of place names is used to determine the location of the tweets. There are various methods, such as geotagging, for determining the location of tweets. However, in some cases, these methods may specify the location of the tweets incorrectly.
In order to get the opinions and thoughts of Twitter users about COVID-19 and education, we perform sentiment analysis for related tweets. In order to analyze the sentiments of the collected tweets, we use the language-based model of RoBERTa [8] and classify the tweets into two categories: positive and negative tweets. Then we analyze the frequency trends of all tweets, positive tweets and negative tweets, for the whole world, as well as for ten chosen countries. We also analyze the frequency trends of the tweets related to each of the two educational parameters of learning and assessment for the whole world separately. In the last step, we analyze the frequency trends of educational tweets for 2020 and 2021 (the period from 15 August to 15 September). In summary, the contribution of this research includes the following:
-
The identification of educational tweets by proposing a lexicon-based method.
-
The identification of the tweets related to the educational parameters of learning and assessment by proposing lexicon-based methods.
-
Extracting and analyzing sentiment trends of education.
The next sections of the article are organized as follows; in Section 2, the relevant works are reviewed; Section 3 introduces the proposed method; Section 4 details the analyses and findings; and lastly, Section 5 presents the concluding remarks.

2. Related Work

In the era of COVID-19, several studies [9] with different objectives have analyzed related tweets. Sentiment analysis is one of the most important processes that can be performed on tweets related to COVID-19 to extract people’s thoughts and opinions regarding the issue. Several studies have focused on analyzing the sentiments of tweets related to COVID-19, including the classification of tweet sentiments into ten categories (positive, negative, anger, anticipation, disgust, fear, happiness, sadness, surprise, and trust) [10] or three categories (positive, negative, and neutral) [11,12,13]. Similarly, another study analyzed tweets related to Omicron SARS-CoV-2 and categorized the sentiment of the tweets into five categories (“Neutral”, “Great”, “Good”, “Neutral”, “Bad”, and “Horrible”) [14]. The top 10 topics in the English and Portuguese tweets of the United States and Brazil have also been identified and analyzed [15]. In previous studies, we analyzed users’ sentiments from different countries in the first three months of the outbreak [6] and also investigated the impact of the pandemic on the economy [4].
In order to investigate the effects of COVID-19 on education through the processing of users’ tweets, several studies have collected and analyzed related tweets. In order to analyze the Australian people’s views on home study during the COVID-19 epidemic, 10421 tweets were collected over three weeks, and their sentiments were classified into six categories (positive, negative, sense of humor, teacher appreciation, government/politician feedback, and compliments) [16]. Similarly, Indonesians’ tweets about online learning were collected in October 2020, and their sentiments were analyzed in three categories (positive, negative, and neutral) [17]. Another study [18] collected and examined 17155 related tweets to analyze sentiments and topics in tweets related to online education during COVID-19. Table 1 summarizes previous works.
Previous studies have processed a small number of tweets over a short period of time. Approximating the users’ response about the impact of COVID-19 on education requires longer periods of time and an analysis of a much larger number of tweets. For this reason, this article analyzes 15 million tweets posted in 2020 and 2021. The more tweets processed, the better and more accurate the result will be. Moreover, sentiment analysis is separately performed for ten countries to enable the comparison of users’ attitudes across different countries. Finally, unlike previous studies, which were based on geotagging, we use a dictionary-based method to tag the location of tweets so that the location of tweets can be inferred more accurately.

3. Data Extraction and Analysis

A three-step method is used to analyze the effect of COVID-19 on education through the processing of users’ tweets (Figure 1). In the first step, we pre-process a large dataset of tweets related to COVID-19 [19] and then determine their location. In the second step, we thematically extract the tweets using a lexicon-based method. In the third step, we analyze the sentiments of the tweets and classify them into two categories of positive and negative tweets.

3.1. Data Collection

Several studies, such as Refs. [19,20,21], collected datasets of tweets related to COVID-19 at different times. The dataset provided in Ref. [20] is old and does not include tweets from 2021. In this research, a dataset was prepared with the help of weekly and daily sampling from a comprehensive dataset containing tweets about COVID-19 [19]. The sampled dataset contains more than 15 million tweets related to COVID-19 for the time periods of March to June 2020 (the beginning of the COVID-19 pandemic), 15 August to 15 September 2020 (the beginning of the 2020 school year), and 15 August to 15 September 2021 (the beginning of the 2021 academic year). As the datasets were sampled in different time periods, it shows a new degree of novelty, which reflects how people’s attitude changes over time during the pandemic.
The lexicon-based method of our previous work [22] is used to determine the location of tweets. The method uses the GeoNames geographic database (containing over 25,000,000 different place names and geographic information) to create a list of place names for the countries with the highest cases of COVID-19. This list contains complete and detailed information about the names of the states, provinces, and cities of each of the countries in question. The compiled list contains more than 7000 place names that can pinpoint the location of each tweet precisely. The mentioned list is used as a Gazetteer list in a GATE pipeline [23]. In fact, each tweet that mentions the name of a country’s state, city, etc., is given the country’s tag as its location. For example, the USA tag is assigned as the location for the following tweet:
“Over 85% of corona cases in Michigan are located in Detroit and surrounding areas”.
For example, if any of the words in Figure 2 are mentioned in the text of each tweet, the name of the country of Pakistan is considered as the location of that tweet.

3.2. Thematic Extraction of Tweets

To extract tweets related to education, we propose a lexicon-based method by creating a glossary of education-related words. For this purpose, we use all the vocabulary of the Dictionary of Education on Oxford [24], which contains 1100 words. We apply the proposed lexicon to the existing database and calculate the precision and recall of the retrieved data. Then, by inspecting the output data, we remove the misleading words from the lexicon. After refining the lexicon, we ran the method and calculated the precision and recall again. We repeat this process to achieve acceptable precision and recall. Finally, the vocabulary of this lexicon has been reduced to 134 words, which are represented in Figure 3. After compiling the dictionary, tweets related to education are extracted using the proposed lexicon-based method.
Learning and assessment have always been two important factors in education [22,25]. The importance of these two subjects, as well as the ever-growing concern about them, are more relevant nowadays following the COVID-19 pandemic and the change in teaching and learning methodologies. Likewise, we use a lexicon-based method to extract tweets related to each of these educational factors. In doing so, from the words of the education lexicon, we specify the words related to these two parameters and create a dictionary for each of them. The dictionaries related to the two parameters of learning and assessment consist of 12 and 13 words, respectively. In this way, we will have a lexicon of related words for each parameter, the vocabulary of which can be seen in Figure 4. Now, using the compiled dictionaries, the educational tweets related to each of the two parameters can be identified.

3.3. Sentiment Analysis

Sentiment Analysis (SA) or Opinion Mining (OM) is defined to be the “computational study of people’s opinions, attitudes, and emotions toward an entity. The entity can represent individuals, events, or topics” [26]. We perform the sentiment analysis of the tweets using the sentiment classification model of RoBERTa [3]. This model is one of the recent models of sentiment analysis and improves BERT5. In fact, the reason for choosing it for sentiment analysis is its excellent performance, high speed, and accuracy [27].
Three labeled datasets of Stanford Sentiment treebank (67,300 samples) [28], SemEval 2015 Task 10 (6800 samples) [28], and SemEval 2015 Task 11 (3500 samples) [29] have been used to train the sentiment classification model. Since three datasets are used to train the model, a separate classifier must be considered for each. The input and output of this model are the text of a tweet (string) and the representation of the tokens of ht (t = 1, …, T), respectively. Each of the three vector representation classifiers receive a tweet called H as input, which is calculated through the following equation:
H = ∑ et * ht
t
In which
e t = e α t j α j ,             α t = w a t t h t
In the above relations, watt is considered as attention weight and ht as bias.
The final tag of the tweet in the testing phase is determined by the majority vote of the three classifiers. In short, the RoBERTa model tags the tweets with positive content as one and tags the tweets with negative content as zero. As a result, the sentiments of the tweets are categorized into positive and negative groups.
After identifying the sentiments of the tweets using the language-based sentiment classification model of RoBERTa, we obtain the frequency of positive tweets and negative tweets for the whole world, and for each country separately. To this end, we calculate the frequency of positive and negative tweets for the whole world and for the countries with a significant number of related tweets.

3.4. Evaluation Scheme

In this step, we assess the performance of the proposed method based on the most popular criteria, including accuracy, precision, recall, and F1 score. To describe these criteria, we consider the confusion matrix presented in Table 2:
The evaluation criteria (precision, recall, and F1 score) can be calculated according to the confusion matrix as follows:
Precision = T P T P + F P
R e c a l l = T P T P + F N
F 1 s c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
In order to calculate the evaluation criteria, we first randomly collect 1000 related tweets from the entire database by sampling. We execute the proposed method on this sampled dataset. We independently manually annotate this sampled dataset as a benchmark. Then, using the above equations, the evaluation criteria for different sections of the proposed method are calculated. The results are shown in Table 3.

4. Discussion and Findings

To consider the variety of languages in the tweets extracted from different locations in the world, we used the GATE software to implement the lexicon-based methods. GATE is a software program capable of processing a variety of languages. It is used to develop the software components of natural language processing [30]. Using a GATE pipeline, in which the lexicon of places is the Gazetteer list, a place name is determined for each tweet. We also utilize the GATE software to extract tweets related to education and tweets related to each of the two parameters of learning and assessment.
In what follows, the results and findings of this study are presented.

4.1. Sentiment Analysis Trends

This section analyzes the trends in the number of education-related tweets in the first three months of the global outbreak. Figure 5 and Figure 6 demonstrate the frequency of all education tweets, positive tweets, and negative tweets, as well as the official statistics of the COVID-19 cases for the whole world and for the ten chosen countries, respectively. The vertical axis on the left represents the official statistics of the patients, and the vertical axis on the right denotes the frequency of tweets.
What is evident in the figures is that over the chosen time period, as the number of patients increases, so does the number of tweets related to COVID-19 and education. Moreover, most of the related tweets posted have negative content. To clarify the reason for the negative sentiment of the related tweets, we examined the text of the tweets in the weeks with the most negative tweets. As a result, we found out that people’s big concerns are often about how to graduate, exam cancelation, and school and university closure. Therefore, it can be concluded that following the outbreak of the COVID-19, people around the world were concerned about their education or that of their children.
The studied time period is the onset of the global COVID-19 pandemic, and the higher number of tweets with negative content is indicative of people’s concern and anxiety.

4.2. Analyzing Tweets from the Perspective of Educational Parameters

This subsection analyzes the frequency trend of tweets related to each of the two parameters of learning and assessment. Figure 7 displays the frequency of learning and assessment tweets for the whole world.
During the onset of the COVID-19 pandemic, when education changed from traditional to electronic methods, assessment-related tweets gradually outnumbered learning-related ones. Especially in June, the number of tweets posted about assessment has increased significantly. Examining the text of related tweets, we find that people’s discussion and worry has been about how the exams were going to be held.

4.3. Comparing the Trend of Tweets in 2020 and 2021

This section compares the frequency of education-related tweets for a one-month period at the beginning of the school year (15 August to 15 September) in 2020 and 2021. Figure 8 shows the frequency of education-related tweets in 2020 and 2021.
Comparing the years 2020 and 2021, it is clear that the frequency of education tweets at the beginning of the 2020 school year has been much higher than that of 2021. This reveals that people were much more concerned about education in the first year of the global epidemic of COVID-19 and the sudden shift from traditional to electronic education. In contrast, in 2021, those concerns were almost relieved.

5. Conclusions

In this paper, we investigated the response of individuals to the effect of COVID-19 on education by extracting people’s sentiments about education on social media, particularly Twitter. In doing so, we used a three-step approach: data collection, thematic extraction of tweets, and sentiment analysis, and investigated two parameters of learning and assessment. Next, we calculated the precision, recall, and F1 score so as to evaluate the extraction of education-related tweets, the extraction of learning-related tweets, and the extraction of assessment-related tweets, and obtained an F1-score of 0.721, 0.827, 0.77, respectively, which are reasonable considering the numerous features and parameters of text processing.
The results indicate that in the world, and most of the studied countries, the majority of tweets had negative content, suggesting the intense concern of people for their own education or that of their children during the epidemic. Moreover, the frequency of education tweets in 2020 (between 15 August and 15 September) has been constantly greater than the frequency of education tweets in 2021. Indeed, in the second year of the COVID-19 epidemic (the beginning of the 2021 school year) public concern and debate have largely subsided, which could be due to e-learning maturity or satisfaction with it.
As a limitation of the work done, we can mention the processing and analysis of only English-language tweets. Besides, location extraction might not work for all tweets, especially when there is not any location name in the text, or the name is not known to our lexicon. Other limitations include the lack of enough tweets for many countries. In addition, in this research, the analysis of the desired tweets has been done in the case of only ten countries.
To more accurately examine the reasons for negative and positive sentiments at different times, we can perform topic modeling on negative and positive tweets. As a result, the negative and positive topics discussed can be identified more comprehensively. As another future direction, the countries in question can be clustered with the help of their sentiment analysis trends by applying the clustering methods for time series data.

Author Contributions

Conceptualization, H.V.-N.; methodology, M.J. and H.V.-N.; software, M.J.; validation, H.V.-N., W.M. and A.C.; investigation, H.H.; resources, H.H.; data curation, M.J. and H.V.-N.; writing—original draft preparation, M.J.; writing—review and editing, H.V.-N., W.M., A.C. and H.H.; visualization, M.J.; supervision, H.V.-N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset generated during the current study is available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Escandón, K.; Rasmussen, A.L.; Bogoch, I.I.; Murray, E.J.; Escandón, K.; Popescu, S.V.; Kindrachuk, J. COVID-19 false dichotomies and a comprehensive review of the evidence regarding public health, COVID-19 symptomatology, SARS-CoV-2 transmission, mask wearing, and reinfection. BMC Infect. Dis. 2021, 21, 1–47. [Google Scholar]
  2. Hajiabadi, M.; Vahdat-Nejad, H.; Hajiabadi, H. COVID-19 and Tourism: Extracting Public Attitudes. Current lssues in Tourism; Taylor&Francis: Abingdon, UK, 2022; pp. 1–7. [Google Scholar]
  3. Zhu, X.; Liu, J. Education in and after Covid-19: Immediate responses and long-term visions. Postdigital Sci. Educ. 2020, 2, 695–699. [Google Scholar] [CrossRef] [Green Version]
  4. Salmani, F.; Vahdat-Nejad, H.; Hajiabadi, H. Analyzing the Impact of COVID-19 on Economy from the Perspective of User’s Reviews. In Proceedings of the 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), Mashhad, Iran, 28–29 October 2021; pp. 30–33. [Google Scholar]
  5. Duc-Long, L.; Thien-Vu, G.; Dieu-Khuon, H. The Impact of the COVID-19 Pandemic on Online Learning in Higher Education: A Vietnamese Case. Eur. J. Educ. Res. 2021, 10, 1683–1695. [Google Scholar]
  6. Vahdat-Nejad, H.; Salmani, F.; Hajiabadi, M.; Azizi, F.; Abbasi, S.; Jamalian, M.; Mosafer, R.; Bagherzadeh, P.; Hajiabadi, H. Extracting Feelings of People Regarding COVID-19 by Social Network Mining. J. Inf. Knowl. Manag. 2022, 21, 2240008. [Google Scholar] [CrossRef]
  7. Azizi, F.; Vahdat-Nejad, H.; Hajiabadi, H.; Khosravi, M.H. Extracting Major Topics of COVID-19 Related Tweets. In Proceedings of the 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), Mashhad, Iran, 28–29 October 2021; pp. 25–29. [Google Scholar]
  8. Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint 2019, arXiv:190711692. [Google Scholar]
  9. Ellouze, M.; Mechti, S.; Krichen, M.; Ravi, V.; Belguith, L.H. A deep learning approach for detecting the behaviour of people having personality disorders towards COVID-19 from Twitter. Int. J. Comput. Sci. Eng. 2022, 25, 353–366. [Google Scholar] [CrossRef]
  10. Drias, H.; Drias, Y. Mining Twitter Data on COVID-19 for Sentiment analysis and frequent patterns Discovery. medRxiv 2020, ppmedrxiv-20090464. [Google Scholar]
  11. Manguri, K.H.; Ramadhan, R.N.; Amin, P.R.M. Twitter sentiment analysis on worldwide COVID-19 outbreaks. Kurd. J. Appl. Res. 2020, 5, 54–65. [Google Scholar] [CrossRef]
  12. Vernikou, S.; Lyras, A.; Kanavos, A. Multiclass sentiment analysis on COVID-19-related tweets using deep learning models. Neural Comput. Appl. 2022, 34, 19615–19627. [Google Scholar] [CrossRef] [PubMed]
  13. Yin, H.; Song, X.; Yang, S.; Li, J. Sentiment analysis and topic modeling for COVID-19 vaccine discussions. World Wide Web 2022, 25, 1067–1083. [Google Scholar] [CrossRef] [PubMed]
  14. Thakur, N.; Han, C.Y. An Exploratory Study of Tweets about the SARS-CoV-2 Omicron Variant: Insights from Sentiment Analysis, Language Interpretation, Source Tracking, Type Classification, and Embedded URL Detection. COVID 2022, 2, 1026–1049. [Google Scholar] [CrossRef]
  15. Garcia, K.; Berton, L. Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA. Appl. Soft Comput. 2021, 101, 107057. [Google Scholar] [CrossRef] [PubMed]
  16. Ewing, L.-A.; Vu, H.Q. Navigating’ home schooling’during COVID-19: Australian public response on twitter. Media Int. Aust. 2021, 178, 77–86. [Google Scholar] [CrossRef]
  17. Sahir, S.H.; Ramadhana, R.S.A.; Marpaung, M.F.R.; Munthe, S.R.; Watrianthos, R. Online learning sentiment analysis during the covid-19 Indonesia pandemic using twitter data. IOP Conf. Series Mater. Sci. Eng. 2021, 1156, 012011. [Google Scholar] [CrossRef]
  18. Mujahid, M.; Lee, E.; Rustam, F.; Washington, P.B.; Ullah, S.; Reshi, A.A.; Ashraf, I. Sentiment Analysis and Topic Modeling on Tweets about Online Education during COVID-19. Appl. Sci. 2021, 11, 8438. [Google Scholar] [CrossRef]
  19. Lamsal, R. Coronavirus (covid-19) tweets dataset. IEEE Dataport 2022, 10. [Google Scholar] [CrossRef]
  20. Lamsal, R. Design and analysis of a large-scale COVID-19 tweets dataset. Appl. Intell. 2021, 51, 2790–2804. [Google Scholar] [CrossRef] [PubMed]
  21. Thakur, N. A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave. Data 2022, 7, 109. [Google Scholar] [CrossRef]
  22. Bjork, R.A.; Yan, V.X. The increasing importance of learning how to learn. In Integrating Cognitive Science with Innovative Teaching in STEM Disciplines; McDaniel, M.A., Frey, R.F., Fitzpatrick, S.M., Roediger, H.L., III, Eds.; Washington University: St Louis, MO, USA, 2014; pp. 15–36. [Google Scholar]
  23. GATE. 2022. Available online: https://gate.ac.uk/ (accessed on 1 June 2022).
  24. Atkins, L. Social Class; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
  25. Brown, G.A.; Bull, J.; Pendlebury, M. Assessing Student Learning in Higher Education; Routledge: Abingdon, UK, 2013. [Google Scholar]
  26. Medhat, W.; Hassan, A.; Korashy, H. Sentiment analysis algorithms and applications: A survey. Ain Shams Eng. J. 2014, 5, 1093–1113. [Google Scholar] [CrossRef] [Green Version]
  27. Ghasiya, P.; Okamura, K. Investigating COVID-19 news across four nations: A topic modeling and sentiment analysis approach. IEEE Access 2021, 9, 36645–36656. [Google Scholar] [CrossRef] [PubMed]
  28. Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.Y.; Potts, C. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, 18–21 October 2013; pp. 1631–1642. [Google Scholar]
  29. Ghosh, A.; Li, G.; Veale, T.; Rosso, P.; Shutova, E.; Barnden, J.; Reyes, A. Semeval-2015 task 11: Sentiment analysis of figurative language in twitter. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, CO, USA, 4–5 June 2015; pp. 470–478. [Google Scholar]
  30. Cunningham, H.; Maynard, D.; Bontcheva, K.; Tablan, V.; Ursu, C.; Dimitrov, M.; Dimitrov, M.; Funk, A. Developing language processing components with GATE. In GATE v2 0 User Guide; University of Sheffield: Sheffield, UK, 2001. [Google Scholar]
Figure 1. Proposed method.
Figure 1. Proposed method.
Bdcc 07 00028 g001
Figure 2. Names of places in Pakistan.
Figure 2. Names of places in Pakistan.
Bdcc 07 00028 g002
Figure 3. The vocabulary of the proposed education dictionary.
Figure 3. The vocabulary of the proposed education dictionary.
Bdcc 07 00028 g003
Figure 4. Vocabulary for learning (right) and assessment (left).
Figure 4. Vocabulary for learning (right) and assessment (left).
Bdcc 07 00028 g004
Figure 5. Frequency of all, positive, and negative tweets, and the official COVID-19 cases for the whole world.
Figure 5. Frequency of all, positive, and negative tweets, and the official COVID-19 cases for the whole world.
Bdcc 07 00028 g005
Figure 6. Frequency of all, positive, and negative tweets, and official COVID-19 cases for several countries.
Figure 6. Frequency of all, positive, and negative tweets, and official COVID-19 cases for several countries.
Bdcc 07 00028 g006
Figure 7. Frequency of tweets related to learning and assessment parameters.
Figure 7. Frequency of tweets related to learning and assessment parameters.
Bdcc 07 00028 g007
Figure 8. Frequency of education tweets between August 15 and September 15 for 2020 and 2021.
Figure 8. Frequency of education tweets between August 15 and September 15 for 2020 and 2021.
Bdcc 07 00028 g008
Table 1. A summary of the discussed research works.
Table 1. A summary of the discussed research works.
Ref.AuthorYearDatasetDomainContribution
[9]Ellouze M et al.202293,000 tweetsCOVID-1Classification
[10]Drias H et al.2020More than 600,000 tweetsCOVID-19Sentiment analysis
[11]Manguri KH et al.2020500,000 tweetsCOVID-19Sentiment analysis
[12]Vernikou S et al.202244,955 tweetsCOVID-19Sentiment analysis
[13]Yin H et al.202278,827 tweetsCOVID-19 vaccineSentiment analysis and topic modeling
[14]Thakur N et al.202212,028 tweetsSARS-CoV-2 OmicronSentiment analysis
[15]Garcia K et al.202114,200,000 tweetsCOVID-19Topic detection and sentiment analysis
[6]Vahdat-Nejad H et al.20222,100,000 tweetsCOVID-19Sentiment analysis
[4]Salmani F et al.20212,100,000 tweetsCOVID-19
and   economy
Sentiment analysis
[16]Ewing L-A et al.202110,421 tweetsCOVID-19 and
home eduation
Sentiment analysis
[17]Sahir SH et al.2021159,045 tweetsCOVID-19 and
online learning
Sentiment analysis
[18]Mujahid M et al.202117,155 tweetsCOVID-19 and
online education
Sentiment Analysis and Topic Modeling
Table 2. Confusion matrix.
Table 2. Confusion matrix.
Positive (Predicted)Negative (Predicted)
Positive (real)True Positive (TP)False Negative (FN)
Negative (real)False Positive (FP)True Negative (TN)
Table 3. Evaluation parameters.
Table 3. Evaluation parameters.
ModelPrecisionRecallF1-Score
Extracting the education-related tweets0.6190.8660.721
Extracting learning parameter tweets0.7160.9830.827
Extracting assessment parameter tweets0.640.970.77
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jamalian, M.; Vahdat-Nejad, H.; Mansoor, W.; Copiaco, A.; Hajiabadi, H. Analyzing the Effect of COVID-19 on Education by Processing Users’ Sentiments. Big Data Cogn. Comput. 2023, 7, 28. https://doi.org/10.3390/bdcc7010028

AMA Style

Jamalian M, Vahdat-Nejad H, Mansoor W, Copiaco A, Hajiabadi H. Analyzing the Effect of COVID-19 on Education by Processing Users’ Sentiments. Big Data and Cognitive Computing. 2023; 7(1):28. https://doi.org/10.3390/bdcc7010028

Chicago/Turabian Style

Jamalian, Mohadese, Hamed Vahdat-Nejad, Wathiq Mansoor, Abigail Copiaco, and Hamideh Hajiabadi. 2023. "Analyzing the Effect of COVID-19 on Education by Processing Users’ Sentiments" Big Data and Cognitive Computing 7, no. 1: 28. https://doi.org/10.3390/bdcc7010028

Article Metrics

Back to TopTop