New Horizons in Web Search, Web Data Mining, and Web-Based Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 August 2023) | Viewed by 15635

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors

School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China
Interests: data mining; machine learning; crowdsourcing and human–computer interaction; intelligent systems
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Information and Engineering, Yangzhou University, Yangzhou 225127, China
Interests: text simplification; topic modeling; text mining

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
Interests: information and social networks; web data mining; complex web community discovery

Special Issue Information

Dear Colleagues,

Today’s web applications tend to be autonomous, proactive, content-exploring, self-learning, socially collaborative, location-aware, and constantly evolving. They are based on cutting-edge AI technologies, ranging from traditional information retrieval systems, multimedia service systems to mobile crowd sensing systems, intelligent healthcare systems, and even collaborative scientific innovation platforms. The emergence of novel applications generates massive amounts of heterogeneous data, calling on more complex and comprehensive analysis and modeling technologies to explore and exploit these data. This Special Issue invites article submissions presenting recent trends and advances in web search, data mining, and social recommendations, with special interest in novel modeling methodology and application modes.

The topics include, but are not limited to, web search, web mining and content analysis, the Web of Things, ubiquitous and mobile computing, social networks and recommendations, human-in-the-loop and collaborative human–AI systems, NLP and interactive dialogue, and some special aspects of web applications such as privacy, fairness, ethics, and interoperability, as well as emerging novel applications such as the monitoring and prevention of epidemics, mental health and well-being, intelligent assistants, etc.

Dr. Jing Zhang
Dr. Jipeng Qiang
Dr. Cangqi Zhou
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • web search
  • data mining
  • machine learning
  • information retrieval
  • web applications

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

5 pages, 158 KiB  
Editorial
New Horizons in Web Search, Web Data Mining, and Web-Based Applications
by Jing Zhang, Jipeng Qiang and Cangqi Zhou
Appl. Sci. 2024, 14(2), 530; https://doi.org/10.3390/app14020530 - 08 Jan 2024
Viewed by 659
Abstract
In today’s era of rapid digitization and information technology advancement, web search and web data mining stand at the core of the technological progress of numerous web-based applications [...] Full article

Research

Jump to: Editorial

13 pages, 1230 KiB  
Article
Predicting Task Planning Ability for Learners Engaged in Searching as Learning Based on Tree-Structured Long Short-Term Memory Networks
by Pengfei Li, Shaoyu Dong, Yin Zhang and Bin Zhang
Appl. Sci. 2023, 13(23), 12840; https://doi.org/10.3390/app132312840 - 30 Nov 2023
Viewed by 584
Abstract
The growing utilization of web-based search engines for learning purposes has led to increased studies on searching as learning (SAL). In order to achieve the desired learning outcomes, web learners have to carefully plan their learning objectives. Previous SAL research has proposed the [...] Read more.
The growing utilization of web-based search engines for learning purposes has led to increased studies on searching as learning (SAL). In order to achieve the desired learning outcomes, web learners have to carefully plan their learning objectives. Previous SAL research has proposed the significant influence of task planning quality on learning outcomes. Therefore, accurately predicting web-based learners’ task planning abilities, particularly in the context of SAL, is of paramount importance for both web-based search engines and recommendation systems. To solve this problem, this paper proposes a method for predicting the ability of task planning for web learners. Specifically, we first introduced a tree-based representation method to capture how learners plan their learning tasks. Subsequently, we proposed a method based on the deep learning technique to accurately predict the SAL task planning ability for web learners. Experimental results indicate that, compared to baseline approaches, our proposed method can provide a more effective representation of learners’ task planning and deliver more accurate predictions of learners’ task planning abilities in SAL. Full article
Show Figures

Figure 1

20 pages, 2136 KiB  
Article
WSREB Mechanism: Web Search Results Exploration Mechanism for Blind Users
by Snober Naseer, Umer Rashid, Maha Saddal, Abdur Rehman Khan, Qaisar Abbas and Yassine Daadaa
Appl. Sci. 2023, 13(19), 11007; https://doi.org/10.3390/app131911007 - 06 Oct 2023
Viewed by 834
Abstract
In the contemporary digital landscape, web search functions as a pivotal conduit for information dissemination. Nevertheless, blind users (BUs) encounter substantial barriers in leveraging online services, attributable to intrinsic deficiencies in the information structure presented by online platforms. A critical analysis reveals that [...] Read more.
In the contemporary digital landscape, web search functions as a pivotal conduit for information dissemination. Nevertheless, blind users (BUs) encounter substantial barriers in leveraging online services, attributable to intrinsic deficiencies in the information structure presented by online platforms. A critical analysis reveals that a considerable segment of BUs perceive online service access as either challenging or unfeasible, with only a fraction of search endeavors culminating successfully. This predicament stems largely from the linear nature of information interaction necessitated for BUs, a process that mandates sequential content relevancy assessment, consequently imposing cognitive strain and fostering information disorientation. Moreover, the prevailing evaluative metrics for web service efficacy—precision and recall—exhibit a glaring oversight of the nuanced behavioral and usability facets pertinent to BUs during search engine design. Addressing this, our study introduces an innovative framework to facilitate information exploration, grounded in the cognitive principles governing BUs. This framework, piloted using the Wikipedia dataset, seeks to revolutionize the search result space through categorical organization, thereby enhancing accessibility for BUs. Empirical and usability assessments, conducted on a cohort of legally blind individuals (N = 25), underscore the framework’s potential, demonstrating notable improvements in web content accessibility and system usability, with categorical accuracy standing at 84% and a usability quotient of 72.5%. This research thus holds significant promise for redefining web search paradigms to foster inclusivity and optimized user experiences for BUs. Full article
Show Figures

Figure 1

19 pages, 10635 KiB  
Article
A Neural-Network-Based Landscape Search Engine: LSE Wisconsin
by Matthew Haffner, Matthew DeWitte, Papia F. Rozario and Gustavo A. Ovando-Montejo
Appl. Sci. 2023, 13(16), 9264; https://doi.org/10.3390/app13169264 - 15 Aug 2023
Viewed by 750
Abstract
The task of image retrieval is common in the world of data science and deep learning, but it has received less attention in the field of remote sensing. The authors seek to fill this gap in research through the presentation of a web-based [...] Read more.
The task of image retrieval is common in the world of data science and deep learning, but it has received less attention in the field of remote sensing. The authors seek to fill this gap in research through the presentation of a web-based landscape search engine for the US state of Wisconsin. The application allows users to select a location on the map and to find similar locations based on terrain and vegetation characteristics. It utilizes three neural network models—VGG16, ResNet-50, and NasNet—on digital elevation model data, and uses the NDVI mean and standard deviation for comparing vegetation data. The results indicate that VGG16 and ResNet50 generally return more favorable results, and the tool appears to be an important first step toward building a more robust, multi-input, high resolution landscape search engine in the future. The tool, called LSE Wisconsin, is hosted publicly on ShinyApps.io. Full article
Show Figures

Figure 1

16 pages, 1641 KiB  
Article
Web Page Content Block Identification with Extended Block Properties
by Kiril Griazev and Simona Ramanauskaitė
Appl. Sci. 2023, 13(9), 5680; https://doi.org/10.3390/app13095680 - 05 May 2023
Viewed by 920
Abstract
Web page segmentation is one of the most influential factors for the automated integration of web page content with other systems. Existing solutions are focused on segmentation but do not provide a more detailed description of the segment including its range (minimum and [...] Read more.
Web page segmentation is one of the most influential factors for the automated integration of web page content with other systems. Existing solutions are focused on segmentation but do not provide a more detailed description of the segment including its range (minimum and maximum HTML code bounds, covering the segment content) and variants (the same segments with different content). Therefore the paper proposes a novel solution designed to find all web page content blocks and detail them for further usage. It applies text similarity and document object model (DOM) tree analysis methods to indicate the maximum and minimum ranges of each identified HTML block. In addition, it indicates its relation to other blocks, including hierarchical as well as sibling blocks. The evaluation of the method reveals its ability to identify more content blocks in comparison to human labeling (in manual labeling only 24% of blocks were labeled). By using the proposed method, manual labeling effort could be reduced by at least 70%. Better performance was observed in comparison to other analyzed web page segmentation methods, and better recall was achieved due to focus on processing every block present on a page, and providing a more detailed web page division into content block data by presenting block boundary range and block variation data. Full article
Show Figures

Figure 1

15 pages, 8476 KiB  
Article
EFCMF: A Multimodal Robustness Enhancement Framework for Fine-Grained Recognition
by Rongping Zou, Bin Zhu, Yi Chen, Bo Xie and Bin Shao
Appl. Sci. 2023, 13(3), 1640; https://doi.org/10.3390/app13031640 - 27 Jan 2023
Viewed by 1044
Abstract
Fine-grained recognition has many applications in many fields and aims to identify targets from subcategories. This is a highly challenging task due to the minor differences between subcategories. Both modal missing and adversarial sample attacks are easily encountered in fine-grained recognition tasks based [...] Read more.
Fine-grained recognition has many applications in many fields and aims to identify targets from subcategories. This is a highly challenging task due to the minor differences between subcategories. Both modal missing and adversarial sample attacks are easily encountered in fine-grained recognition tasks based on multimodal data. These situations can easily lead to the model needing to be fixed. An Enhanced Framework for the Complementarity of Multimodal Features (EFCMF) is proposed in this study to solve this problem. The model’s learning of multimodal data complementarity is enhanced by randomly deactivating modal features in the constructed multimodal fine-grained recognition model. The results show that the model gains the ability to handle modal missing without additional training of the model and can achieve 91.14% and 99.31% accuracy on Birds and Flowers datasets. The average accuracy of EFCMF on the two datasets is 52.85%, which is 27.13% higher than that of Bi-modal PMA when facing four adversarial example attacks, namely FGSM, BIM, PGD and C&W. In the face of missing modal cases, the average accuracy of EFCMF is 76.33% on both datasets respectively, which is 32.63% higher than that of Bi-modal PMA. Compared with existing methods, EFCMF is robust in the face of modal missing and adversarial example attacks in multimodal fine-grained recognition tasks. The source code is available at https://github.com/RPZ97/EFCMF (accessed on 8 January 2023). Full article
Show Figures

Figure 1

9 pages, 396 KiB  
Article
Link Prediction with Hypergraphs via Network Embedding
by Zijuan Zhao, Kai Yang and Jinli Guo
Appl. Sci. 2023, 13(1), 523; https://doi.org/10.3390/app13010523 - 30 Dec 2022
Cited by 1 | Viewed by 1906
Abstract
Network embedding is a promising field and is important for various network analysis tasks, such as link prediction, node classification, community detection and others. Most research studies on link prediction focus on simple networks and pay little attention to hypergraphs that provide a [...] Read more.
Network embedding is a promising field and is important for various network analysis tasks, such as link prediction, node classification, community detection and others. Most research studies on link prediction focus on simple networks and pay little attention to hypergraphs that provide a natural way to represent complex higher-order relationships. In this paper, we propose a link prediction method with hypergraphs using network embedding (HNE). HNE adapts a traditional network embedding method, Deepwalk, to link prediction in hypergraphs. Firstly, the hypergraph model is constructed based on heterogeneous library loan records of seven universities. With a network embedding method, the low-dimensional vectors are obtained to extract network structure features for the hypergraphs. Then, the link prediction is implemented on the hypergraphs as the classification task with machine learning. The experimental results on seven real networks show our approach has good performance for link prediction in hypergraphs. Our method will be helpful for human behavior dynamics. Full article
Show Figures

Figure 1

17 pages, 2148 KiB  
Article
Unsupervised Domain Adaptation via Stacked Convolutional Autoencoder
by Yi Zhu, Xinke Zhou and Xindong Wu
Appl. Sci. 2023, 13(1), 481; https://doi.org/10.3390/app13010481 - 29 Dec 2022
Viewed by 1768
Abstract
Unsupervised domain adaptation involves knowledge transfer from a labeled source to unlabeled target domains to assist target learning tasks. A critical aspect of unsupervised domain adaptation is the learning of more transferable and distinct feature representations from different domains. Although previous investigations, using, [...] Read more.
Unsupervised domain adaptation involves knowledge transfer from a labeled source to unlabeled target domains to assist target learning tasks. A critical aspect of unsupervised domain adaptation is the learning of more transferable and distinct feature representations from different domains. Although previous investigations, using, for example, CNN-based and auto-encoder-based methods, have produced remarkable results in domain adaptation, there are still two main problems that occur with these methods. The first is a training problem for deep neural networks; some optimization methods are ineffective when applied to unsupervised deep networks for domain adaptation tasks. The second problem that arises is that redundancy of image data results in performance degradation in feature learning for domain adaptation. To address these problems, in this paper, we propose an unsupervised domain adaptation method with a stacked convolutional sparse autoencoder, which is based on performing layer projection from the original data to obtain higher-level representations for unsupervised domain adaptation. More specifically, in a convolutional neural network, lower layers generate more discriminative features whose kernels are learned via a sparse autoencoder. A reconstruction independent component analysis optimization algorithm was introduced to perform individual component analysis on the input data. Experiments undertaken demonstrated superior classification performance of up to 89.3% in terms of accuracy compared to several state-of-the-art domain adaptation methods, such as SSRLDA and TLMRA. Full article
Show Figures

Figure 1

13 pages, 5830 KiB  
Article
Development of a Web Application for the Detection of Coronary Artery Calcium from Computed Tomography
by Juan Aguilera-Alvarez, Juan Martínez-Nolasco, Sergio Olmos-Temois, José Padilla-Medina, Víctor Sámano-Ortega and Micael Bravo-Sanchez
Appl. Sci. 2022, 12(23), 12281; https://doi.org/10.3390/app122312281 - 30 Nov 2022
Viewed by 1691
Abstract
Coronary atherosclerosis is the most common form of cardiovascular diseases, which represent the leading global cause of mortality in the adult population. The amount of coronary artery calcium (CAC) is a robust predictor of this disease that can be measured using the medical [...] Read more.
Coronary atherosclerosis is the most common form of cardiovascular diseases, which represent the leading global cause of mortality in the adult population. The amount of coronary artery calcium (CAC) is a robust predictor of this disease that can be measured using the medical workstations of computed tomography (CT) equipment or specialized tools included in commercial software for DICOM viewers, which is not available for all operating systems. This manuscript presents a web application that semiautomatically quantifies the amount of coronary artery calcium (CAC) on the basis of the coronary calcium score (CS) using the Agatston technique through digital image processing. To verify the correct functioning of this web application, 30 CTCSs were analyzed by a cardiologist and compared to those of commercial software (OsiriX DICOM Viewer).All the scans were correctly classified according to the cardiovascular event risk group, with an average error in the calculation of CS of 1.9% and a Pearson correlation coefficient r = 0.9997, with potential clinical application. Full article
Show Figures

Figure 1

21 pages, 1266 KiB  
Article
Fuzzy MLKNN in Credit User Portrait
by Zhuangyi Zhang, Lu Han and Muzi Chen
Appl. Sci. 2022, 12(22), 11342; https://doi.org/10.3390/app122211342 - 08 Nov 2022
Cited by 1 | Viewed by 1072
Abstract
Aiming at the problems of subjective enhancement caused by the discretization of credit data and the lack of a multi-dimensional portrait of credit users in the current credit data research, this paper proposes an improved Fuzzy MLKNN multi-label learning algorithm based on MLKNN. [...] Read more.
Aiming at the problems of subjective enhancement caused by the discretization of credit data and the lack of a multi-dimensional portrait of credit users in the current credit data research, this paper proposes an improved Fuzzy MLKNN multi-label learning algorithm based on MLKNN. On the one hand, the subjectivity of credit data after discretization is weakened by introducing intuitionistic fuzzy numbers. On the other hand, the algorithm is improved by using the corresponding fuzzy Euclidean distance to realize the multi-label portrait of credit users. The experimental results show that Fuzzy MLKNN performs significantly better than MLKNN on credit data and has the most significant improvement on One Error. Full article
Show Figures

Figure 1

13 pages, 426 KiB  
Article
Prompt Tuning for Multi-Label Text Classification: How to Link Exercises to Knowledge Concepts?
by Liting Wei, Yun Li, Yi Zhu, Bin Li and Lejun Zhang
Appl. Sci. 2022, 12(20), 10363; https://doi.org/10.3390/app122010363 - 14 Oct 2022
Cited by 4 | Viewed by 3269
Abstract
Exercises refer to the evaluation metric of whether students have mastered specific knowledge concepts. Linking exercises to knowledge concepts is an important foundation in multiple disciplines such as intelligent education, which represents the multi-label text classification problem in essence. However, most existing methods [...] Read more.
Exercises refer to the evaluation metric of whether students have mastered specific knowledge concepts. Linking exercises to knowledge concepts is an important foundation in multiple disciplines such as intelligent education, which represents the multi-label text classification problem in essence. However, most existing methods do not take the automatic linking of exercises to knowledge concepts into consideration. In addition, most of the widely used approaches in multi-label text classification require large amounts of training data for model optimization, which is usually time-consuming and labour-intensive in real-world scenarios. To address these problems, we propose a prompt tuning method for multi-label text classification, which can address the problem of the number of labelled exercises being small due to the lack of specialized expertise. Specifically, the relevance scores of exercise content and knowledge concepts are learned by a prompt tuning model with a unified template, and then the multiple associated knowledge concepts are selected with a threshold. An Exercises–Concepts dataset of the Data Structure course is constructed to verify the effectiveness of our proposed method. Extensive experimental results confirm our proposed method outperforms other state-of-the-art baselines by up to 35.53% and 41.78% in Micro and Macro F1, respectively. Full article
Show Figures

Figure 1

Back to TopTop