Next Issue
Volume 8, March
Previous Issue
Volume 8, January
 
 

Big Data Cogn. Comput., Volume 8, Issue 2 (February 2024) – 8 articles

Cover Story (view full-size image): Empirical social research has always been data-driven. In the age of Big Data, increasing availability and quantities of data open the door for a more in-depth study and simultaneously raise challenges for researchers. This paper systematically assesses the application domain of empirical social research and its sub-domain of qualitative content analysis to propose AI-based support to its user stereotypes. In addition to defining use cases, the paper designs an information system that supports researchers with ML-based category recommendations and expert system-based dashboard recommendations. The information system was designed in compliance with the AI2VIS4BigData reference model to ensure the implemented system components' completeness, comparability, and reusability. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Select all
Export citation of selected articles as:
19 pages, 342 KiB  
Article
Inverse Firefly-Based Search Algorithms for Multi-Target Search Problem
by Ouarda Zedadra, Antonio Guerrieri, Hamid Seridi, Aymen Benzaid and Giancarlo Fortino
Big Data Cogn. Comput. 2024, 8(2), 18; https://doi.org/10.3390/bdcc8020018 - 19 Feb 2024
Viewed by 1284
Abstract
Efficiently searching for multiple targets in complex environments with limited perception and computational capabilities is challenging for multiple robots, which can coordinate their actions indirectly through their environment. In this context, swarm intelligence has been a source of inspiration for addressing multi-target search [...] Read more.
Efficiently searching for multiple targets in complex environments with limited perception and computational capabilities is challenging for multiple robots, which can coordinate their actions indirectly through their environment. In this context, swarm intelligence has been a source of inspiration for addressing multi-target search problems in the literature. So far, several algorithms have been proposed for solving such a problem, and in this study, we propose two novel multi-target search algorithms inspired by the Firefly algorithm. Unlike the conventional Firefly algorithm, where light is an attractor, light represents a negative effect in our proposed algorithms. Upon discovering targets, robots emit light to repel other robots from that region. This repulsive behavior is intended to achieve several objectives: (1) partitioning the search space among different robots, (2) expanding the search region by avoiding areas already explored, and (3) preventing congestion among robots. The proposed algorithms, named Global Lawnmower Firefly Algorithm (GLFA) and Random Bounce Firefly Algorithm (RBFA), integrate inverse light-based behavior with two random walks: random bounce and global lawnmower. These algorithms were implemented and evaluated using the ArGOS simulator, demonstrating promising performance compared to existing approaches. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing in 2023)
Show Figures

Figure 1

26 pages, 4290 KiB  
Article
A Model for Enhancing Unstructured Big Data Warehouse Execution Time
by Marwa Salah Farhan, Amira Youssef and Laila Abdelhamid
Big Data Cogn. Comput. 2024, 8(2), 17; https://doi.org/10.3390/bdcc8020017 - 6 Feb 2024
Viewed by 1844
Abstract
Traditional data warehouses (DWs) have played a key role in business intelligence and decision support systems. However, the rapid growth of the data generated by the current applications requires new data warehousing systems. In big data, it is important to adapt the existing [...] Read more.
Traditional data warehouses (DWs) have played a key role in business intelligence and decision support systems. However, the rapid growth of the data generated by the current applications requires new data warehousing systems. In big data, it is important to adapt the existing warehouse systems to overcome new issues and limitations. The main drawbacks of traditional Extract–Transform–Load (ETL) are that a huge amount of data cannot be processed over ETL and that the execution time is very high when the data are unstructured. This paper focuses on a new model consisting of four layers: Extract–Clean–Load–Transform (ECLT), designed for processing unstructured big data, with specific emphasis on text. The model aims to reduce execution time through experimental procedures. ECLT is applied and tested using Spark, which is a framework employed in Python. Finally, this paper compares the execution time of ECLT with different models by applying two datasets. Experimental results showed that for a data size of 1 TB, the execution time of ECLT is 41.8 s. When the data size increases to 1 million articles, the execution time is 119.6 s. These findings demonstrate that ECLT outperforms ETL, ELT, DELT, ELTL, and ELTA in terms of execution time. Full article
(This article belongs to the Special Issue Big Data and Information Science Technology)
Show Figures

Figure 1

20 pages, 740 KiB  
Article
Fair-CMNB: Advancing Fairness-Aware Stream Learning with Naïve Bayes and Multi-Objective Optimization
by Maryam Badar and Marco Fisichella
Big Data Cogn. Comput. 2024, 8(2), 16; https://doi.org/10.3390/bdcc8020016 - 31 Jan 2024
Viewed by 1462
Abstract
Fairness-aware mining of data streams is a challenging concern in the contemporary domain of machine learning. Many stream learning algorithms are used to replace humans in critical decision-making processes, e.g., hiring staff, assessing credit risk, etc. This calls for handling massive amounts of [...] Read more.
Fairness-aware mining of data streams is a challenging concern in the contemporary domain of machine learning. Many stream learning algorithms are used to replace humans in critical decision-making processes, e.g., hiring staff, assessing credit risk, etc. This calls for handling massive amounts of incoming information with minimal response delay while ensuring fair and high-quality decisions. Although deep learning has achieved success in various domains, its computational complexity may hinder real-time processing, making traditional algorithms more suitable. In this context, we propose a novel adaptation of Naïve Bayes to mitigate discrimination embedded in the streams while maintaining high predictive performance through multi-objective optimization (MOO). Class imbalance is an inherent problem in discrimination-aware learning paradigms. To deal with class imbalance, we propose a dynamic instance weighting module that gives more importance to new instances and less importance to obsolete instances based on their membership in a minority or majority class. We have conducted experiments on a range of streaming and static datasets and concluded that our proposed methodology outperforms existing state-of-the-art (SoTA) fairness-aware methods in terms of both discrimination score and balanced accuracy. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing in 2023)
Show Figures

Figure 1

23 pages, 3957 KiB  
Article
A Simultaneous Wireless Information and Power Transfer-Based Multi-Hop Uneven Clustering Routing Protocol for EH-Cognitive Radio Sensor Networks
by Jihong Wang, Zhuo Wang and Lidong Zhang
Big Data Cogn. Comput. 2024, 8(2), 15; https://doi.org/10.3390/bdcc8020015 - 31 Jan 2024
Viewed by 1526
Abstract
Clustering protocols and simultaneous wireless information and power transfer (SWIPT) technology can solve the issue of imbalanced energy consumption among nodes in energy harvesting-cognitive radio sensor networks (EH-CRSNs). However, dynamic energy changes caused by EH/SWIPT and dynamic spectrum availability prevent existing clustering routing [...] Read more.
Clustering protocols and simultaneous wireless information and power transfer (SWIPT) technology can solve the issue of imbalanced energy consumption among nodes in energy harvesting-cognitive radio sensor networks (EH-CRSNs). However, dynamic energy changes caused by EH/SWIPT and dynamic spectrum availability prevent existing clustering routing protocols from fully leveraging the advantages of EH and SWIPT. Therefore, a multi-hop uneven clustering routing protocol is proposed for EH-CRSNs utilizing SWIPT technology in this paper. Specifically, an EH-based energy state function is proposed to accurately track the dynamic energy variations in nodes. Utilizing this function, dynamic spectrum availability, neighbor count, and other information are integrated to design the criteria for selecting high-quality cluster heads (CHs) and relays, thereby facilitating effective data transfer to the sink. Intra-cluster and inter-cluster SWIPT mechanisms are incorporated to allow for the immediate energy replenishment for CHs or relays with insufficient energy while transmitting data, thereby preventing data transmission failures due to energy depletion. An energy status control mechanism is introduced to avoid the energy waste caused by excessive activation of the SWIPT mechanism. Simulation results indicate that the proposed protocol markedly improves the balance of energy consumption among nodes and enhances network surveillance capabilities when compared to existing clustering routing protocols. Full article
Show Figures

Graphical abstract

19 pages, 2148 KiB  
Article
Mixture of Attention Variants for Modal Fusion in Multi-Modal Sentiment Analysis
by Chao He, Xinghua Zhang, Dongqing Song, Yingshan Shen, Chengjie Mao, Huosheng Wen, Dingju Zhu  and Lihua Cai
Big Data Cogn. Comput. 2024, 8(2), 14; https://doi.org/10.3390/bdcc8020014 - 29 Jan 2024
Viewed by 1627
Abstract
With the popularization of better network access and the penetration of personal smartphones in today’s world, the explosion of multi-modal data, particularly opinionated video messages, has created urgent demands and immense opportunities for Multi-Modal Sentiment Analysis (MSA). Deep learning with the attention mechanism [...] Read more.
With the popularization of better network access and the penetration of personal smartphones in today’s world, the explosion of multi-modal data, particularly opinionated video messages, has created urgent demands and immense opportunities for Multi-Modal Sentiment Analysis (MSA). Deep learning with the attention mechanism has served as the foundation technique for most state-of-the-art MSA models due to its ability to learn complex inter- and intra-relationships among different modalities embedded in video messages, both temporally and spatially. However, modal fusion is still a major challenge due to the vast feature space created by the interactions among different data modalities. To address the modal fusion challenge, we propose an MSA algorithm based on deep learning and the attention mechanism, namely the Mixture of Attention Variants for Modal Fusion (MAVMF). The MAVMF algorithm includes a two-stage process: in stage one, self-attention is applied to effectively extract image and text features, and the dependency relationships in the context of video discourse are captured by a bidirectional gated recurrent neural module; in stage two, four multi-modal attention variants are leveraged to learn the emotional contributions of important features from different modalities. Our proposed approach is end-to-end and has been shown to achieve a superior performance to the state-of-the-art algorithms when tested with two largest public datasets, CMU-MOSI and CMU-MOSEI. Full article
Show Figures

Figure 1

24 pages, 1853 KiB  
Article
Optimal Image Characterization for In-Bed Posture Classification by Using SVM Algorithm
by Claudia Angelica Rivera-Romero, Jorge Ulises Munoz-Minjares, Carlos Lastre-Dominguez and Misael Lopez-Ramirez
Big Data Cogn. Comput. 2024, 8(2), 13; https://doi.org/10.3390/bdcc8020013 - 26 Jan 2024
Cited by 1 | Viewed by 1549
Abstract
Identifying patient posture while they are lying in bed is an important task in medical applications such as monitoring a patient after a surgical intervention, sleep supervision to identify behavioral and physiological markers, or for bedsore prevention. An acceptable strategy to identify the [...] Read more.
Identifying patient posture while they are lying in bed is an important task in medical applications such as monitoring a patient after a surgical intervention, sleep supervision to identify behavioral and physiological markers, or for bedsore prevention. An acceptable strategy to identify the patient’s position is the classification of images created from a grid of pressure sensors located in the bed. These samples can be arranged based on supervised learning methods. Usually, image conditioning is required before images are loaded into a learning method to increase classification accuracy. However, continuous monitoring of a person requires large amounts of time and computational resources if complex pre-processing algorithms are used. So, the problem is to classify the image posture of patients with different weights, heights, and positions by using minimal sample conditioning for a specific supervised learning method. In this work, it is proposed to identify the patient posture from pressure sensor images by using well-known and simple conditioning techniques and selecting the optimal texture descriptors for the Support Vector Machine (SVM) method. This is in order to obtain the best classification and to avoid image over-processing in the conditioning stage for the SVM. The experimental stages are performed with the color models Red, Green, and Blue (RGB) and Hue, Saturation, and Value (HSV). The results show an increase in accuracy from 86.9% to 92.9% and in kappa value from 0.825 to 0.904 using image conditioning with histogram equalization and a median filter, respectively. Full article
(This article belongs to the Special Issue Perception and Detection of Intelligent Vision)
Show Figures

Figure 1

21 pages, 4484 KiB  
Article
Data-Driven Short-Term Load Forecasting for Multiple Locations: An Integrated Approach
by Anik Baul, Gobinda Chandra Sarker, Prokash Sikder, Utpal Mozumder and Ahmed Abdelgawad
Big Data Cogn. Comput. 2024, 8(2), 12; https://doi.org/10.3390/bdcc8020012 - 26 Jan 2024
Cited by 1 | Viewed by 1625
Abstract
Short-term load forecasting (STLF) plays a crucial role in the planning, management, and stability of a country’s power system operation. In this study, we have developed a novel approach that can simultaneously predict the load demand of different regions in Bangladesh. When making [...] Read more.
Short-term load forecasting (STLF) plays a crucial role in the planning, management, and stability of a country’s power system operation. In this study, we have developed a novel approach that can simultaneously predict the load demand of different regions in Bangladesh. When making predictions for loads from multiple locations simultaneously, the overall accuracy of the forecast can be improved by incorporating features from the various areas while reducing the complexity of using multiple models. Accurate and timely load predictions for specific regions with distinct demographics and economic characteristics can assist transmission and distribution companies in properly allocating their resources. Bangladesh, being a relatively small country, is divided into nine distinct power zones for electricity transmission across the nation. In this study, we have proposed a hybrid model, combining the Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU), designed to forecast load demand seven days ahead for each of the nine power zones simultaneously. For our study, nine years of data from a historical electricity demand dataset (from January 2014 to April 2023) are collected from the Power Grid Company of Bangladesh (PGCB) website. Considering the nonstationary characteristics of the dataset, the Interquartile Range (IQR) method and load averaging are employed to deal effectively with the outliers. Then, for more granularity, this data set has been augmented with interpolation at every 1 h interval. The proposed CNN-GRU model, trained on this augmented and refined dataset, is evaluated against established algorithms in the literature, including Long Short-Term Memory Networks (LSTM), GRU, CNN-LSTM, CNN-GRU, and Transformer-based algorithms. Compared to other approaches, the proposed technique demonstrated superior forecasting accuracy in terms of mean absolute performance error (MAPE) and root mean squared error (RMSE). The dataset and the source code are openly accessible to motivate further research. Full article
Show Figures

Figure 1

21 pages, 1133 KiB  
Article
AI-Based User Empowerment for Empirical Social Research
by Thoralf Reis, Lukas Dumberger, Sebastian Bruchhaus, Thomas Krause, Verena Schreyer, Marco X. Bornschlegl and Matthias L. Hemmje
Big Data Cogn. Comput. 2024, 8(2), 11; https://doi.org/10.3390/bdcc8020011 - 23 Jan 2024
Viewed by 1659
Abstract
Manual labeling and categorization are extremely time-consuming and, thus, costly. AI and ML-supported information systems can bridge this gap and support labor-intensive digital activities. Since it requires categorization, coding-based analysis, such as qualitative content analysis, reaches its limits with large amounts of data [...] Read more.
Manual labeling and categorization are extremely time-consuming and, thus, costly. AI and ML-supported information systems can bridge this gap and support labor-intensive digital activities. Since it requires categorization, coding-based analysis, such as qualitative content analysis, reaches its limits with large amounts of data and could benefit from AI and ML-based support. Empirical social research, its application domain, benefits from Big Data’s ability to create more extensive human behavior and development models. A range of applications are available for statistical analysis to serve this purpose. This paper aims to implement an information system that supports researchers in empirical social research in performing AI-supported qualitative content analysis. AI2VIS4BigData is a reference model that standardizes use cases and artifacts for Big Data information systems that integrate AI and ML for user empowerment. Thus, this work’s concepts and implementations try to achieve an AI2VIS4BigData-compliant information system that supports social researchers in categorizing text data and creating insightful dashboards. Thereby, the text categorization is based on an existing ML component. Furthermore, it presents two evaluations that were conducted for these concepts and implementations: a qualitative cognitive walkthrough assessing the system’s usability and a quantitative user study with 18 participants revealed that though the users perceive AI support as more efficient, they need more time to reflect on the recommendations. The research revealed that AI support increased the correctness of the users’ categorizations but also slowed down their decision-making. The assumption that this is due to the UI design and additional information for processing requires follow-up research. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing in 2023)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop