applsci-logo

Journal Browser

Journal Browser

Methods and Applications of Data Mining in Business Domains

A project collection of Applied Sciences (ISSN 2076-3417). This project collection belongs to the section "Computing and Artificial Intelligence".

Papers displayed on this page all arise from the same project. Editorial decisions were made independently of project staff and handled by the Editor-in-Chief or qualified Editorial Board members.

Viewed by 56931

Editors


E-Mail Website
Guest Editor
Amsterdam Business School, University of Amsterdam, Postbus 15953, 1001 NL Amsterdam, The Netherlands
Interests: data analytics; social networks; open source development; NLP
Faculty of Behavioural, Management and Social Sciences (BMS), University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
Interests: text mining, natural language processing; artificial intelligence, machine learning; data mining

Project Overview

Dear Colleagues,

Researchers are invited to contribute with research work that presents original scientific results, concepts, and methods in the field of data mining applied to various domains, such as Healthcare, Software Development, Logistics and Human Resources. We are especially interested in how the data mining method was modified to cater to the specific domain in question. The challenge being, that the more complex a domain is the harder it is to make good predictions, as more implicit domain knowledge is required that is not always available. This is especially true in complex domains where there are soft factors like the interaction of the conflicting and cooperating objectives of the stakeholders and system dynamics play a significant role. The challenge in a business context is that one would like to see (i) how the algorithms can be repeatable in the real world, (ii) how the patterns mined can be utilized by the business and (iii) how the resulting model can be understood and utilized in the business environment. Furthermore, the idea is to identify the variables that impact the goal variable but to do so with the data, interestingness, deployment and general domain (business) constraints of the domain.

One of the methods to analyse a complex domain is by using a method called intelligence meta-synthesis. Intelligence synthesis is the collection and creation of perceived or understood (i.e. not necessarily objective) information. Meta-synthesis is the collection and creation of knowledge and information from collected intelligences. The goal of this approach is to design and develop predictive models, that could eventually be incorporated into a business intelligence dashboard. As a result, one would (i) understand the nature and origin of data that allows the system user to determine the quality of the data, to perform the data cleaning; (ii) understand the factors in the domain that influence the predicted variable, leading the developer to determine which variables need to be included in the predictive model; (iii) develop predictive models that are usable and interesting within the domain in terms of predictive power, integrating with existing infrastructure, and integrating with business rules & processes; and finally (iv) use the predicted data to find the optimize business processes in the particular domain.

The main goal of this collection is to bring together researchers, participants, academic scientists and contributors to share their experiences, present and discuss ongoing and latest research results that cover several aspects of original research as regards to existing theoretical, methodological contributions as well as the development of new methods/approaches in data mining in business domains.

Dr. Chintan Amrit
Dr. Asad Abdi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Domain Analysis
  • Domain driven data mining;
  • Domain Knowledge Discovery and Extraction;
  • Domain Information Extraction and Retrievals;
  • Data-driven large scale optimizations for data mining in big data;
  • Feature selection and extraction methodologies to attribute reductions in high-dimensional and large-scale data;
  • Low quality and/or noisy big data mining problems;
  • Real-world big data applications using data mining approaches;
  • Domain Driven Sentiment analysis, emotion detection, and opinion mining;
  • Model usability and understandability;
  • Explainable machine learning models;
  • Applications in science, engineering, medicine, healthcare, finance, business, law, education, transportation and retailing.

Published Papers (21 papers)

2023

Jump to: 2022, 2021

4 pages, 204 KiB  
Editorial
Methods and Applications of Data Mining in Business Domains
by Chintan Amrit and Asad Abdi
Appl. Sci. 2023, 13(19), 10774; https://doi.org/10.3390/app131910774 - 28 Sep 2023
Viewed by 799
Abstract
This Special Issue invited researchers to contribute original research in the field of data mining, particularly in its application to diverse domains, like healthcare, software development, logistics, and human resources [...] Full article
23 pages, 4818 KiB  
Article
Enhancing Retail Transactions: A Data-Driven Recommendation Using Modified RFM Analysis and Association Rules Mining
by Angela Hsiang-Ling Chen and Sebastian Gunawan
Appl. Sci. 2023, 13(18), 10057; https://doi.org/10.3390/app131810057 - 06 Sep 2023
Cited by 1 | Viewed by 1481
Abstract
Retail transactions have become an integral part of the economic cycle of every country and even on a global scale. Retail transactions are a trade sector that has the potential to be developed continuously in the future. This research focused on building a [...] Read more.
Retail transactions have become an integral part of the economic cycle of every country and even on a global scale. Retail transactions are a trade sector that has the potential to be developed continuously in the future. This research focused on building a specified and data-driven recommendation system based on customer-purchasing and product-selling behavior. Modified RFM analysis was used by adding two variables, namely periodicity and customer engagement index; clustering algorithm such as K-means clustering and Ward’s method; and association rules to determine the pattern of the cause–effect relationship on each transaction and four types of classifiers to apply and to validate the recommendation system. The results showed that based on customer behavior, it should be split into two groups: loyal and potential customers. In contrast, for product behavior, it also comprised three groups: bestseller, profitable, and VIP product groups. Based on the result, K-nearest neighbor is the most suitable classifier with a low chance of overfitting and a higher performance index. Full article
Show Figures

Figure 1

20 pages, 2302 KiB  
Article
Hybrid Sampling and Dynamic Weighting-Based Classification Method for Multi-Class Imbalanced Data Stream
by Meng Han, Ang Li, Zhihui Gao, Dongliang Mu and Shujuan Liu
Appl. Sci. 2023, 13(10), 5924; https://doi.org/10.3390/app13105924 - 11 May 2023
Cited by 4 | Viewed by 1348
Abstract
The imbalance and concept drift problems in data streams become more complex in multi-class environment, and extreme imbalance and variation in class ratio may also exist. To tackle the above problems, Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream [...] Read more.
The imbalance and concept drift problems in data streams become more complex in multi-class environment, and extreme imbalance and variation in class ratio may also exist. To tackle the above problems, Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream (HSDW-MI) is proposed. The HSDW-MI algorithm deals with imbalance and concept drift problems through the hybrid sampling and dynamic weighting phases, respectively. In the hybrid sampling phase, adaptive spectral clustering is proposed to sample the data after clustering, which can maintain the original data distribution; then the sample safety factor is used to determine the samples to be sampled for each class; the safe samples are oversampled and the unsafe samples are under-sampled in each cluster. If the data stream is extremely imbalanced, the sample storage pool is used to extract samples with a high safety factor to add to the data stream. In the dynamic weighting phase, a dynamic weighting method based on the G-mean value is proposed. The G-mean values are used as the weights of each base classifier in the ensemble and the ensemble is dynamically updated during the processing of the data stream to accommodate the occurrence of concept drift. Experiments were conducted with LB, OAUE, ARF, BOLE, MUOB, MOOD, CALMID, and the proposed HSDW-MI on 10 multi-class synthetic data streams with different class ratios and concept drifts and 3 real multi-class imbalanced streams with unknown drifts, and the results show that the proposed HSDW-MI has better classification capabilities and performs more consistently compared to all other algorithms. Full article
Show Figures

Figure 1

19 pages, 2864 KiB  
Article
Equilibrium Optimizer-Based Joint Time-Frequency Entropy Feature Selection Method for Electric Loads in Industrial Scenario
by Mengran Zhou, Xiaokang Yao, Ziwei Zhu and Feng Hu
Appl. Sci. 2023, 13(9), 5732; https://doi.org/10.3390/app13095732 - 06 May 2023
Cited by 1 | Viewed by 984
Abstract
A prerequisite for refined load management, crucial for intelligent energy management, is the precise classification of electric loads. However, the high dimensionality of electric load samples and poor identification accuracy of industrial scenarios make it difficult to be used in actual production. As [...] Read more.
A prerequisite for refined load management, crucial for intelligent energy management, is the precise classification of electric loads. However, the high dimensionality of electric load samples and poor identification accuracy of industrial scenarios make it difficult to be used in actual production. As such, this research presents a selection approach equilibrium optimizer-based joint time-frequency entropy feature selection method for electric loads in industrial scenarios to address these issues. The method first introduces entropy value features based on extracting time-frequency domain features and then uses an equilibrium optimizer (EO) to screen the joint feature set. A Chinese cement plant was chosen as the acquisition site for the experiments, and the low-frequency data from power equipment were gathered to form an original dataset for power analysis. The features screened by the EO were used as model inputs to verify the effectiveness of the EO on the joint feature set under K-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), random forest (RF), and discriminant analysis (DA) models. Experimental results show that introducing entropy value features for the joint feature set can significantly improve the classification performance. The average accuracy of the features screened by the EO was as high as 95.58% on SVM, while the computation time was 0.75 s. Therefore, for industrial electricity scenarios, the approach suggested in this research can enhance the identification accuracy of electric loads and significantly reduce the computation time of the model to a great extent. This has essential research significance for intelligent energy management in real industrial scenarios. Full article
Show Figures

Figure 1

14 pages, 3792 KiB  
Article
Marketing Decision Support System Based on Data Mining Technology
by Rong Hou, Xu Ye, Hafizah Binti Omar Zaki and Nor Asiah Binti Omar
Appl. Sci. 2023, 13(7), 4315; https://doi.org/10.3390/app13074315 - 29 Mar 2023
Cited by 12 | Viewed by 2637
Abstract
With the continuous development of business intelligence technology, the application research of decision support systems (DSSs) is deepening. In China, the work in this area started relatively late, and there are few DSS research cases to assist in marketing decision-making. Currently, marketing decision [...] Read more.
With the continuous development of business intelligence technology, the application research of decision support systems (DSSs) is deepening. In China, the work in this area started relatively late, and there are few DSS research cases to assist in marketing decision-making. Currently, marketing decision support systems have shortcomings in data integration, historical data, query functions, and data analysis. This article analyzes the characteristics of marketing decision-making, discusses the application of data warehouse, OLAP, and data mining technology in marketing decision support systems, and designs a marketing decision support system based on data mining technology. The system uses a BP neural network to conduct data mining marketing forecasting. A three-layer network model for marketing prediction is established, with sales time, product price, and customer purchasing power as network inputs and output as the sales volume of a certain type of product in different locations. The test results show that the average absolute percentage error of this method is 15.13%, and the prediction accuracy is high. Research shows that with the continuous development of data mining technology, the system cannot only help users conduct scientific and reasonable marketing decision-making analyses, making the marketing decision-making process more scientific and reasonable, but also can bring new ideas to enterprise decision-makers, promoting the continuous improvement and progress of the system. Full article
Show Figures

Figure 1

12 pages, 810 KiB  
Review
Business Intelligence Adoption for Small and Medium Enterprises: Conceptual Framework
by Ibrahim Abdusalam Abubaker Alsibhawi, Jamaiah Binti Yahaya and Hazura Binti Mohamed
Appl. Sci. 2023, 13(7), 4121; https://doi.org/10.3390/app13074121 - 24 Mar 2023
Cited by 3 | Viewed by 4096
Abstract
All businesses have many issues, especially small and medium enterprises trying to survive with traditional technology. Therefore, enterprises need to adopt business intelligence by using the management of information technology systems to overcome the issues. This study proposes a conceptual framework that identifies [...] Read more.
All businesses have many issues, especially small and medium enterprises trying to survive with traditional technology. Therefore, enterprises need to adopt business intelligence by using the management of information technology systems to overcome the issues. This study proposes a conceptual framework that identifies the potential factors that influence the adoption of business intelligence systems in the SME industry in Libya. Therefore, this study was established based on two main theories: the technology acceptance model (TAM) and the unified theory of adopting and using technology (UTAUT). In line with the previous studies that investigated this type of influence, this study recommended a conceptual framework containing several factors: change management, knowledge sharing, information quality, IT project management, the perceived usefulness of a BIS, and the perceived ease of adoption of a BIS. This study did not consider the environmental factors’ effect on adopting a BIS (business intelligence system); this is due to the different characteristics of each small and medium enterprise in terms of the sector or industry type. Full article
Show Figures

Figure 1

16 pages, 1159 KiB  
Article
A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique
by Amal Al Ali, Ahmed M. Khedr, Magdi El-Bannany and Sakeena Kanakkayil
Appl. Sci. 2023, 13(4), 2272; https://doi.org/10.3390/app13042272 - 10 Feb 2023
Cited by 8 | Viewed by 3897
Abstract
This study aims to develop a better Financial Statement Fraud (FSF) detection model by utilizing data from publicly available financial statements of firms in the MENA region. We develop an FSF model using a powerful ensemble technique, the XGBoost (eXtreme Gradient Boosting) algorithm, [...] Read more.
This study aims to develop a better Financial Statement Fraud (FSF) detection model by utilizing data from publicly available financial statements of firms in the MENA region. We develop an FSF model using a powerful ensemble technique, the XGBoost (eXtreme Gradient Boosting) algorithm, that helps to identify fraud in a set of sample companies drawn from the Middle East and North Africa (MENA) region. The issue of class imbalance in the dataset is addressed by applying the Synthetic Minority Oversampling Technique (SMOTE) algorithm. We use different Machine Learning techniques in Python to predict FSF, and our empirical findings show that the XGBoost algorithm outperformed the other algorithms in this study, namely, Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), AdaBoost, and Random Forest (RF). We then optimize the XGBoost algorithm to obtain the best result, with a final accuracy of 96.05% in the detection of FSF. Full article
Show Figures

Figure 1

2022

Jump to: 2023, 2021

18 pages, 1615 KiB  
Article
Stock Price Prediction Using a Frequency Decomposition Based GRU Transformer Neural Network
by Chengyu Li and Guoqi Qian
Appl. Sci. 2023, 13(1), 222; https://doi.org/10.3390/app13010222 - 24 Dec 2022
Cited by 15 | Viewed by 4064
Abstract
Stock price prediction is crucial but also challenging in any trading system in stock markets. Currently, family of recurrent neural networks (RNNs) have been widely used for stock prediction with many successes. However, difficulties still remain to make RNNs more successful in a [...] Read more.
Stock price prediction is crucial but also challenging in any trading system in stock markets. Currently, family of recurrent neural networks (RNNs) have been widely used for stock prediction with many successes. However, difficulties still remain to make RNNs more successful in a cluttered stock market. Specifically, RNNs lack power to retrieve discerning features from a clutter of signals in stock information flow. Making it worse, by RNN a single long time cell from the market is often fused into a single feature, losing all the information about time which is essential for temporal stock prediction. To tackle these two issues, we develop in this paper a novel hybrid neural network for price prediction, which is named frequency decomposition induced gate recurrent unit (GRU) transformer, abbreviated to FDGRU-transformer or FDG-trans). Inspired by the success of frequency decomposition, in FDG-transformer we apply empirical model decomposition to decompose the complete ensemble of cluttered data into a trend component plus several informative and independent mode components. Equipped with the decomposition, FDG-transformer has the capacity to extract the discriminative insights from the cluttered signals. To retain the temporal information in the observed cluttered data, FDG-transformer utilizes hybrid neural network of GRU, long short term memory (LSTM) and multi-head attention (MHA) transformers. The integrated transformer network is capable of encoding the impact of different weights from each past time step to the current one, resulting in the establishment of a time series model from a deeper fine-grained level. We appy the developed FDG-transformer model to analyze Limit Order Book data and compare the results with that obtained from other state-of-the-art methods. The comparison shows that our model delivers effective price forecasting. Moreover, an ablation study is conducted to validate the importance and necessity of each component in the proposed model. Full article
Show Figures

Figure 1

31 pages, 1510 KiB  
Article
Analysis of Factors Influencing the Prices of Tourist Offers
by Agata Kołakowska and Magdalena Godlewska
Appl. Sci. 2022, 12(24), 12938; https://doi.org/10.3390/app122412938 - 16 Dec 2022
Cited by 2 | Viewed by 1198
Abstract
Tourism is a significant branch of many world economies. Many factors influence the volume of tourist traffic and the prices of trips. There are factors that clearly affect tourism, such as COVID-19. The paper describes the methods of machine learning and process mining [...] Read more.
Tourism is a significant branch of many world economies. Many factors influence the volume of tourist traffic and the prices of trips. There are factors that clearly affect tourism, such as COVID-19. The paper describes the methods of machine learning and process mining that allow for assessing the impact of various factors (micro, mezzo and macro) on the prices of tourist offers. The methods were used on large sets of real data from two tour operators, and the results of these studies are discussed in this paper. The research presented is part of a larger project aiming at predicting trip prices. It answers the question of which factors have the greatest impact on the price and which can be omitted in further work. Nevertheless, the dynamic world situation suggests that the ranking of factors may change and the presented universal methods may provide different results in the coming years. Full article
Show Figures

Figure 1

20 pages, 3033 KiB  
Article
Use of Data Mining to Predict the Influx of Patients to Primary Healthcare Centres and Construction of an Expert System
by Juan J. Cubillas, María I. Ramos and Francisco R. Feito
Appl. Sci. 2022, 12(22), 11453; https://doi.org/10.3390/app122211453 - 11 Nov 2022
Cited by 4 | Viewed by 1324
Abstract
In any productive sector, predictive tools are crucial for optimal management and decision-making. In the health sector, it is especially important to have information available in advance, as this not only means optimizing resources, but also improving patient care. This work focuses on [...] Read more.
In any productive sector, predictive tools are crucial for optimal management and decision-making. In the health sector, it is especially important to have information available in advance, as this not only means optimizing resources, but also improving patient care. This work focuses on the management of healthcare resources in primary care centres. The main objective of this work is to develop a model capable of predicting the number of patients who will demand health care in a primary care centre on a daily basis. This model is integrated into a decision support system that is accessible and easy to use by the manager through a web application. In this case, data from a primary care centre in the city of Jaén, Spain, were used. The model was estimated using spatial-temporal training data, the daily health demand data in that centre for five years, and a series of meteorological data. Different regression algorithms have been employed. The workflow requires selecting the parameters that influence the health demand prediction and discarding those that distort the model. The main contribution of this research is the daily prediction of the number of patients attending the health centre with absolute errors better than 3%, which is crucial for decision-making on the sizing of health resources in a primary care health centre. Full article
Show Figures

Figure 1

25 pages, 2369 KiB  
Article
Intelligent Decision Forest Models for Customer Churn Prediction
by Fatima Enehezei Usman-Hamza, Abdullateef Oluwagbemiga Balogun, Luiz Fernando Capretz, Hammed Adeleye Mojeed, Saipunidzam Mahamad, Shakirat Aderonke Salihu, Abimbola Ganiyat Akintola, Shuib Basri, Ramoni Tirimisiyu Amosa and Nasiru Kehinde Salahdeen
Appl. Sci. 2022, 12(16), 8270; https://doi.org/10.3390/app12168270 - 18 Aug 2022
Cited by 10 | Viewed by 2083
Abstract
Customer churn is a critical issue impacting enterprises and organizations, particularly in the emerging and highly competitive telecommunications industry. It is important to researchers and industry analysts interested in projecting customer behavior to separate churn from non-churn consumers. The fundamental incentive is a [...] Read more.
Customer churn is a critical issue impacting enterprises and organizations, particularly in the emerging and highly competitive telecommunications industry. It is important to researchers and industry analysts interested in projecting customer behavior to separate churn from non-churn consumers. The fundamental incentive is a firm’s intent desire to keep current consumers, along with the exorbitant expense of gaining new ones. Many solutions have been developed to address customer churn prediction (CCP), such as rule-based and machine learning (ML) solutions. However, the issue of scalability and robustness of rule-based customer churn solutions is a critical drawback, while the imbalanced nature of churn datasets has a detrimental impact on the prediction efficacy of conventional ML techniques in CCP. As a result, in this study, we developed intelligent decision forest (DF) models for CCP in telecommunication. Specifically, we investigated the prediction performances of the logistic model tree (LMT), random forest (RF), and Functional Trees (FT) as DF models and enhanced DF (LMT, RF, and FT) models based on weighted soft voting and weighted stacking methods. Extensive experimentation was performed to ascertain the efficacy of the suggested DF models utilizing publicly accessible benchmark telecom CCP datasets. The suggested DF models efficiently distinguish churn from non-churn consumers in the presence of the class imbalance problem. In addition, when compared to baseline and existing ML-based CCP methods, comparative findings showed that the proposed DF models provided superior prediction performances and optimal solutions for CCP in the telecom industry. Hence, the development and deployment of DF-based models for CCP and applicable ML tasks are recommended. Full article
Show Figures

Figure 1

18 pages, 759 KiB  
Article
Customer Churn Prediction in B2B Non-Contractual Business Settings Using Invoice Data
by Milan Mirkovic, Teodora Lolic, Darko Stefanovic, Andras Anderla and Danijela Gracanin
Appl. Sci. 2022, 12(10), 5001; https://doi.org/10.3390/app12105001 - 15 May 2022
Cited by 6 | Viewed by 3359
Abstract
Customer churn is a problem virtually all companies face, and the ability to predict it reliably can be a cornerstone for successful retention campaigns. In this study, we propose an approach to customer churn prediction in non-contractual B2B settings that relies exclusively on [...] Read more.
Customer churn is a problem virtually all companies face, and the ability to predict it reliably can be a cornerstone for successful retention campaigns. In this study, we propose an approach to customer churn prediction in non-contractual B2B settings that relies exclusively on invoice-level data for feature engineering and uses multi-slicing to maximally utilize available data. We cast churn as a binary classification problem and assess the ability of three established classifiers to predict it when using different churn definitions. We also compare classifier performance when different amounts of historical data are used for feature engineering. The results indicate that robust models for different churn definitions can be derived by using invoice-level data alone and that using more historical data for creating some of the features tends to lead to better performing models for some classifiers. We also confirm that the multi-slicing approach to dataset creation yields better performing models compared to the traditionally used single-slicing approach. Full article
Show Figures

Figure 1

15 pages, 924 KiB  
Article
Legal Judgment Prediction via Heterogeneous Graphs and Knowledge of Law Articles
by Qihui Zhao, Tianhan Gao, Song Zhou, Dapeng Li and Yingyou Wen
Appl. Sci. 2022, 12(5), 2531; https://doi.org/10.3390/app12052531 - 28 Feb 2022
Cited by 3 | Viewed by 2525
Abstract
Legal judgment prediction (LJP) is a crucial task in legal intelligence to predict charges, law articles and terms of penalties based on case fact description texts. Although existing methods perform well, they still have many shortcomings. First, the existing methods have significant limitations [...] Read more.
Legal judgment prediction (LJP) is a crucial task in legal intelligence to predict charges, law articles and terms of penalties based on case fact description texts. Although existing methods perform well, they still have many shortcomings. First, the existing methods have significant limitations in understanding long documents, especially those based on RNNs and BERT. Secondly, the existing methods are not good at solving the problem of similar charges and do not fully and effectively integrate the information of law articles. To address the above problems, we propose a novel LJP method. Firstly, we improve the model’s comprehension of the whole document based on a graph neural network approach. Then, we design a graph attention network-based law article distinction extractor to distinguish similar law articles. Finally, we design a graph fusion method to fuse heterogeneous graphs of text and external knowledge (law article group distinction information). The experiments show that the method could effectively improve LJP performance. The experimental metrics are superior to the existing state of the art. Full article
Show Figures

Figure 1

21 pages, 1087 KiB  
Review
Artificial Intelligence-Based Methods for Business Processes: A Systematic Literature Review
by Poliana Gomes, Luiz Verçosa, Fagner Melo, Vinícius Silva, Carmelo Bastos Filho and Byron Bezerra
Appl. Sci. 2022, 12(5), 2314; https://doi.org/10.3390/app12052314 - 23 Feb 2022
Cited by 8 | Viewed by 4276
Abstract
Companies are usually overloaded with data that they may not know how to take advantage of. On the other hand, artificial intelligence (AI) techniques are known to “keep learning” as the data increase. In this context, our research question emerges: what AI-based methods, [...] Read more.
Companies are usually overloaded with data that they may not know how to take advantage of. On the other hand, artificial intelligence (AI) techniques are known to “keep learning” as the data increase. In this context, our research question emerges: what AI-based methods, in the literature, could be used to automatize business processes and support the decision-making processes of companies? To fill this gap, in this paper, we performed a review of the literature to identify these techniques. We ensured the usage of methods since they allowed reproducibility and extensions. We applied our search string in the Scopus and Web of Science databases and discovered 21 relevant papers pertaining to our question. In these papers, we identified methods that automated tasks and helped analysts make assertive decisions when designing, extending, or reengineering business processes. The authors applied diverse AI techniques, such as K-means, Bayesian networks, and swarm intelligence. Our analysis provides statistics about the techniques and problems being tackled and point to possible future directions. Full article
Show Figures

Figure 1

25 pages, 69685 KiB  
Article
Improving the Forecasting Performance of Taiwan Car Sales Movement Direction Using Online Sentiment Data and CNN-LSTM Model
by Chao Ou-Yang, Shih-Chung Chou and Yeh-Chun Juan
Appl. Sci. 2022, 12(3), 1550; https://doi.org/10.3390/app12031550 - 31 Jan 2022
Cited by 7 | Viewed by 3401
Abstract
The automotive industry is the leading producer of machines in Taiwan and worldwide. Developing effective methods for forecasting car sales can allow car companies to arrange their production and sales plans. Capitalizing on the growth of social media and deep learning algorithms, this [...] Read more.
The automotive industry is the leading producer of machines in Taiwan and worldwide. Developing effective methods for forecasting car sales can allow car companies to arrange their production and sales plans. Capitalizing on the growth of social media and deep learning algorithms, this research aimed to improve the overall performance of the forecasting of Taiwan car sales movement direction forecasting by using online sentiment data and CNN-LSTM method. First, the historical sales volumes and multi-channel online sentiment data for six car brands in Taiwan were collected and preprocessed for labeling of car sales movement direction. Then, three models, namely, the classical, sentimental, and CNN-LSTM models, were constructed and trained/fitted for forecasting car sales movement directions in Taiwan. Finally, the performance of the three prediction models were compared to verify the effects of online sentiment data and the CNN-LSTM model on forecasting performance. The results showed that four forecasting performance indices, i.e., accuracy, precision, recall and F1-score, improved by 27.78% (from 41.67% to 69.45%), 0.39 (from 0.38 to 0.77), 0.27 (from 0.42 to 0.69) and 0.33 (from 0.35 to 0.68), respectively. Therefore, the online sentiment data and CNN-LSTM method can indeed improve the overall performance of car sales movement direction in Taiwan. Full article
Show Figures

Figure 1

2021

Jump to: 2023, 2022

17 pages, 1433 KiB  
Article
Reinforcement Learning for Options Trading
by Wen Wen, Yuyu Yuan and Jincui Yang
Appl. Sci. 2021, 11(23), 11208; https://doi.org/10.3390/app112311208 - 25 Nov 2021
Cited by 4 | Viewed by 6198
Abstract
Reinforcement learning has been applied to various types of financial assets trading, such as stocks, futures, and cryptocurrencies. Options, as a novel kind of derivative, have their characteristics. Because there are too many option contracts for one underlying asset and their price behavior [...] Read more.
Reinforcement learning has been applied to various types of financial assets trading, such as stocks, futures, and cryptocurrencies. Options, as a novel kind of derivative, have their characteristics. Because there are too many option contracts for one underlying asset and their price behavior is different. Besides, the validity period of an option contract is relatively short. To apply reinforcement learning to options trading, we propose the options trading reinforcement learning (OTRL) framework. We use options’ underlying asset data to train the reinforcement learning model. Candle data in different time intervals are utilized, respectively. The protective closing strategy is added to the model to prevent unbearable losses. Our experiments demonstrate that the most stable algorithm for obtaining high returns is proximal policy optimization (PPO) with the protective closing strategy. The deep Q network (DQN) can exceed the buy and hold strategy in options trading, as can soft actor critic (SAC). The OTRL framework is verified effectively. Full article
Show Figures

Figure 1

20 pages, 13860 KiB  
Article
Few-Shot Charge Prediction with Data Augmentation and Feature Augmentation
by Peipeng Wang, Xiuguo Zhang and Zhiying Cao
Appl. Sci. 2021, 11(22), 10811; https://doi.org/10.3390/app112210811 - 16 Nov 2021
Cited by 2 | Viewed by 1538
Abstract
The task of charge prediction is to predict the charge based on the fact description. Existing methods have a good effect on the prediction of high-frequency charges, but the prediction of low-frequency charges is still a challenge. Moreover, there exist some confusing charges [...] Read more.
The task of charge prediction is to predict the charge based on the fact description. Existing methods have a good effect on the prediction of high-frequency charges, but the prediction of low-frequency charges is still a challenge. Moreover, there exist some confusing charges that have relatively similar fact descriptions, which can be easily misjudged. Therefore, we propose a model with data augmentation and feature augmentation for few-shot charge prediction. Specifically, the model takes the text description as the input and uses the Mixup method to generate virtual samples for data augmentation. Then, the charge information heterogeneous graph is introduced, and a novel graph convolutional network is designed to extract distinguishability features for feature augmentation. A feature fusion network is used to effectively integrate the charge graph knowledge into the fact to learn semantic-enhanced fact representation. Finally, the semantic-enhanced fact representation is used to predict the charge. In addition, based on the distribution of each charge, a category prior loss function is designed to increase the contribution of low-frequency charges to the model optimization. The experimental results on real-work datasets prove the effectiveness and robustness of the proposed model. Full article
Show Figures

Figure 1

19 pages, 2055 KiB  
Article
Multi-Objective Design of Profit Volumes and Closeness Ratings Using MBHS Optimizing Based on the PrefixSpan Mining Approach (PSMA) for Product Layout in Supermarkets
by Jakkrit Kaewyotha and Wararat Songpan
Appl. Sci. 2021, 11(22), 10683; https://doi.org/10.3390/app112210683 - 12 Nov 2021
Cited by 2 | Viewed by 1712
Abstract
Product layout significantly impacts consumer demand for purchases in supermarkets. Product shelf renovation is a crucial process that can increase supermarket efficiency. The development of a sequential pattern mining algorithm for investigating the correlation patterns of product layouts, solving the numerous problems of [...] Read more.
Product layout significantly impacts consumer demand for purchases in supermarkets. Product shelf renovation is a crucial process that can increase supermarket efficiency. The development of a sequential pattern mining algorithm for investigating the correlation patterns of product layouts, solving the numerous problems of shelf design, and the development of an algorithm that considers in-store purchase and shelf profit data with the goal of improving supermarket efficiency, and consequently profitability, were the goals of this research. The authors of this research developed two types of algorithms to enhance efficiency and reach the goals. The first was a PrefixSpan algorithm, which was used to optimize sequential pattern mining, known as the PrefixSpan mining approach. The second was a new multi-objective design that considered the objective functions of profit volumes and closeness rating using the mutation-based harmony search (MBHS) optimization algorithm, which was used to evaluate the performance of the first algorithm based on the PrefixSpan algorithm. The experimental results demonstrated that the PrefixSpan algorithm can determine correlation rules more efficiently and accurately ascertain correlation rules better than any other algorithms used in the study. Additionally, the authors found that MBHS with a new multi-objective design can effectively find the product layout in supermarket solutions. Finally, the proposed product layout algorithm was found to lead to higher profit volumes and closeness ratings than traditional shelf layouts, as well as to be more efficient than other algorithms. Full article
Show Figures

Figure 1

11 pages, 559 KiB  
Article
Dynamic Nearest Neighbor: An Improved Machine Learning Classifier and Its Application in Finances
by Oscar Camacho-Urriolagoitia, Itzamá López-Yáñez, Yenny Villuendas-Rey, Oscar Camacho-Nieto and Cornelio Yáñez-Márquez
Appl. Sci. 2021, 11(19), 8884; https://doi.org/10.3390/app11198884 - 24 Sep 2021
Cited by 5 | Viewed by 1602
Abstract
The presence of machine learning, data mining and related disciplines is increasingly evident in everyday environments. The support for the applications of learning techniques in topics related to economic risk assessment, among other financial topics of interest, is relevant for us as human [...] Read more.
The presence of machine learning, data mining and related disciplines is increasingly evident in everyday environments. The support for the applications of learning techniques in topics related to economic risk assessment, among other financial topics of interest, is relevant for us as human beings. The content of this paper consists of a proposal of a new supervised learning algorithm and its application in real world datasets related to finance, called D1-NN (Dynamic 1-Nearest Neighbor). The D1-NN performance is competitive against the main state of the art algorithms in solving finance-related problems. The effectiveness of the new D1-NN classifier was compared against five supervised classifiers of the most important approaches (Bayes, nearest neighbors, support vector machines, classifier ensembles, and neural networks), with superior results overall. Full article
Show Figures

Figure 1

23 pages, 1861 KiB  
Article
Knowledge Development Trajectories of the Radio Frequency Identification Domain: An Academic Study Based on Citation and Main Paths Analysis
by Wei-Hao Su, Kai-Ying Chen, Louis Y. Y. Lu and Jen-Jen Wang
Appl. Sci. 2021, 11(18), 8254; https://doi.org/10.3390/app11188254 - 07 Sep 2021
Cited by 3 | Viewed by 2317
Abstract
The study collected papers on radio frequency identification (RFID) applications from an academic database to explore the topic’s development trajectory and predict future development trends. Overall, 3820 papers were collected, and citation networks were established on the basis of the literature references. Main [...] Read more.
The study collected papers on radio frequency identification (RFID) applications from an academic database to explore the topic’s development trajectory and predict future development trends. Overall, 3820 papers were collected, and citation networks were established on the basis of the literature references. Main path analysis was performed on the networks to determine the development trajectory of RFID applications. After clustering into groups, the results are twenty clusters, and six clusters with citation counts of more than 200 were obtained. Cluster and word cloud analyses were conducted, and the main research themes were identified: RFID applications in supply chain management, antenna design, collision prevention protocols, privacy and safety, tag sensors, and localization systems. Text mining was performed on the titles and abstracts of the papers to identify frequent keywords and topics of interest to researchers. Finally, statistical analysis of papers published in the previous 4 years revealed RFID applications in construction, aquaculture, and experimentation are less frequently discussed themes. This study provides planning directions for industry, and the findings serve as a reference for business domain. The integrated analysis successfully determined the trajectory of RFID-based technological development and applications as well as forecast the direction of future research. Full article
Show Figures

Figure 1

22 pages, 1824 KiB  
Article
Important Trading Point Prediction Using a Hybrid Convolutional Recurrent Neural Network
by Xinpeng Yu and Dagang Li
Appl. Sci. 2021, 11(9), 3984; https://doi.org/10.3390/app11093984 - 28 Apr 2021
Cited by 12 | Viewed by 2787
Abstract
Stock performance prediction plays an important role in determining the appropriate timing of buying or selling a stock in the development of a trading system. However, precise stock price prediction is challenging because of the complexity of the internal structure of the stock [...] Read more.
Stock performance prediction plays an important role in determining the appropriate timing of buying or selling a stock in the development of a trading system. However, precise stock price prediction is challenging because of the complexity of the internal structure of the stock price system and the diversity of external factors. Although research on forecasting stock prices has been conducted continuously, there are few examples of the successful use of stock price forecasting models to develop effective trading systems. Inspired by the process of human stock traders looking for trading opportunities, we propose a deep learning framework based on a hybrid convolutional recurrent neural network (HCRNN) to predict the important trading points (IPs) that are more likely to be followed by a significant stock price rise to capture potential high-margin opportunities. In the HCRNN model, the convolutional neural network (CNN) performs convolution on the most recent region to capture local fluctuation features, and the long short-term memory (LSTM) approach learns the long-term temporal dependencies to improve stock performance prediction. Comprehensive experiments on real stock market data prove the effectiveness of our proposed framework. Our proposed method ITPP-HCRNN achieves an annualized return that is 278.46% more than that of the market. Full article
Show Figures

Figure 1

Back to TopTop