2023

Jump to: 2022, 2021

4 pages, 204 KiB

Open AccessEditorial

Methods and Applications of Data Mining in Business Domains

by Chintan Amrit and Asad Abdi

Appl. Sci. 2023, 13(19), 10774; https://doi.org/10.3390/app131910774 - 28 Sep 2023

Viewed by 799

This Special Issue invited researchers to contribute original research in the field of data mining, particularly in its application to diverse domains, like healthcare, software development, logistics, and human resources [...] Full article

23 pages, 4818 KiB

Open AccessArticle

Enhancing Retail Transactions: A Data-Driven Recommendation Using Modified RFM Analysis and Association Rules Mining

by Angela Hsiang-Ling Chen and Sebastian Gunawan

Appl. Sci. 2023, 13(18), 10057; https://doi.org/10.3390/app131810057 - 06 Sep 2023

Cited by 1 | Viewed by 1481

Abstract

Retail transactions have become an integral part of the economic cycle of every country and even on a global scale. Retail transactions are a trade sector that has the potential to be developed continuously in the future. This research focused on building a [...] Read more.

Retail transactions have become an integral part of the economic cycle of every country and even on a global scale. Retail transactions are a trade sector that has the potential to be developed continuously in the future. This research focused on building a specified and data-driven recommendation system based on customer-purchasing and product-selling behavior. Modified RFM analysis was used by adding two variables, namely periodicity and customer engagement index; clustering algorithm such as K-means clustering and Ward’s method; and association rules to determine the pattern of the cause–effect relationship on each transaction and four types of classifiers to apply and to validate the recommendation system. The results showed that based on customer behavior, it should be split into two groups: loyal and potential customers. In contrast, for product behavior, it also comprised three groups: bestseller, profitable, and VIP product groups. Based on the result, K-nearest neighbor is the most suitable classifier with a low chance of overfitting and a higher performance index. Full article

► Show Figures

Figure 1

20 pages, 2302 KiB

Open AccessArticle

Hybrid Sampling and Dynamic Weighting-Based Classification Method for Multi-Class Imbalanced Data Stream

by Meng Han, Ang Li, Zhihui Gao, Dongliang Mu and Shujuan Liu

Appl. Sci. 2023, 13(10), 5924; https://doi.org/10.3390/app13105924 - 11 May 2023

Cited by 4 | Viewed by 1348

Abstract

The imbalance and concept drift problems in data streams become more complex in multi-class environment, and extreme imbalance and variation in class ratio may also exist. To tackle the above problems, Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream [...] Read more.

The imbalance and concept drift problems in data streams become more complex in multi-class environment, and extreme imbalance and variation in class ratio may also exist. To tackle the above problems, Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream (HSDW-MI) is proposed. The HSDW-MI algorithm deals with imbalance and concept drift problems through the hybrid sampling and dynamic weighting phases, respectively. In the hybrid sampling phase, adaptive spectral clustering is proposed to sample the data after clustering, which can maintain the original data distribution; then the sample safety factor is used to determine the samples to be sampled for each class; the safe samples are oversampled and the unsafe samples are under-sampled in each cluster. If the data stream is extremely imbalanced, the sample storage pool is used to extract samples with a high safety factor to add to the data stream. In the dynamic weighting phase, a dynamic weighting method based on the G-mean value is proposed. The G-mean values are used as the weights of each base classifier in the ensemble and the ensemble is dynamically updated during the processing of the data stream to accommodate the occurrence of concept drift. Experiments were conducted with LB, OAUE, ARF, BOLE, MUOB, MOOD, CALMID, and the proposed HSDW-MI on 10 multi-class synthetic data streams with different class ratios and concept drifts and 3 real multi-class imbalanced streams with unknown drifts, and the results show that the proposed HSDW-MI has better classification capabilities and performs more consistently compared to all other algorithms. Full article

► Show Figures

Figure 1

19 pages, 2864 KiB

Open AccessArticle

Equilibrium Optimizer-Based Joint Time-Frequency Entropy Feature Selection Method for Electric Loads in Industrial Scenario

by Mengran Zhou, Xiaokang Yao, Ziwei Zhu and Feng Hu

Appl. Sci. 2023, 13(9), 5732; https://doi.org/10.3390/app13095732 - 06 May 2023

Cited by 1 | Viewed by 984

Abstract

A prerequisite for refined load management, crucial for intelligent energy management, is the precise classification of electric loads. However, the high dimensionality of electric load samples and poor identification accuracy of industrial scenarios make it difficult to be used in actual production. As [...] Read more.

A prerequisite for refined load management, crucial for intelligent energy management, is the precise classification of electric loads. However, the high dimensionality of electric load samples and poor identification accuracy of industrial scenarios make it difficult to be used in actual production. As such, this research presents a selection approach equilibrium optimizer-based joint time-frequency entropy feature selection method for electric loads in industrial scenarios to address these issues. The method first introduces entropy value features based on extracting time-frequency domain features and then uses an equilibrium optimizer (EO) to screen the joint feature set. A Chinese cement plant was chosen as the acquisition site for the experiments, and the low-frequency data from power equipment were gathered to form an original dataset for power analysis. The features screened by the EO were used as model inputs to verify the effectiveness of the EO on the joint feature set under K-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), random forest (RF), and discriminant analysis (DA) models. Experimental results show that introducing entropy value features for the joint feature set can significantly improve the classification performance. The average accuracy of the features screened by the EO was as high as 95.58% on SVM, while the computation time was 0.75 s. Therefore, for industrial electricity scenarios, the approach suggested in this research can enhance the identification accuracy of electric loads and significantly reduce the computation time of the model to a great extent. This has essential research significance for intelligent energy management in real industrial scenarios. Full article

► Show Figures

Figure 1

14 pages, 3792 KiB

Open AccessArticle

Marketing Decision Support System Based on Data Mining Technology

by Rong Hou, Xu Ye, Hafizah Binti Omar Zaki and Nor Asiah Binti Omar

Appl. Sci. 2023, 13(7), 4315; https://doi.org/10.3390/app13074315 - 29 Mar 2023

Cited by 12 | Viewed by 2637

Abstract

With the continuous development of business intelligence technology, the application research of decision support systems (DSSs) is deepening. In China, the work in this area started relatively late, and there are few DSS research cases to assist in marketing decision-making. Currently, marketing decision [...] Read more.

With the continuous development of business intelligence technology, the application research of decision support systems (DSSs) is deepening. In China, the work in this area started relatively late, and there are few DSS research cases to assist in marketing decision-making. Currently, marketing decision support systems have shortcomings in data integration, historical data, query functions, and data analysis. This article analyzes the characteristics of marketing decision-making, discusses the application of data warehouse, OLAP, and data mining technology in marketing decision support systems, and designs a marketing decision support system based on data mining technology. The system uses a BP neural network to conduct data mining marketing forecasting. A three-layer network model for marketing prediction is established, with sales time, product price, and customer purchasing power as network inputs and output as the sales volume of a certain type of product in different locations. The test results show that the average absolute percentage error of this method is 15.13%, and the prediction accuracy is high. Research shows that with the continuous development of data mining technology, the system cannot only help users conduct scientific and reasonable marketing decision-making analyses, making the marketing decision-making process more scientific and reasonable, but also can bring new ideas to enterprise decision-makers, promoting the continuous improvement and progress of the system. Full article

► Show Figures

Figure 1

12 pages, 810 KiB

Open AccessReview

Business Intelligence Adoption for Small and Medium Enterprises: Conceptual Framework

by Ibrahim Abdusalam Abubaker Alsibhawi, Jamaiah Binti Yahaya and Hazura Binti Mohamed

Appl. Sci. 2023, 13(7), 4121; https://doi.org/10.3390/app13074121 - 24 Mar 2023

Cited by 3 | Viewed by 4096

Abstract

All businesses have many issues, especially small and medium enterprises trying to survive with traditional technology. Therefore, enterprises need to adopt business intelligence by using the management of information technology systems to overcome the issues. This study proposes a conceptual framework that identifies [...] Read more.

All businesses have many issues, especially small and medium enterprises trying to survive with traditional technology. Therefore, enterprises need to adopt business intelligence by using the management of information technology systems to overcome the issues. This study proposes a conceptual framework that identifies the potential factors that influence the adoption of business intelligence systems in the SME industry in Libya. Therefore, this study was established based on two main theories: the technology acceptance model (TAM) and the unified theory of adopting and using technology (UTAUT). In line with the previous studies that investigated this type of influence, this study recommended a conceptual framework containing several factors: change management, knowledge sharing, information quality, IT project management, the perceived usefulness of a BIS, and the perceived ease of adoption of a BIS. This study did not consider the environmental factors’ effect on adopting a BIS (business intelligence system); this is due to the different characteristics of each small and medium enterprise in terms of the sector or industry type. Full article

► Show Figures

Figure 1

16 pages, 1159 KiB

Open AccessArticle

A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique

by Amal Al Ali, Ahmed M. Khedr, Magdi El-Bannany and Sakeena Kanakkayil

Appl. Sci. 2023, 13(4), 2272; https://doi.org/10.3390/app13042272 - 10 Feb 2023

Cited by 8 | Viewed by 3897

Abstract

This study aims to develop a better Financial Statement Fraud (FSF) detection model by utilizing data from publicly available financial statements of firms in the MENA region. We develop an FSF model using a powerful ensemble technique, the XGBoost (eXtreme Gradient Boosting) algorithm, [...] Read more.

This study aims to develop a better Financial Statement Fraud (FSF) detection model by utilizing data from publicly available financial statements of firms in the MENA region. We develop an FSF model using a powerful ensemble technique, the XGBoost (eXtreme Gradient Boosting) algorithm, that helps to identify fraud in a set of sample companies drawn from the Middle East and North Africa (MENA) region. The issue of class imbalance in the dataset is addressed by applying the Synthetic Minority Oversampling Technique (SMOTE) algorithm. We use different Machine Learning techniques in Python to predict FSF, and our empirical findings show that the XGBoost algorithm outperformed the other algorithms in this study, namely, Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), AdaBoost, and Random Forest (RF). We then optimize the XGBoost algorithm to obtain the best result, with a final accuracy of 96.05% in the detection of FSF. Full article

► Show Figures

Figure 1

2022

Jump to: 2023, 2021

18 pages, 1615 KiB

Open AccessArticle

Stock Price Prediction Using a Frequency Decomposition Based GRU Transformer Neural Network

by Chengyu Li and Guoqi Qian

Appl. Sci. 2023, 13(1), 222; https://doi.org/10.3390/app13010222 - 24 Dec 2022

Cited by 15 | Viewed by 4064

Abstract

Stock price prediction is crucial but also challenging in any trading system in stock markets. Currently, family of recurrent neural networks (RNNs) have been widely used for stock prediction with many successes. However, difficulties still remain to make RNNs more successful in a [...] Read more.

Stock price prediction is crucial but also challenging in any trading system in stock markets. Currently, family of recurrent neural networks (RNNs) have been widely used for stock prediction with many successes. However, difficulties still remain to make RNNs more successful in a cluttered stock market. Specifically, RNNs lack power to retrieve discerning features from a clutter of signals in stock information flow. Making it worse, by RNN a single long time cell from the market is often fused into a single feature, losing all the information about time which is essential for temporal stock prediction. To tackle these two issues, we develop in this paper a novel hybrid neural network for price prediction, which is named frequency decomposition induced gate recurrent unit (GRU) transformer, abbreviated to FDGRU-transformer or FDG-trans). Inspired by the success of frequency decomposition, in FDG-transformer we apply empirical model decomposition to decompose the complete ensemble of cluttered data into a trend component plus several informative and independent mode components. Equipped with the decomposition, FDG-transformer has the capacity to extract the discriminative insights from the cluttered signals. To retain the temporal information in the observed cluttered data, FDG-transformer utilizes hybrid neural network of GRU, long short term memory (LSTM) and multi-head attention (MHA) transformers. The integrated transformer network is capable of encoding the impact of different weights from each past time step to the current one, resulting in the establishment of a time series model from a deeper fine-grained level. We appy the developed FDG-transformer model to analyze Limit Order Book data and compare the results with that obtained from other state-of-the-art methods. The comparison shows that our model delivers effective price forecasting. Moreover, an ablation study is conducted to validate the importance and necessity of each component in the proposed model. Full article

► Show Figures

Figure 1

31 pages, 1510 KiB

Open AccessArticle

Analysis of Factors Influencing the Prices of Tourist Offers

by Agata Kołakowska and Magdalena Godlewska

Appl. Sci. 2022, 12(24), 12938; https://doi.org/10.3390/app122412938 - 16 Dec 2022

Cited by 2 | Viewed by 1198

Abstract

Tourism is a significant branch of many world economies. Many factors influence the volume of tourist traffic and the prices of trips. There are factors that clearly affect tourism, such as COVID-19. The paper describes the methods of machine learning and process mining [...] Read more.

Tourism is a significant branch of many world economies. Many factors influence the volume of tourist traffic and the prices of trips. There are factors that clearly affect tourism, such as COVID-19. The paper describes the methods of machine learning and process mining that allow for assessing the impact of various factors (micro, mezzo and macro) on the prices of tourist offers. The methods were used on large sets of real data from two tour operators, and the results of these studies are discussed in this paper. The research presented is part of a larger project aiming at predicting trip prices. It answers the question of which factors have the greatest impact on the price and which can be omitted in further work. Nevertheless, the dynamic world situation suggests that the ranking of factors may change and the presented universal methods may provide different results in the coming years. Full article

► Show Figures

Figure 1

20 pages, 3033 KiB

Open AccessArticle

Use of Data Mining to Predict the Influx of Patients to Primary Healthcare Centres and Construction of an Expert System

by Juan J. Cubillas, María I. Ramos and Francisco R. Feito

Appl. Sci. 2022, 12(22), 11453; https://doi.org/10.3390/app122211453 - 11 Nov 2022

Cited by 4 | Viewed by 1324

Abstract

In any productive sector, predictive tools are crucial for optimal management and decision-making. In the health sector, it is especially important to have information available in advance, as this not only means optimizing resources, but also improving patient care. This work focuses on [...] Read more.

In any productive sector, predictive tools are crucial for optimal management and decision-making. In the health sector, it is especially important to have information available in advance, as this not only means optimizing resources, but also improving patient care. This work focuses on the management of healthcare resources in primary care centres. The main objective of this work is to develop a model capable of predicting the number of patients who will demand health care in a primary care centre on a daily basis. This model is integrated into a decision support system that is accessible and easy to use by the manager through a web application. In this case, data from a primary care centre in the city of Jaén, Spain, were used. The model was estimated using spatial-temporal training data, the daily health demand data in that centre for five years, and a series of meteorological data. Different regression algorithms have been employed. The workflow requires selecting the parameters that influence the health demand prediction and discarding those that distort the model. The main contribution of this research is the daily prediction of the number of patients attending the health centre with absolute errors better than 3%, which is crucial for decision-making on the sizing of health resources in a primary care health centre. Full article

► Show Figures

Figure 1

25 pages, 2369 KiB

Open AccessArticle

Intelligent Decision Forest Models for Customer Churn Prediction

by Fatima Enehezei Usman-Hamza, Abdullateef Oluwagbemiga Balogun, Luiz Fernando Capretz, Hammed Adeleye Mojeed, Saipunidzam Mahamad, Shakirat Aderonke Salihu, Abimbola Ganiyat Akintola, Shuib Basri, Ramoni Tirimisiyu Amosa and Nasiru Kehinde Salahdeen

Appl. Sci. 2022, 12(16), 8270; https://doi.org/10.3390/app12168270 - 18 Aug 2022

Cited by 10 | Viewed by 2083

Abstract

Customer churn is a critical issue impacting enterprises and organizations, particularly in the emerging and highly competitive telecommunications industry. It is important to researchers and industry analysts interested in projecting customer behavior to separate churn from non-churn consumers. The fundamental incentive is a [...] Read more.

Customer churn is a critical issue impacting enterprises and organizations, particularly in the emerging and highly competitive telecommunications industry. It is important to researchers and industry analysts interested in projecting customer behavior to separate churn from non-churn consumers. The fundamental incentive is a firm’s intent desire to keep current consumers, along with the exorbitant expense of gaining new ones. Many solutions have been developed to address customer churn prediction (CCP), such as rule-based and machine learning (ML) solutions. However, the issue of scalability and robustness of rule-based customer churn solutions is a critical drawback, while the imbalanced nature of churn datasets has a detrimental impact on the prediction efficacy of conventional ML techniques in CCP. As a result, in this study, we developed intelligent decision forest (DF) models for CCP in telecommunication. Specifically, we investigated the prediction performances of the logistic model tree (LMT), random forest (RF), and Functional Trees (FT) as DF models and enhanced DF (LMT, RF, and FT) models based on weighted soft voting and weighted stacking methods. Extensive experimentation was performed to ascertain the efficacy of the suggested DF models utilizing publicly accessible benchmark telecom CCP datasets. The suggested DF models efficiently distinguish churn from non-churn consumers in the presence of the class imbalance problem. In addition, when compared to baseline and existing ML-based CCP methods, comparative findings showed that the proposed DF models provided superior prediction performances and optimal solutions for CCP in the telecom industry. Hence, the development and deployment of DF-based models for CCP and applicable ML tasks are recommended. Full article

► Show Figures

Figure 1

18 pages, 759 KiB

Open AccessArticle

Customer Churn Prediction in B2B Non-Contractual Business Settings Using Invoice Data

by Milan Mirkovic, Teodora Lolic, Darko Stefanovic, Andras Anderla and Danijela Gracanin

Appl. Sci. 2022, 12(10), 5001; https://doi.org/10.3390/app12105001 - 15 May 2022

Cited by 6 | Viewed by 3359

Abstract

Customer churn is a problem virtually all companies face, and the ability to predict it reliably can be a cornerstone for successful retention campaigns. In this study, we propose an approach to customer churn prediction in non-contractual B2B settings that relies exclusively on [...] Read more.

Customer churn is a problem virtually all companies face, and the ability to predict it reliably can be a cornerstone for successful retention campaigns. In this study, we propose an approach to customer churn prediction in non-contractual B2B settings that relies exclusively on invoice-level data for feature engineering and uses multi-slicing to maximally utilize available data. We cast churn as a binary classification problem and assess the ability of three established classifiers to predict it when using different churn definitions. We also compare classifier performance when different amounts of historical data are used for feature engineering. The results indicate that robust models for different churn definitions can be derived by using invoice-level data alone and that using more historical data for creating some of the features tends to lead to better performing models for some classifiers. We also confirm that the multi-slicing approach to dataset creation yields better performing models compared to the traditionally used single-slicing approach. Full article

► Show Figures

Figure 1

15 pages, 924 KiB

Open AccessArticle

Legal Judgment Prediction via Heterogeneous Graphs and Knowledge of Law Articles

by Qihui Zhao, Tianhan Gao, Song Zhou, Dapeng Li and Yingyou Wen

Appl. Sci. 2022, 12(5), 2531; https://doi.org/10.3390/app12052531 - 28 Feb 2022

Cited by 3 | Viewed by 2525

Abstract

Legal judgment prediction (LJP) is a crucial task in legal intelligence to predict charges, law articles and terms of penalties based on case fact description texts. Although existing methods perform well, they still have many shortcomings. First, the existing methods have significant limitations [...] Read more.

Legal judgment prediction (LJP) is a crucial task in legal intelligence to predict charges, law articles and terms of penalties based on case fact description texts. Although existing methods perform well, they still have many shortcomings. First, the existing methods have significant limitations in understanding long documents, especially those based on RNNs and BERT. Secondly, the existing methods are not good at solving the problem of similar charges and do not fully and effectively integrate the information of law articles. To address the above problems, we propose a novel LJP method. Firstly, we improve the model’s comprehension of the whole document based on a graph neural network approach. Then, we design a graph attention network-based law article distinction extractor to distinguish similar law articles. Finally, we design a graph fusion method to fuse heterogeneous graphs of text and external knowledge (law article group distinction information). The experiments show that the method could effectively improve LJP performance. The experimental metrics are superior to the existing state of the art. Full article

► Show Figures

Figure 1

21 pages, 1087 KiB

Open AccessReview

Artificial Intelligence-Based Methods for Business Processes: A Systematic Literature Review

by Poliana Gomes, Luiz Verçosa, Fagner Melo, Vinícius Silva, Carmelo Bastos Filho and Byron Bezerra

Appl. Sci. 2022, 12(5), 2314; https://doi.org/10.3390/app12052314 - 23 Feb 2022

Cited by 8 | Viewed by 4276

Abstract

Companies are usually overloaded with data that they may not know how to take advantage of. On the other hand, artificial intelligence (AI) techniques are known to “keep learning” as the data increase. In this context, our research question emerges: what AI-based methods, [...] Read more.

Companies are usually overloaded with data that they may not know how to take advantage of. On the other hand, artificial intelligence (AI) techniques are known to “keep learning” as the data increase. In this context, our research question emerges: what AI-based methods, in the literature, could be used to automatize business processes and support the decision-making processes of companies? To fill this gap, in this paper, we performed a review of the literature to identify these techniques. We ensured the usage of methods since they allowed reproducibility and extensions. We applied our search string in the Scopus and Web of Science databases and discovered 21 relevant papers pertaining to our question. In these papers, we identified methods that automated tasks and helped analysts make assertive decisions when designing, extending, or reengineering business processes. The authors applied diverse AI techniques, such as K-means, Bayesian networks, and swarm intelligence. Our analysis provides statistics about the techniques and problems being tackled and point to possible future directions. Full article

► Show Figures

Figure 1

25 pages, 69685 KiB

Open AccessArticle

Improving the Forecasting Performance of Taiwan Car Sales Movement Direction Using Online Sentiment Data and CNN-LSTM Model

by Chao Ou-Yang, Shih-Chung Chou and Yeh-Chun Juan

Appl. Sci. 2022, 12(3), 1550; https://doi.org/10.3390/app12031550 - 31 Jan 2022

Cited by 7 | Viewed by 3401

Abstract

The automotive industry is the leading producer of machines in Taiwan and worldwide. Developing effective methods for forecasting car sales can allow car companies to arrange their production and sales plans. Capitalizing on the growth of social media and deep learning algorithms, this [...] Read more.

The automotive industry is the leading producer of machines in Taiwan and worldwide. Developing effective methods for forecasting car sales can allow car companies to arrange their production and sales plans. Capitalizing on the growth of social media and deep learning algorithms, this research aimed to improve the overall performance of the forecasting of Taiwan car sales movement direction forecasting by using online sentiment data and CNN-LSTM method. First, the historical sales volumes and multi-channel online sentiment data for six car brands in Taiwan were collected and preprocessed for labeling of car sales movement direction. Then, three models, namely, the classical, sentimental, and CNN-LSTM models, were constructed and trained/fitted for forecasting car sales movement directions in Taiwan. Finally, the performance of the three prediction models were compared to verify the effects of online sentiment data and the CNN-LSTM model on forecasting performance. The results showed that four forecasting performance indices, i.e., accuracy, precision, recall and F1-score, improved by 27.78% (from 41.67% to 69.45%), 0.39 (from 0.38 to 0.77), 0.27 (from 0.42 to 0.69) and 0.33 (from 0.35 to 0.68), respectively. Therefore, the online sentiment data and CNN-LSTM method can indeed improve the overall performance of car sales movement direction in Taiwan. Full article

► Show Figures

Figure 1

2021

Jump to: 2023, 2022

17 pages, 1433 KiB

Open AccessArticle

Reinforcement Learning for Options Trading

by Wen Wen, Yuyu Yuan and Jincui Yang

Appl. Sci. 2021, 11(23), 11208; https://doi.org/10.3390/app112311208 - 25 Nov 2021

Cited by 4 | Viewed by 6198

Abstract

Reinforcement learning has been applied to various types of financial assets trading, such as stocks, futures, and cryptocurrencies. Options, as a novel kind of derivative, have their characteristics. Because there are too many option contracts for one underlying asset and their price behavior [...] Read more.

Reinforcement learning has been applied to various types of financial assets trading, such as stocks, futures, and cryptocurrencies. Options, as a novel kind of derivative, have their characteristics. Because there are too many option contracts for one underlying asset and their price behavior is different. Besides, the validity period of an option contract is relatively short. To apply reinforcement learning to options trading, we propose the options trading reinforcement learning (OTRL) framework. We use options’ underlying asset data to train the reinforcement learning model. Candle data in different time intervals are utilized, respectively. The protective closing strategy is added to the model to prevent unbearable losses. Our experiments demonstrate that the most stable algorithm for obtaining high returns is proximal policy optimization (PPO) with the protective closing strategy. The deep Q network (DQN) can exceed the buy and hold strategy in options trading, as can soft actor critic (SAC). The OTRL framework is verified effectively. Full article

► Show Figures

Figure 1

20 pages, 13860 KiB

Open AccessArticle

Few-Shot Charge Prediction with Data Augmentation and Feature Augmentation

by Peipeng Wang, Xiuguo Zhang and Zhiying Cao

Appl. Sci. 2021, 11(22), 10811; https://doi.org/10.3390/app112210811 - 16 Nov 2021

Cited by 2 | Viewed by 1538

Abstract

The task of charge prediction is to predict the charge based on the fact description. Existing methods have a good effect on the prediction of high-frequency charges, but the prediction of low-frequency charges is still a challenge. Moreover, there exist some confusing charges [...] Read more.

The task of charge prediction is to predict the charge based on the fact description. Existing methods have a good effect on the prediction of high-frequency charges, but the prediction of low-frequency charges is still a challenge. Moreover, there exist some confusing charges that have relatively similar fact descriptions, which can be easily misjudged. Therefore, we propose a model with data augmentation and feature augmentation for few-shot charge prediction. Specifically, the model takes the text description as the input and uses the Mixup method to generate virtual samples for data augmentation. Then, the charge information heterogeneous graph is introduced, and a novel graph convolutional network is designed to extract distinguishability features for feature augmentation. A feature fusion network is used to effectively integrate the charge graph knowledge into the fact to learn semantic-enhanced fact representation. Finally, the semantic-enhanced fact representation is used to predict the charge. In addition, based on the distribution of each charge, a category prior loss function is designed to increase the contribution of low-frequency charges to the model optimization. The experimental results on real-work datasets prove the effectiveness and robustness of the proposed model. Full article

► Show Figures

Figure 1

19 pages, 2055 KiB

Open AccessArticle

Multi-Objective Design of Profit Volumes and Closeness Ratings Using MBHS Optimizing Based on the PrefixSpan Mining Approach (PSMA) for Product Layout in Supermarkets

by Jakkrit Kaewyotha and Wararat Songpan

Appl. Sci. 2021, 11(22), 10683; https://doi.org/10.3390/app112210683 - 12 Nov 2021

Cited by 2 | Viewed by 1712

Abstract

Product layout significantly impacts consumer demand for purchases in supermarkets. Product shelf renovation is a crucial process that can increase supermarket efficiency. The development of a sequential pattern mining algorithm for investigating the correlation patterns of product layouts, solving the numerous problems of [...] Read more.

Product layout significantly impacts consumer demand for purchases in supermarkets. Product shelf renovation is a crucial process that can increase supermarket efficiency. The development of a sequential pattern mining algorithm for investigating the correlation patterns of product layouts, solving the numerous problems of shelf design, and the development of an algorithm that considers in-store purchase and shelf profit data with the goal of improving supermarket efficiency, and consequently profitability, were the goals of this research. The authors of this research developed two types of algorithms to enhance efficiency and reach the goals. The first was a PrefixSpan algorithm, which was used to optimize sequential pattern mining, known as the PrefixSpan mining approach. The second was a new multi-objective design that considered the objective functions of profit volumes and closeness rating using the mutation-based harmony search (MBHS) optimization algorithm, which was used to evaluate the performance of the first algorithm based on the PrefixSpan algorithm. The experimental results demonstrated that the PrefixSpan algorithm can determine correlation rules more efficiently and accurately ascertain correlation rules better than any other algorithms used in the study. Additionally, the authors found that MBHS with a new multi-objective design can effectively find the product layout in supermarket solutions. Finally, the proposed product layout algorithm was found to lead to higher profit volumes and closeness ratings than traditional shelf layouts, as well as to be more efficient than other algorithms. Full article

► Show Figures

Figure 1

11 pages, 559 KiB

Open AccessArticle

Dynamic Nearest Neighbor: An Improved Machine Learning Classifier and Its Application in Finances

by Oscar Camacho-Urriolagoitia, Itzamá López-Yáñez, Yenny Villuendas-Rey, Oscar Camacho-Nieto and Cornelio Yáñez-Márquez

Appl. Sci. 2021, 11(19), 8884; https://doi.org/10.3390/app11198884 - 24 Sep 2021

Cited by 5 | Viewed by 1602

Abstract

The presence of machine learning, data mining and related disciplines is increasingly evident in everyday environments. The support for the applications of learning techniques in topics related to economic risk assessment, among other financial topics of interest, is relevant for us as human [...] Read more.

The presence of machine learning, data mining and related disciplines is increasingly evident in everyday environments. The support for the applications of learning techniques in topics related to economic risk assessment, among other financial topics of interest, is relevant for us as human beings. The content of this paper consists of a proposal of a new supervised learning algorithm and its application in real world datasets related to finance, called D1-NN (Dynamic 1-Nearest Neighbor). The D1-NN performance is competitive against the main state of the art algorithms in solving finance-related problems. The effectiveness of the new D1-NN classifier was compared against five supervised classifiers of the most important approaches (Bayes, nearest neighbors, support vector machines, classifier ensembles, and neural networks), with superior results overall. Full article

► Show Figures

Figure 1

23 pages, 1861 KiB

Open AccessArticle

Knowledge Development Trajectories of the Radio Frequency Identification Domain: An Academic Study Based on Citation and Main Paths Analysis

by Wei-Hao Su, Kai-Ying Chen, Louis Y. Y. Lu and Jen-Jen Wang

Appl. Sci. 2021, 11(18), 8254; https://doi.org/10.3390/app11188254 - 07 Sep 2021

Cited by 3 | Viewed by 2317

Abstract

The study collected papers on radio frequency identification (RFID) applications from an academic database to explore the topic’s development trajectory and predict future development trends. Overall, 3820 papers were collected, and citation networks were established on the basis of the literature references. Main [...] Read more.

The study collected papers on radio frequency identification (RFID) applications from an academic database to explore the topic’s development trajectory and predict future development trends. Overall, 3820 papers were collected, and citation networks were established on the basis of the literature references. Main path analysis was performed on the networks to determine the development trajectory of RFID applications. After clustering into groups, the results are twenty clusters, and six clusters with citation counts of more than 200 were obtained. Cluster and word cloud analyses were conducted, and the main research themes were identified: RFID applications in supply chain management, antenna design, collision prevention protocols, privacy and safety, tag sensors, and localization systems. Text mining was performed on the titles and abstracts of the papers to identify frequent keywords and topics of interest to researchers. Finally, statistical analysis of papers published in the previous 4 years revealed RFID applications in construction, aquaculture, and experimentation are less frequently discussed themes. This study provides planning directions for industry, and the findings serve as a reference for business domain. The integrated analysis successfully determined the trajectory of RFID-based technological development and applications as well as forecast the direction of future research. Full article

► Show Figures

Figure 1

22 pages, 1824 KiB

Open AccessArticle

Important Trading Point Prediction Using a Hybrid Convolutional Recurrent Neural Network

by Xinpeng Yu and Dagang Li

Appl. Sci. 2021, 11(9), 3984; https://doi.org/10.3390/app11093984 - 28 Apr 2021

Cited by 12 | Viewed by 2787

Abstract

Stock performance prediction plays an important role in determining the appropriate timing of buying or selling a stock in the development of a trading system. However, precise stock price prediction is challenging because of the complexity of the internal structure of the stock [...] Read more.

Stock performance prediction plays an important role in determining the appropriate timing of buying or selling a stock in the development of a trading system. However, precise stock price prediction is challenging because of the complexity of the internal structure of the stock price system and the diversity of external factors. Although research on forecasting stock prices has been conducted continuously, there are few examples of the successful use of stock price forecasting models to develop effective trading systems. Inspired by the process of human stock traders looking for trading opportunities, we propose a deep learning framework based on a hybrid convolutional recurrent neural network (HCRNN) to predict the important trading points (IPs) that are more likely to be followed by a significant stock price rise to capture potential high-margin opportunities. In the HCRNN model, the convolutional neural network (CNN) performs convolution on the most recent region to capture local fluctuation features, and the long short-term memory (LSTM) approach learns the long-term temporal dependencies to improve stock performance prediction. Comprehensive experiments on real stock market data prove the effectiveness of our proposed framework. Our proposed method ITPP-HCRNN achieves an annualized return that is 278.46% more than that of the market. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Methods and Applications of Data Mining in Business Domains

Share This Project Collection

Editors

Project Overview

Keywords

Published Papers (21 papers)

2023

Jump to: 2022, 2021

2022

Jump to: 2023, 2021

2021

Jump to: 2023, 2022

Further Information

Guidelines

MDPI Initiatives

Follow MDPI