Journal Description
Analytics
Analytics
is an international, peer-reviewed, open access journal on methodologies, technologies, and applications of analytics, published quarterly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- Rapid Publication: first decisions in 16 days; acceptance to publication in 5.8 days (median values for MDPI journals in the second half of 2022).
- Recognition of Reviewers: APC discount vouchers, optional signed peer review, and reviewer names published annually in the journal.
- Analytics is a companion journal of Mathematics.
Latest Articles
A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences
Analytics 2023, 2(2), 463-484; https://doi.org/10.3390/analytics2020026 (registering DOI) - 01 Jun 2023
Abstract
►
Show Figures
In this article, the Lagrange expansion of the second kind is used to generate a novel zero-truncated Katz distribution; we refer to it as the Lagrangian zero-truncated Katz distribution (LZTKD). Notably, the zero-truncated Katz distribution is a special case of this distribution. Along
[...] Read more.
In this article, the Lagrange expansion of the second kind is used to generate a novel zero-truncated Katz distribution; we refer to it as the Lagrangian zero-truncated Katz distribution (LZTKD). Notably, the zero-truncated Katz distribution is a special case of this distribution. Along with the closed form expression of all its statistical characteristics, the LZTKD is proven to provide an adequate model for both underdispersed and overdispersed zero-truncated count datasets. Specifically, we show that the associated hazard rate function has increasing, decreasing, bathtub, or upside-down bathtub shapes. Moreover, we demonstrate that the LZTKD belongs to the Lagrangian distribution of the first kind. Then, applications of the LZTKD in statistical scenarios are explored. The unknown parameters are estimated using the well-reputed method of the maximum likelihood. In addition, the generalized likelihood ratio test procedure is applied to test the significance of the additional parameter. In order to evaluate the performance of the maximum likelihood estimates, simulation studies are also conducted. The use of real-life datasets further highlights the relevance and applicability of the proposed model.
Full article
Open AccessArticle
Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance
Analytics 2023, 2(2), 438-462; https://doi.org/10.3390/analytics2020025 - 16 May 2023
Abstract
►▼
Show Figures
The use of distributions to model and quantify risk is essential in risk assessment and management. In this study, the generalized unit half-logistic geometric (GUHLG) distribution is developed to model bounded insurance data on the unit interval. The corresponding probability density function plots
[...] Read more.
The use of distributions to model and quantify risk is essential in risk assessment and management. In this study, the generalized unit half-logistic geometric (GUHLG) distribution is developed to model bounded insurance data on the unit interval. The corresponding probability density function plots indicate that the related distribution can handle data that exhibit left-skewed, right-skewed, symmetric, reversed-J, and bathtub shapes. The hazard rate function also suggests that the distribution can be applied to analyze data with bathtubs, N-shapes, and increasing failure rates. Subsequently, the inferential aspects of the proposed model are investigated. In particular, Monte Carlo simulation exercises are carried out to examine the performance of the estimation method by using an algorithm to generate random observations from the quantile function. The results of the simulation suggest that the considered estimation method is efficient. The univariate application of the distribution and the multivariate application of the associated regression using risk survey data reveal that the model provides a better fit than the other existing distributions and regression models. Under the multivariate application, we estimate the parameters of the regression model using both maximum likelihood and Bayesian estimations. The estimates of the parameters for the two methods are very close. Diagnostic plots of the Bayesian method using the trace, ergodic, and autocorrelation plots reveal that the chains converge to a stationary distribution.
Full article

Figure 1
Open AccessFeature PaperArticle
Clustering Matrix Variate Longitudinal Count Data
Analytics 2023, 2(2), 426-437; https://doi.org/10.3390/analytics2020024 - 05 May 2023
Abstract
Matrix variate longitudinal discrete data can arise in transcriptomics studies when the data are collected for N genes at r conditions over t time points, and thus, each observation for can be written as
[...] Read more.
Matrix variate longitudinal discrete data can arise in transcriptomics studies when the data are collected for N genes at r conditions over t time points, and thus, each observation for can be written as an matrix. When dealing with such data, the number of parameters in the model can be greatly reduced by considering the matrix variate structure. The components of the covariance matrix then also provide a meaningful interpretation. In this work, a mixture of matrix variate Poisson-log normal distributions is introduced for clustering longitudinal read counts from RNA-seq studies. To account for the longitudinal nature of the data, a modified Cholesky-decomposition is utilized for a component of the covariance structure. Furthermore, a parsimonious family of models is developed by imposing constraints on elements of these decompositions. The models are applied to both real and simulated data, and it is demonstrated that the proposed approach can recover the underlying cluster structure.
Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
►▼
Show Figures

Figure 1
Open AccessArticle
Wavelet Support Vector Censored Regression
Analytics 2023, 2(2), 410-425; https://doi.org/10.3390/analytics2020023 - 04 May 2023
Abstract
►▼
Show Figures
Learning methods in survival analysis have the ability to handle censored observations. The Cox model is a predictive prevalent statistical technique for survival analysis, but its use rests on the strong assumption of hazard proportionality, which can be challenging to verify, particularly when
[...] Read more.
Learning methods in survival analysis have the ability to handle censored observations. The Cox model is a predictive prevalent statistical technique for survival analysis, but its use rests on the strong assumption of hazard proportionality, which can be challenging to verify, particularly when working with non-linearity and high-dimensional data. Therefore, it may be necessary to consider a more flexible and generalizable approach, such as support vector machines. This paper aims to propose a new method, namely wavelet support vector censored regression, and compare the Cox model with traditional support vector regression and traditional support vector regression for censored data models, survival models based on support vector machines. In addition, to evaluate the effectiveness of different kernel functions in the support vector censored regression approach to survival data, we conducted a series of simulations with varying number of observations and ratios of censored data. Based on the simulation results, we found that the wavelet support vector censored regression outperformed the other methods in terms of the C-index. The evaluation was performed on simulations, survival benchmarking datasets and in a biomedical real application.
Full article

Figure 1
Open AccessFeature PaperArticle
Building Neural Machine Translation Systems for Multilingual Participatory Spaces
Analytics 2023, 2(2), 393-409; https://doi.org/10.3390/analytics2020022 - 01 May 2023
Abstract
►▼
Show Figures
This work presents the development of the translation component in a multistage, multilevel, multimode, multilingual and dynamic deliberative (M4D2) system, built to facilitate automated moderation and translation in the languages of five European countries: Italy, Ireland, Germany, France and Poland. Two main topics
[...] Read more.
This work presents the development of the translation component in a multistage, multilevel, multimode, multilingual and dynamic deliberative (M4D2) system, built to facilitate automated moderation and translation in the languages of five European countries: Italy, Ireland, Germany, France and Poland. Two main topics were to be addressed in the deliberation process: (i) the environment and climate change; and (ii) the economy and inequality. In this work, we describe the development of neural machine translation (NMT) models for these domains for six European languages: Italian, English (included as the second official language of Ireland), Irish, German, French and Polish. As a result, we generate 30 NMT models, initially baseline systems built using freely available online data, which are then adapted to the domains of interest in the project by (i) filtering the corpora, (ii) tuning the systems with automatically extracted in-domain development datasets and (iii) using corpus concatenation techniques to expand the amount of data available. We compare our results produced by the domain-adapted systems with those produced by Google Translate, and demonstrate that fast, high-quality systems can be produced that facilitate multilingual deliberation in a secure environment.
Full article

Figure 1
Open AccessArticle
Investigating Online Art Search through Quantitative Behavioral Data and Machine Learning Techniques
Analytics 2023, 2(2), 359-392; https://doi.org/10.3390/analytics2020021 - 26 Apr 2023
Abstract
►▼
Show Figures
Studying searcher behavior has been a cornerstone of search engine research for decades, since it can lead to a better understanding of user needs and allow for an improved user experience. Going beyond descriptive data analysis and statistics, studies have been utilizing the
[...] Read more.
Studying searcher behavior has been a cornerstone of search engine research for decades, since it can lead to a better understanding of user needs and allow for an improved user experience. Going beyond descriptive data analysis and statistics, studies have been utilizing the capabilities of Machine Learning to further investigate how users behave during general purpose searching. But the thematic content of a search greatly affects many aspects of user behavior, which often deviates from general purpose search behavior. Thus, in this study, emphasis is placed specifically on the fields of Art and Cultural Heritage. Insights derived from behavioral data can help Culture and Art institutions streamline their online presence and allow them to better understand their user base. Existing research in this field often focuses on lab studies and explicit user feedback, but this study takes advantage of real usage quantitative data and its analysis through machine learning. Using data collected by real world usage of the Art Boulevard proprietary search engine for content related to Art and Culture and through the means of Machine Learning-powered tools and methodologies, this article investigates the peculiarities of Art-related online searches. Through clustering, various archetypes of Art search sessions were identified, thus providing insight on the variety of ways in which users interacted with the search engine. Additionally, using extreme Gradient boosting, the metrics that were more likely to predict the success of a search session were documented, underlining the importance of various aspects of user activity for search success. Finally, through applying topic modeling on the textual information of user-clicked results, the thematic elements that dominated user interest were investigated, providing an overview of prevalent themes in the fields of Art and Culture. It was established that preferred results revolved mostly around traditional visual Art themes, while academic and historical topics also had a strong presence.
Full article

Figure 1
Open AccessArticle
The AI Learns to Lie to Please You: Preventing Biased Feedback Loops in Machine-Assisted Intelligence Analysis
Analytics 2023, 2(2), 350-358; https://doi.org/10.3390/analytics2020020 - 18 Apr 2023
Abstract
Researchers are starting to design AI-powered systems to automatically select and summarize the reports most relevant to each analyst, which raises the issue of bias in the information presented. This article focuses on the selection of relevant reports without an explicit query, a
[...] Read more.
Researchers are starting to design AI-powered systems to automatically select and summarize the reports most relevant to each analyst, which raises the issue of bias in the information presented. This article focuses on the selection of relevant reports without an explicit query, a task known as recommendation. Drawing on previous work documenting the existence of human-machine feedback loops in recommender systems, this article reviews potential biases and mitigations in the context of intelligence analysis. Such loops can arise when behavioral “engagement” signals such as clicks or user ratings are used to infer the value of displayed information. Even worse, there can be feedback loops in the collection of intelligence information because users may also be responsible for tasking collection. Avoiding misalignment feedback loops requires an alternate, ongoing, non-engagement signal of information quality. Existing evaluation scales for intelligence product quality and rigor, such as the IC Rating Scale, could provide ground-truth feedback. This sparse data can be used in two ways: for human supervision of average performance and to build models that predict human survey ratings for use at recommendation time. Both techniques are widely used today by social media platforms. Open problems include the design of an ideal human evaluation method, the cost of skilled human labor, and the sparsity of the resulting data.
Full article
(This article belongs to the Special Issue Selected Papers from the 2022 Summer Conference on Applied Data Science (SCADS))
►▼
Show Figures

Figure 1
Open AccessEditorial
Data Stream Analytics
Analytics 2023, 2(2), 346-349; https://doi.org/10.3390/analytics2020019 - 14 Apr 2023
Abstract
The human brain works in such a complex way that we have not yet managed to decipher its functional mysteries [...]
Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Open AccessArticle
Development of a Dynamically Adaptable Routing System for Data Analytics Insights in Logistic Services
by
, , , , and
Analytics 2023, 2(2), 328-345; https://doi.org/10.3390/analytics2020018 - 13 Apr 2023
Abstract
►▼
Show Figures
This work proposes an effective solution to the Vehicle Routing Problem, taking into account all phases of the delivery process. When compared to real-world data, the findings are encouraging and demonstrate the value of Machine Learning algorithms incorporated into the process. Several algorithms
[...] Read more.
This work proposes an effective solution to the Vehicle Routing Problem, taking into account all phases of the delivery process. When compared to real-world data, the findings are encouraging and demonstrate the value of Machine Learning algorithms incorporated into the process. Several algorithms were combined along with a modified Hopfield network to deliver the optimal solution to a multiobjective issue on a platform capable of monitoring the various phases of the process. Additionally, a system providing viable insights and analytics in regard to the orders was developed. The results reveal a maximum distance saving of 25% and a maximum overall delivery time saving of 14%.
Full article

Figure 1
Open AccessArticle
Metric Ensembles Aid in Explainability: A Case Study with Wikipedia Data
by
and
Analytics 2023, 2(2), 315-327; https://doi.org/10.3390/analytics2020017 - 07 Apr 2023
Abstract
In recent years, as machine learning models have become larger and more complex, it has become both more difficult and more important to be able to explain and interpret the results of those models, both to prevent model errors and to inspire confidence
[...] Read more.
In recent years, as machine learning models have become larger and more complex, it has become both more difficult and more important to be able to explain and interpret the results of those models, both to prevent model errors and to inspire confidence for end users of the model. As such, there has been a significant and growing interest in explainability in recent years as a highly desirable trait for a model to have. Similarly, there has been much recent attention on ensemble methods, which aim to aggregate results from multiple (often simple) models or metrics in order to outperform models that optimize for only a single metric. We argue that this latter issue can actually assist with the former: a model that optimizes for several metrics has some base level of explainability baked into the model, and this explainability can be leveraged not only for user confidence but to fine-tune the weights between the metrics themselves in an intuitive way. We demonstrate a case study of such a benefit, in which we obtain clear, explainable results based on an aggregate of five simple metrics of relevance, using Wikipedia data as a proxy for some large text-based recommendation problem. We demonstrate that not only can these metrics’ simplicity and multiplicity be leveraged for explainability, but in fact, that very explainability can lead to an intuitive fine-tuning process that improves the model itself.
Full article
(This article belongs to the Special Issue Selected Papers from the 2022 Summer Conference on Applied Data Science (SCADS))
►▼
Show Figures

Figure 1
Open AccessArticle
Readability Indices Do Not Say It All on a Text Readability
Analytics 2023, 2(2), 296-314; https://doi.org/10.3390/analytics2020016 - 30 Mar 2023
Cited by 1
Abstract
►▼
Show Figures
We propose a universal readability index, , applicable to any alphabetical language and related to cognitive psychology, the theory of communication, phonics and linguistics. This index also considers readers’ short-term-memory processing capacity, here modeled by the word interval ,
[...] Read more.
We propose a universal readability index, , applicable to any alphabetical language and related to cognitive psychology, the theory of communication, phonics and linguistics. This index also considers readers’ short-term-memory processing capacity, here modeled by the word interval , namely, the number of words between two interpunctions. Any current readability formula does not consider , but scatterplots of versus a readability index show that texts with the same readability index can have very different , ranging from 4 to 9, practically Miller’s range, which refers to 95% of readers. It is unlikely that has no impact on reading difficulty. The examples shown are taken from Italian and English Literatures, and from the translations of The New Testament in Latin and in contemporary languages. We also propose an extremely compact formula, relating the capacity of human short-term memory to the difficulty of reading a text. It should synthetically model human reading difficulty, a kind of “footprint” of humans. However, further experimental and multidisciplinary work is necessary to confirm our conjecture about the dependence of a readability index on a reader’s short-term-memory capacity.
Full article

Figure 1
Open AccessArticle
Upgraded Thoth: Software for Data Visualization and Statistics
by
, , , , , , , , , , , , , , , and
Analytics 2023, 2(1), 284-295; https://doi.org/10.3390/analytics2010015 - 16 Mar 2023
Abstract
Thoth is a free desktop/laptop software application with a friendly graphical user interface that facilitates routine data-visualization and statistical-calculation tasks for astronomy and astrophysical research (and other fields where numbers are visualized). This software has been upgraded with many significant improvements and new
[...] Read more.
Thoth is a free desktop/laptop software application with a friendly graphical user interface that facilitates routine data-visualization and statistical-calculation tasks for astronomy and astrophysical research (and other fields where numbers are visualized). This software has been upgraded with many significant improvements and new capabilities. The major upgrades consist of: (1) six new graph types, including 3D stacked-bar charts and 3D surface plots, made by the Orson 3D Charts library; (2) new saving and loading of graph settings; (3) a new batch-mode or command-line operation; (4) new graph-data annotation functions; (5) new options for data-file importation; and (6) a new built-in FITS-image viewer. There is now the requirement that Thoth be run under Java 1.8 or higher. Many other miscellaneous minor upgrades and bug fixes have also been made to Thoth. The newly implemented plotting options generally make possible graph construction and reuse with relative ease, without resorting to writing computer code. The illustrative astronomy case study of this paper demonstrates one of the many ways the software can be utilized. These new software features and refinements help make astronomers more efficient in their work of elucidating data.
Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
►▼
Show Figures

Figure 1
Open AccessArticle
Application of Mixture Models for Doubly Inflated Count Data
by
and
Analytics 2023, 2(1), 265-283; https://doi.org/10.3390/analytics2010014 - 11 Mar 2023
Abstract
In health and social science and other fields where count data analysis is important, zero-inflated models have been employed when the frequency of zero count is high (inflated). Due to multiple reasons, there are scenarios in which an additional count value of k
[...] Read more.
In health and social science and other fields where count data analysis is important, zero-inflated models have been employed when the frequency of zero count is high (inflated). Due to multiple reasons, there are scenarios in which an additional count value of k > 0 occurs with high frequency. The zero- and k-inflated Poisson distribution model (ZkIP) is more appropriate for such situations. The ZkIP model is a mixture distribution with three components: degenerate distributions at 0 and k count and a Poisson distribution. In this article, we propose an alternative and computationally fast expectation–maximization (EM) algorithm to obtain the parameter estimates for grouped zero and k-inflated count data. The asymptotic standard errors are derived using the complete data approach. We compare the zero- and k-inflated Poisson model with its zero-inflated and non-inflated counterparts. The best model is selected based on commonly used criteria. The theoretical results are supplemented with the analysis of two real-life datasets from health sciences.
Full article
Open AccessArticle
A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy
by
and
Analytics 2023, 2(1), 246-264; https://doi.org/10.3390/analytics2010013 - 03 Mar 2023
Abstract
Location-based services (LBS) require users to provide their current location for service delivery and customization. Location privacy protection addresses concerns associated with the potential mishandling of location information submitted to the LBS provider. Location accuracy has a direct impact on the quality of
[...] Read more.
Location-based services (LBS) require users to provide their current location for service delivery and customization. Location privacy protection addresses concerns associated with the potential mishandling of location information submitted to the LBS provider. Location accuracy has a direct impact on the quality of service (QoS), where higher location accuracy results in better QoS. In general, the main goal of any location privacy technique is to achieve maximum QoS while providing minimum or no location information if possible, and using dummy locations is one such location privacy technique. In this paper, we introduced a temporal constraint attack whereby an adversary can exploit the temporal constraints associated with the semantic category of locations to eliminate dummy locations and identify the true location. We demonstrated how an adversary can devise a temporal constraint attack to breach the location privacy of a residential location. We addressed this major limitation of the current dummy approaches with a novel Voronoi-based semantically balanced framework (VSBDG) capable of generating dummy locations that can withstand a temporal constraint attack. Built based on real-world geospatial datasets, the VSBDG framework leverages spatial relationships and operations. Our results show a high physical dispersion cosine similarity of 0.988 between the semantic categories even with larger location set sizes. This indicates a strong and scalable semantic balance for each semantic category within the VSBDG’s output location set. The VSBDG algorithm is capable of producing location sets with high average minimum dispersion distance values of 5861.894 m for residential locations and 6258.046 m for POI locations. The findings demonstrate that the locations within each semantic category are scattered farther apart, entailing optimized location privacy.
Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
►▼
Show Figures

Figure 1
Open AccessArticle
Survey of Distances between the Most Popular Distributions
by
Analytics 2023, 2(1), 225-245; https://doi.org/10.3390/analytics2010012 - 01 Mar 2023
Abstract
►▼
Show Figures
We present a number of upper and lower bounds for the total variation distances between the most popular probability distributions. In particular, some estimates of the total variation distances in the cases of multivariate Gaussian distributions, Poisson distributions, binomial distributions, between a binomial
[...] Read more.
We present a number of upper and lower bounds for the total variation distances between the most popular probability distributions. In particular, some estimates of the total variation distances in the cases of multivariate Gaussian distributions, Poisson distributions, binomial distributions, between a binomial and a Poisson distribution, and also in the case of negative binomial distributions are given. Next, the estimations of Lévy–Prohorov distance in terms of Wasserstein metrics are discussed, and Fréchet, Wasserstein and Hellinger distances for multivariate Gaussian distributions are evaluated. Some novel context-sensitive distances are introduced and a number of bounds mimicking the classical results from the information theory are proved.
Full article

Figure 1
Open AccessArticle
Smart Multimedia Information Retrieval
Analytics 2023, 2(1), 198-224; https://doi.org/10.3390/analytics2010011 - 20 Feb 2023
Abstract
►▼
Show Figures
The area of multimedia information retrieval (MMIR) faces two major challenges: the enormously growing number of multimedia objects (i.e., images, videos, audio, and text files), and the fast increasing level of detail of these objects (e.g., the number of pixels in images). Both
[...] Read more.
The area of multimedia information retrieval (MMIR) faces two major challenges: the enormously growing number of multimedia objects (i.e., images, videos, audio, and text files), and the fast increasing level of detail of these objects (e.g., the number of pixels in images). Both challenges lead to a high demand of scalability, semantic representations, and explainability of MMIR processes. Smart MMIR solves these challenges by employing graph codes as an indexing structure, attaching semantic annotations for explainability, and employing application profiling for scaling, which results in human-understandable, expressive, and interoperable MMIR. The mathematical foundation, the modeling, implementation detail, and experimental results are shown in this paper, which confirm that Smart MMIR improves MMIR in the area of efficiency, effectiveness, and human understandability.
Full article

Figure 1
Open AccessArticle
The SP Theory of Intelligence, and Its Realisation in the SP Computer Model, as a Foundation for the Development of Artificial General Intelligence
Analytics 2023, 2(1), 163-197; https://doi.org/10.3390/analytics2010010 - 17 Feb 2023
Abstract
►▼
Show Figures
The theme of this paper is that the SP Theory of Intelligence (SPTI), and its realisation in the SP Computer Model, is a promising foundation for the development of artificial intelligence at the level of people or higher, also known as ‘artificial
[...] Read more.
The theme of this paper is that the SP Theory of Intelligence (SPTI), and its realisation in the SP Computer Model, is a promising foundation for the development of artificial intelligence at the level of people or higher, also known as ‘artificial general intelligence’ (AGI). The SPTI, and alternatives to the SPTI chosen to be representative of potential foundations for the development of AGI, are considered and compared. A key principle in the SPTI and its development is the importance of information compression (IC) in human learning, perception, and cognition. More specifically, IC in the SPTI is achieved via the powerful concept of SP-multiple-alignment, the key to the versatility of the SPTI in diverse aspects of intelligence, and thus a favourable combination of Simplicity with descriptive and explanatory Power. Since there are many uncertainties between where we are now and, far into the future, anything that might qualify as an AGI, a multi-pronged attack on the problem is needed. The SPTI qualifies as the basis for one of those prongs. Although it will take time to achieve AGI, there is potential along the road for many useful benefits and applications of the research.
Full article

Figure 1
Open AccessArticle
Dynamic Skyline Computation with LSD Trees
Analytics 2023, 2(1), 146-162; https://doi.org/10.3390/analytics2010009 - 09 Feb 2023
Abstract
Given a set of high-dimensional feature vectors , the skyline or Pareto problem is to report the subset of vectors in S that are not dominated by any vector of S. Vectors closer to the origin are preferred:
[...] Read more.
Given a set of high-dimensional feature vectors , the skyline or Pareto problem is to report the subset of vectors in S that are not dominated by any vector of S. Vectors closer to the origin are preferred: we say a vector x is dominated by another distinct vector y if x is equally or further away from the origin than y with respect to all its dimensions. The dynamic skyline problem allows us to shift the origin, which changes the answer set. This problem is crucial for dynamic recommender systems where users can shift the parameters and thus shift the origin. For each origin shift, a recomputation of the answer set from scratch is time intensive. To tackle this problem, we propose a parallel algorithm for dynamic skyline computation that uses multiple local split decision (LSD) trees concurrently. The geometric nature of the LSD trees allows us to reuse previous results. Experiments show that our proposed algorithm works well if the dimension is small in relation to the number of tuples to process.
Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
►▼
Show Figures

Figure 1
Open AccessTechnical Note
Untangling Energy Consumption Dynamics with Renewable Energy Using Recurrent Neural Network
Analytics 2023, 2(1), 132-145; https://doi.org/10.3390/analytics2010008 - 01 Feb 2023
Abstract
►▼
Show Figures
The environmental issues we are currently facing require long-term prospective efforts for sustainable growth. Renewable energy sources seem to be one of the most practical and efficient alternatives in this regard. Understanding a nation’s pattern of energy use and renewable energy production is
[...] Read more.
The environmental issues we are currently facing require long-term prospective efforts for sustainable growth. Renewable energy sources seem to be one of the most practical and efficient alternatives in this regard. Understanding a nation’s pattern of energy use and renewable energy production is crucial for developing strategic plans. No previous study has been performed to explore the dynamics of power consumption with the change in renewable energy production on a country-wide scale. In contrast, a number of deep learning algorithms have demonstrated acceptable performance while handling sequential data in the era of data-driven predictions. In this study, we developed a scheme to investigate and predict total power consumption and renewable energy production time series for eleven years of data using a recurrent neural network (RNN). The dynamics of the interaction between the total annual power consumption and renewable energy production were investigated through extensive exploratory data analysis (EDA) and a feature engineering framework. The performance of the model was found to be satisfactory through the comparison of the predicted data with the observed data, the visualization of the distribution of the errors and root mean squared error (RMSE), and the R2 values of 0.084 and 0.82. Higher performance was achieved by increasing the number of epochs and hyperparameter tuning. The proposed framework has the potential to be used and transferred to investigate the trend of renewable energy production and power consumption and predict future scenarios for different communities. The incorporation of a cloud-based platform into the proposed pipeline to perform predictive studies from data acquisition to outcome generation may lead to real-time forecasting.
Full article

Figure 1
Open AccessArticle
Theory-Guided Analytics Process: Using Theories to Underpin an Analytics Process for New Banking Product Development Using Segmentation-Based Marketing Analytics Leveraging on Marketing Intelligence
Analytics 2023, 2(1), 105-131; https://doi.org/10.3390/analytics2010007 - 01 Feb 2023
Abstract
►▼
Show Figures
Retail banking is undergoing considerable product competitiveness and disruptions. New product development is necessary to tackle such challenges and reinvigorate product lines. This study presents an instrumental real-life banking case study, where marketing analytics was utilized to drive a product differentiation strategy. In
[...] Read more.
Retail banking is undergoing considerable product competitiveness and disruptions. New product development is necessary to tackle such challenges and reinvigorate product lines. This study presents an instrumental real-life banking case study, where marketing analytics was utilized to drive a product differentiation strategy. In particular, the study applied unsupervised machine learning techniques of link analysis, latent class analysis, and association analysis to undertake behavioral-based market segmentation, in view of attaining a profitable competitive advantage. To underpin the product development process with well grounded theoretical framing, this study asked the research question: “How may we establish a theory-driven approach for an analytics-driven process?” Findings of this study include a theoretical conceptual framework that underpinned the end-to-end segmentation-driven new product development process, backed by the empirical literature. The study hopes to provide: (i) for managerial practitioners, the use of case-based reasoning for practice-oriented new product development design, planning, and diagnosis efforts, and (ii) for researchers, the potentiality to test of the validity and robustness of an analytical-driven NPD process. The study also hopes to drive a wider research interest that studies the theory-driven approach for analytics-driven processes.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics

Conferences
Special Issues
Special Issue in
Analytics
Visual Analytics: Techniques and Applications
Guest Editors: Kostiantyn Kucher, Katerina VrotsouDeadline: 30 November 2023
Special Issue in
Analytics
New Insights in Learning Analytics
Guest Editors: Kinshuk, Vivekanandan Suresh KumarDeadline: 31 December 2023