Feature Papers in Analytics

A special issue of Analytics (ISSN 2813-2203).

Deadline for manuscript submissions: closed (31 December 2023) | Viewed by 13006

Special Issue Editors

School of Engineering, Pablo de Olavide University, Seville, Spain
Interests: data analytics; deep learning; feature selection; classification
School of System Engineering, Universidad Nacional de San Agustin de Arequipa, Arequipa, Peru
Interests: geospatial analytics; mobile
* Associate Professor

Special Issue Information

Dear Colleagues,

As Editor-in-Chief of Analytics, I am pleased to announce the Special Issue “Feature Papers in Analytics”, which will host a collection of high-quality papers (original research articles or comprehensive reviews) from top academics addressing the interdisciplinary nature of Data Analytics. I welcome the submission of manuscripts from Editorial Board Members, in addition to outstanding scholars invited by the Editorial Board and the Editorial Office, related to any of the topics covered in the scope of the journal: https://www.mdpi.com/journal/analytics/about.

You are invited to send short proposals for submissions to our Editorial Office (analytics@mdpi.com) for evaluation. Please note that selected full papers will still be subject to thorough and rigorous peer review.

Prof. Dr. Jesus S. Aguilar-Ruiz
Ernesto Mauro Suarez Lopez
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Analytics is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review

4 pages, 194 KiB  
Editorial
Data Stream Analytics
by Jesus S. Aguilar-Ruiz, Albert Bifet and Joao Gama
Analytics 2023, 2(2), 346-349; https://doi.org/10.3390/analytics2020019 - 14 Apr 2023
Viewed by 1241
Abstract
The human brain works in such a complex way that we have not yet managed to decipher its functional mysteries [...] Full article
(This article belongs to the Special Issue Feature Papers in Analytics)

Research

Jump to: Editorial, Review

16 pages, 3053 KiB  
Article
Exploring Infant Physical Activity Using a Population-Based Network Analysis Approach
by Rama Krishna Thelagathoti, Priyanka Chaudhary, Brian Knarr, Michaela Schenkelberg, Hesham H. Ali and Danae Dinkel
Analytics 2024, 3(1), 14-29; https://doi.org/10.3390/analytics3010002 - 31 Dec 2023
Viewed by 711
Abstract
Background: Physical activity (PA) is an important aspect of infant development and has been shown to have long-term effects on health and well-being. Accurate analysis of infant PA is crucial for understanding their physical development, monitoring health and wellness, as well as identifying [...] Read more.
Background: Physical activity (PA) is an important aspect of infant development and has been shown to have long-term effects on health and well-being. Accurate analysis of infant PA is crucial for understanding their physical development, monitoring health and wellness, as well as identifying areas for improvement. However, individual analysis of infant PA can be challenging and often leads to biased results due to an infant’s inability to self-report and constantly changing posture and movement. This manuscript explores a population-based network analysis approach to study infants’ PA. The network analysis approach allows us to draw conclusions that are generalizable to the entire population and to identify trends and patterns in PA levels. Methods: This study aims to analyze the PA of infants aged 6–15 months using accelerometer data. A total of 20 infants from different types of childcare settings were recruited, including home-based and center-based care. Each infant wore an accelerometer for four days (2 weekdays, 2 weekend days). Data were analyzed using a network analysis approach, exploring the relationship between PA and various demographic and social factors. Results: The results showed that infants in center-based care have significantly higher levels of PA than those in home-based care. Moreover, the ankle acceleration was much higher than the waist acceleration, and activity patterns differed on weekdays and weekends. Conclusions: This study highlights the need for further research to explore the factors contributing to disparities in PA levels among infants in different childcare settings. Additionally, there is a need to develop effective strategies to promote PA among infants, considering the findings from the network analysis approach. Such efforts can contribute to enhancing infant health and well-being through targeted interventions aimed at increasing PA levels. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

28 pages, 1699 KiB  
Article
A Novel Curve Clustering Method for Functional Data: Applications to COVID-19 and Financial Data
by Ting Wei and Bo Wang
Analytics 2023, 2(4), 781-808; https://doi.org/10.3390/analytics2040041 - 08 Oct 2023
Viewed by 1080
Abstract
Functional data analysis has significantly enriched the landscape of existing data analysis methodologies, providing a new framework for comprehending data structures and extracting valuable insights. This paper is dedicated to addressing functional data clustering—a pivotal challenge within functional data analysis. Our contribution to [...] Read more.
Functional data analysis has significantly enriched the landscape of existing data analysis methodologies, providing a new framework for comprehending data structures and extracting valuable insights. This paper is dedicated to addressing functional data clustering—a pivotal challenge within functional data analysis. Our contribution to this field manifests through the introduction of innovative clustering methodologies tailored specifically to functional curves. Initially, we present a proximity measure algorithm designed for functional curve clustering. This innovative clustering approach offers the flexibility to redefine measurement points on continuous functions, adapting to either equidistant or nonuniform arrangements, as dictated by the demands of the proximity measure. Central to this method is the “proximity threshold”, a critical parameter that governs the cluster count, and its selection is thoroughly explored. Subsequently, we propose a time-shift clustering algorithm designed for time-series data. This approach identifies historical data segments that share patterns similar to those observed in the present. To evaluate the effectiveness of our methodologies, we conduct comparisons with the classic K-means clustering method and apply them to simulated data, yielding encouraging simulation results. Moving beyond simulation, we apply the proposed proximity measure algorithm to COVID-19 data, yielding notable clustering accuracy. Additionally, the time-shift clustering algorithm is employed to analyse NASDAQ Composite data, successfully revealing underlying economic cycles. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

36 pages, 34844 KiB  
Article
Image Segmentation of the Sudd Wetlands in South Sudan for Environmental Analytics by GRASS GIS Scripts
by Polina Lemenkova
Analytics 2023, 2(3), 745-780; https://doi.org/10.3390/analytics2030040 - 21 Sep 2023
Cited by 1 | Viewed by 1410
Abstract
This paper presents the object detection algorithms GRASS GIS applied for Landsat 8-9 OLI/TIRS data. The study area includes the Sudd wetlands located in South Sudan. This study describes a programming method for the automated processing of satellite images for environmental analytics, applying [...] Read more.
This paper presents the object detection algorithms GRASS GIS applied for Landsat 8-9 OLI/TIRS data. The study area includes the Sudd wetlands located in South Sudan. This study describes a programming method for the automated processing of satellite images for environmental analytics, applying the scripting algorithms of GRASS GIS. This study documents how the land cover changed and developed over time in South Sudan with varying climate and environmental settings, indicating the variations in landscape patterns. A set of modules was used to process satellite images by scripting language. It streamlines the geospatial processing tasks. The functionality of the modules of GRASS GIS to image processing is called within scripts as subprocesses which automate operations. The cutting-edge tools of GRASS GIS present a cost-effective solution to remote sensing data modelling and analysis. This is based on the discrimination of the spectral reflectance of pixels on the raster scenes. Scripting algorithms of remote sensing data processing based on the GRASS GIS syntax are run from the terminal, enabling to pass commands to the module. This ensures the automation and high speed of image processing. The algorithm challenge is that landscape patterns differ substantially, and there are nonlinear dynamics in land cover types due to environmental factors and climate effects. Time series analysis of several multispectral images demonstrated changes in land cover types over the study area of the Sudd, South Sudan affected by environmental degradation of landscapes. The map is generated for each Landsat image from 2015 to 2023 using 481 maximum-likelihood discriminant analysis approaches of classification. The methodology includes image segmentation by ‘i.segment’ module, image clustering and classification by ‘i.cluster’ and ‘i.maxlike’ modules, accuracy assessment by ‘r.kappa’ module, and computing NDVI and cartographic mapping implemented using GRASS GIS. The benefits of object detection techniques for image analysis are demonstrated with the reported effects of various threshold levels of segmentation. The segmentation was performed 371 times with 90% of the threshold and minsize = 5; the process was converged in 37 to 41 iterations. The following segments are defined for images: 4515 for 2015, 4813 for 2016, 4114 for 2017, 5090 for 2018, 6021 for 2019, 3187 for 2020, 2445 for 2022, and 5181 for 2023. The percent convergence is 98% for the processed images. Detecting variations in land cover patterns is possible using spaceborne datasets and advanced applications of scripting algorithms. The implications of cartographic approach for environmental landscape analysis are discussed. The algorithm for image processing is based on a set of GRASS GIS wrapper functions for automated image classification. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

12 pages, 421 KiB  
Article
Clustering Matrix Variate Longitudinal Count Data
by Sanjeena Subedi
Analytics 2023, 2(2), 426-437; https://doi.org/10.3390/analytics2020024 - 05 May 2023
Viewed by 1221
Abstract
Matrix variate longitudinal discrete data can arise in transcriptomics studies when the data are collected for N genes at r conditions over t time points, and thus, each observation Yn for n=1,,N can be written as [...] Read more.
Matrix variate longitudinal discrete data can arise in transcriptomics studies when the data are collected for N genes at r conditions over t time points, and thus, each observation Yn for n=1,,N can be written as an r×t matrix. When dealing with such data, the number of parameters in the model can be greatly reduced by considering the matrix variate structure. The components of the covariance matrix then also provide a meaningful interpretation. In this work, a mixture of matrix variate Poisson-log normal distributions is introduced for clustering longitudinal read counts from RNA-seq studies. To account for the longitudinal nature of the data, a modified Cholesky-decomposition is utilized for a component of the covariance structure. Furthermore, a parsimonious family of models is developed by imposing constraints on elements of these decompositions. The models are applied to both real and simulated data, and it is demonstrated that the proposed approach can recover the underlying cluster structure. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

12 pages, 1510 KiB  
Article
Upgraded Thoth: Software for Data Visualization and Statistics
by Russ R. Laher, Frank J. Masci, Luisa M. Rebull, Steven D. Schurr, Wendy Burt, Anastasia Laity, Melanie Swain, David L. Shupe, Steve Groom, Benjamin Rusholme, Mih-Seh Kong, John C. Good, Varoujan Gorjian, Rachel Akeson, Benjamin J. Fulton, David R. Ciardi and Sean Carey
Analytics 2023, 2(1), 284-295; https://doi.org/10.3390/analytics2010015 - 16 Mar 2023
Viewed by 1429
Abstract
Thoth is a free desktop/laptop software application with a friendly graphical user interface that facilitates routine data-visualization and statistical-calculation tasks for astronomy and astrophysical research (and other fields where numbers are visualized). This software has been upgraded with many significant improvements and new [...] Read more.
Thoth is a free desktop/laptop software application with a friendly graphical user interface that facilitates routine data-visualization and statistical-calculation tasks for astronomy and astrophysical research (and other fields where numbers are visualized). This software has been upgraded with many significant improvements and new capabilities. The major upgrades consist of: (1) six new graph types, including 3D stacked-bar charts and 3D surface plots, made by the Orson 3D Charts library; (2) new saving and loading of graph settings; (3) a new batch-mode or command-line operation; (4) new graph-data annotation functions; (5) new options for data-file importation; and (6) a new built-in FITS-image viewer. There is now the requirement that Thoth be run under Java 1.8 or higher. Many other miscellaneous minor upgrades and bug fixes have also been made to Thoth. The newly implemented plotting options generally make possible graph construction and reuse with relative ease, without resorting to writing computer code. The illustrative astronomy case study of this paper demonstrates one of the many ways the software can be utilized. These new software features and refinements help make astronomers more efficient in their work of elucidating data. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

19 pages, 3332 KiB  
Article
A Voronoi-Based Semantically Balanced Dummy Generation Framework for Location Privacy
by Aditya Tadakaluru and Xiao Qin
Analytics 2023, 2(1), 246-264; https://doi.org/10.3390/analytics2010013 - 03 Mar 2023
Viewed by 1389
Abstract
Location-based services (LBS) require users to provide their current location for service delivery and customization. Location privacy protection addresses concerns associated with the potential mishandling of location information submitted to the LBS provider. Location accuracy has a direct impact on the quality of [...] Read more.
Location-based services (LBS) require users to provide their current location for service delivery and customization. Location privacy protection addresses concerns associated with the potential mishandling of location information submitted to the LBS provider. Location accuracy has a direct impact on the quality of service (QoS), where higher location accuracy results in better QoS. In general, the main goal of any location privacy technique is to achieve maximum QoS while providing minimum or no location information if possible, and using dummy locations is one such location privacy technique. In this paper, we introduced a temporal constraint attack whereby an adversary can exploit the temporal constraints associated with the semantic category of locations to eliminate dummy locations and identify the true location. We demonstrated how an adversary can devise a temporal constraint attack to breach the location privacy of a residential location. We addressed this major limitation of the current dummy approaches with a novel Voronoi-based semantically balanced framework (VSBDG) capable of generating dummy locations that can withstand a temporal constraint attack. Built based on real-world geospatial datasets, the VSBDG framework leverages spatial relationships and operations. Our results show a high physical dispersion cosine similarity of 0.988 between the semantic categories even with larger location set sizes. This indicates a strong and scalable semantic balance for each semantic category within the VSBDG’s output location set. The VSBDG algorithm is capable of producing location sets with high average minimum dispersion distance values of 5861.894 m for residential locations and 6258.046 m for POI locations. The findings demonstrate that the locations within each semantic category are scattered farther apart, entailing optimized location privacy. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

17 pages, 542 KiB  
Article
Dynamic Skyline Computation with LSD Trees
by Dominik Köppl
Analytics 2023, 2(1), 146-162; https://doi.org/10.3390/analytics2010009 - 09 Feb 2023
Viewed by 1427
Abstract
Given a set of high-dimensional feature vectors SRn, the skyline or Pareto problem is to report the subset of vectors in S that are not dominated by any vector of S. Vectors closer to the origin are preferred: [...] Read more.
Given a set of high-dimensional feature vectors SRn, the skyline or Pareto problem is to report the subset of vectors in S that are not dominated by any vector of S. Vectors closer to the origin are preferred: we say a vector x is dominated by another distinct vector y if x is equally or further away from the origin than y with respect to all its dimensions. The dynamic skyline problem allows us to shift the origin, which changes the answer set. This problem is crucial for dynamic recommender systems where users can shift the parameters and thus shift the origin. For each origin shift, a recomputation of the answer set from scratch is time intensive. To tackle this problem, we propose a parallel algorithm for dynamic skyline computation that uses multiple local split decision (LSD) trees concurrently. The geometric nature of the LSD trees allows us to reuse previous results. Experiments show that our proposed algorithm works well if the dimension is small in relation to the number of tuples to process. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

Review

Jump to: Editorial, Research

20 pages, 2518 KiB  
Review
A Brief Survey of Methods for Analytics over RDF Knowledge Graphs
by Maria-Evangelia Papadaki, Yannis Tzitzikas and Michalis Mountantonakis
Analytics 2023, 2(1), 55-74; https://doi.org/10.3390/analytics2010004 - 17 Jan 2023
Cited by 1 | Viewed by 2229
Abstract
There are several Knowledge Graphs expressed in RDF (Resource Description Framework) that aggregate/integrate data from various sources for providing unified access services and enabling insightful analytics. We observe this trend in almost every domain of our life. However, the provision of effective, efficient, [...] Read more.
There are several Knowledge Graphs expressed in RDF (Resource Description Framework) that aggregate/integrate data from various sources for providing unified access services and enabling insightful analytics. We observe this trend in almost every domain of our life. However, the provision of effective, efficient, and user-friendly analytic services and systems is quite challenging. In this paper we survey the approaches, systems and tools that enable the formulation of analytic queries over KGs expressed in RDF. We identify the main challenges, we distinguish two main categories of analytic queries (domain specific and quality-related), and five kinds of approaches for analytics over RDF. Then, we describe in brief the works of each category and related aspects, like efficiency and visualization. We hope this collection to be useful for researchers and engineers for advancing the capabilities and user-friendliness of methods for analytics over knowledge graphs. Full article
(This article belongs to the Special Issue Feature Papers in Analytics)
Show Figures

Figure 1

Back to TopTop