Scientific Data Processing and Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 December 2023) | Viewed by 6657

Special Issue Editor


E-Mail Website
Guest Editor
Matrosov Institute for System Dynamics and Control Theory, Siberian Branch, Russian Academy of Sciences, 664033 Irkutsk, Russia
Interests: data science; deep learning; artificial intelligence; geoinformation systems; web technologies

Special Issue Information

Dear Colleagues,

This Special Issue is devoted to the methods and algorithms for processing and analyzing scientific data. Scientific data cover observational, experimental, and simulation data, including but not limited to remote sensing data, sensor data (IoT), texts, images, and video and audio obtained in the course of research carried out by scientists. Data collection and data cleaning methods, data engineering, machine learning, deep learning, big data management and processing technologies, and cloud, parallel, and distributed computing can be used for processing. Particularly welcome will be works associated with the sharing and reuse of scientific data. This Special Issue aims to provide a forum for the academic community to present and discuss theories, architecture, techniques, and applications of scientific data.

Dr. Igor V. Bychkov
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data analysis
  • machine learning
  • deep learning
  • data engineering
  • data management and processing
  • data science

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 4820 KiB  
Article
Analysis of Using Machine Learning Techniques for Estimating Solar Panel Performance in Edge Sensor Devices
by Dalibor Dobrilovic, Jasmina Pekez, Visnja Ognjenovic and Eleonora Desnica
Appl. Sci. 2024, 14(3), 1296; https://doi.org/10.3390/app14031296 - 04 Feb 2024
Viewed by 817
Abstract
The importance of the usage of renewable energy sources in powering wireless sensor nodes in IoT and sensor networks grows together with the increasing number of utilized sensor nodes. Considering the other types of renewable energy sources, solar power differs as the most [...] Read more.
The importance of the usage of renewable energy sources in powering wireless sensor nodes in IoT and sensor networks grows together with the increasing number of utilized sensor nodes. Considering the other types of renewable energy sources, solar power differs as the most suitable one and emerges as the major source for powering sensor nodes. Thus, the consideration of using sensor nodes and collected sensor data for estimating solar panel performances and therefore solar power potential can improve the efforts in this direction. This paper presents the methodology for implementing edge intelligence on wireless sensor nodes for solar panel output voltage estimation and forecasting. The methodology covers the usage of the Python Scikit-learn package and micromlgen library for the implementation of edge intelligence on Arduino clone-based sensor nodes, particularly the development boards based on the ESP8266 chips. Scikit-learn is used for analyzing the efficiency of various regressors on collected solar data. The micromlgen library is then used for implementing those regressors on Arduino and clone nodes. The prediction of solar panel voltage generation is based on a single-sensor reading—UV or BH1750 light sensor. The Random Forest and Decision Tree regressors are implemented on the ESP8266-based development board—Wemos D1 R2. The estimation accuracy of the RF model is an MSE of approximately 0.10, MAE of 0.07 for UV and 0.04 for BH1750, and an R2 of approximately 0.93 for both UV and BH1750 light sensors. The Decision Tree model has a lower accuracy with an MSE between 0.13 and 0.14, MAE of 0.07 for UV and 0.04 for BH1750, and R2 of 0.90 and 0.89 for the UV and BH1750 sensors, respectively. The methodology and its efficiency are presented and discussed in this paper. Full article
(This article belongs to the Special Issue Scientific Data Processing and Analysis)
Show Figures

Figure 1

21 pages, 6941 KiB  
Article
Comparison of Artificial Neural Network and Regression Models for Filling Temporal Gaps of Meteorological Variables Time Series
by Egor Dyukarev
Appl. Sci. 2023, 13(4), 2646; https://doi.org/10.3390/app13042646 - 18 Feb 2023
Cited by 1 | Viewed by 1305
Abstract
Continuous meteorological variable time series are highly demanded for various climate related studies. Five statistical models were tested for application of temporal gaps filling in time series of surface air pressure, air temperature, relative air humidity, incoming solar radiation, net radiation, and soil [...] Read more.
Continuous meteorological variable time series are highly demanded for various climate related studies. Five statistical models were tested for application of temporal gaps filling in time series of surface air pressure, air temperature, relative air humidity, incoming solar radiation, net radiation, and soil temperature. A bilayer artificial neural network, linear regression, linear regression with interactions, and the Gaussian process regression models with exponential and rational quadratic kernel were used to fill the gaps. Models were driven by continuous time series of meteorological variables from the ECMWF (European Centre for Medium-range Weather Forecasts) ERA5-Land reanalysis. Raw ECMWF ERA5-Land reanalysis data are not applicable for characterization of specific local weather conditions. The linear correlation coefficients (CC) between ERA5-Land data and in situ observations vary from 0.61 (for wind direction) to 0.99 (for atmospheric pressure). The mean difference is high and estimated at 3.2 °C for air temperature and 3.5 hPa for atmospheric pressure. The normalized root-mean-square error (NRMSE) is 5–13%, except for wind direction (NRMSE = 49%). The linear bias correction of ERA5-Land data improves matching between the local and reanalysis data for all meteorological variables. The Gaussian process regression model with an exponential kernel based or bilayered artificial neural network trained on ERA5-Land data significantly shifts raw ERA5-Land data toward the observed values. The NRMSE values reduce to 2–11% for all variables, except wind direction (NRMSE = 22%). CC for the model is above 0.87, except for wind characteristics. The suggested model calibrated against in situ observations can be applied for gap-filling of time series of meteorological variables. Full article
(This article belongs to the Special Issue Scientific Data Processing and Analysis)
Show Figures

Figure 1

22 pages, 4317 KiB  
Article
TKIFRPM: A Novel Approach for Topmost-K Identical Frequent Regular Patterns Mining from Incremental Datasets
by Saif Ur Rehman, Muhammad Altaf Khan, Habib Un Nabi, Shaukat Ali, Noha Alnazzawi and Shafiullah Khan
Appl. Sci. 2023, 13(1), 654; https://doi.org/10.3390/app13010654 - 03 Jan 2023
Viewed by 1261
Abstract
The regular frequent pattern mining (RFPM) approaches are aimed to discover the itemsets with significant frequency and regular occurrence behavior in a dataset. However, these approaches mainly suffer from the following two issues: (1) setting the frequency threshold parameter for the discovery of [...] Read more.
The regular frequent pattern mining (RFPM) approaches are aimed to discover the itemsets with significant frequency and regular occurrence behavior in a dataset. However, these approaches mainly suffer from the following two issues: (1) setting the frequency threshold parameter for the discovery of regular frequent patterns technique is not an easy task because of its dependency on the characteristics of a dataset, and (2) RFPM approaches are designed to mine patterns from the static datasets and are not able to mine dynamic datasets. This paper aims to solve these two issues by proposing a novel top-K identical frequent regular patterns mining (TKIFRPM) approach to function on online datasets. The TKIFRPM maintains a novel synopsis data structure with item support index tables (ISI-tables) to keep summarized information about online committed transactions and dataset updates. The mining operation can discover top-K regular frequent patterns from online data stored in the ISI-tables. The TKIFRPM explores the search space in recursive depth-first order and applies a novel progressive node’s sub-tree pruning strategy to rapidly eliminate a complete infrequent sub-tree from the search space. The TKIFRPM is compared with the MTKPP approach, and it found that it outperforms its counterpart in terms of runtime and memory usage to produce designated topmost-K frequent regular pattern mining on the datasets following incremental updates. Full article
(This article belongs to the Special Issue Scientific Data Processing and Analysis)
Show Figures

Figure 1

24 pages, 5749 KiB  
Article
Forest Fire Risk Forecasting with the Aid of Case-Based Reasoning
by Nikita Dorodnykh, Olga Nikolaychuk, Julia Pestova and Aleksandr Yurin
Appl. Sci. 2022, 12(17), 8761; https://doi.org/10.3390/app12178761 - 31 Aug 2022
Cited by 4 | Viewed by 1724
Abstract
Forest fire is one of the serious threats to the population and infrastructure of Irkutsk Oblast because its territory is heavily forested. This paper discusses the main stages of solving the problem of forecasting the risk of forest fires via a case-based approach, [...] Read more.
Forest fire is one of the serious threats to the population and infrastructure of Irkutsk Oblast because its territory is heavily forested. This paper discusses the main stages of solving the problem of forecasting the risk of forest fires via a case-based approach, including data preprocessing, formation of a case model, and creation of a prototype of a case-based expert system. The main contributions of the paper are the following: a case model that provides a compact representation of information about weather conditions, vegetation type, and infrastructure of the region in relation to the possible risk of a wildfire; a case-base containing information about wildfires in Irkutsk Oblast for the period from 2017 to 2020; and a methodology for creating prototypes of case bases providing the transformation of decision tables of a special type. The approbation of the approach was carried out for separate forest districts, namely Bodaibinsk and Kazachinsk-Lena. The accuracy score was used for the evaluation of the results of forecasting the risk of wildfires. The average score value reached 0.51. The evaluation results revealed that application of the case-based approach can be considered as the initial stage for deeper investigations with the use of different methods (data mining, neural networks) for more accurate forecasting. Full article
(This article belongs to the Special Issue Scientific Data Processing and Analysis)
Show Figures

Figure 1

Back to TopTop