sensors-logo

Journal Browser

Journal Browser

Recent Advances in Big Data and Cloud Computing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (20 June 2022) | Viewed by 23237

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan
Interests: cloud computing; IoT; RFID; big data; edge & fog computing; distributed systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Big data is a rapidly expanding research area, spanning the fields of computer science and information management, and has become a ubiquitous term used when understanding and solving complex problems in different disciplinary fields, such as engineering, applied mathematics, medicine, computational biology, healthcare, social networks, finance, business, government, education, transportation and telecommunications.

The cloud represents the natural evolution of distributed computing and the widespread adaption of virtualization. Cloud computing has become a platform for the consumption and delivery of scalable services in the field of IT services.  The goal of cloud computing is to share resources among the cloud service consumers, cloud partners and cloud vendors in the cloud value chain.  Edge / fog computing is the extension of cloud computing to the edge, namely the physical world, to meet the data volume and decision velocity requirements of many emerging applications, such as augmented and virtual realities (AR/VR), cyber-physical systems (CPS), intelligent and autonomous systems and mission-critical systems. The boundary between the powerful centralized cloud and massively distributed, Internet-connected sensors, actuators and “things” is blurred in this new computing paradigm.  Many unresolved problems exist as they are not sufficiently specified, since edge computing itself has not yet been fully defined.

This Special Issue encourages authors from academia and industry to submit new research results relating to advanced technological innovations in big data and cloud computing.

The Special Issue topics include, but are not limited to, the following:

  • System architecture for cloud/edge/fog computing;
  • Coordination between cloud, fog and sensing/actuation endpoints;
  • Connectivity, storage and computation in the edge;
  • Security, privacy and ethics issues related to the cloud and big data;
  • Power, energy and resource management;
  • Big data analytics and machine learning;
  • Big data platforms and technologies;
  • Predictive and business intelligence;
  • Information solution architecture;
  • Sensing as a service;
  • Large-scale sensor networks.

Dr. Robert Hsu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Cloud Computing
  • Fog Computing
  • Edge Computing
  • Big Data Analytics
  • Internet-of-Things

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

29 pages, 1166 KiB  
Article
Pangea: An MLOps Tool for Automatically Generating Infrastructure and Deploying Analytic Pipelines in Edge, Fog and Cloud Layers
by Raúl Miñón, Josu Diaz-de-Arcaya, Ana I. Torre-Bastida and Philipp Hartlieb
Sensors 2022, 22(12), 4425; https://doi.org/10.3390/s22124425 - 11 Jun 2022
Cited by 9 | Viewed by 2692
Abstract
Development and operations (DevOps), artificial intelligence (AI), big data and edge–fog–cloud are disruptive technologies that may produce a radical transformation of the industry. Nevertheless, there are still major challenges to efficiently applying them in order to optimise productivity. Some of them are addressed [...] Read more.
Development and operations (DevOps), artificial intelligence (AI), big data and edge–fog–cloud are disruptive technologies that may produce a radical transformation of the industry. Nevertheless, there are still major challenges to efficiently applying them in order to optimise productivity. Some of them are addressed in this article, concretely, with respect to the adequate management of information technology (IT) infrastructures for automated analysis processes in critical fields such as the mining industry. In this area, this paper presents a tool called Pangea aimed at automatically generating suitable execution environments for deploying analytic pipelines. These pipelines are decomposed into various steps to execute each one in the most suitable environment (edge, fog, cloud or on-premise) minimising latency and optimising the use of both hardware and software resources. Pangea is focused in three distinct objectives: (1) generating the required infrastructure if it does not previously exist; (2) provisioning it with the necessary requirements to run the pipelines (i.e., configuring each host operative system and software, install dependencies and download the code to execute); and (3) deploying the pipelines. In order to facilitate the use of the architecture, a representational state transfer application programming interface (REST API) is defined to interact with it. Therefore, in turn, a web client is proposed. Finally, it is worth noting that in addition to the production mode, a local development environment can be generated for testing and benchmarking purposes. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

21 pages, 1713 KiB  
Article
An Accelerating Reduction Approach for Incomplete Decision Table Using Positive Approximation Set
by Tao Yan, Chongzhao Han, Kaitong Zhang and Chengnan Wang
Sensors 2022, 22(6), 2211; https://doi.org/10.3390/s22062211 - 12 Mar 2022
Viewed by 1365
Abstract
Due to the explosive growth of data collected by various sensors, it has become a difficult problem determining how to conduct feature selection more efficiently. To address this problem, we offer a fresh insight into rough set theory from the perspective of a [...] Read more.
Due to the explosive growth of data collected by various sensors, it has become a difficult problem determining how to conduct feature selection more efficiently. To address this problem, we offer a fresh insight into rough set theory from the perspective of a positive approximation set. It is found that a granularity domain can be used to characterize the target knowledge, because of its form of a covering with respect to a tolerance relation. On the basis of this fact, a novel heuristic approach ARIPA is proposed to accelerate representative reduction algorithms for incomplete decision table. As a result, ARIPA in classical rough set model and ARIPA-IVPR in variable precision rough set model are realized respectively. Moreover, ARIPA is adopted to improve the computational efficiency of two existing state-of-the-art reduction algorithms. To demonstrate the effectiveness of the improved algorithms, a variety of experiments utilizing four UCI incomplete data sets are conducted. The performances of improved algorithms are compared with those of original ones as well. Numerical experiments justify that our accelerating approach enhances the existing algorithms to accomplish the reduction task more quickly. In some cases, they fulfill attribute reduction even more stably than the original algorithms do. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

14 pages, 2059 KiB  
Article
Balanced Energy-Aware and Fault-Tolerant Data Center Scheduling
by Muhammad Shaukat, Waleed Alasmary, Eisa Alanazi, Junaid Shuja, Sajjad A. Madani and Ching-Hsien Hsu
Sensors 2022, 22(4), 1482; https://doi.org/10.3390/s22041482 - 14 Feb 2022
Cited by 11 | Viewed by 2231
Abstract
Fault tolerance, performance, and throughput have been major areas of research and development since the evolution of large-scale networks. Internet-based applications are rapidly growing, including large-scale computations, search engines, high-definition video streaming, e-commerce, and video on demand. In recent years, energy efficiency and [...] Read more.
Fault tolerance, performance, and throughput have been major areas of research and development since the evolution of large-scale networks. Internet-based applications are rapidly growing, including large-scale computations, search engines, high-definition video streaming, e-commerce, and video on demand. In recent years, energy efficiency and fault tolerance have gained significant importance in data center networks and various studies directed the attention towards green computing. Data centers consume a huge amount of energy and various architectures and techniques have been proposed to improve the energy efficiency of data centers. However, there is a tradeoff between energy efficiency and fault tolerance. The objective of this study is to highlight a better tradeoff between the two extremes: (a) high energy efficiency and (b) ensuring high availability through fault tolerance and redundancy. The main objective of the proposed Energy-Aware Fault-Tolerant (EAFT) approach is to keep one level of redundancy for fault tolerance while scheduling resources for energy efficiency. The resultant energy-efficient data center network provides availability as well as fault tolerance at reduced operating cost. The main contributions of this article are: (a) we propose an Energy-Aware Fault-Tolerant (EAFT) data center network scheduler; (b) we compare EAFT with energy efficient resource scheduling techniques to provide analysis of parameters such as, workload distribution, average task per servers, and energy consumption; and (c) we highlight effects of energy efficiency techniques on the network performance of the data center. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

16 pages, 1161 KiB  
Article
A Joint Resource Allocation, Security with Efficient Task Scheduling in Cloud Computing Using Hybrid Machine Learning Techniques
by Prasanta Kumar Bal, Sudhir Kumar Mohapatra, Tapan Kumar Das, Kathiravan Srinivasan and Yuh-Chung Hu
Sensors 2022, 22(3), 1242; https://doi.org/10.3390/s22031242 - 6 Feb 2022
Cited by 58 | Viewed by 6855
Abstract
The rapid growth of cloud computing environment with many clients ranging from personal users to big corporate or business houses has become a challenge for cloud organizations to handle the massive volume of data and various resources in the cloud. Inefficient management of [...] Read more.
The rapid growth of cloud computing environment with many clients ranging from personal users to big corporate or business houses has become a challenge for cloud organizations to handle the massive volume of data and various resources in the cloud. Inefficient management of resources can degrade the performance of cloud computing. Therefore, resources must be evenly allocated to different stakeholders without compromising the organization’s profit as well as users’ satisfaction. A customer’s request cannot be withheld indefinitely just because the fundamental resources are not free on the board. In this paper, a combined resource allocation security with efficient task scheduling in cloud computing using a hybrid machine learning (RATS-HM) technique is proposed to overcome those problems. The proposed RATS-HM techniques are given as follows: First, an improved cat swarm optimization algorithm-based short scheduler for task scheduling (ICS-TS) minimizes the make-span time and maximizes throughput. Second, a group optimization-based deep neural network (GO-DNN) for efficient resource allocation using different design constraints includes bandwidth and resource load. Third, a lightweight authentication scheme, i.e., NSUPREME is proposed for data encryption to provide security to data storage. Finally, the proposed RATS-HM technique is simulated with a different simulation setup, and the results are compared with state-of-art techniques to prove the effectiveness. The results regarding resource utilization, energy consumption, response time, etc., show that the proposed technique is superior to the existing one. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

17 pages, 439 KiB  
Article
Edge Computing Driven Data Sensing Strategy in the Entire Crop Lifecycle for Smart Agriculture
by Rihong Zhang and Xiaomin Li
Sensors 2021, 21(22), 7502; https://doi.org/10.3390/s21227502 - 11 Nov 2021
Cited by 9 | Viewed by 2107
Abstract
In the context of smart agriculture, high-value data sensing in the entire crop lifecycle is fundamental for realizing crop cultivation control. However, the existing data sensing methods are deficient regarding the sensing data value, poor data correlation, and high data collection cost. The [...] Read more.
In the context of smart agriculture, high-value data sensing in the entire crop lifecycle is fundamental for realizing crop cultivation control. However, the existing data sensing methods are deficient regarding the sensing data value, poor data correlation, and high data collection cost. The main problem for data sensing over the entire crop lifecycle is how to sense high-value data according to crop growth stage at a low cost. To solve this problem, a data sensing framework was developed by combining edge computing with the Internet of Things, and a novel data sensing strategy for the entire crop lifecycle is proposed in this paper. The proposed strategy includes four phases. In the first phase, the crop growth stage is divided by Gath-Geva (GG) fuzzy clustering, and the key growth parameters corresponding to the growth stage are extracted. In the second phase, based on the current crop growth information, a prediction method of the current crop growth stage is constructed by using a Tkagi-Sugneo (T-S) fuzzy neural network. In the third phase, based on Deng’s grey relational analysis method, the environmental sensing parameters of the corresponding crop growth stage are optimized. In the fourth phase, an adaptive sensing method of sensing nodes with effective sensing area constraints is established. Finally, based on the actual crop growth history data, the whole crop life cycle dataset is established to test the performance and prediction accuracy of the proposed method for crop growth stage division. Based on the historical data, the simulation data sensing environment is established. Then, the proposed algorithm is tested and compared with the traditional algorithms. The comparison results show that the proposed strategy can divide and predict a crop growth cycle with high accuracy. The proposed strategy can significantly reduce the sensing and data collection times and energy consumption and significantly improve the value of sensing data. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

19 pages, 2797 KiB  
Article
Fault-Tolerant and Data-Intensive Resource Scheduling and Management for Scientific Applications in Cloud Computing
by Zulfiqar Ahmad, Ali Imran Jehangiri, Mohammed Alaa Ala’anzy, Mohamed Othman and Arif Iqbal Umar
Sensors 2021, 21(21), 7238; https://doi.org/10.3390/s21217238 - 30 Oct 2021
Cited by 8 | Viewed by 1831
Abstract
Cloud computing is a fully fledged, matured and flexible computing paradigm that provides services to scientific and business applications in a subscription-based environment. Scientific applications such as Montage and CyberShake are organized scientific workflows with data and compute-intensive tasks and also have some [...] Read more.
Cloud computing is a fully fledged, matured and flexible computing paradigm that provides services to scientific and business applications in a subscription-based environment. Scientific applications such as Montage and CyberShake are organized scientific workflows with data and compute-intensive tasks and also have some special characteristics. These characteristics include the tasks of scientific workflows that are executed in terms of integration, disintegration, pipeline, and parallelism, and thus require special attention to task management and data-oriented resource scheduling and management. The tasks executed during pipeline are considered as bottleneck executions, the failure of which result in the wholly futile execution, which requires a fault-tolerant-aware execution. The tasks executed during parallelism require similar instances of cloud resources, and thus, cluster-based execution may upgrade the system performance in terms of make-span and execution cost. Therefore, this research work presents a cluster-based, fault-tolerant and data-intensive (CFD) scheduling for scientific applications in cloud environments. The CFD strategy addresses the data intensiveness of tasks of scientific workflows with cluster-based, fault-tolerant mechanisms. The Montage scientific workflow is considered as a simulation and the results of the CFD strategy were compared with three well-known heuristic scheduling policies: (a) MCT, (b) Max-min, and (c) Min-min. The simulation results showed that the CFD strategy reduced the make-span by 14.28%, 20.37%, and 11.77%, respectively, as compared with the existing three policies. Similarly, the CFD reduces the execution cost by 1.27%, 5.3%, and 2.21%, respectively, as compared with the existing three policies. In case of the CFD strategy, the SLA is not violated with regard to time and cost constraints, whereas it is violated by the existing policies numerous times. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

20 pages, 891 KiB  
Article
Secure Outsourcing of Matrix Determinant Computation under the Malicious Cloud
by Mingyang Song and Yingpeng Sang
Sensors 2021, 21(20), 6821; https://doi.org/10.3390/s21206821 - 14 Oct 2021
Cited by 1 | Viewed by 1519
Abstract
Computing the determinant of large matrix is a time-consuming task, which is appearing more and more widely in science and engineering problems in the era of big data. Fortunately, cloud computing can provide large storage and computation resources, and thus, act as an [...] Read more.
Computing the determinant of large matrix is a time-consuming task, which is appearing more and more widely in science and engineering problems in the era of big data. Fortunately, cloud computing can provide large storage and computation resources, and thus, act as an ideal platform to complete computation outsourced from resource-constrained devices. However, cloud computing also causes security issues. For example, the curious cloud may spy on user privacy through outsourced data. The malicious cloud violating computing scripts, as well as cloud hardware failure, will lead to incorrect results. Therefore, we propose a secure outsourcing algorithm to compute the determinant of large matrix under the malicious cloud mode in this paper. The algorithm protects the privacy of the original matrix by applying row/column permutation and other transformations to the matrix. To resist malicious cheating on the computation tasks, a new verification method is utilized in our algorithm. Unlike previous algorithms that require multiple rounds of verification, our verification requires only one round without trading off the cheating detectability, which greatly reduces the local computation burden. Both theoretical and experimental analysis demonstrate that our algorithm achieves a better efficiency on local users than previous ones on various dimensions of matrices, without sacrificing the security requirements in terms of privacy protection and cheating detectability. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

16 pages, 562 KiB  
Article
Evaluation of Task Scheduling Algorithms in Heterogeneous Computing Environments
by Roxana-Gabriela Stan, Lidia Băjenaru, Cătălin Negru and Florin Pop
Sensors 2021, 21(17), 5906; https://doi.org/10.3390/s21175906 - 2 Sep 2021
Cited by 3 | Viewed by 3279
Abstract
This work establishes a set of methodologies to evaluate the performance of any task scheduling policy in heterogeneous computing contexts. We formally state a scheduling model for hybrid edge–cloud computing ecosystems and conduct simulation-based experiments on large workloads. In addition to the conventional [...] Read more.
This work establishes a set of methodologies to evaluate the performance of any task scheduling policy in heterogeneous computing contexts. We formally state a scheduling model for hybrid edge–cloud computing ecosystems and conduct simulation-based experiments on large workloads. In addition to the conventional cloud datacenters, we consider edge datacenters comprising smartphone and Raspberry Pi edge devices, which are battery powered. We define realistic capacities of the computational resources. Once a schedule is found, the various task demands can or cannot be fulfilled by the resource capacities. We build a scheduling and evaluation framework and measure typical scheduling metrics such as mean waiting time, mean turnaround time, makespan, throughput on the Round-Robin, Shortest Job First, Min-Min and Max-Min scheduling schemes. Our analysis and results show that the state-of-the-art independent task scheduling algorithms suffer from performance degradation in terms of significant task failures and nonoptimal resource utilization of datacenters in heterogeneous edge–cloud mediums in comparison to cloud-only mediums. In particular, for large sets of tasks, due to low battery or limited memory, more than 25% of tasks fail to execute for each scheduling scheme. Full article
(This article belongs to the Special Issue Recent Advances in Big Data and Cloud Computing)
Show Figures

Figure 1

Back to TopTop