Submit to Applied Sciences Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Cloud Computing for Big Data Analysis

Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 August 2021) | Viewed by 17286

Share This Special Issue

Special Issue Editors

Dr. Fabrizio Marozzo

E-Mail Website
Guest Editor

Department of Informatics, Modeling, Electronics, and Systems Engineering (DIMES), University of Calabria, 87036 Rende, Italy
Interests: big data analysis; social media analysis; cloud computing; data mining; machine learning; Internet of Things
Special Issues, Collections and Topics in MDPI journals

Dr. Loris Belcastro

E-Mail Website
Guest Editor

Department of Electronics, Computer Science and System Sciences (DIMES), University of Calabria Via Pietro Bucci – Cubo 41C (5th floor), 87036 Rende (CS), Italy
Interests: cloud computing; social media and Big Data analysis; distributed knowledge discovery; data mining
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

It is our pleasure to announce the opening of a new Special Issue in Applied Science. The main topic for the Issue is cloud computing for big data analysis.

In the era of the Internet of Things, huge amounts of digital data are generated and collected by several sources, such as sensors, mobile devices, and social media. This huge amount of data, commonly referred as big data, represents a challenge for the current storage, process, and analysis capabilities.

Novel technologies, architectures, and algorithms have been and are being developed to capture and analyze big data. For example, in the scientific and business fields, researchers and data scientists are analyzing big data to extract information and knowledge useful for making new discoveries and supporting decision processes.

In this context, cloud computing is a valid and cost-effective solution for supporting big data storage and executing data analytic applications. Due to elastic resource allocation and high computing power, cloud computing represents a compelling solution for big data analytics, allowing faster data analysis, resulting in more timely results and then greater data value.

From this perspective, this Special Issue aims to contribute to the field, presenting the most relevant advances in this research area.

The following are some of the topics proposed for this Special Issue (but not limited to):

Programming models and algorithms for distributed computing environments;
Systems for data processing on cloud platforms;
Data analysis workflows for distributed environments;
Scalable data mining algorithms;
Programming models and scalable algorithms for big data;
Big data analytics and applications;
Applications of machine learning in big data;
Cloud-based data mining applications; and
Libraries, algorithms, and applications for big social data analysis.

We hope you will contribute your high quality research and we look forward to reading your results.

Dr. Fabrizio Marozzo
Dr. Loris Belcastro
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Cloud computing
Big data
Scalable data mining
Data analysis workflows
Social media analysis
Parallel and distributed algorithms
High performance computing
Machine learning applications

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Editorial

Jump to: Research

4 pages, 183 KiB

Open AccessEditorial

Cloud Computing for Big Data Analysis

by Fabrizio Marozzo and Loris Belcastro

Appl. Sci. 2022, 12(20), 10567; https://doi.org/10.3390/app122010567 - 19 Oct 2022

Cited by 3 | Viewed by 1514

Abstract

With the spread of the Internet of Things, large amounts of digital data are generated and collected from different sources, such as sensors, cameras, in-vehicle infotainment, smart meters, mobile devices, applications, and web services [...] Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Graphical abstract

Research

Jump to: Editorial

14 pages, 396 KiB

Open AccessArticle

Knowledge Discovery from Large Amounts of Social Media Data

by Loris Belcastro, Riccardo Cantini and Fabrizio Marozzo

Appl. Sci. 2022, 12(3), 1209; https://doi.org/10.3390/app12031209 - 24 Jan 2022

Cited by 7 | Viewed by 3254

Abstract

In recent years, social media analysis is arousing great interest in various scientific fields, such as sociology, political science, linguistics, and computer science. Large amounts of data gathered from social media are widely analyzed for extracting useful information concerning people’s behaviors and interactions. In particular, they can be exploited to analyze the collective sentiment of people, understand the behavior of user groups during global events, monitor public opinion close to important events, identify the main topics in a public discussion, or detect the most frequent routes followed by social media users. As an example of the countless works in the state-of-the-art on social media analysis, this paper presents three significant applications in the field of opinion and pattern mining from social media data: (i) an automatic application for discovering user mobility patterns, (ii) a novel application for estimating the political polarization of public opinion, and (iii) an application for discovering interesting social media discussion topics through a hashtag recommendation system. Such applications clearly highlight the abundance and wealth of useful information in many application contexts of human life that can be extracted from social media posts. Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Figure 1

12 pages, 933 KiB

Open AccessArticle

A Computational Intelligence Approach to Predict Energy Demand Using Random Forest in a Cloudera Cluster

by Laura Cáceres, Jose Ignacio Merino and Norberto Díaz-Díaz

Appl. Sci. 2021, 11(18), 8635; https://doi.org/10.3390/app11188635 - 17 Sep 2021

Cited by 7 | Viewed by 2252

Abstract

Society’s energy consumption has shot up in recent years, making the prediction of its demand a current challenge to ensure an efficient and responsible use. Artificial intelligence techniques have proven to be potential tools in handling tedious tasks and making sense of large-scale data to make better business decisions in different areas of knowledge. In this article, the use of random forests algorithms in a Big Data environment is proposed for household energy demand forecasting. The predictions are based on the use of information from different sources, confirming a fundamental role of socioeconomic data in consumer’s behaviours. On the other hand, the use of Big Data architectures is proposed to perform horizontal and vertical scaling of the solution to be used in real environments. Finally, a tool for high-resolution predictions with great efficiency is introduced, which enables energy management in a very accurate way. Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Figure 1

13 pages, 526 KiB

Open AccessArticle

Employing Vertical Elasticity for Efficient Big Data Processing in Container-Based Cloud Environments

by Jin-young Choi, Minkyoung Cho and Jik-Soo Kim

Appl. Sci. 2021, 11(13), 6200; https://doi.org/10.3390/app11136200 - 04 Jul 2021

Cited by 5 | Viewed by 2122

Abstract

Recently, “Big Data” platform technologies have become crucial for distributed processing of diverse unstructured or semi-structured data as the amount of data generated increases rapidly. In order to effectively manage these Big Data, Cloud Computing has been playing an important role by providing scalable data storage and computing resources for competitive and economical Big Data processing. Accordingly, server virtualization technologies that are the cornerstone of Cloud Computing have attracted a lot of research interests. However, conventional hypervisor-based virtualization can cause performance degradation problems due to its heavily loaded guest operating systems and rigid resource allocations. On the other hand, container-based virtualization technology can provide the same level of service faster with a lightweight capacity by effectively eliminating the guest OS layers. In addition, container-based virtualization enables efficient cloud resource management by dynamically adjusting the allocated computing resources (e.g., CPU and memory) during the runtime through “Vertical Elasticity”. In this paper, we present our practice and experience of employing an adaptive resource utilization scheme for Big Data workloads in container-based cloud environments by leveraging the vertical elasticity of Docker, a representative container-based virtualization technique. We perform extensive experiments running several Big Data workloads on representative Big Data platforms: Apache Hadoop and Spark. During the workload executions, our adaptive resource utilization scheme periodically monitors the resource usage patterns of running containers and dynamically adjusts allocated computing resources that could result in substantial improvements in the overall system throughput. Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Figure 1

13 pages, 1562 KiB

Open AccessArticle

Benchmarking and Performance Evaluations on Various Configurations of Virtual Machine and Containers for Cloud-Based Scientific Workloads

by Syed Asif Raza Shah, Ahmad Waqas, Moon-Hyun Kim, Tae-Hyung Kim, Heejun Yoon and Seo-Young Noh

Appl. Sci. 2021, 11(3), 993; https://doi.org/10.3390/app11030993 - 22 Jan 2021

Cited by 9 | Viewed by 2813

Abstract

Cloud computing manages system resources such as processing, storage, and networking by providing users with multiple virtual machines (VMs) as needed. It is one of the rapidly growing fields that come with huge computational power for scientific workloads. Currently, the scientific community is ready to work over the cloud as it is considered as a resource-rich paradigm. The traditional way of executing scientific workloads on cloud computing is by using virtual machines. However, the latest emerging concept of containerization is growing more rapidly and gained popularity because of its unique features. Containers are treated as lightweight as compared to virtual machines in cloud computing. In this regard, a few VMs/containers-associated problems of performance and throughput are encountered because of middleware technologies such as virtualization or containerization. In this paper, we introduce the configurations of VMs and containers for cloud-based scientific workloads in order to utilize the technologies to solve scientific problems and handle their workloads. This paper also tackles throughput and efficiency problems related to VMs and containers in the cloud environment and explores efficient resource provisioning by combining four unique methods: hyperthreading (HT), vCPU cores selection, vCPU affinity, and isolation of vCPUs. The HEPSCPEC06 benchmark suite is used to evaluate the throughput and efficiency of VMs and containers. The proposed solution is to implement four basic techniques to reduce the effect of virtualization and containerization. Additionally, these techniques are used to make virtual machines and containers more effective and powerful for scientific workloads. The results show that allowing hyperthreading, isolation of CPU cores, proper numbering, and allocation of vCPU cores can improve the throughput and performance of virtual machines and containers. Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Figure 1

16 pages, 2520 KiB

Open AccessArticle

Spatiotemporal Analysis of Web News Archives for Crime Prediction

by Areeba Umair, Muhammad Shahzad Sarfraz, Muhammad Ahmad, Usman Habib, Muhammad Habib Ullah and Manuel Mazzara

Appl. Sci. 2020, 10(22), 8220; https://doi.org/10.3390/app10228220 - 20 Nov 2020

Cited by 23 | Viewed by 3427

Abstract

In today’s world, security is the most prominent aspect which has been given higher priority. Despite the rapid growth and usage of digital devices, lucrative measurement of crimes in under-developing countries is still challenging. In this work, unstructural crime data (900 records) from the news archives of the previous eight years were extracted to predict the behavior of criminals’ networks and transform it into useful information using natural language processing (NLP). To estimate the next move of criminals in Pakistan, we performed hotspot-based spatial analysis. Later, this information is fed to two different classifiers for possible identification and prediction. We achieved the maximum accuracy of

92 %

using K-Nearest Neighbor (KNN) and

62 %

using the Random Forest algorithm. In terms of crimes, the results showed that the most prevalent crime events are robberies. Thus, the usage of digital information archives, spatial analysis, and machine learning techniques can open new ways of handling a peaceful and sustainable society in eradicating crimes for countries having paucity of financial resources. Full article

(This article belongs to the Special Issue Cloud Computing for Big Data Analysis)

► Show Figures

Journal Menu

Journal Browser

Cloud Computing for Big Data Analysis

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (6 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI