Applications of Text Mining in Software Repository Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 3398

Special Issue Editors


E-Mail Website
Guest Editor
Centre for Research and Technology-Hellas, Information Technology Institute (CERTH/ITI), 57001 Thessaloniki, Greece
Interests: software engineering; software security; software reliability; software quality; mathematical modelling; of computer systems; artificial intelligence for software engineering; expert systems

E-Mail Website
Guest Editor
IITiS PAN, ul. Baltycka 5, 44-100 Gliwice, Poland
Interests: energy optimization; energy packet networks; neural computation; random neural networks; G-networks; networked systems; physical and biological networks; probability models; natural computation
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Modern software repositories comprise a large volume of textual information, ranging from the natural language description of their functional and non-functional requirements to the textual representation of their actual source code. This information can be leveraged for extracting useful insights about the associated software repositories and their overall development process, as well as for building intelligent models for aiding their design and development and improving their quality. To this end, text mining techniques have been utilized extensively in the literature in order to extract interesting patterns and gain relevant knowledge from software repositories (e.g., bug patterns, vulnerability patterns, developer activity trends, etc.). In addition to this, text mining-related information has been also used for enhancing software engineering tasks, including: requirements elicitation, verification, and validation; bug/vulnerability prediction, detection, and localization; code clone detection; and automatic code generation, among others. Although a lot of work has been conducted during the past years on the aforementioned topics, many challenges remain unresolved and require further research.

The aim of this Special Issue is to bring together academics and practitioners to exchange and discuss the latest innovations and applications of text mining techniques in analyzing software repositories. Particularly, we are seeking for review papers, empirical studies, and technical papers describing novel solutions and methodologies on the topic of text mining techniques for software repository analysis. Papers that utilize text mining for addressing, but not limited to, the following topics will be considered for publication:

  • Requirements elicitation;
  • Requirements verification and validation;
  • Requirements classification;
  • Natural language processing of source code;
  • Bug and vulnerability pattern identification;
  • Bug and vulnerability prediction;
  • Bug and vulnerability detection;
  • Bug and vulnerability localization;
  • Bug and vulnerability severity evaluation;
  • Automatic test case generation;
  • Software repository tagging;
  • Software repository knowledge discovery;
  • Source code summarization;
  • Source code recommendation;
  • Automatic source code generation;
  • Code clone detection;
  • Automatic feature identification;
  • Automatic technology identification.

Dr. Miltiadis G. Siavvas
Dr. Erol Gelenbe
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • software engineering
  • mining software repositories
  • text mining
  • natural language processing
  • artificial intelligence
  • deep learning
  • requirements verification, and validation
  • bug and vulnerability prediction
  • bug and vulnerability detection
  • code clone detection
  • automatic code generation

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

34 pages, 1610 KiB  
Article
Security Monitoring during Software Development: An Industrial Case Study
by Miltiadis Siavvas, Dimitrios Tsoukalas, Ilias Kalouptsoglou, Evdoxia Manganopoulou, Georgios Manolis, Dionysios Kehagias and Dimitrios Tzovaras
Appl. Sci. 2023, 13(12), 6872; https://doi.org/10.3390/app13126872 - 06 Jun 2023
Viewed by 1920
Abstract
The devastating consequences of successful security breaches that have been observed recently have forced more and more software development enterprises to shift their focus towards building software products that are highly secure (i.e., vulnerability-free) from the ground up. In order to produce secure [...] Read more.
The devastating consequences of successful security breaches that have been observed recently have forced more and more software development enterprises to shift their focus towards building software products that are highly secure (i.e., vulnerability-free) from the ground up. In order to produce secure software applications, appropriate mechanisms are required for enabling project managers and developers to monitor the security level of their products during their development and identify and eliminate vulnerabilities prior to their release. A large number of such mechanisms have been proposed in the literature over the years, but limited attempts with respect to their industrial applicability, relevance, and practicality can be found. To this end, in the present paper, we demonstrate an integrated security platform, the VM4SEC platform, which exhibits cutting-edge solutions for software security monitoring and optimization, based on static and textual source code analysis. The platform was built in a way to satisfy the actual security needs of a real software development company. For this purpose, an industrial case study was conducted in order to identify the current security state of the company and its security needs in order for the employed security mechanisms to be adapted to the specific needs of the company. Based on this analysis, the overall architecture of the platform and the parameters of the selected models and mechanisms were properly defined and demonstrated in the present paper. The purpose of this paper is to showcase how cutting-edge security monitoring and optimization mechanisms can be adapted to the needs of a dedicated company and to be used as a blueprint for constructing similar security monitoring platforms and pipelines. Full article
(This article belongs to the Special Issue Applications of Text Mining in Software Repository Analysis)
Show Figures

Figure 1

17 pages, 403 KiB  
Article
An Improved Confounding Effect Model for Software Defect Prediction
by Yuyu Yuan, Chenlong Li and Jincui Yang
Appl. Sci. 2023, 13(6), 3459; https://doi.org/10.3390/app13063459 - 08 Mar 2023
Viewed by 921
Abstract
Software defect prediction technology can effectively improve software quality. Depending on the code metrics, machine learning models are built to predict potential defects. Some researchers have indicated that the size metric could cause confounding effects and bias the prediction results. However, evidence shows [...] Read more.
Software defect prediction technology can effectively improve software quality. Depending on the code metrics, machine learning models are built to predict potential defects. Some researchers have indicated that the size metric could cause confounding effects and bias the prediction results. However, evidence shows that the real confounder should be the development cycle and number of developers, which could bring confounding effects when using code metrics for prediction. This paper proposes an improved confounding effect model, introducing a new confounding variable into the traditional model. On multiple projects, we experimentally analyzed the effect extent of the confounding variable. Furthermore, we verified that controlling confounding variables helps improve the predictive model’s performance. Full article
(This article belongs to the Special Issue Applications of Text Mining in Software Repository Analysis)
Show Figures

Figure 1

Back to TopTop