Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges

Albuquerque, Danyllo; Guimarães, Everton; Perkusich, Mirko; Almeida, Hyggo; Perkusich, Angelo

doi:10.3390/app13158770

Open AccessArticle

Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges

by

Danyllo Albuquerque

^1,2,*,†

,

Everton Guimarães

^3,†

,

Mirko Perkusich

^2,†

,

Hyggo Almeida

^2,†

and

Angelo Perkusich

^2,†

¹

Federal Institute of Paraiba (IFPB), Campina Grande 58432-300, CEP, Brazil

²

Research, Development, and Innovation Centre (VIRTUS/UFCG), Federal University of Campina Grande, Campina Grande 58429-140, CEP, Brazil

³

Engineering Department, Penn State University, Great Valley, PA 19355, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2023, 13(15), 8770; https://doi.org/10.3390/app13158770

Submission received: 11 July 2023 / Revised: 25 July 2023 / Accepted: 27 July 2023 / Published: 29 July 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

The potential application of this study is to guide developers and organizations in effectively detecting and refactoring code smells, enhancing the quality and sustainability of software projects within an agile development context.

Abstract

(Context) Code smells indicate poor coding practices or design flaws, suggesting deeper software quality issues. While addressing code smells promptly improves software quality, traditional detection techniques often fail in continuous detection during software development. (Problem Statement) More recently, Interactive Detection (ID) technique has been proposed, enabling the detection of code smells continuously. Although the use of this technique by developers and organizations is promising, there are no practical recommendations for its use in the context of software development. (Goal) The objective of this study was to propose and evaluate the integration of ID into the widely adopted Scrum framework for agile software development. (Method) To achieve this objective, we utilized a mixed-method approach that combined a comprehensive literature review and expert knowledge to propose the integration. Furthermore, we conducted a focus group and a controlled experiment involving software development activities to evaluate this integration. (Results) The findings revealed that this integration significantly benefitted software development, such as early detection of code smells, increased effectiveness in code smell detection, and improved code quality. These findings shed light on the potential benefits of adopting this integration, offering valuable insights for developers and researchers. (Conclusions) This research emphasized the importance of continuous code smell detection as an integral part of agile development and opened avenues for further research in code quality management within agile methodologies.

Keywords:

code smell; detection techniques; agile software development; empirical evaluation

1. Introduction

Code smells indicate deeper software quality issues, providing valuable insights into potential areas of improvement [1]. Several techniques of smell detection have been investigated [2,3,4,5]. Most of these techniques rely on Non-interactive Detection (NID), which only provides a list of code smells upon an explicit request from the developer after the source code is compiled and finalized [6]. These techniques may not effectively identify and address these issues continuously during software development [7]. In many cases, code smells are detected during code reviews or after the software has been deployed, resulting in additional time and effort required for remediation [8]. This delay can lead to a greater smell accumulation, making it challenging to maintain and evolve the software efficiently [9].

Interactive Detection (ID) techniques have been proposed to help reveal smell instances in code fragments without an explicit request from the developer [7]. In contrast to NID, the ID technique allows developers interact with smelly code elements as they edit or browse their program statements [6]. This feature can help the developer be aware of new occurrences of code smells as soon as they are introduced [10]. Empirical studies show that the sooner code smells are removed, the less effort and time is spent on this activity [2,11].

The Scrum framework has gained widespread adoption in agile software development due to its flexible and iterative approach [12]. Scrum promotes collaboration, adaptability, and continuous improvement, aligning well with the dynamic nature of software development projects [13]. By integrating the ID technique, which enables continuous detection of code smells, into the Scrum framework, developers can address code smells promptly, resulting in enhanced software quality.

Despite the popularity of the Scrum framework, integrating the ID technique into Scrum remains relatively unexplored [6]. The lack of comprehensive studies and practical guidelines on integrating continuous code smell detection into Scrum creates a knowledge gap within the software engineering community. Integrating ID into Scrum addresses continuous code smell detection challenges during agile development [10]. It provides just-in-time feedback, enabling early identification and resolution of code quality issues [14]. ID fosters collaborative development and supports a culture of continuous inspection and improvement, promoting high code quality and software maintainability in an agile environment [3,15,16]. In summary, the integration of ID into Scrum applies to a wide range of scenarios within agile software development (e.g., Code Review, Collaborative Development, Large and Complex Software Systems, and Legacy Code Refactoring), helping development teams to proactively address code quality issues and maintain high software standards throughout the development lifecycle.

The primary goal of this study is to propose an approach that integrates the ID technique into Scrum to enhance software quality. By examining the feasibility of this integration and exploring the perceptions of software professionals, we aim to provide valuable insights into the potential advantages, benefits, and limitations of adopting this approach. Additionally, this research seeks to identify the organizational and technical factors that can facilitate or hinder the successful implementation of the ID technique within Scrum teams.

To achieve these objectives, we employ a mixed-method approach. Firstly, we conduct a comprehensive literature review to gather the existing knowledge on code smell detection, Scrum, and related integration approaches. Based on this review, we propose integrating the ID into Scrum by adapting roles, artifacts, and events within the framework. Subsequently, we conduct a focus group consisting of software professionals. The participants discuss the proposed integration, sharing their perceptions and providing insights into adopting this approach’s advantages, benefits, and limitations. Finally, we conduct a controlled experiment involving 24 software developers to validate and evaluate the proposed approach. This experiment aims to apply the integrated ID technique within Scrum and assess its effectiveness in real-world software development scenarios.

This study contributes to the software engineering community by proposing an approach to integrating the ID technique into the Scrum framework, enabling continuous code smell detection and fostering proactive quality management. The findings of this study shed light on the potential benefits and challenges associated with adopting this integration, offering valuable insights for organizations seeking to enhance software quality within their Scrum teams. Furthermore, this research highlights the importance of considering continuous code smell detection as an integral part of the agile development process and paves the way for future research in code quality management within agile methodologies.

The remainder of this paper is organized as follows. Section 2 provides background information on basic concepts necessary to understand the study. Section 3 explains the main related work. Section 4 describes the proposed approach to integrating smell detection techniques within the context of agile software development, with a particular focus on Scrum. Section 5 presents the offline validation of the approach, including the results obtained from the focus group conducted with eight software development professionals. Section 6 exposes the online validation of the approach, including the results obtained from the controlled experiment conducted with 24 software developers. Section 7 discusses the main implications of the research from both researchers’ and practitioners’ perspectives, while Section 8 concludes the paper and provides directions for future research in this area.

2. Background

This section provides an overview of the fundamental concepts necessary for comprehending the current study. Firstly, Section 2.1 elucidates the definition of code smells and their impact on software quality. Subsequently, Section 2.2 explores various techniques employed for detecting code smells, while Section 2.3 highlights the key features of the agile software development and the Scrum framework.

2.1. Code Smells

During software development, strict deadlines and evolving requirements can often result in the accumulation of technical debt [1]. Technical debt refers to a collection of sub-optimal decisions made during development that may impede the maintainability of a system in the future. One of the primary indicators of technical debt found within the source code is code smells. Code smells serve as warning signs that indicate areas where the source code may require refactoring or improvement [1]. Researchers such as Fowler [1], Marinescu [17], and Sharma [3] have proposed and cataloged various code smells. Two common examples of code smell, along with their consequences, include:

God Class: this code smell refers to excessively large classes, possess poor cohesion, and have multiple dependencies on other data classes within the system [1]. Previous studies have demonstrated that God Classes can hinder program comprehension and adversely impact software maintainability [18,19].
Long Method: this code smell arises when methods encompass multiple functionalities or address more than one concern [1]. Research has shown that Long Methods can diminish program understanding and make the source code more prone to changes and faults [18,19].

Early detection of code smells holds substantial potential for promoting the longevity and sustainability of software systems [10,20]. It involves promptly identifying opportunities for refactoring [1,21,22] as soon as code smells emerge in the source code during development. The longer the code smells persist in the codebase, the more challenging they become to eliminate. Refactoring, a behavior-preserving change in the program’s structure, is pivotal in enhancing software maintainability [1,23]. This paper delves into the significance of early code smell detection in software development, highlighting the risks associated with delayed detection and addressing code smells. Drawing insights from studies by Murphy [10], Albuquerque [6], Fowler [1], Murphy et al. [21], and Morales et al. [22], we underscore the benefits of early detection and emphasize the importance of refactoring to ensure software quality and foster long-term success.

2.2. Techniques for Code Smell Detection

To assist themselves in detecting and refactoring code smells, developers commonly rely on (semi-)automated techniques [23]. These techniques consist of two essential components [10,20]:

Detection Mechanism: this component lets developers choose or define algorithms for detecting code smells. By selecting appropriate metrics and adjusting thresholds, developers can establish their detection strategy [24]. Code smell detection methods can be categorized into five groups: (i) metrics-based, (ii) machine-learning-based, (iii) history-based, (iv) rules-/heuristics-based, and (v) optimization-based [3].
User Interface: once the detection mechanism completes its analysis, the user interface presents the detected instances of code smells [5]. There are two main approaches for displaying the results: (i) list-based, which provides a comprehensive list of code smell occurrences throughout the project, and (ii) interactive-based, which employs the source code to highlight potential instances of code smells.

Figure 1 illustrates the different components involved in code smell detection. The detection mechanism can be activated either automatically after completing a programming activity or upon explicit request by the developer. In terms of how the User Interface presents code smell occurrences, it can be displayed within the source code itself, allowing developers analyze them while actively engaging in programming activities. Alternatively, the results can be presented in a separate structured file, preventing simultaneous analysis with programming tasks. Based on the developer’s interaction with these components, code smell detection techniques can be classified into the following categories.

Interactive Detection (ID) is a technique that facilitates developer interaction with code smells (Figure 1) by identifying instances within code fragments without requiring an explicit request. ID techniques continuously operate in the background, detecting code smell instances as developers work on coding tasks. This enables early identification and allows developers analyze, modify, and implement the source code accordingly [10,20].

Non-Interactive Detection (NID) is a technique that does not support direct interaction with code smell elements (Figure 1) unless specifically requested by the developer. In NID techniques, potential code smells are detected by analyzing the entire project once the developer initiates the detection process. Developers using NID techniques may only identify code smell occurrences after merging code changes with other software components. Empirical studies have shown that the longer code smells persist, the more time and effort it takes to address them [25,26]. Furthermore, when developers directly interact with the detection mechanism, they cannot concurrently perform other programming activities [10,20].

2.3. Agile and Scrum Framework

Scrum is a widely adopted agile framework in software development [1], consisting of roles, events, and artifacts to support effective project management. The roles in Scrum include Product Owner, Scrum Master, and Developers. The first demands the product backlog, representing stakeholder interests. The second facilitates the team and ensures adherence to Scrum principles. The last is responsible for delivering potentially production-ready software [12].

Scrum also defines several events that provide opportunities for collaboration and inspection. The Sprint Planning event sets the direction for the upcoming sprint, with the team determining the work to be performed and creating a sprint backlog. The Daily Scrum is a short daily Scrum where the team synchronizes activities, discusses progress and identifies any obstacles. The Sprint Review allows stakeholders providing feedback on the increment developed during the sprint, while the Sprint Retrospective enables the team to reflect on their processes and identify ways to improve [12].

Scrum utilizes specific artifacts to provide transparency and track progress. The Product Backlog is a prioritized list of requirements, capturing the work for future sprints. The Sprint Backlog contains the selected items from the product backlog for the current sprint. The Increment represents the sum of all the product backlog items completed and is the primary measure of progress [12].

3. Related Work

This section discusses the existing research on code smell detection and empirical studies related to Interactive Detection (ID) and Non-Interactive Detection (NID) techniques. Additionally, we highlight the distinctive aspects of our study in comparison to the literature.

Code Smell Detection Techniques: Code smells have been extensively studied from various perspectives [4,5], including their introduction [27], evolution [22], and impact on reliability [19] and maintainability [18]. Numerous techniques for code smell detection have been investigated [2,3,5,28], often relying on heuristics to identify code artifacts affected by specific smell types. While these approaches have shown promising results in empirical assessments, they also exhibit common limitations. For instance, they may detect code smell candidates that developers do not consider actual maintainability problems [26,29]. Moreover, the agreement between different detectors is often low [3,25], necessitating the use of diverse techniques for detecting different smells. Additionally, the performance of many detection mechanisms heavily depends on the thresholds used to identify code smell instances [2,5].

Comparison of ID and NID Techniques: Prior works have evaluated various techniques for code smell detection, but limited attention has been turned to evaluating ID techniques. Studies like those of Paiva et al. [29] and Sharma et al. [3] have compared different techniques, but they did not include ID in their evaluations. Some research has explored the use of visualization techniques in code smell detection like that of Mumtaz et al. [30] and Pereira et al. [5]. However, none of these studies specifically evaluated the application of ID-sensitive visualizations for code smells. Furthermore, while false positives in code smell detection have been studied [25,26,31], these evaluations did not consider the ID characteristics of the techniques. Regarding ID evaluations, only one study by Murphy et al. [10] assessed the use of an ID technique, focusing on usability guidelines rather than analyzing the effectiveness of code smell detection.

Integrating Smell Detection into Software Development Activities: Prause and Apelt (2008) [15] introduced a code review tool that empowers developers to track code evolution and anonymously provide positive or negative feedback on specific code segments. Their presented approach addresses various challenges in software review and emphasizes overall code quality enhancement rather than solely concentrating on individual defects. Similarly, Prause and Apelt [16] proposed a technical prototype and its integration into a Scrum environment. By means of two experiments with agile software teams and subsequent surveys, they showed that gamification could effectively improve adherence to coding conventions, leading to improvement in code quality. Moreover, Lakehal et al. [14] developed a novel framework to generate dynamically distributed applications as service chains of components and optimize the life cycles of connected objects. The framework combines a generic context-aware ontology situation model, middleware, and the IoT for managing the user’s composite situations at the design and run-time levels.

The main differences between this study and the related work are the specific focus, objectives, and methodologies. While this study centers on integrating ID into the Scrum framework for continuous code smell detection in agile software development, related works have explored various aspects of code smell detection from different perspectives. Unlike prior evaluations that mainly compared NID techniques or focused on usability guidelines, this study represents an independent and comprehensive assessment of ID for code smells. Moreover, the related work investigated gamification techniques in software development and the challenges of mobile context-aware applications, addressing different aspects of software development, code quality improvement, collaborative development, and responsibility assignment. In contrast, this study’s primary objective is to propose, evaluate, and highlight the benefits of integrating ID into the widely adopted Scrum framework for effective code smell detection.

4. Approach: Integrating ID Technique into Scrum

To incorporate the discussed smell detection techniques into Scrum activities, adjustments can be made to include a set of associated disciplined activities. The decision to adopt Scrum for integrating these techniques stems from its ability to effectively handle frequent changes in project requirements, often leading to the introduction of code smells. Implementing the ID and NID techniques in a disciplined manner within Scrum facilitates early detection and refactoring of code smells and enhances the software’s quality attributes. The subsequent sections of this study delve into the proposed approach (Section 4.1) and the modifications made to the Scrum components (Section 4.2), providing further insights into these aspects.

4.1. Proposal Definition

The ID and NID can offer advantages and disadvantages depending on the project’s purpose and context. ID enables the identification and resolution of specific problems in local code snippets, improving code quality and reducing fix time. This approach proves faster and more cost-effective than globalized detection as it targets specific code areas requiring attention. On the other hand, the NID approach identifies broader issues throughout the software system. It proves most valuable in larger, complex projects with interdependent code components and widespread problems. However, NID detection requires more detailed code analysis across the entire system, making it more time-consuming and costly.

According to a prior study, implementing appropriate processes can prevent the occurrence or persistence of software smells in a system. Conversely, ineffective or absent processes can lead to the emergence of software smells [3]. Despite various detection approaches, understanding “how” and “when” to apply them in the development process remains crucial. To bridge this gap, Figure 2 illustrates an approach integrating code smell detection techniques into the Scrum framework. Each numbered element in the figure represents a crucial component of the approach, and in the following, we offer a comprehensive description of each component.

(1) Besides the Product Backlog, another crucial artifact is dedicated to managing code smell occurrences within the Scrum framework. This shared resource necessitates collective maintenance and engagement from developers [3]. It serves as a repository for documenting instances of code smells detected using the Non-Interactive Detection (NID) technique, providing a comprehensive overview of code smell prevalence throughout the project. Incorporating these instances into the artifact facilitates a holistic understanding of code quality, enabling informed decisions regarding code smell refactoring and overall software quality improvement.

(2) During sprint planning, the team thoroughly examines and prioritizes code smell instances from the global list, employing predefined criteria and consulting with the product owner. To streamline this process, a combination of ID and NID techniques can be utilized. By leveraging both techniques, the team effectively addresses and validates the planning of code smell refactoring. This step is vital for determining which specific code smell instances will be transformed into actionable tasks for the upcoming sprint.

(3) The prioritization and order of items in the global list are based on the severity and criticality of code smells, ensuring that the most significant ones are addressed within the sprint. Each identified code smell occurrence from the prioritized list is converted into specific tasks, considering the team’s capacity and available resources. Factors such as granularity, spread level, and impacted code elements play a crucial role in determining the size and duration of these tasks. Subsequently, the tasks are assigned to developers responsible for executing them within the designated sprint timeframe.

(4) Developers continuously evaluate the software artifacts generated during the sprint through their programming activities, employing the ID technique. Thus, whenever a new instance of a code smell is detected in the code, the developer conducts an initial assessment based on factors such as the spread level of the smell, the time invested in refactoring, and the effort required. This assessment helps determine whether immediate or delayed removal of the code smell is appropriate. In the latter case, the smell is added to the global list, and its resolution is deliberated in future sprints.

(5) During the daily Scrum meetings, developers provide updates on critical code smells they have encountered and explain the rationale behind delaying their removal through refactoring actions. These decisions undergo team discussions and can be modified or maintained collaboratively. This approach enables the team to collectively evaluate the impact and urgency of addressing identified code smells, considering project constraints, resource availability, and technical dependencies. By fostering open communication and collaborative decision-making, the team effectively prioritizes and handles critical code smells, ensuring necessary actions are coordinated and informed.

(6) During the sprint review, it is crucial for the team to conduct a comprehensive analysis of the remaining code smells to assess the overall code quality. The NID technique can facilitate this analysis. The objective is to identify and document the results of this analysis, providing valuable insights and information for future actions and decision-making. By systematically evaluating the code smells that persist at the end of the sprint, the team gains a deeper understanding of the areas that require attention and improvement.

(7) Before considering sprint items as “done”, it is essential to check if they meet the criteria outlined in the “definition of done”. In this context, the ID technique can support the Scrum team in making decisions regarding task completion. The ID may provide data and indicators, such as software quality metrics and code smell occurrences, that can serve as acceptance criteria for completing sprint tasks. A broader analysis of the entire project becomes necessary if the sprint delivers increased granularity in a software release or version. In such cases, the NID technique can provide data and indicators that serve as acceptance criteria for a release or software version.

(8) At the end of the sprint iteration, the corresponding item in the “list of smells” is updated to include the code smells addressed during the sprint and any newly identified smells. Both the ID and NID techniques can be used to accomplish this task. For example, the ID technique can verify the removal of the prioritized local smells during the sprint, while the NID technique can validate more global instances newly introduced in the project.

4.2. Scrum Adaptation

The proposed integration of Interactive Detection (ID) into the Scrum framework necessitates modifying its roles, artifacts, and events. The following points outline the key changes in each of these components:

Roles: a new role called the “Quality Evaluator” must be established to manage code quality maintenance and improvement actions within the development team. The Quality Evaluator should possess expertise in code smell detection and refactoring. This role is responsible for guiding team members involved in programming activities.

The “Quality Evaluator” role can be both a genuinely new role and an additional function for the Scrum Master. In smaller teams, the Scrum Master may handle quality assurance, but larger organizations may benefit from a dedicated Quality Evaluator. The significance depends on the organization’s size and needs for specialized quality assessment in agile software development.

Artifacts: the Product Backlog and, consequently, the Sprint Backlog, need to be expanded to include a list of code smell occurrences. This addition ensures that the ID and NID technique data and indicators can serve as acceptance criteria for the “definition of done” for various team-generated artifacts.

Code smells should be treated differently from other bugs/defects in the software development process. While bugs and defects may directly impact the software’s functionality and result in incorrect or unexpected behavior, code smells represent poor coding practices or design flaws that might not necessarily manifest as functional issues immediately but can lead to deeper software quality problems over time. Treating code smells separately acknowledges their importance in ensuring long-term code maintainability and software quality.

Events: the proposed approach primarily focuses on events directly associated with software development within the Scrum framework. In what follows, we describe the suggested changes in each Scrum event.

Sprint Planning: a dedicated period is included in Sprint Planning to review and prioritize code smells in the project. The Quality Evaluator collaborates with the product owner and developers to determine which code smells should be addressed in the current sprint. This modification aligns with the “adaptation” pillar of Scrum, ensuring that code smell management is integrated into the planning process.

Sprint: the Sprint is adjusted to promote continuous inspection of code artifacts throughout the working day, facilitated by the Interactive Detection (ID) technique. This enables early identification of code smells, enabling easier refactoring. This adjustment aligns with the “inspection” pillar of Scrum, emphasizing the importance of proactive code quality assessment. Since the major reason that causes smells in software systems is the poor technical skills of developers and lack of awareness towards writing high-quality code [3], using ID in a disciplined way can contribute to improving skills and awareness in smell detection.

Daily Scrum: the Daily Scrum remains unchanged, but now developers highlight critical and severe code smells as part of their challenges in achieving the sprint goal. The Scrum Team collectively decides whether to address these code smells in the current sprint or postpone them based on the required effort. This modification ensures that code smell management becomes a regular part of the team’s discussions and decision-making processes.

Sprint Review: in the Sprint Review, a comprehensive analysis of the global list of code smells is conducted, comparing previous and current versions to understand the changes. The team discusses the most critical code smells, particularly those briefly mentioned in Daily Scrums but postponed due to their complexity. This promotes transparency within the Scrum by facilitating effective communication and awareness about various artifacts generated throughout development.

5. Offline Validation: Insights from a Focus Group

Utilizing a focus group serves as a valuable method to evaluate the applicability of Interactive Detection (ID) within the agile development process. It enables the identification of potential strengths and weaknesses of the technique in alignment with the objectives and values of Scrum. Figure 3 provides an overview of the steps involved in conducting the focus group. The subsequent sections detail the planning (Section 5.1), execution (Section 5.2), and data analysis from the focus group (Section 5.3). Additionally, potential threats to the validity of the findings are presented and discussed (Section 5.4).

5.1. Planning

Conducting an effective focus group entails following several crucial steps. In our study, we adhered to specific recommendations outlined by Kontio et al. [32]. The first step involved defining the focus group’s objective and expected outcomes. Our primary aim was to validate an approach integrating the ID technique into Scrum. To achieve this, we sought feedback from potential users, including software developers and managers, to assess the practical adoption and usefulness of the proposed approach (Section 4). The second step entailed carefully selecting participants with relevant characteristics to ensure diversity and representativeness. Given that our research aimed to understand user perceptions of code smell detection techniques in agile development, participants with varying experience levels were chosen to obtain diverse perspectives and enrich the discussions.

The third step involved preparing a well-designed script of questions and topics to guide the group discussion. This script was crafted to strike a balance between providing structure and allowing for spontaneous conversation, facilitating the emergence of new and relevant issues. Lastly, the fourth step focused on selecting an appropriate location that was comfortable, conducive to discussion, and easily accessible for participants. This choice ensured a favorable active participation environment and minimized potential inconveniences. By meticulously following these steps, we gathered valuable insights through the focus group, which significantly contributed to the overall success of our research.

5.2. Conduction

A structured script was developed to guide the activities of the focus group. The first part of the focus group involved an introductory presentation on the purpose and methodology of the focus group, providing participants with the necessary background knowledge. The second part focused on presenting the proposed adaptations of roles, artifacts, and events within the Scrum framework, followed by a discussion among participants to evaluate the relevance of these propositions. In the third part, the software development approach integrating the Interactive Detection (ID) technique into Scrum was presented, and participants engaged in a guided discussion to assess the appropriateness of the proposed activities. Finally, the fourth part consisted of a practical demonstration of using the ID technique integrated into the agile process, followed by a discussion to identify potential benefits, drawbacks, and challenges of implementing the approach. Supplementary Materials [33] provide further details about the participants, script, and activities conducted during the focus group.

The focus group took place in May 2023 and lasted approximately 110 min. The activity was conducted using the Google Meet platform, enabling remote participation and offering a convenient and cost-effective solution as the participants were in different geographic regions. Eight participants were recruited through email invitations based on their explicit and voluntary interest in participating in the focus group. All participants were affiliated with the Research, Development, and Innovation Center of the Federal University of Campina Grande (VIRTUS/UFCG) and held roles as developers and project managers. While participants were expected to have at least moderate familiarity with the Java language and the agile software development process, extensive knowledge of code smells, refactoring, and detection techniques was not required. Overall, the participants met the criteria for the focus group, as they self-assessed their proficiency level as “very proficient” or “proficient” in all concepts associated with the focus group.

5.3. Results

The collected data from the focus group session were analyzed using a qualitative approach which involved consolidating, condensing, and interpreting the gathered information to gain deeper insights into the results. The audio recordings were transcribed and organized in a spreadsheet using the qualitative data analysis features of the Microsoft Excel tool. In the following sections, we present the main findings derived from the analysis of the focus group activities.

Adaptation of Scrum Components: The second part of the focus group aimed to evaluate the participant consensus regarding the proposed adaptations to the Scrum components. Through data analysis, we examined the participants’ perspectives on the relevance and feasibility of incorporating these adaptations into the Scrum framework.

Adaptation of Roles. Initially, the results associated with the proposal to create a new role (i.e., quality evaluator) in Scrum are presented. Most participants agreed (5) when asked about their agreement, while only three partially agreed with creating this role. This suggests that the focus group participants received the proposal well and that they see value in including this new role to improve the quality of the development process and, consequently, the quality of the code. However, most participants disagreed when asked if other role adaptations were necessary (6), and only two agreed. This may indicate that the participants see the proposed creation of the “quality evaluator” as sufficient to improve the development process and that other changes may be unnecessary or harmful.

Adaptation of artifacts. In this activity, we present the results regarding the proposal to add a new artifact to the Scrum (i.e., Global List of smells in the Product Backlog). The results obtained during the focus group indicate that the participants received the creation of the artifact well. Most participants (6) fully agreed with the inclusion of this artifact, which suggests that they recognize the value that this artifact can add to the development process. This can be explained by the fact that the “Global List” can help manage code smells. On the other hand, when asked if there was a need for other adaptations of artifacts, most participants were neutral (5) or disagreed (3). These results can be explained by the fact that the inclusion of the “Global List” is already seen as a significant enough change to improve the development process.

Adaptation of events. This activity demonstrates the results associated with the proposed adaptation of Scrum events. When asked about their agreement with this adaptation, half of the participants strongly agreed (4), and the other half partially agreed (4). These results may indicate that the participants saw the adaptation of the events as positive, although not all completely agreed. However, when asked if there was a need for further adaptations in the events, most participants disagreed (6) or were neutral (2). This may indicate that the proposed adaptations are sufficient to improve the development process and that other changes could be unnecessary or harmful.

Presentation of the Approach: The third part of the focus group analysis focused on gathering data related to integrating the ID technique into Scrum. The findings suggest that the proposed approach can potentially enhance the effectiveness of the code smell detection process. Most participants (seven out of eight) strongly agreed with this assertion. This positive response can be attributed to adopting the ID technique, which enables continuous detection of code smells during development. By promptly identifying and addressing these issues, developers can improve software quality.

On the other hand, when considering whether integrating the ID technique into Scrum helps improve software quality, most participants (six out of eight) strongly agreed with the statement, while one disagreed. The dissenting opinion may stem from varying perspectives among participants regarding what constitutes an “improvement” in software quality. However, the participants who agreed with the statement likely perceived that the ID technique could enhance software quality by enabling the early detection and resolution of code issues before they can adversely impact the software’s functionality.

Another critical aspect of integrating the ID technique into Scrum is its adoption by software developers. In the survey, six participants agreed that ID integrated into Scrum could be quickly adopted by software development teams, while one was neutral, and one disagreed with this statement. This suggests that the ID can be successfully implemented in software development teams without significant challenges. A possible explanation is that Scrum is already widely adopted in many software development organizations, and the ID can be easily integrated into the existing processes.

Using the Approach: The fourth part of the focus group analysis focused on gathering data regarding the participant perceptions of the demonstrated example of using the ID technique integrated into Scrum, including its benefits, advantages, and disadvantages.

Benefits: by examining the responses, we identified several benefits of using the ID technique integrated into Scrum, detailed as follows:

Improving code quality: All participants (eight out of eight) highlighted that integrating ID into Scrum directly enhances code quality. Their responses emphasized that continuous detection and correction of code smells result in greater productivity and a more manageable code. One expert (E1) stated that “…by using the ID technique integrated into the agile process, we will be able to maintain the code more easily because we will be detecting and correcting the smell, thus having greater productivity when carrying out the activities”. Another expert (E2) mentioned that “…the adoption of ID will improve quality due to continuous refactoring in software development activities”;
Improved product delivery: Most participants (six out of eight) acknowledged that integrating ID into Scrum helps improve product delivery. They noted that it enhances security, speed, and overall quality, enabling faster and more reliable software implementations. One expert (E2) stated that “…product delivery will be positively affected, as it will provide more security and speed of new implementations”. Another expert (E3) mentioned that “…the improvement in the delivery of the product at the end of the Sprint is noticeable”;
Easier maintenance: Participants (four out of eight) emphasized that integrating ID into Scrum leads to easier code maintenance, facilitating faster and more efficient changes. They expressed that continuous feedback accelerates the maintenance process and ensures cleaner and more organized code. One expert (E4) stated that it “…also ensures greater ease of maintenance in those codes that do not contain smells”. Another expert (E7) mentioned that “…because the feedback to the developer is in real time, it speeds up the maintenance process even more”.

In summary, the experts highlighted that integrating ID into Scrum directly improves software quality, detects problems in the code earlier, and facilitates their correction before they escalate. It promotes the development of a cleaner and more organized code, enables frequent and error-free deliveries, simplifies code maintenance, and allows faster and more efficient changes. These benefits were observed in eight responses for improved code quality, six for improved product delivery, and four for easier maintenance.

Difficulties: After a comprehensive analysis of the participant responses, the scientific investigation revealed a series of challenges pertaining to the seamless integration of the ID technique into Scrum, which are enumerated below:

Resistance to change: most participants (seven out of eight) expressed concerns about resistance to change when adopting the ID technique. They highlighted that this resistance could stem from a lack of support from leadership or developers. One expert (E8) mentioned that “…the only difficulty I observe at first would be adapting the teams to the ID technique”. Another expert (E5) stated that “…as the use of this type of integration is not widely used today, there would be an initial implementation barrier there”;
Lack of knowledge and skills: Participants (four out of eight) emphasized a lack of knowledge and skills as a significant challenge in applying the ID technique. They pointed out that developers may detect code smells but may not possess the technical expertise to address them effectively. One expert (E1) mentioned that “…the smells can be detected, but the developer may not have enough technical knowledge to correct them”. Another expert (E4) stated that “…I believe that the greatest difficulty that will be encountered will be the resistance of the people who will use this approach”;
Organizational difficulties: Participants (four out of eight) highlighted organizational constraints as potential barriers to implementing the ID technique. These difficulties may include limited budgets, resource constraints, and company culture. One expert (E6) stated the following: “…I believe that as this is ongoing research, there may be resistance to change on the part of some organizations.” Another expert (E7) mentioned that “…maybe the Product Owner can prioritize demands above the resolution proposed by ID”.

In summary, the challenges associated with implementing the ID technique in Scrum include resistance to change, a lack of knowledge and skills, and organizational difficulties. Focus group participants noted that resistance may arise due to a lack of support from leadership and developers, as well as a preference for routine and comfort in following established development processes. Insufficient knowledge and skills can pose an initial barrier to implementing the ID technique, and the need for adequate training and capacity building is crucial. Additionally, organizational difficulties may involve limited budgets, resource availability, company culture, and the potential burden of tasks and projects on developers.

Disadvantages: Upon analyzing the participants’ responses, several categories of disadvantages associated with implementing the ID technique integrated into Scrum were identified and are detailed as follows:

No disadvantages: a significant number of participants (five out of eight) stated they did not perceive any disadvantages. They did not identify specific drawbacks or negative aspects of integrating the ID technique into Scrum. Experts (E5 and E8) mentioned “…none” and “…I couldn’t see any disadvantages”, respectively;
Difficulty in adoption: some participants (three out of eight) acknowledged the potential difficulty in adopting the ID technique, particularly when the team is accustomed to using alternative approaches or development methods. They highlighted that resistance to change within the team could pose challenges. One expert (E1) stated the following: “…I believe that due to resistance to changes in the team, we may have difficulty with adoption” and another expert (E2) mentioned that “…due to cultural issues of organizations, the difficulty of adoption can be faced”;
Tool dependency: a couple of participants (two out of eight) expressed concerns about the ID technique integrated into Scrum requiring specific tools and technologies, potentially limiting developers’ options. They questioned why this approach should be used instead of the already established code quality tools. One expert (E4) asked: “…why use this approach and not an already established code quality tool?”, and another expert (E1) mentioned that “…we may have difficulty adopting the tool”.

Most participants did not perceive any disadvantages in integrating the ID technique into Scrum. However, some experts mentioned potential challenges in adopting the approach, particularly in teams accustomed to different techniques or development methods. Additionally, concerns were raised about tool dependency and the potential limitations it may impose on developers’ choices.

5.4. Validity Threats

In what follows, we discuss threats associated with this study and the actions taken to mitigate them by following the classification scheme proposed by Wohlin et al. [34].

Construction validity: Threats in a focus group include unclear instructions, biased or confusing questions, and moderator bias influencing data interpretation. The research protocol underwent continuous reviews and refinements to address these threats, including clear step-by-step descriptions. Participant feedback confirmed the clarity of the information presented, indicating no issues. Internal validity: Threats to internal validity include selection and interaction effects. Participants with diverse profiles representative of software development and agile approach experience were chosen to mitigate the selection effect. To reduce the impact of the interaction effect, individual responses were recorded without participant access to the answers of the others, minimizing the influence of previous responses on subsequent participants.

External validity: A threat to external validity is associated with the clear and detailed description of the methodological procedures adopted, the process of recruiting participants, conducting the focus group sessions, and the techniques used for data analysis. To mitigate this threat, a study protocol was defined and is available through Supplementary Materials, facilitating replication by independent researchers. Validity of conclusion: A possible threat is associated with the subjective interpretation of the data. To minimize this threat, different data sources (such as audio and notes) were used to ensure the data’s integrity, and triangulation analyses were conducted to confirm the obtained results. Additionally, multiple researchers participated in the data analysis to reduce the influence of individual biases.

6. Online Validation: Insights from a Controlled Experiment

After validating the proposed software development approach, which integrates Interactive Detection (ID) into the Scrum framework through offline validation, this section focuses on conducting an online validation of the approach using a controlled experiment. Figure 4 provides an overview of the steps involved in the controlled experiment. The subsequent sections elaborate on the planning (Section 6.1) and execution (Section 6.2) of the experiment. Additionally, the obtained results are presented (Section 6.3), followed by a discussion of the threats to validity inherent in this study (Section 6.4).

6.1. Planning

To design the controlled experiment, a set of recommendations described in the works of Wohlin et al. [34] and Jedlitschka et al. [35] were followed. Participants performed tasks related to code smell detection with the support of the software development approach that integrates ID into the Scrum framework. All tasks were performed with the support of the ID and NID techniques embedded in the Eclipse ConCAD tool [36].

Regarding the participants of this study, 24 undergraduate students were recruited to carry out the experimental activities. All participants were enrolled in the Computer Engineering program and taking a Web Programming course. They were recruited through email based on their explicit and voluntary interest in participating in the experiment. It was expected that the participants were at least moderately familiar with the Java language. However, extensive knowledge of code smells or detection techniques used in the experiment was not expected from the participants. All study participants attended a training session to level their knowledge about code smells, detection approaches, refactoring, and using the integrated development environment in the context of the experimental activities.

The data collected during the experiment, including participants’ demographic aspects, code evaluations, and obtained feedback, were analyzed using statistical and qualitative methods to verify the effectiveness of the software development approach that integrates the ID into the Scrum framework (Section 4). For the analysis, resources from the R tool [37] were used, providing appropriate means to calculate the statistical tests considered in this study. Regarding statistical tests, we used the Wilcoxon signed-rank test [38] for the values of detected smell instances. This test was chosen because the data did not follow a normalized distribution. Additionally, we used paired t-Test [38] to compare recall and precision measures since these values follow a normalized distribution.

6.2. Conduction

Teams of participants performed experimental tasks associated with developing a software project using the Java language. In the context of this study, each team was tasked with developing a web system to simulate a bookstore domain. This system was chosen for its simplicity and ability to be developed within the experiment’s timeframe. The system allowed book registration, editing, deletion, book filtering, database integration, user registration and profile editing, authentication, and user-specific profile editing.

The development process utilized the Java language and the Spring framework for guidance. The experimental tasks were divided into control and experimental groups, with control teams using the proposed approach integrating ID into Scrum, and experimental teams using Scrum. The teams were balanced regarding experience and background to avoid biases in the experimental results. Each of the eight teams consisted of three students randomly selected to perform the activities.

To evaluate the results associated with the experimental activities, an “oracle” representing the list of code smells that truly represent maintainability issues in the system was obtained. Three activities were performed to generate the oracle: (i) the codes were analyzed using the JSpirit tool [39] to obtain software quality metrics, (ii) these metrics were used to automatically identify code smells based on predefined limits, and (iii) the identified instances were manually inspected and validated by two expert researchers in code smell detection. It is worth mentioning that eight distinct oracles were generated at the end of each experiment iteration.

The execution environment for the experimental tasks, including files and tool support, was made available to the participant teams. Additionally, the experimental tasks were supervised by two researchers. The total duration of the experiment was 45 days, with two deliveries made during this period (i.e., one at the end of the second and third Sprint). For each delivery, a set of requirements and planned functionalities were defined. Moreover, the experiment was organized into three phases as described below.

Phase 1: Pre-Experiment. The purpose of this phase was to prepare the participants for the subsequent experimental activities, ensuring they had the necessary knowledge about the types of smells to be identified and familiarity with the ID and NID techniques provided by the Eclipse ConCAD tool. Initially, all participant teams received material containing the definitions of 10 types of smells supported by the Eclipse ConCAD tool along with illustrative examples of each type. Participants were given a maximum of 20 min to understand these definitions. They then underwent detailed training on the code smell detection techniques provided by the Eclipse ConCAD tool, and the proposed approach that integrates ID into a Scrum (Section 4). Finally, all participants were invited to complete a questionnaire to collect relevant information, such as their software development experience, prior knowledge of code Smell detection, and familiarity with the Eclipse IDE.

Phase 2: Software Development and Smell Detection. In this phase, the 24 participants were divided into eight teams to develop a web system using the Java language. In addition to implementing the system’s functionalities, the participants were also required to identify and catalog 10 different types of code smells throughout the experiment. Half of the participant teams used the proposed approach integrating ID into a Scrum, while the other half used only the Scrum framework without disciplined ID support. The teams had to develop the web system over three 15-day Sprints. At the end of the second and third Sprints, each team provided a list containing information on the total number of true positives (TP), false positives (FP), and false negatives (FN). The detection results were analyzed only after the experiment’s conclusion to prevent the learning effect from introducing biases. This ensured that the participating teams were not influenced by the results during the detection process, maintaining the integrity and impartiality of the collected data.

Phase 3: Post-Experiment. In the final stage of the experiment, participants were invited to complete a feedback questionnaire to share their perceptions of using the adapted Scrum with the ID technique. Specifically, participants were asked to highlight the benefits and challenges encountered during the experimental activities. They were encouraged to report positive aspects observed, such as productivity gains, code quality improvement, facilitation of collaboration among teams, and enhancement of agile practices. Additionally, they were invited to mention any challenges faced during the application of the approach, such as difficulties in detecting smells, resistance to change, or integration issues between Scrum activities and the ID technique.

6.3. Results

This section presents the main results associated with the online validation of the proposed approach for software development that integrates ID into the Scrum framework. In what follows, the results associated with the experiment participants are presented (Section 6.3.1). Next, the main results associated with the experimental activities are described (Section 6.3.2). Finally, the participants’ perceptions about using the proposed approach are discussed (Section 6.3.3).

6.3.1. Pre-Experiment Results (Phase 1)

Data were obtained in Phase 1 of the experiment to define the participants’ profiles. The participants were males (80%) and females (20%). Regarding the age range, 60% of the participants were between 19 and 21 years old. Regarding prior experience in software development, it was observed that 25% of the participants had no previous knowledge, 33% had little experience (1–2 semesters), and 29% had some experience (3–4 semesters). It is important to mention that only 13% of the participants had significant experience five 5 or more semesters).

Regarding prior experience in smell detection, 42% of the participants had no previous knowledge, while 29% had little experience (1–2 semesters). Only 29% had some experience (3–4 semesters). Regarding using the Eclipse IDE, most participants (75%) were familiar with it, while 25% had no experience with that specific tool. Regarding Java knowledge, 8% had no previous knowledge, and 42% had basic knowledge. Approximately 50% of the participants had intermediate or advanced knowledge of the language. Overall, the participants’ profile meets the study’s expectations, as they have intermediate knowledge of Java, are familiar with the integrated development environment used in the experiment, and have familiarity with code smell detection and refactoring as part of their vocabulary or work routine.

6.3.2. Code Smell Detection (Phase 2)

The second phase of the experiment involved performing software development tasks with the support of the proposed approach that integrates ID into Scrum. The objective of this activity was to evaluate the effectiveness and practical feasibility of the approach in terms of productivity, code quality, collaboration among teams, and suitability for agile practices. Additionally, it aimed to verify if developers using the approach would result in more accurate and comprehensive detection of code smells compared to developers using the non-adapted Scrum framework.

During the experimental activities, the participants were instructed to make two deliveries at the end of the second and third Sprint. Each delivery involved two artifacts: (i) the Java project implementing the specified requirements and functionalities, and (ii) a list containing 10 different types of code smells that the participating teams identified as maintainability issues. For each delivered project, an oracle containing a list of code smells representing maintainability problems was generated. By comparing these artifacts (i.e., the oracle versus the list of smells), it was possible to obtain the results related to True Positives (TP), False Positives (FP), and False Negatives (FN). Table 1 describes the results corresponding to each component.

The approach decreases the number of True Positives (TP). Participating teams using the approach detected 51 (first delivery) and 34 (second delivery) True Positives, whereas without the use of the approach, they detected 60 (1st delivery) and 51 (2nd delivery) True Positives. Therefore, using the approach contributes to a reduction of approximately 25% in the total number of TP (i.e., remaining code smells) in the code smell detection activity. The results related to the TP (alpha = 0.05, p = 0.0061, W = 0, MD = 4.77, Z = −2.3664) were statistically significant using the Wilcoxon test [38]. The proposed approach contributes to the reduction in TP due to several factors. One possible reason is that the approach encourages a proactive strategy for dealing with smells, emphasizing prevention and early detection. This means that the team is better prepared to avoid introducing smells and promptly correct them if they occur. This preventive mindset contributes to the decrease in detected smells and true positives. Another possible reason is that Scrum promotes an iterative and incremental development process, focusing on the continuous delivery of functional software. This allows teams the identification and address of smells in a more agile way as they progress in development. Integrating ID into Scrum reinforces the attention to code quality and the use of practices that help prevent the introduction of smells.

The approach decreasesthe number of False Positives (FP). It was found that participants identified 12 (in the first delivery) and 9 (in the second delivery) False Positives using the approach. On the other hand, when using the ID technique, participants identified 24 (in the first delivery) and 21 (in the second delivery) False Positives. It can be concluded, therefore, that the use of the approach can reduce the number of FP by up to 50%. It is important to mention that the statistical results related to FP (alpha = 0.05, p = 0.00032, W = 0, MD = −3.38, Z = −3.407) obtained in experimental tasks were significant using the Wilcoxon test [38]. There are several reasons why the proposed approach contributes to reducing FP. Firstly, the approach emphasizes code inspection from the early stages of development. This means developers are encouraged to analyze and correct code smells as soon as they are identified, even before they propagate and become more complex. This practice results in the early detection of issues and prevents false alarms from accumulating. Additionally, the proposed approach involves close collaboration among team members, including developers, testers, and project managers. During code inspections, all members can collaboratively review and discuss the smells found, allowing a more comprehensive view and a better understanding of the code context. These results are closely aligned with those of Murphy-hill et al. [10].

The approach decreases the number of False Negatives (FN). The experimental results revealed that the teams using the approach identified 26 (first delivery) and 17 (second delivery) False Negatives, while the teams without the approach detected 38 (first delivery) and 33 (second delivery) False Negatives. This indicates that using the ID technique can reduce the number of FN by over 30%. It is important to mention that the statistical results related to FN (alpha = 0.05, p = 0.00389, W = 0, MD = −3.62, Z = −2.520) obtained by the experimental tasks were significant using the Wilcoxon test [38]. Several reasons explain this reduction in FN with the use of the proposed approach. Firstly, the approach emphasizes code inspection and early smell detection. By adopting continuous inspection practices and collaborative reviews, teams can detect more smells representing maintenance problems due to improved awareness and skills in smell detection [3,10]. It is important to mention that a previous study argues that a major reason that causes smells in software systems is poor technical skills of developers and lack of awareness towards writing high-quality code [3].

This allows these smells to be addressed before they become FN, meaning issues that would go unnoticed without the intervention of the ID technique. Additionally, the approach promotes a shared responsibility mindset for code quality. In Scrum, teams are encouraged to take ownership of software quality and strive for excellence in development. Using the approach, developers become more aware of the importance of identifying and resolving code smells that represent maintenance problems. This reduces FN as developers are more attentive and committed to ensuring code quality.

6.3.3. Post-Experiment Results (Phase 3)

In the third phase of the experiment, participants were asked to complete a questionnaire to share their perceptions about the proposed approach that integrates ID into Scrum for software development. The following are the quantitative results and qualitative excerpts from the responses of the 24 participants in the controlled experiment.

Overall Perception: The first part of the questionnaire aimed to capture participants’ general perceptions regarding the use of the proposed approach. Most participants evaluated the approach positively, with 83% reporting significant benefits in terms of code quality and development agility. Participants mentioned benefits such as faster identification and resolution of code smells, resulting in cleaner and higher-quality code. They also highlighted the efficiency of constantly reviewing and refactoring the code, leading to more organized development and a system with fewer bugs. The agile approach of Scrum helped them maintain a good work pace, meet deadlines, and prioritize important tasks, ultimately delivering a stable and reliable system.

However, 17% of participants mentioned facing challenges related to the need for learning and adapting to the new practices. Among the qualitative responses, some participants shared their experiences of facing challenges when initially adopting the proposed approach. They mentioned a learning period and the need to become familiar with the new practices. Regardless, as time went on, they recognized the benefits and positive outcomes that these practices brought to their work. Despite the initial difficulties, with their team’s support, they overcame challenges and became more proficient in using the approach.

Scrum: In the second part of the questionnaire, participants were asked to evaluate the adaptations made in the Scrum framework during the experiment. Regarding role adaptations, participants were asked if they agreed with creating the new role. Approximately 60% of participants fully agreed, while about 30% partially agreed with this proposition. The proposal was generally well received, and participants saw value in including this new role to improve the development process and code quality. The participants were also asked if they believed other role adaptations besides the new “quality evaluator” role were necessary. More than 50% of participants entirely disagreed, while another 30% partially disagreed with this statement.

Regarding artifact adaptations, the participants were questioned about including the new artifact. Over 80% of respondents fully or partially agreed with this proposition. Including this new artifact can provide a clearer and more detailed view of smell detection and refactoring activities, facilitating communication and collaboration among those involved in these activities. Participants were also asked if they believed that other artifact adaptations were necessary. The results revealed that most participants disagreed (62%) or remained neutral (29%) regarding the need for other artifact adaptations beyond the inclusion of the “Global List”.

Regarding event adaptations, participants were questioned about their agreement with the adaptations made to Scrum events. The results indicate that 45% of participants fully agreed with the proposed event adaptations, while 32% partially agreed. This suggests that most participants view the event adaptations positively, although not everyone agrees. Participants were also asked if they believed additional event adaptations would be necessary beyond the proposed ones. The results revealed that most participants disagreed (42%) or remained neutral (29%) regarding the need for further event adaptations beyond the proposed ones.

ID Technique: The third part of the questionnaire evaluated the ID technique used during the experimental activities. The participants were asked to express their agreement to assess whether the ID technique facilitated the identification and understanding of code smells. The results can be summarized as follows: (i) 55% fully agreed that the ID technique enabled the identification and understanding of code smell instances. They reported that the technique provided a more effective approach for detecting and understanding smells, allowing an in-depth analysis and a clearer view of problematic code; (ii) 33% partially agreed, indicating ID technique brought some benefit in identifying and understanding smells but also faced specific difficulties during the process. These participants emphasized that, despite some challenges, the technique still proved helpful in improving smell detection in the code; and (iii) 12% disagreed, stating that the ID technique did not facilitate the identification and understanding of code smell instances. These participants mentioned that the approach was unsuitable for their specific needs or did not notice a significant improvement compared to traditional smell detection methods.

The participants expressed their opinions regarding the influence of applying the ID technique on the quality of the produced code. The results can be summarized as follows: (i) 67% fully agreed that using the ID technique positively influenced the quality of the produced code. They mentioned that the technique helped identify and fix code issues more efficiently, resulting in higher-quality code at the end of the development process; (ii) 25% partially agreed, acknowledging that the ID technique had some positive impact on the quality of the produced code but also noting that other factors, such as the team’s experience and the project’s context, played an essential role in overall code quality; (iii) 8% disagreed, stating that applying the ID technique did not have a positive influence on the quality of the produced code. They pointed out that other existing methods or practices were already in place to ensure code quality, and the ID technique did not bring additional improvements in this aspect.

Next, the participants were asked to share the main benefits of using the proposed approach integrating ID into Scrum. The participants’ responses can be grouped as Better identification of smells (62%), In-depth code understanding (54%), Early smell detection (45%), and Improved collaboration and communication (38%). Finally, the participants shared the challenges and difficulties encountered when applying the ID technique during development. Their responses can be summarized as follows: Learning curve (58%), Change resistance (50%), Cognitive overload (38%), and Integration with existing tools (30%).

Final Considerations: The participants were asked if applying the proposed approach could contribute to communication and collaboration among team members. From the participant answer analysis, we determined that they emphasized that the proposed approach prioritizes communication, facilitates information exchange, and promotes collaborative work. They also highlighted that frequent meetings and interactions among team members in the Scrum approach foster alignment, engagement, and a conducive environment for teamwork, idea sharing, and solution finding. Finally, they pointed out that clear and open communication in Scrum with interactive detection enhances task understanding and minimizes communication gaps.

Finally, participants were asked to share their recommendations regarding the use of the proposed approach for other development teams. Their responses can be summarized as follows: (1) Positive recommendation: 79% of the participants highly recommended using the approach for other development teams. They emphasized the observed benefits, such as the ease of identifying and understanding smell instances, the positive influence on the quality of the produced code, and the improved collaboration and communication among team members; (2) Neutral recommendation: 13% of the participants expressed impartial advice regarding using the approach. These participants suggest that the teams interested in adopting the approach should carefully assess its suitability to their needs and consider factors such as the team’s familiarity with the ID technique and the availability of required resources; and (3) Negative recommendation: Less than 8% of the participants provided negative advice regarding the use of the approach. These participants believe the ID technique may not be suitable for all development teams, especially those with resource constraints or an organizational culture less inclined to adopt interactive approaches.

6.4. Validity Threats

Subsequently, we address the identified threats in connection with the study and outline the undertaken measures for their mitigation, in accordance with the classification scheme proposed by Wohlin et al. [34].

Construct Validity: Participants’ prior experience and technical knowledge may vary, potentially influencing the results and data interpretation. To mitigate this, teams were balanced in terms of experience and background to avoid biases. Clear guidelines were provided to ensure consistency in development practices, even within the Scrum framework. Internal Validity: Participants’ prior knowledge of code smell detection could influence their decisions during the experiment. To minimize this, detailed training was provided on the detection techniques and the software development approach that integrates the ID into Scrum, aiming to level participants’ knowledge. Additionally, independent experts validated code smell instances using transparent and objective criteria.

External Validity: The simulated bookstore domain system may not fully represent a real-world development environment, potentially limiting generalizability. However, its selection was based on simplicity and suitability for the experiment’s timeframe. The presence of researchers during tasks could have influenced participants’ behavior, but it was emphasized that decisions and actions should reflect real software development practices. Conclusion Validity: The study’s sample size of 24 participants may limit generalization to a larger population. However, appropriate methodological procedures were implemented to ensure internal validity within the experiment’s context. The use of student participants facilitated recruitment and provided valuable insights. Limitations also exist concerning the specific language (Java) and technology (Spring framework) used, potentially affecting generalizability to different software development contexts. The simulated system’s characteristics might not fully align with real-world scenarios, but efforts were made to minimize this effect by selecting a simple and representative simulation.

7. Implications and Benefits

In what follows, we examine the study’s implications for the research community and industrial practitioners. Firstly, we outline some implications for researchers:

Investigate long-term effects: the study provides insights into the short-term effects of integrating code smell detection techniques into Scrum. Researchers should investigate the long-term impact and sustainability of using the software development approach that integrates the ID into Scrum. Longitudinal studies can assess whether the observed positive effects are maintained over time and identify any challenges or diminishing returns.
Explore scalability and applicability: we focused on a single project and a specific context. Researchers should explore the scalability and applicability of the findings across different projects, teams, and domains. Investigating how the integration of code smell detection techniques performs in various settings can provide a more comprehensive understanding of its benefits and limitations.
Assess the effectiveness of integrating code smell detection techniques: the results indicate that integrating the ID technique into Scrum can improve code quality and software development processes. Researchers should conduct studies to validate the effectiveness of different code smell detection techniques in agile environments and explore their impact on software quality and overall project outcomes.
Address challenges related to change resistance: resistance to change emerged as a significant difficulty in integrating the ID technique into Scrum. Researchers should investigate strategies to overcome resistance and facilitate the adoption of new practices and methodologies in software development teams. This may involve addressing cultural barriers, providing adequate training and support, and involving key stakeholders in the change process.

By addressing these implications, researchers can expand the knowledge base and provide valuable insights into the integration of code smell detection techniques, their broader impact, and the considerations necessary for successful adoption in agile software development. Additionally, we highlight the main implications for practitioners:

Consider integrating code smell detection: the study demonstrates the potential benefits of integrating code smell detection techniques into Scrum. Practitioners should consider incorporating these techniques into their software development practices to improve code quality and maintainability.
Train and educate development teams: to effectively leverage code smell detection techniques, practitioners should invest in training and educating their development teams. Providing knowledge and guidance on identifying and addressing code smells can empower developers to improve code quality and prevent technical debt proactively.
Balance manual and automated code analysis: the study used manual code smell detection by quality evaluators. Practitioners should balance manual code analysis and the leveraging of automated code analysis tools. Automation can enhance the detection process, increasing efficiency and scalability while maintaining the contextual expertise of human evaluators.
Monitor code quality metrics: integrating code smell detection techniques enables the tracking of code quality metrics over time. Practitioners should establish mechanisms to monitor and analyze these metrics regularly. Continuous monitoring can provide insights into code quality trends, identify potential areas of improvement, and help allocate resources effectively.
Evaluate the impact on development speed: while code smell detection can enhance code quality, practitioners should assess the impact on development speed and project timelines. Finding the right balance between addressing code smells and delivering software within schedule constraints is crucial. Regularly evaluate the trade-offs between code quality improvements and time-to-market goals.

Considering these implications, practitioners can leverage the benefits of integrating code smell detection techniques into their development processes, improving code quality, reducing technical debt, and enhancing overall project success. The research community must guide practitioners in continuously using the ID and NID techniques to maintain the code quality of their software systems, thereby providing a valuable contribution to practice.

8. Conclusions and Future Directions

This study presents a software development approach integrating the ID technique into the Scrum framework. A focus group of eight professionals, including developers and project managers, evaluated the proposed approach. The findings revealed that this integration significantly benefits software development, such as improved code quality, easier maintenance, and enhanced product delivery. However, the participants also highlighted challenges in implementing the approach, including a lack of knowledge and skills, organizational difficulties, and resistance to change. The evaluation results indicate that using the approach can be a promising strategy for improving software development in organizations that follow the Scrum model.

Then, a controlled experiment with 24 participants concluded that using the proposed approach promoted increased attention to code quality, early anomaly detection, and collaboration among team members. One of the main findings was an improvement of over 15% in effectiveness measures. This indicates that the proposed approach enabled more accurate and comprehensive code smell detection, ensuring that more smells were identified and refactored during software development. Additionally, the approach facilitated collaboration among the involved teams, promoting better communication and knowledge sharing. The disciplined application of the ID technique encouraged cautious code analysis and the detailed review of detected smells, resulting in increased awareness of the importance of code quality throughout the software development process.

In conclusion, our proposed approach signifies a noteworthy advancement in integrating code smell detection within the Scrum framework. However, to achieve optimality, future research endeavors may be necessary. While we diligently designed a robust and effective solution, we welcome continuous exploration, feedback, and contributions from the research community to further refine and enhance our approach. As a future research opportunity, we suggest conducting experimental studies involving a larger sample size and a diverse set of software development teams from industrial environments. This will provide more robust and reliable results, increasing the validity of the proposed approach. In addition, to enhance the relevance of the evaluation, we suggest conducting assessments in a real-world industrial setting. This will help validate the practical implications and effectiveness of the proposed approach in a context closer to that of its intended use. Finally, we also suggest conducting longitudinal studies that span longer periods of time to assess the long-term impact and sustainability of the proposed approach. This will help uncover potential challenges or benefits arising over time and provide valuable insights for practitioners and researchers.

Supplementary Materials

The Supplementary Materials [33] offer additional comprehensive information concerning the participants, script, and activities conducted during the focus group.

Author Contributions

D.A.: defined the study design, validation protocols, empirical studies execution, formal analysis, and wrote the original draft; E.G.: participated in validation, conceptualization, and text revision; M.P.: participated in validation, conceptualization, and text revision; H.A.: assisted in methodology, investigation, and formal analysis; A.P.: contributed to conceptualization and resource acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research received support from the IFPB employee qualification incentive program (PIQIFPB)—Public Notice Nr 21/2021/PRPIPG.

Institutional Review Board Statement

The empirical studies described in this manuscript were conducted in adherence to fundamental ethical principles, including obtaining informed consent, upholding privacy, and ensuring the respectful, secure, and confidential handling of collected data by the authors.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

All data and Supplementary Materials can be obtained from the corresponding author.

Acknowledgments

The authors thank all the subjects participating in the focus group and controlled experiment.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fowler, M. Refactoring: Improving the Design of Existing Code; Addison-Wesley Professional: Boston, MA, USA, 2018. [Google Scholar]
Fernandes, E.; Oliveira, J.; Vale, G.; Paiva, T.; Figueiredo, E. A review-based comparative study of bad smell detection tools. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, Limerick, Ireland, 1–3 June 2016; pp. 1–12. [Google Scholar]
Sharma, T.; Spinellis, D. A survey on software smells. J. Syst. Softw. 2018, 138, 158–173. [Google Scholar] [CrossRef] [Green Version]
De Paulo Sobrinho, E.V.; De Lucia, A.; de Almeida Maia, M. A systematic literature review on bad smells—5 W’s: Which, when, what, who, where. IEEE Trans. Softw. Eng. 2018, 47, 17–66. [Google Scholar] [CrossRef]
Pereira dos Reis, J.; Brito e Abreu, F.; de Figueiredo Carneiro, G.; Anslow, C. Code smells detection and visualization: A systematic literature review. Arch. Comput. Methods Eng. 2022, 29, 47–94. [Google Scholar] [CrossRef]
Albuquerque, D.; Guimarães, E.; Braga, A.; Perkusich, M.; Almeida, H.; Perkusich, A. Empirical Assessment on Interactive Detection of Code Smells. In Proceedings of the 2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 22–24 September 2022; pp. 1–6. [Google Scholar]
Do, L.N.Q.; Ali, K.; Livshits, B.; Bodden, E.; Smith, J.; Murphy-Hill, E. Just-in-time static analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, 10–14 July 2017; pp. 307–317. [Google Scholar]
Silva, D.; Tsantalis, N.; Valente, M.T. Why we refactor? Confessions of github contributors. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Seattle, WA, USA, 13–18 November 2016; pp. 858–870. [Google Scholar]
Schnappinger, M.; Osman, M.H.; Pretschner, A.; Pizka, M.; Fietzke, A. Software quality assessment in practice: A hypothesis-driven framework. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Oulu, Finland, 11–12 October 2018; pp. 1–6. [Google Scholar]
Murphy-Hill, E.; Barik, T.; Black, A.P. Interactive ambient visualizations for soft advice. Inf. Vis. 2013, 12, 107–132. [Google Scholar] [CrossRef]
Lacerda, G.; Petrillo, F.; Pimenta, M.; Guéhéneuc, Y.G. Code smells and refactoring: A tertiary systematic review of challenges and observations. J. Syst. Softw. 2020, 167, 110610. [Google Scholar] [CrossRef]
Beck, K.; Beedle, M.; van Bennekum, A.; Cockburn, A.; Cunningham, W.; Fowler, M.; Grenning, J.; Highsmith, J.; Hunt, A.; Jeffries, R.; et al. Manifesto for Agile Software Development. In Proceedings of the Agile Manifesto, Snowbird, UT, USA, 11–13 February 2001. [Google Scholar]
Srivastava, A.; Bhardwaj, S.; Saraswat, S. SCRUM model for agile methodology. In Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 5–6 May 2017; pp. 864–869. [Google Scholar]
Lakehal, A.; Alti, A.; Laborie, S.; Roose, P. A semantic agile approach for reconfigurable distributed applications in pervasive environments. Int. J. Ambient. Comput. Intell. 2020, 11, 48–67. [Google Scholar] [CrossRef]
Prause, C.R.; Apelt, S. An approach for continuous inspection of source code. In Proceedings of the 6th International Workshop on Software Quality, Leipzig, Germany, 10 May 2008; pp. 17–22. [Google Scholar]
Prause, C.R.; Jarke, M. Gamification for enforcing coding conventions. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, Bergamo, Italy, 30 August–4 September 2015; pp. 649–660. [Google Scholar]
Marinescu, R. Measurement and quality in object-oriented design. In Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM’05), Washington, DC, USA, 25–30 September 2005; pp. 701–704. [Google Scholar]
Khomh, F.; Di Penta, M.; Guéhéneuc, Y.G.; Antoniol, G. An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empir. Softw. Eng. 2012, 17, 243–275. [Google Scholar] [CrossRef]
Palomba, F.; Bavota, G.; Di Penta, M.; Fasano, F.; Oliveto, R.; De Lucia, A. On the diffuseness and the impact on maintainability of code smells: A large scale empirical investigation. Empir. Softw. Eng. 2018, 23, 1188–1221. [Google Scholar] [CrossRef] [Green Version]
Albuquerque, D.; Garcia, A.; Oliveira, R.; Oizumi, W. Deteccao interativa de anomalias de codigo: Um estudo experimental. In Proceedings of the Workshop on Software Modularity, WMOD2014, Maceio, Brazil, 28 September–3 October 2014. [Google Scholar]
Murphy-Hill, E.; Black, A.P. Refactoring tools: Fitness for purpose. IEEE Softw. 2008, 25, 38–44. [Google Scholar] [CrossRef] [Green Version]
Morales, R.; Soh, Z.; Khomh, F.; Antoniol, G.; Chicano, F. On the use of developers’ context for automatic refactoring of software anti-patterns. J. Syst. Softw. 2017, 128, 236–251. [Google Scholar] [CrossRef]
Opdyke, W.F. Refactoring Object-Oriented Frameworks; University of Illinois at Urbana-Champaign: Champaign, IL, USA, 1992. [Google Scholar]
Mantyla, M.V. An experiment on subjective evolvability evaluation of object-oriented software: Explaining factors and interrater agreement. In Proceedings of the 2005 International Symposium on Empirical Software Engineering, Noosa Heads, Australia, 17–18 November 2005; p. 10. [Google Scholar]
Palomba, F.; Bavota, G.; Di Penta, M.; Oliveto, R.; De Lucia, A.; Poshyvanyk, D. Detecting bad smells in source code using change history information. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), Palo Alto, CA, USA, 11–15 November 2013; pp. 268–278. [Google Scholar]
Fontana, F.A.; Mäntylä, M.V.; Zanoni, M.; Marino, A. Comparing and experimenting machine learning techniques for code smell detection. Empir. Softw. Eng. 2016, 21, 1143–1191. [Google Scholar] [CrossRef]
Tufano, M.; Palomba, F.; Bavota, G.; Oliveto, R.; Di Penta, M.; De Lucia, A.; Poshyvanyk, D. When and why your code starts to smell bad (and whether the smells go away). IEEE Trans. Softw. Eng. 2017, 43, 1063–1088. [Google Scholar] [CrossRef]
Dewangan, S.; Rao, R.S.; Mishra, A.; Gupta, M. Code Smell Detection Using Ensemble Machine Learning Algorithms. Appl. Sci. 2022, 12, 10321. [Google Scholar] [CrossRef]
Paiva, T.; Damasceno, A.; Figueiredo, E.; Sant’Anna, C. On the evaluation of code smells and detection tools. J. Softw. Eng. Res. Dev. 2017, 5, 7. [Google Scholar] [CrossRef] [Green Version]
Mumtaz, H.; Beck, F.; Weiskopf, D. Detecting bad smells in software systems with linked multivariate visualizations. In Proceedings of the 2018 IEEE Working Conference on Software Visualization (VISSOFT), Madrid, Spain, 24–25 September 2018; pp. 12–20. [Google Scholar]
Tsantalis, N.; Chaikalis, T.; Chatzigeorgiou, A. Ten years of JDeodorant: Lessons learned from the hunt for smells. In Proceedings of the 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), Campobasso, Italy, 20–23 March 2018; pp. 4–14. [Google Scholar]
Kontio, J.; Bragge, J.; Lehtola, L. The focus group method as an empirical tool in software engineering. In Guide to Advanced Empirical Software Engineering; Springer: London, UK, 2008; pp. 93–116. [Google Scholar]
Albuquerque, D. Supplementary Material—Validating an Approach of Interactive Detection into Scrum Framework—Figshare. 2023. Available online: https://figshare.com/articles/dataset/_Focus_Group_Valida_o_da_Proposta_de_Abordagem_DI_integrada_ao_Scrum/22777067/1 (accessed on 22 June 2023).
Wohlin, C.; Runeson, P.; Höst, M.; Ohlsson, M.C.; Regnell, B.; Wesslén, A. Experimentation in Software Engineering; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Jedlitschka, A.; Ciolkowski, M.; Pfahl, D. Reporting experiments in software engineering. In Guide to Advanced Empirical Software Engineering; Springer: London, UK, 2008; pp. 201–228. [Google Scholar]
Albuquerque, D.; Guimaraes, E.; Perkusich, M.; Almeida, H.; Perkusich, A. ConCAD: A Tool for Interactive Detection of Code Anomalies. In Proceedings of the Anais do X Workshop de Visualização, Evolução e Manutenção de Software, Online, 9–13 November 2022; pp. 31–35. [Google Scholar]
Kolaczyk, E.D.; Csárdi, G. Statistical Analysis of Network Data with R; Springer: New York, NY, USA, 2014; Volume 65. [Google Scholar]
Kraska-Miller, M. Nonparametric Statistics for Social and Behavioral Sciences; CRC Press: New York, NY, USA, 2013. [Google Scholar]
Vidal, S.; Vazquez, H.; Diaz-Pace, J.A.; Marcos, C.; Garcia, A.; Oizumi, W. JSpIRIT: A flexible tool for the analysis of code smells. In Proceedings of the 2015 34th International Conference of the Chilean Computer Science Society (SCCC), Santiago, Chile, 9–13 November 2015; pp. 1–6. [Google Scholar]

Figure 1. Techniques for code smell detection.

Figure 2. Overview of the Approach for Integrating ID Technique into Scrum.

Figure 3. Focus Group Overview.

Figure 4. Controlled Experiment Overview.

Table 1. Smell Detection Results.

	1a Release			2a Release
	Scrum + ID
	TP	FP	FN	TP	FP	FN
Team 1	11	3	6	9	3	4
Team 2	12	2	7	8	2	4
Team 3	13	3	6	9	2	5
Team 4	15	4	7	8	2	4
Total	51	12	26	34	9	17
Average	12.7	3	6.5	8.5	2.3	4.25
	Scrum
Team 5	14	5	9	11	5	7
Team 6	15	6	9	12	5	8
Team 7	16	6	9	14	5	9
Team 8	15	7	11	14	6	9
Total	60	24	38	51	21	33
Average	15	6	9.5	12.5	5.3	8.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Albuquerque, D.; Guimarães, E.; Perkusich, M.; Almeida, H.; Perkusich, A. Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges. Appl. Sci. 2023, 13, 8770. https://doi.org/10.3390/app13158770

AMA Style

Albuquerque D, Guimarães E, Perkusich M, Almeida H, Perkusich A. Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges. Applied Sciences. 2023; 13(15):8770. https://doi.org/10.3390/app13158770

Chicago/Turabian Style

Albuquerque, Danyllo, Everton Guimarães, Mirko Perkusich, Hyggo Almeida, and Angelo Perkusich. 2023. "Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges" Applied Sciences 13, no. 15: 8770. https://doi.org/10.3390/app13158770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges

Abstract

Featured Application

Abstract

1. Introduction

2. Background

2.1. Code Smells

2.2. Techniques for Code Smell Detection

2.3. Agile and Scrum Framework

3. Related Work

4. Approach: Integrating ID Technique into Scrum

4.1. Proposal Definition

4.2. Scrum Adaptation

5. Offline Validation: Insights from a Focus Group

5.1. Planning

5.2. Conduction

5.3. Results

5.4. Validity Threats

6. Online Validation: Insights from a Controlled Experiment

6.1. Planning

6.2. Conduction

6.3. Results

6.3.1. Pre-Experiment Results (Phase 1)

6.3.2. Code Smell Detection (Phase 2)

6.3.3. Post-Experiment Results (Phase 3)

6.4. Validity Threats

7. Implications and Benefits

8. Conclusions and Future Directions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI