An Aspects Framework for Component-Based Requirements Prediction and Regression Testing

Ali, Sadia; Hafeez, Yaser; Humayun, Mamoona; Jhanjhi, N. Z.; Ghoniem, Rania M.

doi:10.3390/su142114563

Open AccessArticle

An Aspects Framework for Component-Based Requirements Prediction and Regression Testing

by

Sadia Ali

¹,

Yaser Hafeez

¹,

Mamoona Humayun

²

,

N. Z. Jhanjhi

^3,*

and

Rania M. Ghoniem

^4,*

¹

University Institute of Information Technology, Pir Mehr Ali Shah Arid Agriculture University, Rawalpindi 46000, Pakistan

²

Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakakah 72311, Saudi Arabia

³

School of Computer Science, SCS Taylor’s University, Subang Jaya 47500, Malaysia

⁴

Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Sustainability 2022, 14(21), 14563; https://doi.org/10.3390/su142114563

Submission received: 23 September 2022 / Revised: 30 October 2022 / Accepted: 2 November 2022 / Published: 5 November 2022

(This article belongs to the Special Issue Advances in Software and Hardware Engineering towards a Sustainable Technology Education)

Download

Browse Figures

Versions Notes

Abstract

:

Component-based software development has become more popular in recent decades. Currently, component delivery only includes interface specifications, which complicates the selection and integration of suitable components to build a new system. The majority of the components are reused, after appropriate modifications in accordance with the new system, or new version of the system. After components integration, errors may occur during the interaction of their features due to incomplete, ambiguous, or mismatched terms used in requirement analysis and specification, affecting component validation. Therefore, there is a need for a study that identifies challenges and covert concepts into practice by providing solutions to these challenges. The objective of this study is to identify some attributes and information sources that are essential during component-based development. The proposed framework is based on these attributes and information sources. In this study, we provide a taxonomy of attributes and information sources among different activities of component development, and propose a framework to improve the component development process. To investigate the proposed framework, we performed an experimental study to get real-world scenario results from industrial practitioners. The results showed that the proposed framework improves the process of component specification and validation without ambiguity and component failures. Additionally, compared with other methods (random priority, clustering-based and execution rate), the proposed framework successfully outperforms other methods. As a result, the proposed framework’s accuracy, F-measures, and fault identification rate were higher (i.e., greater than 80%) than those of other methods (i.e., less than 80%). The proposed framework will provide a significant guideline for practitioners and researchers.

Keywords:

component-based software; specification; selection; integration

1. Introduction

Component-Based Software Engineering (CBSE) techniques are in growing demand and are commonly used for software development. CBSE reduces complexity challenges and improves rapid adaptation to changes by utilizing reusability concepts [1]. Reusability requires modification (instead of making products from scratch) and the integration of different components to meet customers’ needs [2]. Reusability is essential for product development to deal with continuous changes in customers’ needs. Component-Based Software (CBS) in our daily life increases, as seen in some examples in Figure 1, for providing different services automatically [3,4]. The CBS system is developed for easy, fast, and cheap project development.

CBSE divides project requirements into various components and reuses these components in the applications of a similar domain, instead of re-making similar components [2]. The reusability of the components enhances the reliability of the overall system. Apart from other characteristics, reusability stands as the backbone for CBSE that uses already built components, making it unnecessary to rebuild components [3,4]. A component comprises features such as composition, context-dependencies, contract-based interfaces, independent implication, and the composition of a third-party perspective. Software, hardware, services, and networks are components of CBSE used to reduce development complexities. Various large-scale complex products such as software product lines, service-oriented architecture, embedded systems, cyber-physical systems, and automotive systems have adapted it [1,5]. It is a useful approach for the development of a complex and multi-stakeholder perspective. However, it has negative impacts, such as increased collaboration among components, increased criticality, and complexity issues [6]. Consequently, it causes other problems such as software anomalies, cooperative collaboration behavior, testing difficulties during integration, and resource management issues [7,8,9,10,11].

The integration of components for system development makes CBSE error-prone and begins with the testing of more cost and effort. The persistent integration testing level is used to publicize bugs during component interaction [12]. The integration testing techniques, which are widely used [13,14], can be improved through Model-Based Techniques (MBT). MBT are used to track requirements from requirement specifications to their relevant TC semantically in continuous integration. It is useful in highly configurable and component-based systems with a wide range of configurations and customization, where each unit or component has its own set of requirements and TC, such as web services, embedded systems, or modules that encapsulate internal functionality. During customization, different components with different functionalities form a single system to facilitate different user requirements in a single application. The complexity in the handling of customization and configuration in CBS is due to inconsistency and ambiguity in requirements, and the involvement of multiple stakeholders [15,16]. Another reason is the missing and incomplete semantic information of the specification, as written in natural language [15]. Furthermore, the regression testing of CBS after a modification becomes difficult due to inconsistent requirements, improper component selection, and test planning [3,17,18].

There were different practices presented in the existing literature, and they highlighted important information. Therefore, the authors in [19] also presented a dynamic component model and an appropriate adaptation and validation mechanism to configure and validate components and automatically mitigate interface mismatch information semantically. It thus seems doable and simple to develop software components supporting Plug and Play. Component selection is important from the available sources of components during development. Thus, in [3], the authors identified attributes after conducting a survey among industrial practitioners that are important in decision-making during component selection. The other important aspect identifies from the literature that there is a need for run-time change management in CBSD. Therefore, in [20] from an explanatory study, identify the trade-off between the attributes of quality during dynamic design changes in CBSD. The validation of components’ formal analysis according to requirements relevant test cases (TC) in distance-based and sensor-based components. To circulate correct and on-time information among all the components [14]. Other studies identified that formal and semantic-based component specifications significantly impacted all phases of CBSD [7,13,14,21], to reduce inconsistency, redundancy, irrelevancy, incompleteness, and the unbalanced structure of components, which improves after analysis and visualization of heterogeneous components [21,22]. Additionally, risk-based requirements due to inconsistency and incompleteness required extra effort, time, and cost for reliability analysis after modification [17]. Further, there is a need to increase the fault detection rate [17,23] after validation at every phase of system testing.

The aims of the literature review are to identify recent trends, attributes, and problems during product development using CBSE. So, we identified that there are different activities, types, and sources of CBS used. The activities performed during CBSE are [24,25,26,27,28,29]:

Component analysis and specification (this includes correct component specification as well as component reusability searches from various available resources, as shown in Figure 2);
Component prioritization (to identify the correct sequence of components for integration);
Test planning and management (defines and controls the software test schedule, process, requirements, supporting tools, and maintains multiple versions of software test suites);
The preparation of test data and TC (defining scope, objectives, pre and post conditions of TC;
Modification in accordance with requirements or re-search for reuse);
Development and integration (design and integration) of components.

Validation (testing of components at a different level of system development). The above discussions show that factors such as inconsistent requirements, reusability, component selection, and the validation of components create challenges during the development of component-based applications. The development of a CBS depends on third party involvement with or without code availability. Based on different reuse functions of components, four sources of components are usually available during the development of a new product, viz., open-source, internally built, outsourced, and off-the-shelf components [1,3,22,30], as shown in Figure 2.

The main types of CBS are: Software Product Line (it satisfies the specific needs of customers by sharing and managing standard features [31]); Embedded (it has a combination of complicated and time-consuming decision-making components [20]); Cyber-Physical (it monitors and controls information after integration of different components in a real-time environment [32,33]); and Automotive (implemented for various complex real-time seasonal tasks according to their priority [14]).

The trend in CBS techniques indicates that most of the techniques are based on software requirements specification and validation at integration and regression levels. Furthermore, most of the studies adopted a case study method for evaluation. The highlighted challenges during CBSE are the run-time management of components, dealing with continuous dynamic changes, automatically mitigating interface mismatch information semantically, and configuring the validation of components [3,14,20]. Other studies showed that formal and semantic-based component specifications significantly impacted all phases of CBSE [7,13,14,21] to reduce inconsistency, redundancy, irrelevancy, incompleteness, and imbalance in the component structure, which improves after analysis and visualization of different components [21,22,33]. Besides, there is a need for extra effort, time, and cost for reliability analysis after modification [17,25] for risk-based requirements due to inconsistency and incompleteness. Furthermore, there is a need to increase the fault detection rate [11,17,23,28] after validation at every phase of system testing. The common challenges of CBSE in each activity identified from existing literature are: dynamic changes [20], formal and semantic analysis [13], risk-based requirements and TC, regression testing analysis [14,17], improved fault detection [23], configuration and variability management [21,33], and distributed environments [17,23]. Figure 3 shows the main activities of CBSE, i.e., requirement analysis and specification, changes and selection, development and integration, and validation.

Each activity further provided details of two factors, i.e., information sources; it describes the type of information used to deal with challenges during each activity and attributes, to provide information about the type and nature of challenges [24,25,28,29,34]. Most studies show that two types of information sources involved in CBSE and necessary to mitigate different attributes/factors are: historical information such as frequent reuse and change in all activities; and human-based extracts from different viewpoints of software roles such as developers and testers. These attributes or factors that arise during CBSE activities include ambiguities, inconsistencies, irrelevancy, redundancy, and unbalanced structure. Resolving these factors in the first activity, i.e., requirement analysis and specification, which reduces the chances of error in other activities, as all activities are interrelated. The main reasons for these errors identified in the literature are [11,24,25,29,34]:

Long-Term: The risk of troublesome to maintain component systems.
Inconsistency: Component is challenging to use due to ambiguity and incompleteness.
Composition Predictability: Integration of component not correct as expected.
Requirements Management: Difficult due to contradictory, incomplete, and incorrect requirements.
Component Selection and Testing: For better performance, a reliable and reusable component is needed.
Performance Analysis: Maintaining performance consistent with requiring correctness of components.
Adequacy: Without source code, it is challenging to identify unified adequacy criteria.

According to the findings of the literature review, requirement semantic analysis and specification reduce the occurrence of identified factors during various CBS development activities. For testing modified components, the TCP technique has been the most widely used. For requirement management, formal specification strategies are used, and requirement inaccuracy has a high probability of causing errors in CBS. As a result, comprehensive techniques for reliability analysis based on accurate semantic requirement specification, and the identification of error-prone requirements are required to improve CBS TCP techniques for requirement change and modification. The semantic analysis and prediction of error-prone requirements alleviate inconsistency, ambiguity, lack of code, frequent modification, and the integration of various components.

Thus, the development of these component types requires complex and accurate specification of requirements to modify and reuse them in a new system or version development. Review and critically assess prior research to identify current trends and development issues in CBS in this study. The procedures of this research work are:

Review and identify modern practices in the CBSE domain based on specific research questions. This aids in categorizing critical factors that mitigate existing challenges in CBSE requirements analysis, specification, and validation.
To improve requirements analysis, specification, and validation of the CBSE, a framework was proposed based on factor analysis.
An experimental study was conducted based on three industrial CBS projects to evaluate proposed framework performance. When compared to other techniques, the proposed framework significantly improves CBS development and quality.
Finally, the research work provides a detailed roadmap and procedure for SMS researchers and practitioners, as well as a real-world implementation of the proposed framework.

The remaining paper comprises the proposed framework shown in Section 2, a description of the evaluation process, results, and discussion in Section 3, and the conclusion in Section 4.

2. Proposed Framework

In this section, we describe our proposed framework for the selection and prioritization of TC, changes validation, and fault identification that occur after making changes in CBS products. Sentiment analysis and historical information were used to maximize fault identification. Existing literature suggests that the semantic analysis of requirements improves the requirement correctness and that historical product information (requirements or TC) is essential for validation of a new version or series of the product after component integration. Existing literature reveals that the requirements of CBS are continuously increasing, and a technique to implement these changes without interruption of the main application is needed to reduce failures. Each requirement combines different components, and each component consists of various similarities and differences, which introduces different errors after integration. Therefore, multiple TCs are required to validate these requirements, and different researchers worked on it to reducing gaps after the integration of similarities and differences. Furthermore, existing studies revealed that historical information is vital to reduce error-prone requirements, which results in fault reduction, remove redundancy, and irrelevancy. The comprehensive SRPPTC framework provides for validation of changes after integration with the sentiment implementation of CBS, as shown in Figure 4, and focuses on the following essential points:

The requirements are specified using aspect-based sentimental analysis to reduce ambiguities and misspecification. The historical information identifies parameters for error-prone requirements prediction. These parameters, such as frequent reuse requirements, number of reuses, frequent causes of fault, high faults rate, and frequently changed requirements, are extracted from existing literature.
The frequently reused and non-frequently reused requirements were considered for classifying requirements both in updated and new products. They were further divided into commonalities and variabilities to reduce complexity. Additionally, highly error-prone requirements tend to detect maximum faults as early as possible.

Therefore, SRPPTC identifies maximum error after the execution of the highest priority TC. For maximum fault identification, SRPPTC performs the following steps after requirements gathering through an online web form and user stories of software development.

2.1. Requirements Analysis

The online webforms, user stories, and reviews of stakeholders, were used to extract requirements. Reuse requirements were extracted and specified using different techniques of requirement gathering, while only online webforms and use stories were used for new requirements. The pre-processing of these requirements was done to specify and extract requirements based on sentiment-based aspects.

2.1.1. Subsubsection Pre-Processing Requirements Pre-Processing Requirements

The requirements of a product are written in textual form and extracted as input in pre-processing. The requirements are collected during the extraction process from stakeholders using the online web application form and the interview extracts viewpoints of different stakeholders. These requirements are documented in a natural language without a different viewpoint categorization for further analysis. The requirement document may comprise ambiguous and incomplete requirements. To remove ambiguity and incompleteness in software requirements, we analyzed requirements using RStudio-based aspect-based sentiment analysis (ABSA) [35]. The SRPPTC framework involves pre-processing of the online web form, user stories, and feedback to enhance the accuracy of the opinion mining process and avoid overheads of redundant processes, as shown in Figure 5, based on Algorithm 1.

Algorithm 1: Words pre-processing

Input: A total number of requirements (R_i)
Outcome: All Pre-processed Words (W^all)
Set W^all ← φ
for all R ϵ R_i
String T ← tokenization (R_i)
sT′ ← lowercasing (T)
end for
for all T_i ϵ T
T″ ← Normalization (T_i)
T″′ ← Stemming (T″)
T″″ ← Stemming (T″′)
T″″′ ← Transformation (T″″)
end for
return W^all
end

We use the example of a patient record system to elaborating on the process of pre-processing requirements. New requirements are requested by stakeholders of the patient record system, as shown in Table 1. In pre-processing, data is collected based on new requirements by considering text documents as a bag of words. From the bag of words, semantic terms are screened all sentences based on: parts of speech (POS) from sentences were removed (such as a verbs, nouns, and adverbs); lemmatization (e.g., view reports and report checking used for similar lexical meaning); remove stop words (such as and, thus, and therefore); remove punctuation marks in the tokenization phase, such as “;”, “!”, “?”, “;” in transformation; remove “ing”, “duplicates” and “plural” and sort results semantically based on their frequency (repeated similar meanings used by different words), as shown in Table 1.

2.1.2. Aspect Based Sentiment Analysis

Sentiment analysis is used to discover people’s feelings, reactions, and opinions about a service or product. The computational study of components such as sentiments, opinions, views, attitudes, and emotions are stated in the user feedback text, which may be in various formats such as blogs, reviews, comments, or news. Aspect-based Sentiment Analysis (ABSA) fixes the sentiment in the text towards specific aspects. It can be a phrase or single word, e.g., “the restaurant’s food quality and service are different restaurant aspects”. Thus, it consists of the following steps, as shown in Figure 5. From input data, extract pre-processing terms. Then extract opinions and aspects from the terms, composition of aspects and opinions, refine the list of aspects, outcomes by arranging sentiments on aspects based.

Different sets of requirements documents consist of different words that are either unique or similar semantically. In the SRPPTC framework, for modelling and analyzing the semantic concepts from documents into unique topics/terms used in ABSA. The flow chart of ABSA, shown in Figure 5, describes all the steps taken in requirements mining. We use the RapidMiner 7.3 tool for ABSA to automatically extract semantics’ aspects or components. In ABSA, different requirements topics or terms are mined and classified according to their frequency on a semantics basis. The aspects with a higher frequency highlighted are bolder than those aspects with a lower frequency, as seen in documents. Next, we use these aspects for opinion extraction based on three opinions, i.e., positive, negative, and neutral, as shown in Figure 5. ABSA process starts after pre-processing data is extracted for categorization into different opinions and aspects. The list of different aspects (i.e., from aspect A₁ to A_n) is derived from pre-processing terms (T). These aspects are classified according to opinions to improve the integration and execution process of the system. The example of patient management (PRM) was considered to illustrate some sentiments of stakeholders relevant to previous experience and new requirements they wanted in updated versions. Thus, different sentiments, according to different aspects, are shown in Table 2.

2.2. Requirement Prediction

The number of attributes relevant to error identification is frequently reused requirements (FRR), the number of times reused (NTR), frequently causing faults (FCF), high faults rate (HFR), frequently used in groups (FUG), and frequently changed requirements (FCR). Of all the parameters in component-based systems, the FRR is the most important due to its reusability characteristics. Different versions and series of product reuse reduce complexity and failures during continuous changes. The FRR was used as a factor for the classification of requirements into FRR and not FRR (NFRR). Additionally, these requirements were divided into commonalities and variabilities using the Weka tool. This tool classifies requirements based on a data mining technique using FRR of similar and variable features. The FRR is the reuse frequency of various features in different versions and series of the same product. Hereafter, the frequency of other factors, i.e., NTR, FCF, HFR, FUG, and FCR, is calculated and used to predict those requirements causing failures and having higher chances of errors. These error-prone requirements help in the selection of TC for execution to reduce irrelevancy. As in the PRM case, after ABSA analysis, historical information is used to predict error-prone requirements and validate changes in the software product. Next, TC size is reduced to select TC of error-prone requirements and prioritize them for initial execution and maximum faults identification. These TC help to improve the initial faults identification process while reducing irrelevancy and redundancy among TC. Consequently, all aspects of the PRM example are categorized into FRR and NFRR as a sample of aspects shown in Table 3, with all other parameters’ values. These values help in the prediction of accurate and correct error-prone requirements from all sets of organization projects as well as PRM all version requirements to validate changes in PRM.

FRR and NFRR are used to categorize all requirements; their weighting is calculated using other parameters/factors, and error-prone requirements are identified as described in the following subsection.

Total Requirement Frequency (TRF)

The error-prone requirements are predicted using selected parameters, which are defined above with Equations (1) and (2) for

T R F_{F R R}

and

T R F_{N F R R}

values.

T R F_{F R R} = [F C F_{F R R} + H F R_{F R R} + F U G_{F R R} + F C R_{F R R}]

(1)

T R F_{N F R R} = [F C F_{F R R} + H F R_{N F R R} + F U G_{N F R R} + F C R_{N F R R}]

(2)

Consequently, each parameter is the addition of

C_{s}

and

V_{s}

requirements. For all parameters,

C_{s}

and

V_{s}

are calculated using Equations (3) and (4):

C_{s F R R = N F R R} = (\frac{\sum_{i = 1, l = 1}^{i = m, l = n} F c v_{i} \times F c p_{l}}{R_{c z}})

(3)

V_{s F R R = N F R R} = (\frac{\sum_{i = 1, l = 1}^{i = m, l = n} F v v_{i} \times F v p_{l}}{R_{v z}})

(4)

The

F c v_{i}

and

F v v_{i}

(commonality and variability frequency version wise) for calculating FCF frequency is based on the frequently reused requirement in different versions of the product. Additionally, reused requirements in different product development is calculated in

F c p_{l}

and

F v p_{l}

(frequency project wise). To calculate

F C F_{N F R R}

value, the same process used in Equation (1), is adopted. The classification of requirements based on different criteria is to reduce effort, redundancy, and irrelevancy in TC and faults identification. The

i

and

l

denote every requirement frequency of faults in versions and products, respectively. Hence,

R_{c z}

and

R_{v z}

used for the total number of requirements and divided each requirement with a total number of requirements to reduce bias in value.

To calculate

F C F_{F R R}

and

F C F_{N F R R}

factors use Equations (3) and (4); which describes the frequency of

F C F_{}

for FRR and NFRR. Where

C_{s}

and

V_{s}

stand for commonalities and variabilities frequency to predict suspected requirements. Next, use C_s and vs. values extracted from Equations (3) and (4) to find FCF value for FRR. The value of

F C F_{F R R}

is 1.56 based on the information provided in Table 3. If the value of

i

and

l

in PRM case is 6 and 3 respectively, while for FRR

F c v_{i} * F c p_{l}

is 49 and

R_{c z}

is 64 then

C_{s}

is 0.76. Also, for

V_{s}

using Equation (4), if

F v v_{i} * F v p_{l}

is 80 and

R_{v z} i s

100 then

V_{s}

is 0.80. Similarly, for

F C F_{N F R R}

value of 0.90; if FRR factor

F c v_{i} * F c p_{l}

is 15 and

R_{v z}

is 50, then

C_{s}

is 0.30. While for

V_{s}

using Equation (4), if

F v v_{i} * F v p_{l}

is equal to 30 and

R v_{z}

is 50 then

V_{s}

value is 0.60. Consequently, for other parameters i.e.,

H F R_{F R R & N F R R}

,

F U G_{F R R & N F R R}

, and

F C R_{F R R & N F R R}

we predicted suspected requirements using Equations (3) and (4). Then, at the end, by adding frequencies of all parameters and requirements with higher TRF value, we have high chances of errors during software verification and validation.

2.3. Test Case Prioritization

The higher TRF requirements TC were for the software testing process to verify product changes. Many TC were extracted, as each requirement may have more than one TC. The problem was which set of TC is to be first executed for quick identification of maximum faults. Therefore, test case prioritization (TCP) criteria were more appropriate for TC execution without a permanent or temporal reduction in TC size. Consequently, TC was prioritized based on different criteria like code coverage (CC) and historical information (such as previous execution time, previous faults rate, and TC change frequency (TCCF)), shown in Table 4.

To execute TC using the SRPPTC framework, the criteria used for TCP are TCCF and CC. Existing literature reveals that historical information plays an essential role in identifying the maximum error, and TC with a higher change frequency identifies maximum error better. The CC criteria are used when more than one TC has a similar change frequency to reduce ties among TC frequencies. The details of TC and their relevant requirements maintenance for the verification and validation process are shown in Table 4.

2.4. Performance Evaluation Metrics

The matrices were used for the performance evaluation of the SRPPTC framework to ensure that error-prone requirements prediction and historical information is essential for change verification and validation. Three matrices were used, i.e., accuracy, F-measures, and an average percentage of faults detection (APFD). Below is a brief description of these matrices.

2.4.1. Accuracy

Accuracy measures were used to evaluate the SRPPTC framework performance using Equation (5) to measures correctness from all TC, which is executed first for error detection, and identified maximum faults relative to total faults were selected using accuracy. The

T F

is the faults revealing TC,

T R

is the TC which is selected for execution,

i r

is the total number of revealed faults, r means the total number of TC,

s r

is the TC that is not selected, and identify faults and nr is the total number of not revealed faults.

A c u r r a c y_{F R R & N F R R} = {(\frac{T F_{i r} + T R_{r}}{T F_{i r} + T F_{n r} + T R_{r} + T F_{s r}})}

(5)

2.4.2. F-Measure

The F-measure combines precision (

P

) and recall (

R

) to identify SRPPTC framework performance accuracy and is calculated using Equation (6). The

P

measure indicates the correctness of TC execution for maximum faults detection using Equation (7). Whereas, the

R

measure shows the frequency of total TC executed relative to total faults revealed using Equation (8).

F = [(\frac{2 \times P \times R}{P + R})]

(6)

P_{F R R & N F R R} = {(\frac{T F_{i r}}{T F_{i r} + T F_{n r}})}

(7)

R_{F R R & N F R R} = {(\frac{T F_{i r}}{T F_{i r} + T F_{s r}})}

(8)

2.4.3. APFD

The APFD metric was used to calculate the fault detection rate (FDR) from the prioritized test suite. The higher the value of FDR, the higher the chances of detecting the maximum number of faults during regression testing. Whereas,

m

is the limit of total faults and

n

is the total number of test cases.

A P F D = 1 - \frac{(T F_{1} + T F_{2} + \dots + T F_{m})}{(m n)} + \frac{1}{2 n}

(9)

3. Results and Discussion

In this section, we describe the procedure for evaluating the execution of SRPPTC framework. For evaluation purposes, we performed an experimental study to estimate the performance of the SRPPTC framework. The organization selected for experiment conduction became operational in 1998 as freelancers with international clients for project development. It is now a well-established distribution-based product development organization providing services in software engineering, departmental management, customer care center, and a competent marketing team. We selected their multi-vendor online web-based and tax collection online applications, which are product line-based and comprise over 1300 components and configurations. We considered the functional requirements; historical information and TC were used to check functionalities that are associated with change requirements. We conducted an experiment in which participants were divided into two groups, viz:, the SRPPTC and control groups. These participants include project managers, team leaders, system analyst, software developers, designers, and quality engineering experts. The following experimental research questions (ERQs) was asked:

ERQ1: Were the identified challenges mitigated, and TCP improved after requirement prediction during CBSE? This question investigates whether the challenges identified can improve the CBSE activities, i.e., requirements analysis and specification, components selection, and TCP for reliability analysis after changes.

ERQ2: What trade-offs exist between requirements and TCP to increase FDR and CBSE quality? This question tries to identify the reason for requirements and TCP trade-off to increase the faults detection rate and CBS quality.

ERQ3: Do requirements play a significant role in TCP after statistical analysis? This question investigates whether results collected after experimental conduction using hypothesis testing are reliable or not.

3.1. Experiment Structure

The experiment was designed to evaluate the SRPPTC framework performance, which is comprised of different interrelated activities such as project selection, participant selection, data collection, and analysis. These activities involved making decisions and drawing conclusions from experimental study results. To experiment, we selected an organization that has experience in developing CBS. We selected 39 participants randomly from the total employees of the organization with at least one year of experience in CBS development. The 39 participants were divided into two teams, 23 in the treatment team (TT) and 16 in the control team (CT). The CT participants followed their traditional CBS reliability analysis using traditional methods i.e., random priority (RP), clustering (Cl), and execution rate (ER), while the TT participants learned the SRPPTC framework during the training period. The data collected after the experiment was based on a questionnaire administered to participants of both teams for feedback collection and quality improvement.

3.2. Experimental Procedure

The Team Foundation Server (TFS) repository maintained the information about the CBS projects during the execution of the steps in the SRPPTC framework. The TFS was used to assign roles, save project information, and assign responsibilities among participants the experiment. The TFS is the online repository used to build and validate products during their development, maintain the information, and provide one platform to all team members worldwide during development. The TFS repository was used to manage all the task allocation, responses, and decision making from requirements to reliability analysis. The participants consist of team leaders, project managers, developers, quality engineers, and requirements analysts. The experimental results are explained and discussed in line with the above described three ERQs.

In response to ERQ1, it was identified that challenges mitigated CBSE activities and feedback analyzed from the data collected after the implementation of the SRPPTC framework. The questionnaire was based on different parameters viz: requirement analysis and specification improved (RASI); historical information supportive (HIS); requirement changes and selection improved (RCSI); development and integration improved (DII); validation of changes improved (VCI); requirement prioritization improved; mining heterogeneous perspective supportive (MHPS); the customization of preferences improved (CPI); configuration management process improved (CMPI); test planning improvement (TPI); technique selection issues resolved (TSIS); and faults detection rate increases (FDRI) to extract the participants of TT and CT teams’ feedback and performance analysis.

We observed improvements in these parameters following the implementation of the SRPPTC framework, and all participants in each case scored higher than 50% in each parameter. The CT team participants emphasized that stated parameters are essential factors, and traditional methods are less productive during CBSE reliability analysis using requirements and historical information. Additionally, participants recorded a below fifty percent satisfaction level. The results of both teams after the same project reliability analysis are shown in Figure 6. The x-axis of Figure 6 represents the parameters, while the y-axis represents the satisfaction level of participants according to each parameter.

Similarly, to answer ERQ2, the overall feedback comparison of all participants of both teams, i.e., TT and CT after experiment performance, were analyzed. The feedback analysis revealed a positive trade-off among components requirements and TCP after modification. ABSA analysis revealed improvements in requirement management activities, which also improved test planning, strategy selection, and fault detection during CBSE reliability analysis after changes. These results are in Figure 7, and the x-axis described team members with satisfaction levels on the y-axis.

In further analysis of trade-off among requirements and TCP, evaluation metrics are described in Section 3. Therefore, the APFD of faults identified after the execution of the highest identified TCP using TT and CT procedures. The result is shown in Figure 8 and showed that the TT team identified the highest APFD value compared to CT team identification. Consequently, teams shown on the x-axis and y-axis described the APFD values.

Similarly, accuracy and F-measure metrics have higher values for the TT participants’ feedback analysis compared to the CT participants’ feedback results. Furthermore, the y-axis has the highest value in the TT and x-axis, the highest value in CT, as shown in Figure 9.

The validation of questionnaire-based data used to review the analysis conducted on a hypothesis testing in a parametric and non-parametric test using SPSS software. The statistical results used to describe ERQ3 answer to explain that questions are unbiased and show that the TCP process significantly improves requirements during software reliability analysis. The null Hypothesis (H0) and the alternative Hypothesis (H1) of ERQ3 are:

Hypothesis 0 (H0):

The TCP process significantly improves the requirements during software reliability analysis.

Hypothesis 1 (H1):

The TCP process does not significantly improve the requirements during software reliability analysis.

To show the accuracy of the SRPPTC framework, we performed a statistical investigation using the SPSS tool. The t-statistical test used to analyze for reliability and results revealed that the SRPPTC framework is appropriate for testing CBS as the t-value is less than the confidence interval (i.e., =0.95). The complete test has different t and means values, which shows that our experiment performance is reliable, unbiased, and unambiguous. It means that the value of the SRPPTC framework is higher than the non-SRPPTC framework in both projects, i.e., (0.728 and 0.744) and (0.356 and 0.370), respectively. The SRPPTC framework improves the process of specification and integration in CBS.

3.3. Performance Evaluation Metrics

There are different types of threats to validity, which are: internal, external, constructive, and reliability [16]. These threats are discussed about the limitation of the study. To reduce biasness and internal/theoretical validity in existing studies selection, extract relevant studies from the authenticated data sources. This provides guidelines and a roadmap for researchers and practitioners in the form of modern practices to improve CBSE using requirements and testing approaches. The selection of factors for requirement prediction helps in improving requirements and TCP process. To avoid external/generalization validity threats for evaluation, we selected real industrial projects and participants to work on a proposed framework, for real and accurate results. Thus, we used authenticated data sources for constructive/objective validity, which includes all relevant CBSE requirements and testing techniques using historical information based on factors identified to reduce human interaction or opinion during the selection and prioritization of TC. The reliability/interpretation validity is used to reduce the chances of wrong conclusions and interpretation of results. Next, data collected during the experiment were subjected to statistical analysis to reduce biases and human influences in the description of the result.

4. Conclusions and Future Work

This study provides researchers and practitioners with guidelines and a roadmap of the latest trends, challenges, and practical solutions for handling these challenges. The major hurdle during integration testing is when reusing components and test cases due to inconsistency and ambiguity. Therefore, we first extracted some challenges and factors that are helpful for validating changes to relevant challenges based on requirements information. Secondly, we proposed a framework based on an aspect-based sentiment analysis technique for requirement specification and historical information for TCP by predicting error-prone requirements. The proposed framework enhanced the process of CBS requirement specification and the validation of components after integration and changes, with a high satisfaction rate of both end-user and team. Thirdly, an experiment was conducted for the proposed framework effectiveness and the results outperformed with other techniques, i.e., random priority, execution rate, and clustering, both practically and statistically. This provides guidelines to industry and researchers to manage CBS validation practices by predicting suspected requirements. The findings of this research work provide a better understanding of CBSE and support to improve CBSE activities. It also provides industrialists and researchers with guidelines to manage the challenges of CBS after modification from requirements to testing practices by predicting suspected requirements.

Future work will investigate the proposed framework on various applications with similar results and trends. We will also explore the sustainability of requirements roles and historical information in component validation after integration and modifications in the interfaces of different components and their configuration. This will further improve the process of regression testing and traceability in CBS for the sustainable development of software. Furthermore, the allocation of tasks, resources, and efforts for the sustainable development of software to fulfil a nation’s future technical needs is one of the field’s global aims. Hence, in the future, to assess how closely their values comply with the requirements for sustainable software development, indicators of sustainable software development are needed.

Author Contributions

Conceptualization, S.A., Y.H., M.H., N.Z.J. and R.M.G.; methodology, S.A., Y.H., M.H., N.Z.J. and R.M.G.; software, S.A., Y.H., M.H., N.Z.J. and R.M.G.; validation, S.A., Y.H., M.H., N.Z.J. and R.M.G.; formal analysis, S.A., Y.H., M.H., N.Z.J. and R.M.G.; investigation, S.A., Y.H., M.H., N.Z.J. and R.M.G.; resources, S.A., Y.H., M.H., N.Z.J. and R.M.G.; data curation, S.A., Y.H., M.H., N.Z.J. and R.M.G.; writing—original draft preparation, S.A., Y.H., M.H. and N.Z.J.; writing—review and editing, S.A., Y.H., M.H. and N.Z.J.; visualization, S.A., Y.H., M.H., N.Z.J. and R.M.G.; supervision, S.A., Y.H., M.H. and N.Z.J.; funding acquisition, S.A., Y.H., M.H., N.Z.J. and R.M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R138), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We sincerely acknowledge the support from Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R138), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Conflicts of Interest

There are no conflicts of interest to report regarding the present study.

References

Borg, M.; Chatzipetrou, P.; Wnuk, K.; Alégroth, E.; Gorschek, T.; Papatheocharous, E.; Shah, S.M.A.; Axelsson, J. Selecting Component Sourcing Options: A Survey of Software Engineering’s Broader Make-or-Buy Decisions. Inf. Softw. Technol. 2019, 112, 18–34. [Google Scholar] [CrossRef]
Umran Alrubaee, A.; Cetinkaya, D.; Liebchen, G.; Dogan, H. A Process Model for Component-Based Model-Driven Software Development. Information 2020, 11, 302. [Google Scholar] [CrossRef]
Chatzipetrou, P.; Papatheocharous, E.; Wnuk, K.; Borg, M.; Alégroth, E.; Gorschek, T. Component Attributes and Their Importance in Decisions and Component Selection. Softw. Qual. J. 2019, 28, 567–593. [Google Scholar] [CrossRef] [Green Version]
Khan, F.; Tahir, M.; Babar, M.; Arif, F.; Khan, S. Framework for Better Reusability in Component Based Software Engineering. J. Appl. Environ. Biol. Sci. 2016, 6, 77–81. [Google Scholar]
National Institute of Technology, Durgapur, West Bengal-713209, India; Banerjee, P.; Sarkar, A. Quality Evaluation of Component-Based Software: An Empirical Approach. IJISA 2018, 10, 80–91. [Google Scholar] [CrossRef] [Green Version]
Mougouei, D.; Powers, D.M.W.; Mougouei, E. A Fuzzy Framework for Prioritization and Partial Selection of Security Requirements in Software Projects. IFS 2019, 37, 2671–2686. [Google Scholar] [CrossRef]
Bukhari, S.S.A.; Mamoona, H.; Shah, S.A.A.; Jhanjhi, N.Z. Improving requirement engineering process for web application development. In Proceedings of the 2018 12th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics (MACS), Karachi, Pakistan, 24–25 November 2018; pp. 1–5. [Google Scholar]
Sadia, A.; Hafeez, Y.; Humayun, M.; Jhanjhi, N.Z.; Le, D. Towards aspect based requirements mining for trace retrieval of component-based software management process in globally distributed environment. Inf. Technol. Manag. 2022, 23, 151–165. [Google Scholar]
Humayun, M.; Jhanjhi, N.; Hamid, B.; Ahmed, G. Emerging Smart Logistics and Transportation Using IoT and Blockchain. IEEE Internet Things Mag. 2020, 3, 58–62. [Google Scholar] [CrossRef]
Dreossi, T.; Donzé, A.; Seshia, S.A. Compositional Falsification of Cyber-Physical Systems with Machine Learning Components. J. Autom. Reason. 2019, 63, 1031–1053. [Google Scholar] [CrossRef] [Green Version]
Qasim, M.; Bibi, A.; Hussain, S.J.; Jhanjhi, N.Z.; Humayun, M.; Sama, N.U. Test case prioritization techniques in software regression testing: An overview. Int. J. Adv. Appl. Sci. 2021, 8, 107–121. [Google Scholar]
Ali, S.; Imran, M.; Hafeez, Y.; Abbasi, T.R.; Haider, W.; Salam, A. Improving Component Based Software Integration Testing Using Data Mining Technique. In Proceedings of the 2018 12th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics (MACS), Karachi, Pakistan, 24–25 November 2018; pp. 1–6. [Google Scholar] [CrossRef]
Hull, B.; Kuza, L.; Moore, J. A Model-Based Systems Approach to Radar Design Utilizing Multi-Attribute Decision Analysis Techniques. In 2018 Systems and Information Engineering Design Symposium (SIEDS); IEEE: Charlottesville, VA, USA, 2018; pp. 197–202. [Google Scholar]
Mohamed, N.; Moussa, S.; Badr, N.; Tolba, M. Enhancing Test Cases Prioritization for Internet of Things Based Systems Using Search-Based Technique. IJICIS 2021, 21, 84–94. [Google Scholar] [CrossRef]
Hadar, I.; Zamansky, A.; Berry, D.M. The Inconsistency between Theory and Practice in Managing Inconsistency in Requirements Engineering. Empir. Softw. Eng. 2019, 24, 3972–4005. [Google Scholar] [CrossRef]
Hettiarachchi, C.; Do, H. A Systematic Requirements and Risks-Based Test Case Prioritization Using a Fuzzy Expert System. In Proceedings of the 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS), Sofia, Bulgaria, 22–26 July 2019; pp. 374–385. [Google Scholar]
Azizi, M.; Do, H. A Collaborative Filtering Recommender System for Test Case Prioritization in Web Applications. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Pau, France, 9 April 2018; pp. 1560–1567. [Google Scholar]
Ouriques, J.F.S.; Cartaxo, E.G.; Machado, P.D.L. Test Case Prioritization Techniques for Model-Based Testing: A Replicated Study. Softw. Qual. J. 2018, 26, 1451–1482. [Google Scholar] [CrossRef] [Green Version]
Meyerer, F.; Hummel, O. Towards Plug-and-Play for Component-Based Software Systems. In Proceedings of the 19th International Doctoral Symposium on Components and Architecture—WCOP ’14, Marcq-en-Bareul, France, 30 June–4 July 2014; pp. 25–30. [Google Scholar]
Sas, D.; Avgeriou, P. Quality Attribute Trade-Offs in the Embedded Systems Industry: An Exploratory Case Study. Softw. Qual. J. 2020, 28, 505–534. [Google Scholar] [CrossRef] [Green Version]
Moreno, L.; Bavota, G.; Haiduc, S.; Di Penta, M.; Oliveto, R.; Russo, B.; Marcus, A. Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, Bergamo, Italy, 30 August 2015; pp. 567–578. [Google Scholar]
Moonen, L.; Yazdanshenas, A.R. Analyzing and Visualizing Information Flow in Heterogeneous Component-Based Software Systems. Inf. Softw. Technol. 2016, 77, 34–55. [Google Scholar] [CrossRef]
Smara, M.; Aliouat, M.; Pathan, A.-S.K.; Aliouat, Z. Acceptance Test for Fault Detection in Component-Based Cloud Computing and Systems. Future Gener. Comput. Syst. 2017, 70, 74–93. [Google Scholar] [CrossRef]
Ibias, A.; Hierons, R.M.; Núñez, M. Using Squeeziness to Test Component-Based Systems Defined as Finite State Machines. Inf. Softw. Technol. 2019, 112, 132–147. [Google Scholar] [CrossRef]
Fernández-García, A.J.; Iribarne, L.; Corral, A.; Criado, J.; Wang, J.Z. A Recommender System for Component-Based Applications Using Machine Learning Techniques. Knowl.-Based Syst. 2019, 164, 68–84. [Google Scholar] [CrossRef]
Limna, T.; Tandayya, P. A Flexible and Scalable Component-Based System Architecture for Video Surveillance as a Service, Running on Infrastructure as a Service. Multimed. Tools Appl. 2016, 75, 1765–1791. [Google Scholar] [CrossRef]
Gonzalez-Herrera, I.; Bourcier, J.; Daubert, E.; Rudametkin, W.; Barais, O.; Fouquet, F.; Jézéquel, J.M.; Baudry, B. ScapeGoat: Spotting Abnormal Resource Usage in Component-Based Reconfigurable Software Systems. J. Syst. Softw. 2016, 122, 398–415. [Google Scholar] [CrossRef] [Green Version]
Yang, C.; Liu, J.; Zeng, Y.; Xie, G. Real-Time Condition Monitoring and Fault Detection of Components Based on Machine-Learning Reconstruction Model. Renew. Energy 2019, 133, 433–441. [Google Scholar] [CrossRef]
Graics, B.; Molnár, V.; Vörös, A.; Majzik, I.; Varró, D. Mixed-Semantics Composition of Statecharts for the Component-Based Design of Reactive Systems. Softw. Syst. Model. 2020, 19, 1483–1517. [Google Scholar] [CrossRef]
Badampudi, D.; Wnuk, K.; Wohlin, C.; Franke, U.; Smite, D.; Cicchetti, A. A Decision-Making Process-Line for Selection of Software Asset Origins and Components. J. Syst. Softw. 2018, 135, 88–104. [Google Scholar] [CrossRef]
Li, X.; Wong, W.E.; Gao, R.; Hu, L.; Hosono, S. Genetic Algorithm-Based Test Generation for Software Product Line with the Integration of Fault Localization Techniques. Empir. Softw. Eng. 2018, 23, 1–51. [Google Scholar] [CrossRef]
Souto, S.; d’Amorim, M. Time-Space Efficient Regression Testing for Configurable Systems. J. Syst. Softw. 2018, 137, 733–746. [Google Scholar] [CrossRef] [Green Version]
Arrieta, A.; Wang, S.; Sagardui, G.; Etxeberria, L. Search-Based Test Case Prioritization for Simulation-Based Testing of Cyber-Physical System Product Lines. J. Syst. Softw. 2019, 149, 1–34. [Google Scholar] [CrossRef]
Wohlin, C.; Wnuk, K.; Smite, D.; Franke, U.; Badampudi, D.; Cicchetti, A. Supporting Strategic Decision-Making for Selection of Software Assets. In Software Business; Maglyas, A., Lamprecht, A.-L., Eds.; Lecture Notes in Business Information Processing; Springer International Publishing: Cham, Switzerland, 2016; Volume 240, pp. 1–15. ISBN 978-3-319-40514-8. [Google Scholar]
Asghar, M.Z.; Khan, A.; Zahra, S.R.; Ahmad, S.; Kundi, F.M. Aspect-Based Opinion Mining Framework Using Heuristic Patterns. Clust. Comput. 2019, 22, 7181–7199. [Google Scholar] [CrossRef]

Figure 1. CBS examples.

Figure 2. Resources detail.

Figure 3. Taxonomy of CBS activities factors.

Figure 4. Suspected requirements prediction and prioritization of test case framework.

Figure 5. Aspect-based sentiment analysis process.

Figure 6. SRPPTC framework teams feedback results.

Figure 7. Without SRPPTC framework participants feedback.

Figure 8. APFD results.

Figure 9. Accuracy and F-measure results.

Table 1. Example of words pre-processing.

S#	Steps	Results/Outcome
1	Data Extraction	“The system provides facilities of online appointments, view reports, and check for medicine for high-quality and time-saving services. The patients faced difficulties such as getting appointments and checking reports. The far distance patients may suffer due to the non-availability of online checking and recording of blood pressure. … ”.
2	Remove POS, Stop Words and Lemmatization	“System provides online appointments facilities; view reports check medicine, high-quality and time-saving services. Patients faced difficulties getting appointments, and checking report. … ”.
3	Tokenization	“System provides facilities online appointments view reports check medicine, high-quality and time-saving services. Patients faced difficulties getting appointments to… ”.
4	Transformation	“System” “Online” “Appointment” “Patient” “Doctor” “View Report” “Blood Pressure” “Check” “Sugar” “Heartbeat” “Record” “Lab Test” ….
5	Words/Terms	“System” “Online” “Appointment” “Patient” “Doctor” “View Report” “Blood Pressure” “Check” “Sugar” “Heartbeat” “Record” “Time” “Date” “Visit” “Lab Test” “Service” “Medicine”

Table 2. Example of ABSA process.

Aspects	Opinion
System	Positive
Access with Card	Negative
Access with face detection	Positive
Medicine History	Positive
Lost Connection	Negative
Lab Report	Negative
Alarming System	Positive

Table 3. Requirements classification.

Req_Id	Req_Name	Req_Classification	C_s/V_s	FRR	FCF	HFR	FUG	FCR
Req01	Login and Logout System	FRR	C_s	9	3	6	7	6
Req02	Office Schedule	FRR	V_s	6	2	3	1	2
Req03	Payment Method	FRR	V_s	9	4	7	5	5
Req04	Lab Results	NFRR	V_s	2	4	3	4	3
Req05	Appointments	NFRR	C_s	4	3	2	1	2
Req06	Alarming System	FRR	V_s	7	5	5	6	7

Table 4. Requirements TC details.

Req_Id	TC_Id	TC_Description	TE	CC	TCCF	Faults Rate
Req_Id	TC_Id	TC_Description	*P\NR	CC	TCCF	Faults Rate
Req01	TC01	Login with a valid user name	P	0.7	0.6	0.7
	TC02	Login with a valid password	NR	0.6	0.7	0.5
	TC03	Login with an invalid user name	NR	0.4	0.2	0.1
*P = Passed			NR = No Run

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ali, S.; Hafeez, Y.; Humayun, M.; Jhanjhi, N.Z.; Ghoniem, R.M. An Aspects Framework for Component-Based Requirements Prediction and Regression Testing. Sustainability 2022, 14, 14563. https://doi.org/10.3390/su142114563

AMA Style

Ali S, Hafeez Y, Humayun M, Jhanjhi NZ, Ghoniem RM. An Aspects Framework for Component-Based Requirements Prediction and Regression Testing. Sustainability. 2022; 14(21):14563. https://doi.org/10.3390/su142114563

Chicago/Turabian Style

Ali, Sadia, Yaser Hafeez, Mamoona Humayun, N. Z. Jhanjhi, and Rania M. Ghoniem. 2022. "An Aspects Framework for Component-Based Requirements Prediction and Regression Testing" Sustainability 14, no. 21: 14563. https://doi.org/10.3390/su142114563

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Aspects Framework for Component-Based Requirements Prediction and Regression Testing

Abstract

1. Introduction

2. Proposed Framework

2.1. Requirements Analysis

2.1.1. Subsubsection Pre-Processing Requirements Pre-Processing Requirements

2.1.2. Aspect Based Sentiment Analysis

2.2. Requirement Prediction

Total Requirement Frequency (TRF)

2.3. Test Case Prioritization

2.4. Performance Evaluation Metrics

2.4.1. Accuracy

2.4.2. F-Measure

2.4.3. APFD

3. Results and Discussion

3.1. Experiment Structure

3.2. Experimental Procedure

3.3. Performance Evaluation Metrics

4. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI