Next Article in Journal
Understanding and Predicting Ride-Hailing Fares in Madrid: A Combination of Supervised and Unsupervised Techniques
Next Article in Special Issue
Designing a Comprehensive and Flexible Architecture to Improve Energy Efficiency and Decision-Making in Managing Energy Consumption and Production in Panama
Previous Article in Journal
Improving Landslide Susceptibility Assessment through Frequency Ratio and Classification Methods—Case Study of Valencia Region (Spain)
Previous Article in Special Issue
An Intelligent Waste Management Application Using IoT and a Genetic Algorithm–Fuzzy Inference System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Statistical Test Method to Establish a Decision Model of Performance Evaluation Matrix

1
Department of Industrial Education and Technology, National Changhua University of Education, Changhua 500208, Taiwan
2
Language Center, National Chin-Yi University of Technology, Taichung 411030, Taiwan
3
Department of Industrial Engineering and Management, National Chin-Yi University of Technology, Taichung 411030, Taiwan
4
Department of Business Administration, Chaoyang University of Technology, Taichung 413310, Taiwan
5
Department of Business Administration, Asia University, Taichung 413305, Taiwan
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(8), 5139; https://doi.org/10.3390/app13085139
Submission received: 4 March 2023 / Revised: 14 April 2023 / Accepted: 17 April 2023 / Published: 20 April 2023
(This article belongs to the Special Issue Smart Service Technology for Industrial Applications II)

Abstract

:
Many studies have pointed out that the Performance Evaluation Matrix (PEM) is a convenient and useful tool for the evaluation, analysis, and improvement of service operating systems. All service items of the operating system can collect customer satisfaction and importance through questionnaires and then convert them into satisfaction indices and importance indices to establish PEM and its evaluation rules. Since the indices have unknown parameters, if the evaluation is performed directly by the point estimates of the indices, there will be a risk of misjudgment due to sampling error. In addition, most of the studies only determine the critical-to-quality (CTQ) that needs to be improved, and do not discuss the treatment rules in the case of limited resources nor perform the confirmation after improvement. Therefore, to address similar research gaps, this paper proposed the unbiased estimators of these two indices and determined the critical-to-quality (CTQ) service items which need to be improved through the one-tailed statistical hypothesis test by building a PEM method of the satisfaction index. In addition, through the one-tailed statistical hypothesis test method of the importance index, the improvement priority of service items was determined under the condition of limited resources. Confirmation of the effect on improvement is an important step in management. Thus, this paper adopted a statistical two-tailed hypothesis test to verify whether the satisfaction of all the CTQ service items that need to be improved was enhanced. Since the method proposed in this paper was established through statistical hypothesis tests, the risk of misjudgment due to sampling error could be reduced. Obviously, reducing the misjudgment risk is the advantage of the method in this paper. Based on the precondition, utilizing the model in this study may assist the industries to determine CTQ rapidly, implement the most efficient improvement under the condition of limited resources and also confirm the improvement effect at the same time. Finally, a case study of computer-assisted language learning system (CALL System) was used to illustrate a way to apply the model proposed in this paper.

1. Introduction

In the face of global warming and the impact of the COVID-19 pandemic in recent years, human coexistence with the natural environment has become a relatively important issue. The awareness of sustainable development that enterprises and environmental protection have in common has also become the major concern for governments and enterprises in various countries. According to many studies, as the Internet of Things (IoT) has gradually become popular and mature, various smart APP platforms have also sprung up [1,2,3,4]. Not only can efficiency be leveled up, but the impact on the environment can also be reduced. For example, by means of the digital learning model of the teaching APP, students can eliminate the inconvenience of going out as well as achieve the effect of learning at home. By doing so, not only can the amount of traffic jams be reduced, but the time of going out can also be saved, so that the effect of energy saving and carbon reduction can be achieved. Based on the above scenario, constructing a complete evaluation and improvement model of the digital learning effectiveness of the teaching APP will help attract more users to achieve digital learning effect by means of the teaching APP in order to achieve the abov-mentioned effect of alleviating the congestion problem, saving energy, and reducing carbon emissions.
Some methods such as Quality Function Deployment (QFD) and Balanced Scorecard (BSC) are employed to weigh performance evaluations or product designs, and some apply Fuzzy inference system (FIS) to increase the competitive capability and customer satisfaction [5,6,7]. According to numerous studies, the Performance Evaluation Matrix (PEM) is a convenient and useful evaluation tool for the service operating systems. Thus, plenty of dissertations have engaged in the research on PEM [8,9,10,11]. In order not to lose generality, this paper assumes that there are q key service items of digital learning provided by the teaching APP, so q questions are designed to investigate the importance and satisfaction of the q service items for learners. In addition, many studies assert that since Beta assigns values between 0 and 1, it represents satisfaction ranging from 0, which is completely dissatisfied, to 1, which is 100% satisfied. Similarly, the importance ranges from 0, which means total ignorance, to 1, which refers to great importance. Consequently, satisfaction and importance are regarded as random variables subject to the Beta distribution [12,13]. Hung, Huang, and Chen [14] proposed a standardized and significant importance index and satisfaction index based on the two parameters assigned by Beta distribution. If random variable X i represents the satisfaction of the ith service item, then X i is distributed as the Beta distribution denoted by X i ~ B e t a a i S , b i S , where a i S and b i S are two parameters. Similarly, if random variable Y i indicates the importance attached to the ith service item, then Y i is distributed as the Beta distribution denoted by Y i ~ B e t a a i I , b i I , where a i I and b i I are two parameters. Thus, the satisfaction index and the importance index can be expressed, as follows, for satisfaction:
θ i S = a i S a i S + b i S   ( satisfaction index ) ,
θ i I = a i I a i I + b i I   ( importance index ) .
According to the abovementioned data, the values of the two indices are between 0 and 1. In this paper, we set the satisfaction index as the x-axis and set the importance index as the y-axis to form the performance evaluation matrix. Since the indices have unknown parameters, if the evaluation is performed directly by the point estimates of the indices, there will be a risk of misjudgment due to sampling error [15,16,17]. In addition, most studies just determine the critical-to-quality (CTQ) that needs to be improved, without discussing the processing rules in resource-limited settings, and without the confirmation of the improvements. In order to make up for such research gaps, this paper adopts the one-tailed statistical hypothesis testing method of satisfaction index, determines the CTQ service items that need to be improved, and then defines their improvement order. Then, this paper uses a statistical two-tailed hypothesis test to verify whether the satisfaction of all CTQ service items that need to be improved is improved. First, this paper detects the unbiased estimators of the satisfaction index and the importance index for all service items. However, according to numerous studies, if the PEM locations of the point estimates are directly used to determine whether the service items need to be improved, then the risk of misjudgment may be caused by sampling error [18,19]. Therefore, this paper uses the average value θ 0 S of all satisfaction indices as the testing standard in accordance with the spirit of continuous improvement on the comprehensive quality management and employs the statistical one-tailed test to determine the critical value C 0 S . At the same time, the average value θ 0 I of all importance indices is adopted as the testing standard, and the statistical one-tailed test is employed to determine the critical value. Next, this paper establishes the evaluation blocks and evaluation rules of PEM based on the critical value derived from the statistical test and judges whether the service item needs to be improved according to the location of the evaluation coordinate point x i , y i = θ ^ i S , θ ^ i I of each service item, where θ ^ i S is the estimator of θ i S and θ ^ i I is the estimator of θ i I . Meanwhile, according to the locations of the evaluation coordinate points of the service items that need to be improved in the evaluation quadrant of PEM, the priority of improvement is determined under the condition of limited resources. Because this method is established by statistical tests, the risk of misjudgment caused by sampling error can be lowered. After the task of improvement is completed, the effectiveness of the improvement must be confirmed. Therefore, this paper then adopts the statistical two-tailed test to verify whether the satisfaction of all service items that need to be improved is leveled up. Finally, a case study of the computer-assisted language learning system (CALL System) is used in this paper to illustrate the application of the abovementioned statistical test rules to define the key service items that need to be improved and to determine the priority of improvement in the situation of limited resources.
The remainder of this paper Is arranged as follows. In Section 2, we propose the unbiased estimators and the statistical hypothesis test method for the satisfaction index and the importance index, respectively. According to the statistical test results, we procure the service items that need to be improved as well as determine the order of improvement for all service items. In Section 3, we define the evaluation block of the Performance Evaluation Matrix based on the two critical values derived from the statistical test conducted in Section 2; at the same time, we also define evaluation rules. In Section 4, we adopt a statistical two-tailed test to verify the improvement effects on all key service items that need to be improved. In Section 5, we design a case study of the computer-assisted language learning system (CALL System), illustrating the application of the method proposed in this paper. In Section 6, we offer conclusions.

2. Unbiased Estimators and Statistical Hypothesis Test

As mentioned earlier, since the indices have unknown parameters, they must be estimated by sample data of the interviewed customers. Due to the application of the Central Limit Theorem in this paper to carry out the statistical test, the sample size must be sufficiently large. The sample size of the cases applied to the method in this study are usually larger than 100. They are much larger than the minimum sample size required in the Central Limit Theorem (the minimum sample size for requirements is 30). In this section, we propose the unbiased estimators and the statistical hypothesis test method for the satisfaction index and the importance index, respectively.

2.1. Satisfaction Index

First, we assume that the satisfaction sample data of the customer’s ith service item is X i , 1 , ,   X i , j , ,   X i , n , i = 1 , 2 , , q . Then, the estimator of the satisfaction index for service item i is expressed as follows:
θ ^ i S = 1 n × j = 1 n X i , j .
The expected value of θ ^ i S is
E θ ^ i S = 1 n × j = 1 n E X i , j = 1 n × i = 1 n 0 1 x Γ a i S + b i S Γ a i S Γ b i S x a i S 1 x b i S d x = 1 n × i = 1 n a i S a i S + b i S = θ i S .  
Thus, θ ^ i S is an unbiased estimator of θ i S .
θ S = 1 q i = 1 q θ ^ i S .
Then, based on the concept of total quality management, when the satisfaction index value of service item i is greater than or equal to the average value θ S , it is fine to maintain the current situation. When the satisfaction index value of service item i is smaller than the average value θ S , then the service item i needs to be improved [20,21]. Based on the stated above, the hypotheses of the statistical test for satisfaction index i are as follows:
null   hypothesis   H 0 :   θ i S θ S ;
alternative   hypothesis   H 1 :   θ i S < θ S .
We assume that the test statistic is the unbiased estimator θ ^ i S with significance level α ; then, the critical region for satisfaction index i is
C R X i = θ ^ i S θ ^ i S < C X 0 i .
Thus, we have
p θ ^ i S < C X 0 i θ i S = θ S = α .
Equivalently,
p Z X i < n C X 0 i θ S S X i = α ,
where
S X i = 1 n 1 × j = 1 n X i , j X ¯ i 2
and
Z X i = n θ i S θ S S X i
are approximate values, and they are distributed as standard normal distributions when sample size n is large. Then, we have
n C X 0 i θ S S X i = z α ,
where the critical value C X 0 i is displayed below:
C X 0 i = θ S z α S X i n .

2.2. Importance Index

We assume the importance sample data of the customer’s ith service item is Y i , 1 , ,   Y i , j , ,   Y i , n , i = 1 , 2 , , q ; then, the estimator of importance index for service item i is expressed as follows:
θ ^ i I = 1 n × j = 1 n Y i , j .
The expected value of θ ^ i I is
E θ ^ i I = 1 n × j = 1 n E Y i , j = 1 n × j = 1 n 0 1 y Γ a i I + b i I Γ a i I Γ b i I y a i I 1 y b i I d y = 1 n × i = 1 n Γ a i I + b i I Γ a i I + 1 Γ a i I + b i I + 1 × 0 1 B e t a a i I + 1 , b i I d y   = 1 n × i = 1 n a i I a i I + b i I = θ i I .  
Thus, θ ^ i I is an unbiased estimator of θ i I .
θ I = 1 q i = 1 q θ ^ i I .
Then, when the importance index value of service item i is greater than or equal to the average value θ I , the priority of improvement is high. When the importance index value of service item i is less than the average value θ I , the priority of improvement is low. Based on the above statements, the hypotheses of the statistical test for importance index i are defined as follows:
null   hypothesis   H 0 :   θ i I θ I ;
alternative   hypothesis   H 1 :   θ i I < θ I .
We assume the test statistic is the unbiased estimator θ ^ i I with significance level α ; then, the critical region for importance index i is
C R Y i = θ ^ i I θ ^ i I < C Y 0 i .
Thus, we have
p θ ^ i I < C Y 0 i θ i I = θ I = α .
Equivalently,
p Z Y i < n C Y 0 i θ I S Y i = α ,
where
S Y i = 1 n 1 × j = 1 n Y i , j Y ¯ i 2
and
Z Y i = n θ i I θ I S Y i
are approximate values, and they are distributed as standard normal distributions when sample size n is large. Then,
C Y 0 i = θ I z α S Y i n .

3. Performance Evaluation Matrix

As mentioned earlier, the performance evaluation matrix is widely used to evaluate and improve the performance levels of q service items for various service systems. In this paper, the satisfaction index is set as x-axis and the importance index is set as y-axis to form the performance evaluation matrix. Then, the horizontal line y = C Y 0 and the vertical line x = C X 0 are used to divide the performance evaluation matrix into four quadrants, where C X 0 = M i n C X 01 , C X 02 , , C X 0 q and C Y 0 = M i n C Y 01 , C Y 02 , , C Y 0 q . These four quadrants can be defined as follows:
Quadrant   1 :   Q 1 = x , y | C X 0 x 1 , C Y 0 y 1 ,
Quadrant   2 :   Q 2 = x , y | 0 x < C X 0 , C Y 0 y 1 ,
Quadrant   3 :   Q 3 = x , y | 0 x < C X 0 , 0 y C Y 0 ,
Quadrant   4 :   Q 4 = x , y | C X 0 x 1 , 0 y C Y 0 .
Then, the performance evaluation matrix and four quadrants can be displayed in Figure 1 as follows:
Obviously, when the evaluation coordinate points fall into Quadrant 2 and Quadrant 3, it means that the satisfaction of the service items is lower than the average satisfaction, so that the service items definitely need to be improved. Since the evaluation coordinate points falling into Quadrant 2 are more important than the ones falling into Quadrant 3, the priority of their improvement is higher. In addition, when the evaluation coordinate points fall into Quadrant 1 and Quadrant 4, it means that the satisfaction of the service items is higher than the average satisfaction, so that there is no need to make any improvement, and the current state can remain unchanged. Based on the above concept, we can define the evaluation coordinates for the ith service item as follows:
x i , y i = θ ^ i S , θ ^ i I .
Then, the evaluation rules and improvement ranking rules are established as follows:
(1)
If x i , y i Q 1 Q 4 , then service item i does not need to make any improvement.
(2)
If x i , y i Q 2 Q 3 , then service item i needs to be improved.
(3)
If x i , y i Q 2 , then the service item i is ranked the first priority of improvement.
(4)
If x i , y i Q 3 , then the service item i is ranked the second priority of improvement.

4. Statistical Two-Tailed Test

As noted above, confirming the effect upon the improvement is an important step in management. Therefore, we use a statistical two-tailed test to judge whether the satisfaction of all the CTQ service items needs to be enhanced in this section. We let set A be the set formed by the service items that need to be enhanced; then,
A = i | x i , y i Q 2 Q 3 .
θ b i S is the value of satisfaction index before improvement, θ a i S is the value of satisfaction index after improvement, and i A . If the satisfaction sample data are X i , 1 , ,   X i , j , ,   X i , n for i A , then the estimator of θ a i S can be shown as follows:
θ ^ a i S = 1 n × j = 1 n X i , j .
Similarly, the estimator of θ b i S for i A as follows:
θ ^ b i S = 1 n × j = 1 n X i , j .
Then,
Z h i = n θ ^ h i S θ h i S S X h i ,   h = a , b   and   i A ,
where
S X a i = 1 n 1 × j = 1 n X i , j X ¯ i 2   and   S X b i = 1 n 1 × j = 1 n X i , j X ¯ i 2 .
Then, Z h i is an approximate value, and it is distributed as a standard normal distribution when sample size n is large. Thus,
1 α = p Z α / 2 Z h i Z α / 2 = p Z α / 2 n θ ^ h i S θ h i S S X h i Z α / 2 = p θ ^ h i S Z α / 2 S X h i n θ h i S θ ^ h i S + Z α / 2 S X h i n .
Then, the 100 1 α % confidence interval of θ b i S and θ a i S can be demonstrated as follows:
before   improvement :   L θ b i , U θ b i = θ ^ b i S Z α / 2 S X b i n , θ ^ b i S + Z α / 2 S X b i n ;
after   improvement :   L θ a i , U θ a i = θ ^ a i S Z α / 2 S X a i n , θ ^ a i S + Z α / 2 S X a i n .
Obviously, when the value of θ a i S is bigger than that of θ b i S , it indicates that the improvement is effective. When θ a i S is equal to θ b i S , it means that the improvement has no effect. When the value of θ a i S is smaller than that of θ b i S , improvement does not occur; in fact, it is made worse. Accordingly, the statistical hypothesis test for validating the effectiveness of improvement is displayed as follows:
null   hypothesis   H 0 :   θ b i S = θ a i S ;
alternative   hypothesis   H 1 :   θ b i S θ a i S .
Based on Equations (36) and (37), we perform the two-tailed statistical test using two confidence intervals. The test rules are defined as follows:
(1)
If U θ b i < L θ a i , then the improvement of service item i works well.
(2)
If L θ b i , U θ b i L θ a i , U θ a i ϕ , then service item i does not improve significantly.
(3)
If U θ a i < L θ b i , then it means that improvement does not occur; in fact, it is made worse.
Based on the rules of the two-tailed statistical test, we can assist the system administrator of the teaching APP to verify the improvement effect upon all the service items that need to be improved and determine whether the improvements should continue so as to ensure the system quality of the teaching APP.

5. Case Study

As mentioned above, with the performance evaluation matrix, we can help the system administrator of the teaching APP quickly figure out the key service items that need to be improved and implement some improvements. In the field of English digital learning, many scholars have been involved in the research on the computer-assisted language learning system (CALL system) to help students use the CALL system to improve their foreign language [22,23,24,25,26]. The web-based e-learning system (WELS) satisfaction questionnaire designed by Shee and Wang [27], including four dimensions—1. Learner Interface, 2. Learning Community, 3. System Content, and 4. Personalization and Contains—and 13 questions can be said to be simple and complete. Therefore, it has been adopted by plenty of studies [28,29,30,31], for example, on the perceptions of students’ e-learning experiences in English for specific purpose (ESP) classes [28], examining and categorizing proper conditions for mathematics teaching as a real-life utilization with a multi-criteria decision analysis and fuzzy-decision-making trial and evaluation laboratory (MCDA and F-DEMATEL) method [29], and examining 98 articles to confirm the satisfaction as a critical aspect of student success in online education [30]. This paper also used this questionnaire to collect users’ satisfaction and importance and adopted the Call system as a case study to illustrate the model proposed in this study.
This questionnaire survey recruited students from a university in Taiwan as a sample. A total of 398 questionnaires were sent out, and 355 questionnaires were received (the response rate was 89.2%), of which 11 were invalid and 344 were effective (the effective response rate was 86.4%). The following is a data analysis based on the received 344 valid questionnaires (N = 344).
First, according to Equations (3) and (14), we calculated the estimator ( θ ^ i S ) of the satisfaction index and the estimator ( θ ^ i I ) of the importance index of each question, respectively. At the same time, according to Equation (5), we also calculated the average of the estimators of all satisfaction indices, θ S = 0.52. Then, the hypotheses of the statistical test for satisfaction index i are displayed as follows:
null   hypothesis   H 0 :   θ i S 0.52 ;
alternative   hypothesis   H 1 :   θ i S < 0.52 .
Subsequently, based on Equation (13) and θ S = 0.52, we calculated the critical value ( C X 0 i ) of the danger zone of the satisfaction index of each question and filled it in Table 1.
Similarly, according to Equation (16), the average of estimators of all the importance indices, θ I = 0.70, was calculated. The hypotheses of the statistical test for importance index i were defined as follows:
null   hypothesis   H 0 :   θ i I 0.70 ;
alternative   hypothesis   H 1 :   θ i I < 0.70 .
Next, according to Equation (24) and θ I = 0.70, we determined the critical value ( C Y 0 i ) of the danger zone of the importance index of each question and filled it in Table 1.
According to Table 1, we used horizontal line y = C Y 0 and vertical line x = C X 0 to divide the performance evaluation matrix into four quadrants, where C X 0 = M i n C X 01 , C X 02 , , C X 0 q = 0.49 and C Y 0 = M i n C Y 01 , C Y 02 , , C Y 0 q = 0.68. Next, according to the evaluation coordinate points of x i , y i = θ ^ i S , θ ^ i I for all service items in Table 1, we plotted them in Figure 2, as shown below.
According to Figure 2, the evaluation coordinate points of these 13 service items fall into the four evaluation quadrants as follows:
Quadrant 1: items 1–4;
Quadrant 2: item 5, item 6;
Quadrant 3: item 7, item 8;
Quadrant 4: items 9–13.
According to the evaluation rules, service items 5–8 need to improve. Based on the improvement ranking rules, since evaluation coordinates x 5 , y 5 Q 2 and x 6 , y 6 Q 2 , service item 5 and service item 6 were ranked to be of the first priority of improvement. Similarly, evaluation coordinates x 7 , y 7 Q 3 and x 8 , y 8 Q 3 , then service item 7 and service item 8 were ranked to be of the second priority of improvement.
Next, managers would be able to refer to the recommendations of these rules to carry out improvement procedures. After the improvement concludes, a test of the improvement can be carried out. Based on Equations (38) and (39), the assumptions of its statistical test are as follows:
null   hypothesis   H 0 :   θ b i S = θ a i S ;
alternative   hypothesis   H 1 :   θ b i S θ a i S .
Based on Equations (36) and (37), the 95% confidence intervals with sample size n = 258 of the satisfaction indices of the four service items that need to implement improvements are calculated.
  • Service Item 5
    before   improvement :   L θ b 5 , U θ b 5 = 0.41 1.96 0.28 344 , 0.41 + 1.96 0.28 344 = [ 0.38 , 0.44 ] ;
    after   improvement :   L θ a 5 , U θ a 5 = 0.48 1.96 0.27 258 , 0.48 + 1.96 0.27 258 = [ 0.45 , 0.51 ] .
  • Service Item 6
    before   improvement :   L θ b 6 , U θ b 6 = 0.39 1.96 0.29 344 , 0.39 + 1.96 0.29 344 =   [ 0.36 , 0.42 ] ;
    after   improvement :   L θ a 6 , U θ a 6 = 0.47 1.96 0.28 258 , 0.47 + 1.96 0.28 258 =   [ 0.43 , 0.50 ] .
  • Service Item 7
    before   improvement :   L θ b 7 , U θ b 7 = 0.37 1.96 0.29 344 , 0.37 + 1.96 0.29 344 =   [ 0.34 , 0.40 ] ;
    after   improvement :   L θ a 7 , U θ a 7 = 0.42 1.96 0.27 258 , 0.42 + 1.96 0.27 258 =   [ 0.39 , 0.46 ] .
  • Service Item 8
    before   improvement :   L θ b 8 , U θ b 8 = 0.34 1.96 0.28 344 , 0.34 + 1.96 0.28 344 =   [ 0.31 , 0.37 ] ;
    after   improvement :   L θ a 8 , U θ a 8 = 0.27 1.96 0.24 258 , 0.27 + 1.96 0.24 258 =   [ 0.24 , 0.30 ] .
In summary, the aforementioned test rules are listed as follows:
(1)
For service items 5 and 6, their upper limits before improvement are smaller than their lower limits after improvement ( U θ b 5 = 0.44 < L θ a 5 = 0.45 ; U θ b 6 = 0.42 < L θ a 6 = 0.43 ), indicating that the improvements of service items 5 and 6 are effective.
(2)
For service item 7, L θ b 7 , U θ b 7 = 0.34 , 0.40 and L θ a 7 , U θ a 7 = 0.39 , 0.46 , then L θ b 7 , U θ b 7 L θ a 7 , U θ a 7 ϕ . This shows that the improvement effect of service item 7 is not significant, so continuous improvement is necessary.
(3)
For service item 8, U θ a 8 = 0.30 < L θ b 8 = 0.31 ; it means that improvement does not occur; in fact, it is made worse. It also indicates that the improvement strategy needs to be re-examined so as to make effective improvements.

6. Conclusions

According to the viewpoints of Hung, Huang, and Chen [14], we proposed standardized and significant indices of importance and satisfaction on the premise that satisfaction and importance were assigned by Beta in this paper, and the values of these two indices were both between 0 and 1. Next, we set the satisfaction index as x-axis and the importance index as y-axis to form a performance evaluation matrix. In order not to lose generality, we assumed that the digital learning service items provided by the teaching APP in this paper had a total of q service items, so there were q satisfaction indices and q importance indices in total. First, in this paper, we determined the unbiased estimators of the satisfaction index and the importance index for all service items since the indices had unknown parameters. Second, we used the average value θ 0 S of all satisfaction indices as the testing standard in accordance with the spirit of continuous improvement on the comprehensive product management as well as adopted the statistical one-tailed test to determine the critical value C 0 S . Third, we established testing rules to facilitate the management to procure the service items that need to be improved. At the same time, the average value θ 0 I of all the importance indices was adopted as the testing standard, and the ranking of all service items that need to improve was determined by the statistical one-tailed test. Next, the above two statistical testing rules were applied as the basis of defining the PEM evaluation blocks and establishing the PEM evaluation rules. Obviously, since the method in this paper was established via statistical tests, the risk of wrong judgment caused by sampling error can be diminished. After the task of improvement was completed, we employed the statistical two-tailed test to verify whether the satisfaction of all key service items that need to make improvements was leveled up. Finally, a case study of the computer-assisted language learning system (CALL System) was used to explain the application of the abovementioned statistical testing rules.
As noted above, most studies just identify critical qualities (CTQs) that need improvement, without discussing processing rules in resource-limited settings, and without confirmation of improvement. The method of this paper just makes up for the gap in similar research. This paper adopts the one-tailed statistical hypothesis testing method of satisfaction index, which can help the industry to detect the CTQ service items that need to be improved and determine the order of improvement with a low risk of misjudgment. Finally, the two-tailed statistical hypothesis testing method is used to assist the industry to confirm its improvement effect. In addition, the digital learning effect evaluation and improvement model of the teaching APP constructed in this paper will contribute to attracting more users to apply the teaching APP to achieve the effectiveness of digital learning. The APP enables more students to avoid the inconvenience of going out and obtain good learning outcomes at home. Not only can it alleviate the problem of traffic congestion, but it can also achieve the effect of energy saving and carbon reduction by decreasing the need of going out.
Clearly, this paper refers to the literature, closely examines the needs of the customers, and assumes that the distribution of customer satisfaction and importance lies with Beta distribution since Beta distribution belongs to continuous probability distribution; however, data would have to be collected from the highest score of 100 and the lowest score of 0 and then divided by 100 to produce a value between 0 and 1. In addition, the statistical test must be conducted by the Central Limit Theorem. These are the limitations of this study. Therefore, future research can focus on the data collection method of the Beta distribution, and explore the fitting quantity of sample size in the application of the Central Limit Theorem.

Author Contributions

Conceptualization, C.-C.L. and K.-S.C.; methodology, C.-C.L. and K.-S.C.; software, C.-H.Y.; validation, C.-H.Y.; formal analysis, C.-C.L. and K.-S.C.; resources, C.-H.Y.; data curation, C.-H.Y.; writing—original draft preparation, C.-C.L., C.-H.Y. and K.-S.C.; writing—review and editing, C.-C.L. and K.-S.C. visualization, C.-H.Y.; supervision, K.-S.C.; project administration, C.-H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data underlying the results are available as part of the article and no additional source data are required.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

qthe number of service items
X i the satisfaction of the ith service item
Y i the importance attached to the ith service item
a i S ; b i S parameter of Beta distribution for satisfaction
a i I ; b i I parameter of Beta distribution for importance
θ i S satisfaction index
θ ^ i S the estimator of θ i S .
θ i I importance index
θ ^ i I the estimator of θ i I
θ 0 S the average value of all satisfaction indices
C 0 S critical value
θ I the average value of θ ^ i I
x i , y i = θ ^ i S , θ ^ i I evaluation coordinate point
nsample size
X i , 1 , ,   X i , j , ,   X i , n the satisfaction sample data of the customer’s ith service item
E θ ^ i S = θ i S the expected value of θ ^ i S
θ ^ i S an unbiased estimator of θ i S
θ S the average value of θ ^ i S
α significance level
C R X i the critical region for satisfaction index i
S X i the standard deviation of satisfaction index i
Z X i the standard score of satisfaction index i
C X 0 i the critical value of satisfaction index i
Y i , 1 , ,   Y i , j , ,   Y i , n the importance sample data of the customer’s ith service item
θ ^ i I an unbiased estimator of importance index i
E θ ^ i I = θ i I the expected value of θ ^ i I
θ I the average value of θ ^ i I
C R Y i the critical region for importance index i
S Y i the standard deviation of importance index i
Z Y i the standard score of importance index i
C Y 0 i the critical value of importance index i
H 0 null hypothesis
H 1 alternative hypothesis
y = C Y 0 the value at which the horizontal line intersects the y-axis
x = C X 0 the value at which the vertical line intersects the x-axis
C X 0 M i n C X 01 , C X 02 , , C X 0 q
C Y 0 M i n C Y 01 , C Y 02 , , C Y 0 q
Q 1 Quadrant 1
Q 2 Quadrant 2
Q 3 Quadrant 3
Q 4 Quadrant 4
A the set formed by the service items that need to be enhanced
θ b i S the value of satisfaction index before improvement
θ a i S the value of satisfaction index after improvement
X i , 1 , ,   X i , j , ,   X i , n the satisfaction sample data is for i A
θ ^ a i S the average value of satisfaction index after improvement
θ ^ b i S the average value of satisfaction index before improvement
Z h i the standard score of satisfaction index i after improvement ( h = a ) or before improvement ( h = b )
S X a i the standard deviation of satisfaction index i after improvement
S X b i the standard deviation of satisfaction index i before improvement
L θ b i , U θ b i the 100 1 α % confidence interval of θ b i S
L θ a i , U θ a i the 100 1 α % confidence interval of θ a i S

References

  1. Ahmad, S.; Wong, K.Y.; Tseng, M.-L.; Wong, W.P. Sustainable product design and development: A review of tools, applications and research prospects. Resour. Conserv. Recycl. 2018, 132, 49–61. [Google Scholar] [CrossRef]
  2. Amindoust, A.; Ahmed, S.; Saghafinia, A.; Bahreininejad, A. Sustainable supplier selection: A ranking model based on fuzzy inference system. Appl. Soft Comput. 2012, 12, 1668–1677. [Google Scholar] [CrossRef]
  3. Awasthi, A.; Govindan, K.; Gold, S. Multi-tier sustainable global supplier selection using a fuzzy AHP-VIKOR based approach. Int. J. Prod. Econ. 2018, 195, 106–117. [Google Scholar] [CrossRef]
  4. Meshram, C.; Ibrahim, R.W.; Deng, L.; Shende, S.W.; Meshram, S.G.; Barve, S.K. A robust smart card and remote user password-based authentication protocol using extended chaotic maps. Soft Comput. 2021, 25, 10037–10051. [Google Scholar] [CrossRef]
  5. Yazdi, A.K.; Hanne, T.; Gómez, J.C.O. Evaluating the performance of colombian banks by hybrid multicriteria decision making methods. J. Bus. Econ. Manag. 2020, 21, 1707–1730. [Google Scholar] [CrossRef]
  6. Angtuaco, D.S.; Barria, N.M.A.; Lee, J.M.C.; Tangsoc, J.C.; Chiu, A.S.F.; Mutuc, J.E. A redesign of the toothpaste tube using green QFD II for improved usability and sustainability. J. Clean. Prod. 2023, 393, 136279. [Google Scholar] [CrossRef]
  7. Sarfaraz, A.H.; Yazdi, A.K.; Wanke, P.; Nezhad, E.A.; Hosseini, R.S. A novel hierarchical fuzzy inference system for supplier selection and performance improvement in the oil & gas industry. J. Decis. Syst. 2022. [Google Scholar] [CrossRef]
  8. Chen, K.-S.; Yu, C.-M. Fuzzy test model for performance evaluation matrix of service operating systems. Comput. Ind. Eng. 2020, 140, 106240. [Google Scholar] [CrossRef]
  9. Martínez-Caro, E.; Cegarra-Navarro, J.G.; Cepeda-Carrión, G. An application of the performance-evaluation model for e-learning quality in higher education. Total. Qual. Manag. Bus. 2015, 26, 632–647. [Google Scholar] [CrossRef]
  10. Wong, R.C.P.; Szeto, W.Y. An alternative methodology for evaluating the service quality of urban taxis. Transp. Policy 2018, 69, 132–140. [Google Scholar] [CrossRef]
  11. Wu, J.; Wang, Y.; Zhang, R.; Cai, J. An Approach to Discovering Product/Service Improvement Priorities: Using Dynamic Importance-Performance Analysis. Sustainability 2018, 10, 3564. [Google Scholar] [CrossRef]
  12. Cezar, A.; Ögüt, H. Analyzing conversion rates in online hotel booking: The role of customer reviews, recommendations and rank order in search listings. Int. J. Contemp. Hosp. Manag. 2016, 28, 286–304. [Google Scholar] [CrossRef]
  13. Gómez-Déniz, E.; Pérez-Rodríguez, J.V.; Boza-Chirino, J. Modelling tourist expenditure at origin and destination. Tour. Econ. 2020, 26, 437–460. [Google Scholar] [CrossRef]
  14. Hung, Y.H.; Huang, M.L.; Chen, K.-S. Service quality evaluation by service quality performance matrix. Total. Qual. Manag. Bus. 2003, 14, 79–89. [Google Scholar] [CrossRef]
  15. Wang, C.-H.; Chen, K.-S. New process yield index of asymmetric tolerances for bootstrap method and six sigma approach. Int. J. Prod. Econ. 2020, 219, 216–223. [Google Scholar] [CrossRef]
  16. Yu, K.T.; Chen, K.S. Testing and analyzing capability performance for products with multiple characteristics. Int. J. Prod. Res. 2016, 54, 6633–6643. [Google Scholar] [CrossRef]
  17. Chen, K.-S. Fuzzy testing of operating performance index based on confidence intervals. Ann. Oper. Res. 2022, 311, 19–33. [Google Scholar] [CrossRef]
  18. Cheng, S.W. Practical implementation of testing process capability indices. Qual. Eng. 1994, 7, 239–259. [Google Scholar] [CrossRef]
  19. Xu, Y.; Zhang, X.; Meng, P. A Novel Intelligent Deep Learning-Based Uncertainty-Guided Network Training in Market Price. IEEE Trans. Ind. Inform. 2022, 18, 5705–5711. [Google Scholar] [CrossRef]
  20. Yang, C. Establishment and applications of the integrated model of service quality measurement. Manag. Serv. Qual. 2003, 13, 310–324. [Google Scholar] [CrossRef]
  21. Yu, C.-M.; Chang, H.-T.; Chen, K.-S. Developing a performance evaluation matrix to enhance the learner satisfaction of an e-learning system. Total. Qual. Manag. Bus. Excel. 2018, 29, 727–745. [Google Scholar] [CrossRef]
  22. Li, R. Research trends of blended language learning: A bibliometric synthesis of SSCI-indexed journal articles during 2000–2019. ReCALL 2022, 34, 309–326. [Google Scholar] [CrossRef]
  23. Bobunova, A.; Sergeeva, M.; Notina, E. Integrating Computer-Assisted Language Learning into ESL Classroom: Formation of Moral and Aesthetic Values. Int. J. Inf. Educ. Technol. 2021, 11, 24–29. [Google Scholar] [CrossRef]
  24. Warschauer, M.; Healey, D. Computers and language learning: An overview. Lang. Teach. 1998, 31, 57–71. [Google Scholar] [CrossRef]
  25. Leszczyński, P.; Charuta, A.; Łaziuk, B.; Gałązkowski, R.; Wejnarski, A.; Roszak, M.; Kołodziejczak, B. Multimedia and interactivity in distance learning of resuscitation guidelines: A randomised controlled trial. Interact. Learn. Environ. 2018, 26, 151–162. [Google Scholar] [CrossRef]
  26. Zhang, Y.; MacWhinney, B. The role of novelty stimuli in second language acquisition: Evidence from the optimized training by the Pinyin Tutor at TalkBank. Smart Learn. Environ. 2023, 10, 3. [Google Scholar] [CrossRef]
  27. Shee, D.Y.; Wang, Y.-S. Multi-criteria evaluation of the web-based e-learning system: A methodology based on learner satisfaction and its applications. Comput. Educ. 2008, 50, 894–905. [Google Scholar] [CrossRef]
  28. Gaffas, Z.M. Students’ perceptions of e-learning ESP course in virtual and blended learning modes. Educ. Inf. Technol. 2023. [Google Scholar] [CrossRef]
  29. Martin, F.; Bolliger, D.U. Developing an online learner satisfaction framework in higher education through a systematic review of research. Int. J. Educ. Technol. High. Educ. 2022, 19, 1–21. [Google Scholar] [CrossRef]
  30. Qing, J.; Cheng, G.; Ni, X.-Q.; Yang, Y.; Zhang, W.; Li, Z. Implementation of an interactive virtual microscope laboratory system in teaching oral histopathology. Sci. Rep. 2022, 12, 5492. [Google Scholar] [CrossRef]
  31. Souza, F.V.; Motoki, F.Y.S.; Mainardes, E.W.; Azzari, V. Public Corporate e-Learning: Antecedents and Results. Public Organ. Rev. 2021, 22, 1139–1156. [Google Scholar] [CrossRef]
Figure 1. Performance evaluation matrix and four quadrants.
Figure 1. Performance evaluation matrix and four quadrants.
Applsci 13 05139 g001
Figure 2. PEM of CALL System.
Figure 2. PEM of CALL System.
Applsci 13 05139 g002
Table 1. Satisfaction and Importance survey for CALL System.
Table 1. Satisfaction and Importance survey for CALL System.
DimensionsItems θ ^ i S θ ^ i I C X 0 i C Y 0 i Quadrant
Learner
Interface
1. Ease of use0.650.770.490.69Q1
2. User-friendliness0.700.770.500.69Q1
3. Ease of understanding0.680.760.500.69Q1
4. Operational stability0.590.750.490.69Q1
Learning
Community
5. Ease of discussion with other learners0.410.770.490.69Q2 **
6. Ease of discussion with teachers0.390.770.490.69Q2 **
7. Ease of accessing shared data0.370.630.490.68Q3 *
8. Ease of exchanging learning with the others0.340.570.490.68Q3 *
System
Content
9. Up-to-date content0.520.670.490.68Q4
10. Sufficient content0.520.670.490.68Q4
11. Useful content0.530.660.490.68Q4
Personalization12. Capability of controlling learning progress0.510.650.490.68Q4
13. Capability of recording learning performance0.510.670.490.68Q1
Note: * indicates the need for improvement; ** indicates the need for the top priority of improvement.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, C.-C.; Yu, C.-H.; Chen, K.-S. Using Statistical Test Method to Establish a Decision Model of Performance Evaluation Matrix. Appl. Sci. 2023, 13, 5139. https://doi.org/10.3390/app13085139

AMA Style

Liu C-C, Yu C-H, Chen K-S. Using Statistical Test Method to Establish a Decision Model of Performance Evaluation Matrix. Applied Sciences. 2023; 13(8):5139. https://doi.org/10.3390/app13085139

Chicago/Turabian Style

Liu, Chin-Chia, Chun-Hung Yu, and Kuen-Suan Chen. 2023. "Using Statistical Test Method to Establish a Decision Model of Performance Evaluation Matrix" Applied Sciences 13, no. 8: 5139. https://doi.org/10.3390/app13085139

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop