1. Introduction
In daily life events, we frequently come across many intricate challenges that are full of uncertainties. Such uncertainties may be impossible to model using traditional mathematical approaches. As a result, state-of-the-art mathematical techniques are needed to model such uncertainties. To avoid ambiguities, Zadeh created the idea of fuzzy sets (
f-sets) [
1].
f-sets are common mathematical tools used in numerous domains, ranging from computer science [
2,
3] to pure mathematics [
4,
5,
6,
7,
8,
9].
Figure 1 shows some hybrid extensions of
f-sets.
An
f-set has entries indicated by
, i.e., a membership degree for
x. Because
, the non-membership degree
is calculated by subtracting the
from 1. However, if
, it is not as simple, and there is additional uncertainty. As an extension of
f-sets, intuitionistic fuzzy sets (
if-sets) [
10] have been proposed to model this form of uncertainty. An
if-set has entries indicated by
and
, namely membership and non-membership degrees, respectively, such that
(
Figure 2). In contrast to fuzzy sets, the idea of intuitionistic fuzzy sets can depict problems where
. In addition, the indeterminacy degree is determined as
.
Although
f-sets and
if-sets may overcome many difficulties and uncertainties [
23], far more are encountered in practice. Consider the voting process for a presidential election. During this procedure, the electorate’s decisions can be divided into three categories: yes, no, and abstention. To represent such a process, Cuong proposed the notion of picture fuzzy sets (
pf-sets) [
16]. A
pf-set has elements with the degrees of membership, non-membership, and neutral membership denoted by
,
, and
, respectively. The refusal to vote or non-participation in voting leads to the indeterminacy described above. Furthermore,
reflects the degree of indeterminacy in
pf-sets because
in Cuong’s definition. Even though
pf-sets model the aforementioned difficulties, the definitions and operations put forward by Cuong have conceptual errors. Memiş [
21] revised the idea of
pf-sets and associated operations to maintain consistency, where
.
Conversely,
pf-sets are unable to model the problems comprising parameters and alternatives (objects) with a picture fuzzy membership (
pf-membership) degree. In other words,
pfs-sets [
16,
18,
24] can represent problems with alternatives (objects) using
pf-membership (
Figure 3), with the expert voting on whether to accept, reject, or abstain from the alternatives.
Recently, various studies have been conducted on
pf-sets and
pfs-sets. The idea of a rough picture set has been introduced, and several of its topological features, including the lower and upper rough picture fuzzy approximation operators, have also been investigated [
25]. The creation of clustering algorithms that can explore latent knowledge from a large number of datasets is an emerging research field in
pf-sets. The distance and similarity measure is one of the most crucial tools in clustering that establishes the level of association between two objects. Therefore, generalized picture distance measure has been defined, and it has been applied to picture fuzzy clustering [
26]. In addition to distance measure, picture fuzzy similarity has also been studied [
27,
28]. A technique for solving decision-making issues utilizing the generalized
pfs-sets and an adjustable weighted soft discernibility matrix has been presented, and threshold functions have been defined [
29]. A weighted soft discernibility matrix in the generalized
pfs-sets has been employed to offer an illustrative example to demonstrate the superiority of the suggested approach therein. Matrix representations of mathematical concepts, such as
pfs-sets are crucial in the context of computerizing [
30,
31]. Thus, Arikrishnan and Sriram [
20] define picture fuzzy soft matrices and investigate their algebraic structures. Because the related study is based on Cuong’s [
16] study, there are some theoretical inconsistencies. Moreover, Arikrishnan and Sriram have only focused on the algebraic structures. The study of Sahu et al. [
32] aims to analyze students’ characteristics, such as career, memory, interest, knowledge, environment, and attitude, in order to predict the most suitable career path. This will enable students to explore and excel in their chosen field comfortably. A hybridized distance measure has been proposed, using picture fuzzy numbers to evaluate students, subjects, and students’ characteristics for career selection. However, related studies only rely on fictitious problem data. A research study that integrates
pfs-sets with Quality Function Deployment (QFD) to propose a Multiple Criteria Group Decision-Making (MCGDM) method has been discussed [
33]. In this approach, the preferences of the decision-makers are collected in linguistic terms and transformed into Picture Fuzzy Numbers (PFNs). The study applies the proposed MCGDM method to rank social networking sites, specifically evaluating Facebook, Whatsapp, Instagram, and Twitter, providing valuable insights into their comparative performance. The study of Lu et al. [
34] has introduced the concept of generalized
pfs-sets by combining an image fuzzy soft set with a fuzzy parameter set. They discuss five main operations for generalized
pfs-sets: subset, equality, union, intersection, and complement.
Suppose the problem has picture fuzzy uncertainty and a large number of data. In that case,
pfs-sets cannot operate efficiently with a large number of data. Therefore, processing data through the computer is compulsory, and the matrix versions of the
pfs-sets are needed. The concept of picture fuzzy soft matrices (
pfs-matrices) was propounded in 2020 [
20]; however, in the aforementioned study, only the algebraic structures of the concept have been investigated. To this end, this paper redefines the concept of
pfs-matrices, defines the distance measures of the
pfs-matrices, and applies them to supervised learning to manifest their modeling ability. The major contributions of this paper are as follows:
pfs-matrices are redefined, and some of their basic properties are investigated.
Distance measures of pfs-matrices are introduced.
Picture fuzzy soft k-nearest neighbor (PFS-kNN) based on distance measure of pfs-matrices is proposed.
An application of PFS-kNN to medical diagnosis is provided.
In
Section 2 of the paper, definitions of
pf-sets and
pfs-sets are provided. In
Section 3, the motivations of the redefining of
pfs-matrices are detailed. In
Section 4, the idea of
pfs-matrices is redefined, and their properties are further examined. In
Section 5, distance measures of
pfs-matrices are introduced, and their basic properties are researched. In
Section 6, a PFS-
kNN classifier is proposed. In
Section 7, the proposed classifier is applied to medical diagnosis and compared with the well-known
kNN-based classifiers. Finally, we discuss
pfs-matrices and PFS-
kNN and provide conclusive remarks for further research.
2. Preliminaries
In this section, we present the concepts of pf-sets and pfs-sets by considering the notations used across this study. Across this paper, let E and U denote the parameter and alternative sets, respectively.
Definition 1 ([
16,
21]).
Let f be a function such that . Then, the graphic is called a picture fuzzy set (pf-set) over E. Here, a pf-set is denoted by instead of . Moreover, for all , and . Furthermore, , , and are the membership, neutral membership, and non-membership functions, respectively, and the indeterminacy degree of the element is defined by .
In the present paper, the set of all the pf-sets over E is symbolized by and .
Remark 1. In , the notations graph and f are interchangeable since they have generated each other uniquely. Thus, we prefer the notation f to graph for brevity, provided that it results in no confusion.
Definition 2 ([
16,
22]).
Let α be a function such that . Then, the graphic is called a picture fuzzy soft set (pfs-set) parameterized via E over U (or briefly over U). Throughout this paper, the set of all the pfs-sets over U is symbolized by .
Remark 2. In , the notations graph and α are interchangeable since they have generated each other uniquely. Thus, we prefer the notation α to graph for brevity, provided that it results in no confusion.
Example 1. Let and . Then,is a pfs-set over U. 3. Motivations of the Redefining of pfs-Matrices
This section discusses the definition, fundamental operations, and counter-examples to Arikrishnan and Sriram’s definition [
20], based on Cuong’s definition [
16], considering the notations employed throughout the rest of the study.
Definition 3 ([
16]).
Let . Then, the graphic is called a picture fuzzy set (pf-set) over E such that . In this section, the set of all the pf-sets over E according to Cuong’s definition is denoted by and .
Definition 4 ([
16]).
Let . For all , if , , and , then is called a subset of and is denoted by . Definition 5 ([
16]).
Let . If and , then and are called equal pf-sets and are denoted by . Definition 6 ([
16]).
Let . For all , if , , and , then is called union of and and is denoted by . Definition 7 ([
16]).
Let . For all , if , , and , then is called intersection of and and is denoted by . Definition 8 ([
16]).
Let . For all , if , , and , then is called complement of and is denoted by . To hold the conditions “Empty
pf-set over
E is a subset of all the
pf-set over
E” and “All
pf-sets over
E are the subset of universal
pf-set over
E”, the definition and operations of
pf-sets in [
16] must be as follows [
21]:
Definition 9 ([
21]).
Let . For all , if , , and , then κ is called empty pf-set and is denoted by or . Definition 10 ([
21]).
Let . For all , if , , and , then κ is called universal pf-set and is denoted by or . Cuong’s definitions have led to the inconsistencies in Examples 2 and 3 [
21]:
Example 2 ([
21]).
There is a contradiction in Definition 10 since , i.e., . Moreover, even if , . Example 3 ([
21]).
Let such that . Then, and . Therefore, Memiş [
21] has provided the definition and operations of
pf-sets in [
16] to overcome the aforementioned inconsistencies.
Definition 11 ([
16,
18]).
. The set is called a pfs-set over U, where is a mapping given by . In this section, the set of all the pfs-sets over U according to Cuong’s definition is denoted by and .
Cuong [
16] defined
pfs-sets based on his own definition and operations of
pf-sets. As a result, the inconsistencies mentioned earlier also apply to his concept of
pfs-sets. Additionally, Yang et al. [
18] claimed to have introduced the concept of
pf-sets, even though Cuong had already defined it in [
16]. Thus, the concept of
pfs-sets has also similar inconsistencies therein. Hence,
pfs-sets were redefined to deal with inconsistencies mentioned above [
22].
Furthermore, the concept of
pfs-matrices has similar inconsistencies therein, since Arikrishnan and Sriram [
20] have introduced the
pfs-matrices according to Cuong’s definition [
16] and defined their union, intersection, and complement.
Definition 12 ([
20]).
Let . Then, is called pfs-matrix of and defined by such that for and , Here, if and , then has order . In the present study, the membership, neutral membership, and non-membership degrees of , i.e., , , and , will be denoted by , , and , respectively, as long as they do not cause any confusion. Moreover, the set of all the pfs-matrices over U according to Arikrishnan and Sriram’s definition is denoted by and .
It must be noted that the following definitions from [
20] expressed the notations employed throughout the present paper. Definitions of inclusion and equality in the
pfs-matrices space is provided according to Arikrishnan and Sriram’s definitions.
Definition 13. Let . For all i and j, if , , and , then is called a submatrix of and is denoted by .
Definition 14. Let . For all i and j, if , , and , then and are called equal pfs-matrices and denoted by .
Definition 15. ([
20]).
Let . For all i and j, if , , and , then is called union of and and denoted by . Definition 16. ([
20]).
Let . For all i and j, if , , and , then is called intersection of and and denoted by . Definition 17. ([
20]).
Let . For all i and j, if , , and , then is complement of and denoted by . According to Arikrishnan and Sriram’s definitions, the definitions of empty and universal pfs-matrices must be defined as in Definitions 18 and 19, respectively, to hold the conditions “Empty pfs-matrices over U is a submatrix of all the pfs-matrices over U” and “All pfs-matrices over U are the submatix of universal pfs-matrix over U”.
Definition 18. Let . For all i and j, if , , and , then is empty pfs-matrix and is denoted by .
Definition 19. Let . For all i and j, if , , and , then is universal pfs-matrix and is denoted by .
Arikrishnan and Sriram’s definitions have resulted in the inconsistencies in Examples 4 and 5:
Example 4. There is a contradiction in Definition 19 since , namely, . Moreover, even if , .
Example 5. Let such that . Then,and Consequently, since the aforesaid definitions and operations of pfs-matrices and how they operate are inconsistent, this concept and its operations must be redefined.
4. Picture Fuzzy Soft Matrices (pfs-Matrices)
Cuong [
16] and Yang et al. [
18] have introduced the concept of
pfs-sets to address the need for more general mathematical modeling of specific issues involving additional uncertainties. In addition, Yang et al. [
18] have proposed an adjustable soft discernibility approach based on
pfs-sets and applied it to a decision-making problem. Memiş [
22] has redefined the concept of
pfs-sets and applied it to a project selection problem. The applications described in the aforementioned studies demonstrate the successful use of
pfs-sets in addressing various issues with the uncertainties modeled by membership, non-membership, and neutral degrees, namely picture fuzzy uncertainties. These results suggest that researching the idea of
pfs-sets is worthwhile. However, it is important to note that these ideas have drawbacks, such as complexity and lengthy computation times. Therefore, it is crucial to understand their matrix representations, i.e.,
pfs-matrices, and ensure their theoretical consistency in the context of computerizing the aforementioned problems. For instance, for utilizing
pfs-sets in machine learning,
pfs-matrices, which are matrix representation of
pfs-sets, and their consistent theoretical definition and operations are needed.
Thus, in the present section, we make consistent the idea of pfs-matrices and present some of its fundamental properties. Since some of the propositions in this section have elementary proof, only the propositions with the complex proof are demonstrated.
Definition 20. Let (See Definition 2). Then, is called pfs-matrix of α and defined bysuch that for and ,Here, if and , then has order . In the present study, the membership, neutral membership, and non-membership degrees of , i.e., , , and , will be denoted by , , and , respectively, as long as they do not cause any confusion. Moreover, the set of all the pfs-matrices parameterized via E over U (briefly over U) is denoted by and .
Example 6. The pfs-matrix of α given in Example 1 is as follows: Definition 21. Let . For all i and j, if , , and , then is -pfs-matrix and denoted by . Moreover, is empty pfs-matrix and is universal pfs-matrix.
Definition 22. Let , , and . For all i and j, ifthen is called -restriction of and is denoted by . Briefly, if , then can be used instead of . It is clear that Definition 23. Let . For all i and j, if , , and , then is called a submatrix of and denoted by .
Definition 24. Let . For all i and j, if , , and , then and are called equal pfs-matrices and denoted by .
Proposition 1. Let . Then,
- i.
- ii.
- iii.
- iv.
- v.
- vi.
Proof. The proofs of i- are straightforward. □
Remark 3. From Proposition 1, it is straightforward that the inclusion relation herein is a partial ordering relation in .
Definition 25. Let . If and , then is called a proper submatrix of and denoted by .
Definition 26. Let . For all i and j, if , , and , then is called union of and and denoted by .
Definition 27. Let . For all i and j, if , , and , then is called intersection of and and denoted by .
Example 7. Assume that two pfs-matrices and are as follows:Then, Proposition 2. Let . Then,
- i.
and
- ii.
and
- iii.
and
- iv.
and
- v.
and
- vi.
and
Proof. . Let
. Then,
The proof of is similar to the aforementioned proof. In addition, the proofs of i-v are straightforward. □
Definition 28. Let . For all i and j, if , , and , then is called difference between and and denoted by .
Proposition 3. Let . Then,
- i.
- ii.
Proof. The proofs of i and are straightforward. □
Remark 4. It must be emphasized that the difference operation herein is non-commutative and non-associative.
Definition 29. Let . For all i and j, if , , and , then is complement of and denoted by or . It is clear that .
Proposition 4. Let . Then,
- i.
- ii.
- iii.
- iv.
Proof. The proofs of i- are straightforward. □
Proposition 5. Let . Then, the following De Morgan’s laws are valid.
- i.
- ii.
Proof. i. Let
. Then,
The proof of is similar to the aforementioned proof. □
Definition 30. Let . For all i and j, ifandthen is called symmetric difference between and and denoted . Proposition 6. Let . Then,
- i.
- ii.
- iii.
- iv.
Proof. . Let
. Then,
The proofs of i- are similar to the proof mentioned above. □
Remark 5. It must be emphasized that the symmetric difference operation herein is non-associative.
5. Distance Measures of pfs-Matrices
This section, firstly, defines the concept of metrics over . One of the significant goals herein is to contribute to pf-sets and soft sets theoretically. The other goal is to improve the modeling skill of pfs-matrices for classification problems in machine learning owing to the aforementioned theoretical contribution. Throughout this study, let .
Definition 31. Let be a function. Then, d is a metric over for all if d satisfies the following properties,
- i.
- ii.
- iii.
Secondly, Minkowski, Euclidean, and Hamming metrics over are propounded. Thereafter, their three properties are investigated.
Proposition 7. The function defined bysuch that is Minkowski metric over . Its normalized version, namely normalized Minkowski metric, is defined as follows:such that . Specifically, and are Hamming and Euclidean metrics and represented by and , respectively. Moreover, and are normalized Hamming and Euclidean metrics and are represented by and , respectively.
Proof. Let
and
. Satisfying of
the conditions
i and
ii is straightforward from Definition 31. Then,
Moreover,
,
,
, and
because
, for all
and
. Hence,
Then,
□
Proposition 8. Let and . Then, Proof. The proof is straightforward. □
Proposition 9. Let and . Then, .
Proof. The proof is straightforward. □
Proposition 10. Let and . Then,
- i.
- ii.
Proof. The proofs of i and are straightforward. □
6. Picture Fuzzy Soft -Nearest Neighbor Classifier: PFS-NN
In this section, firstly, the basic expressions and notations to be required for the suggested PFS-kNN based on pfs-matrices are provided. Throughout the paper, let represent a data matrix. The last column of D consists of class labels of the data. Here, m and n are the numbers of samples and attributes in D, respectively. Moreover, let , , and derived from attained D denote a training matrix, class matrix of the training matrix, and the testing matrix, respectively, such that . Moreover, let be a matrix comprising of unique class labels of . Further, let and represent ith rows of and , respectively. In a similar manner, and represent jth rows of and , respectively. Furthermore, let stand for the predicted classes of the testing queries.
Definition 32. Let . Then, the vector such that defined byis called normalized u, i.e., normalizing vector of u. Definition 33. Consider the training matrix attained from , , and . Then, the matrix defined byis called feature-fuzzification matrix of , namely column normalized matrix of , and it is denoted by . Definition 34. Consider the testing matrix attained from , , and . Then, the matrix defined byis called feature-fuzzification matrix of , namely column normalized matrix of , and it is denoted by . Definition 35. Let be a feature-fuzzification matrix of . Then, the matrixis called feature picture fuzzification of and is defined bysuch that , , and . Definition 36. Let be a feature-fuzzification matrix of . Then, the matrixis called feature picture fuzzification of and is defined bysuch that , , and . Definition 37. Let be a feature-fuzzification matrix of and be the picture fuzzification of . Then, the pfs-matrix is the training pfs-matrix attained by row of and is defined by such that and .
Definition 38. Let be a a feature-fuzzification matrix of and be the picture fuzzification of . Then, the pfs-matrix is called the testing pfs-matrix attained by row of and is defined by such that and .
Secondly, a new classifier named PFS-kNN employing the Minkowski metric of pfs-matrices is suggested (Algorithm 1). Pseudocode of the proposed PFS-kNN is presented in Algorithm 1. In Line 1, it obtains feature fuzzification of testing and training matrices required for feature picture fuzzification. In Line 2, the feature picture fuzzification of testing and training matrices utilizing their feature fuzzification versions. The aim herein is to make the data ready in a way that can be used in the distance calculation of pfs-matrices. In Lines 3–4, the ith testing pfs-matrix is constructed by extracting ith sample from the feature picture fuzzification of the testing matrix. Similarly, in Lines 5–6, the jth training pfs-matrix is constructed by extracting jth sample from the feature picture fuzzification of the training matrix. In Line 7, the distance between the ith test sample and the jth training sample is calculated utilizing the Minkowski metric over the pfs-matrices in accordance with Proposition 7, and is attained. In Line 9, k-nearest neighbor according to the matrix of picture fuzzy soft distances, namely , is determined. In Line 10, the most repetitive class label (predicted class label) of the determined k-nearest neighbor is obtained. In Line 11, the predicted class label, particularly diagnosis label in medical diagnosis, is assigned to the test sample. In Line 12–13, finally, the predicted label (class) matrix is created for the test queries.
Algorithm 1 PFS-kNN’s pseudocode |
Input: , , , k, , and p Output: PFS-kNN(, C, , k, , p) - 1:
Calculate feature fuzzification of and , i.e., and ▹See Definition 33 and 34 - 2:
Calculate feature picture fuzzification of and , i.e., and ▹See Definition 35 and 36 - 3:
for i from 1 to do - 4:
Calculate the testing pfs-matrix employing - 5:
for j from 1 to do - 6:
Calculate the training pfs-matrix employing - 7:
▹See Proposition 7 - 8:
end for - 9:
Find k-nearest neighbor using - 10:
Find the most repetitive class label in the considered k-nearest neighbor - 11:
most repetitive class label (predicted class label) - 12:
end for - 13:
return
|
7. Application of PFS-NN to Medical Diagnosis
In this section, firstly, details of the datasets used in simulation and the setting of the compared classifiers are provided according to the methodology presented in
Figure 4. Afterward, the performance metrics for classification problems are introduced. Finally, simulation results for several medical datasets in UC Irvine Machine Learning Repository (UCI-MLR) [
35] are presented, and the discussion of the results are provided.
7.1. Medical Datasets
One of the major motivations of this paper is the applicability of PFS-
kNN in medical diagnosis. Therefore, the well-known and commonly used four medical diagnosis datasets in UCI-MLR [
35] were chosen. This subsection offers descriptions of the following medical datasets employed in the simulation, provided in
Table 1: “Breast Tissue”, “Parkinsons[sic]”, “Breast Cancer Wisconsin”, and “Indian Liver”.
Breast Tissue [
35]: This dataset measured impedance frequencies:
,
,
, 125, 250, 500, and 1000 KHz. The aforesaid frequencies were used to test the impedance of freshly removed breast tissue. The impedance spectrum is formed by these data plotted in the (actual, imaginary) plane, from which the features of the breast tissue are calculated. The dataset can be used to predict the categorization of either the original six classes or four classes by combining the mastopathy, fibro-adenoma, and glandular types whose distinction is unnecessary (they cannot be differentiated accurately).
Parkinsons[sic] [
35]: The dataset consists of a range of biological voice measurements from 31 patients, 23 of whom have Parkinson’s disease. Each column in the dataset stands for a separate vocal measure, and each row corresponds to one of these people’s 195 voice recordings (“name” column). The major purpose of the data is to differentiate between healthy and Parkinson’s disease patients by utilizing the “status” column, which is set to 0 for healthy and 1 for Parkinson’s disease patients.
Breast Cancer Wisconsin (Diagnostic) [
35]: This dataset uses a digitized picture of a fine needle aspirate (FNA) of a breast mass to construct characteristics. They describe the characteristics of the cell nuclei shown in the photograph. The separation plane mentioned above was created using the Multisurface approach-Tree (MSM-T), a classification approach that constructs a decision tree using linear programming [
44]. To locate relevant features, an exhaustive search in the space of 1–4 features and 1–3 separation planes was utilized. The exact linear program used to obtain the separation plane in 3-dimensional space is described in [
45].
Indian Liver Patient (ILPD) [
35]: This data collection contains 416 records for liver patients and 167 for non-liver patients. The dataset was gathered in the northeastern state of Andhra Pradesh, India. The selector is a class label categorizing people (liver sick or not). This data collection has 441 male and 142 female patients records. Any patient over the age of 89 is labeled as “90”.
7.2. Quality Metrics for Classification Performance
In this subsection, the mathematical expressions of the quality metrics for binary and multi classification [
46], i.e., Accuracy, Precision, Sensitivity (or Recall), and F1-Score, are presented to make a comparison of the considered classifiers. Assume that
is
n queries to be classified,
is their ground truth class sets,
is their prediction class sets, and
l is their number of the class. The quality metrics for binary classification are as follows:
where true positive (
), true negative (
), false positive (
), and false negative (
) are defined as follows:
such that
stands for the cardinality of a set.
The performance metrics for multi classification are as follows:
where
ith true positive (
),
ith true negative (
),
ith false positive (
), and
ith false negative (
) for the class
i are defined as follows:
such that
stands for the cardinality of a set.
7.3. Diagnosis Results for Medical Diagnosis
In this subsection, the comparison of PFS-
kNN with the well-known and state-of-the-art
kNN-based classifiers (
Table 2), i.e.,
kNN [
36], Fuzzy
kNN [
37], W
kNN [
38], IFROWANN [
39], LC
kNN [
40], GM
kNN [
41], LMR
kNN [
42], and BM-Fuzzy
kNN [
43], is performed by employing a computer with I(R)Core(TM) I5-4200H CPU@2.80GHz and 8 GB RAM and MATLAB R2021b software. Random 10 runs rely on the five-fold cross-validation (CV) [
47,
48], generating the classifiers’ performance results in which each CV, of which four parts are selected for training and the other for testing (for more details about CV, see [
47]), randomly split the considered dataset into five parts.
Table 3 presents the average Accuracy, Precision, Recall, and F1-Score results of PFS-
kNN,
kNN, Fuzzy
kNN, W
kNN, IFROWANN, LC
kNN, GM
kNN, LMR
kNN, and BM-Fuzzy
kNN for the datasets.
Based on the results obtained from Accuracy, it is evident that PFS-kNN surpasses all other kNN-based classifiers that were compared. This is similarly observed when it comes to F1-Score results. However, it should be noted that the proposed approach has lower Precision and Recall results when compared to the other classifiers. Nevertheless, the results are still close to the highest score in general.
These simulation results manifest that pfs-matrices and PFS-kNN can model uncertainty and real-world problems, such as medical diagnosis and machine learning. It is important to note that applying these models can significantly impact the accuracy of such issues, leading to more reliable and effective solutions. Therefore, using PFS-kNN and pfs-matrices is recommended when dealing with similar problems.
In this study, we evaluated the Accuracy performance values of various algorithms on four medical datasets. To obtain a comprehensive understanding of the algorithms’ performance, we ran each algorithm 50 times (10 times five-fold cross-validation) and plotted the results as box plots in
Figure 5.
From the visual results in
Figure 5a–d, we can observe that PFS-
kNN outperforms the other algorithms, with the highest performance value and a performance value distribution that is close to normal distribution. This indicates that PFS-
kNN is a reliable algorithm for these medical datasets.
Similarly, in
Figure 5b, we see that PFS-
kNN produces the highest performance results, with the 50 performance values almost following a normal distribution. Moreover, the distance between quartiles is relatively low, suggesting that PFS-
kNN is consistent in performance.
Overall, the box plots in
Figure 5 demonstrate that PFS-
kNN is a superior algorithm compared to the others evaluated in this study, and it is a promising option for medical data analysis.
8. Discussion on PFS-NN in Medical Diagnosis and Supervised Learning
This section discusses the significance of the proposed PFS-kNN classifier’s performance on medical diagnosis datasets herein.
Accuracy and F1-Score Dominance:The achievement of PFS-kNN outperforming all other kNN-based classifiers in terms of Accuracy and F1-Score is remarkable. Accuracy measures the overall correctness of the classifier’s predictions, while the F1-Score considers both precision and recall. These metrics are crucial in medical diagnosis, where accurately identifying and classifying medical conditions can be a life-or-death matter. The superior performance of PFS-kNN in these areas indicates its potential as a valuable tool for enhancing the accuracy and effectiveness of medical diagnoses.
Precision and Recall Trade-Off: While PFS-kNN performs well in terms of Accuracy and F1-Score, it is observed to have slightly lower Precision and Recall compared to other classifiers. Precision measures the ratio of correctly predicted positive cases to all predicted positive cases, while Recall measures the ratio of correctly predicted positive cases to all actual positive cases. In medical diagnosis, Precision is vital for minimizing false positive errors, and Recall is crucial for reducing false negatives. The slightly lower Precision and Recall values suggest that PFS-kNN might be more cautious when making positive predictions, possibly to reduce false positive errors. However, the results are still close to the highest scores overall, indicating a reasonable balance between these metrics.
Modeling Uncertainty and Real-World Problems: Addressing the concept of pfs-matrices and their role in modeling uncertainty in practical scenarios, such as medical diagnosis, is significant. Medical diagnosis frequently deals with intricate and uncertain data, and the capability of PFS-kNN to model uncertainty is a valuable advantage. This indicates that the classifier is flexible and resilient in handling various demanding datasets, making it suitable for real-world applications where data are inherently uncertain and noisy.
Impact on Accuracy and Reliability: The practical importance of using PFS-kNN and pfs-matrices in areas, such as medical diagnosis mentioned in the previous section indicates that they can notably affect accuracy. By enhancing accuracy in medical diagnosis, they can provide more dependable and efficient solutions, decrease misdiagnosis rates, and improve patient outcomes. This emphasizes the potential of PFS-kNN to make a valuable contribution to the healthcare industry, where precision and accuracy are crucial.
Recommendation for Similar Problems: The suggestion to utilize PFS-kNN and pfs-matrices as a conclusion highlights the belief in the effectiveness of this approach. This indicates that the advantages demonstrated in the research are not restricted to the dataset employed for assessment but can also apply to other medical diagnosis scenarios or related fields.
In brief, the performance of the proposed PFS-kNN classifier on medical diagnosis datasets, assessed using Minkowski metrics over pfs-matrices, demonstrates its potential to enhance the accuracy and dependability of medical diagnoses. While there are some trade-offs in Precision and Recall, the overall superiority in Accuracy and F1-Score, coupled with its capability to model uncertainty, positions PFS-kNN as a promising tool for improving healthcare and addressing real-world challenges in supervised learning.
9. Conclusions
This paper redefined the idea of pfs-matrices, and their fundamental properties were examined extensively. Afterward, distance measures of pfs-matrices were introduced. Then, PFS-kNN, via the aforementioned distance measures, was suggested and applied to medical diagnosis. The results manifested that the concept of pfs-matrices and the proposed PFS-kNN approach can model uncertainty and real-world problems such as medical diagnosis.
The current study, which focuses on soft sets, has significantly contributed to the literature in both theoretical and practical aspects. This study has introduced three crucial additions that redefine the mathematics underlying pfs-matrices and proposed new distance measures between pfs-matrices and PFS-kNN. By doing so, this paper has expanded the understanding of this field and enhanced its applicability in real-world problems. In addition, this research has gained prominence in the literature due to its innovative contributions, which have opened up new avenues for further exploration and research in the field.
In future works, there is potential for further investigation into the algebraic and topological structures of
pfs-matrices and the exploration of new distance and similarity measures. While
pfs-matrices have proven effective in addressing specific problems, it is essential to acknowledge their limitations when dealing with picture fuzzy parameters. To overcome this issue, research can be conducted on several related concepts, such as intuitionistic fuzzy parameterized intuitionistic fuzzy soft matrices (
ifpifs-matrices) [
49,
50], aggregation operators of
pfs-matrices [
51,
52], picture fuzzy parameterized picture fuzzy soft sets (
pfppfs-sets) [
53], and picture fuzzy parameterized picture fuzzy soft matrices (
pfppfs-matrices). Additionally, interval-valued intuitionistic fuzzy parameterized interval-valued intuitionistic fuzzy soft sets (
d-sets) [
4] and interval-valued intuitionistic fuzzy parameterized interval-valued intuitionistic fuzzy soft matrices (
d-matrices) [
5] are other related concepts that may be worth exploring. We can better understand their potential applications and limitations by studying and applying these concepts to different real-world problems. For instance, different real-world problems, such as trend prediction of component stock [
54], remote sensing image fusion [
55], and Landsat image fusion [
56] can be investigated, and the applications of
pfs-matrices to them can be focused.