# Metrics Related to Confusion Matrix as Tools for Conformity Assessment Decisions

^{1}

^{2}

^{*}

## Abstract

**:**

## Featured Application

**Application in determining optimal length of the guard band when assessing global producer’s and consumer’s risk.**## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Production Process and Improvement Procedure

#### 2.2. Risk Calculation

#### 2.3. Confusion Matrix Construction

#### 2.4. Metrics Associated to Confusion Matrix Written in Metrology Manner

## 3. Results and Discussion

#### 3.1. Risk Analysis

#### 3.2. Metrics Analysis

#### 3.3. Guard Banding by Confusion Matrix

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Conflicts of Interest

## References

- BIPM; IEC; IFCC; ILAC; ISO; IUPAC; IUPAP; OIML. Evaluation of Measurement Data—The Role of Measurement Uncertainty in Conformity Assessment, JCGM 106:2012. BIPM. 2012. Available online: https://www.bipm.org/documents/20126/2071204/JCGM_106_2012_E.pdf/fe9537d2-e7d7-e146-5abb-2649c3450b25?version=1.7&t=1659083025736&download=true (accessed on 26 March 2023).
- Allard, A.; Fischer, N.; Smith, I.; Harris, P.; Pendrill, L. Risk calculations for conformity assessment in practice. In Proceedings of the 19th International Congress of Metrology, Paris, France, 24–26 September 2019. [Google Scholar] [CrossRef]
- Runje, B.; Horvatić Novak, A.; Razumić, A.; Piljek, P.; Štrbac, B.; Orošnjak, M. Evaluation of Consumer and Producer Risk in Conformity Assessment Decision. In Proceedings of the 30th DAAAM International Symposium “Intelligent Manufacturing & Automation”, Zadar, Croatia, 23–26 October 2019. [Google Scholar] [CrossRef]
- ILAC-G8:09/2019. Guidelines on Decision Rules and Statements of Conformity. 2019. Available online: https://ilac.org/publications-and-resources/ilac-guidance-series/ (accessed on 18 May 2023).
- Puydarrieux, S.; Pou, J.M.; Leblond, L.; Fischer, N.; Allard, A.; Feinberg, M.; El Guennouni, D. Role of measurement uncertainty in conformity assessment. In Proceedings of the 19th International Congress of Metrology, Paris, France, 24–26 September 2019. [Google Scholar] [CrossRef]
- Dias, F.R.S.; Lourenço, F.R. Measurement uncertainty evaluation and risk of false conformity assessment for microbial enumeration tests. J. Microbiol. Methods
**2021**, 189, 106312. [Google Scholar] [CrossRef] - Bettencourt da Silva, R. Eurachem/CITAC Guide: Setting and Using Target Uncertainty in Chemical Measurement. 2015. Available online: https://www.eurachem.org/index.php/publications/guides/gd-stmu (accessed on 18 May 2023).
- Pendrill, L.R. Using measurement uncertainty in decision-making and conformity assessment. Metrologia
**2014**, 51, 3206. [Google Scholar] [CrossRef] - Williams, A.; Magnusson, B. Eurachem/CITAC Guide: Use of Uncertainty Information in Compliance Assessment. 2021. Available online: https://www.eurachem.org/index.php/publications/guides/uncertcompliance (accessed on 18 May 2023).
- EUROLAB Technical Report, No. 1/2017-Decision Rules Applied to Conformity Assessment. Available online: https://www.eurolab.org/pubs-techreports (accessed on 18 May 2023).
- Lira, I. A Bayesian approach to the consumer’s and producer’s risks in measurement. Metrologia
**1999**, 36, 397. [Google Scholar] [CrossRef] - Toczek, W.; Smulko, J. Risk Analysis by a Probabilistic Model of the Measurement Process. Sensors
**2021**, 21, 2053. [Google Scholar] [CrossRef] [PubMed] - Separovic, L.; de Godoy Bertanha, M.L.; Saviano, A.M.; Lourenço, F.R. Conformity Decisions Based on Measurement Uncertainty—A Case Study Applied to Agar Diffusion Microbiological Assay. J. Pharm. Innov.
**2020**, 15, 110–115. [Google Scholar] [CrossRef] - BIPM; IEC; IFCC; ILAC; ISO; IUPAC; IUPAP; OIML. Evaluation of Measurement Data—Supplement 1 to the “Guide to the Expression of Uncertainty in Measurement”—Propagation of Distributions Using a Monte Carlo Method, JCGM 101:2008. BIPM. 2008. Available online: https://www.bipm.org/documents/20126/2071204/JCGM_101_2008_E.pdf/325dcaad-c15a-407c-1105-8b7f322d651c?version=1.12&t=1659082897489&download=true (accessed on 26 March 2023).
- Božić, D.; Runje, B. Selection of an Appropriate Prior Distribution in Risk Assessment. In Proceedings of the 33rd International DAAAM Virtual Symposium “Intelligent Manufacturing & Automation”, Vienna, Austria, 26–27 October 2022. [Google Scholar] [CrossRef]
- Pennecchi, F.R.; Kuselman, I.; Di Rocco, A.; Brynn Hibbert, D.; Sobina, A.; Sobina, E. Specific risks of false decisions in conformity assessment of a substance or material with a mass balance constraint—A case study of potassium iodate. Measurement
**2021**, 173, 108662. [Google Scholar] [CrossRef] - Separovic, L.; Bettencourt da Silva, R.J.N.; Lourenço, F.R. Determination of intrinsic and metrological components of the correlation of multiparameter products for minimising the risks of false conformity decisions. Measurement
**2021**, 180, 109531. [Google Scholar] [CrossRef] - Separovic, L.; Lourenço, F.R. Measurement uncertainty and risk of false conformity decision in the performance evaluation of liquid chromatography analytical procedures. J. Pharm. Biomed. Anal.
**2019**, 171, 73–80. [Google Scholar] [CrossRef] - Lombardo, M.; Margueiro da Silva, S.; Lourenço, F.R. Conformity assessment of medicines containing antibiotics—A multivariate assessment. Regul. Toxicol. Pharmacol.
**2022**, 136, 105279. [Google Scholar] [CrossRef] - Kuselman, I.; Pennecchi, F.; Bettencourt da Silva, R.J.N.; Brynn Hibbert, D. Conformity assessment of multicomponent materials or objects: Risk of false decisions due to measurement uncertainty—A case study of denatured alcohols. Talanta
**2017**, 164, 189–195. [Google Scholar] [CrossRef] [PubMed] - de Oliveira, E.C.; Lourenço, F.R. Risk of false conformity assessment applied to automotive fuel analysis: A multiparameter approach. Chemosphere
**2021**, 263, 128265. [Google Scholar] [CrossRef] [PubMed] - Pennecchi, F.R.; Kuselman, I.; Di Rocco, A.; Brynn Hibbert, D.; Semenova, A.A. Risks in a sausage conformity assessment due to measurement uncertainty, correlation and mass balance constraint. Food Control
**2021**, 125, 107949. [Google Scholar] [CrossRef] - Božić, D.; Samardžija, M.; Kurtela, M.; Keran, Z.; Runje, B. Risk Evaluation for Coating Thickness Conformity Assessment. Materials
**2023**, 16, 758. [Google Scholar] [CrossRef] - Brandão, L.P.; Silva, V.F.; Bassi, M.; de Oliveira, E.C. Risk Assessment in Monitoring of Water Analysis of a Brazilian River. Molecules
**2022**, 27, 3628. [Google Scholar] [CrossRef] [PubMed] - Caelen, O. A Bayesian interpretation of the confusion matrix. Ann. Math Artif. Intell.
**2017**, 81, 429–450. [Google Scholar] [CrossRef] - Grandini, M.; Bagli, E.; Visani, G. Metrics for Multi-Class Classification: An Overview. arXiv
**2020**. [Google Scholar] [CrossRef] - Jiao, Y.; Du, P. Performance measures in evaluating machine learning based bioinformatics predictors for classifications. Quant. Biol.
**2016**, 4, 320–330. [Google Scholar] [CrossRef] - Luque, A.; Carrasco, A.; Martín, A.; de las Heras, A. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit.
**2019**, 91, 216–231. [Google Scholar] [CrossRef] - Jeni, L.A.; Cohn, J.F.; De La Torre, F. Facing Imbalanced Data-Recommendations for the Use of Performance Metrics. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction (ACII), Geneva, Switzerland, 2–5 September 2013. [Google Scholar] [CrossRef]
- Flach, P.A.; Lachiche, N. Naive Bayesian Classification of Structured Data. Mach. Learn.
**2004**, 57, 233–269. [Google Scholar] [CrossRef] - Tharwat, A. Classification assessment methods. Appl. Comput. Inform.
**2021**, 17, 168–192. [Google Scholar] [CrossRef] - Alvarez, S.A. An Exact Analytical Relation among Recall, Precision, and Classification Accuracy in Information Retrieval. Technical Report BC-CS-2002-01. Available online: http://www.cs.bc.edu/~alvarez/APR/aprformula.pdf (accessed on 19 May 2023).
- Flach, P.; Kull, M. Precision-recall-gain curves: PR analysis done right. Adv. Neural. Inf. Process. Syst.
**2015**, 28, 838–846. Available online: https://proceedings.neurips.cc/paper/2015/file/33e8075e9970de0cfea955afd4644bb2-Paper.pdf (accessed on 19 May 2023). - McHugh, M.L. Interrater reliability: The kappa statistic. Biochem. Med.
**2012**, 22, 276–282. [Google Scholar] [CrossRef] - Chicco, D.; Tötsch, N.; Jurman, G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min.
**2021**, 14, 13. [Google Scholar] [CrossRef] [PubMed] - Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom.
**2020**, 21, 6. [Google Scholar] [CrossRef] [PubMed] - Juba, B.; Le, H.S. Precision-Recall versus Accuracy and the Role of Large Data Sets. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019. [Google Scholar] [CrossRef]
- Sun, Y.; Wong, A.K.C.; Kamel, M.S. Classification of imbalanced data: A review. Intern. J. Pattern Recognit Artif. Intell.
**2009**, 23, 687–719. [Google Scholar] [CrossRef] - Tanha, J.; Abdi, Y.; Samadi, N.; Razzaghi, N.; Asadpour, M. Boosting methods for multi-class imbalanced data classification: An experimental review. J. Big Data
**2020**, 7, 70. [Google Scholar] [CrossRef] - Delgado, R.; Tibau, X.-A. Why Cohen’s Kappa should be avoided as performance measure in classification. PLoS ONE
**2019**, 14, e0222916. [Google Scholar] [CrossRef] [PubMed] - Volodarsky, E.T.; Kosheva, L.O.; Klevtsova, M.O. The Role Uncertainty of Measurements in the Formation of Acceptance Criteria. In Proceedings of the XXIX International Scientific Symposium “Metrology and Metrology Assurance” (MMA), Sozopol, Bulgaria, 6–9 September 2019. [Google Scholar] [CrossRef]
- Haloulos, I.; Theodorou, D.; Zannikou, Y.; Zanninkos, F. Monitoring fuel quality: A case study for quinizarin marker content of unleaded petrol marketed in Greece. Accred. Qual. Assur.
**2016**, 21, 203–210. [Google Scholar] [CrossRef] - Dobbert, M.A. Guard-Band Strategy for Managing False-Accept Risk. NCSLI Meas.
**2008**, 4, 44–48. [Google Scholar] [CrossRef] - Deaver, D. Guardbanding with Confidence. In Proceedings of the NCSL Workshop and Symposium, Chicago, IL, USA, 1 September 1994; Available online: https://download.flukecal.co/pub/literature/ddncsl94.pdf (accessed on 24 May 2023).
- Purata-Sifuentes, O.-J.; Hernández-Balandrán, L.-E.; Tornero-Navarro, M.-G. Role of the measurement uncertainty in cone penetration test results of lubricating grease. In Proceedings of the Joint IMEKO TC11 & TC24 hybrid conference, Dubrovnik, Croatia, 16–20 October 2022. [Google Scholar] [CrossRef]
- Shirono, K.; Tanaka, H.; Koike, M. Economic optimization of acceptance interval in conformity assessment: 1. Process with no systematic effect. Metrologia
**2022**, 59, 045005. [Google Scholar] [CrossRef] - Shirono, K.; Tanaka, H.; Koike, M. Economic optimization of acceptance interval in conformity assessment: 2. Process with unknown systematic effect. Metrologia
**2022**, 59, 045006. [Google Scholar] [CrossRef] - Margueiro da Silva, C.; Lourenço, F.R. Definition of multivariate acceptance limits (guard-bands) applied to pharmaceutical equivalence assessment. J. Pharm. Biomed. Anal.
**2023**, 222, 115080. [Google Scholar] [CrossRef] [PubMed] - Bettencourt da Silva, R.J.N.; Lourenço, F.; Brynn Hibbert, D. Setting Multivariate and Correlated Acceptance Limits for Assessing the Conformity of Items. Anal. Lett.
**2022**, 55, 2011–2032. [Google Scholar] [CrossRef] - Chicco, D.; Warrens, M.J.; Jurman, G. The Matthews Correlation Coefficient (MCC) is More Informative Than Cohen’s Kappa and Brier Score in Binary Classification Assessment. IEEE Access
**2021**, 9, 78368–78381. [Google Scholar] [CrossRef]

**Figure 1.**An example of a situation where the measured value has a high measurement uncertainty whose value exceeds the lower limit of the tolerance interval [4]. The symbols ${T}_{L}$ and ${T}_{U}$ in figures are marks for the lower and upper limits of the tolerance interval, respectively.

**Figure 2.**A different relationship between acceptance interval and tolerance interval: (

**a**) Model of the producer’s risk minimization; (

**b**) Shared risk; (

**c**) Model of the consumer’s risk minimization. The symbols ${A}_{L}$ and ${A}_{U}$ in figures are marks for the lower and upper limits of the acceptance interval, respectively.

**Table 1.**Reducing of the number of rings rejected as non-conforming although they meet specifications, and those accepted as conforming, but which do not meet specifications.

Global Risk | $\mathbf{w}=-0.0025\text{}\mathbf{mm}$ | $\mathbf{w}=0$ | $\mathbf{w}=0.0025\text{}\mathbf{mm}$ | |||
---|---|---|---|---|---|---|

Model | Initial | Improved | Initial | Improved | Initial | Improved |

${R}_{P}$ | 288 | 0 | 484 | 14 | 895 | 81 |

${R}_{C}$ | 380 | 21 | 233 | 6 | 122 | 0 |

w/mm | Model | Metrics | r | Risks | $\left[{\mathit{A}}_{\mathit{L}},{\mathit{A}}_{\mathit{U}}\right]/\mathbf{mm}$ |
---|---|---|---|---|---|

$-0.001475$ | In * Rp_min | $precision=recall=F1=0.9647$ $kappa=MCC=0.662825$ | $-0.59$ | ${R}_{C}=3.16\%$ ${R}_{P}=3.16\%$ | $\left[99.976525,\text{}100.023475\right]$ |

$-0.0004649$ | Im Rp_min | $precision=recall=F1=0.9991$ $kappa=MCC=0.731748$ | $-0.18596$ | ${R}_{C}=0.086\%$ ${R}_{P}=0.086\%$ | $\left[99.97754,\text{}100.0225\right]$ |

$-0.001548$ | In, Im Rp_min | kappa_{In} = kappa_{Im} = 0.6625 | $-0.6192$ | ${R}_{C}=3.21\%$ ${R}_{P}=3.09\%$ | $\left[99.976452,\text{}100.023548\right]$ |

$0.001003$ | In, Im Rc_min | $kapp{a}_{In}=kapp{a}_{Im}=0.6355$ | $0.4012$ | ${R}_{C}=1.84\%$ ${R}_{P}=6.28\%$ | $\left[99.979,\text{}100.021\right]$ |

−0.001808 | In, Im Rp_min | $MC{C}_{In}=MC{C}_{Im}=0.6614$ | $-0.7232$ | ${R}_{C}=3.36\%$ ${R}_{P}=2.85\%$ | $\left[99.97619,\text{}100.0238\right]$ |

$0.001282$ | In, Im Rc_min | $MC{C}_{In}=MC{C}_{Im}=0.6444$ | $0.5128$ | ${R}_{C}=0.019\%$ ${R}_{P}=0.38\%$ | $\left[99.97928,\text{}100.0207\right]$ |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Božić, D.; Runje, B.; Lisjak, D.; Kolar, D.
Metrics Related to Confusion Matrix as Tools for Conformity Assessment Decisions. *Appl. Sci.* **2023**, *13*, 8187.
https://doi.org/10.3390/app13148187

**AMA Style**

Božić D, Runje B, Lisjak D, Kolar D.
Metrics Related to Confusion Matrix as Tools for Conformity Assessment Decisions. *Applied Sciences*. 2023; 13(14):8187.
https://doi.org/10.3390/app13148187

**Chicago/Turabian Style**

Božić, Dubravka, Biserka Runje, Dragutin Lisjak, and Davor Kolar.
2023. "Metrics Related to Confusion Matrix as Tools for Conformity Assessment Decisions" *Applied Sciences* 13, no. 14: 8187.
https://doi.org/10.3390/app13148187