# Alternative Analysis Approaches for the Assessment of Pilot Bioavailability/Bioequivalence Studies

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

_{mean}) and geometric (G

_{mean}) mean ƒ

_{2}factor approaches were investigated. Methods performance was measured with a confusion matrix. The G

_{mean}ƒ

_{2}factor using a cut-off of 35 was the most appropriate method in the simulation conditions frame, enabling to more accurately conclude the potential of test formulations, with a reduced sample size. For simplification, a decision tree is also proposed for appropriate planning of the sample size and subsequent analysis approach to be followed in pilot BA/BE trials.

## 1. Introduction

## 2. Materials and Methods

_{a}) (Figure 1).

#### 2.1. Study Design

#### 2.2. Population Pharmacokinetic Simulation

_{GI}represents the amount in the gastrointestinal tract, A

_{1}the amount in the organism, k

_{a}the absorption rate constant, k

_{e}the elimination rate constant, V the apparent volume of distribution and C the plasma concentration.

_{a}, V, and k

_{e}are presented in Table 1. All parameters followed a log-normal distribution (Equation (3)). Absolute bioavailability (F) was considered to have a mean value of 0.9 (Table 1), and no variability was tested for this parameter.

_{a}value as 30% of the reference product mean k

_{a}(i.e., truly bioinequivalent) (Figure 1—Covariate Model).

_{max}) and area under the plasma concentration-time curve (AUC) were derived for each pharmacokinetic profile. The AUC typically reflects the extent of drug absorption, whether C

_{max}is considered to reflect the absorption rate [1,2]. C

_{max}usually shows larger variation compared to AUC, as the parameter highly depends on the selection of sampling times. Thus, as the risk of failing to demonstrate bioequivalence is higher for the rate of drug absorption, performed simulations only covered the effect of variability on the bioequivalence of C

_{max}.

#### 2.3. Simulations Analysis

_{mean}) and geometric (G

_{mean}) mean ƒ

_{2}factor approaches were also investigated.

#### 2.3.1. Average Bioequivalence Analysis

_{max}. A linear model was applied, using sequence, subject nested within sequence, period and treatment as fixed effects [1,8,9].

_{T}and μ

_{R}are the population average response (i.e., the LSM) of the ln-transformed measure for test and reference formulations, respectively. Hence, for the back-transformed data, the hypotheses for average bioequivalence can be expressed as

_{1}) is shown by rejecting the null hypothesis (H

_{0}) of average bioinequivalence, i.e., the decision of bioequivalence is based on whether the 90% CI ($100\xb7\left(1-2\mathsf{\alpha}\right)\%$) of the test-to-reference GMR is within the regulatory acceptance interval of [80.00–125.00]% [3,7,10,11].

^{2}is the mean square error obtained from the ANOVA model of the ln-transformed parameters [1,8,9].

^{®}WinNonlin

^{®}version 8.3 (Certara USA Inc., Princeton, NJ, USA).

#### 2.3.2. Centrality of the Test-to-Reference GMR

#### 2.3.3. Bootstrap Bioequivalence Analysis

#### 2.3.4. Similarity ƒ_{2} Factor

_{2}factor is a mathematical index widely used to compare dissolution profiles, evaluating their similarity, using the percentage of drug dissolved per unit of time. The similarity ƒ

_{2}factor, proposed by Moore and Flanner in 1996 [16], is derived from the mean squared difference, and can be calculated as a function of the reciprocal of mean squared-root transformation of the sum of square differences at all points:

_{2}is the similarity factor, n is the number of time points, and ${{\displaystyle \overline{\mathrm{R}}}}_{t}$ and ${{\displaystyle \overline{\mathrm{T}}}}_{t}$ are the mean percentage of drug dissolved at time t after initiation of the study, for reference and test products, respectively [1,16,17].

_{2}similarity factor ranges from 0 (when ${{\displaystyle \overline{\mathrm{R}}}}_{t}-{{\displaystyle \overline{\mathrm{T}}}}_{t}=100\%$, at all t) to 100 (when ${{\displaystyle \overline{\mathrm{R}}}}_{t}-{{\displaystyle \overline{\mathrm{T}}}}_{t}=0\%$, at all t). Therefore, applying Equation (8), an average difference of 10%, 15%, and 20% from all measured time points results in a ƒ

_{2}value of 50, 41, and 35, respectively (Figure 2). EMA [1] and FDA [18,19] have set a public standard of ƒ

_{2}value between 50–100, i.e., a maximum mean difference of 10%, to indicate similarity between the two dissolution profiles.

_{2}was applied as an alternative to the average bioequivalence analysis. The similarity between test and reference products by means of ƒ

_{2}was evaluated through the comparison of arithmetic (A

_{mean}) and geometric (G

_{mean}) means of plasma concentration-time profiles derived from the simulated individual pharmacokinetic profiles. ƒ

_{2}was used to assess the similarity on the rate of drug absorption by normalizing test and reference mean concentration-time profiles to the maximum plasma concentration (C

_{max}) derived from the mean reference profile, until reference C

_{max}is observed (reference t

_{max}) (Equation (9)).

_{max}of the reference mean concentration-time profile, and ${\mathrm{t}}_{\mathrm{max}}{}_{R}$ the time of observation of ${\mathrm{C}}_{\mathrm{max}}{}_{R}$. The similarity ƒ

_{2}factor is calculated as

_{max}, and ${\mathrm{R}}_{t}^{N}$ and ${\mathrm{T}}_{t}^{N}$ are the normalized concentration at time t for reference and test products, respectively.

_{2}factor was tested for differences between test and reference formulations’ mean concentration-time profiles of 10%, 15% and 20%. Consequently, the interval hypotheses for the ƒ

_{2}factor can be formulated as

#### 2.4. Performance Measurement

_{0}(Equation (12)); (ii) and the type II error which concerns the failing to reject false H

_{0}(Equation (13)) [3]. The probabilities of making type I and type II errors are given as

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35, 41, and 50), a confusion matrix, i.e., a cross-tabulation of the observed and predicted classes with associated statistics, was created (Table 2).

- Sensitivity, also referred to as power, recall or true positive rate, which measures the capacity of the model to correctly identify bioequivalent test and reference formulations. In other words, it is the probability of correctly rejecting H
_{0}when H_{0}is false (Table 2).$$\mathrm{Sensitivity}\text{}or\text{}\mathrm{Power}=1-\mathsf{\beta}=P({\mathrm{reject}\text{}\mathrm{H}}_{0}|{\text{}\mathrm{H}}_{0}\text{}\mathrm{is}\text{}\mathrm{false})$$When the test recognizes all the bioequivalent formulations (i.e., no false negatives) Sensitivity = 1; when the test does not recognize any of the bioequivalent formulations Sensitivity = 0. - Specificity, also referred to as true negative rate, measures the capacity of the model to correctly identify bioinequivalent test and reference formulations. In other words, it is the probability of correctly failing to reject H
_{0}when H_{0}is true (Table 2).$$\mathrm{Specificity}=1-\mathsf{\alpha}=P({\mathrm{fail}\text{}\mathrm{to}\text{}\mathrm{reject}\text{}\mathrm{H}}_{0}\text{}|{\text{}\mathrm{H}}_{0}\text{}\mathrm{is}\text{}\mathrm{true})$$When the test recognizes all the bioinequivalent formulations (i.e., no false positives) Specificity = 1; when the test does not recognize any of the bioinequivalent formulations Specificity = 0. - Precision, also referred to as positive predictive value (PPV), measures the correctness achieved in bioequivalent predictions (Table 2).When PPV = 1, all identified bioequivalent formulations are truly bioequivalent.
- Negative Predictive Value (NPV), which measures the correctness achieved in bioinequivalent predictions (Table 2).When NPV = 1, all identified bioinequivalent formulations are truly bioinequivalent.
- Accuracy, which represents the ratio between the correctly identified predicted instances (bioequivalent and bioinequivalent) and the total number of instances (Table 2).When Accuracy = 1, the test predicted correctly all the bioequivalent and bioinequivalent formulations.
- F
_{1}score, which is the harmonic mean of Sensitivity and Precision.$${\mathrm{F}}_{1}=\frac{2\xb7\mathrm{Sensitivity}\xb7\mathrm{Precision}}{\mathrm{Sensitivity}+\mathrm{Precision}}$$F_{1}score is independent from the number of samples correctly classified as negative. A F_{1}= 1 indicates perfect precision and sensitivity; for a F_{1}= 0, either precision or sensitivity are 0. - Matthews Correlation Coefficient (MCC), which measures the correlation coefficient between the true classes and the method predicted classes.$$\mathrm{MCC}=\frac{Cov\left(t,p\right)}{{\sigma}_{t}\xb7{\sigma}_{p}}=\frac{\mathrm{TP}\xb7\mathrm{TN}-\mathrm{FP}\xb7\mathrm{FN}}{\sqrt{\left(\mathrm{TP}+\mathrm{FP}\right)\xb7\left(\mathrm{TP}+\mathrm{FN}\right)\xb7\left(\mathrm{TN}+\mathrm{FP}\right)\xb7\left(\mathrm{TN}+\mathrm{FN}\right)}}$$
_{t}and σ_{p}are the standard deviations, respectively [21]. A MCC = 1 indicates a perfect prediction; MCC = 0 indicates that the prediction is no better than random; a$\mathrm{nd}$ MCC = −1 indicates total disagreement between prediction and observation. - Cohen’s Kappa (κ) statistic, which is a measure of concordance for categorical data that measures agreement relative to what would be expected by chance.$$\mathsf{\kappa}=\frac{2\xb7\left(\mathrm{TP}\xb7\mathrm{TN}-\mathrm{FP}\xb7\mathrm{FN}\right)}{\left(\mathrm{TP}+\mathrm{FP}\right)\xb7\left(\mathrm{TP}+\mathrm{FN}\right)\xb7\left(\mathrm{TN}+\mathrm{FP}\right)\xb7\left(\mathrm{TN}+\mathrm{FN}\right)}$$When there is complete agreement κ = 1; when there is no agreement κ = 0; and when there is no effective agreement, or when there is a complete disagreement, κ = −1.

## 3. Results

#### 3.1. Simulated Pharmacokinetic Data

_{max}for both test and reference products was approximately 642.5 µg/L (geometric coefficient of variation [GCV%] ≈ 6%), being reached between 0.75 and 4 h (median t

_{max}= 2.25 h). AUC from pre-dose until the last sampling time was approximately 4950 µg.h/L (GCV% ≈ 3%). As expected, no differences between test and reference products were observed for these NCA parameters (Appendix SA.1.2).

_{max}for the test product (t

_{max}= 3.5 h [1.75–8 h]), as well as a 30% reduction of C

_{max}(G

_{mean}= 460.69 µg/L [GCV% ≈ 6%]), as a consequence of the differences in k

_{a}between test and reference products. No differences were observed for AUC, as test and reference products only presented differences in k

_{a}, and not in F (Appendix SA.1.2).

_{a}(IIV and IOV) did not greatly affect the distribution of C

_{max}nor AUC values (GCV% ≈ 7–20%); however, it increased the time range for the observation of C

_{max}. t

_{max}values ranged from 0.5 to 6 h for the reference product and ranged from 0.25 to 8 h for a truly bioequivalent test, and from 1 to 12 h for a truly bioinequivalent test (Appendix SA.2.3).

_{e}was not associated with higher dispersion of C

_{max}values, but it was associated with a wider time range for t

_{max}(from 0.5 to 8 h for reference and truly bioequivalent test; and from 1 to 12 h for truly bioinequivalent test). An increase variability in k

_{e}induced an increased variability in AUC (GCV% ≈ 27–40%) (Appendix SA.4.3).

_{max}and AUC values (GCV% ≈ 30–48%). However, no differences were observed for t

_{max}range (Appendix SA.3.3).

#### 3.2. Bioequivalence Evaluation

_{1}in Figure 9, for MCC in Figure 10 and for Cohen’s κ in Figure 11.

_{max}bioequivalence metric. The variability tested for the other model parameters had no relevant impact.

#### 3.2.1. Average Bioequivalence Method

#### 3.2.2. Centrality of the Test-to-Reference GMR Method

#### 3.2.3. Bootstrap Bioequivalence Method

#### 3.2.4. Similarity f_{2} Factor Method

_{2}factor derived from the A

_{mean}and G

_{mean}pharmacokinetic profiles behaved similarly. For an IOV of 20% in V, and using a cut-off of 35 (i.e., to detect a mean difference of 20%), the ƒ

_{2}method could correctly identify more than 99% of truly bioequivalent test formulations in studies with 12 and 30 subjects. When increasing IOV to 30%, the sensitivity slightly decreased to 94% in studies with 12 subjects but scored higher than 98% in studies with 30 subjects. For the highest tested variability (IOV of 45%), the ƒ

_{2}factor derived from both A

_{mean}and G

_{mean}profiles was found to be a much more sensitive approach than the standard average bioequivalence approach, with >76% and 96% of truly bioequivalent test formulations identified in studies with 12 and 30 subjects, respectively (Table 6 and Table 7, and Figure 3).

_{2}factor method still performed better than the average bioavailability method. As expected, the sensitivity slightly decreased while using a higher cut-off, however, differences in the sensitivity between 35 and 41 cut-off values were only noticeable at 30% IOV. For an IOV of 30% in V, the ƒ

_{2}method could correctly identify more than 84% of truly bioequivalent test formulations with 12 subjects (nearly a 10% decrease in comparison to a cut-off of 35) and 98% with 30 subjects (no difference between the two cut-off values). Moreover, for the highest tested variability (IOV of 45%), the sensitivity of ƒ

_{2}factor method for both A

_{mean}and G

_{mean}profiles using a cut-off of 41 decreased to nearly 66% and 88% (nearly 10% decrease in comparison to a cut-off of 35) with 12 and 30 subjects, respectively (Table 8 and Table 9, and Figure 3).

_{2}factor method performed slightly worse than the average bioequivalence method in studies simulated for the lowest sample size (12 subjects) with the highest variability (IOV of 45%) on k

_{a}. In this case, a sensitivity of 88% was attained. However, regarding the different variability scenarios in V, the ƒ

_{2}factor method using a cut-off of 50 was always more sensitive than the average bioequivalence method for an IOV ≥ 20%. Nevertheless, as expected, the sensitivity decreased, compared to the other tested cut-offs. For an IOV of 20% in V and using a cut-off value of 50, ƒ

_{2}factor method correctly predicted nearly 80% of the truly bioequivalent test formulations with only 12 subjects and 99% with 30 subjects. For an IOV of 30%, ƒ

_{2}factor method showed a sensitivity of more than 60% in studies with 12 subjects and more than 90% with 30 subjects. For the highest tested variability (IOV of 45%), ƒ

_{2}factor correctly predicted almost 50% of the truly bioequivalent test formulations with 12 subjects and more than 64% with 30 subjects (Table 10 and Table 11, and Figure 3).

_{2}factor method using different cut-offs in comparison to the average bioequivalence method, no inflation of type I error (>5%) was induced with ƒ

_{2}factor method for A

_{mean}and G

_{mean}pharmacokinetic profiles, using all cut-off values (Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11, Figure 5).

#### 3.2.5. Comparison of Average Bioequivalence, Centrality of the Point Estimate, Bootstrap Bioequivalence, and Similarity ƒ_{2} Factor Methods

_{1}and κ were calculated in order to select the best methodology to assess the potential of a test formulation to be bioequivalent to a reference formulation on the rate of drug absorption, based on pilot BA/BE trials.

_{2}factor method derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, the accuracy was above 80% and 94% in studies with 12 and 30 subjects, respectively, using a cut-off of 35 (Table 6 and Table 7) and a cut-off of 41 (Table 8 and Table 9); while using a cut-off of 50, accuracy ranged from 74% to 82% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{1}) (Figure 9). For an IOV of 20% in V, and for studies with 12 and 30 subjects, F

_{1}estimates for the average bioequivalence ranged between 71.8% and 99.5%, respectively (Table 3); for the centrality of the GMR ranged between 88.3% and 99.5% (Table 4); and for bootstrap bioequivalence ranged between 94.3% and 99.5% (Table 5). For the ƒ

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, F

_{1}was above 99% using a cut-off of 35 (Table 6 and Table 7), above 98% using a cut-off and 41 (Table 8 and Table 9), while using a cut-off of 50, F

_{1}ranged from 88% to 99% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{1}highly decreased, ranging between 25.9% and 86.4%, respectively (Table 3). For the same sample sizes, the centrality of the GMR method showed an F

_{1}between 70.8% and 91.9% (Table 4), respectively, and the bootstrap bioequivalence method presented an F

_{1}between 83.5% and 96.4% (Table 5). For the ƒ

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, F

_{1}was above 96% using a cut-off of 35 (Table 6 and Table 7), and above 91% using a cut-off of 41 (Table 8 and Table 9). Using a cut-off of 50, F

_{1}ranged from 75% to 95% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{1}decreased to 2% and 34.7%, respectively (Table 3). For the same sample sizes, the centrality of the GMR showed an F

_{1}between 49.3% and 69.2% (Table 4), respectively, and the bootstrap bioequivalence method presented an F

_{1}of 68.9% and 77.2% (Table 5). For the ƒ

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, F

_{1}ranged within 85% and 98% using a cut-off of 35 (Table 6 and Table 7), within 80% and 94% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, F

_{1}ranged from 65% to 78% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, MCC ranged within 99% and 100% using a cut-off of 35 (Table 6 and Table 7), within 97% and 100% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, MCC ranged from 80% to 98% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, MCC ranged within 94% and 100% using a cut-off of 35 (Table 6 and Table 7), within 85% and 98% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, MCC ranged from 66% to 91% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, MCC ranged between 77% and 96% using a cut-off of 35 (Table 6 and Table 7), within 70% and 89% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, MCC ranged from 56% to 70% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, κ was above 99% using a cut-off of 35 (Table 6 and Table 7), above 97% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, κ ranged from 79% to 99% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, κ was above 94% using a cut-off of 35 (Table 6 and Table 7), above 84% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, κ ranged from 61% to 92% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

_{2}factor derived from A

_{mean}and G

_{mean}pharmacokinetic profiles, κ was within 75% and 96% using a cut-off of 35 (Table 6 and Table 7), within 66% and 89% using a cut-off and 41 (Table 8 and Table 9); while using a cut-off of 50, κ ranged from 48% to 66% in studies with 12 and 30 subjects, respectively (Table 10 and Table 11).

## 4. Discussion

_{max}values (GCV% ≈ 30–48%). Hence, the within-individual variability (IOV) in V was the identified variability with the highest impact on the bioequivalence evaluation of C

_{max}.

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35, 41, and 50) the relationship between type I and type II errors was studied. Moreover, accuracy, MCC, F

_{1}, and κ were calculated in order to select the best methodology for the evaluation of the potentiality of a test formulation to be bioequivalent to a reference formulation on the rate of drug absorption, based on pilot BA/BE trials. For each bioequivalence evaluation method, results were consistent for all the calculated cross-tabulation matrix statistics.

_{mean}and G

_{mean}ƒ

_{2}factor) showed a higher sensitivity/power than the established average bioequivalence method commonly used (Figure 3).

_{2}factor methodology was tested using cut-offs of 35, 41, and 50 for testing a mean difference of 20%, 15%, and 10%, respectively, between the concentration-time profiles of test and reference, until the reference C

_{max}.

_{2}factor methodology, for an IOV of 20% (in V), 12 subjects are needed to target a power of at least 80%, either using a cut-off of 35 or 41, corresponding to 75% of the required sample size estimated with the same IOV and power assumptions, using the average bioequivalence analysis approach (Table 12). Using a cut-off of 50, 14 subjects would be needed (corresponding to 88% of the estimated sample size using the average bioequivalence analysis approach). For an IOV of 30%, and to target a power of at least 80%, 12 subjects are necessary using a cut-off of 35 and 41 (corresponding to 38% of the estimated sample size using the average bioequivalence analysis approach). For the highest tested variability (45%) and to target the same power level of at least 80%, pilot studies may be performed with 14 subjects (using a cut-off of 35) or 20 subjects (using a cut-off of 41), which correspond to 21% and 30%, respectively, of the estimated sample size using the average bioequivalence analysis approach.

_{2}factor cut-offs inflated type I error rate (a maximum type I error of only 1% was observed for A

_{mean}ƒ

_{2}factor with a cut-off of 35 for simulations performed using an IOV of 45% in V [Table 6]), the authors suggest the use of a cut-off of 35 instead of 41 and 50 for the ƒ

_{2}factor methodology, under the simulated conditions frame.

_{2}factor derived from the A

_{mean}and G

_{mean}pharmacokinetic profiles, the G

_{mean}ƒ

_{2}factor using a cut-off of 35 was the method with the best relationship between avoiding type I and type II errors. It was also the method with higher accuracy and a better relationship between outcomes and predictions. Nevertheless, simulations are needed with more extreme scenarios (e.g., a true GMR of 90% and 80%) to better define a cut-off for this method.

_{mean}ƒ

_{2}factor and GMR and the absolute true mean difference of ln-transformed test and reference C

_{max}(i.e., LSM) is shown in Figure 12. The higher the absolute true mean difference of ln-transformed test and reference C

_{max}, the lower the ƒ

_{2}factor. Moreover, this figure also shows that more accurate GMR and ƒ

_{2}factor estimates are obtained with the increase of the number of simulated subjects in the trial (for true bioequivalent simulations, GMR is 100% and ƒ

_{2}factor is 70; for true bioinequivalent simulations, GMR is 70% and ƒ

_{2}factor is 20).

_{max}of approximately 2 to 4 h.

## 5. Conclusions

_{mean}ƒ

_{2}factor using a cut-off of 35 was found to be most appropriate method in the simulation conditions frame, enabling them to more accurately conclude on the potential of test formulations, with a reduced sample size. For simplification, a decision tree is also proposed for an appropriate planning of the sample size and subsequent analysis approach to be followed in pilot BA/BE trials.

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- European Medicines Agency (EMA). Guideline on the Investigation of Bioequivalence (CPMP/EWP/QWP/1401/98 Rev. 1/Corr **). London. 20 January 2010. Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-investigation-bioequivalence-rev1_en.pdf (accessed on 15 September 2022).
- U.S. Food and Drug Administration (FDA). Guidance for Industry: Bioequivalence Studies with Pharmacokinetic Endpoints for Drugs Submitted under an ANDA. Draft Guidance. August 2021. Available online: https://www.fda.gov/media/87219/download (accessed on 15 September 2022).
- Chow, S.C.; Liu, J. Design and Analysis of Bioavailability and Bioequivalence Studies; Chapman and Hall/CRC: New York, NY, USA, 2008; ISBN 9780429140365. [Google Scholar]
- Pan, G.; Wang, Y. Average Bioequivalence Evaluation: General Methods for Pilot Trials. J. Biopharm. Stat.
**2006**, 16, 207–225. [Google Scholar] [CrossRef] [PubMed] - Fuglsang, A. Pilot and Repeat Trials as Development Tools Associated with Demonstration of Bioequivalence. AAPS J.
**2015**, 17, 678–683. [Google Scholar] [CrossRef] [PubMed] - Moreno, I.; Ochoa, D.; Román, M.; Cabaleiro, T.; Abad-Santos, F. Utility of Pilot Studies for Predicting Ratios and Intrasubject Variability in High-Variability Drugs. Basic Clin. Pharmacol. Toxicol.
**2016**, 119, 215–221. [Google Scholar] [CrossRef] [PubMed] - U.S. Food and Drug Administration (FDA). Guidance for Industry: Statistical Approaches to Establishing Bioequivalence. Draft Guidance. December 2022. Available online: https://www.fda.gov/media/163638/download (accessed on 6 March 2023).
- European Medicines Agency (EMA). Questions & Answers: Positions on Specific Questions Addressed to the Pharmacokinetics Working Party (PKWP) (EMA/618604/2008 Rev. 13). 19 November 2015. Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/questions-answers-positions-specific-questions-addressed-pharmacokinetics-working-party_en.pdf (accessed on 17 September 2022).
- European Medicines Agency (EMA). Guideline on the Investigation of Bioequivalence—Annex I (EMA/582648/2016). 21 September 2016. Available online: https://www.ema.europa.eu/en/documents/other/31-annex-i-statistical-analysis-methods-compatible-ema-bioequivalence-guideline_en.pdf (accessed on 15 September 2022).
- Schuirmann, D.J. A Comparison of the Two One-Sided Tests Procedure and the Power. J. Pharmacokinet. Biopharm.
**1987**, 15, 657–680. [Google Scholar] [CrossRef] [PubMed] - Chow, S.C. Bioavailability and Bioequivalence in Drug Development. Wiley Interdiscip. Rev. Comput. Stat.
**2014**, 6, 304–312. [Google Scholar] [CrossRef] [PubMed] - Bonate, P.L. Pharmacokinetic-Pharmacodynamic Modeling and Simulation, 2nd ed.; Springer: New York, NY, USA, 2011; ISBN 978-1-4419-9485-1. [Google Scholar]
- Pigeot, I.; Hauschke, D.; Shao, J. The Bootstrap in Bioequivalence Studies. J. Biopharm. Stat.
**2011**, 21, 1126–1139. [Google Scholar] [CrossRef] [PubMed] - Labes, D.; Schütz, H.; Lang, B. PowerTOST: Power and Sample Size for (Bio)Equivalence Studies. 2010. Available online: https://github.com/Detlew/PowerTOST (accessed on 20 May 2022).
- Pigeot, I. The Bootstrap Percentile in Food and Drug Administration Regulations for Bioequivalence Assessment. Drug Inf. J.
**2001**, 35, 1445–1453. [Google Scholar] [CrossRef] - Moore, J.W.; Flanner, H.H. Mathematical Comparison of Curves with an Emphasis on in Vitro Dissolution Profiles. Pharm. Technol.
**1996**, 20, 64–74. [Google Scholar] - Shah, V.P.; Tsong, Y.; Sathe, P.; Liu, J.-P. In Vitro Dissolution Profile Comparison—Statistics and Analysis of the Similarity Factor, F2. Pharm. Res.
**1998**, 15, 889–896. [Google Scholar] [CrossRef] [PubMed] - US Food and Drug Administration (FDA). Guidance for Industry: Waiver of In Vivo Bioavailability and Bioequivalence Studies for Immediate-Release Solid Oral Dosage Forms Based on a Biopharmaceutics Classification System. December 2017. Available online: https://www.gmp-compliance.org/files/guidemgr/UCM070246.pdf (accessed on 3 October 2022).
- US Food and Drug Administration (FDA). Guidance for Industry: Dissolution Testing of Immediate Release Solid Oral Dosage Forms. August 1997. Available online: https://www.fda.gov/media/70936/download (accessed on 3 October 2022).
- Kuhn, M. Building Predictive Models in R Using the Caret Package. J. Stat. Softw.
**2008**, 28, 1–26. [Google Scholar] [CrossRef] - Chicco, D.; Tötsch, N.; Jurman, G. The Matthews Correlation Coefficient (MCC) Is More Reliable than Balanced Accuracy, Bookmaker Informedness, and Markedness in Two-Class Confusion Matrix Evaluation. BioData Min.
**2021**, 14, 13. [Google Scholar] [CrossRef] [PubMed]

**Figure 2.**Distribution of ƒ

_{2}similarity factor as a function of mean difference. ƒ

_{2}similarity factor is derived from the mean squared difference and can be calculated as a function of the reciprocal of the mean squared-root transformation of the sum of square differences at all points. An average difference of 10%, 15%, and 20% from all measured time points results in a ƒ

_{2}value of 50 (red dotted lines), 41 (green dotted lines) and 35 (blue dotted lines), respectively.

**Figure 3.**Variation of sensitivity for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 4.**Variation of sensitivity for A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35, 41, and 50, as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 5.**Variation of specificity for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 6.**Variation of precision for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 7.**Variation of negative predictive value (NPV) for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 8.**Variation of accuracy for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as a function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 9.**Variation of F

_{1}for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as a function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 10.**Variation of Matthews correlation coefficient (MCC) for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 11.**Variation of Cohen’s κ for the bioequivalence evaluation methods (average bioequivalence, centrality of the test-to-reference GMR, bootstrap bioequivalence analysis, and A

_{mean}and G

_{mean}ƒ

_{2}factor evaluated with a cut-off of 35) as a function of the number of subjects, per tested variability for the different pharmacokinetic model parameters.

**Figure 12.**Relationship between G

_{mean}f

_{2}factor and test-to-reference GMR (above) or absolute LSM difference (below), and number of subjects (colour gradient), for all simulated true bioequivalent (blue) and true bioinequivalent (red) studies. Vertical dotted lines correspond to the maximum 20% difference between test and reference formulations, tested by the average bioequivalence approach. Horizontal dotted lines correspond to the tested cut-off values for ƒ

_{2}of 50, 41, and 35.

k_{a}(h ^{−1}) | V (L) | k_{e}(h ^{−1}) | F |
---|---|---|---|

1.22 | 58.8 | 0.150 | 0.900 |

_{a}: absorption rate constant, k

_{e}: elimination rate constant, V: volume of distribution.

Method Prediction | ||||
---|---|---|---|---|

Bioequivalent | Bioinequivalent | |||

Truly | Bioequivalent | TP | FN Type II Error | Sensitivity $\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$ |

Bioinequivalent | FP Type I Error | TN | Specificity $\frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}}$ | |

Precision $\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$ | Negative Predictive Value $\frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FN}}$ | Accuracy $\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$ |

Average Bioequivalence (90% CI) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100.0 | 99.0–100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

30% IIV & 20% IOV | 56.0–99.0 | 44.0–1.00 | 100 | 0.00 | 100 | 69.4–99.0 | 78.0–99.5 | 71.8–99.5 | 62.4–99.0 | 56.0–99.0 |

30% IIV & 30% IOV | 15.0–76.0 | 85.0–24.0 | 99.0–100 | 1.00–0.00 | 93.75–100 | 53.8–80.6 | 57.0–88.0 | 25.9–86.4 | 25.8–78.3 | 14.0–76.0 |

0% IIV & 45% IOV | 1.00–21.0 | 99.0–79.0 | 100 | 0.00 | 100 | 50.3–55.9 | 50.5–60.5 | 1.98–34.7 | 7.09–34.3 | 1.00–21.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

Test-to-Reference GMR Centrality | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 98.0–100 | 2.00–0.00 | 100 | 0.00 | 100 | 98.0–100 | 99.0–100 | 99.0–100 | 98.02–100 | 98.0–100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 95.0–100 | 5.00–0.00 | 100 | 0.00 | 100 | 95.2–100 | 97.5–100 | 97.4–100 | 95.1–100 | 95.0–100 |

30% IIV & 20% IOV | 79.0–99.0 | 21.0–1.00 | 100 | 0.00 | 100 | 82.6–99.0 | 89.5–99.5 | 88.3–99.5 | 80.8–99.0 | 79.0–99.0 |

30% IIV & 30% IOV | 57.0–85.0 | 43.0–15.0 | 96.0–100 | 4.00–0.00 | 93.4–100 | 69.1–87.0 | 76.5–92.5 | 70.8–91.9 | 57.6–86.0 | 53.0–85.0 |

0% IIV & 45% IOV | 36.0–54.0 | 64.0–46.0 | 90.0–98.0 | 10.0–2.00 | 78.3–96.4 | 58.4–68.1 | 63.0–76.0 | 49.3–69.2 | 30.9–57.9 | 26.0–52.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

Bootstrap Bioequivalence (95% CI) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 91.0–99.0 | 9.00–1.00 | 98.0–100 | 2.00–0.00 | 97.85–100 | 91.6–99.0 | 94.5–99.5 | 94.3–99.5 | 89.2–99.0 | 89.0–99.0 |

30% IIV & 30% IOV | 76.0–94.0 | 24.0–6.00 | 94.0–99.0 | 6.00–1.00 | 92.7–98.9 | 79.7–94.3 | 85.0–96.5 | 83.5–96.4 | 71.2–93.1 | 70.0–93.0 |

0% IIV & 45% IOV | 62.0–66.0 | 38.0–34.0 | 82.0–95.0 | 18.0–5.00 | 77.5–93.0 | 68.3–73.6 | 72.0–80.5 | 68.9–77.2 | 44.9–63.7 | 44.0–61.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 6.**Cross-Tabulated Matrix Statistics Calculated for Arithmetic Mean (A

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 35.

A_{mean} ƒ_{2} Factor (Cut-Off of 35) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

30% IIV & 30% IOV | 94.0–98.0 | 6.00–2.00 | 100 | 0.00 | 100 | 94.3–98.0 | 97.0–99.0 | 96.9–99.0 | 94.2–98.0 | 94.0–98.0 |

0% IIV & 45% IOV | 76.0–96.0 | 24.0–4.00 | 99.0–100 | 1.00–0.00 | 98.7–100 | 80.5–96.2 | 87.5–98.0 | 85.9–98.0 | 77.1–96.1 | 75.0–96.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 7.**Cross-Tabulated Matrix Statistics Calculated for Geometric Mean (G

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 35.

G_{mean} ƒ_{2} Factor (Cut-Off of 35) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

30% IIV & 30% IOV | 96.0–100 | 4.00–0.00 | 100 | 0.00 | 100 | 96.2–100 | 98.0–100 | 98.0–100 | 96.1–100 | 96.0–100 |

0% IIV & 45% IOV | 79.0–96.0 | 21.0–4.00 | 100 | 0.00 | 100 | 82.6–96.2 | 89.5–98.0 | 88.3–98.0 | 80.8–96.1 | 79.0–96.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation Coefficient; NPV—Negative predictive value.

**Table 8.**Cross-Tabulated Matrix Statistics Calculated for Arithmetic Mean (A

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 41.

A_{mean} ƒ_{2} Factor (Cut-Off of 41) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 98.0–100 | 2.00–0.00 | 100 | 0.00 | 100 | 98.0–100 | 99.0–100 | 99.0–100 | 98.0–100 | 98.0–100 |

30% IIV & 30% IOV | 84.0–98.0 | 16.0–2.00 | 100 | 0.00 | 100 | 86.2–98.0 | 92.0–99.0 | 91.3–99.0 | 85.1–98.0 | 84.0–98.0 |

0% IIV & 45% IOV | 66.0–88.0 | 34.0–12.0 | 100 | 0.00 | 100 | 74.6–89.3 | 83.0–94.0 | 79.5–93.6 | 70.2–88.6 | 66.0–88.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 9.**Cross-Tabulated Matrix Statistics Calculated for Geometric Mean (G

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 41.

G_{mean} ƒ_{2} Factor (Cut-Off of 41) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 99.0 | 1.00 | 100 | 0.00 | 100 | 99.0 | 99.5 | 99.5 | 99.0 | 99.0 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 97.0–100 | 3.00–0.00 | 100 | 0.00 | 100 | 97.1–100 | 98.5–100 | 98.5–100 | 97.0–100 | 97.0–100 |

30% IIV & 30% IOV | 89.0–98.0 | 11.0–2.00 | 100 | 0.00 | 100 | 90.1–98.0 | 94.5–99.0 | 94.2–99.0 | 89.5–98.0 | 89.0–98.0 |

0% IIV & 45% IOV | 69.0–89.0 | 31.0–11.0 | 100 | 0.00 | 100 | 76.3–90.1 | 84.5–94.5 | 81.7–94.2 | 72.6–89.5 | 69.0–89.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 10.**Cross-Tabulated Matrix Statistics Calculated for Arithmetic Mean (A

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 50.

A_{mean} ƒ_{2} Factor (Cut-Off of 50) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

0% IIV & 45% IOV | 90.0–99.0 | 10.0–1.00 | 100 | 0.00 | 100 | 90.9–99.0 | 95.0–99.5 | 94.7–99.5 | 90.5–99.0 | 90.0–99.0 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

30% IIV & 20% IOV | 79.0–98.0 | 21.0–2.00 | 100 | 0.00 | 100 | 82.6–98.0 | 89.5–99.0 | 88.3–99.0 | 80.8–98.0 | 79.0–98.0 |

30% IIV & 30% IOV | 61.0–92.0 | 39.0–8.00 | 100 | 0.00 | 100 | 71.9–92.6 | 80.5–96.0 | 75.8–95.8 | 66.3–92.3 | 61.0–92.0 |

0% IIV & 45% IOV | 49.0–64.0 | 51.0–36.0 | 100 | 0.00 | 100 | 66.2–73.5 | 74.5–82.0 | 65.8–78.1 | 57.0–68.6 | 49.0–64.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 11.**Cross-Tabulated Matrix Statistics Calculated for Geometric Mean (G

_{mean}) ƒ

_{2}Factor, Using a Cut-Off of 50.

G_{mean} ƒ_{2} Factor (Cut-Off of 50) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity (%) | Type II Error (%) | Specificity (%) | Type I Error (%) | Precision (%) | NPV (%) | Accuracy (%) | F_{1}(%) | MCC (%) | κ (%) | |

Baseline | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

k_{a} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

% IIV & 45% IOV | 88.0–97.0 | 12.0–3.00 | 100 | 0.00 | 100 | 89.3–97.1 | 94.0–98.5 | 93.6–98.5 | 88.6–97.0 | 88.0–97.0 |

V | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 99.0–100 | 1.00–0.00 | 100 | 0.00 | 100 | 99.0–100 | 99.5–100 | 99.5–100 | 99.0–100 | 99.0–100 |

30% IIV & 20% IOV | 82.0–99.0 | 18.0–1.00 | 100 | 0.00 | 100 | 84.8–99.0 | 91.0–99.5 | 90.1–99.5 | 83.4–99.0 | 82.0–99.0 |

30% IIV & 30% IOV | 63.0–91.0 | 37.0–9.00 | 100 | 0.00 | 100 | 73.0–91.7 | 81.5–95.5 | 77.3–95.3 | 67.8–91.4 | 63.0–91.0 |

0% IIV & 45% IOV | 48.0–66.0 | 52.0–34.0 | 100 | 0.00 | 100 | 65.8–74.6 | 74.0–83.0 | 64.9–79.5 | 56.2–70.2 | 48.0–66.0 |

k_{e} | ||||||||||

30% IIV & 0% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 10% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 20% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

30% IIV & 30% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

0% IIV & 45% IOV | 100 | 0.00 | 100 | 0.00 | 100 | 100 | 100 | 100 | 100 | 100 |

_{1}—Harmonic mean of sensitivity and precision; κ—Cohen’s Kappa; MCC—Matthews correlation coefficient; NPV—Negative predictive value.

**Table 12.**Sample Size for a 2 × 2 × 2 Crossover Study for Different Bioequivalence Evaluation Methods, Targeting a Power of at Least 80%, an α of 0.05, and Assuming a GMR of 100%.

IOV (%) | Average Bioequivalence ^{1} | Bootstrap Bioequivalence | G_{mean} ƒ_{2} Factor | ||
---|---|---|---|---|---|

35 | 41 | 50 | |||

10% | 12 | 12 | 12 | 12 | 12 |

20% | 16 | 12 | 12 | 12 | 12 |

30% | 32 | 14 | 12 | 12 | 18 |

45% | 66 | >30 | 14 | 20 | >30 |

^{1}Calculated using R package ‘PowerTOST’ version 1.5–3 [14]. G

_{mean}—Geometric mean; IOV—Inter-occasion variability.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Henriques, S.C.; Albuquerque, J.; Paixão, P.; Almeida, L.; Silva, N.E.
Alternative Analysis Approaches for the Assessment of Pilot Bioavailability/Bioequivalence Studies. *Pharmaceutics* **2023**, *15*, 1430.
https://doi.org/10.3390/pharmaceutics15051430

**AMA Style**

Henriques SC, Albuquerque J, Paixão P, Almeida L, Silva NE.
Alternative Analysis Approaches for the Assessment of Pilot Bioavailability/Bioequivalence Studies. *Pharmaceutics*. 2023; 15(5):1430.
https://doi.org/10.3390/pharmaceutics15051430

**Chicago/Turabian Style**

Henriques, Sara Carolina, João Albuquerque, Paulo Paixão, Luís Almeida, and Nuno Elvas Silva.
2023. "Alternative Analysis Approaches for the Assessment of Pilot Bioavailability/Bioequivalence Studies" *Pharmaceutics* 15, no. 5: 1430.
https://doi.org/10.3390/pharmaceutics15051430