1. Introduction
The incidence of hip fractures, including femoral neck fractures, has increased [
1]. Failed internal fixation for femoral neck fractures has a strong negative impact on patients, leading to increased postoperative mortality and high medical costs [
2]. Recent systematic reviews and meta-analyses have revealed that preoperative posterior tilt ≥ 20° using lateral radiograph is associated with failed internal fixations [
3,
4]. On the other hand, preoperative anteroposterior (AP) alignment (valgus tilt > 15°) was a risk factor for failure in treatment [
5]. In addition, the influence of postoperative AP alignment on reoperations or functional scores was evaluated recently [
6,
7].
Posterior tilt measurement using lateral radiography was first presented as an assessment of lateral alignment in 2009 [
8]. The reliability of the measurement ranges from substantial to excellent [
9,
10,
11]. On the other hand, AP alignment is generally measured using the Garden alignment index (GAI) [
12], but its reliability remains unknown. It is challenging to determine alignment by the trabecular line in the femoral head [
12], especially in elderly patients with thin trabecular lines because of osteoporosis. Therefore, surgeons need an AP alignment measurement with high reliability to develop treatment strategies and estimate the prognosis. We hypothesized that valgus tilt measurement (VTM) using an AP radiograph would be as reliable as posterior tilt measurement using a lateral radiograph.
The purpose of this study was to evaluate and compare the reliability of GAI and a new AP alignment measurement (VTM) using preoperative AP radiographs of nondisplaced femoral neck fractures. In addition, we compared the reliability in terms of raters’ status (junior vs. senior surgeons) and patient age (over vs. under 80 years of age). We believe that determining a more reliable measurement for AP alignment will contribute to better clinical decision making and could serve as a basis for extensive clinical research on femoral neck fractures.
2. Materials and Methods
2.1. Study Design and Setting
The study was designed as an intra- and inter-rater reliability analysis in a general hospital. We followed the standards of earlier reliability studies [
13,
14,
15]. This study was conducted in accordance with the Declaration of Helsinki and was approved by the institutional review board (approval number: 221043).
2.2. Patient Selection
We calculated the sample size for reliability measurements using a web calculator [
16]. The sample size was a total of 16 images based on data (expected intraclass coefficient [ICC] of inter-rater reliability = 0.8, precision [±expected] = 0.1, number of raters = 4, α = 0.05) reported in reliability studies [
9,
17]. Lastly, we set a sample size of 50 images because earlier reliability studies evaluated 50 images [
9,
11].
We selected consecutive patients with nondisplaced femoral neck fractures (Garden stage I and II) according to the Garden classification on preoperative AP radiographs between March 2019 and June 2022 [
12]. We only excluded radiographs in which we could not measure VTM or GAI for any reason in order to avoid selection bias and in consideration of the clinical practice setting (
Figure 1).
2.3. AP Hip Radiographs
Radiology technicians obtained preoperative AP hip radiographs using standard methods. The patients were placed in the supine position. If the fractured leg was externally rotated, the leg was manually positioned in its natural position to the extent possible.
2.4. Radiographic Measurements
The raters were four trauma surgeons (two junior and two senior surgeons) in the same general hospital. We defined the raters as senior surgeons (more experienced surgeons with 10 years of experience) or junior surgeons (less experienced surgeons with <10 years of experience). All raters were lectured on the measurements of GAI and VTM. The raters practiced the measurements using 10 radiographs as pilot measurements before the start of the study. The raters used a digital Picture Archiving and Communication System (PACS) with standard resolution monitors, and variables such as patient age and sex were noted.
First, all raters independently measured the angles on both the fractured and the unfractured sides in all 50 images. After a washout period of 6 weeks [
9,
10], the order of images at the time of the second viewing was randomly changed. They measured the angles in the same image set for a second time.
2.5. Garden Alignment Index
GAI is the angle between the trabecular line in the femoral head and a line drawn through the long axis of the medial cortex of the femoral shaft [
12] (
Figure 2). A valgus displacement of 15° based on GAI is the cutoff value for valgus-impacted femoral neck fracture [
5,
18]. The valgus displacement is the difference in angles measured with GAI between the fractured side and the unfractured side in valgus-impacted femoral neck fractures.
2.6. Valgus Tilt Measurement
The valgus tilt was measured using AP radiographs with modified methods of posterior tilt measurements using lateral radiographs [
8]. First, the mid-neck line (MNL) was drawn through the center of two lines across the residual mid-femoral neck; the first line was drawn at the narrowest part of the residual mid-femoral neck, and a second parallel line was drawn 5 mm distal to the first line. Second, the femoral head line (FHL) was drawn from the center of the femoral head circle to the point where the MNL crosses the femoral head circle. Lastly, valgus tilt was the angle formed by the MNL and FHL (
Figure 3). Negative values denoted a varus tilt of the femoral head corresponding to the MNL, whereas positive values denoted a valgus tilt.
2.7. Statistical Analysis
Intra-rater reliability reflects the variation in measurements by a single rater across multiple observations, while inter-rater reliability reflects the variation in measurements between multiple raters [
15]. The ICC was used to evaluate the intra- and inter-rater reliabilities of GAI and VTM. To calculate intra-rater reliability, we used a mixed-effects model, considering rater as a fixed effect, and patient and time as random effects. The ICC for intra-rater variability was calculated on the basis of the mean rating of four raters and the absolute agreement. To evaluate inter-rater reliability, we used a mixed-effects model considering time as a fixed effect, and rater and patient as random effects. The ICC for inter-rater variability was calculated on the basis of the mean rating of two timepoints and absolute agreement. The 95% confidence intervals (CI) were calculated using bootstrap resampling methods (1000 replications). We interpreted the ICC as follows according to a previous study [
19]: excellent (>0.75), fair to good (0.40–0.75), and poor (<0.40).
For inter-rater reliability, the standard error of measurement (SEM agreement) was calculated from the sum of raters and residual variance: SEM agreement = √(σ between raters + σ residual) [
20]. The minimal detectable change (MDC) was calculated as 1.96 × √2 × SEM. For intra-rater reliability, the within-subject standard deviation (SD) and repeatability coefficient (RC) were calculated. The within-subject SD was calculated using one-way analysis of variance. RC was calculated as √2 × 1.96 × within-subject SD [
21].
ICCs for intra- and inter-rater reliabilities were compared between GAI and VTM using bootstrap resampling methods. Inter- and inter-rater reliabilities were compared between GAI and VTM using descriptive statistics. We also compared the reliability of GAI and VTM between senior and junior doctors.
In the subgroup analysis, we compared the inter- and intra-rater reliability of GAI and VTM in four raters between two patient groups (patients aged ≥80 years vs. <80 years).
Using four measurements for each case at the first test session, we calculated the degree using VTM as a reference, the degree of the unfractured side, and the degree corresponding to a valgus displacement of 15° as the cutoff value based on GAI.
Statistical analysis was performed using R version 4.2.2 (R Foundation for Statistical Computing, Vienna, Austria).
3. Results
We excluded one radiograph in which we could not measure VTM because of excessive external rotation (
Figure 4). Finally, this study included 50 patients (39 [78%] women and 11 [22%] men), with a median age of 78 (IQR 38–99) years. Twenty-four and 26 patients were aged ≥80 years and <80 years, respectively. Four raters measured GAI and VTM in 50 hip radiographs in two tests, providing a total of 4 × 50 × 2 = 400 assessments.
3.1. Intra-Rater Reliability
The overall ICC for the four raters was “excellent” for both GAI (ICC 0.92, 95% CI 0.89–0.94) and VTM (ICC 0.86, 95% CI 0.82–0.89) (
Table 1). The difference in ICC between GAI and VTM was 0.08 (95% CI, 0.03–0.14). The within-subject SD and RC of VTM were lower (4.92, 13.65) than those of GAI (6.33, 17.54). The inter-rater reliability of junior surgeons was similar to that of senior surgeons in GAI, but was higher than that of senior surgeons in VTM.
3.2. Inter-Rater Reliability
The overall ICC for the four raters was “excellent” for both GAI (ICC 0.92, 95% CI 0.89–0.95) and VTM (ICC 0.85, 95% CI 0.81–0.88) (
Table 2). The difference in ICC between GAI and VTM was 0.08 (95% CI 0.03–0.13). The SEM and MDC values of GAI were lower (2.35 and 6.51) than those of VTM (2.56 and 7.08). The inter-rater reliability of junior surgeons was similar to that of senior surgeons in GAI but was higher than that of senior surgeons in VTM.
3.3. Subgroup Analysis
In the subgroup analysis, the intra- and inter-rater reliabilities of GAI and VTM were higher in patients aged <80 years than in patients aged ≥80 years (
Table 3).
The mean degree of VTM on the unfractured side was mean 1.4° (SD 4.3) and the median was 2° (interquartile range [IQR] 1–4). The mean degree of VTM corresponding to a valgus displacement of 15° based on GAI was 9.3° (SD 6.4), with a median of 7° (IQR 5–15).
4. Discussion
The results demonstrated that GAI was a more reliable measurement method than VTM in assessing AP alignment using preoperative AP radiographs for nondisplaced femoral neck fractures, although both measurements were reliable. The reliability of junior surgeons was similar to that of senior surgeons for GAI, but was higher than that of senior surgeons for VTM. The reliability of GAI in patients aged <80 years was higher than that in patients aged ≥80 years.
GAI is a more reliable measurement method than VTM. The inter- and intra-rater reliability of the GAI was higher than that of the posterior tilt assessment using lateral radiographs (both ICC 0.77) [
9], although the analysis methods were different from our methods. The results of the assessment using AP radiographs were not consistent with those using lateral radiographs (inter- and intra-rater reliability of posterior tilt [ICC 0.75] versus inter- and intra-rater reliability of lateral GAI [ICC 0.60, 0.75]) [
11], suggesting that the reliability of posterior tilt was higher than that of lateral GAI. In the AP alignment assessment, our results objectively demonstrated the reasons for the historically frequent use of GAI in terms of reliability.
The reliability of junior surgeons was similar to that of senior surgeons for GAI but higher than that of senior surgeons for VTM. The results differed from general expectations because the reliability of experienced raters is usually higher than that of inexperienced raters [
14]. We consider GAI to be a reliable measurement method regardless of experience because GAI was similar between junior and senior surgeons. In contrast, in VTM, junior surgeons seemed to consistently identify the narrowest part of the residual mid-femoral neck. Thus, junior surgeons may be superior in terms of assessments based on morphological indicators.
The reliabilities of GAI and VTM were higher for patients aged <80 years than for those aged ≥80 years. The view of the trabeculae affects GAI measurement because it is measured using the trabecular line in the femoral head. It is challenging to determine alignment using the trabecular line for radiographs of elderly patients with osteoporosis. On the other hand, MNL in VTM may not be a constant parameter due to age-related degeneration. It is important to recognize that measuring AP alignment is not as reliable in older patients as it is in younger persons.
4.1. Strengths
This is the first study to evaluate the reliability of GAI assessment. Considering the evidence from previous studies [
8,
9,
10,
11], we compared the reliability of GAI with that of VTM as a new measurement scope. We followed the standards of earlier reliability studies [
13,
14,
15] and chose not to exclude AP radiographs of poor quality to minimize the risk of selection bias.
4.2. Limitations
This study had some limitations. First, the rotation and flexion of the injured hip due to pain might have affected AP alignment measurements, although the influence on posterior tilt assessment with lateral radiographs was negligible for positions of the injured hip [
22]. Second, there was a lack of external validity because all the data were obtained from only one general hospital in Japan. It is unclear whether the results of this study can be generalized to other countries with different patient populations and image viewing systems. Ultimately, well-designed studies with more images and raters are necessary to clarify the reliability of AP alignment assessment.
5. Conclusions
This reliability analysis showed that although both GAI and VTM were reliable measurement methods, GAI was more reliable than VTM for assessing AP alignment using preoperative AP radiographs for nondisplaced femoral neck fractures. Additionally, the reliability of GAI was higher in patients aged <80 years than in patients aged ≥80 years. Therefore, age-related variations in GAI measurement should be considered. Well-designed reliability studies with more images and raters are necessary to clarify the reliability of AP alignment assessment.
Author Contributions
All authors contributed to the conceptualization and design of this study; Y.Y. performed the data collection; Y.Y., R.O., Y.M. and T.M. performed the measurements; Y.T. and A.S. conducted the statistical analyses; the first draft of this manuscript was written by N.Y., Y.T. and A.S.; all authors commented on the previous versions of the manuscript. All authors read and agreed to the published version of the manuscript.
Funding
This study received no external funding.
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board.
Informed Consent Statement
Informed consent was obtained in the form of an opt-out on a website.
Data Availability Statement
The data supporting the findings of this study are available from the first author, Y.Y., upon reasonable request.
Acknowledgments
The authors thank Toshiharu Mitsuhashi for providing information on the statistical analyses. The authors also thank Yosuke Mochizuki for his advice.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Takusari, E.; Sakata, K.; Hashimoto, T.; Fukushima, Y.; Nakamura, T.; Orimo, H. Trends in hip fracture incidence in Japan: Estimates based on nationwide hip fracture surveys from 1992 to 2017. JBMR Plus 2021, 5, e10428. [Google Scholar] [CrossRef]
- Do, L.N.D.; Kruke, T.M.; Foss, O.A.; Basso, T. Reoperations and mortality in 383 patients operated with parallel screws for Garden I-II femoral neck fractures with up to ten years follow-up. Injury 2016, 47, 2739–2742. [Google Scholar] [CrossRef]
- van der List, J.P.; El Saddy, S.; Vos, S.J.; Temmerman, O.P.P. Role of preoperative posterior tilt on the outcomes of internal fixation of non-displaced femoral neck fractures: A systematic review and meta-analysis. Injury 2021, 52, 316–323. [Google Scholar] [CrossRef]
- Papadelis, E.; Chaudhry, Y.P.; Hayes, H.; Talone, C.; Shah, M.P. Evaluation of the posterior tilt angle in predicting failure of nondisplaced femoral neck fractures after internal fixation: A systematic review. J. Orthop. Trauma. 2022. ahead of print. [Google Scholar] [CrossRef]
- Song, H.K.; Choi, H.J.; Yang, K.H. Risk factors of avascular necrosis of the femoral head and fixation failure in patients with valgus angulated femoral neck fractures over the age of 50 years. Injury 2016, 47, 2743–2748. [Google Scholar] [CrossRef]
- Qiu, L.; Huang, Y.; Li, G.; Wu, H.; Zhang, Y.; Zhang, Z. Essential role of reliable reduction quality in internal fixation of femoral neck fractures in the non-elderly patients-a propensity score matching analysis. BMC Musculoskelet. Disord. 2022, 23, 346. [Google Scholar] [CrossRef]
- Park, Y.C.; Um, K.S.; Kim, D.J.; Byun, J.; Yang, K.H. Comparison of femoral neck shortening and outcomes between in situ fixation and fixation after reduction for severe valgus-impacted femoral neck fractures. Injury 2021, 52, 569–574. [Google Scholar] [CrossRef]
- Palm, H.; Gosvig, K.; Krasheninnikoff, M.; Jacobsen, S.; Gebuhr, P. A new measurement for posterior tilt predicts reoperation in undisplaced femoral neck fractures: 113 Consecutive patients treated by internal fixation and followed for 1 year. Acta Orthop. 2009, 80, 303–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dolatowski, F.C.; Hoelsbrekken, S.E. Eight orthopedic surgeons achieved moderate to excellent reliability measuring the preoperative posterior tilt angle in 50 Garden-I and Garden-II femoral neck fractures. J. Orthop. Surg. Res. 2017, 12, 133. [Google Scholar] [CrossRef]
- Dolatowski, F.C.; Adampour, M.; Frihagen, F.; Stavem, K.; Erik Utvåg, S.; Hoelsbrekken, S.E. Preoperative posterior tilt of at least 20° increased the risk of fixation failure in Garden-I and -II femoral neck fractures. Acta Orthop. 2016, 87, 252–256. [Google Scholar] [CrossRef]
- Kalsbeek, J.H.; van Walsum, A.D.P.; Roerdink, W.H.; van Vugt, A.B.; van de Krol, H.; Schipper, I.B. Validation of two methods to measure posterior tilt in femoral neck fractures. Injury 2020, 51, 380–383. [Google Scholar] [CrossRef] [PubMed]
- Garden, R.S. Malreduction and avascular necrosis in subcapital fractures of the femur. J. Bone Jt. Surg. Br. 1971, 53, 183–197. [Google Scholar] [CrossRef] [Green Version]
- Carreira, D.S.; Emmons, B.R. The reliability of commonly used radiographic parameters in the evaluation of the pre-arthritic hip: A systematic review. JBJS Rev. 2019, 7, e3. [Google Scholar] [CrossRef] [PubMed]
- Audigé, L.; Bhandari, M.; Kellam, J. How reliable are reliability studies of fracture classifications? A systematic review of their methodologies. Acta Orthop. Scand. 2004, 75, 184–194. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Koo, T.K.; Li, M.Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef] [Green Version]
- Arifin, W.N. Sample Size Calculator. 2022. Available online: wnarifin.github.io (accessed on 11 December 2022).
- Bonett, D.G. Sample size requirements for estimating intraclass correlations with desired precision. Stat. Med. 2002, 21, 1331–1335. [Google Scholar] [CrossRef]
- Song, H.K.; Lee, J.J.; Oh, H.C.; Yang, K.H. Clinical implication of subgrouping in valgus femoral neck fractures: Comparison of 31-B1.1 with 31-B1.2 fractures using the OTA/AO classification. J. Orthop. Trauma. 2013, 27, 677–682. [Google Scholar] [CrossRef]
- Cicchetti, D.V. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol. Assess. 1994, 6, 284–290. [Google Scholar] [CrossRef]
- de Vet, H.C.W.; Terwee, C.B.; Knol, D.L.; Bouter, L.M. When to use agreement versus reliability measures. J. Clin. Epidemiol. 2006, 59, 1033–1039. [Google Scholar] [CrossRef] [Green Version]
- Bland, J.M.; Altman, D.G. Measurement error. BMJ 1996, 312, 1654. [Google Scholar] [CrossRef]
- Hoelsbrekken, S.E.; Dolatowski, F.C. The influence of the hips position on measurements of posterior tilt in a valgus-impacted femoral neck fracture. Injury 2017, 48, 2184–2188. [Google Scholar] [CrossRef] [PubMed]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).