1. Introduction
In the field of orthodontics, cephalometric analysis using X-ray images is commonly employed to examine the characteristics of the craniofacial skeleton [
1]. For this reason, there may be a need to conduct comprehensive morphological studies on a large number of samples. For example, Yamaguchi et al. have continued to conduct research based on cephalogram measurements since 2001 [
2,
3]. Accurate and extensive image measurement is in demand in this field.
Several techniques have been proposed to obtain accurate results in image measurement. Frazao and Manzi have demonstrated that the accuracy and reliability of 3D measurements using CBCT-based 3D images are sufficient [
4]. In addition, Bajaj and Panwar have shown that the anatomical landmarks obtained in patients with facial asymmetry are consistently accurate and reliable [
5].
Scholars proposing this approach using 3D images argue that depth information cannot be obtained from 2D images and that evaluation can be difficult due to the overlap of structures.
Despite this, cephalometric measurements are still widely used because this technique has now become commonly recognized among orthodontic practitioners. Moreover, it is easier to measure and manage 2D targets than 3D targets. These two tasks, measurement and management, mean that when a large number of measurements is desired, 3D images can make the division of labor difficult. The former requires the formation of shared recognition and compliance, while the latter requires the alignment of software for handling 3D images and incurs significant time and effort costs for data sharing.
The pursuit of accurate measurement can lead to a reduction in the scale of measurement. By contrast, when emphasizing the importance of a large number of measurements, examples can be given where the accuracy of the measurement is sacrificed. To solve the problem of the enormous amount of work required when all operations are performed by hand in large-scale image measurement, the use of image recognition artificial intelligence (AI) for automatic measurement has been proposed. Kim et al. proposed automated cephalometric analysis by deep learning and achieved good classification rates of 88.43% [
6]. However, Hung, et al. have stated that it is still necessary to further examine the reliability and applicability of AI models [
7]. Currently, with image recognition by AI, it is not possible to fully automate measurement and obtain 100% accurate results from it.
The evaluation of the validity of the above-mentioned automatic measurement results is commonly performed by well-trained experts using manual measurements as the benchmark. We suggest that when accurate and large-scale results are desired for orthodontic image measurements, the most practical option is to use an automated system that assists experts in 2D image measurements. Therefore, we developed a collaborative semi-automatic measurement system based on the following concept.
This system utilizes automatically generated measurement templates instead of completely relying on manual measurements. Measurement operations are performed by well-trained individuals or those with specialized knowledge. They are only responsible for placing markers on the landmarks on the cephalogram affixed to the template. The measurement is automatically processed by the program. This enables a system that can efficiently obtain measurement results through the division of labor, supported by expert knowledge from the beginning, rather than an automated system that requires the verification and correction of the final results.
In this paper, we first describe the configuration of the system that we developed. In the following sections, the discussion will proceed by describing the use of a large-scale measurement project in the field of orthodontic correction as an example of the application of the present system. Specifically, details of the operation in the actual project will be described, and the cost at which it was able to be operated will be presented. Finally, this paper will outline the potential for the further expansion and application of the system in other fields.
This paper has been written from the perspective of the system’s developers aiming to evaluate its efficiency. However, the accuracy of the measurement results will be left for evaluation by orthodontist user.
2. Materials and Methods
2.1. System Overview
In
Figure 1, an overview of the project in which this system is used is presented. The authors are referred to as Developers, the individuals responsible for placing markers are referred to as Orthodontists, and the providers of imaging data are referred to as Doctors.
2.2. Policy for Using Imaging Data
We received the computed tomographic (CT) images of 500 patients as digital imaging and communications in medicine (DICOM) files. Because the DICOM files included patients’ personal data, we removed the identifying information and anonymized the images by using RAW format. As a result, each patient is identified using only a four-digit serial number. The present study was approved by the Ethics Committee of Kanagawa Dental University (approval number: 841; date: 4 December 2021).
To simplify the automatic processing of these data, we created a batch script that converted DICOM files into RAW format when the files were received.
2.3. Creating Measurement Images
Measurement images are created using the process described in the next section. We adopted ExFact VR (Nihon Visual Science, Inc., Tokyo, Japan) for volume rendering.
First, we generated the anterior and lateral rendering images as orthogonal projections. These images are referred to in this article as “cephalograms”, although they are not cephalograms per se. We also generated axial slices of images of the nasal cavity.
Second, we set appropriate gradations of color and transparency on the screen to depict the intensity of each three-dimensional image by adjusting the look-up table (LUT) so that the viewer could easily recognize the landmarks in the rendered image.
This configuration affects the ease of recognition of the colors in each image and the clarity and inner contour of the image itself.
2.4. Automated Image Generation
Despite the need to create cephalograms for 500 people, the software used to render the volume data did not incorporate the ability to automatically generate cephalograms. Therefore, the aforementioned activity was handled by a script with process automation technology called “robotic process automation” (RPA); this enables automation that meets special needs arising at the field level without modifying the existing software. To accomplish this, we created a Python script to automatically control the software for volume rendering using PyAutoGUI, a module for graphical user interface (GUI) automation.
Additionally, we developed software that automatically calculates, generates, and applies the optimum LUT from the histogram of the intensity of the voxels of the input images and incorporates it into the script. The automatic generation of LUT may fail, depending on the distribution of the histogram. Therefore, we also made it possible to apply the LUT created in advance to the samples that the cephalograms failed to generate correctly.
2.5. Measurement Template
The measurement template (
Figure 2) is an editable data image (in .xcf, a file format of GIMP) with some movable markers placed on the cephalogram to indicate the position of its landmarks. The markers’ positions are initially tentative. Developers devised template files in which orthodontists could indicate the landmarks.
The developers insert the cephalogram into the template where the marker is temporarily placed. Then, the developers send the template to orthodontists. The orthodontists move the markers to specified landmarks in the cephalograms and return the file to the developer.
The central part of the template (the area indicated by the orange frame as the measurement area in
Figure 2) is the target of the measurement program’s processing. The markers placed in the storage areas (green frames in
Figure 2) are ignored during measurement. In some of the imaging data we received, the scan range excluded some parts of the cranium; in some cases, the cranium was out of focus. In these cases, unnecessary markers could be moved to outside the measurement area to skip the measurement process.
Conversely, the measurement process with cephalograms must be repeated several times, with intervals between measurements, to minimize errors. The use of measurement templates does not require the repetition of the entire process to minimize errors. The measurement template with the correctable markers remains editable data. Therefore, if any measurement result is questionable, the measurement can be performed again after adjusting the positions of the markers that need to be corrected on the measurement template.
2.6. Marker Design
During the measurement process, the template is loaded into a program that automatically detects the position of the markers placed by the orthodontists. Each marker has a unique color, and the measurement program searches for pixels of that color to detect their position. The measurement marker is composed of the following elements (
Figure 2):
The marker is designed in anticipation of the manual work by the orthodontists and the processing by the position-detection program. The balloon not only indicates the marker ID but also helps the orthodontists grab and move it with the cursor.
2.7. Work Process via Measurement Template
The measurement template is passed between the orthodontists and the developers, as shown in the top-to-bottom flow graphic in
Figure 3. As indicated in the flow graphic, some aspects of the work might proceed simultaneously.
2.8. Tilt Correction
The craniofacial imaging data provided by doctors were not all straightforward. If the posture was inclined, the distances between the landmarks on the images would be compressed in the depth direction, which might result in inaccurate measurements. To prevent such compression, it was necessary to define the four landmarks for fixing the horizontal standard, to measure the two tilts of the axis, and to correct the inclinations of the imaging data before the actual measurements (
Figure 4).
For example, it is assumed that the facial tilt of patients is characterized by front- and exact-side orientations of 0° and 90°, respectively. It is also assumed that the line connecting the left and right outermost edges of the orbits is horizontal and that its length is at its maximum when the patient is facing the front and at its minimum when the patient is facing exactly to the left side (
Figure 5). If, instead, the patient is facing 8° to the left or right from the front, the measurement would be approximately 99.02% of the actual length (
Figure 5 and
Figure 6). In this case, if the actual length is 100 mm, the measurement would be 1 mm shorter. The error in the measurement increases rapidly as the inclinations increase.
The inclination of the horizontal tilt of the cranium was calculated from the positions of these four landmarks. By calculating back from that inclination, the inclination of roll and pitch rotation was corrected, and the horizontal orientations of the screen space and the cranium were matched. Regarding the yaw rotation, only the points that were inconsistent with the appearance of the rendered image were corrected. In this way, it was possible to easily identify the landmarks and obtain measurement images in which all the patients were shown in the ideal posture for measurement.
2.9. Operational Requirements
The following two points can be considered the error factors in the measurements:
The first point refers to the limitation of resolution that results because the cephalogram in this system reflects not analog data, as a radiograph does, but sampled digital data. A digital image is constructed by pixels, and the position where a marker is placed can be determined only in units of pixels; a finer position adjustment is not possible. That is, the size of the pixels becomes the limit of the measurement resolution as it is.
In practice, the dispersion easily becomes an error factor rather than a limitation of the data. This is because, for some situations, the same landmark may be interpreted differently in centimeters by different orthodontists in placing the markers. Because marker placement criteria differ according to orthodontists, it was necessary to arrange the data in agreement with the images before the real measurement.
We distributed several examples of the same data to three orthodontists and asked them to arrange the markers; then, we overlapped their arrangements and checked the variations in position (
Figure 8).
The orthodontists conferred with each other, referring to the images of the superimposed markers, confirming the difference in recognition of the correct marker position and discussing unified placement criteria. On the basis of the discussion, they made arrangements to practice marker placement together before proceeding to the actual work.
2.10. Definition of Landmarks to Be Measured
The following describes the actual required cephalometric specifications, as well as the measurement points and sections. These were based on the orthodontists’ research interests. Measurement images were conducted for five types, including front, left and right side views, and two types of cross sections. Depth distance was not considered, because distance measurement is performed on 2D images. Additionally, no angular measurements were taken, although it is technically possible to do so. The orthodontists and developers exchanged templates and decided on the landmarks that would be used in practice.
We present here an example of a measurement. Note that the measurement specifications follow the actual project, but the sample itself is data just for illustrative purposes.
In the anterior template, 11 types of color-coded markers (21 markers in total) were used for the measurement (
Figure 9).
In the lateral template, 12 types of markers (12 markers in total) were used for the measurement (
Figure 10). On each lateral measurement, two patterns of templates were created: one viewed from the left side of a cranium and the other viewed from the right side.
In axial templates, the pair of parallel cross-sectional slices was used (
Figure 11).
2.11. Measurement Process
In the measurement process, the following three programs were used in order:
template_to_image.py
An image conversion script is a script that converts the marker placement template file of the original GIMP format into a portable network graphic (png) image file;
This script is for the GIMP built-in Python shell, the same one used when the templates were created.
point_detect.cpp
Landmark detection program is a program that identifies the markers placed on the measurement image and outputs the coordinate values of each marker to a comma-separated values (csv) file;
It is an easy C++ program in which image recognition is used.
measure_distances.py
A measurement script is a script that measures the required distance between each feature point from the combination of the coordinate values of each marker and the markers identified in the specifications;
This script is a short Python script that calculates the distance from a csv file of coordinate values;
The output results are several csv files of the distance between the markers.
4. Discussion
4.1. Verification of Results
The evaluation of the results presented in this study should be discussed in terms of the validity of the measurement results and the efficiency of the measurement process. As stated in the Results section, the validity of the measurement results obtained using this system was acknowledged by the measurement requesters themselves, who possess expertise in the relevant field. By contrast, in respect of the evaluation of the efficiency of this method, as we could find only a small number of papers describing the time required for these measurements [
10], we discussed the relationship between estimated and spent time in the actual measurement project. In the preparation stage of the measurement project that was addressed in this study, a discussion on the efficiency of comparison with existing methods had already been conducted. Specifically, from the perspective of practitioners and scholars in the field of orthodontics, the introduction of a new system in the measurement project was required for the completion of a large number of measurements in a short period, and the deadline was set accordingly. As a result, the measurement project using this system successfully delivered all the measurement results within the deadline. Based on the fact that the above goals have been achieved, the efficiency of this system can be considered to have been proven.
4.2. Significance of the System
The proposed system, as shown in the flowchart in
Figure 1, allows for the simplification and automation of individual processes, allowing for a collaborative and streamlined workflow among workers with the appropriate skills. This leads to improved efficiency and accurate results. Moreover, the strength of the system lies in its ability to allow for a seamless workflow even when the workers are temporally or spatially separated, as long as data transfer is successful. In this case, it is also possible to implement an asynchronous operation in which the results are accumulated for each task and the measurement process is performed after a certain amount of data are collected. This allows for the appropriate allocation of personnel and equipment according to the workload of each process.
Thus, the system and method of operation adopted in this study enable overall efficiency and flexible operation in the processing.
4.3. Limits
This system is supposed to be able to measure angles, areas, etc.; however, in THE actual measurement implemented in our project, only simple length measurements have been performed.
This system has limitations in measurement resolution due to the use of digital images. Additionally, in the division of labor, there is a bias that arises from the fact that people perform measurements visually. This was eliminated by forming a consensus among the workers. In the consensus-building discussions, the measurement template was effective in clearly demonstrating bias.
4.4. Future Outlook
Looking ahead, during the building and operation of inspection systems for industries, food production, and the medical field, there may be cases where it is not possible to eliminate human judgment. Even if the automation of processing using AI is envisaged, the system cannot reach a state of completion all at once.
By adopting measurement templates and operating methods like this, it is possible to accumulate not only the results of measurements but also the data from the operating inspection and the diagnostic system’s processes that rely on visual judgment. Owing to this, it is possible to expect educational benefits as beginners learn to apply the method of image judgment using the marker positions set by experienced workers. Furthermore, these data can also be used as training data for AI-based recognition, and by improving its accuracy, it will be possible to transition to complete automation gradually and without difficulty.