Next Article in Journal
Development of Cotton Picker Fire Monitoring System Based on GA-BP Algorithm
Previous Article in Journal
Sensor-Based Classification of Primary and Secondary Car Driver Activities Using Convolutional Neural Networks
Previous Article in Special Issue
Specular Reflections Detection and Removal for Endoscopic Images Based on Brightness Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

ECG Electrode Localization: 3D DS Camera System for Use in Diverse Clinical Environments

by
Jennifer Bayer
1,†,
Christoph Hintermüller
1,*,†,
Hermann Blessberger
2,3 and
Clemens Steinwender
2,3
1
Institute for Biomedical Mechatronics, Johannes Kepler University, 4040 Linz, Austria
2
Department of Cardiology, Kepler University Hospital, 4020 Linz, Austria
3
Medical Faculty, Johannes Kepler University, 4020 Linz, Austria
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Sensors 2023, 23(12), 5552; https://doi.org/10.3390/s23125552
Submission received: 24 March 2023 / Revised: 15 May 2023 / Accepted: 8 June 2023 / Published: 13 June 2023

Abstract

:
Models of the human body representing digital twins of patients have attracted increasing interest in clinical research for the delivery of personalized diagnoses and treatments to patients. For example, noninvasive cardiac imaging models are used to localize the origin of cardiac arrhythmias and myocardial infarctions. The precise knowledge of a few hundred electrocardiogram (ECG) electrode positions is essential for their diagnostic value. Smaller positional errors are obtained when extracting the sensor positions, along with the anatomical information, for example, from X-ray Computed Tomography (CT) slices. Alternatively, the amount of ionizing radiation the patient is exposed to can be reduced by manually pointing a magnetic digitizer probe one by one to each sensor. An experienced user requires at least 15 min. to perform a precise measurement. Therefore, a 3D depth-sensing camera system was developed that can be operated under adverse lighting conditions and limited space, as encountered in clinical settings. The camera was used to record the positions of 67 electrodes attached to a patient’s chest. These deviate, on average, by 2.0 mm ± 1.5 mm from manually placed markers on the individual 3D views. This demonstrates that the system provides reasonable positional precision even when operated within clinical environments.

1. Introduction

Models of the human body have gained increasing interest in clinical research and are essential for delivering personalized diagnoses and treatments to patients. They can be used to build a digital twin of a patient’s body that can be used for planning curative interventions, predicting the outcomes of intended treatments, or the likelihood of relapses and complications. For most of these models, apart from the knowledge of the patient’s exact anatomy, information about physiological processes is required such as the impedance of fibrous tissue forming an infarction scar, which largely differs from the impedance of intact myocardium.
Electrical impedance tomography (EIT) and electrical capacitance tomography (ECT) are used to measure tissue parameters such as impedance or capacitance [1,2,3]. The origin of cardiac arrhythmias or myocardial infarctions can be identified by integrating ECG recordings [4,5,6], and the functions within the human brain [7,8] can be visualized using models, including EEG recordings. All these methods require the positions of between 12 and a few hundred sensors to be exactly known. The larger the positional error, the lower the diagnostic value of the results generated by the model. Consequently, they are less suitable for treatment planning, guidance, outcome stratification, or prevention of complications and relapses.
A commonly used approach is to extract the sensor positions, along with the anatomical details, from Magnet Resonance Image (MRI) stacks or X-ray Computed Tomography (CT) slices. Both approaches require special markers to be attached to the sensors, which are visible in MRI [9,10] or CT scans [11]. Identifying the sensor positions from MRI or CT scans yields the smallest positional errors compared to the true sensor position. However, this approach significantly hinders the clinical uptake and widespread use of electrical impedance tomography (EIT), electrical capacitance tomography (ECT), noninvasive imaging of cardiac electrophysiology (NICE), and other model-based approaches. Patients either have to be exposed to large amounts of ionizing radiation when using CT scans, limiting the use of the aforementioned methods to three applications per year. Although MRI is not bound by this limitation, it is only covered by insurance companies if it is required for obtaining a proper diagnosis and evaluating outcomes.
Given these limitations, alternative approaches that decouple the generation of the underlying anatomical models from the localization of the sensors have been tested [12,13,14]. Alternatives such as magnetic digitizer systems, e.g., the Polhemus Fastrak [12], tracked gaming controllers [13], or motion capture systems, have been used to identify the positions of electrodes relative to the patient’s body. The use of photogrammetry, visual odometry, and stereoscopic approaches was already considered more than 15 years years ago [15,16]. The Microsoft Kinect 3D depth-sensing camera (3D DS) was one of the first compact and affordable devices. Nowadays, modern coded light and stereo vision-based models are portable and lightweight enough to be easily attached to or even integrated within a standard tablet computer.
In the past few decades, 3D DS cameras have mainly been used in EEG-based studies to locate EEG sensors on the patient’s skull [12,14,17]. All of them use the recorded EEG signals to localize brain activity or identify the focus of a seizure within the cortex. In contrast, very few studies report the use of 3D DS cameras to locate ECG sensors on the chest or even the whole torso [18,19,20]. One reason for this may be that the skull is a rigid structure that does not change its shape when the subject moves during the recording. In contrast, when recording the sensor position on the torso, the patient needs to maintain a specific posture. The instructions provided to the patient on how to achieve and maintain this posture are integral to the entire recording procedure.
In the present work, the positions of 64 ECG electrodes mounted on the torso are recorded using 3D DS camera readings only. Section 2 encompasses descriptions of the overall structure of the developed 3D DS camera-based system, method, and algorithm used for the real-time recording of the individual 3D views of the torso (Section 2.2); the postprocessing steps necessary for extracting the electrode positions (Section 2.3); and the recording protocol used and the instructions provided to each subject participating in the clinical testing (Section 2.4). In Section 3, the results obtained from the five subjects are presented, and in Section 4, these results are discussed.

2. Materials and Methods

The 3D depth-sensing (3D DS) camera-based measurement of electrode positions can be divided into four main steps: (i) selecting the appropriate 3D DS camera, (ii) defining an appropriate measurement protocol, (iii) recording the 3D surfaces in real time, and (iv) extracting the electrode center points.
The most important component for recording the electrode positions is the 3D camera. It can be characterized by various parameters such as the closest distance d n e a r and the vertical ψ V and horizontal ψ H fields of view (FOV). These parameters define the volume in front of the camera in which objects must be placed to be accurately captured by the depth sensor. Based on these considerations, the Intel Realsense SR300 3D DS camera [21] was selected. Descriptions of the exact selection criteria that led to this decision can be found in Section 2.1.
The human torso represents a flexible object that offers several degrees of freedom for movement and deformation in contrast to the rather rigid skull. The position of each ECG electrode perceived by the 3D DS camera and its relative position to the other electrodes is directly affected by the movements of the patient’s body. Therefore, it is essential to define an appropriate recording protocol before the first 3D data set is recorded. As large displacements may prevent the successful extraction of the electrode positions, the patient is required to actively maintain the same posture throughout the recording procedure. Details on how this active engagement of the patient can be achieved are described in Section 2.4. For the remaining steps, (iii) real-time recording and (iv) offline processing, Figure 1 provides an overview of the necessary sub-steps and their interdependence:
A 3D DS camera combines a depth sensor and an RGB color sensor in a single device. These two sensors simultaneously record an RGB color image Ξ r g b and a 16-bit depth image D 16 . The latter image encodes the distance d between the camera and the objects located in front of the camera.
The developed real-time recording system is intended for use in diverse clinical settings such as examination rooms in outpatient clinics or local cardiology practitioner clinics. The lighting conditions encountered depend on the pointing direction of the camera and the number of light sources, as well as their brightness and color hue. In order to properly handle these conditions, the white-balancing settings, exposure time τ , and overall gain γ e x of the color sensor are continuously adjusted in real time. Automatic white balancing (AWB), which is described in detail in Section 2.2.1, uses Ξ r g b to estimate the color temperature K W of the dominant light source.
At the same time, a binary mask M D is generated from the depth image D 16 . This M D splits D 16 into foreground pixels representing the torso surface and objects in the background (Section 2.2.3). M D is used to generate a 3D mesh S of the imaged torso surface (Section 2.2.4) and tune the exposure time τ and global gain setting Γ e x of the color sensor. This is achieved by combining M D with the brightness information I of the color image Ξ r g b obtained during the AWB step (Section 2.2.2). The mask M D is also used to outline the patient’s contours on the real-time preview screen, along with various system parameters.
When the trigger is pressed, the triangulation component (Figure 1) generates a 3D surface mesh S, which is stored along with the corresponding texture information Ξ u v s of the torso created from the RGB color image Ξ .
In the offline processing step, a pairwise iterative closest-point (ICP) algorithm is used to align the recorded surfaces S with each other. The resulting transformation matrices are used to extract the 3D positions from the color-corrected texture images Ξ u v s , which have been stored alongside each S (Section 2.3.2). In order to facilitate the steps necessary to identify the color markers attached to the electrodes, an additional color-correction step, which is described in Section 2.3.1, is conducted. The aim of this step is to ensure that the patient’s skin color and marker colors are accurately represented across all the recorded texture images Ξ u v s . To achieve this, the Ξ u v s are split into a chromacity image χ r g and the corresponding intensity image I. Both are used to identify the red and blue pixels and related 3D points corresponding to each electrode marker. Details on how this is achieved can be found in Section 2.3.3.
The centers of these markers are coaligned with the centers of the electrode clips and patches. Their positions on the surface are computed by fitting a planar model (Section 2.3.4) to the extracted red and blue points. In the final labeling step (Section 2.3.5), the electrode positions are assigned to the corresponding ECG signals recorded from the patient’s torso.
The colors of the markers vary depending on the position and orientation of the electrode clip relative to the torso and 3D DS camera. Therefore, a dedicated calibration procedure is utilized, which is outlined in Section 2.3.6, to determine the ranges of the red and blue color values that represent the electrode markers.

2.1. Selecting the Camera

The selected Intel Realsense SR300 3D DS camera [21] is used in narrow or crowded places such as examination rooms in outpatient clinics and cardiology practitioner clinics. In these places, the patient is typically seated on an examination bed or chair placed close to the wall. Consequently, the closest distance d n e a r relative to the depth sensor at which objects may be placed has to be shorter than the shortest horizontal distance d H , m i n of the patient’s torso to any surrounding obstacles such as walls or furniture. The horizontal ψ H and vertical ψ V FOVs determine how tall or wide the closest object can be to be fully captured in its height and width. The minimum required values for ψ H and ψ V can be approximated based on the patient’s approximated h t o r s o and the 3D DS camera’s d n e a r using the following relationships:
d n e a r < d H , m i n ψ m i n = 2 arctan ( h t o r s o d n e a r + d H , m i n ) max ( ψ H , ψ V ) > = ψ m i n
According to the datasheet [21] the depth sensor can capture objects located at distances between 20 cm and 150 cm from the camera. This range is more than sufficient to record the surface of the torso. The depth information of each object is captured using an infrared sensor in combination with a near-infrared projector [21,22].
The depth images D 16 are recorded in 4:3 image format, covering a horizontal FOV of 69 degrees and a vertical FOV of 54 degrees at a depth resolution of less than 1 mm. The color sensor of the camera generates the RGB images Ξ r g b in 16:9 format. Its horizontal FOV of 68 degrees is sufficiently well-paired with the horizontal FOV of the depth sensor. With a ψ V of 41 degrees, it covers only 2 / 3 of the depth sensor in height. This results in a lack of color information for the pixels close to the top and bottom edges of the depth image D 16 , which was considered when outlining the measurement protocol in Section 2.4.

2.2. Real-Time Recording

2.2.1. Automatic White Balancing

The color sensor used by the Intel Realsense 3D DS camera offers the possibility to manually tune the color gains Γ R ^ , Γ G ^ , Γ B ^ indirectly by adjusting their color temperature parameter K W . This was used to implement a custom AWB component (Figure 1), along with the algorithm proposed in [23], which can handle these varying conditions. After applying a lookup table v ^ = v based on linearization (gamma decompression) and normalization to the interval [ 0 , 1 ] of the red R ^ = R , green G ^ = G , and blue B ^ = B color channels, the resulting linear RGB image Ξ is converted into an RGB chromacity χ r g image and a linear grayscale image I that encodes the brightness of each pixel.
I i = R ^ i + G ^ i + B ^ i r i = R ^ i / I i , g i = G ^ i / I i , b i = G ^ i / I i
From χ r g , all pixels ( r i , g i , b i ) are selected that encode shades of gray. The red r, green g, and blue b chromacity values of these pixels are located within a small area around the neutral color gamut point, which has a color temperature of 5500 K, as shown in Figure 2. The basic assumption is that these pixels most likely correspond to object surfaces of a neutral gray color. Consequently, a reddish-colored taint in these pixels must be caused by a low K value of the predominant illumination, and a bluish cast most likely results from a light source with a large K. Overexposed pixels are excluded, as their color most likely results from the saturation of at least one of the three color channels and thus does not properly represent the skin color of the patient or the color of the illuminant. Likewise, underexposed pixels are not considered, as their color is most likely caused by camera noise rather than the light reflected by the imaged object.
For adjusting the color temperature setting K W of the 3D DS camera, only pixels ( r i , W , g i , W , b i , W , I i , W ) that are located within a small area surrounding the neutral color gamut point are selected, which is, according to Cohen [23], defined by the chromacity values r ¯ = 0.363 , g ¯ = 0.338 , and b ¯ = 0.299 . This area encloses all pixels that are located within the following two ellipses centered at the color gamut point:
( r i , W r ¯ ) 2 σ r 2 + ( g i , W g ¯ ) 2 σ g 2 < = 1 and ( b i , W b ¯ ) 2 σ b 2 + ( g i , W g ¯ ) 2 σ g 2 < = 1
0.00155 I m a x < I i < 0.955 I m a x
Their primary and secondary axes are defined by the standard deviations for the red σ r = 0.0723 , green σ g = 0.0097 , and blue σ b = 0.0749 chromacity values with respect to the neutral color gamut point, which was determined in [23]. The maximum intensity encountered is max I = 3 .
The lower I m i n = 0.02 and upper I m a x = 0.98 exposure limits, as defined for each channel in [23], are linearized to I m i n = 0.02 / 12.92 and I m a x = ( ( 0.98 + 0.055 ) / 1.055 ) 2.4 before applying them to the overall linear intensity values I.
To match K W with the color temperature K of the light source, the overall color gain Γ K of the camera is estimated. The following model is used to simulate how the camera adjusts the gain Γ R ^ of its red and blue Γ B ^ channels when K W is updated.
Γ R = Γ K , Γ B ^ = 1 Γ K , Γ K = 2 ( K W K W , m i n ) K W , m a x K W , m i n , K W = K W , n + 1 = γ K W , n
Neither the lower and upper limits for Γ R ^ and Γ B ^ , nor the color temperature corresponding to equal gain values Γ R ^ = Γ B ^ = 1 , are documented for up-to-date 3D DS cameras. It is assumed that Γ R ^ = Γ B ^ = 1 corresponds to the center color temperature K ¯ W = ( K W , m i n + K W , m a x ) / 2 between the minimum K W , m i n and maximum K W , m a x values of the color sensor. On startup, K W is initialized to K W , 0 = K ¯ W . For the recorded color images Ξ n + 1 , the corresponding K W , n + 1 = γ K W , n is estimated from the previous value of K W = K W , n and a scaler γ reflecting the relative change of K W between two consecutive Ξ n . The color sensor of the used camera has a rolling shutter. Therefore, color images are only considered for estimating the scaler γ and K W after the next exposure time interval has elapsed.
The goal is to minimize the distance between the average r ¯ W , green g ¯ W , and blue b ¯ W chromacities of the selected pixels and the r ¯ , g ¯ , b ¯ of the color gamut point that corresponds to a color temperature of K = 5500 Kelvin . To achieve this, r ¯ W , g ¯ W , and b ¯ W are multiplied by the unknown intensity y . = R ¯ W + G ¯ W + B ¯ W to obtain the corresponding mean red R ¯ W , green G ¯ W , and blue B ¯ W color values. These values are scaled by γ using (5). After scaling, the updated r ¯ W , g ¯ W , and b ¯ W are computed using (2).
r ¯ = r ¯ W Γ K I . r ¯ W Γ K I . + g ¯ W I . + b ¯ W I . Γ K , g ¯ = g ¯ W I . r ¯ W Γ K I . + g ¯ W I . + b ¯ W I . Γ K , b ¯ = b ¯ W I . Γ K r ¯ W Γ K I . + g ¯ W I . + b ¯ W I . Γ K
It is obvious that the unknown intensity I . does not have any impact on the result. It can be omitted from (6) and γ . Consequently, Γ K can be computed from r ¯ , g ¯ , b ¯ , r ¯ W , g ¯ W , and b ¯ W directly.
In Figure 2, it can be observed that the curve along which the color gamut point moves can be approximated for color temperatures K 5000 K by the line connecting the red corner ( r = 1 , g = 0 , b = 0 ) and the midpoint ( r = 0 , g = 0.5 , b = 0.5 ) between the blue ( r = 0 , g = 0 , b = 1 ) and green corners ( r = 0 , g = 1 , b = 0 ) of the chromacity space. For color temperatures K > 5000 K , the curve can be approximated by the line connecting the blue corner ( r = 0 , g = 0 , b = 1 ) with the midpoint ( r = 0.5 , g = 0.5 , b = 0 ) between the red ( r = 1 , g = 0 , b = 0 ) and green ( r = 0 , g = 1 , b = 0 ) corners, respectively. The two midpoints ( r = 0.5 , g = 0.5 , b = 0 ) and ( r = 0 , g = 0.5 , b = 0.5 ) correspond to the yellow y ¯ = ( r ¯ + g ¯ ) / 2 and cyan c ¯ = ( g ¯ + b ¯ ) / 2 chromacities, respectively. Based on the ratio y ¯ / b ¯ , the average chromacity value g ¯ W γ of the green channel scaled by γ can be expressed. The resulting expression is inserted into the quadratic Equation (8) obtained from the ratio r ¯ / c ¯ :
r ¯ c ¯ = 2 r ¯ g ¯ + b ¯ = 2 r ¯ W γ g ¯ W + b ¯ W γ , y ¯ b ¯ = r ¯ + g ¯ 2 b ¯ = r ¯ W γ + g ¯ W 2 b ¯ W γ
γ 2 r ¯ W r ¯ b ¯ W b ¯ = 0
Solving (8) with respect to γ yields
γ = r ¯ b ¯ W r ¯ W b ¯
Along with γ , the actual error E between the neutral illumination color gamuts r ¯ , g ¯ , b ¯ and r ¯ W , g ¯ W , and b ¯ W ; the expected error E * after scaling r ¯ W and b ¯ W by γ ; and the updated value K W + = K W Γ K are computed using (5):
E = ( r ¯ r ¯ W ) 2 + ( g ¯ g ¯ W ) 2 + ( b ¯ b ¯ W ) 2 and
E * = ( r ¯ r ¯ W γ ) 2 + ( g ¯ g ¯ W ) 2 + ( b ¯ b ¯ W γ ) 2
Based on these equations, the K W setting of the 3D DS camera is updated K W = K W Γ K if E * < E . During testing, it was found that numerical inaccuracies can prevent the computation of the appropriate estimates for the color temperature K of the predominant illuminant. Therefore, a numerically stable test is used instead to determine whether K W has to be updated or its current value can be kept.
( E > 1.758 × 10 8 ) ( E E * > 10 15 )

2.2.2. Patient-Locked Auto-Exposure

In addition to the overall color appearance, the light sources that are present also affect the overall light intensity I, which among others, can vary depending on the viewing direction of the 3D DS camera. For example, in the case shown in Figure 3a, the camera is pointing toward a window. In Figure 3b, the camera is pointing in the opposite direction toward the door.
In order to maintain a constant illumination intensity I of the patient’s torso, independent of the viewing direction and the overall brightness of all present light sources, the histogram-based auto-exposure AE algorithm proposed in [24] was adopted.
This algorithm is implemented in the exposure component (Figure 1). It considers only the pixels in Ξ that correspond to the patient’s torso. These pixels are selected by segmenting the depth image D recorded by the 3D DS camera into a foreground object (the patient) and the remaining background using the approach outlined in Section 2.2.3. The binary mask M D obtained in this segmentation step is mapped to the color image Ξ using the texture coordinates v u v s computed from the depth image D 16 by the camera control library. All brightness values I i of all pixels covered by the mapped mask M D are considered for adjusting τ . Any other pixels and pixels that are over- or underexposed according to Equation (4) are discarded.
The algorithm proposed by Chen and Li [24] uses the histogram of the gamma-compressed grayscale image Ξ ¯ computed from Ξ . In order to avoid the computational burden required by an explicit conversion between the linear illumination image I and Ξ ¯ , the histogram H ( V ) is directly computed from the linearized illumination values I i of the selected pixels. This is accomplished by maintaining a lookup table that lists the linearized bin boundary values h ^ v corresponding to the uniform boundaries h v of the grayscale histogram H ( V ) . The histogram H ( V ) can then be generated for all considered I i using a left bisection search to scan this lookup table, which is far less computationally demanding. A further reduction can be achieved by precomputing the differences Δ H 2 = ( I i 128 ) 2 and Δ H 3 = ( I i 128 ) 3 for each bin, which are used to calculate the skewness S ( V ) of H ( V ) .
To compute the values of the exposure time τ and overall gain Γ e x to be set on the camera, the overall exposure parameter τ * is used.
τ * n + 1 = τ * n S ( V ) τ Δ N τ Γ e x , n
Γ e x , n + 1 = max ( min ( τ * n + 1 τ * * , Γ e x , m a x ) , Γ e x , m i n ) τ = max ( min ( τ * Γ e x , n + 1 , τ m a x ) , τ m i n )
τ * * = max ( min ( τ f r a m e τ Δ , τ m a x ) , τ m i n )
The parameter τ Δ represents the size of one τ step in milliseconds, N τ = 5 represents the number of steps to take if S ( V ) = 1 , and τ f r a m e = 100 ms represents the optimal exposure time for each frame. The value of τ Δ depends on the actual step size in ms offered by the 3D DS camera.

2.2.3. Depth Segmentation

The binary mask M D is created from the 16-bit depth images D 16 recorded by the 3D DS camera. It splits the image into the patient and any surrounding objects, obstacles, and relevant edges. This implementation was inspired by the Canny edge detection algorithm proposed in [25]. The algorithm uses two thresholds to find the edges in an image Ξ based on the gradient Δ Ξ ¯ of its corresponding grayscale image Ξ ¯ . Pixels that have a gradient value Δ x i that exceeds the upper limit are considered to be part of an edge. Pixels with a value of Δ x i between the two limits are only included in an edge if they are adjacent to an already identified edge pixel. To improve the obtained set of edges and reduce the number of edges caused by noise, the grayscale Ξ ¯ is smoothed using a Gaussian filter.
This approach was adopted for processing depth images D 16 that contain pixels for which no valid depth value D i = 0 is available. The computation of the depth value gradient Δ D i and one of the corresponding Gaussian filter weights w i are computationally too demanding to be computed in real time. Therefore, the depth gradient values Δ D i = Δ d x , i 2 + Δ D y , i 2 of D are rounded to the closest 16-bit integer value Δ 16 D . The resulting reduced number of Δ 16 D and corresponding distinct weights w 16 are stored in a precomputed weights table instead of directly computing w i on every iteration for each pixel. This avoids the computationally demanding operations of computing e x and x in real time. A companion table with squared boundary values Δ 16 2 D = ( Δ j , 16 2 D + Δ j + 1 , 16 2 D ) / 4 between the individual Δ 16 D ensures that the w i for a pixel D i of D can be generated through a fast left bisection search. Pixels D i = 0 represent objects without a defined depth, and their values are copied to the smoothed depth image D ˜ image without any changes.
The smoothed D ˜ is filtered using an octagonal Laplace kernel to find the initial set of edge pixels d e ,
1 2 1 1 2 1 4 + 2 2 1 1 2 1 1 2
An octagonal kernel has the advantage that all distances between the eight-connected neighbor pixels d 8 and the central pixel d c are of equal length.
All pixels d that exhibit a sign change between opposing neighbor pixels d 8 on the Laplacian image D ˜ are included in the initial set of edge points d e . Pixels d e that have at least one neighbor d k , 8 = 0 with an undefined depth are considered primary edge pixels e P . Their actual Δ D ˜ ( d e ) values are computed using the following approach:
Δ D ˜ ( d e ) = max D ˜ + k , 8 D ˜ k , 8 if D ˜ k , 8 > 0 and D ˜ + k , 8 > 0 2 D ˜ k , 8 D ˜ i if D ˜ k , 8 > 0 2 D ˜ + k , 8 D ˜ i if D ˜ + k , 8 > 0
All d e where Δ D ˜ ( d e ) > Δ D ˜ P are marked d P , whereas any other d e are only considered if the Canny rule for minor edge pixels d M holds. This rule has been modified for use on depth images D as follows:
( Δ D ˜ ( d e ) > Δ D ˜ M ) ( ( Δ D ˜ ( d e ) > Δ D ˜ P ) ( Δ D ˜ ( d e ) Δ D ˜ M Δ D ˜ P Δ D ˜ ( d e ) > 1 ) )
The upper Canny limit Δ D ˜ P is set to 1.2 cm and the minor limit Δ D ˜ M is set to 0.35 cm.
A binary depth mask M D is created from all pixels d i > 0 in D of a known depth. Pixels d e located at any of the edges are excluded from M D . The resulting M D is split into 9 segments M D , 9 . The pixels M i within the central M D , 9 are labeled with respect to the different objects and components they represent. The labeled four-connected components L 4 are sorted by size. The largest L 4 that touches the segment boundary is extended to all other M D , 9 segments using the flood-fill method, starting from the center of mass of L 4 . At the end of this step, all adjacent edge pixels d e are appended to the extended L 4 + representation of L 4 .
As the depth values at the boundaries of L 4 + can largely vary, the following approach is used to remove any unrelated outliers. This approach is based on the observation that the boundaries of the patient’s torso are well-separated from the background along the vertical direction and above the head.
D c l o s e < D i ( M i ) < D f a r
D f a r = min max ( D r , f a r ) + max ( D r , f a r ) min ( D r , c l o s e ) 3 , D r , f a r ¯ + 3 σ ( D r , f a r )
D c l o s e = max min ( D r , c l o s e ) max ( D r , c l o s e ) min ( D r , c l o s e ) 3 , D r , c l o s e ¯ 3 σ ( D r , f a r ) , D n e a r
The values D r , c l o s e and D r , f a r correspond to the smallest and largest depth values encountered for the mask pixels M i within each row of L 4 + , and D r , c l o s e ¯ , D r , f a r ¯ , σ D r , c l o s e , and σ ( D r , f a r ) represent their mean and standard deviations. Any pixel M i for which the condition in (18) does not hold is removed from L 4 + . In the case that either the number of pixels of L 4 + is less than 200 or no appropriate values for D f a r or D c l o s e could be found, the current L 4 is discarded and the search for a suitable L 4 + representing the patient is attempted with the next larger L 4 . If no suitable L 4 is left, segmentation is aborted and real-time processing continues with the next set of depth and color image frames recorded by the 3D DS camera.

2.2.4. Surface Mesh Generation

The final surface mesh is generated by converting the depth image D into a corresponding point cloud P. Therein, each point v i corresponds to a specific pixel d i in D. In the case of pixels d i = 0 without a defined depth value, the origin point v i = O = ( 0 , 0 , 0 ) is assigned. The unique correspondence between any d i and its corresponding v i allows creating S by mapping a pre-triangulated grid G to P. Any triangle T that includes at least one v i for which d i = 0 is dropped from G.
Before S is stored on disk using the .obj format, along with Ξ u v s and the color temperature setting K W it was recorded with, degenerated T A = 0 and occluded triangles T 1 that do not correspond to a valid surface patch are removed. The filtering of T 1 is facilitated by the fact that 3D DS cameras, especially those that can capture objects located a short distance from the camera, use a dedicated RGB color sensor to record Ξ u v s . This sensor is typically attached to the left or right side of the depth sensor system and thus views the imaged object from a slightly different angle. This difference in viewing angle and FOV between the depth and the color sensor is sufficiently large to identify triangles that do not represent a part of the object’s real surface. This small difference in viewing angle causes the surface normal n 1 to flip its direction between the representation of T 1 in the depth image D and in Ξ . This flip is not plausible as it would mean that the color sensor is capturing the back side of T 1 , whereas the depth sensor captures its front side. This is prevented by the fact that both sensors are mounted on the same support. The following approach exploits this fact by identifying triangles where the sign, and thus the direction, of the surface normal vector appears flipped in Ξ compared to D.
The pre-triangulated grid G is initialized such that the normal vector n T of each triangle T on S points toward the camera and is oriented in the negative m , Z < 0 viewing direction Z of the camera. For every valid T of initial surface mesh S, the normal vector n u v s of its representation in Ξ u v s T u v s must also point in the Z u v s direction. Triangles T 1 where the signs of n and n u v s are opposite, indicated by n u v s , Z u v s > = 0 , suggest that triangle T 1 likely does not represent a valid part of S and should be removed.
In addition, triangles T A = 0 with a degenerated representation T u v s in Ξ u v s are removed. This includes triangles with an area A u v s < 0.25 pixels, as well as cases where T u v s has a shortest edge of less than half a pixel and triangles that extend beyond the top and bottom corners of Ξ u v s .
Further, skinny triangles T φ < 13 are discarded if they enclose at least one angle φ between any two edges e a , e b , and e c that is smaller than 13 degrees, and if the lengths | e c | and | e b | of its longest two edges e c and e b conform to the following conditions:
( | e c | > | e K N N | ¯ + 4 σ | e K N N | ) ( e b | > | e K N N | ¯ + 4 σ | e K N N | )
( | e c | > | e K N N | ¯ + 4 σ | e K N N | ) ( count ( φ < 13 ) = = 2 )
To compute the average length | e K N N | ¯ and standard deviation σ | e K N N | , only triangles T K N N are considered that are formed by any three K-nearest neighbors v K N N located within a radius of max ( | e b | 0.9 , | e a | ) around the tip vertex v φ < 13 of T φ < 13 and the midpoint of its shortest edge e a . Additionally, any T φ < 13 that has to be discarded according to (21) will result in the deletion of all adjacent T φ < 13 connected to its e b or e c . In addition, in the case of any T φ < 13 satisfying (22), only the T φ < 13 adjacent to e c is removed. Finally, duplicate vi i v j encoding the same point and v not referenced by any triangle are removed from the surface S, along with all small disconnected surface patches S d i s .
The surface S is stored on disk in .obj format, along with the corresponding texture information Ξ u v s . Its triangle n T and vertex normals n v are recomputed, and a transformation is applied to all vertices and normals. The latter ensures that the z -axis points in the direction of the patient’s head and the positive x -axis extends from the left to the right side of the torso. The origin point is selected such that it is located on the central viewing axis of the camera. To compute its y-component, the point cloud is divided into 3 sections along the vertical direction, roughly representing the chest, belly, and hips of the patient from top to bottom. The points within the top third are further split into 5 subgroups from right to left along the x-axis. For the rightmost and leftmost groups, the median coordinates y ^ r and y ^ l are computed. Based on these values, the final y-coordinate of the origin point is computed as y = y ^ r + y ^ l / 2 .
This ensures that all surfaces are located close to each other and that they partially overlap. At the same time, the actual relative shift between the surfaces and the angle at which the camera views the surface is retained as much as possible. This is crucial for the registration process described in Section 2.3.2.

2.3. Offline Processing

The electrode positions are computed using a set of at least 14 recordings of the torso surface, covering a minimum angle of approximately 270 degrees in the horizontal plane. The necessary steps, depicted in Figure 1, are presented in the following subsections. These steps include the pairwise alignment and registration of the recorded surfaces S, as described in Section 2.3.2; the extraction of the points v representing the colored electrode markers, as described in Section 2.3.3; and the fitting of a model of the marker to identify its central point, as described in Section 2.3.4. In the final step, a unique label is attached to each position, which uniquely links the individual ECG signals and the 3D position of the corresponding electrode.

2.3.1. Color Correction

The color sensor of the Intel Realsense SR300 camera (Intel corporation, Santa Clara, CA, USA) offers only a limited range (between K W , m i n = 2500 and K W , m a x = 6500 ) within which the color temperature parameter K W can be tuned using the algorithm discussed in Section 2.2.1. This range is optimized for indoor use [21,22], where typical light sources include incandescent tungsten lamps ( K = 2500 ), fluorescent lights ( K = 3800 ), and standardized CIE sources such as CIE55 ( K = 5000 ) or CIE65 ( K = 6500 ).
The space limitations encountered in clinical settings, for example, outpatient and cardiology practitioner clinics, result in more challenging illumination conditions that can vary significantly depending on factors such as the patient’s seating position or the camera’s direction. Specifically, individual objects and parts of the room may be shaded by other objects, for example, the electrodes on the patient’s back. Shaded areas are characterized by color temperature values K > 7000 , which are significantly larger than the K W , m a x = 6500 upper limit assumed by the color sensor. Examples of this situation are shown in Figure 4a,c.
An additional color-correction process is applied to the recorded texture images Ξ and the 3D surfaces. A virtual camera is used to simulate the recording of Ξ with a different K W setting than the actual one. This virtual camera offers an AWB range between K W , m i n = 2000 and K W , m a x = 9000 . It uses the model introduced in Section 2.2.1 to adjust the gain of its red Γ R ^ = Γ K and blue Γ B ^ = 1 / Γ K color channels.
The virtual camera internally stores a linearized and normalized representation Ξ ^ = of Ξ u v s . This representation corresponds to an image recorded with an equal gain Γ K = 1 and K W = 5500 .
Γ K , = = 2 ( K W , u v s K W , m i n ) K W , m a x K W , m i n , R ^ i , = = R ^ i Γ K , = , G ^ i , = = G ^ i , B ^ i , = = B ^ i Γ K , = , Γ K = Γ K , =
Its white-balancing parameter K W is initialized to the color temperature K u v s at which Ξ ^ u v s was recorded by the color sensor of the 3D DS camera.
After initialization, the color-correction approach described in Section 2.2.1 is used to adjust the K W of the virtual camera until a suitable value for K W + is found. If K W + jitters around its ideal value for at least 20 repetitions, the color correction stops when the following condition is met:
1 < = K n + 1 , W + K n , W + < = 1
In this case, K W + is set to the mean value K W + ¯ of the last 3 minimum updates for which the difference between consecutive K W + values is less than 10. With each update of K W , a new version of Ξ u v s is created by multiplying the red color values R ^ = of Ξ ^ = by the updated Γ K + , multiplying the blue values by 1 / Γ K + , and performing a left bisection search on the lookup table V ^ = V established in Section 2.2.1. Pixels that are overexposed according to (4) are not modified. Pixels that appear overexposed after scaling and exceed a maximum value of 1 in at least one channel are assumed to be fully saturated in all three channels, which are each set to the maximum value. Pixels that appear underexposed, with at least one channel having a value less than 10 8 , are assumed to be unexposed in all channels. Therefore, in such cases, all three channels of the pixel are set to 1 when fully saturated and 0 when unexposed. Additionally, all channels are clipped to the maximum possible value of 1 if necessary. The color-optimized version of Ξ u v s (Figure 4b,d) is then used to extract the 3D points of the electrode markers, as described in Section 2.3.3.

2.3.2. Surface Registration

To align the surfaces, a point-to-plane algorithm was chosen. This kind of ICP algorithm minimizes the distances l = | v S v T | between corresponding v S and v T along the direction of the surface normals n T of S T .
E ( ) = ( v S , v T ) K ( v T v S ) n T 2 = m i n
A precise alignment between S T and S S across all surface pairs is achieved when Equation (25) is also minimal in the reverse case with S T and S S swapped. The following simple symmetric point-to-plane approach is used by the registration component (Figure 1) to align the surfaces. It was chosen in favor of other symmetric point-to-plane algorithms such as [26], as it can be directly implemented using unidirectional ICP functions from open3D library [27]. In the first step, the forward transformation matrix f is computed for the set of corresponding points ( v T , v S ) C f by applying (25). In the second step, the reverse transformation R is computed for the points ( v S , v T ) C r corresponding to the reversed setup. The initial 0 , r is initialized as 1 f . The set ( v T , v S ) C f is selected from a subset of v S that is located within the maximum correspondence distance l c of v T . The same selection criterion is used for the reverse set ( v S , v T ) C r with respect to any v S . In the final step, the optimal transformation and the new correspondence distance l c , + 1 are selected from f , r , l c + 1 , f , and l c + 1 , r using the following criteria:
, l c + 1 = r , l c + 1 , r if E ( r ) < { E ( ) , E ( f ) } l c + 1 , r < { l c , l c + 1 , f } f , l c + 1 , f if E ( f ) < { E ( ) , E ( Re r ) } l c + 1 , f < { l c , l c + 1 , r } l c + 1 , f = l ¯ f + 2 σ l f , l c + 1 , r = l ¯ r + 2 σ l r l ¯ f = mean ( v T , v S ) K f ( | v S v T | ) , σ l f = std ( v T , v S ) C f ( | v S v T | ) l ¯ r = mean ( v S , v T ) C f ( | v T v S | ) , σ l r = std ( v S , v T ) C r ( | v T v S | )
The surfaces S recorded using the approach described in Section 2.2 are aligned such that they more or less share the same space, apart from the small rotation Δ φ along the horizontal direction and the relative vertical movement Δ z between the cameras. No information about their orientation in space or how much each pair overlaps is recorded. For obtaining sufficiently precise positions of the electrodes, the optimal correspondence distance l o between any ( v T , v S ) should be l o 1 mm . Therefore, the symmetric ICP registration is repeated for each pair in multiple runs. The results obtained for and l c + 1 in the previous run are used to initialize 0 and l c in the next run. If the condition in (26) for updating and l c fails, one last run is attempted with l c = l m i n 1 mm if l c < l m i n and l c 1 l c > σ l 0 holds. For the first optimization run, 0 is initialized to roughly reflect the relative rotation about the z-axis between two recorded surfaces S T and S S and its relative shift Δ z along the z-axis. The following approach is used to estimate the relative rotation angle Δ φ between S T and S S :
Δ φ = arccos v ^ l , S v ^ r , S | v ^ l , T v ^ r , T | h S | | h T |
The right v ^ r , S , v ^ r , T and left v ^ l , T , v ^ r , T median points define the horizontal directions of the sagittal planes with respect to S S and S T . They are computed using the same approach described in Section 2.2.4 to define the final position of the origin along the y-coordinate.
Suitable estimates for l c , m a x , l c , m i n , and σ l 0 are essential for achieving a sufficiently precise alignment of S T and S S . When testing the implementation of the symmetric ICP, it was empirically found that the values for l c , m a x , in particular, varied significantly depending on the relative distance and angle between two consecutive surfaces. Initially, constant values were assigned to l c , m a x and l c , m i n . However, these values resulted in an insufficient alignment between the surfaces on average. Specifically, the alignment of the surfaces at the left side where the front and back sides of the torso meet was rather challenging, and in some cases, not possible at all.
In order to improve the results and ensure a proper alignment between the surfaces, the following approach is used to determine suitable estimates of l c , m a x , l c , m i n , and σ l 0 for each pair of S T and S S . These estimates are computed based on the distances between the vertices v T and v S within the volume V T S = V T V S , which represents the common region of the axis-aligned bounding boxes V T and V S encompassing the target surface S T and the source surface S S . The latter S S is obtained by applying an initial transformation 0 to the source surface S S . The transformation 0 shifts all v S V T S such that their center of mass v S ^ aligns with the center of mass v T ^ of all v S V T S . The value for l m a x is obtained by applying (26) to the distances between the points in the forward correspondence set ( v T , v S ) C f , 0 and the backward correspondence set ( v T , v S ) C r , 0 . Both sets are found through a KNN search [28,29], which also considers the surface normals n T and n S in each v T and v S . This approach has the advantage of considering only v T and v S as corresponding when their surface normals n T and n S are closely aligned. From the resulting C f , 0 and C r , 0 , any v T and v S are removed if the deviation between their surface normals n S and n T exceeds 30 degrees, ensuring that n S | n T < 0.98 .
The estimate for l m i n is based on the overall mean ( | v T | , | v S | ) of the shortest neighbor distances within all v T V T S and v S V T S .
From the final of all consecutive pairs of S T and S S , the global alignment of each S i is determined by the cumulative transformation = j i j , starting with identity Re = I for the first surface S 1 . Alternatively, the transformation of the first surface can be initialized by the horizontal camera inclination angle φ about the z-axis using (27). From this, the relative angle between S 1 and the x-axis of the patient’s frontal plane is computed. This already provides a rough alignment of the resulting point cloud of the torso with its frontal plane.

2.3.3. Electrode Marker Extraction

In the current setup, the electrodes are attached to g.LADYbirdTM active electrode clips from g.tec medical engineering GmbH, Schiedlberg, Austria. These clips have a circular head, with its center aligned with the center of the electrode. The clip itself is covered with red-colored epoxy to protect the integrated electronics from water and other liquids. The circumference of the head is painted blue to model a circular electrode marker with a blue boundary and a red central disk. Figure 5 shows an example of this basic setup.
The blue boundary color (see Figure 5b) is selected such that the electrode marker easily can be detected within the RGB chromacity space representations χ ^ r g of the surface texture images Ξ u v s . The χ ^ r g values are obtained as a byproduct of the white-balancing and light color–temperature correction approaches described in Section 2.3.1.
Each χ ^ r g is scanned for red x r = r r , g r , b r and blue x b = r b , g b , b b pixels that are fully described by one of the following two ellipses within the RGB chromacity space.
( δ r r cos ( ϕ r ) δ g r sin ( ϕ r ) ) 2 σ ( r r ) + ( δ r r sin ( ϕ r ) + δ g r cos ( ϕ r ) ) 2 σ ( g r ) < = 1 with δ r r = r r r r ¯ , δ g r = g r g r ¯
( δ r b cos ( ϕ b ) δ g b sin ( ϕ b ) ) 2 σ ( r b ) + ( δ r b sin ( ϕ b ) + δ g b cos ( ϕ b ) ) 2 σ ( g b ) < = 1 with δ r b = r b r b ¯ , δ g b = g b g b ¯
The values r r ¯ , g r ¯ , r b ¯ , ϕ r , g b ¯ , and ϕ b define the red and green coordinates of the center point of the ellipsis and the rotation angle by which each of them is rotated with respect to the red axis of the RGB chromacity space. Their values are determined through the calibration procedure described in Section 2.3.6. All matching x r and x b pixels are mapped to their corresponding 3D vertices v r and v b on the torso surface S. This mapping is accomplished by computing the barycentric coordinates of each x r and x b within the representation of the surface triangle T in Ξ u v s .
The resulting marker point cloud P M formed by all v r and v b is filtered with respect to v r and v b , which likely correspond to a valid electrode marker, as defined by the color of the clip head. This is achieved by a radius-based KNN search for at least one neighbor of the opposite color. The radius is set to the radius of the clip head for all v r and the width of the blue boundary ring for all v b . If the neighborhood of radius ρ does not contain any points of the opposite color, v is removed from P M .
The filtered P M is split into individual clusters of v e l v r v b , representing the individual electrode clips. This is accomplished by applying the HDBSCAN algorithm [30]. The results are more robust compared to the basic DBSCAN algorithm [31], especially in the presence of groups of outliers, for example, generated by a bluish shadow cast on the cables and electrode clips. In addition, a minimum distance ϵ s p l i t can be defined, and clusters are not split any further. In contrast to the basic DBSCAN [31] algorithm, ϵ s p l i t defines a lower boundary limit rather than a strict cutting distance. In other words, less dense clusters with an average density exceeding ϵ s p l i t are not necessarily forced to split into distinct leaf clusters. The parameters of the minimum cluster size N C , m i n and the minimum samples X C , m i n = 20 are used to fine-tune and control the extraction of clusters that represent the individual electrode markers, considering the actual number of electrodes N e l .
ϵ s p l i t = r n h
N C , m i n = max ( count ( v r ) 4 N c l i p , X C , m i n )
In order to simplify the subsequent processing steps, the overall point cloud P S , as well as PM, is realigned such that the frontal plane of the torso is in line with the x-z plane of the coordinate system. This is achieved by once again splitting P S into chest, belly, and hip sections. The points of the chest section are further split along the x-axis into three parts, representing the right shoulder, neck, and left shoulder. The final transformation is computed by aligning the vector between the median points of the left and right shoulders to the x-axis of the frontal plane.

2.3.4. Fitting Marker Model

The red points v r and blue points v b within each cluster are fitted to a planar marker model consisting of a red disk enclosed within a blue ring. Before fitting, all v r and v b are projected onto the plane Q c l , which is parallel to all v e l .
v e l = v e l v e l v ¯ e l , n m j n m j
This ensures that all v e l are located on Q c l , which is defined by the predominant surface normal vector direction n m j within all surface normal vectors n e l v e l and their center of mass v ¯ e l .
n m j = i r U i σ i V i with U Σ V = svd ( n G ( v e l ) ) r = i m a x 1 i K 10 log 10 j i σ j k K σ k > 3
The shifted v e l are then fitted to the following model, which is based on the distances ρ ( v e l ) between the individual v e l ) and the electrode center X on Q e l .
ρ ( v e l ) = | v e l X | , δ ρ ( b b ) = ρ ( v b ) ρ d i s c ) ϵ = δ ρ 2 ( v b ) + ρ 2 ( { v r | ρ ( v r ) < ρ d i s c } ) + ρ 2 ( v b ) + X v ¯ e l , n m j 2
δ ρ represents the relative distances of the blue points v b from the boundary of the enclosed red disk with a radius ρ d i s c . From all the red points v r within a cluster, the model selects those that are within a radius ρ r < ρ d i s c from the current X . The model in (34) is optimized with respect to X using the L-BFGS-B algorithm provided by the SciPy minimize function. This numerically robust algorithm was selected because it can achieve satisfactory optimization results for least-squares optimization problems. Its implementation details can be found in the SciPy manual and [32,33]. For all clusters for which an ϵ m i n ( X ) could be found, X m i n is stored, along with n m j . Any remaining clusters for which no appropriate X m i n could be found are not further considered.
In some cases, it is possible that a clip is split into two smaller clusters. For example, if an electrode array is carelessly attached to the torso, electrode leads can shadow the relevant parts of the clip head. This might be the case when the following condition holds with respect to the counts of v r or v b :
1 3 < count ( v r ) count ( v b ) < 3
Two neighboring clusters are considered pieces of the same marker only if at least 10 closest neighbors of any v e l in the first cluster are closest to at least 85 distinct v e l in the other cluster. The cylindrical model is fitted to the largest piece of the marker only. This prevents nearby image artifacts in the Ξ u v s from causing misalignment of the affected electrode marker and distracting the center point from its true location.
The identified cluster centers X are triangulated using the ball-pivoting method [34,35] implemented in the open3D library. The radii ρ 1 = x ¯ / 2 and ρ 2 = x ¯ for two distinct balls are derived from the average distance x ¯ = mean ( | X 9 X | ) between each X and its 9 closest neighbors. Outliers are removed if | X 9 X | > x ¯ + 2 σ ( | X 9 X | ) before computing x ¯ . For a final check to determine if the X of neighboring clusters resemble two pieces of the same marker, the surface connectivity between individual X is computed. The marker attached to the largest group, where two X for which | X 1 X 2 | < 2 3 ρ 0 holds, is retained, whereas the other is removed. Ball-pivoting triangulation and the removal of small clip pieces are repeated until no more nearby groups, represented by distinct X , are found. The remaining X that are included in the resulting triangular surfaces represent the frontal and dorsal patches of the electrode grid layout proposed in [36]. Clusters that are too far away to be included in the mesh by the ball-pivoting process are considered single electrodes, similar to those used, for example, in Einthoven I, II, and III.
In the final step, the triangular meshes of the frontal and dorsal electrode patches are normalized. In this process, any vertical edge that intersects the horizontal line between two common neighbors of its endpoints is swapped with the edge that connects the common neighbors.

2.3.5. Label Assignment

Starting from the point with the smallest y-coordinate, the triangulation of the frontal patch is scanned line by line. All electrodes that can be connected along consecutive horizontal edges are joined into one row of the frontal patch [36] and stored in right-to-left order. The rows are ordered from bottom to top. After all rows of the frontal patch have been collected, the same approach is applied to collect the electrodes of the dorsal patch. Again, the electrodes are stored in right-to-left and bottom-to-top order.
On the frontal patch, the number labels for each channel are assigned in ascending order from bottom right to top left. The dorsal assignment starts at the top right and ends at the bottom left. The remaining electrode points X that have not been included within the triangulation of the frontal and dorsal patches either correspond to the three Einthoven leads I, II, and III if they are located on the arms close to the front of the left and right shoulders and on the left hip. The electrode array includes two additional electrodes that are placed frontal and dorsal close to the right side of the torso.

2.3.6. Calibration

The proposed method to identify the color electrode markers requires proper calibration of the mean values r r ¯ , g r ¯ , r b ¯ , and g b ¯ ; the standard deviations σ ( r r ) , σ ( g r ) , σ ( r b ) , and σ ( g b ) ; and the rotation angles ϕ r and ϕ b of the ellipses in Equations (28) and (29). In the first step, the color-corrected chromacity representation χ ^ r g of the texture images Ξ u v s obtained as a byproduct in Section 2.3.1) is roughly segmented. The pixels representing a blue or red pixel of the clips are initialized with the following values: r r ¯ = 0.75 , σ ( r r ) = 0.1 , g r ¯ = 0.08 , σ ( g r ) = 0.06 , r b ¯ = 0.05 , σ ( r b ) = 0.02 , g b ¯ = 0.13 , σ ( g b ) = 0.06 , and ϕ r = ϕ b = 0 .
These values were empirically identified from the chromacity space triangle of the 3D DS camera’s color sensor, generated from the pixels of all χ ^ r g . The resulting raw pixel masks M χ ^ , r a w are stored along with the corresponding χ ^ r g obtained from the data sets of at least three patients. In addition, a binary mask M I selects pixels of Ξ u v s that are properly exposed according to (4). For storing the c h i ^ r g on disk, the 16-bit PNG format is used. They are loaded along with the corresponding M χ ^ , r a w in an image processing program such as GimpTM or Adobe PhotoshopTM for manual segmentation of the clips.
The resulting M χ ^ , created by manually removing any pixel that does not represent a clip or electrode marker from M χ ^ , r a w , is used in combination with M I to extract the pixels that are part of the electrode clips and markers visible on each 16-bit χ r g image. Any pixel that does not correspond to a clip, is over- or underexposed, or meets the condition in (3) is not further considered in the following calibration steps. From all other pixel values, a 2D heat map N H with 256 bins for red r and green g chromacity values each is generated and median-filtered using a 7 by 7 neighborhood.
The red and blue color shades of the electrode markers appear as distinct, Gaussian-shaped peaks P H on N H . They (1, 2) are clearly visible as bright spots on the heat map, as shown in Figure 6. A Gaussian mixture model [37,38] is used to extract the individual clusters C H that represent each peak. Each peak is described as a 2D Gaussian distribution, which can be characterized by its center point or centroid and the standard deviations along each direction with respect to this center. By fitting the individual Gaussian models to the heat map N H , the actual position, orientation, and area covered by each peak can be found. To compute the initial positions of the cluster centroids, the heat map is binarized and labeled. In this process, any 4-connected set of at least 5 bins is considered a peak if all bin counts n H conform to the following condition:
n H > | n H | + 1.9 σ ( n H ) with n H = { n H | n H > 0 }
The cluster C H , r with the highest mean r r ¯ red component is used to compute σ ( r r ) , σ ( g r ) , and ϕ r . The values for σ ( r b ) , σ ( g b ) , and ϕ b are derived from C H , b for which 1 r b ¯ g b ¯ = m a x holds.
Σ r U r = eig ( cov ( C H , r ) ) Σ b U b = eig ( cov ( C H , b ) ) σ ( r r ) = σ 1 , r , σ ( g r ) = σ 2 , r , ϕ r = u 1 , r | r | u 1 , r | σ ( r b ) = σ 1 , b , σ ( g b ) = σ 2 , b , ϕ b = u 1 , b | r | u 1 , r |
σ 1 , r , σ 2 , r , σ 1 , b , and σ 2 , b represent the first and second eigenvalues Σ of the covariance matrices cov ( C H , r ) , cov ( C H , b ) of C H , r and C H , b , and u 1 , r and u 1 , b are the corresponding initial eigenvectors. These values are stored along with the centroids of C H , r and C H , which define the mean values r r ¯ , g r ¯ , r b ¯ , and g b ¯ on disk that are to be used in the extraction step described in Section 2.3.3.
The remaining clusters 3 and 4 are not further considered as they correspond to the color highlights on the clips (3) or are caused by inappropriately chosen parameters affecting the conversion of the raw sensor signals to the RGB color space (4).

2.4. Recording Protocol

The technical approach outlined in Section 2.2 and Section 2.3, requires that the patient maintains the same posture throughout the recording. This is only possible if the patient is directly engaged and actively participating in the measurement.
Therefore, prior to the application of the electrodes, the patient is instructed to sit down on a chair. The height of the chair is then adjusted so the patient can comfortably sit upright throughout the recording process. The feet of the patient should rest flat on the floor and the knees should be bent by no more than 90 degrees. If the chair cannot be adjusted in height, an alternative solution is to stack multiple chairs to increase the patient’s comfort and encourage them to straighten their back. To ensure optimal recordings without any obstacles, the chair should not have armrests or a backrest and be placed at least 1 meter from any furniture or other objects that can cause shadows. This ensures that the FOV of the 3D DS camera can be optimally used and the operator is able to capture a surface at least every 20 degrees.
After the electrodes have been attached to the torso, the patient is instructed to place the hands on the thighs. The fingers should point inward and the thumbs should point straight toward the hips. The optimal position of the hands is a thumb length before the hips. While the electrode positions are recorded, the patient is instructed to maintain a straight and upright back. Most patients are able to easily maintain this position by slightly straightening their elbows (about 120 degrees between the upper and lower arm). This helps them to move their chest and shoulders into a position that is as upright as possible. This has the effect that the patient is forced into an isometric posture, which can easily be maintained while the electrode positions are recorded. In addition, this position facilitates the recording of electrodes placed under the left axle, for example, the Wilson electrodes V 5 and V 6 .

3. Results

In the following section, the results are presented.
The narrow vertical field of view of the color sensor is one of the main reasons why the 3D images of the torso are recorded in portrait mode. In a typical clinical setting, where space is limited, it is likely that the patient is seated close to furniture or walls. For proper recording of the 3D images, a space of at least 2.5 m by 2.5 m is required. This includes a standard chair without armrests or a backrest, with a diameter of 50 cm, that can provide at least 1 m of space on all four sides of the patient for the operator to move around while recording the images. The remaining space between the patient, the operator, and any surrounding furniture or walls should be 50 cm or less. Both sensors of the camera must be able to properly capture the dorsal part of the patient’s torso at distances between 20 cm and 50 cm. This can only be achieved by cameras with FOV angles conforming to (1) such as the Intel Realsense cameras, which have wide viewing angles of ≈70 degrees for both the depth and color sensors when used in portrait recording mode. This is especially important for capturing the dorsal views of the torso.
The color sensor has a viewing ratio of 16:9 between the horizontal and vertical FOVs. This results in a vertical viewing angle of about 40 degrees, which is a lot smaller than the ≈60 degrees of the depth sensor. This can lead to a situation where, for example, around ≈60 columns on the top and bottom of the depth image lack texture information. However, this is acceptable given that consecutive 3D images are recorded in portrait mode with an overlap of about two-thirds, ensuring that the texture images sufficiently overlap.
Thanks to the vertical nature of the patient’s torso, in portrait mode, it is easy to keep the patient centered in the image while moving the camera to the next recording position. As the patient’s torso covers most of the image space, only very few objects and obstacles located behind the patient are captured by the cameras, which can easily be removed before storing the 3D surface images.
Scanning always starts with the right frontal view of the torso and ends at the right dorsal side. If possible, the right lateral side of the torso can be recorded. This is not essential for extracting the electrode positions and can be omitted in standard recording procedures. It is recommended to explicitly record the right lateral torso surface when there is sufficient space to the right of the patient.
The preview image of the torso, shown in the main area (1) of the user interface shown in Figure 7, is split into a 3-by-3 grid. The center segment of this grid is used as the focus area, representing the central part of the patient’s torso. The contours of the largest object containing the focus segment are highlighted in orange. As the camera points at the patient’s torso, the contours highlight the boundaries of the patient’s torso. The recording of a torso surface segment is initiated by pressing the trigger of the camera. The color of the contour line switches to green and the live preview freezes to indicate that the captured depth and color image have been processed and the 3D surface has been generated and stored. Once the underlying point cloud has been triangulated, occluded and degenerated triangles, as well as detached surface patches, are removed. Then, the contour is updated to mark the parts that will be stored on disk. After the 3D surface information, corresponding texture image, and meta information have been stored, the live preview is started again and the color of the contour reverts to orange. The live preview is updated at a maximum rate of 10 FPS. With the Python-based prototype, update rates between ≈4 FPS and ≈7 FPS can be realistically achieved.
The main preview area (panel 1 in Figure 7) has the same shape as the depth image. For the parts on the left and right sides that are not captured by the RGB image, the edges identified on the depth image are displayed instead. The outline of the patient’s torso does not extend beyond the edges of the RGB image. In panel 2 of the preview screen (Figure 7), several recording and camera parameters, such as the frame rate in FPS, exposure time τ in ms, etc., are shown, along with the intermediate parameters computed for automatic-exposure control and color correction. In panel 3, the full set of edges identified on the current depth image is displayed. The two vertical lines delineate the area of the depth image that is covered by the color image.
The prototype for real-time recording of the 3D torso surface patches, as well as for postprocessing and calibration, was implemented in Python version 3 using a recent version of NumPy and SciPy [39]. The librealsens version 2 library [40] was used to control the acquisition, convert the depth values into a point cloud, and compute the corresponding texture uvs map for the RGB image. The OpenCV library [41] was used to generate the preview display, and the generation and cleaning of the 3D meshes were accomplished using the open3D library [27]. The most computationally demanding components, the depth-edge detection (Section 2.2.3), automatic white balancing (Section 2.2.1), and patient-locked auto-exposure control (Section 2.2.2), were converted into Python-C modules using Cython [42].
In total, five male subjects between 38 and 70 years of age participated in the present study. Each subject was seated on a chair or examination bed, depending on the available space. After applying the ECG electrodes to the chest and back, the subjects were instructed to maintain the posture described in Section 2.4. The measurement of the torso surface and the recording of a 30 -min long ECG with 67 channels took about 30 min to 45 min. After each measurement, the data were analyzed and the prototype improved accordingly.
The data set recorded from the first subject turned out to be quite limited and, therefore, is not included in the presented results, as it was affected by the automatic white balancing and exposure control of the color sensor, which could not cope well with the diverse and complex lighting conditions. Further, the 3D points recorded by the depth sensor were directly transformed to match the color image captured by the color sensor. This posed several challenges related to occluded surface parts causing undesirable distortions and the introduction of noncausal surfaces. Starting with the data for the second subject, the direct mapping was replaced with the texture mapping approach, which yielded better results and allowed for the implementation of the algorithms for occlusion management and the removal of noncausal triangles, as described in Section 2.2.4.
For each patient, 12 to 15 views were recorded. Each of the views contained a 3D surface described by ≈170,000 vertices and ≈300,000 triangles. As shown in Table 1, between 7 and 21 iterations of the symmetric ICP algorithm were necessary to align the surfaces. The maximum correspondence distance between the points of the surface pairs was reduced in every iteration step, starting from 7 cm–12 cm and reaching 0.7 mm–1.2 mm. More iterations were necessary to align the surfaces joining the frontal and dorsal views on the left side of the torso. In cases where the available space around the subject was insufficient, the number of iterations required to align the surfaces was increased. In the most challenging scenario, the proper alignment of the surfaces was not possible at all. This situation was encountered in the data set recorded from subject 5, where part of the torso surface on the left side was obscured by the backrest of the chair. Among other challenges, this required an increased number of 21 iterations to align the leftmost frontal and dorsal views.
Across all subjects, a final root mean square error between consecutive surfaces of 0.7 mm was achieved. Using the proposed approach, 12 to 15 surfaces per patient were registered within 13 min. As shown in Table 2, the extraction of the electrode marker points and the computation and labeling of the electrode positions were completed after another ≈8 min.
The recording sessions were part of a larger clinical pilot study investigating the prognostic value of index arrhythmias with respect to the outcome of pulmonary vein ablation, for which the participants provided informed consent. Apart from the 3D camera and ECG recordings, this study was based on clinical data recorded during the patient’s clinical treatment. Therefore, CT recordings and other independent means of recording the electrode positions relative to the torso were included. To assess the accuracy of electrode localization, the electrode positions were backprojected onto the individual views of the torso and marked on the corresponding color images. Examples are shown in Figure 3, Figure 4b,c and Figure 8.
The annotated RGB images were presented to an expert who used the cross-hair tool shown in Figure 8b to manually adjust the position of each marker. In order to facilitate this task, two markers were used: the green marker indicates the backprojected position of the marker and the red marker corresponds to the manually adjusted position. All positions were checked during this process and if necessary, they were moved to better reflect the perceived center positions on each view. When finished, all positions were reprojected onto a 3D space.
For the set of corrected positions of each electrode, the mean point, as well as the mean distance and standard deviation to this mean point, were computed. The resulting values are shown in Table 3, along with the mean and standard deviations of the computed electrode positions with respect to the manually determined mean. Both sets of results were influenced by the accuracy of the registration process and the fact that no unique solution exists for the backprojection of the electrode positions onto the individual views. In addition, the mean and standard deviation of the registration errors and the error between the mean and standard deviation of the distances between the individual projections and their mean point are listed.
The corrected electrode positions deviated, on average, by 2.3 mm ± 1.4 mm from the mean point, and the computed electrode positions deviated from the mean point by [ 2.0 mm ± 1.5 mm ]. This is in accordance with the limitations posed by the backprojection, where the reprojected points deviated from the computed position by 0.9 mm ± 1.4 mm , and the ICP registration resulted in an average deviation between corresponding points of 0.6 mm ± 0.2 mm . Given the amount of data to be processed per subject, the overall time of 22 min. required to extract and align the electrode positions is quite impressive, considering that only the computations of the asymmetric ICP and the HDBSCAN algorithms are implemented as part of the native open3D library and as Cython scripts, respectively. The rest of the implementation was carried out in Python using NumPy arrays only. In contrast, the expert required between 30 min. and 45 min. to point and place the electrode markers on the 14 views of a single data set.

4. Discussion

The results are promising given the fact that the torso is a far less rigid structure compared to the skull. Further, the limited space conditions and adverse environmental conditions typically found in clinical settings, e.g., outpatient and local practitioner clinics, are quite challenging. This is evident in the results shown in Table 3 for subjects 4 and 5. In both cases, nearby obstacles such as backrests or furniture limited access to the patient’s left side, resulting in increased positional variations of 2.2 mm ± 1.5 mm and 2.4 mm ± 1.8 mm in relation to the mean of the manually defined electrode positions. This is compared to 1.7 mm ± 1.4 mm and 1.6 mm ± 1.5 mm for subjects 2 and 3, respectively.
These values are still in the range reported for recently proposed approaches for localizing electrodes mounted on the human body. As shown in Table 4, few studies exist that evaluate the use of 3D DS cameras [19,20] and photogrammetry methods [18] for localizing ECG electrodes on the torso. The achieved results varied between 1.16 mm and 11.8 mm, depending on the metrics and positional references used. The authors of [20] used the Hausdorff metric to compare the positions obtained from a Microsoft Kinect 3D DS camera to positions found on MRI or CT scans. On average, they achieved a positional error of 11.8 mm, which is an order of magnitude larger than the error between 1.16 mm and 2.5 mm achieved by Schulze et al. [18], Alioui et al. [19] and the present study, all of which used the Euclidean metric instead.
The majority of studies proposed methods for the localization of EEG sensors mounted on the scalp. Apart from Homölle and Oostenveld [8], the achieved average positional errors ranged from 1.5 mm [12] to 3.26 mm [14] using various reference measurements, including the mean of manually placed marks [12,14] and positional references generated using a magnetic digitizer [8,13,16] such as the Polhemus Fastrak. Comparing the positional error of 9.4 mm achieved by Homölle and Oostenveld [8] to all other results, it can be assumed that this was mainly caused by unavoidable inaccuracies when taking the magnetic digitizer measurements.
Considering that the positions of ECG electrodes mounted on the torso are directly affected by any movements, the positional error of 2.0 mm achieved in the present study is a clear indication that the active engagement and participation of the patient in the measurement is essential. The instructions on how the patient can easily maintain a posture that facilitates the recording of the electrode positions have a huge impact on the outcome of the measurements. If the instructions are not clearly defined by the measurement protocol, or not properly understood or followed by the patient, the positional error will increase. For example, subject 4 (see Table 3) changed the position of his arms during the measurement twice. This immediately resulted in an increased positional error of 2.2 mm ± 1.5 mm.
In addition to the limited space, the lighting conditions encountered in the clinical environment, as well as tight schedules, have a direct impact on the average positional error. Varying lighting conditions, including multiple light sources with differing light temperatures, on the other hand, can have a negative impact on photogrammetric approaches and 3D DS camera-based measurements of the torso surface and the electrode positions thereon. Algorithms for automatic white balancing and exposure control have been adopted to improve color constancy across multiple 3D views of the torso and maintain a constant exposure of the torso independent of the viewing direction and angle. In combination with the developed calibration method, this resulted in increased accuracy in identifying those pixels representing the color markers.
Time, in particular, is a very limited resource, which largely limits the routine use of magnetic digitizers within clinical environments. For precise positional measurement, the exact placement of the magnetic probe on each electrode and manual triggering of the measurement are required. An experienced user requires about 15 min. to accomplish this task. Any attempt to reduce this time can only be achieved by the less accurate placement of the probe on each electrode, which can result in increased positional errors of 7.8 mm and higher, as encountered by Clausner et al. [43].
In general, keeping the required human interactions and number of related errors as low as possible is one key goal for establishing NICE-based tools and procedures in clinical environments. The time required to localize the electrode positions on the human torso, as well as the amount of ionizing radiation the patient is exposed to, are key factors that can either prevent or facilitate a successful uptake. Alternative approaches currently used to obtain the electrode positions include manually placing markers on CT and MRI scans [9,12,19]. and automatically segmenting and pointing a magnetic digitizer probe to each individual electrode [8,13,16]. These approaches require a significant amount of time (about 45 min.) to point to each electrode, which is more than the 15 min. required for magnetic probe-based measurements. The mentioned approaches suffer from an additional bias related to the individual human perception of the electrode and marker shapes, as well as inaccuracies in the way the pointing probes are placed onto the electrode.
In contrast, the proposed 3D DS camera-based approach is not affected by these kinds of errors. When implemented on a tablet computer, the presented approach will enable clinicians to acquire the electrode positions and torso surfaces within 10 min. Therefore, average positional errors of less than 2.5 mm will be feasible even under limited spatial conditions and tight schedules.
Some aspects essential for the successful clinical uptake of the presented approach still have to be addressed. On all color sensors, the raw signals recorded for red, green, and blue channels have to be converted into the RGB color space before they can be used. If the required parameters are not properly calibrated, the resulting images may show a bluish hue that can not be corrected by any white-balancing algorithm. This was the case for subject 5 shown in Figure 3, and caused the additional peak (4) in the calibration heat map shown in Figure 6. During the preparation of future studies, it is necessary to establish an appropriate procedure for verifying and optimizing the settings for these parameters before the first measurement and at regular intervals.
Each 3D DS camera data set also provides a point cloud representation of the torso surface. This is used in current studies to build electroanatomical models for electrocardiographic noninvasive imaging methods from clinical cardiac CT slices only. Further applications of the proposed approach are currently being investigated for enhanced electrical impedance tomography.

5. Conclusions

In the presented work, a complete 3D DS camera-based system was developed for localizing 67 ECG electrodes identified by color markers. Issues such as varying lighting conditions, including multiple light sources with different light temperatures, and the alignment of individual 3D views were addressed. The implemented recording protocol provides precise rules on how to seat the patient and includes well-defined instructions for the patient to easily maintain a specific isometric posture while all views are recorded. The resulting active engagement and participation of the patient in the measurement helped to minimize positional errors caused by the patient moving during the measurement. In combination with the symmetric ICP algorithm implemented, average positional errors of 2.3 mm or less could be achieved for each measurement.
The implemented prototype system localizes the electrodes on the torso with minimal human interaction. It can handle diverse lighting conditions and operate in narrow spaces, as encountered in clinical settings such as outpatients of local practitioner clinics.

Author Contributions

Conceptualization, J.B. and C.H.; methodology, J.B.; software, J.B. and C.H.; validation, J.B., C.H. and H.B.; formal analysis, H.B. and C.S.; data curation, C.H. and H.B.; writing—original draft preparation, C.H.; writing—review and editing, J.B., C.H., H.B. and C.S.; visualization, C.H.; supervision, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Linz Center of Mechatronics.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of the Medical Faculty of Johannes Kepler University Linz (protocol code 1190/2020 and date of 16 September 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data and software are made available in the data and software repository of Johannes Kepler university. Until public access to these reporsitories is available in its full extent data developed data and software are provided on request by the institute for Biomedical Mechatronics at Johannes Kepler University, mmt@jku.at.

Acknowledgments

The authors would like to thank the patients who participated in the PAIP study, as well as the team in the Department of Cardiology at the Kepler University Hospital for providing the required resources and for their support. Further, the authors would like to thank Werner Baumgartner and colleagues from the Biomedical Mechatronics Institute at Johannes Kepler University for their support during the development of the prototype and the preparation of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
EEGElectroencephalogram
ECGElectrocardiogram
EITElectrical Impedance Tomography
ECTElectrical Capacitance Tomography
MRIMagnetic Resonance Tomography
CTComputed Tomography
3D DS camera3D Depth-Sensing camera
RGBRed, Green, Blue
AWBAutomatic White Balancing
FOVField of View
AEAuto-Exposure
ICPIterative Closest Point

References

  1. Ogawa, R.; Baidillah, M.R.; Darma, P.N.; Kawashima, D.; Akita, S.; Takei, M. Multifrequency Electrical Impedance Tomography With Ratiometric Preprocessing for Imaging Human Body Compartments. IEEE Trans. Instrum. Meas. 2022, 71, 1–14. [Google Scholar] [CrossRef]
  2. Lymperopoulos, G.; Lymperopoulos, P.; Alikari, V.; Dafogianni, C.; Zyga, S.; Margari, N. Applications for Electrical Impedance Tomography (EIT) and Electrical Properties of the Human Body. Adv. Exp. Med. Biol. 2017, 989, 109–117. [Google Scholar] [PubMed]
  3. Gupta, S.; Lee, H.J.; Loh, K.J.; Todd, M.D.; Reed, J.; Barnett, A.D. Noncontact Strain Monitoring of Osseointegrated Prostheses. Sensors 2018, 18, 3015. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Rudy, Y.; Lindsay, B.D. Electrocardiographic Imaging of Heart Rhythm Disorders: From Bench to Bedside. Card. Electrophysiol. Clin. 2015, 7, 17–35. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Zhang, J.; Sacher, F.; Hoffmayer, K.; O’Hara, T.; Strom, M.; Cuculich, P.; Silva, J.; Cooper, D.; Faddis, M.; Hocini, M.; et al. Cardiac Electrophysiological Substrate Underlying the ECG Phenotype and Electrogram Abnormalities in Brugada Syndrome Patients. Circulation 2015, 131, 1950–1959. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Seger, M.; Hanser, F.; Dichtl, W.; Stuehlinger, M.; Hintringer, F.; Trieb, T.; Pfeifer, B.; Berger, T. Non-invasive imaging of cardiac electrophysiology in a cardiac resynchronization therapy defibrillator patient with a quadripolar left ventricular lead. Europace 2014, 16, 743–749. [Google Scholar] [CrossRef]
  7. Ettl, S.; Rampp, S.; Fouladi-Movahed, S.; Dalal, S.S.; Willomitzer, F.; Arold, O.; Stefan, H.; Häusler, G. Improved EEG source localization employing 3D sensing by “Flying Triangulation”. In Proceedings of the Optical Metrology, Munich, Germany, 13–15 May 2013. [Google Scholar]
  8. Homölle, S.; Oostenveld, R. Using a structured-light 3D scanner to improve EEG source modeling with more accurate electrode positions. J. Neurosci. Methods 2019, 326, 108378. [Google Scholar] [CrossRef]
  9. Butler, R.; Gilbert, G.; Descoteaux, M.; Bernier, P.M.; Whittingstall, K. Application of polymer sensitive MRI sequence to localization of EEG electrodes. J. Neurosci. Methods 2017, 278, 36–45. [Google Scholar] [CrossRef]
  10. Marino, M.; Liu, Q.; Brem, S.; Wenderoth, N.; Mantini, D. Automated detection and labeling of high-density EEG electrodes from structural MR images. J. Neural Eng. 2016, 13, 056003. [Google Scholar] [CrossRef]
  11. Ma, Y.; Mistry, U.; Thorpe, A.; Housden, R.J.; Chen, Z.; Schulze, W.H.W.; Rinaldi, C.A.; Razavi, R.; Rhode, K.S. Automatic Electrode and CT/MR Image Co-localisation for Electrocardiographic Imaging. In Proceedings of the Functional Imaging and Modeling of the Heart; Ourselin, S., Rueckert, D., Smith, N., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 268–275. [Google Scholar]
  12. Dalal, S.S.; Rampp, S.; Willomitzer, F.; Ettl, S. Consequences of EEG electrode position error on ultimate beamformer source reconstruction performance. Front. Neurosci. 2014, 8, 42. [Google Scholar] [CrossRef] [Green Version]
  13. Cline, C.C.; Coogan, C.; He, B. EEG electrode digitization with commercial virtual reality hardware. PLoS ONE 2018, 13, e0207516. [Google Scholar] [CrossRef]
  14. Chen, S.; He, Y.; Qiu, H.; Yan, X.; Zhao, M. Spatial Localization of EEG Electrodes in a TOF+CCD Camera System. Front. Neuroinform. 2019, 13, 21. [Google Scholar] [CrossRef]
  15. Ghanem, R.N.; Ramanathan, C.; Jia, P.; Rudy, Y. Heart-surface reconstruction and ECG electrodes localization using fluoroscopy, epipolar geometry and stereovision: Application to noninvasive imaging of cardiac electrical activity. IEEE Trans. Med. Imaging 2003, 22, 1307–1318. [Google Scholar] [CrossRef] [Green Version]
  16. Kössler, L.; Maillard, L.; Benhadid, A.; Vignal, J.P.; Braun, M.; Vespignani, H. Spatial localization of EEG electrodes. Neurophysiol. Clin. = Clin. Neurophysiol. 2007, 37, 97–102. [Google Scholar] [CrossRef]
  17. He, Y.; Qiu, H.; Gu, Y.; Chen, S. EEG Electrode Localization based on Joint ToF and CCD Camera Group. In Proceedings of the 2019 IEEE 5th International Conference on Computer and Communications (ICCC), Chengdu, China, 6–9 December 2019; pp. 1771–1776. [Google Scholar] [CrossRef]
  18. Schulze, W.H.W.; Mackens, P.; Potyagaylo, D.; Rhode, K.; Tülümen, E.; Schimpf, R.; Papavassiliu, T.; Borggrefe, M.; Dössel, O. Automatic camera-based identification and 3-D reconstruction of electrode positions in electrocardiographic imaging. Biomed. Tech. Biomed. Eng. 2014, 59, 515–528. [Google Scholar] [CrossRef]
  19. Alioui, S.; Kastelein, M.; van Dam, E.M.; van Dam, P.M. Automatic registration of 3D camera recording to model for leads localization. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; pp. 1–4. [Google Scholar] [CrossRef]
  20. Perez-Alday, E.A.; Thomas, J.A.; Kabir, M.; Sedaghat, G.; Rogovoy, N.; van Dam, E.; van Dam, P.; Woodward, W.; Fuss, C.; Ferencik, M.; et al. Torso geometry reconstruction and body surface electrode localization using three-dimensional photography. J. Electrocardiol. 2018, 51, 60–67. [Google Scholar] [CrossRef]
  21. Intel RealSenseTM Depth Camera SR300 Series Product Family. 2021. Available online: https://www.intelrealsense.com/wp-content/uploads/2019/07/RealSense_SR30x_Product_Datasheet_Rev_002.pdf (accessed on 24 April 2023).
  22. Zabatani, A.; Surazhsky, V.; Sperling, E.; Moshe, S.B.; Menashe, O.; Silver, D.H.; Karni, Z.; Bronstein, A.M.; Bronstein, M.M.; Kimmel, R. Intel RealSenseTM SR300 Coded Light Depth Camera. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2333–2345. [Google Scholar] [CrossRef]
  23. Cohen, N.; A Color Balancing Algorithm for Cameras. EE368 Digital Image Processing. 2011. Available online: https://stacks.stanford.edu/file/druid:my512gb2187/Cohen_A_New_Color_Balancing_Method.pdf (accessed on 24 April 2023).
  24. Chen, W.; Li, X. Exposure Evaluation Method Based on Histogram Statistics. In Proceedings of the 2017 2nd International Conference on Electrical, Automation and Mechanical Engineering (EAME 2017), Shanghai, China, 23–24 April 2017; Atlantis Press: Amsterdam, The Netherlands, 2017; pp. 290–293. [Google Scholar] [CrossRef] [Green Version]
  25. Wang, B.; Fan, S. An Improved CANNY Edge Detection Algorithm. In Proceedings of the 2009 Second International Workshop on Computer Science and Engineering, Tianjin, China, 3–6 August 2014; Volume 1, pp. 497–500. [Google Scholar] [CrossRef]
  26. Rusinkiewicz, S. A Symmetric Objective Function for ICP. ACM Trans. Graph. 2019, 38. [Google Scholar] [CrossRef]
  27. Zhou, Q.Y.; Park, J.; Koltun, V. Open3D: A Modern Library for 3D Data Processing. arXiv 2018, arXiv:1801.09847. [Google Scholar]
  28. Blanco, J.L.; Rai, P.K. Nanoflann: A C++ Header-Only Fork of FLANN, a Library for Nearest Neighbor (NN) with KD-Trees. 2014. Available online: https://github.com/jlblancoc/nanoflann (accessed on 24 April 2023).
  29. Muja, M.; Lowe, D.G. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In Proceedings of the International Conference on Computer Vision Theory and Applications, Lisboa, Portugal, 5–8 February 2009. [Google Scholar]
  30. McInnes, L.; Healy, J.; Astels, S. hdbscan: Hierarchical density based clustering. J. Open Source Softw. 2017, 2, 205. [Google Scholar] [CrossRef]
  31. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), Portland, OR, USA, 2–4 August 1996. [Google Scholar]
  32. Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
  33. Zhu, C.; Byrd, R.H.; Lu, P.; Nocedal, J. Algorithm 778: L-BFGS-B: Fortran Subroutines for Large-Scale Bound-Constrained Optimization. ACM Trans. Math. Softw. 1997, 23, 550–560. [Google Scholar] [CrossRef]
  34. Bernardini, F.; Mittleman, J.; Rushmeier, H.; Silva, C.; Taubin, G. The ball-pivoting algorithm for surface reconstruction. IEEE Trans. Vis. Comput. Graph. 1999, 5, 349–359. [Google Scholar] [CrossRef]
  35. Digne, J. An Analysis and Implementation of a Parallel Ball Pivoting Algorithm. Image Process. Line 2014, 4, 149–168. [Google Scholar] [CrossRef] [Green Version]
  36. Hintermüller, C.; Seger, M.; Pfeifer, B.; Fischer, G.; Modre, R.; Tilg, B. Sensitivity and Effort-Gain Analysis: Multi-Lead ECG Electrode Array Selection for Activation Time Imaging. IEEE Trans. Biomed. Engeneering 2006, 53, 2055–2066. [Google Scholar] [CrossRef]
  37. Kambhatla, N.; Leen, T. Classifying with Gaussian Mixtures and Clusters. In Proceedings of the Advances in Neural Information Processing Systems; Tesauro, G., Touretzky, D., Leen, T., Eds. MIT Press: Cambridge, MA, USA, 1994; Volume 7. [Google Scholar]
  38. Reynolds, D. Gaussian Mixture Models. In Encyclopedia of Biometrics; Li, S.Z., Jain, A.K., Eds.; Springer: Boston, MA, USA, 2015; pp. 827–832. [Google Scholar] [CrossRef]
  39. Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [Green Version]
  40. Intel RealSenseTM Cross Platform API. 2022. Available online: https://intelrealsense.github.io/librealsense/doxygen/ (accessed on 1 April 2022).
  41. Itseez. Open Source Computer Vision Library. 2015. Available online: https://github.com/itseez/opencv (accessed on 24 April 2023).
  42. Behnel, S.; Bradshaw, R.; Citro, C.; Dalcin, L.; Seljebotn, D.S.; Smith, K. Cython: The Best of Both Worlds. Comput. Sci. Eng. 2011, 13, 31–39. [Google Scholar] [CrossRef]
  43. Clausner, T.; Dalal, S.S.; Crespo-García, M. Photogrammetry-Based Head Digitization for Rapid and Accurate Localization of EEG Electrodes and MEG Fiducial Markers Using a Single Digital SLR Camera. Front. Neurosci. 2017, 11, 264. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Schematic presentation of the overall approach used to extract ECG electrode positions from 3D depth-sensing camera data. The first step involves methods to control the camera’s white balance and exposure settings and generate textured 3D surface meshes from the recorded depth data. During the offline processing step, these surfaces are aligned to extract the electrode positions within clusters of marker vertices found using texture images in the RGB chromacity color space and 3D surfaces.
Figure 1. Schematic presentation of the overall approach used to extract ECG electrode positions from 3D depth-sensing camera data. The first step involves methods to control the camera’s white balance and exposure settings and generate textured 3D surface meshes from the recorded depth data. During the offline processing step, these surfaces are aligned to extract the electrode positions within clusters of marker vertices found using texture images in the RGB chromacity color space and 3D surfaces.
Sensors 23 05552 g001
Figure 2. The elliptic area in the RGB chromacity space corresponds to the pixels encoding shades of gray [23]. The red r and green g chromacity values of the natural illumination color gamut of 5500 K define a point that is shifted slightly off the mean RGB chromacity toward yellowish colors. Standard 3D DS cameras designed for indoor use such as the Intel RealsenseTM typically allow adjusting the gains for the red and blue channels to illumination color gamuts between, for example, 2800 K and 6500 K, as indicated on the color gamut curve. However, in real clinical settings, gamuts from 2000 K up to 10,000 K can be expected, depending on the number of light sources and shades cast by objects and people.
Figure 2. The elliptic area in the RGB chromacity space corresponds to the pixels encoding shades of gray [23]. The red r and green g chromacity values of the natural illumination color gamut of 5500 K define a point that is shifted slightly off the mean RGB chromacity toward yellowish colors. Standard 3D DS cameras designed for indoor use such as the Intel RealsenseTM typically allow adjusting the gains for the red and blue channels to illumination color gamuts between, for example, 2800 K and 6500 K, as indicated on the color gamut curve. However, in real clinical settings, gamuts from 2000 K up to 10,000 K can be expected, depending on the number of light sources and shades cast by objects and people.
Sensors 23 05552 g002
Figure 3. The histogram-based auto-exposure algorithm considers only the pixels that most likely correspond to the patient and ignores any others. This ensures that the brightness of the patient’s skin remains as constant as possible, regardless of whether the camera points toward a window (a) or the darkest corner of the room (b). Each visible electrode is labeled with the corresponding channel number and the ‘+’ markers indicate the projected location of the computed electrode position.
Figure 3. The histogram-based auto-exposure algorithm considers only the pixels that most likely correspond to the patient and ignores any others. This ensures that the brightness of the patient’s skin remains as constant as possible, regardless of whether the camera points toward a window (a) or the darkest corner of the room (b). Each visible electrode is labeled with the corresponding channel number and the ‘+’ markers indicate the projected location of the computed electrode position.
Sensors 23 05552 g003
Figure 4. Examples for texture images recorded from the front (a) and back (c) of the torso, and the corresponding color-corrected versions (b,d). In (b,d), each visible electrode is labeled with the corresponding channel number. The ‘+’ markers indicate the projected location of the computed electrode position.
Figure 4. Examples for texture images recorded from the front (a) and back (c) of the torso, and the corresponding color-corrected versions (b,d). In (b,d), each visible electrode is labeled with the corresponding channel number. The ‘+’ markers indicate the projected location of the computed electrode position.
Sensors 23 05552 g004
Figure 5. The electrodes mounted on the patient’s torso (a) are attached to the red electrode clips. The blue boundary of each clip head (b) forms a circular marker with the red electrode clip. Based on the red and blue colors, each marker can be recognized from the recorded texture images along with 3D surface information.
Figure 5. The electrodes mounted on the patient’s torso (a) are attached to the red electrode clips. The blue boundary of each clip head (b) forms a circular marker with the red electrode clip. Based on the red and blue colors, each marker can be recognized from the recorded texture images along with 3D surface information.
Sensors 23 05552 g005
Figure 6. Chromacity heat map of the pixels representing the electrode markers created from the texture images of three patients. The brighter the color, the higher the pixel count for the corresponding point in the RGB chromacity space, represented by its red r and green g chromacity values. For better readability, the RGB chromacity values are displayed in RGB gamma-compressed form. The Gaussian peaks (dash doted ellipses) representing the red (1) and blue (2) pixels of the electrode markers are clearly visible. They can easily be distinguished from the peak (3) representing the color highlights and reflections. Peak (4) is caused by inappropriately chosen values for the parameters required to convert raw color sensor data to the RGB color space.
Figure 6. Chromacity heat map of the pixels representing the electrode markers created from the texture images of three patients. The brighter the color, the higher the pixel count for the corresponding point in the RGB chromacity space, represented by its red r and green g chromacity values. For better readability, the RGB chromacity values are displayed in RGB gamma-compressed form. The Gaussian peaks (dash doted ellipses) representing the red (1) and blue (2) pixels of the electrode markers are clearly visible. They can easily be distinguished from the peak (3) representing the color highlights and reflections. Peak (4) is caused by inappropriately chosen values for the parameters required to convert raw color sensor data to the RGB color space.
Sensors 23 05552 g006
Figure 7. The preview screen is divided into three panels. The main panel (1) displays the image recorded by the color sensor of the 3D DS camera. The parts of the 3D image for which no color information could be captured are replaced with the edges extracted from the depth image shown in panel 3. The current values of the color temperature, exposure time, frames per second, and other process parameters are displayed in panel 2. The two vertical lines indicate the area where the views of both cameras overlap.
Figure 7. The preview screen is divided into three panels. The main panel (1) displays the image recorded by the color sensor of the 3D DS camera. The parts of the 3D image for which no color information could be captured are replaced with the edges extracted from the depth image shown in panel 3. The current values of the color temperature, exposure time, frames per second, and other process parameters are displayed in panel 2. The two vertical lines indicate the area where the views of both cameras overlap.
Sensors 23 05552 g007
Figure 8. Manual evaluation of the proposed approach for locating the electrode positions on a patient’s torso. The electrode positions are backprojected onto each recorded torso surface segment (a). An electrode position can be moved by clicking on the corresponding green cross-shaped graphical marker displayed on the texture image. Its new position is selected by pointing and clicking on it (b). In case the position pointed to is not backed by a valid surface triangle, the new point (red cross) is moved to the closest possible position. By right-clicking on an electrode marker, it can be disabled and/or enabled on the presented view. Any disabled markers are not considered suitable for further evaluation.
Figure 8. Manual evaluation of the proposed approach for locating the electrode positions on a patient’s torso. The electrode positions are backprojected onto each recorded torso surface segment (a). An electrode position can be moved by clicking on the corresponding green cross-shaped graphical marker displayed on the texture image. Its new position is selected by pointing and clicking on it (b). In case the position pointed to is not backed by a valid surface triangle, the new point (red cross) is moved to the closest possible position. By right-clicking on an electrode marker, it can be disabled and/or enabled on the presented view. Any disabled markers are not considered suitable for further evaluation.
Sensors 23 05552 g008
Table 1. The symmetric ICP alignment metrics obtained from the data sets of four out of five subjects participating in the study. For an average of 14 angular views, the computation of the pairwise transformation matrix was repeated between 7 and 21 times, with an average repetition rate of 9.7 repeats per surface pair. The average initial distance between corresponding points was 7 cm, the average root mean square error was 0.7 mm, and the average final correspondence distance was 1 mm.
Table 1. The symmetric ICP alignment metrics obtained from the data sets of four out of five subjects participating in the study. For an average of 14 angular views, the computation of the pairwise transformation matrix was repeated between 7 and 21 times, with an average repetition rate of 9.7 repeats per surface pair. The average initial distance between corresponding points was 7 cm, the average root mean square error was 0.7 mm, and the average final correspondence distance was 1 mm.
Subj.# Views repeats pair Correspondence Distance (mm)Final Rmse (mm)
InitialFinal
Min.MeanMax.Min.MeanMax.Min.MeanMax.Min.MeanMax.
212910.11237.471.6124.90.70.91.20.50.60.8
31479.41751.476.1105.00.91.01.10.60.70.7
41489.41742.961.2105.20.91.01.10.60.70.8
51589.92150.071.3117.20.91.01.10.60.70.7
Mean1479.72137.470.0124.90.71.01.20.50.70.8
Table 2. Performance measures obtained from four out of five subjects participating in the study. The positions mounted on a patient’s torso were extracted from 14 recorded angular views, on average, within ≈21.8 min. The symmetric ICP-based pairwise alignment of the corresponding surfaces of about 139,527 vertices and 254,351 triangles took approximately 2 3 of this time.
Table 2. Performance measures obtained from four out of five subjects participating in the study. The positions mounted on a patient’s torso were extracted from 14 recorded angular views, on average, within ≈21.8 min. The symmetric ICP-based pairwise alignment of the corresponding surfaces of about 139,527 vertices and 254,351 triangles took approximately 2 3 of this time.
Subj.# Views# Vertices# TrianglesTime (min.)
Min.MeanMax.Min.MeanMax.ICPPos.Total
21263,204101,446132,813112,547178,654235,1126.585.7512.51
314131,090198,732232,417252,514380,923448,76617.759.9428.28
414155,025189,075223,063292,232357,555426,26114.627.7822.94
515172,548203,327241,228322,431384,722458,54215.247.7923.48
Mean1463,204176,301241,228112,547331,879458,54213.557.8221.80
Table 3. The electrode positions computed using the proposed approach and defined by manually marking the clip center on each view deviated from each other, on average, by ≈1.9 mm ± 1.5 mm. In addition, the distance between the computed electrode position and the mean point obtained by the projection of each electrode onto each individual view, as well as the average position found by manually marking the center of the clip on each view, are provided. Along with the variation resulting from the ICP-based surface alignment, both values allow for the assessment of how well the proposed approach can approximate the true positions of the electrodes.
Table 3. The electrode positions computed using the proposed approach and defined by manually marking the clip center on each view deviated from each other, on average, by ≈1.9 mm ± 1.5 mm. In addition, the distance between the computed electrode position and the mean point obtained by the projection of each electrode onto each individual view, as well as the average position found by manually marking the center of the clip on each view, are provided. Along with the variation resulting from the ICP-based surface alignment, both values allow for the assessment of how well the proposed approach can approximate the true positions of the electrodes.
Subj.Manual (mm) Manual ¯ marker (mm)Mapping (mm)Registration (mm)
Meanstd.Meanstd.Meanstd.Meanstd.
21.40.91.71.40.60.90.60.2
32.11.41.61.50.91.40.60.2
42.91.82.21.51.11.80.70.2
52.71.62.41.80.91.60.60.2
Mean2.31.42.01.50.91.40.60.2
Table 4. Comparison of the proposed approach for localizing ECG electrodes on the torso using a 3D DS camera with recent developments. In contrast to the large number of publications addressing the localization of EEG electrodes, only a few could be found using 3D DS cameras. The results obtained in the present study are within the ranges found by other studies.
Table 4. Comparison of the proposed approach for localizing ECG electrodes on the torso using a 3D DS camera with recent developments. In contrast to the large number of publications addressing the localization of EEG electrodes, only a few could be found using 3D DS cameras. The results obtained in the present study are within the ranges found by other studies.
SourceSensor/MethodMulti-ViewRef.# El. μ σ max.
Registration (mm)(mm)(mm)
ECG
Presented3D DSSymmetric ICPmanual mean672.01.5
reprojection 0.91.4
manual marking manual mean 2.31.4
Perez E. (2018) [20]3D DSKinect SoftwareMRI/CT12811.864.8
Alioui S. (2017) [19]3D DSonly one viewN/A32.5
Schulze W. (2014) [18]PhotogrammetryLeast Squaresmarker plate801.160.97
EEG
Chen S. (2019) [14]3D DSLeast Squaresmultiple repeats303.261.05
Homölle S. (2019) [8]3D DSScanner SoftwareMagnetic digitizer619.410.9
Cline C. C. (2018) [13]IR-ScannerScanner SoftwareMagnetic digitizer/1281.730.37
Magnetic digitizer IR-Scanner 2.980.89
VR-Digitizer 3.740.71
Clausner T. (2017) [43]PhotogrammetryPhotogrammetryFace Scan681.300.6
Magnetic digitizersoftware 7.802.1
Butler R. 2017 [9]MRImanual mean630.5
DalalS. S. (2014) [12]Magnetic digitizerFaceScan +686.813.3
Flying Triangulation manual mean 1.52.9
Kössler L. (2010) [16]Laser scannerScanner softwareMagnetic digitizer681.831.16
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bayer, J.; Hintermüller, C.; Blessberger, H.; Steinwender, C. ECG Electrode Localization: 3D DS Camera System for Use in Diverse Clinical Environments. Sensors 2023, 23, 5552. https://doi.org/10.3390/s23125552

AMA Style

Bayer J, Hintermüller C, Blessberger H, Steinwender C. ECG Electrode Localization: 3D DS Camera System for Use in Diverse Clinical Environments. Sensors. 2023; 23(12):5552. https://doi.org/10.3390/s23125552

Chicago/Turabian Style

Bayer, Jennifer, Christoph Hintermüller, Hermann Blessberger, and Clemens Steinwender. 2023. "ECG Electrode Localization: 3D DS Camera System for Use in Diverse Clinical Environments" Sensors 23, no. 12: 5552. https://doi.org/10.3390/s23125552

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop