Skip main navigation

Detection of Left Ventricular Systolic Dysfunction From Electrocardiographic Images

Originally publishedhttps://doi.org/10.1161/CIRCULATIONAHA.122.062646Circulation. 2023;148:765–777

Abstract

BACKGROUND:

Left ventricular (LV) systolic dysfunction is associated with a >8-fold increased risk of heart failure and a 2-fold risk of premature death. The use of ECG signals in screening for LV systolic dysfunction is limited by their availability to clinicians. We developed a novel deep learning–based approach that can use ECG images for the screening of LV systolic dysfunction.

METHODS:

Using 12-lead ECGs plotted in multiple different formats, and corresponding echocardiographic data recorded within 15 days from the Yale New Haven Hospital between 2015 and 2021, we developed a convolutional neural network algorithm to detect an LV ejection fraction <40%. The model was validated within clinical settings at Yale New Haven Hospital and externally on ECG images from Cedars Sinai Medical Center in Los Angeles, CA; Lake Regional Hospital in Osage Beach, MO; Memorial Hermann Southeast Hospital in Houston, TX; and Methodist Cardiology Clinic of San Antonio, TX. In addition, it was validated in the prospective Brazilian Longitudinal Study of Adult Health. Gradient-weighted class activation mapping was used to localize class-discriminating signals on ECG images.

RESULTS:

Overall, 385 601 ECGs with paired echocardiograms were used for model development. The model demonstrated high discrimination across various ECG image formats and calibrations in internal validation (area under receiving operation characteristics [AUROCs], 0.91; area under precision-recall curve [AUPRC], 0.55); and external sets of ECG images from Cedars Sinai (AUROC, 0.90 and AUPRC, 0.53), outpatient Yale New Haven Hospital clinics (AUROC, 0.94 and AUPRC, 0.77), Lake Regional Hospital (AUROC, 0.90 and AUPRC, 0.88), Memorial Hermann Southeast Hospital (AUROC, 0.91 and AUPRC 0.88), Methodist Cardiology Clinic (AUROC, 0.90 and AUPRC, 0.74), and Brazilian Longitudinal Study of Adult Health cohort (AUROC, 0.95 and AUPRC, 0.45). An ECG suggestive of LV systolic dysfunction portended >27-fold higher odds of LV systolic dysfunction on transthoracic echocardiogram (odds ratio, 27.5 [95% CI, 22.3–33.9] in the held-out set). Class-discriminative patterns localized to the anterior and anteroseptal leads (V2 and V3), corresponding to the left ventricle regardless of the ECG layout. A positive ECG screen in individuals with an LV ejection fraction ≥40% at the time of initial assessment was associated with a 3.9-fold increased risk of developing incident LV systolic dysfunction in the future (hazard ratio, 3.9 [95% CI, 3.3–4.7]; median follow-up, 3.2 years).

CONCLUSIONS:

We developed and externally validated a deep learning model that identifies LV systolic dysfunction from ECG images. This approach represents an automated and accessible screening strategy for LV systolic dysfunction, particularly in low-resource settings.

Clinical Perspective

What Is New?

  • A convolutional neural network model was developed and externally validated that accurately identifies left ventricular (LV) systolic dysfunction from ECG images across subgroups of age, sex, and race.

  • The model shows robust performance across multiple institutions and health settings, both applied to ECG image databases and directly uploaded single ECG images to a web-based application by clinicians.

  • The approach provides information for both screening of LV systolic dysfunction and its risk on the basis of ECG images alone.

What Are the Clinical Implications?

  • Our model represents an automated screening strategy for LV systolic dysfunction on a variety of ECG layouts.

  • With availability of ECG images in practice, this approach overcomes implementation challenges of deploying an interoperable screening tool for LV systolic dysfunction in resource-limited settings.

  • This model is available in an online format to facilitate real-time screening for LV systolic dysfunction by clinicians.

Left ventricular (LV) systolic dysfunction is associated with a >8-fold increased risk of subsequent heart failure and nearly a 2-fold risk of premature death.1 Although early diagnosis can effectively lower this risk,2–4 individuals are often diagnosed after developing symptomatic disease due to lack of effective screening strategies.5–7 The diagnosis traditionally relies on echocardiography, a specialized resource intensive imaging modality whcih is difficult to deploy at scale. 8,9 Algorithms using raw signals from electrocardiography have been developed as a strategy to detect LV systolic dysfunction.10–12 However, clinicians, particularly in remote settings, do not have access to ECG signals. The lack of interoperability in signal storage formats from ECG devices further limits the broad uptake of such signal-based models.13 The use of ECG images is an opportunity to implement interoperable screening strategies for LV systolic dysfunction.

We previously developed a deep learning approach of format-independent inference from real-world ECG images.14 The approach can interpretably diagnose cardiac conduction and rhythm disorders using any layout of real-world 12-lead ECG images and can be accessed on web-based or application-based platforms. Extension of this artificial intelligence–driven approach to ECG images to screen for LV systolic dysfunction could rapidly broaden access to a low-cost, easily accessible, and scalable diagnostic approach to underdiagnosed and undertreated at-risk populations. This approach adapts deep learning for end users without disruption of data pipelines or clinical workflow. Moreover, the ability to add localization of predictive cues in the ECG images relevant to the LV systolic dysfunction can improve the uptake of these models in clinical practice.15

In this study, we present a model for accurate identification of an LV ejection fraction (LVEF) <40%, a threshold with therapeutic implications based on ECG images. We developed, tested, and externally validated this approach using paired ECG-echocardiographic data from large academic hospitals, rural hospital systems, and a prospective cohort study.

METHODS

The Yale institutional review board reviewed the study, approved the study protocol, and waived the need for informed consent, as the study represents a secondary analysis of existing data. The data cannot be shared publicly, although an online version of the model is publicly available for research use at https://www.cards-lab.org/ecgvision-lv. This web application represents a prototype of the eventual application of the model, with instructions for required image standards and a version that demonstrates an automated image standardization pipeline.

Data Source for Model Development

We used 12-lead ECG signal waveform data from the Yale New Haven Hospital (YNHH) collected between 2015 and 2021. These ECGs were recorded as standard 12-lead recordings sampled at a frequency of 500 Hz for 10 seconds and were recorded on multiple different machines; a majority were collected using Philips PageWriter machines and GE MAC machines. Among patients with an ECG, those with a corresponding transthoracic echocardiogram (TTE) within 15 days of obtaining the ECG were identified from YNHH electronic health records. LVEF values were extracted based on a cardiologist’s read of the nearest TTE to each ECG. To augment the evaluation of models built on an image data set generated from this YNHH signal waveform, 6 sets of ECG image data sets were used for external validation.

Data Preprocessing

During model development, all ECGs were analyzed to determine whether they had 10 seconds of continuous recordings across all 12 leads. The 10-second samples were preprocessed with a 1-second median filter, subtracted from the original waveform to remove baseline drift in each lead, representing processing steps pursued by ECG machines before generating printed output from collected waveform data.

ECG signals were transformed into ECG images using the Python library ecg-plot16 and stored at 100 dots per inch. Images were generated with a calibration of 10 mm/mV, which is standard for printed ECGs in most real-world settings. In sensitivity analyses, we evaluated model performance on images calibrated at 5 and 20 mm/mV. All images, including those in train, validation, and test sets, were converted to grayscale, followed by down-sampling to 300×300 pixels regardless of their original resolution using Python Image Library (PIL v9.2.0). To ensure that the model was adaptable to real-world images, which may vary in formats and the layout of leads, we created a data set with different plotting schemes for each signal waveform recording (Figure 1). This strategy has been used to train a format-independent image-based model for detecting conduction and rhythm disorders, as well as the hidden label of sex.14 The model in this study learned ECG lead-specific information based on the label, regardless of the location of the lead.

Figure 1.

Figure 1. Study outline. A, Data processing. B, Model training. C, Model validation. The transfer learning strategy (displayed in B) in developing the current model includes transferring model initialization weights from the previous algorithm originally trained to detect cardiac rhythm disorders and the hidden label of sex from ECG images. Transfer learning was used as initialization weights for the EfficientNet B3 convolutional neural network being trained to detect left ventricular systolic dysfunction. Other than the weights, clinical and sex labels were not input into the current model. EF indicates ejection fraction; ELSA-Brasil, Estudo Longitudinal de Saúde do Adulto (the Brazilian Longitudinal Study of Adult Health); FC, fully connected layers; and Grad-CAM, gradient-weighted class activation mapping.

Four formats of images were included in the training image data set (Figure 1). The first format was based on the standard printed ECG format in the United States, with four 2.5-second columns printed sequentially on the page. Each column contained 2.5-second intervals from 3 leads. The full 10-second recording of the lead I signal was included as the rhythm strip. The second format, a 2-rhythm format, added lead II as an additional rhythm strip to the standard format. The third layout was the alternate format, which consisted of 2 columns, the first with 6 simultaneous 5-second recordings from the limb leads, and the second with 6 simultaneous 5-second recordings from the precordial leads without a corresponding rhythm lead. The fourth format was a shuffled format, which had precordial leads in the first 2 columns and limb leads in the third and fourth. All images were rotated a random amount between –10 and 10 degrees before being input into the model to mimic variations seen in uploaded ECGs and to aid in prevention of overfitting.

The process of converting ECG signals to images was independent of model development, ensuring that the model did not learn any aspects of the processing that generated images from the signals. All ECGs were converted to images in all different formats without conditioning on clinical labels. The validation required uploaded images to be upright, cropped to the waveform region, with no brightness and contrast consideration as long as the waveform was distinguishable from the background and lead labels were discernible.

Experimental Design

Each included ECG had a corresponding LVEF value from its nearest TTE within 15 days of recording. Low LVEF was defined as an LVEF <40%, the cutoff used as an indication for most guideline-directed pharmacotherapy for heart failure.4 Patients with at least one ECG within 15 days of its nearest TTE were randomly split into training, validation, and held-out test patient level sets (85%, 5%, and 10%; Figure S1). This sampling was stratified by whether a patient had ever had an LVEF <40% to ensure that cases of preserved and reduced LVEF were split proportionally among the sets. In the training cohort, all ECGs within 15 days of a TTE were included for all patients to maximize the data available. In validation and testing cohorts, only one ECG was included per patient to ensure independence of observations in the assessment of performance metrics. This ECG was randomly chosen among all ECGs within 15 days of a TTE. In addition, to ensure that model learning was not affected by the relatively lower frequency of an LVEF <40%, higher weights were given to these cases at the training stage based on the effective number of samples in the class sampling scheme.17

Model Training

We built a convolutional neural network model based on the EfficientNet-B3 architecture,18 which previously demonstrated an ability to learn and identify both rhythm and conduction disorders, as well as the hidden label of sex in real-world ECG images.14 The EfficientNet-B3 model requires images to be sampled at 300×300 square pixels, includes 384 layers, and has >10 million trainable parameters (Figure S2). We used transfer learning by initializing model weights as the pretrained EfficientNet-B3 weights used to predict the 6 physician-defined clinical labels and sex from Sangha et al.14 We first only unfroze the last 4 layers and trained the model with a learning rate of 0.01 for 2 epochs, and then unfroze all layers and trained with a learning rate of 5×10–6 for 6 epochs. We used an Adam optimizer, gradient clipping, and a minibatch size of 64 throughout training. The optimizer and learning rates were chosen after hyperparameter optimization. For both stages of training the model, we stopped training when validation loss did not improve in 3 consecutive epochs.

We trained and validated our model on a generated image data set that had equal numbers of standard, 2-rhythm, alternate, and standard shuffled images (Figure 1). In sensitivity analyses, the model was validated on 3 novel ECG layouts constructed from the held-out set to assess its performance on ECG formats not encountered in the training process. These novel ECG outlines included 3-rhythm (with leads I, II, and V1 as the rhythm strip), no rhythm, and rhythm on top formats (with lead I as the rhythm strip located above the 12-lead; Figure S3). Additional sensitivity analyses were performed using ECG images calibrated at 5, 10, and 20 mm/mV (Figure S4). A custom class-balanced loss function (weighted binary cross-entropy) based on the effective number of samples was used given the lower frequency of the LVEF <40% label relative to those with an LVEF ≥40%. Furthermore, model performance was evaluated in a 5-fold cross-validation analysis using the original derivation (train and validation) set. A patient-level split stratified by LVEF <40% versus ≥40% was pursued in this analysis, and model performance was assessed on the held-out test set.

External Validation

We pursued a series of validation studies. These represented both clinical and population-based cohort studies. Clinical validation represented nonsynthetic image data sets from clinical settings spanning consecutive patients undergoing outpatient echocardiography at the Cedars Sinai Medical Center in Los Angeles, CA, and stratified convenience samples of LV systolic dysfunction and non-LV systolic dysfunction ECGs from 4 different settings: (1) outpatient clinics of YNHH; (2) inpatient admissions at Lake Regional Hospital in Osage Beach, MO; (3) inpatient admissions at Memorial Hermann Southeast Hospital in Houston, TX; and (4) outpatient visits and inpatient admissions at Methodist Cardiology Clinic in San Antonio, TX. In addition, we validated our approach in the prospective cohort from Brazil, the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil),19 with protocolized ECGs and echocardiograms in study participants.

Inclusion and exclusion criteria for external validation sets were similar to the internal YNHH data set. Patients were limited to those having a 12-lead ECG within 15 days of a TTE with reported LVEF. For patients with more than one TTE in this interval, the LVEF from the nearest TTE was used for analysis.

At Cedars Sinai, all index ECG images from consecutive patients undergoing outpatient visits between January and March 2019, representing 879 individuals, including 99 with an LVEF <40%, were included. These analyses were performed in a fully federated and blinded fashion without access to the ECGs by developers of the algorithm.

For the other clinical validation sites, a stratified convenience sample enriched for low LVEF was drawn. This was done to evaluate the broad use in a clinical setting by practicing clinicians without access to a research data set. Our preliminary assessments of LV systolic dysfunction prevalence in outpatient and inpatient settings were 10% and 20%, respectively. We sought to achieve twice this prevalence in our external validation data in these sites to ensure that our performance was not driven by patients with preserved LVEF and that the model could detect those with LV systolic dysfunction. In particular, a 1:4 ratio of ECGs corresponding to an LVEF <40% and ≥40% was sought at 3 of the 4 sites (YNHH, Memorial Hermann Southeast Hospital, and Methodist Cardiology Clinic). At the fourth site, Lake Regional Hospital, a 1:2 ratio was requested to better measure discriminative ability of the model in an inpatient-only setting.

In addition to the clinical validation studies, in which concurrent ECGs and echocardiograms are always clinically indicated, imposing a selection of the population, we evaluated our model in the ELSA-Brasil study, a community-based prospective cohort in Brazil that obtained ECGs and echocardiography from participants on the enrollment visit between 2008 and 2010. This set included data from 2577 individuals, including 30 from individuals with an LVEF <40%.

Before validation, patient identifiers, ECG measurements, and reported diagnoses were removed from all ECG images. The differences in ECG layouts and the procedures for validation are described in further detail in the Supplemental Material. Deidentified samples of ECG images are presented in Figure S5 (Cedars Sinai Medical Center), Figure S6 (YNHH and Lake Regional Hospital), Figure S7 (Memorial Hermann Southeast Hospital), and Figure S8 (Methodist Cardiology Clinic), and images are available from the authors on request.

Localization of Model Predictive Cues

We used gradient-weighted class activation mapping (Grad-CAM) to highlight which portions of an image were important for predicting an LVEF <40%.20 We calculated the gradients on the final stack of filters in our EfficientNet-B3 model for each prediction and performed a global average pooling of the gradients in each filter, emphasizing those that contributed to a prediction. We then multiplied these filters by their importance weights and combined them across filters to generate Grad-CAM heatmaps. We averaged class activation maps among 100 positive cases with the most confident model predictions for an LVEF <40% across ECG formats to determine the most important image areas for the prediction of low LVEF. We took an arithmetic mean across the heatmaps for a given image format and overlayed this average heatmap across a representative ECG before conversion of the image to grayscale. The Grad-CAM intensities were converted from their original scale (0–1) to a color range using the jet colormap array in the Python library matplotlib. This colormap was then overlaid on the original ECG image with an α of 0.3. The activation map, a 10×10 array, was upsampled to the original image size using the bilinear interpolation built into TensorFlow v2.8.0. We also evaluated the Grad-CAM for individual ECGs to evaluate the consistency of the information on individual examples.

Preprocessing Strategies for Noisy Input Data

Standard input requirements for our image-based model include ECG images limited to 12-lead tracings with an upright orientation, minimal rotation, solid background, and no peripheral annotations. To mitigate the impact of noisy input data on model predictions in real-world applications, we built in an automated preprocessing function that includes 2 major steps. In the first step, straightening and cropping, the input ECG image is automatically straightened to correct for rotations and then cropped to remove the peripheral elements. The output of this preprocessing step is a 12-lead tracing without surrounding annotations and patient identifiers.

In the second step, quality evaluation and standardization, the algorithm computes the mean pixel-level brightness and contrast values for input images and evaluates them against the brightness and contrast of images used in model development. The brightness and contrast are both scaled to the mean values of the development population before predictions. ECGs with extreme deviations of brightness and contrast (50% above or below the development set) are flagged as out of range so a better-quality image can be acquired and input.

We evaluated the model calibration across the variations of photo brightness and contrast. For this analysis, we used the Python Image Library to adjust the input image qualities. A total of 200 ECGs were randomly selected from the held-out test set in a 1:4 ratio for an LVEF <40% and ≥40%, respectively. Variations of the original image were generated with brightness and contrast between 0.5 to 1.5× the original values and were used in this sensitivity analysis.

Statistical Analysis

Categorical variables were presented as frequency and percentages, and continuous variables were presented as means and standard deviations or median and interquartile range, as appropriate. Model performance was evaluated in the held-out test set and external ECG image data sets. We used area under the receiver operator characteristic (AUROC) to measure model discrimination. The cutoff for binary prediction of LV systolic dysfunction was set at 0.10 for all internal and external validations, based on the threshold that achieved a sensitivity of >90% in the internal validation set. We also assessed the area under the precision-recall curve (AUPRC), sensitivity, specificity, positive predictive value, negative predictive value, and diagnostic odds ratio. The 95% CIs for AUROC and AUPRC were calculated using the DeLong algorithm and bootstrapping with 1000 variations for each estimate, respectively.21,22 Model performance was assessed across demographic subgroups and ECG outlines, as described previously. We conducted further sensitivity analyses of model performance across ECG calibrations. We also evaluated model performance across PR intervals (>200 versus ≤200 ms) and after excluding ECGs with paced rhythms, conduction disorders, atrial fibrillation, and atrial flutter. Moreover, we assessed the association of the predicted probability of LV systolic dysfunction of the model across LVEF categories.

Next, we evaluated the future development of LV systolic dysfunction in time-to-event models using a Cox proportional hazards model. In this analysis, we took the first temporal ECG from the patients in the held-out test set and then modeled the first development of an LVEF <40% across the groups of patients who screened positive but did not have concurrent LV systolic dysfunction (false positives), and those who screened negative (true negative) from this first ECG, with censored at death or end of the study period in June 2021. In addition, we computed an adjusted hazard ratio that accounted for differences in age, sex, and baseline LVEF at the time of index screening for visualization of survival trends. Analytic packages used in model development and statistical analysis are reported in Table S1. All model development and statistical analyses were performed using Python 3.9.5, and the level of significance was set at an α of 0.05.

RESULTS

Study Population

Of the 2 135 846 ECGs obtained between 2015 and 2021, 440 072 were from patients who had TTEs within 15 days of obtaining the ECG. Overall, 433 027 had a complete ECG recording, representing 10 seconds of continuous recordings across all 12 leads. These ECGs were drawn from 116 210 unique patients and were split into train, validation, and test sets at a patient level (Figure S1).

A total of 116 210 individuals with 385 601 ECGs constituted the study population, representing those included in the training, validation, and test sets. Individuals in the model development population had a median age of 68 years (interquartile range, 56–78) at the time of ECG recording, and 59 282 (51.0%) were women. Overall, 75 928 (65.3%) were non-Hispanic White, 14 000 (12.0%) were non-Hispanic Black, 9349 (8.0%) were Hispanic, and 16 843 (14.5%) were from other races. A total of 56 895 (14.8%) ECGs had a corresponding echocardiogram with an LVEF <40%, 36 669 (9.5%) had an LVEF ≥40% but <50%, and 292 037 (75.7%) had an LVEF ≥50% (Table S2).

Detection of LV Systolic Dysfunction

The AUROC of the model for detecting an LVEF<40% on the held-out test set composed of standard images was 0.91, and its AUPRC was 0.55 (Figure 2). A probability threshold for predicting an LVEF <40% was chosen on the basis of a sensitivity of ≥0.90 in the validation subset, with a specificity of 0.77 at this threshold in the internal validation set. With this threshold, the model had sensitivity and specificity of 0.89 and 0.77 in the held-out test set and positive predictive value and negative predictive value of 0.26 and 0.99, respectively. Overall, an ECG suggestive of LV systolic dysfunction portended a >27-fold higher odds (odds ratio, 27.5 [95% CI, 22.3–33.9]) of LV systolic dysfunction on TTE (Table 1). The performance of the model was comparable across subgroups of age, sex, and race (Table 1; Figure 2). In a cross-validation analysis, model performance remained consistent across 5 splits and was similar to the performance of the original model (Table S3). Moreover, across successive deciles of the model-predicted probabilities, the proportion of individuals with LV systolic dysfunction increased, whereas the mean LVEF decreased (Figure S9).

Table 1. Performance of Model on Test Images Across Demographic Subgroups in the Held-Out Test Set

LabelsNo. (%)Positive predictive valueNegative predictive valueSpecificitySensitivityAUROCAUPRCF1 score
All11 621
(100)
0.2570.9880.7690.8920.910 (0.901–0.919)0.545 (0.511–0.579)0.399
Male5952
(51.2)
0.2850.9840.7350.8970.901 (0.889–0.914)0.583 (0.539–0.621)0.433
Female5668
(48.8)
0.2150.9910.8020.8840.917 (0.903–0.932)0.470 (0.416–0.530)0.346
≥65 y6550
(56.4)
0.2520.9850.7170.8960.892 (0.880–0.905)0.522 (0.480–0.561)0.393
<65 y5068
(43.6)
0.2660.9910.8330.8860.931 (0.916–0.945)0.590 (0.534–0.655)0.410
Hispanic942
(8.1)
0.2530.9920.8020.9080.926 (0.892–0.961)0.576 (0.453–0.696)0.396
White7557
(65.0)
0.2610.9880.7700.8950.910 (0.898–0.921)0.537 (0.498–0.580)0.404
Black1417
(12.2)
0.2630.9840.7120.8970.899 (0.872–0.925)0.590 (0.498–0.665)0.407
Other1705
(14.7)
0.2310.9870.7870.8640.912 (0.887–0.937)0.532 (0.437–0.625)0.364
Atrial fibrillation or flutter1518 (13.1)0.2740.9740.5720.9120.858 (0.831–0.885)0.540 (0.470–0.613)0.421
No atrial fibrillation or flutter10 103 (86.9)0.2510.9890.7960.8860.917 (0.907–0.927)0.548 (0.511–0.586)0.392
Paced ECGs551 (4.7)0.3600.9830.3020.9870.821 (0.784–0.858)0.626 (0.549–0.712)0.528
No paced ECGs11 070 (95.3)0.2410.9880.7860.8730.908 (0.898–0.919)0.527 (0.493–0.566)0.378
PR interval >2001253 (10.8)0.2650.9800.7310.8650.900 (0.871–0.929)0.582 (0.497–0.671)0.405
PR interval ≤20010 368 (89.2)0.2550.9880.7730.8960.911 (0.902–0.921)0.540 (0.503–0.579)0.398
Left bundle branch block399 (3.4)0.3280.9530.2770.9630.804 (0.756–0.852)0.602 (0.504–0.701)0.489
No left bundle branch block11 222 (96.6)0.2490.9880.7820.8830.911 (0.901–0.921)0.538 (0.503–0.575)0.389
Right bundle branch block933 (8.0)0.2380.9870.6970.9090.882 (0.847–0.917)0.457 (0.352–0.547)0.377
No right bundle branch block10 688 (92.0)0.2590.9880.7750.8900.912 (0.903–0.922)0.554 (0.519–0.590)0.401

AUPRC indicates area under precision recall curve; and AUROC, area under receiver operating characteristic curve.

*Sex information was not available for one patient, and age was not available for 3 patients of the total 11 621 patients in the held-out test set.

Figure 2.

Figure 2. Model performance measures. Receiver-operating (A) and precision-recall (B) curves on images in held-out test set. C, Diagnostic odds ratios across age, sex, and race subgroups on standard-format images in the held-out test set. AUROC indicates area under receiver-operating characteristic curve; and AUPRC, area under precision-recall curve.

Model Performance Across ECG Formats and Calibrations

The model performance was comparable across the 4 original layouts of ECG images in the held-out set, with an AUROC of 0.91 in detecting concurrent LV systolic dysfunction (Table S4). The model had a sensitivity of 0.89, and a positive prediction conferred a 26- to 27-fold higher odds of LV systolic dysfunction on the standard and the 3 variations of the data. In sensitivity analyses, the model demonstrated similar performance in detecting LV systolic dysfunction from novel ECG formats that were not encountered previously, with an AUROC between 0.88 and 0.91 (Table S5).

The performance of the model was also consistent across ECG calibrations, with an AUROC between 0.88 and 0.91 on ECG calibrations of 5, 10, and 20 mm/mV and an AUROC of 0.909 (0.900–0.918) and an AUPRC of 0.539 (0.504–0.574), with mixed calibrations in the held-out test set. The mixed calibration was generated with a random sample of 5- and 20-mm/mV calibrations from the highest and lowest quartiles of voltages, respectively, in lead I (together representing 25% of the sample from the test set), along with 10 mm/mV (remaining 75% of test set; Table S6). Further sensitivity analyses demonstrated consistent model performance on ECGs: (1) without a prolonged PR interval (AUROC, 0.920 and AUPRC, 0.537; Table S7); (2) without paced rhythms (AUROC, 0.908 and AUPRC, 0.519; Table S8); and (3) without atrial fibrillation, atrial flutter, and conduction disorders (AUROC, 0.919 and AUPRC, 0.536; Table S9). Model performance was also consistent across subsets on the held-out test set based on the timing of the ECG relative to the echocardiogram (Table S10).

LV Systolic Dysfunction in Model-Predicted False Positives

Of the 10 666 ECGs in the held-out test set with an associated LVEF ≥40% on a proximate echocardiogram, the model classified 2469 (23.1%) as “false positives” and 8197 (76.9%) as true negatives. In further evaluation of false positives, 562 (22.8% of false positives) had evidence of mild LV systolic dysfunction, with an LVEF between 40% and 50% on concurrent echocardiography.

In this group of individuals, 4046 patients had at least one follow-up TTE, including 1125 (27.8%) false positives and 2921 (72.2%) true negatives on the initial index screen. There were 2665 and 6083 echocardiograms in the false positive and true negative populations during the follow-up, with the longest follow-up of 6.1 years. Overall, 264 (23.5%) patients with a model-predicted positive screen and 199 (6.8%) with a negative screen developed new LVEF <40% over the median follow-up of 3.2 years (interquartile range, 1.8–4.4 years; Figure 3). This represented a 3.9-fold higher risk of incident low LVEF on the basis of having a positive screening result (hazard ratio, 3.9 [95% CI, 3.3–4.7]). After adjustment for age, sex, and LVEF at the time of screening, patients with a positive screen had a 2.3-fold higher risk of incident low LVEF (adjusted hazard ratio, 2.3 [95% CI, 1.9–2.8]).

Figure 3.

Figure 3. Cumulative hazard curves for incident left ventricular systolic dysfunction in model-predicted positive and negative screens among the members of the held-out test set with a left ventricular ejection fraction ≥40% and at least 1 follow-up measurement.

Localization of Predictive Cues for LV Systolic Dysfunction

Class activation heatmaps of the 100 positive cases with the most confident model predictions for reduced LVEF prediction across 4 ECG layouts are presented in Figure 4. For all 4 formats of images, the region corresponding to leads V2 and V3 were the most important areas for prediction of reduced LVEF. Figure S10 represents the distribution of mean Grad-CAM signal intensities in the regions corresponding to leads V2 and V3 and the other regions of standard format ECGs in this sample. For the majority of cases, the Grad-CAM signal intensities in the V2 and V3 areas were higher than the other regions of the ECG. Representative images of Grad-CAM analysis in sampled individuals with positive and negative screens in the held-out test set and nonsynthetic ECG images in validation sites are presented in Figures S11 and S12, respectively.

Figure 4.

Figure 4. Gradient-weighted class activation mapping across ECG formats. A, Standard format. B, Two rhythm leads. C, Standard shuffled format. D, Alternate format. The heatmaps represent averages of the 100 positive cases with the most confident model predictions for a left ventricular ejection fraction <40%.

External Validation

The validation performance of the model was consistent and robust across each of the 6 validation data sets (Figure 5). The first validation set at Cedars Sinai Medical Center included 879 ECGs from consecutive patients who underwent outpatient echocardiography, including 99 (11%) individuals with an LVEF <40%. The model demonstrated an AUROC of 0.90 and an AUPRC of 0.53 in this set. Second, 147 ECG images drawn from YNHH outpatient clinics were used for validation and included 27 images (18%) from patients with an LVEF <40%. The model had an AUROC of 0.94 and AUPRC of 0.77 in validation on these images. The third image data set included ECG images from inpatient visits to the Lake Regional Hospital. It included 100 ECG images, with 43 images (43%) from patients with an LVEF <40%, with a model AUROC of 0.90 and AUPRC of 0.88. The fourth data set from Memorial Hermann Southeast Hospital included 50 ECG images, 11 (22%) from patients with an LVEF <40%, with a model AUROC and AUPRC of 0.91 and 0.88 on these images, respectively. The fifth validation set contained 50 ECG images from the Methodist Cardiology Clinic, which included 11 (20%) ECGs from patients with an LVEF <40%, with a model AUROC of 0.90 and an AUPRC of 0.74.

Figure 5.

Figure 5. Receiver-operating curves for external validation sites. AUROC indicates area under receiver-operating characteristic curve; ELSA-Brasil, Brazilian Longitudinal Study of Adult Health; LRH, Lake Regional Hospital; and YNHH, Yale New Haven Hospital.

The sixth set included 2577 ECGs from prospectively enrolled individuals in the ELSA-Brasil study, including 30 with an LVEF <40%. The model demonstrated an AUROC of 0.95 and an AUPRC of 0.45 on this set. In a mixed sample of ECG-echocardiography data from all external validation sites, the model demonstrated an AUROC and AUPRC of 0.96 (0.950–0.969) and 0.63 (0.563–0.694), respectively, in detecting LV systolic dysfunction, respectively. The model performance on these 6 validation sets is outlined in Table 2 and Tables S11–S13.

Table 2. Performance of Model on External Validation Data Sets

SitePositive predictive valueNegative predictive valueSpecificitySensitivityAUROCAUPRCF1 score
Cedars Sinai Medical Center0.3260.9790.7720.8690.902 (0.877–0.926)0.533 (0.432–0.640)0.474
Outpatient Clinics of Yale New Haven Hospital0.3381.0000.5581.0000.946 (0.910–0.982)0.775 (0.605–0.916)0.505
Lake Regional Hospital0.5380.9550.3680.9770.901 (0.843–0.959)0.889 (0.810–0.946)0.694
Memorial Hermann Southeast Hospital0.3850.9580.5900.9090.918 (0.790–1.000)0.888 (0.699–1.000)0.541
Methodist Cardiology Clinic0.4581.0000.6671.0000.902 (0.816–0.989)0.738 (0.470–0.928)0.629
ELSA-Brasil0.2560.9960.9760.7000.949 (0.915–0.983)0.449 (0.290–0.651)0.375
All validation sites0.3560.9930.9000.8910.959 (0.950–0.969)0.631 (0.563–0.694)0.508

AUPRC indicates area under precision recall curve; AUROC, area under receiver operating characteristic curve; AUPRC, area under precision recall curve; and ELSA-Brasil, Estudo Longitudinal de Saúde do Adulto (the Brazilian Longitudinal Study of Adult Health).

Quality Assurance in Real-World Applications

We assessed our preprocessing pipeline in segmentation and quality standardization of real-world ECG images. Figure S13 represents examples of ECGs in electronic PDF format before and after preprocessing and demonstrates the automated removal of ECG annotations and patient identifiers from the image. Figures S14 and S15 demonstrate quality standardization of photographs of ECGs obtained by a smartphone with extreme variations of photo brightness, shadows, skew angles, and noise artifacts. Furthermore, we systematically evaluated our model calibration across the variations of photo brightness and contrast in a sample of 200 ECGs randomly selected from the held-out test set in a 1:4 ratio, for an LVEF <40% and ≥40%, respectively. We observed minimal changes in model-predicted probabilities despite 50% alterations in image brightness and contrast on preprocessed images (Figure S16). This effect remained consistent across ECGs from individuals with a low (Figure S17) and normal (Figure S18) LVEF. Table S14 presents the confusion matrices for model predictions at varying levels of input image brightness and contrast, with or without preprocessing.

DISCUSSION

We developed and externally validated an automated deep learning algorithm that accurately identifies LV systolic dysfunction solely from ECG images. The algorithm has high discrimination and sensitivity, representing characteristics ideal for a screening strategy. It is robust to variations in the layouts of ECG waveforms and detects the location of ECG leads across multiple formats with consistent accuracy, making it suitable for implementation in a variety of settings. Moreover, the algorithm was developed and tested in a diverse population with high performance in subgroups of age, sex, and race, and across geographically dispersed academic and community health systems. It performed well in 6 external validation sites, spanning both clinical settings, as well as a prospective cohort study in which protocolized echocardiograms were performed concurrently with ECGs. An evaluation of the class-discriminating signals localized it to the anteroseptal and anterior leads regardless of the ECG layout, topologically corresponding to the left ventricle. Finally, among individuals who did not have a concurrently recorded low LVEF, a positive ECG screen was associated with a 3.9-fold increased risk of developing LV systolic dysfunction in the future compared with those with a negative screen, which was significant after adjustment for age, sex, and baseline LVEF. Therefore, an ECG image-based approach can represent a screening, as well as a predictive strategy for LV systolic dysfunction, particularly in low-resource settings.

Deep learning–based analysis of ECG images to screen for heart failure represents a novel application of artificial intelligence that has the potential to improve clinical care. Convolutional neural networks have previously been designed to detect low LVEF from ECG signals.10,11 Although reliance of signal-based models on voltage data is not computationally limited, their use in both retrospective and prospective settings requires access to a signal repository in which the ECG data architecture varies by ECG device vendors. Moreover, data are often not stored beyond generating printed ECG images, particularly in remote settings.23 Furthermore, widespread adoption of signal-based models is limited by the implementation barriers requiring health system–wide investments to incorporate them into clinical workflow, something that may not be available or cost-effective in low-resource settings and, to date, is not widely available in higher-resource settings such as the United States. The algorithm reported in this study overcomes these limitations by making detection of LV systolic dysfunction from ECGs interoperable across acquisition formats and directly available to clinicians who only have access to ECG images. Because scanned ECG images are the most common format of storage and use of ECGs, untrained operators can implement large-scale screening through chart review or automated applications to image repositories, a lower-resource task than optimizing tools for different machines.

The use of ECG images in our model overcomes the implementation challenges arising from black box algorithms. The origin of risk-discriminative signals in precordial leads of ECG images suggests an LV origin of the predictive signals. Moreover, the consistent observation of these predictive signals in the anteroseptal and anterior leads, regardless of the lead location on printed images, also serves as a control for the model predictions. Despite localizing the class-discriminative signals in the image to the left ventricle, heatmap analysis may not necessarily capture all the model predictive features, such as the duration of ECG segments, intervals, or ECG waveform morphologies that might have been used in model predictions. However, visual representations consistent with clinical knowledge could explain parts of the model prediction process and address the hesitancy in the uptake of these tools in clinical practice.24

An important finding was the significantly increased risk of incident LV systolic dysfunction among patients with model-predicted positive screen but an LVEF ≥40% on concurrent echocardiography. These findings demonstrate an electrocardiographic signature that may precede the development of echocardiographic evidence of LV systolic dysfunction. This was previously reported in signal-based models,10 further suggesting that the detection of LV systolic dysfunction on ECG images represents a similar underlying pathophysiological process. Moreover, we observed a linear relationship between the severity of LV systolic dysfunction and the model-predicted probabilities of low LVEF, supporting the biological plausibility of model predictions from paired ECG and echocardiography data. These observations suggest a potential role for artificial intelligence–based ECG models in risk stratification for future development of cardiovascular disease.25

Our study has certain limitations that merit consideration. First, we developed this model among patients with both ECGs and echocardiograms. Therefore, the selected training population likely had a clinical indication for echocardiography, differing from the broader real-world use of the algorithm for screening tests for LV systolic dysfunction among those without any clinical disease. The ability of our model to consistently distinguish LV systolic dysfunction across demographic subgroups and validation populations suggests robustness and generalizability of the effects, although prospective assessments in the intended screening setting are warranted. It is notable that the model demonstrated a higher specificity and lower sensitivity on the ELSA-Brasil cohort composed of younger and generally healthier individuals with a lower prevalence of LV systolic dysfunction compared with the other validation sets. Depending on the intended result of the screening approach and resource constraints with downstream testing, prediction thresholds for LV systolic dysfunction may need to be recalibrated when deployed in such settings. Second, our model development uses retrospective ECG and echocardiogram data. Thus, all limitations inherent to secondary analyses apply. Although the model has been externally validated in numerous settings, including the ELSA-Brasil cohort in which protocolized echocardiograms were performed without clinical indication, prospective assessments in the intended screening setting are necessary. Third, although we incorporated 4 ECG formats during its development and demonstrated that the model had a consistent performance on a range of commonly used and novel layouts that were not included in the development, we cannot ascertain whether it maintains performance on every novel format. Fourth, although the model development pursues preprocessing of the ECG signal for plotting images, these represent standard processes performed before ECG images are generated or printed by ECG machines. Therefore, any other processing of images is not required for real-world application, as demonstrated in the application of the model to the external validation sets. Fifth, our model was built on a single convolutional neural network architecture. We did not compare the performance of our model against alternative machine learning models and architectures for the detection of LV systolic dysfunction. This was based on our previous study that inferred that EfficientB3 demonstrated good performance on ECG image classification tasks, although future studies could evaluate other architectures for these applications. Finally, although we include a prototype of a web-based application with automated standardization and predictions from ECG images, it represents only a demonstration of the eventual deployment of the model. However, it will require further development and validation before any clinical or large-scale deployment of such an application.

Conclusions

We developed an automated algorithm to detect LV systolic dysfunction from ECG images, demonstrating a robust performance across subgroups of patient demographics, ECG formats and calibrations, and clinical practice settings. Given the ubiquitous availability of ECG images, this approach represents a strategy for automated screening of LV systolic dysfunction, especially in resource-limited settings.

ARTICLE INFORMATION

Acknowledgments

Dr Khera conceived the study and accessed the data. V. Sangha and Dr Khera developed the model. V. Sangha, Drs Nargesi and Dhingra, A. Khunte, and Dr Khera pursued the statistical analysis. V. Sangha and Dr Nargesi drafted the manuscript. All authors provided feedback regarding the study design and made critical contributions to writing the manuscript. Dr Khera supervised the study, procured funding, and is the guarantor. The model is available in an online format for research use at https://www.cards-lab.org/ecgvision-lv

Supplemental Material

External Validation Data and Procedures

Tables S1–S14

Figures S1–S18

Nonstandard Abbreviations and Acronyms

AUPRC

area under precision-recall curve

AUROC

area under receiving operating characteristic curve

ELSA-Brasil

Estudo Longitudinal de Saúde do Adulto (The Brazilian Longitudinal Study of Adult Health)

Grad-CAM

gradient-weighted class activation mapping

LVEF

left ventricular ejection fraction

TTE

transthoracic echocardiography

YNHH

Yale New Haven Hospital

Disclosures Dr Mortazavi reported receiving grants from the National Institute of Biomedical Imaging and Bioengineering, National Heart, Lung, and Blood Institute, US Food and Drug Administration, and the US Department of Defense Advanced Research Projects Agency outside the submitted work; in addition, Dr Mortazavi has a pending patent on predictive models using electronic health records (US20180315507A1). Dr A.H. Ribeiro is funded by the Kjell och Märta Beijer Foundation. Dr Krumholz works under contract with the Centers for Medicare & Medicaid Services to support quality measurement programs, was a recipient of a research grant from Johnson & Johnson, through Yale University, to support clinical trial data sharing; was a recipient of a research agreement, through Yale University, from the Shenzhen Center for Health Information for work to advance intelligent disease prevention and health promotion; collaborates with the National Center for Cardiovascular Diseases in Beijing; receives payment from the Arnold & Porter Law Firm for work related to the Sanofi clopidogrel litigation, from the Martin Baughman Law Firm for work related to the Cook Celect IVC filter litigation, and from the Siegfried and Jensen Law Firm for work related to Vioxx litigation; chairs a cardiac scientific advisory board for UnitedHealth; was a member of the IBM Watson Health Life Sciences board; is a member of the advisory board for element science, the advisory board for Facebook, and the physician advisory board for Aetna; and is co-founder of Hugo Health, a personal health information platform, and co-founder of Refactor Health, a health care artificial intelligence–augmented data management company. Dr A.L.P. Ribeiro is supported in part by CNPq (465518/2014-1, 310790/2021-2, and 409604/2022-4) and by FAPEMIG (PPM-00428-17, RED-00081-16, and PPE-00030-21). V. Sangha and Dr Khera are the coinventors of US provisional patent application No. 63/346,610, “Articles and methods for format-independent detection of hidden cardiovascular disease from printed electrocardiographic images using deep learning.” Dr Khera receives support from the National Heart, Lung, and Blood Institute of the National Institutes of Health (under award K23HL153775) and the Doris Duke Charitable Foundation (under award 2022060). He receives support from the Blavatnik Foundation through the Blavatnik fund for innovation at Yale. He also receives research support, through Yale, from Bristol-Myers Squibb, and Novo Nordisk. He is an associate editor for JAMA. In addition to 63/346,610, Dr Khera is a coinventor of US provisional patent applications 63/177,117, 63/428,569, and 63/484,426. He is also a founder of Evidence2Health, a precision health platform to improve evidence-based cardiovascular care.

Footnotes

*V. Sangha and A.A. Nargesi contributed equally.

Supplemental Material is available at https://www.ahajournals.org/doi/suppl/10.1161/CIRCULATIONAHA.122.062646.

For Sources of Funding and Disclosures, see page 776.

Circulation is available at www.ahajournals.org/journal/circ

Correspondence to: Rohan Khera, MD, MS, 195 Church St, 6th Floor, New Haven, CT 06510. Email

REFERENCES

  • 1. Wang TJ, Evans JC, Benjamin EJ, Levy D, LeRoy EC, Vasan RS. Natural history of asymptomatic left ventricular systolic dysfunction in the community.Circulation. 2003; 108:977–982. doi: 10.1161/01.CIR.0000085166.44904.79LinkGoogle Scholar
  • 2. Srivastava PK, DeVore AD, Hellkamp AS, Thomas L, Albert NM, Butler J, Patterson JH, Spertus JA, Williams FB, Duffy CI, et al. Heart failure hospitalization and guideline-directed prescribing patterns among heart failure with reduced ejection fraction patients.JACC Heart Fail. 2021; 9:28–38. doi: 10.1016/j.jchf.2020.08.017CrossrefMedlineGoogle Scholar
  • 3. Wolfe NK, Mitchell JD, Brown DL. The independent reduction in mortality associated with guideline-directed medical therapy in patients with coronary artery disease and heart failure with reduced ejection fraction.Eur Heart J Qual Care Clin Outcomes. 2021; 7:416–421. doi: 10.1093/ehjqcco/qcaa032CrossrefMedlineGoogle Scholar
  • 4. Heidenreich PA, Bozkurt B, Aguilar D, Allen LA, Byun JJ, Colvin MM, Deswal A, Drazner MH, Dunlay SM, Evers LR, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure: a report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines.Circulation. 2022; 145:e895–e1032. doi: 10.1016/CIR0000000000001063.LinkGoogle Scholar
  • 5. Wang TJ, Levy D, Benjamin EJ, Vasan RS. The epidemiology of “asymptomatic” left ventricular systolic dysfunction: implications for screening.Ann Intern Med. 2003; 138:907–916. doi: 10.7326/0003-4819-138-11-200306030-00012CrossrefMedlineGoogle Scholar
  • 6. Vasan RS, Benjamin EJ, Larson MG, Leip EP, Wang TJ, Wilson PWF, Levy D. Plasma natriuretic peptides for community screening for left ventricular hypertrophy and systolic dysfunction: the Framingham Heart Study.JAMA. 2002; 288:1252–1259. doi: 10.1001/jama.288.10.1252CrossrefMedlineGoogle Scholar
  • 7. McDonagh TA, McDonald K, Maisel AS. Screening for asymptomatic left ventricular dysfunction using B-type natriuretic Peptide.Congest Heart Fail. 2008; 14:5–8. doi: 10.1111/j.1751-7133.2008.tb00002.xCrossrefMedlineGoogle Scholar
  • 8. Galasko GI, Barnes SC, Collinson P, Lahiri A, Senior R. What is the most cost-effective strategy to screen for left ventricular systolic dysfunction: natriuretic peptides, the electrocardiogram, hand-held echocardiography, traditional echocardiography, or their combination?.Eur Heart J. 2006; 27:193–200. doi: 10.1093/eurheartj/ehi559CrossrefMedlineGoogle Scholar
  • 9. Atherton JJ. Screening for left ventricular systolic dysfunction: is imaging a solution?JACC Cardiovasc Imaging. 2010; 3:421–428. doi: 10.1016/j.jcmg.2009.11.014CrossrefMedlineGoogle Scholar
  • 10. Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, Pellikka PA, Enriquez-Sarano M, Noseworthy PA, Munger TM, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram.Nat Med. 2019; 25:70–74. doi: 10.1038/s41591-018-0240-2CrossrefMedlineGoogle Scholar
  • 11. Vaid A, Johnson KW, Badgeley MA, Somani SS, Bicak M, Landi I, Russak A, Zhao S, Levin MA, Freeman RS, et al. Using deep-learning algorithms to simultaneously identify right and left ventricular dysfunction from the electrocardiogram.JACC Cardiovasc Imaging. 2022; 15:395–410. doi: 10.1016/j.jcmg.2021.08.004CrossrefMedlineGoogle Scholar
  • 12. Yao X, Rushlow DR, Inselman JW, McCoy RG, Thacher TD, Behnken EM, Bernard ME, Rosas SL, Akfaly A, Misra A, et al. Artificial intelligence-enabled electrocardiograms for identification of patients with low ejection fraction: a pragmatic, randomized clinical trial.Nat Med. 2021; 27:815–819. doi: 10.1038/s41591-021-01335-4CrossrefMedlineGoogle Scholar
  • 13. Stamenov D, Gusev M, Armenski G. Interoperability of ECG standards. In: Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, IEEE; 2018:0319–0323.Google Scholar
  • 14. Sangha V, Mortazavi BJ, Haimovich AD, Ribeiro AH, Brandt CA, Jacoby DL, Schulz WL, Krumholz HM, Ribeiro ALP, Khera R. Automated multilabel diagnosis on electrocardiographic images and signals.Nat Commun. 2022; 13:1583. doi: 10.1038/s41467-022-29153-3CrossrefMedlineGoogle Scholar
  • 15. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine.Nat Med. 2019; 25:30–36. doi: 10.1038/s41591-018-0307-0CrossrefMedlineGoogle Scholar
  • 16. ECG Plot Python Library.Accessed on May 25, 2022. https://pypi.org/project/ecg-plot/Google Scholar
  • 17. Cui Y, Jia M, Lin T-Y, Song Y, Belongie S. Class-balanced loss based on effective number of samples.arXiv. 2019;1901.05555.Google Scholar
  • 18. Tan M, Le QV. EfficientNet: rethinking model scaling for convolutional neural networks.arXiv. 2020;1905.11946v5Google Scholar
  • 19. Aquino EML, Barreto SM, Bensenor IM, Carvalho MS, Chor D, Duncan BB, Lotufo PA, Mill JG, Molina MDC, Mota ELA, et al. Brazilian longitudinal study of adult health (ELSA-Brasil): objectives and design.Am J Epidemiol. 2012; 175:315–324. doi: 10.1093/aje/kwr294CrossrefMedlineGoogle Scholar
  • 20. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). IEEE; 2017:618–626.Google Scholar
  • 21. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.Biometrics. 1988; 44:837–845.CrossrefMedlineGoogle Scholar
  • 22. Sun X, Xu W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves.IEEE Signal Process Lett. 2014; 21:1389–1393. doi: 10.1109/lsp.2014.2337313CrossrefGoogle Scholar
  • 23. Siontis KC, Noseworthy PA, Attia ZI, Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management.Nat Rev Cardiol. 2021; 18:465–478. doi: 10.1038/s41569-020-00503-2CrossrefMedlineGoogle Scholar
  • 24. Quer G, Arnaout R, Henne M, Arnaout R. Machine learning and the future of cardiovascular care: JACC state-of-the-art review.J Am Coll Cardiol. 2021; 77:300–313. doi: 10.1016/j.jacc.2020.11.030CrossrefMedlineGoogle Scholar
  • 25. Maurovich-Horvat P. Current trends in the use of machine learning for diagnostics and/or risk stratification in cardiovascular disease.Cardiovasc Res. 2021; 117:e67–e69. doi: 10.1093/cvr/cvab059CrossrefMedlineGoogle Scholar

eLetters(0)

eLetters should relate to an article recently published in the journal and are not a forum for providing unpublished data. Comments are reviewed for appropriate use of tone and language. Comments are not peer-reviewed. Acceptable comments are posted to the journal website only. Comments are not published in an issue and are not indexed in PubMed. Comments should be no longer than 500 words and will only be posted online. References are limited to 10. Authors of the article cited in the comment will be invited to reply, as appropriate.

Comments and feedback on AHA/ASA Scientific Statements and Guidelines should be directed to the AHA/ASA Manuscript Oversight Committee via its Correspondence page.