Simple Risk Model Predicts Incidence of Atrial Fibrillation in a Racially and Geographically Diverse Population: the CHARGE‐AF Consortium

Background Tools for the prediction of atrial fibrillation (AF) may identify high‐risk individuals more likely to benefit from preventive interventions and serve as a benchmark to test novel putative risk factors. Methods and Results Individual‐level data from 3 large cohorts in the United States (Atherosclerosis Risk in Communities [ARIC] study, the Cardiovascular Health Study [CHS], and the Framingham Heart Study [FHS]), including 18 556 men and women aged 46 to 94 years (19% African Americans, 81% whites) were pooled to derive predictive models for AF using clinical variables. Validation of the derived models was performed in 7672 participants from the Age, Gene and Environment—Reykjavik study (AGES) and the Rotterdam Study (RS). The analysis included 1186 incident AF cases in the derivation cohorts and 585 in the validation cohorts. A simple 5‐year predictive model including the variables age, race, height, weight, systolic and diastolic blood pressure, current smoking, use of antihypertensive medication, diabetes, and history of myocardial infarction and heart failure had good discrimination (C‐statistic, 0.765; 95% CI, 0.748 to 0.781). Addition of variables from the electrocardiogram did not improve the overall model discrimination (C‐statistic, 0.767; 95% CI, 0.750 to 0.783; categorical net reclassification improvement, −0.0032; 95% CI, −0.0178 to 0.0113). In the validation cohorts, discrimination was acceptable (AGES C‐statistic, 0.664; 95% CI, 0.632 to 0.697 and RS C‐statistic, 0.705; 95% CI, 0.664 to 0.747) and calibration was adequate. Conclusion A risk model including variables readily available in primary care settings adequately predicted AF in diverse populations from the United States and Europe.

A trial fibrillation (AF), a common cardiac arrhythmia, has emerged as a major public health problem as a result of wide prevalence, 1 close relation to stroke and mortality, 2 and associated costs. 3 Tools for the prediction of AF could help identify high-risk individuals and serve as a benchmark to test potential novel risk factors. To this end, the Framingham Heart Study (FHS) developed a risk score for AF, which included a number of variables easily obtained during routine clinical examination. 4 This risk score was recently validated in 2 additional population-based cohorts, the Age Gene/Environment Susceptibility-Reykjavik (AGES) Study and the Cardiovascular Health Study (CHS), where it demonstrated reasonable performance. 5 An alternative score has been developed in the Atherosclerosis Risk in Communities (ARIC) Study, with similar predictive capability. 6 These studies included atrial flutter in their definition of AF. This inclusion is reasonable because, even though atrial flutter and AF are electrophysiologically distinct, most patients with atrial flutter have or will develop AF and the risk of stroke associated with atrial flutter is similar to that observed in AF. 7,8 Previous risk models are limited as a result of being developed in single cohorts. Though the FHS risk score has predicted AF reasonably well in other populations, 5,6 it is unknown whether a risk model developed in a more geographically or racially diverse population would better predict AF. Previously developed models also require information from a 12-lead electrocardiogram, which might be unavailable in some primary care settings. Therefore, we developed and validated a new predictive score for AF (including atrial flutter) in 5 US and European cohorts participating in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) AF consortium. 9

Methods Study Cohorts
Participant-specific data from 3 community-based cohorts in the United States (ARIC, CHS, and FHS) were pooled to develop a risk score for predicting AF, and the validation of this score was performed in 2 additional cohorts in Europe (AGES and the Rotterdam Study [RS]). A brief description of each participating cohort is provided below. For each cohort the determination of which examination to select as baseline was based on the availability of potential predictors and adequate follow-up for the development of AF. Participants were excluded from this analysis if they had AF at baseline, were younger than 46 or older than 94 years of age, had serum creatinine ≥2.0 mg/dL, identified themselves as other than white or African American (n=30 ARIC, n=32 CHS, and n=62 RS participants), or had missing values for any of the variables of interest. After applying exclusion criteria, the derivation cohort included 18 556 participants and the validation cohorts included a total of 7672 participants. The number of individuals excluded by cohort is provided in the Table S1. Institutional Review Boards at the participating institutions approved the individual studies and study participants provided written informed consent.

Atherosclerosis Risk in Communities Study
The ARIC study recruited 15 792 men and women, aged 45 to 64 years, from 4 communities in the United States (Forsyth County, NC; Washington County, MD; Jackson, MS; and suburbs of Minneapolis, MN) in 1987-1989. 10 Participants were mostly white in the Minnesota and Washington County field centers, white and African American in Forsyth County, and exclusively African American in the Jackson field center. After study inception, participants had 3 follow-up examinations, each %3 years apart. For the present analysis, we included individuals attending the last follow-up examination (visit 4, conducted in 1996-1998, n=11 656), with this examination used as baseline in all models. Of these, 10 675 met inclusion criteria.

Cardiovascular Health Study
In 1989-1990, CHS recruited 5201 men and women 65 years or older from 4 communities (Forsyth County, NC; Washington County, MD; Sacramento County, CA; and Pittsburgh, PA). Because of the different age inclusion criteria there was no overlap in ARIC and CHS participants. In 1992-1993, 687 African Americans were recruited in 3 of the 4 communities to increase minority representation. 11 CHS participants had annual follow-up examinations through 1999 with ongoing surveillance for cardiovascular events from baseline through the present. The 1989-1990 examination was considered baseline for 3768 (%65%) of the eligible CHS participants in this analysis, while 1992-1993 was the baseline examination for the rest (n=1275).

Framingham Heart Study
In 1971-1975, the FHS Offspring cohort recruited 5124 predominantly white men and women, offspring (and their spouses) from the Original FHS cohort with follow-up examinations every 4 to 8 years. 12 The current analysis included participants of the FHS Offspring cohort free of AF attending the 6th examination cycle (1995-1998, n=3113); 2838 met inclusion criteria and were included in the analysis.

Age, Gene/Environment Susceptibility Reykjavik Study
The original Reykjavik Study, conducted between 1967 and 1996, included %19 000 men and women living in the greater Reykjavik area, born between 1907 and 1935. 13 Survivors of this study were invited to be part of AGES, which recruited 5764 men and women in 2002-2006. Of these, 5427 had a complete clinic exam, and 4469 met inclusion criteria and were considered for this analysis.

Rotterdam Study
The RS, a prospective population-based study aimed to assess the determinants of chronic conditions in the elderly, examined 7983 men and women, aged 55 years and older, living in the Rotterdam suburb of Ommoord in 1989-1993. 14 Since then, participants have been continuously followed and were reexamined in 1993-1994, 1997-1999, 2002-2004 and 2008-2010. The present analysis included 3203 study participants examined in 1997-1999 meeting inclusion criteria.

Ascertainment of Incident AF
Incident AF cases in all 5 studies were ascertained from study electrocardiograms and hospitalization discharge diagnosis codes (ICD9-CM 427.3, 427.31 or 427.32, or ICD10 I48 in any position). [15][16][17][18] Individuals with atrial flutter were included as AF cases. AF ascertainment in FHS required additional adjudication of cases by study cardiologists using electrocardiographic and clinical data from the FHS clinic, outside hospital, or general practitioner records. 17

Other Measurements
In all 5 study cohorts, examinations included a 12-lead electrocardiogram, standardized measurements of anthropometry, blood pressures, blood lipids, and fasting glucose, as well as assessment of prior cardiovascular disease and medication use. [10][11][12][13]19 Details on measurement methods are provided in the online supplementary materials. Protocols for variable ascertainment and definitions of cardiovascular risk factors were comparable across cohorts.

Statistical Analysis
Derivation of the predictive model Means and standard deviation and frequency distribution of relevant covariates were calculated by cohort and race. We initially ran cohort-and race-specific Cox proportional hazard models to assess individual predictors of AF after age-and sex-adjustment in each cohort up to 7 years of follow-up. Variables considered included age, sex, height, weight, current smoking, systolic and diastolic blood pressure, use of antihypertensive medication, history of diabetes, fasting blood glucose, estimated glomerular filtration rate (eGFR) <60 mL/kg per m 2 , 20 total blood cholesterol, HDL cholesterol, triglycerides, heart rate, electrocardiographic-derived left ventricular hypertrophy, PR interval, history of coronary artery bypass graft (CABG), history of heart failure, history of myocardial infarction, and history of stroke. We selected as candidate predictors for our pooled model any variable significantly associated with AF (P<0.05) in at least 2 of the 3 cohorts, and ran the final Cox proportional hazards model on our participant-specific pooled data using backward selection of variables (P<0.05 to remain in the model). Age, sex, and race interactions were tested, as was the assumption of proportional hazards. Model-based individual 5-year risk of AF was calculated. We evaluated model performance using the C-statistic, 21 discrimination slopes, 22 and Nam and D'Agostino's modified Hosmer-Lemeshow chi-square statistic for survival analysis. 23 To facilitate the use of our score in those clinical settings with limited access to electrocardiograms or blood tests, we first developed a predictive model that did not require information from electrocardiogram and blood tests (which we labeled "simple model"). We then developed a more complex model adding electrocardiographic variables and blood tests (labeled "augmented model"). Variables were retained in the models if they were significantly associated with AF incidence (P<0.05). We calculated the added predicted value of the augmented model versus the simple model with the increment in the C-statistic and the categorical net reclassification improvement (NRI) using the following risk categories: <2.5%, 2.5% to 5%, >5%. 22

Validation analysis
The models developed in the derivation cohorts were applied in AGES and the RS to estimate the 5-year risk of developing AF. As in the derivation analysis, model performance was assessed using the C statistic, discrimination slopes, and Nam and D'Agostino's chi-square statistic metrics. To improve adjustment fit in the validation cohorts, we accounted for the baseline survival of the respective cohort and the corresponding risk factor means. 24

Additional analyses
We compared the performance of the newly developed risk score with the previous FHS AF risk score. 4 To this end, we calculated model quality measures in the pooled data from ARIC, CHS, and FHS, and separately in AGES and RS after applying the AF risk function previously derived from FHS. 4 Because the presence of cardiac murmur, one of the variables included in the FHS AF risk score, was not available in AGES and RS, and given its low prevalence (<3% in the FHS cohort), 4 we assumed it to be absent for all participants in whom it was not ascertained. Finally, we compared calibration and discrimination of the derived risk model and the model independently derived including those same variables in each validation cohort. SAS-Software version 9.1 was used for all analyses.

Results
Baseline characteristics of eligible individuals by cohort and race (in ARIC and CHS) are presented in Table 1. The average age in years ranged from 60 in FHS to 76 in AGES, and the proportion of women was between 55% and 66% across cohorts. African Americans comprised 19% of the derivation sample. The prevalence of cardiovascular risk factors was generally higher in African Americans than in whites. The analysis included 1186 incident AF cases among 18 556 participants in the derivation cohorts, and 585 cases among the 7672 participants in the validation cohorts.

Derivation of the Predictive Model
Using a backward-selection algorithm in pooled data from ARIC, CHS and FHS, the following variables were included in the simple risk prediction score: age, race, height, weight, systolic blood pressure, diastolic blood pressure, current smoking, use of antihypertensive medication, diabetes, history of myocardial infarction, and history of heart failure. In addition to these variables, the PR interval and electrocardiogramderived left ventricular hypertrophy were selected to be included in the augmented prediction score. The augmented score did not select variables requiring measurement of lipid levels, blood glucose, or creatinine. No significant interactions with age, sex, or race were observed. Table 3 includes the beta coefficients, standard errors, and hazard ratios with their 95% CIs corresponding to the final simple and augmented predictive models. The simple predictive model achieved good performance (C-statistic, 0.765; 95% CI, 0.748 to 0.781).
The addition of information from the electrocardiogram provided no gain in predictive ability (C-statistic, 0.767; 95% CI, 0.750 to 0.783). Inclusion of pulse pressure instead of systolic and diastolic blood pressure, or of body mass index or waist circumference instead of weight provided similar results (data not shown). Similarly, the categorical NRI showed that the addition of electrocardiographic variables did not improve the predictive ability of the model (NRI, À0.0032; 95% CI, À0.0178 to 0.0113; Table S2).
The distribution of predicted 5-year risk of AF in the derivation cohorts is provided in Figure 1 and the observed cumulative risk of AF by predicted risk based on the simple model is presented in Figure S1, separately for whites and African Americans. An Excel spreadsheet (available as a supplemental file) allows calculation of AF risk using this predictive model.
Calibration of both models was adequate in the entire derivation sample (Table 4, Figure 2) and individually in each derivation cohort (Table S3). Discrimination using the previously developed FHS AF risk score (C-statistic, 0.734; 95% CI, 0.717 to 0.750) was lower than with the CHARGE score.

Validation of the Predictive Model
The model developed in ARIC, CHS and FHS, was validated in 2 European cohorts, AGES and RS. Table 4 reports discrimination and calibration of the CHARGE-AF predictive models in the validation cohorts. C-statistic values were 0.664 in AGES and 0.705 in RS for the simple model, with similar results for the augmented model. Calibration of the predictive model after recalibration of the model using the average risk  Figure 2). In RS, the new CHARGE score performed slightly better than the previous FHS risk score (C-statistic 0.705 for CHARGE simple score versus 0.686 for FHS score), whereas in AGES the CHARGE and FHS scores had similar discrimination (C-statistic 0.664 for CHARGE simple score versus 0.653 for FHS score).
Because of the relatively lower discrimination of the predictive model in AGES, we calculated the C-statistic of a  model independently derived in the validation cohorts including the variables selected for the CHARGE risk model. Using this approach, the C-statistic in AGES was 0.668 (95% CI, 0.637 to 0.700) and in RS was 0.733 (95% CI, 0.690 to 0.776), not very different from values obtained using the CHARGE risk model (Table 4).

Discussion
In our individual-level pooled analysis of 3 large communitybased prospective studies in the United States, we found that a simple risk model including variables routinely collected in a primary care setting are useful to predict the future risk of AF. Discrimination ability of the model was comparable or superior to other risk stratification schemes developed for coronary heart disease or stroke. [24][25][26] The predictive model performed reasonably well in 2 additional cohorts in Europe when compared to the cohorts' own models. Including variables obtained from a 12-lead electrocardiogram provided no significant additional predictive ability. Previous models for the prediction of AF have been reported already, but these were developed in single cohorts. 4,6 Although the FHS AF risk score has shown acceptable discrimination in populations other than the cohort in which it was developed, 5,6 important improvements of the CHARGE-AF model were the availability of participant-specific data from several cohorts and the larger sample size included in its development and validation. The CHARGE-AF model utilized more than 26 000 individuals with over 1750 AF cases. The geographic and racial diversity of the participating cohorts provided increased generalizability over and above the FHS AF risk score alone. A further advantage of the CHARGE-AF predictive model is that it does not require extra diagnostic tests beyond what is usually available in primary care settings. We also found that the CHARGE-AF model performed better than the original FHS AF score in the derivation and validation cohorts. However, lack of information on cardiac murmur in ARIC, CHS, RS and AGES limits the value of the FHS AF score in these cohorts. Similarly, we did not study discrimination of the ARIC risk score in the CHARGE cohorts since the ARIC score was derived in a middle-aged cohort (45 to 64 years old at baseline), whereas most individuals in the present analysis were older.
The CHARGE-AF predictive model shares some variables with previously developed risk scores for coronary heart disease, 24,25,27 heart failure, 28,29 stroke, 26 or general cardiovascular risk. 30 However, the weight of individual risk factors in these other models differs from the CHARGE-AF model and their ability to accurately predict AF has been shown inadequate. 6 The CHARGE-AF predictive model could have important research and clinical applications. The most immediate application might be to serve as a standard in evaluating the ability of putative novel clinical factors, biomarkers, subclinical measures, or '-omic' (eg, genomic, epigenomic, transcriptomic, proteomic, metabolomic) tests to reclassify an individual's risk of developing AF. In addition, the predictive model might be used to select high-risk individuals for trials of primary prevention of AF or intensive monitoring for AF detection. Our 5-year predictive model also may be useful once primary prevention strategies are developed, to facilitate identification of individuals more likely to benefit from them. Finally, given the association of some cardiovascular risk factors, such as hypertension, obesity, diabetes, or the metabolic syndrome, [31][32][33][34][35] with the risk of AF, the CHARGE-AF predictive model may, in the future, contribute to guidelines for selecting candidates for more aggressive risk factor control. Future randomized trials and observational studies should determine if such approaches are useful and cost-effective.
In the proposed predictive model we found that higher systolic blood pressure was associated with higher AF risk, whereas diastolic blood pressure was inversely associated with AF incidence. This observation is consistent with a previous report from the FHS in which pulse pressure was a better predictor of AF than systolic or diastolic blood pressure alone. 31 We chose to include systolic and diastolic blood pressure as separate variables in our model, instead of pulse Research in Genomic Epidemiology atrial fibrillation (CHARGE-AF) simple score model in the combined derivation cohorts, by race. The x-axis refers to deciles of predicted AF risk. Each bar in the graph represents the average observed and predicted AF risk.
pressure, because they are more commonly recorded in the clinical setting. Including pulse pressure provided similar results as those presented in the current analyses. Similarly, we included weight in the models even though waist circumference or body mass index, and not weight, may be the pathophysiologically relevant factors. In the derivation cohorts, however, models with waist circumference or body mass index offered similar discrimination ability. Which of these variables is more relevant from an etiopathogenic point of view needs to be addressed in future work. Several variables included in the CHARGE-AF predictive model were part of both the published FHS and ARIC AF risk scores, including age, systolic blood pressure, use of antihypertensive medication, and history of heart failure (Table S4). Other variables in the CHARGE-AF model, however, were part of only one of the risk scores, such as race, smoking, height, diabetes, or myocardial infarction (in ARIC), and body mass index (in FHS). Similar to the ARIC model, 6 sex was not selected as a predictor in the CHARGE-AF model. Even though AF incidence is higher in men than women, our model suggests that sex differences in the distribution of AF predictors may account for this disparity. In the initial analysis, we observed an unexpected inverse association between total cholesterol and AF risk. Upon further adjustment, cholesterol levels did not show a significant association with AF. Of note, an inverse association between total and LDL cholesterol was found in an analysis conducted in the ARIC study. 36 We observed that the model had lower discrimination ability in AGES (C-statistic, 0.67). Discrimination only minimally improved in a model derived specifically in AGES using the CHARGE-AF variables (C-statistic, 0.68). In contrast, discrimination of the CHARGE-AF model was better in RS (C-statistic, 0.71). We can only speculate about the reasons to explain these differences. AGES participants were, on average, older than participants from other cohorts. Also, cohort differences in the determination of AF or in the impact of genetic risk factors may partly explain these results.

Strengths and Limitations
Our work has limitations that must be acknowledged. We restricted the age range of our risk score because very few individuals were younger than 46 or older than 94 years. The applicability of our risk model to individuals <46 or >94 years and to individuals not of African or European ancestry is uncertain. Our risk score will need to be validated outside the United States and Western Europe and in other ethnicities (eg, Asians and Hispanics). Similarly, since participants needed to attend a baseline cohort examination in order to be included, the generalizability of the risk score to hospitalized patients or non-ambulatory settings is unknown. Most of the cohorts relied on periodic clinic examinations and hospitalization ICD codes leading to the potential for misclassification of AF, though validation studies in the ARIC study, CHS, and other populations have shown adequate validity of this case definition. 15,16,37 We also have shown previously that ageand race-specific incidence rates of AF in the derivation cohorts were similar in spite of the differences in AF ascertainment. 16 In addition, we note that AF is not infrequently asymptomatic or paroxysmal, being potentially missed in our cohorts. We included initial, paroxysmal, persistent, and permanent AF for which prediction may be heterogeneous. We acknowledge being unable to accurately comment on risk prediction for AF versus atrial flutter. We combined the 2 for several reasons including that they frequently complicate each other's course, 7 they are reported to have similar risk factors, 8 and because ICD codes may not accurately distinguish between them. 38,39 Furthermore, we did not account for measurement error in determining risk factors. We pooled participant-level data assuming a priori that the associations of risk factors with AF in the subjects representing 3 large US cohort studies are sufficiently homogeneous. Strengths of our analysis include the large sample size, the number of AF cases included in the analysis, the inclusion of multiple cohort studies-enhancing generalizability, the availability of a large number of possible AF predictors, the racial diversity in the studied samples, and external replication.
In conclusion, we have developed a new risk model for the prediction of AF. The proposed model has the advantage of being simpler, using information readily available in a primary care setting, and having been developed in a larger population. Future research should determine whether biomarkers or genetic factors have value in the prediction of AF beyond that of clinical risk factors.