Whole Genome Sequence Analysis of the Plasma Proteome in Black Adults Provides Novel Insights Into Cardiovascular Disease
Plasma proteins are critical mediators of cardiovascular processes and are the targets of many drugs. Previous efforts to characterize the genetic architecture of the plasma proteome have been limited by a focus on individuals of European descent and leveraged genotyping arrays and imputation. Here we describe whole genome sequence analysis of the plasma proteome in individuals with greater African ancestry, increasing our power to identify novel genetic determinants.
Proteomic profiling of 1301 proteins was performed in 1852 Black adults from the Jackson Heart Study using aptamer-based proteomics (SomaScan). Whole genome sequencing association analysis was ascertained for all variants with minor allele count ≥5. Results were validated using an alternative, antibody-based, proteomic platform (Olink) as well as replicated in the Multi-Ethnic Study of Atherosclerosis and the HERITAGE Family Study (Health, Risk Factors, Exercise Training and Genetics).
We identify 569 genetic associations between 479 proteins and 438 unique genetic regions at a Bonferroni-adjusted significance level of 3.8×10-11. These associations include 114 novel locus-protein relationships and an additional 217 novel sentinel variant-protein relationships. Novel cardiovascular findings include new protein associations at the APOE gene locus including ZAP70 (sentinel single nucleotide polymorphism [SNP] rs7412-T, β=0.61±0.05, P=3.27×10-30) and MMP-3 (β=-0.60±0.05, P=1.67×10-32), as well as a completely novel pleiotropic locus at the HPX gene, associated with 9 proteins. Further, the associations suggest new mechanisms of genetically mediated cardiovascular disease linked to African ancestry; we identify a novel association between variants linked to APOL1-associated chronic kidney and heart disease and the protein CKAP2 (rs73885319-G, β=0.34±0.04, P=1.34×10-17) as well as an association between ATTR amyloidosis and RBP4 levels in community-dwelling individuals without heart failure.
Taken together, these results provide evidence for the functional importance of variants in non-European populations, and suggest new biological mechanisms for ancestry-specific determinants of lipids, coagulation, and myocardial function.
What Is New?
This is the first study to examine the genetic architecture of the plasma proteome using whole genome sequencing in persons of African ancestry, providing a chance to look at rare, ancestry-specific variation.
This study adds 114 novel genomic loci associated with protein levels in human samples.
What Are the Clinical Implications?
Genetic variant associated with amyloidosis in persons of African ancestry is shown to be associated with RBP4 levels, even in those without cardiomyopathy, implicating it as a potential biomarker.
Editorial, see p 371
The circulating plasma proteome plays a fundamental role in human biological function and dysfunction. Circulating proteins both mediate and respond to disease, and are frequently the targets of pharmaceutical interventions. Several recent studies have coupled genotyping and proteomic profiling to understand the genetic basis for the individual differences observed in protein levels, which are known to be heritable.1–7 Such work has led to critical advances in our understanding of the genetic architecture of the plasma proteome and its relationship to disease, including factors specifically associated with cardiovascular risk.4,6,7 However, initial findings were derived nearly entirely in European populations such as the Framingham Heart Study using genotyping arrays. Further, individuals with increased African ancestry are known to harbor substantially more genetic diversity than those of European ancestry,8,9 and rare mutations found specifically among persons of African ancestry have been critical in expanding our knowledge of cardiovascular biology, as is the case for PCSK9.10 We hypothesized that coupling whole genome sequence analysis with plasma proteomics in individuals of African ancestry would greatly increase the power to identify novel genetic determinants of the plasma proteome, which would inform our understanding not only of ancestry-specific genetic variation but also of human cardiovascular biology in general.
Here we use whole genome sequence data and aptamer-based proteomic profiling of 1301 proteins on the SOMAscan platform in 1852 self-identified Black individuals from the JHS (Jackson Heart Study)11 to identify novel protein quantitative trait loci (pQTLs) determining protein levels. Associations were replicated in 980 participants from MESA (Multi-Ethnic Study of Atherosclerosis)12 and 708 from the HERITAGE Family Study (Health, Risk Factors, Exercise Training and Genetics; Table S1),13 and further validated using an alternate proteomic profiling platform in JHS. These data serve as the basis for an enhanced understanding of proteins highly relevant to cardiovascular homeostasis across diverse human populations.
Whole genomes for JHS and MESA, generated as part of the National Heart, Lung, and Blood Institute (NHLBI) TOPMed (Trans-Omics for Precision Medicine) program, are available through restricted access via the NHLBI database of Genotypes and Phenotypes (dbGaP). TOPMed accession numbers for JHS and MESA are phs000964/phs002256.v1.p1 and phs001416, respectively. Full genome-wide association study (GWAS) summary statistics for JHS (the discovery cohort) generated in this study will be available for general research use through controlled access at dbGaP accession phs001974: NHLBI TOPMed: Genomic Summary Results for the Trans-Omics for Precision Medicine program. For assistance in accessing the discovery data in JHS before full availability on dbGaP, investigators should contact the authors and follow JHS data access procedures (https://www.jacksonheartstudy.org/). GWAS data for the replication studies (MESA and HERITAGE) are fully included in the article. Individual-level proteomic and genomic data in the replication datasets are available through application to the respective cohorts.
The JHS study was approved by Jackson State University, Tougaloo College, and the University of Mississippi Medical Center Institutional Review Boards, and all participants provided written informed consent. All MESA participants provided written informed consent, and the study was approved by the Institutional Review Boards at The Lundquist Institute (formerly Los Angeles BioMedical Research Institute) at Harbor-University of California, Los Angeles, Medical Center, University of Washington, Wake Forest School of Medicine, Northwestern University, University of Minnesota, Columbia University, Johns Hopkins University, and University of California, Los Angeles. The human study protocols were approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center, University of Washington, and the 4 clinical centers of HERITAGE.
The JHS, MESA, and the HERITAGE Family Study have all been previously described.11–13 In brief, JHS is a community-based longitudinal cohort study begun in 2000 of 5306 self-identified Black individuals from the Jackson, Mississippi, metropolitan statistical area.11 Included in the present study are samples collected at Visit 1 between 2000 and 2004 from 1852 individuals with whole genome sequencing (WGS)14 and proteomic profiling performed in batches (see Proteomic Profiling below).
MESA began in 2000 with 6814 men and women age 45 to 84 years recruited at 6 clinical centers across the United States. Participants were identified belonging to 4 racial/ethnic groups: Black, Hispanic, Asian, or White. Included in the present study are 980 individuals selected randomly across all 4 racial/ethnic groups with proteomic profiling from Visit 1 between 2000 and 2002 and whole genome sequence analysis.12
HERITAGE enrolled a combination of self-identified White and Black family units, totaling 763 sedentary participants (62% White) between the ages of 17 and 65 years in a 20-week, graded endurance exercise training study across 4 clinical centers in the United States and Canada in 1994 to 1995.13 Included in the present study are a random subset of 708 individuals with baseline plasma samples and genotyping.
Proteomic profiling by SomaScan (aptamer-based affinity platform) and Olink (antibody-based affinity platform) has been described previously.6,15 Please see the Supplemental Methods for further details.
Genotyping and Imputation
WGS in JHS and MESA has been described previously.14,16 Included in the present study are participants included in Freeze 6 of the TOPMed project at the Northwest Genome Center at University of Washington and the Broad Institute. Samples underwent >30× WGS. Genotype calling with vt17 and quality control were performed by the Informatics Resource Center at the University of Michigan.
Genotyping in HERITAGE was performed on the Illumina Infinium Global Screening Array. Genotypes were called using Illumina’s GenCall on the basis of the top/bottom strand method. Genotype imputation was performed using the University of Michigan Imputation Server Minimac4 to reference panel TOPMed Freeze5.18 Phasing was performed with Eagle v2.4. Sites were excluded with call rate <90%, mismatched alleles, or invalid alleles (88% of sites retained).
All statistical methods are explained throughout the sections below.
Whole Genome Sequence Association Analysis
Across all 3 cohorts, proteomic measurements were standardized to a set of control samples (pooled plasma) that were part of each plate. The resulting values were log-transformed and scaled to a mean of 0 and SD of 1. In JHS, to account for batch effects, proteins were log-transformed and scaled within batch and then combined. In all cohorts, these log-transformed values were residualized on age, sex, batch, and principal components of ancestry 1 to 10 as determined by Genetic Estimation and Inference in Structured Samples (GENESIS).16,19,20 In HERITAGE and MESA, measurements were also residualized on race to account for nongenetic racial effects not captured by genetic ancestry. The resulting residuals were then inverse-normalized. The association between these values and genetic variants was tested using linear mixed effects models adjusted for age, sex, the genetic relationship matrix, and principal components 1 to 10 using the fastGWA model implemented in the GCTA software package (version 1.93.2beta/gcta64).21 Repeat adjustment was implemented to reduce type I error and improve statistical power.22 Variants with a minor allele count less than 5 in a given cohort were excluded from analysis in that cohort. A Bonferroni-adjusted significance threshold of 3.8×10-11 (5×10-8/1301) was used for discovery in JHS. For variants in cis (<1 Mb from the transcription start site [TSS] of the coding gene for the associated protein),1 variants with P values of 5×10-6 were also considered in a separate analysis, given the biological plausibility of such associations.
Variance Explained for Each Protein
Single nucleotide polymorphism (SNP)–based heritability, hSNP2, was estimated using a linkage disequilibrium (LD)– and minor allele frequency (MAF)–stratified genomic relatedness matrix restricted maximum likelihood (GREML-LDMS) model implemented in the GCTA software. This method allows for fitting multiple genomic relatedness matrices with SNPs binned according to their regional LD and MAF.23 It is recommended for heritability estimation on WGS data.23,24 Using this model, we first calculated the segment-based (length of segment: 200 kb) LD scores and partitioned SNPs into 4 groups on the basis of the quartiles of the regional LD score. Genomic relatedness matrices for each of the 4 groups was then computed using SNPs binned into the corresponding group, and jointly fitted into a mixed effect model for estimating the heritability and variance. In our analysis, we allowed for a maximum of 1000 iterations. For all analyses, we adjusted for age, sex, and the first 10 principal components of genetic ancestry. Variance explained by the top performing variant (as determined by lowest P value) was estimated using the equation BETA^2×(2×AF1× (1-AF1)/VAR) where BETA was the beta estimate for the effect allele, AF1 was the allele frequency of the effect allele, and VAR was the variance of the protein residual used for WGS analysis. Variance explained by clinical covariates was estimated using linear regression of log-transformed protein level regressed on age, sex, systolic blood pressure, diabetes, use of hypertensive medication, current smoking status, and a history of coronary heart disease. Proteins whose total heritability could not be estimated by this model, which often occurs when heritability estimates are low,23 were excluded (N=185, Figure S1, Table S2).
Defining Protein-Locus Associations and Sentinel Variants
To identify the broadest genomic region associated with a protein, we applied the following previously described algorithm:1 a 1-Mb region around each SNP associated with a given protein was defined. Beginning with the region containing the variant with the lowest P value, overlapping regions were merged together. This was repeated until no more overlapping regions existed for the given protein. The variant with the lowest P value in each region was identified as the sentinel variant. To describe regions associated with multiple proteins, regions with sentinel variants in LD with r2≥0.8 were described as the same region, exclusively for descriptive purposes. LD was determined using SNPClip, using data from individuals of African ancestry.25,26 Any variants within 1 Mb of the TSS for the cognate gene of a protein were considered “cis.”
Replication in MESA/HERITAGE
Associations between sentinel variants and proteins from JHS were evaluated in MESA and HERITAGE separately if associated statistics were available (if minor allele count was <5 in either cohort, that variant was not considered in that cohort). Where association statistics were available in both cohorts, the 2 cohorts were meta-analyzed using the inverse-variance weighted method using fixed effects. Validation threshold was set at P<0.05 with consistent direction of effect.
Results from JHS, MESA, and HERITAGE were meta-analyzed together using mixed effects models in the “meta” package of R4.0.5. Only variants with a P value for association with a given protein <1×10-5 in at least 2 of the studies were included.
Comparing With Previous pQTLs
To determine whether pQTLs were novel, we used the PhenoScanner package (version 2) for R.27,28 For each protein-locus association identified above, we divided the locus into 1-Mb or less segments (maximum permitted by PhenoScanner API) if needed. The resulting region or regions were then passed to the phenoscanner function in R, with the following arguments: build was set to “38,” P value to 1×10-5, catalog to “pQTL,” and proxies to “None” (query date June 28, 2020). To supplement PhenoScanner, we reviewed the literature for additional studies using SomaScan or Olink to identify the genetic architecture of the plasma proteome and identified 3 not in PhenoScanner.2,6,7 Results from these studies were considered using the same criteria as above. If the protein linked to that region in our analysis was found to be previously associated with any variants in the region, this was considered a “previous” protein-locus association. For the subset of protein-locus associations that were previously described, we secondarily looked to see whether the sentinel SNP in JHS represented a novel genetic determinant. Sentinel SNPs were queried against both PhenoScanner and the 3 other studies to look for any variants associated with the same protein and in linkage disequilibrium with the new sentinel SNP. Again the phenoscanner function in R was used with the following arguments: build was set to “38,” P value to 1×10-5, catalog to “pQTL,” proxies to “EUR” (because these variants were discovered in European populations), and r2 to “0.5” (query date October 1, 2020).
Comparing With Previous GWAS Results
To determine overlap between clinical GWAS analyses and pQTLs in this analysis, we used the PhenoScanner package for R. All 569 sentinel SNPs as identified above were passed to the phenoscanner function in R with the following arguments: build was set to “38,” P value to “1×10-5,” catalog to “GWAS,” r2 to “0.5,” and proxies to “AFR” (query date October 15, 2020).
Comparing Results With ClinVar Data
The entirety of the ClinVar database was downloaded from the National Center for Biotechnology Information FTP site (https://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/variant_summary.txt.gz, accessed September 3, 2020).29 These data were merged to all variants associated with any protein in the JHS at a P value <5×10-6.
Reference allele frequencies from gnomAD30 and variant category from GENCODE31 were obtained from the Functional Annotation of Variants–Online Resource (available at https://favor.genohub.org, download date July 20, 2020).32
Whole Genome Association Analysis of Proteomic Profiling
We performed whole genome association analysis between 28.1 million variants with an allele count in JHS of at least 5 and 1301 plasma protein measures in 1852 self-identified Black individuals (61% women). Proteins exhibited a wide range of estimated total heritability (median heritability = 0.33, interquartile range 0.22–0.48, Figure S1, Table S2). Imputing proteins with nonconverged heritability estimates to 0 resulted in a median heritability of 0.29 (see Methods).
At a Bonferroni-adjusted significance cutoff (5×10-8/1301=3.8×10-11), we identified 569 associations with 479 proteins encompassing 438 unique genetic loci (Figure 1, Table S3). Each locus is a genomic region containing at least 1 variant associated with a protein but often summarizing multiple nearby variants in varying degrees of linkage disequilibrium (see Methods). The variant with the lowest P value for association in each locus is considered the sentinel variant. Using this method, we identified 114 locus-protein associations not previously described. For previously described locus-protein associations, we identified novel sentinel variants in 217 loci.
Across these 569 associations, 329 (58%) of the sentinel variants for the given locus are within 1 Mb of the TSS of the cognate gene for that protein (termed cis), and 240 are nonlocal (termed trans). We identified an additional 183 suggestive cis associations when the P value threshold was lowered to 5×10-6 (Table S4). The majority of cis pQTLs were close to the TSS of the cognate gene, with 90% falling within 100 kb of the TSS (Figure 2A).
The majority of proteins (70%) with a significant pQTL were associated with a single locus. Three proteins were associated with 5 different loci: Ck-beta-8-1, cyclin-dependent kinase inhibitor 1B, and apolipoprotein L1 (Figure 2B).
Patterns observed in previous studies were replicated here: most loci (388, 89%) were associated with only 1 protein, although there were several pleiotropic loci including regions near the VTN, ABO, and APOE genes (Figure 2C), all of which have been implicated in cardiovascular disease.1,3,33–36 Sentinel variants were largely proximate to coding genes, with only 20% in intergenic regions (Figure 2D and 2E). There was a strong inverse relationship between effect size and MAF, consistent with previous pQTL studies (Figure 2F).1
In contrast with previous studies, a significant number of sentinel variants had allele frequencies that varied substantially from those observed in European populations: 166 (36%) of the 464 identified sentinel variants in JHS had MAF <1%, whereas 65 (14%) of the variants had MAF <0.0001% among non-Finnish Europeans in gnomAD.30 Many of these variants were much more common in JHS: among the 166 variants with MAF <1% in non-Finnish Europeans, 71 had MAF >5% in JHS. Figure 3 illustrates the wide disparity between allele frequencies of all 569 sentinel variants in African versus non-Finnish European populations in gnomAD.30
We also completed proteomic profiling in 2 smaller cohorts, MESA (N=980, 53% women, 19% Black) and the HERITAGE Family Study (N=708, 56% women, 36% Black), each containing a subset of self-identified Black individuals, which were meta-analyzed (when possible) to validate the results. Consistent associations were observed for 90% of the 569 sentinel variants at a P value <0.05 with matching direction of effect. If a significance threshold adjusted for multiple corrections is used (P<0.05/569=8.8×10-5), 72% replicate. Variants that did not replicate in some cases had lower MAF, falling below the minor allele count threshold of 5 in 1 of the 2 replication cohorts, reducing overall replication power (Table S3, Figure S2). Results from JHS, MESA, and HERITAGE were also meta-analyzed together. This analysis yielded 13 additional pQTLs: 9 trans and 4 cis (Table S5).
In a limited subsample of JHS participants (N=488), plasma samples were also profiled using the Olink Explore platform, which uses a completely distinct, immunoassay-based approach for protein measurement, which generally relies on polyclonal antibody conjugates.37 Of the 569 sentinel variant-protein associations, 318 could be compared on the Olink platform. These associations showed a consistent effect across the 2 platforms (correlation of effect, 0.82 [95% CI, 0.78–0.85]; Figure S3). Across all 318 comparisons, the median Soma-Olink correlation was 0.62 (interquartile range, 0.35–0.74). The direction of effect matched in 86%, and 51% of associations were confirmed at a Bonferroni (0.05/318) level of significance (Table S3). There were a small number of discordant associations where effects as measured by SOMA and Olink were significant but with opposing directions of effect, such as the association between rs5744204 and lipopolysaccharide-binding protein. These may indicate platform-specific binding effects, but still support a genetic effect on protein levels as the most likely explanation, save for the unlikely possibility of opposing effects on just the binding of reagents from each platform.
|Target full name||Target||Sentinel SNP||SNP (hg38)||JHS AF||Non-Finish European AF gnomAD||African AF gnomAD||Consequence||Cis/trans||Nearest gene||β||SE||P value|
|Plasma serine protease inhibitor||PCI||rs1801020||5:177409531:A:G||0.551||0.750||0.551||5′UTR||Trans||F12||–0.32||0.03||1.1E-26|
|Kelch-like ECH-associated protein 1||KEAP1||rs769455||19:44908783:C:T||0.019||0.000||0.021||Missense||Trans||APOE||1.20||0.12||9.1E-25|
|Sonic hedgehog protein||Sonic hedgehog||rs7412||19:44908822:C:T||0.111||0.080||0.105||Missense||Trans||APOE||0.34||0.05||2.8E-11|
|Tyrosine-protein kinase ZAP-70||ZAP70||rs7412||19:44908822:C:T||0.111||0.080||0.105||Missense||Trans||APOE||0.61||0.05||3.3E-30|
|Bone morphogenetic protein receptor type 2||BMP RII||rs12117||11:6440254:G:A||0.051||0.000||0.055||Missense||Trans||HPX||0.81||0.07||6.4E-31|
|Natural killer cell receptor 2B4||CD244||rs12117||11:6440254:G:A||0.051||0.000||0.055||Missense||Trans||HPX||0.53||0.07||1.5E-13|
|Glial cell line–derived neurotrophic factor||GDNF||rs12117||11:6440254:G:A||0.051||0.000||0.055||Missense||Trans||HPX||0.65||0.07||1.6E-19|
|Tumor necrosis factor||TNF-α||rs12117||11:6440254:G:A||0.051||0.000||0.055||Missense||Trans||HPX||1.75||0.07||3.6E-126|
|Tumor necrosis factor ligand superfamily member 18||TNFSF18||rs12117||11:6440254:G:A||0.051||0.000||0.055||Missense||Trans||HPX||1.87||0.07||1.0E-145|
|Retinol-binding protein 4||RBP||rs76992529||18:31598655:G:A||0.018||0.000||0.016||Missense||Trans||TTR||–0.91||0.13||2.5E-13|
|Cytoskeleton-associated protein 2||CKAP2||rs73885319||22:36265860:A:G||0.232||0.000||0.220||Missense||Trans||APOL1||0.34||0.04||1.3E-17|
|Protein S100-A9||Calgranulin B||rs10430455||1:157733448:T:A||0.105||0.551||0.119||Intergenic||Trans||FCRL2||0.36||0.05||2.2E-11|
|Bactericidal permeability–increasing protein||BPI||rs2814778||1:159204893:T:C||0.837||0.004||0.814||5′UTR||Trans||ACKR1||–0.30||0.04||9.3E-12|
|C-X-C motif chemokine 11||I-TAC||rs2814778||1:159204893:T:C||0.837||0.004||0.814||5′UTR||Trans||ACKR1||–0.36||0.04||3.2E-16|
|C-X-C motif chemokine 16||CXCL16, soluble||rs2234355||3:45946488:G:A||0.441||0.002||0.433||Missense||Trans||CXCR6||0.52||0.03||5.7E-54|
Novel Genetic Determinants of Plasma Proteins Related to Thrombosis, Lipid Biology, and Myocardial Disease
To determine the novelty of the wide genomic regions identified as pQTLs by our analysis, we queried pQTL data available in PhenoScanner, a database of GWAS findings.27,28 Of the 569 protein-locus associations, 114 (20%) had not been previously identified (Figure 1, Table S3) at a P value <1×10-5. Of these 114 novel associations, 84 (74%) were trans associations. Sixty-two (54%) of the sentinel variants for these loci were uncommon (ie, MAF <1%) in non-Finnish European populations, but had a median MAF in JHS of 5% (interquartile range, 2%–12%). Novel pQTLs provide the opportunity to better understand biological pathways. As an example, a variant in the 5-prime untranslated region of F12, the gene for clotting factor XII, is observed to be a novel pQTL for thrombin and plasma serine protease inhibitor. This variant has previously been shown to affect thrombin generation and the coagulation cascade.38
Similar to previous studies, we identify multiple pleiotropic genetic loci that affect the levels of multiple proteins. The APOE locus is 1 such well-established locus, which is known to be associated with hypercholesterolemia, atherosclerotic heart disease, and Alzheimer disease.39 Our analysis reveals 6 new proteins associated with this gene at 3 distinct (r2<0.1) missense variants: rs7412, rs769455, and rs42935 (Figure 4A). These 6 proteins: b-endorphin, MMP-3 (matrix metalloproteinase-3), Sonic hedgehog, ZAP70 (zeta chain of T cell receptor associated protein kinase 70), Kelch-like ECH-associated protein 1, and matrix metalloproteinase-8, implicate new targets in understanding how APOE (apolipoprotein E) may mediate its effects. Indeed, APOE knockout mice, which develop atherosclerotic lesions that mimic human plaques, have shown reduced Zap70 activation.40 Further, MMP-3 levels have been shown to be elevated in affected areas of the brain among those with Alzheimer disease.41
The analysis also shows a new pleiotropic locus at HPX, the gene for hemopexin. The sentinel variant, rs12117, is nearly monoallelic in European populations, but has a MAF in JHS of 5%. Six proteins are shown to be affected by this variant: bone morphogenetic protein receptor type-2, natural killer cell receptor 2B4, K-Ras, glial cell line–derived neurotrophic factor, TNFα (tumor necrosis factor-α), and tumor necrosis factor ligand superfamily member 18. Another 3 proteins are associated with other variants either in or upstream of HPX (Figure 4A). It has been posited that hemopexin protects cells from oxidative stress by clearing heme, and TNFα is known to induce HPX expression in rats as an acute phase response.42 Further, HPX/APOE double knockout mice had accelerated atherosclerosis related to oxidative stress and changes in macrophage function. This role of hemopexin may be particularly important in Black patients with sickle-cell disease: murine models have shown the value of heme-scavenging by hemopexin in reducing inflammation in models of sickle-cell disease.43,44 Our findings suggest specific genetic variation may have a role in the immune functions of hemopexin. Although no members of our cohort had sickle-cell disease, 24 individuals did have both the minor allele of rs12117 and sickle-cell trait. However, no definitive interaction between these 2 variants and any protein could be identified. Unfortunately, given the low frequency of the variant in European-based GWAS, no clinical implications for rs12117 have been identified, although other variants in HPX have been linked to ulcerative colitis.45 Further data are needed; specifically, data from patients with sickle-cell disease would be of value.
Our analysis can implicate new biology related to previously described variants as well. The variant rs2066702 in ADH1B has been identified as a risk locus for alcohol dependence across multiple ancestry-specific GWAS.46 The same variant in our analysis is associated with levels of NAMPT (nicotinamide phosphoribosyltransferase; Figure S4A), which regulates intracellular nicotinamide adenine dinucleotide (NAD+) and plays a role in cardiac hypertrophy and adverse remodeling.47 It is important that the minor allele of rs2066702 is protective of alcohol dependence, and it is this allele that is associated with higher levels of NAMPT, suggesting that alcohol use may deplete NAMPT in humans. Furthermore, previous murine studies have shown that ethanol administration diminished NAMPT levels, whereas overexpression of NAMPT was found to protect against steatosis.48
Conversely, the associations between well-described proteins and poorly understood genes can further elucidate biology. Levels of 2 proteins, plasminogen and angiostatin (itself a fragment of plasminogen), were linked to a variant upstream of GALNT7 (Figure S4B). Plasminogen and angiostatin each have a strong cis pQTL, supporting aptamer specificity for their measurement (Tables S3 and S4). Although plasminogen and angiostatin are critical factors in clot dissolution and angiogenesis inhibition,49,50 respectively, the biological role of GALNT7, a glycosyltransferase, has been linked by more limited evidence to cancer proliferation.51 The sentinel SNPs linked to these proteins in our analysis are monoallelic in European populations, so previous GWAS data do not exist. However, other variants at the GALNT7 locus have been linked to vascular disorders in the UK Biobank, including “Cause of death: peripheral vascular disease, unspecified” (P=1×10-23), “Cause of death: vascular dementia, unspecified” (P=8×10-20), and “Cause of death: chronic or unspecified with hemorrhage” (P=2×10-17), all 3 of which are plausibly mediated by plasminogen or angiostatin.27,28
Known Ancestry-Specific Loci Highlight Ancestry-Specific Cardiovascular Disease Pathways
Analysis of samples from individuals of greater African ancestry allows for assessment of specific loci known to be of particular clinical importance in individuals of African descent. We evaluated the proteomic signatures of 4 such well-described loci.
TTR (transthyretin) amyloidosis results from the misfolding of the transthyretin tetramer, ultimately resulting in abnormal protein deposition in myocardium and nerve tissue, leading to cardiomyopathy and neuropathy. Protein misfolding is accelerated in the presence of mutations in the TTR gene; specifically, rs76992529 encodes a V122I mutation that is found in 3% to 4% of Black individuals. In our data we show this variant to be a robust pQTL for RBP4 (retinol-binding protein 4), a binding partner of TTR.52 In individuals with TTR amyloidosis and overt myocardial disease (typically manifested as left ventricular thickening and diastolic dysfunction), RBP4 levels are known to be diminished—the normal transthyretin tetramer protects RBP4 from renal clearance.53 However, our data show that asymptomatic carriers of this mutation have diminished RBP4 levels as well, even in the absence of reported heart failure (Figure 5A). To further explore this finding, we leveraged extensive metabolite profiling in JHS.54 We found an unknown metabolite feature highly correlated with circulating RBP4 (Pearson correlation 0.64, [95% CI, 0.61–0.66]). As expected, the association between this metabolite and the V122I mutation was also strong (β=–0.76, P=4.6×10-14). This metabolite feature has a mass-to-charge ratio of 269.226, which strongly suggests its identity as a dehydrated form of retinol, the binding partner of RBP4, according to the Human Metabolome Database. These data further complement and validate our proteomic association of RBP4 and TTR. Larger datasets are needed to explore the functional consequences of these proteomic and metabolomic findings.
Two alleles in the APOL1 gene (rs73885319/rs60910145 or “G1” and rs71785313 or “G2”) are linked to chronic kidney disease and cardiovascular disease in JHS and are common in individuals with African ancestry.55–57 In JHS, rs73885319 has a MAF of 23%, whereas the variant is not present persons of European ancestry in gnomAD. In addition to being associated with levels of APOL1 (apolipoprotein L1) in our analysis, it was also the sentinel SNP determining levels of CKAP2 (cytoskeleton associated protein 2, Figure 5B). CKAP2 has been linked to tumor formation because it has a role in mitosis, but has also been observed to be upregulated in renal tubular necrosis.58,59 In models adjusted for age, sex, body mass index, systolic blood pressure, presence of hypertension, presence of diabetes, HbA1c, and proteomic batch/plate, CKAP2 levels as measured by SOMAscan were associated with increased estimated glomerular filtration rate in JHS (β=1.16, P=0.002). Because APOL1 risk variants are associated with renal disease, this could point to a protective role for CKAP2 in response to APOL1 genetic risk, requiring further investigation as a therapeutic target.
DARC (Duffy chemokine receptor) is a binding site crucial to malarial infection with Plasmodiumvivax, but has also been shown to affect risk for cardiovascular outcomes in JHS.60 Under positive selection in sub-Saharan Africa, the FY*O allele of this gene is thus common in individuals of African descent, although it is present in only 0.4% of individuals of non-Finnish European descent in gnomAD.30 Levels of CCL14 (C-C motif chemokine ligand 14) and eotaxin have previously been linked to this gene, and to this list we now add protein S100-A9, CXCL11 (C-X-C motif chemokine ligand 11), and bactericidal permeability-increasing protein. Despite being linked to neutropenia, the Duffy-null allele has not been shown to lead to an increased risk of infection.61 However, there is evidence of a slower progression of HIV infection in the Duffy-null state.62 These results expand the list of inflammatory mediators affected by the Duffy-null state.
Last, the variant that causes sickle cell trait, rs334, has an allele frequency of 4% in JHS. This variant was associated with fractalkine (P=2.5×10-6). Previous work has linked fractalkine, an inflammatory cytokine, to incident heart failure, specifically in Black individuals.63
Protein Associations for Clinically Relevant Variants
Among the other 435 protein-locus pairs with previously identified pQTLs in the same region, 44 of the previous pQTLs were at P values >5×10-8, and 177 of the previous pQTLs differed from the sentinel variants identified in JHS (r2<0.5). Thus, even in genetic regions previously linked to a given protein, many sentinel variants identified in this analysis may point to novel genetic effects when combined with existing genetic databases (Tables S6 and S7). As an example, the variant rs2234355 in the CXCR6 gene is nearly monoallelic in European populations, but is common among African populations, and thus well represented in JHS (MAF 44%). The variant has been previously shown to be protective against Pneumocystis jiorvecii infection in HIV-infected individuals, and was more common in those achieving viremic control.64,65 Interactions between CXCR6 (C-X-C chemokine receptor type 6) and its ligand CXCL16 (C-X-C motif chemokine ligand 16) have been posited as a potential mechanism; we show this variant to be a strong (P=5.7×10-54) sentinel pQTL for CXCL16, supporting this hypothesis. The relationship may also have cardiovascular consequences, because CXCL16 levels have been associated with acute coronary syndromes.66
Our data represent a comprehensive effort to understand the genetic determinants of the circulating plasma proteome using whole genome sequence analysis in individuals with greater genetic diversity than those in previous analyses. We identify numerous novel genetic determinants of a wide range of circulating proteins, many of which are important in vascular and cardiac biology. Many of these genetic variants have known clinical implications, in which case our data delineate novel biology potentially linking genetic variation to disease. As an example, the genetic mutation associated with TTR amyloidosis in persons of African ancestry, rs76992529, is shown here to be associated with RBP4 levels in persons without overt cardiomyopathy. A recent study from the BioMe database found a similar difference among persons with this mutation and without cardiomyopathy.67 Our findings extend the small case-control biobank study to a large, well-defined prospective cohort, advancing RBP4 levels as a potential preclinical biomarker. Further studies are needed to determine if there is an interaction between this mutation, RBP4 levels, and incident cardiomyopathy.
In other cases, the proteomic associations identified represent the first meaningful annotation of a given genetic variant. Such is the case for rs12117, a missense variant in the gene for hemopexin. Despite a MAF of ≈2.6% in persons of African ancestry, little is known about this variant. Here, we describe it as a pleiotropic locus, affecting the levels of multiple inflammatory proteins. Given hemopexin’s role in heme-scavenging, identifying additional carriers, particularly those with sickle-cell disease, may offer critical insights, and the proteins identified here would be useful starting points. The paucity of genome-wide association data in diverse populations limits our ability to interrogate associations, such as rs12117, with tools such as Mendelian randomization, but hopefully highlights the need for greater inclusion of diverse populations in genetic research going forward. Greater diversity in genetic association studies will not only increase our understanding of functional genomics but may also help delineate gene-environment interactions that affect individuals of diverse ancestry. Indeed, our analysis identifies novel variants that are not particularly rare in Europeans, but are only now described in a cohort of Black Americans. This finding suggests the possibility of gene-environment interactions, including the effects of social and structural differences, which have biological/health effects at multiple levels (health care access, stress response, environmental toxins, etc).68 Such future work is important not only for the populations themselves, but also for optimum understanding of the genomic basis of biological variability and disease susceptibility.
Future work leveraging these data may also center around the intriguing finding of genetic variants that produce opposing findings on the Soma platform compared with the Olink platform. These variants, often protein-altering, likely affect binding of one platform, but the significant opposing effects suggest they are true pQTLs. Understanding the implications of such variants on a genome-wide scale may identify functionally important gene regions and inform interpretation of binding data.
Our study has several strengths: as mentioned, it is the largest analysis of its kind in a Black population, which gives it the power to detect many novel variants. The results are compared with 2 multiethnic populations and an alternate profiling platform. Our study also has several important limitations. Although this is the largest pQTL analysis in a Black population, the sample size for genome-wide association is relatively modest compared with many GWAS. This also informs a second limitation, the use of multi-ancestry cohorts for validation rather than a population of similar ancestry to JHS. This fact is related to limited availability of proteomic data in Black persons, and the desire to maintain an adequate sample size for validation of our original findings. For example, all 980 MESA participants with proteomics are included, regardless of their racial or ethnic identification, in the hope that statistical validation can be performed on as many variants as possible. Limiting MESA to only the Black participants would have left only 190 individuals. A further limitation is aptamer specificity on the SomaScan platform. Although cis pQTLs (both from this study and others) and validation on the Olink platform can confirm aptamer specificity, off target effects may be falsely attributed as transpQTLs, though we expect most cases off nonspecificity to bias toward the null. Aptamer validation efforts beyond those included here are ongoing across many groups.1,2,69,70
Taken together, our work highlights the importance of extending proteomics, genomics, and likely other -omics studies, to diverse populations, to identify important potential biomarkers and disease pathways not only in those populations but also in the human population at large.
Jackson Heart Study
The authors wish to thank the staff and participants of the JHS.
The authors thank Drs Arthur S. Leon, D.C. Rao, James S. Skinner, Tuomo Rankinen, Jacques Gagnon, and the late Jack H. Wilmore for contributions to the planning, data collection, and conduct of the HERITAGE project.
D.H.K., U.A.T., A.G.B., T.J.W., J.G.W., P.N., and R.E.G contributed to the original concept of the project, planned these analyses, and formulated the methods. D.H.K., U.A.T., D.N., M.D.B., J.M.R., D.E.C., SD, L.F., S.S., D.S, Y.G., M.E.H., A.C., J.G.W., and R.E.G., collected, organized, and contributed to the quality control and management of JHS proteomic data, via both the SomaLogic platform and the Olink platform. D.H.K., U.A.T., D.N., M.D.B., J.M.R., D.E.C., S.D., L.F., S.S., D.S, R.P.T., P.D., K.D.T., Y.L., W.C.J., X.G., J.Y., Y.-D.I.C, A.W.M., S.S.R., J.I.R., and R.E.G., collected, generated, organized, and contributed to the quality control and management of MESA proteomic data. D.H.K., U.A.T., D.N., M.D.B., J.M.R., D.E.C., S.D., L.F., S.S., D.S, C.B., M.A.S., and R.E.G. collected, generated, organized, and contributed to the quality control and management of HERITAGE Family Study proteomic and genomic data. D.J. and TOPMed performed W.G.S. from JHS and MESA. D.H.K., U.A.T, A.G.B, A.P., Z.Y., P.N., and R.E.G. developed the WGS analysis pipeline and statistical methods across all proteomics data. D.H.K., U.A.T., A.G.B., P.N., J.G.W., and R.E.G. analyzed the data and wrote/revised the article.
Sources of Funding
Dr Katz is supported by a NHLBI T32 postdoctoral training grant (T32HL007374-40). Dr Tahir is supported by the Ruth L. Kirchstein postdoctoral individual National Research Award (F32HL150992). Dr Bick is supported by National Institutes of Health (NIH) DP5-OD029586-01 and is a recipient of a Career Award for Medical Scientists from the Burroughs Wellcome Foundation. Dr Cruz is supported by the KL2/Catalyst Medical Research Investigator Training award from Harvard Catalyst (NIH/National Center for Advancing Translational Sciences Award TR002542). Dr Robbins is supported by the John S. LaDue Memorial Fellowship in Cardiology at Harvard Medical School. Dr Benson is supported by a NHLBI K08HL145095 award. Dr Natarajan is supported by NIH R01HL142711. Drs Gerszten, Wang, and Wilson are supported by NIH R01 DK081572. Drs Gerszten, Wang, and Vasan are supported by NIH R01 HL132320. Drs Gerszten and Vasan are supported by National Institute on Aging grant RF1AG063507.
Jackson Heart Study
JHS is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), the Mississippi State Department of Health (HHSN268201800015I/HHSN26800001), and the University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I, and HHSN268201800012I) contracts from the NHLBI and the National Institute for Minority Health and Health Disparities. Molecular data for the TOPMed program was supported by the NHLBI. Genome sequencing for “NHLBI TOPMed: The Jackson Heart Study” (phs000964.v1.p1) was performed at the Northwest Genomics Center (HHSN268201100037C). Core support including centralized genomic read mapping and genotype calling, along with variant quality metrics and filtering, were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support including phenotype harmonization, data management, sample-identity quality control, and general program coordination were provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I). The authors gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed.
TOPMed MESA Multi-Omics/MESA Study Acknowledgment
WGS for the TOPMed program was supported by the NHLBI. WGS for “NHLBI TOPMed: Multi-Ethnic Study of Atherosclerosis (MESA)” (phs001416.v1.p1) was performed at the Broad Institute of Massachusetts Institute of Technology and Harvard University (3U54HG003067-13S1). Centralized read mapping and genotype calling, along with variant quality metrics and filtering, were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity QC, and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1). The MESA projects are conducted and supported by the NHLBI in collaboration with MESA investigators.
The MESA projects are conducted and supported by the NHLBI in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, and UL1-TR-001420. It is also supported in part by the National Center for Advancing Translational Sciences, Clinical and Translational Science Institute (CTSI) grant UL1TR001881, and National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center grant DK063491 to the Southern California Diabetes Endocrinology Research Center.
This research was partially funded by NHLBI grants HL-45670, HL-47317, HL-47321, HL-47323, and HL-47327, all in support of the HERITAGE Family Study. C.B. is partially funded by the John W. Barton Sr Chair in Genetics and Nutrition, and NIH Centers of Biomedical Research Excellence grant (NIH P30GM118430-01). Dr Sarzynski is supported by R01HL146462.
The views expressed in this article are those of the authors and do not necessarily represent the views of the NHLBI, the NIH, or the US Department of Health and Human Services.
database of Genotypes and Phenotypes
genome-wide association study
Jackson Heart Study
minor allele frequency
Multi-Ethnic Study of Atherosclerosis
National Heart, Lung, and Blood Institute
National Institutes of Health
protein quantitative trait locus
single nucleotide polymorphism
Trans-Omics for Precision Medicine
transcription start site
whole genome sequencing
Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, Burgess S, Jiang T, Paige E, Surendran P,. Genomic atlas of the human plasma proteome.Nature. 2018; 558:73–79. doi: 10.1038/s41586-018-0175-2CrossrefMedlineGoogle Scholar
Emilsson V, Ilkov M, Lamb JR, Finkel N, Gudmundsson EF, Pitts R, Hoover H, Gudmundsdottir V, Horman SR, Aspelund T,. Co-regulatory networks of human serum proteins link genetics to disease.Science. 2018; 361:769–773. doi: 10.1126/science.aaq1327CrossrefMedlineGoogle Scholar
Suhre K, Arnold M, Bhagwat AM, Cotton RJ, Engelke R, Raffler J, Sarwath H, Thareja G, Wahl A, DeLisle RK,. Connecting genetic risk to disease end points through the human blood plasma proteome.Nat Commun. 2017; 8:14357. doi: 10.1038/ncomms14357CrossrefMedlineGoogle Scholar
Yao C, Chen G, Song C, Keefe J, Mendelson M, Huan T, Sun BB, Laser A, Maranville JC, Wu H,. Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease.Nat Commun. 2018; 9:3268. doi: 10.1038/s41467-018-05512-xCrossrefMedlineGoogle Scholar
Folkersen L, Fauman E, Sabater-Lleal M, Strawbridge RJ, Frånberg M, Sennblad B, Baldassarre D, Veglia F, Humphries SE, Rauramaa R,; IMPROVE Study Group. Mapping of 79 loci for 83 plasma protein biomarkers in cardiovascular disease.PLoS Genet. 2017; 13:e1006706. doi: 10.1371/journal.pgen.1006706CrossrefMedlineGoogle Scholar
Folkersen L, Gustafsson S, Wang Q, Hansen DH, Hedman ÅK, Schork A, Page K, Zhernakova DV, Wu Y, Peters J,. Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals.Nat Metab. 2020; 2:1135–1148. doi: 10.1038/s42255-020-00287-2CrossrefMedlineGoogle Scholar
Benson MD, Yang Q, Ngo D, Zhu Y, Shen D, Farrell LA, Sinha S, Keyes MJ, Vasan RS, Larson MG,. Genetic architecture of the cardiovascular risk proteome.Circulation. 2018; 137:1158–1172. doi: 10.1161/CIRCULATIONAHA.117.029536LinkGoogle Scholar
- 8. 1000 Genomes Project Consortium,
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA,. A global reference for human genetic variation.Nature. 2015; 526:68–74.CrossrefMedlineGoogle Scholar
McClellan JM, Lehner T, King MC. Gene discovery for complex traits: lessons from Africa.Cell. 2017; 171:261–264. doi: 10.1016/j.cell.2017.09.037CrossrefMedlineGoogle Scholar
Kotowski IK, Pertsemlidis A, Luke A, Cooper RS, Vega GL, Cohen JC, Hobbs HH. A spectrum of PCSK9 alleles contributes to plasma levels of low-density lipoprotein cholesterol.Am J Hum Genet. 2006; 78:410–422. doi: 10.1086/500615CrossrefMedlineGoogle Scholar
Taylor HA, Wilson JG, Jones DW, Sarpong DF, Srinivasan A, Garrison RJ, Nelson C, Wyatt SB. Toward resolution of cardiovascular health disparities in African Americans: design and methods of the Jackson Heart Study.Ethn Dis. 2005; 15(4 suppl 6):S6–S4.MedlineGoogle Scholar
Bild DE, Bluemke DA, Burke GL, Detrano R, Diez Roux AV, Folsom AR, Greenland P, Jacob DR, Kronmal R, Liu K,. Multi-Ethnic Study of Atherosclerosis: objectives and design.Am J Epidemiol. 2002; 156:871–881. doi: 10.1093/aje/kwf113CrossrefMedlineGoogle Scholar
Bouchard C, Leon AS, Rao DC, Skinner JS, Wilmore JH, Gagnon J. The HERITAGE family study. Aims, design, and measurement protocol.Med Sci Sports Exerc. 1995; 27:721–729.CrossrefMedlineGoogle Scholar
Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM,; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.Nature. 2021; 590:290–299. doi: 10.1038/s41586-021-03205-yCrossrefMedlineGoogle Scholar
Ngo D, Sinha S, Shen D, Kuhn EW, Keyes MJ, Shi X, Benson MD, O’Sullivan JF, Keshishian H, Farrell LA,. Aptamer-based proteomic profiling reveals novel candidate biomarkers and pathways in cardiovascular disease.Circulation. 2016; 134:270–285. doi: 10.1161/CIRCULATIONAHA.116.021803LinkGoogle Scholar
Raffield LM, Zakai NA, Duan Q, Laurie C, Smith JD, Irvin MR, Doyle MF, Naik RP, Song C, Manichaikul AW,; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Hematology & Hemostasis TOPMed Working Group*. D-dimer in African Americans: whole genome sequence analysis and relationship to cardiovascular disease risk in the Jackson Heart Study.Arterioscler Thromb Vasc Biol. 2017; 37:2220–2227. doi: 10.1161/ATVBAHA.117.310073LinkGoogle Scholar
Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants.Bioinformatics. 2015; 31:2202–2204. doi: 10.1093/bioinformatics/btv112CrossrefMedlineGoogle Scholar
Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, Vrieze SI, Chew EY, Levy S, McGue M,. Next-generation genotype imputation service and methods.Nat Genet. 2016; 48:1284–1287. doi: 10.1038/ng.3656CrossrefMedlineGoogle Scholar
Polfus LM, Raffield LM, Wheeler MM, Tracy RP, Lange LA, Lettre G, Miller A, Correa A, Bowler RP, Bis JC,; NHLBI Trans-Omics for Precision Medicine Consortium. Whole genome sequence association with E-selectin levels reveals loss-of-function variant in African Americans.Hum Mol Genet. 2019; 28:515–523. doi: 10.1093/hmg/ddy360CrossrefMedlineGoogle Scholar
Conomos MP, Miller MB, Thornton TA. Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness.Genet Epidemiol. 2015; 39:276–293. doi: 10.1002/gepi.21896CrossrefMedlineGoogle Scholar
Jiang L, Zheng Z, Qi T, Kemper KE, Wray NR, Visscher PM, Yang J. A resource-efficient tool for mixed model association analysis of large-scale data.Nat Genet. 2019; 51:1749–1755. doi: 10.1038/s41588-019-0530-8CrossrefMedlineGoogle Scholar
Sofer T, Zheng X, Gogarten SM, Laurie CA, Grinde K, Shaffer JR, Shungin D, O’Connell JR, Durazo-Arvizo RA, Raffield L,; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. A fully adjusted two-stage procedure for rank-normalization in genetic association studies.Genet Epidemiol. 2019; 43:263–275. doi: 10.1002/gepi.22188CrossrefMedlineGoogle Scholar
Yang J, Bakshi A, Zhu Z, Hemani G, Vinkhuyzen AA, Lee SH, Robinson MR, Perry JR, Nolte IM, van Vliet-Ostaptchouk JV,; LifeLines Cohort Study. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index.Nat Genet. 2015; 47:1114–1120. doi: 10.1038/ng.3390CrossrefMedlineGoogle Scholar
Evans LM, Tahmasbi R, Vrieze SI, Abecasis GR, Das S, Gazal S, Bjelland DW, de Candia TR, Goddard ME, Neale BM,; Haplotype Reference Consortium. Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits.Nat Genet. 2018; 50:737–745. doi: 10.1038/s41588-018-0108-xCrossrefMedlineGoogle Scholar
Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants.Bioinformatics. 2015; 31:3555–3557. doi: 10.1093/bioinformatics/btv402CrossrefMedlineGoogle Scholar
Myers TA, Chanock SJ, Machiela MJ. LDlinkR: An R package for rapidly calculating linkage disequilibrium statistics in diverse populations.Front Genet. 2020; 11:157. doi: 10.3389/fgene.2020.00157CrossrefMedlineGoogle Scholar
Kamat MA, Blackshaw JA, Young R, Surendran P, Burgess S, Danesh J, Butterworth AS, Staley JR. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations.Bioinformatics. 2019; 35:4851–4853. doi: 10.1093/bioinformatics/btz469CrossrefMedlineGoogle Scholar
Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB, Paul DS, Freitag D, Burgess S, Danesh J,. PhenoScanner: a database of human genotype-phenotype associations.Bioinformatics. 2016; 32:3207–3209. doi: 10.1093/bioinformatics/btw373CrossrefMedlineGoogle Scholar
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Jang W,. ClinVar: improving access to variant interpretations and supporting evidence.Nucleic Acids Res. 2018; 46(D1):D1062–D1067. doi: 10.1093/nar/gkx1153CrossrefMedlineGoogle Scholar
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP,; Genome Aggregation Database Consortium. The mutational constraint spectrum quantified from variation in 141,456 humans.Nature. 2020; 581:434–443. doi: 10.1038/s41586-020-2308-7CrossrefMedlineGoogle Scholar
Frankish A, Diekhans M, Ferreira AM, Johnson R, Jungreis I, Loveland J, Mudge JM, Sisu C, Wright J, Armstrong J,. GENCODE reference annotation for the human and mouse genomes.Nucleic Acids Res. 2019; 47(D1):D766–D773. doi: 10.1093/nar/gky955CrossrefMedlineGoogle Scholar
Li X, Li Z, Zhou H, Gaynor SM, Liu Y, Chen H, Sun R, Dey R, Arnett DK, Aslibekyan S,; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.Nat Genet. 2020; 52:969–983. doi: 10.1038/s41588-020-0676-4CrossrefMedlineGoogle Scholar
Reilly MP, Li M, He J, Ferguson JF, Stylianou IM, Mehta NN, Burnett MS, Devaney JM, Knouff CW, Thompson JR,; Myocardial Infarction Genetics Consortium; Wellcome Trust Case Control Consortium. Identification of ADAMTS7 as a novel locus for coronary atherosclerosis and association of ABO with myocardial infarction in the presence of coronary atherosclerosis: two genome-wide association studies.Lancet. 2011; 377:383–392. doi: 10.1016/S0140-6736(10)61996-4CrossrefMedlineGoogle Scholar
Wu O, Bayoumi N, Vickers MA, Clark P. ABO(H) blood groups and vascular disease: a systematic review and meta-analysis.J Thromb Haemost. 2008; 6:62–69. doi: 10.1111/j.1538-7836.2007.02818.xCrossrefMedlineGoogle Scholar
Luo M, Ji Y, Luo Y, Li R, Fay WP, Wu J. Plasminogen activator inhibitor-1 regulates the vascular expression of vitronectin.J Thromb Haemost. 2017; 15:2451–2460. doi: 10.1111/jth.13869CrossrefMedlineGoogle Scholar
Dankner R, Ben Avraham S, Harats D, Chetrit A. ApoE genotype, lipid profile, exercise, and the associations with cardiovascular morbidity and 18-year mortality.J Gerontol A Biol Sci Med Sci. 2020; 75:1887–1893. doi: 10.1093/gerona/glz232CrossrefMedlineGoogle Scholar
Assarsson E, Lundberg M, Holmquist G, Björkesten J, Thorsen SB, Ekman D, Eriksson A, Rennel Dickens E, Ohlsson S, Edfeldt G,. Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability.PLoS One. 2014; 9:e95192. doi: 10.1371/journal.pone.0095192CrossrefMedlineGoogle Scholar
Olson NC, Butenas S, Lange LA, Lange EM, Cushman M, Jenny NS, Walston J, Souto JC, Soria JM, Chauhan G,. Coagulation factor XII genetic variation, ex vivo thrombin generation, and stroke risk in the elderly: results from the Cardiovascular Health Study.J Thromb Haemost. 2015; 13:1867–1877.CrossrefMedlineGoogle Scholar
Bos MM, Noordam R, Blauw GJ, Slagboom PE, Rensen PCN, van Heemst D. The ApoE ε4 isoform: can the risk of diseases be reduced by environmental factors?J Gerontol A Biol Sci Med Sci. 2019; 74:99–107. doi: 10.1093/gerona/gly226CrossrefMedlineGoogle Scholar
Chyu KY, Lio WM, Dimayuga PC, Zhou J, Zhao X, Yano J, Trinidad P, Honjo T, Cercek B, Shah PK. Cholesterol lowering modulates T cell function in vivo and in vitro.PLoS One. 2014; 9:e92095. doi: 10.1371/journal.pone.0092095CrossrefMedlineGoogle Scholar
Yoshiyama Y, Asahina M, Hattori T. Selective distribution of matrix metalloproteinase-3 (MMP-3) in Alzheimer’s disease brain.Acta Neuropathol. 2000; 99:91–95. doi: 10.1007/pl00007428CrossrefMedlineGoogle Scholar
Immenschuh S, Song DX, Satoh H, Muller-Eberhard U. The type II hemopexin interleukin-6 response element predominates the transcriptional regulation of the hemopexin acute phase responsiveness.Biochem Biophys Res Commun. 1995; 207:202–208. doi: 10.1006/bbrc.1995.1173CrossrefMedlineGoogle Scholar
Vinchi F, Costa da Silva M, Ingoglia G, Petrillo S, Brinkman N, Zuercher A, Cerwenka A, Tolosano E, Muckenthaler MU. Hemopexin therapy reverts heme-induced proinflammatory phenotypic switching of macrophages in a mouse model of sickle cell disease.Blood. 2016; 127:473–486.CrossrefMedlineGoogle Scholar
Belcher JD, Chen C, Nguyen J, Abdulla F, Zhang P, Nguyen H, Nguyen P, Killeen T, Miescher SM, Brinkman N,. Haptoglobin and hemopexin inhibit vaso-occlusion and inflammation in murine sickle cell disease: role of heme oxygenase-1 induction.PLoS One. 2018; 13:e0196455. doi: 10.1371/journal.pone.0196455CrossrefMedlineGoogle Scholar
Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, Lee JC, Schumm LP, Sharma Y, Anderson CA,; International IBD Genetics Consortium (IIBDGC). Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease.Nature. 2012; 491:119–124. doi: 10.1038/nature11582CrossrefMedlineGoogle Scholar
Gelernter J, Sun N, Polimanti R, Pietrzak RH, Levey DF, Lu Q, Hu Y, Li B, Radhakrishnan K, Aslan M,; Department of Veterans Affairs Cooperative Studies Program (No. 575B); Million Veteran Program. Genome-wide association study of maximum habitual alcohol intake in >140,000 U.S. European and African American veterans yields novel risk loci.Biol Psychiatry. 2019; 86:365–376. doi: 10.1016/j.biopsych.2019.03.984CrossrefMedlineGoogle Scholar
Pillai VB, Sundaresan NR, Kim G, Samant S, Moreno-Vinasco L, Garcia JG, Gupta MP. Nampt secreted from cardiomyocytes promotes development of cardiac hypertrophy and adverse ventricular remodeling.Am J Physiol Heart Circ Physiol. 2013; 304:H415–H426. doi: 10.1152/ajpheart.00468.2012CrossrefMedlineGoogle Scholar
Xiong X, Yu J, Fan R, Zhang C, Xu L, Sun X, Huang Y, Wang Q, Ruan HB, Qian X. NAMPT overexpression alleviates alcohol-induced hepatic steatosis in mice.PLoS One. 2019; 14:e0212523. doi: 10.1371/journal.pone.0212523CrossrefMedlineGoogle Scholar
Loscalzo J, Braunwald E. Tissue plasminogen activator.N Engl J Med. 1988; 319:925–931. doi: 10.1056/NEJM198810063191407CrossrefMedlineGoogle Scholar
Koshida R, Ou J, Matsunaga T, Chilian WM, Oldham KT, Ackerman AW, Pritchard KA. Angiostatin: a negative regulator of endothelial-dependent vasodilation.Circulation. 2003; 107:803–806. doi: 10.1161/01.cir.0000057551.88851.09LinkGoogle Scholar
Li Y, Zeng C, Hu J, Pan Y, Shan Y, Liu B, Jia L. Long non-coding RNA-SNHG7 acts as a target of miR-34a to increase GALNT7 level and regulate PI3K/Akt/mTOR pathway in colorectal cancer progression.J Hematol Oncol. 2018; 11:89. doi: 10.1186/s13045-018-0632-2CrossrefMedlineGoogle Scholar
Zanotti G, Berni R. Plasma retinol-binding protein: structure and interactions with retinol, retinoids, and transthyretin.In: Vitamins & Hormones. Academic Press; 2004: 271–295. http://www.sciencedirect.com/science/article/pii/S0083672904690108. Accessed November 25, 2020.Google Scholar
Arvanitis M, Koch CM, Chan GG, Torres-Arancivia C, LaValley MP, Jacobson DR, Berk JL, Connors LH, Ruberg FL. Identification of transthyretin cardiac amyloidosis using serum retinol-binding protein 4 and a clinical prediction model.JAMA Cardiol. 2017; 2:305–313. doi: 10.1001/jamacardio.2016.5864CrossrefMedlineGoogle Scholar
Tahir UA, Katz DH, Zhao T, Ngo D, Cruz DE, Robbins JM, Chen ZZ, Peterson B, Benson MD, Shi X,. Metabolomic profiles and heart failure risk in black adults: insights from the Jackson Heart Study.Circ Heart Fail. 2021; 14:e007275. doi: 10.1161/CIRCHEARTFAILURE.120.007275LinkGoogle Scholar
Genovese G, Friedman DJ, Ross MD, Lecordier L, Uzureau P, Freedman BI, Bowden DW, Langefeld CD, Oleksyk TK, Uscinski Knob AL,. Association of trypanolytic ApoL1 variants with kidney disease in African Americans.Science. 2010; 329:841–845. doi: 10.1126/science.1193032CrossrefMedlineGoogle Scholar
Ito K, Bick AG, Flannick J, Friedman DJ, Genovese G, Parfenov MG, Depalma SR, Gupta N, Gabriel SB, Taylor HA,. Increased burden of cardiovascular disease in carriers of APOL1 genetic variants.Circ Res. 2014; 114:845–850. doi: 10.1161/CIRCRESAHA.114.302347LinkGoogle Scholar
Bick AG, Akwo E, Robinson-Cohen C, Lee K, Lynch J, Assimes TL, DuVall S, Edwards T, Fang H, Freiberg SM,; VA Million Veteran Program. Association of APOL1 risk alleles with cardiovascular disease in blacks in the Million Veteran Program.Circulation. 2019; 140:1031–1040. doi: 10.1161/CIRCULATIONAHA.118.036589LinkGoogle Scholar
Wang K, Huang R, Li G, Zeng F, Zhao Z, Liu Y, Hu H, Jiang T. CKAP2 expression is associated with glioma tumor growth and acts as a prognostic factor in high-grade glioma.Oncol Rep. 2018; 40:2036–2046. doi: 10.3892/or.2018.6611MedlineGoogle Scholar
Lin P, Pan Y, Chen H, Jiang L, Liao Y. Key genes of renal tubular necrosis: a bioinformatics analysis.Transl Androl Urol. 2020; 9:654–664. doi: 10.21037/tau.2019.11.24CrossrefMedlineGoogle Scholar
Kim S, Eliot M, Koestler DC, Wu WC, Kelsey KT. Association of neutrophil-to-lymphocyte ratio with mortality and cardiovascular disease in the Jackson Heart Study and modification by the Duffy antigen variant.JAMA Cardiol. 2018; 3:455–462. doi: 10.1001/jamacardio.2018.1042CrossrefMedlineGoogle Scholar
Legge SE, Christensen RH, Petersen L, Pardiñas AF, Bracher-Smith M, Knapper S, Bybjerg-Grauholm J, Baekvad-Hansen M, Hougaard DM, Werge T,. The Duffy-null genotype and risk of infection.Hum Mol Genet. 2020; 29:3341–3349. doi: 10.1093/hmg/ddaa208CrossrefMedlineGoogle Scholar
Kulkarni H, Marconi VC, He W, Landrum ML, Okulicz JF, Delmar J, Kazandjian D, Castiblanco J, Ahuja SS, Wright EJ,. The Duffy-null state is associated with a survival advantage in leukopenic HIV-infected persons of African ancestry.Blood. 2009; 114:2783–2792. doi: 10.1182/blood-2009-04-215186CrossrefMedlineGoogle Scholar
Katz DH, Tahir UA, Ngo D, Benson MD, Gao Y, Shi X, Nayor M, Keyes MJ, Larson MG, Hall ME,. Multiomic profiling in black and white populations reveals novel candidate pathways in left ventricular hypertrophy and incident heart failure specific to black adults.Circ Genom Precis Med. 2021; 14:e003191. doi: 10.1161/CIRCGEN.120.003191LinkGoogle Scholar
Duggal P, An P, Beaty TH, Strathdee SA, Farzadegan H, Markham RB, Johnson L, O’Brien SJ, Vlahov D, Winkler CA. Genetic influence of CXCR6 chemokine receptor alleles on PCP-mediated AIDS progression among African Americans.Genes Immun. 2003; 4:245–250. doi: 10.1038/sj.gene.6363950CrossrefMedlineGoogle Scholar
Picton ACP, Paximadis M, Chaisson RE, Martinson NA, Tiemessen CT. CXCR6 gene characterization in two ethnically distinct South African populations and association with viraemic disease control in HIV-1-infected black South African individuals.Clin Immunol. 2017; 180:69–79. doi: 10.1016/j.clim.2017.04.006CrossrefMedlineGoogle Scholar
Andersen T, Ueland T, Ghukasyan Lakic T, Åkerblom A, Bertilsson M, Aukrust P, Michelsen AE, James SK, Becker RC, Storey RF,. C-X-C ligand 16 is an independent predictor of cardiovascular death and morbidity in acute coronary syndromes.Arterioscler Thromb Vasc Biol. 2019; 39:2402–2410. doi: 10.1161/ATVBAHA.119.312633LinkGoogle Scholar
Kontorovich AR, Abul-Husn NS. Retinol binding protein 4 as a screening biomarker for hereditary TTR amyloidosis in African American adults with TTR V142I.J Card Fail. 2021; S1071-9164(21)00199–8. doi: 10.1016/j.cardfail.2021.05.009Google Scholar
Churchwell K, Elkind MSV, Benjamin RM, Carson AP, Chang EK, Lawrence W, Mills A, Odom TM, Rodriguez CJ, Rodriguez F,; American Heart Association. Call to action: structural racism as a fundamental driver of health disparities: a presidential advisory from the American Heart Association.Circulation. 2020; 142:e454–e468. doi: 10.1161/CIR.0000000000000936LinkGoogle Scholar
Raffield LM, Dang H, Pratte KA, Jacobson S, Gillenwater LA, Ampleford E, Barjaktarevic I, Basta P, Clish CB, Comellas AP,; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. Comparison of proteomic assessment methods in multiple cohort studies.Proteomics. 2020; 20:e1900278. doi: 10.1002/pmic.201900278CrossrefMedlineGoogle Scholar
Benson MD, Ngo D, Ganz P, Gerszten RE. Emerging affinity reagents for high throughput proteomics: trust, but verify.Circulation. 2019; 140:1610–1612. doi: 10.1161/CIRCULATIONAHA.119.039912LinkGoogle Scholar
Carpenter MA, Crow R, Steffes M, Rock W, Heilbraun J, Evans G, Skelton T, Jensen R, Sarpong D. Laboratory, reading center, and coordinating center data management methods in the Jackson Heart Study.Am J Med. Sci. 2004; 328:131–144. doi: 10.1097/00000441-200409000-00001CrossrefMedlineGoogle Scholar
Gold L, Ayers D, Bertino J, Bock C, Bock A, Brody EN, Carter J, Dalby AB, Eaton BE, Fitzwater T,. Aptamer-based multiplexed proteomic technology for biomarker discovery.PLoS One. 2010; 5:e15004. doi: 10.1371/journal.pone.0015004CrossrefMedlineGoogle Scholar
Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF, Feldman HI, Kusek JW, Eggers P, Van Lente F, Greene T,; CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration). A new equation to estimate glomerular filtration rate.Ann Intern Med. 2009; 150:604–612. doi: 10.7326/0003-4819-150-9-200905050-00006CrossrefMedlineGoogle Scholar
Pandey A, Keshvani N, Ayers C, Correa A, Drazner MH, Lewis A, Rodriguez CJ, Hall ME, Fox ER, Mentz RJ,. Association of cardiac injury and malignant left ventricular hypertrophy with risk of heart failure in African Americans: the Jackson Heart Study.JAMA Cardiol. 2019; 4:51–58. doi: 10.1001/jamacardio.2018.4300CrossrefMedlineGoogle Scholar
Marwick TH, Gillebert TC, Aurigemma G, Chirinos J, Derumeaux G, Galderisi M, Gottdiener J, Haluska B, Ofili E, Segers P,. Recommendations on the use of echocardiography in adult hypertension: a report from the European Association of Cardiovascular Imaging (EACVI) and the American Society of Echocardiography (ASE).J Am Soc Echocardiogr. 2015; 28:727–754.CrossrefMedlineGoogle Scholar