Genome-Wide Analysis of Left Ventricular Maximum Wall Thickness in the UK Biobank Cohort Reveals a Shared Genetic Background With Hypertrophic Cardiomyopathy

Background: Left ventricular maximum wall thickness (LVMWT) is an important biomarker of left ventricular hypertrophy and provides diagnostic and prognostic information in hypertrophic cardiomyopathy (HCM). Limited information is available on the genetic determinants of LVMWT. Methods: We performed a genome-wide association study of LVMWT measured from the cardiovascular magnetic resonance examinations of 42 176 European individuals. We evaluated the genetic relationship between LVMWT and HCM by performing pairwise analysis using the data from the Hypertrophic Cardiomyopathy Registry in which the controls were randomly selected from UK Biobank individuals not included in the cardiovascular magnetic resonance sub-study. Results: Twenty-one genetic loci were discovered at P<5×10−8. Several novel candidate genes were identified including PROX1, PXN, and PTK2, with known functional roles in myocardial growth and sarcomere organization. The LVMWT genetic risk score is predictive of HCM in the Hypertrophic Cardiomyopathy Registry (odds ratio per SD: 1.18 [95% CI, 1.13–1.23]) with pairwise analyses demonstrating a moderate genetic correlation (rg=0.53) and substantial loci overlap (19/21). Conclusions: Our findings provide novel insights into the genetic underpinning of LVMWT and highlight its shared genetic background with HCM, supporting future endeavours to elucidate the genetic etiology of HCM.


UK Biobank sample selection
We excluded CMR studies with incomplete LV coverage (N slices < 6) and inadequate segmentation quality (number of segmented voxels < 150 for the basal slices and < 50 for the apical slices). Additionally, all CMR studies with: (i) outlying values defined as 3 x interquartile range above and below the first and third quartile, respectively for unindexed and body-surface-area and height indexed LV volumes, LV mass and LVMWT and (ii) nonphysiological measurements (LV end-diastolic volume < 75ml, LV end-systolic volume < 25ml and LV mass < 40g) and (iii) LVMWT ≥ 13mm were visually reviewed by two European Association of Cardiovascular Imaging (EACVI) level-3 certified analysts (N.A. and L.R.L.) to ascertain the accuracy of measurements.
We next applied a series of genotypic quality control checks (QC) to individuals with good quality CMR studies as outlined in Supplemental Figure 4. We removed individuals with missing genotype (N = 1,066), discordance between the self-reported and genetically inferred sex (N = 26), poor genotype quality (N = 79) and non-European ancestry (N = 1,492).

Genetic analyses
We first selected a set of high quality directly genotyped variants known as model single nucleotide polymorphisms (model SNPs) by applying the following filters: a minor allele frequency (MAF) > 5%, a Hardy-Weinberg equilibrium (HWE) threshold of P-value =1× 10 −6 , and missingness < 0.0015. The LVMWT phenotype was normalised using rank-based inverse normal transformation as it showed evidence of positive skewness (Supplemental Figure 5). We estimated the heritability explained by genotype (ℎ 2 SNP) using BOLT-REML software 52 . We next performed the discovery GWAS in a linear mixed-model method by BOLT-LMM software 53 using the model SNPs (329,861 variants) and ~ 9.9 million imputed variants with MAF ≥ 1% and INFO > 0.3. Both heritability and GWAS models were adjusted for age, sex, body surface area, mean arterial blood pressure corrected for antihypertensive medication use (by adding 15 mmHg to systolic blood pressure and 10mmgHg to diastolic pressure), genotyping array type (UK Biobank versus UK BiLEVE array), and imaging centre. A GWAS P value threshold of < 5 ×10 −8 represents genome-wide significance. We defined a genomic locus as a region encompassing 500kb upstream and downstream of the lead variant with the smallest P value.

Conditional analysis
We examined the existence of secondary independent variants tagging the same GWAS loci by performing conditional analysis in genome-wide complex trait analysis (GCTA) software 54 . A secondary signal was declared if: (i) the newly identified variant's original GWAS P value was lower than 1x10 -6 ; (ii) there was < 1.5-fold difference between the lead variant and secondary association P values on a -log10 scale (i.e., if -log10(Plead)/-log10(Psec) < 1.5); and (iii) if there was < 1.5-fold difference between the main association and conditional association P values on a -log10 scale (i.e., if -log10(Psec)/-log10(Pcond) < 1.5) .

Percent variance
The proportion of variance explained by the genome-wide significant WT loci was calculated by the difference in adjusted R 2 between the linear regression model containing all covariates plus all lead variants and the model containing only analysis covariates.

Variant annotation
99% credible sets were created for each genome-wide significant locus using the Bayesian approach previously described by Wakefield 55 . We used Variant Effect Predictor (VEP) tool (v103) 56 to describe the type, consequence and predicted function based on SIFT and PolyPhen-2 of all variants in the 99% credible sets. Non-synonymous variants were considered 'damaging' if concordantly predicted as having detrimental effects by both SIFT and PolyPhen-2. For non-coding variants, CADD (v1.6) 57 and RegulomeDB (v2.0) 58 databases were used to ascertain functional relevance. Variants with scaled CADD score > 20 or RegulomeDB score ≤ 3 were considered functionally important. Fine-mapping of causal variants within credible sets was performed by the colocalisation analysis of GWAS and cis-eQTL signals from cardiovascular tissues (aorta, coronary artery, left atrial appendage and left ventricle) in GTEx (v8) 59 using coloc R package 60 .

Pleiotropy analyses
We queried the lead variants and their close proxies (linkage disequilibrium [LD] r 2 ≥ 0.8) against PhenoScanner (v2) 61 and GWAS Catalog (queried on 22 th August 2021) 62 to investigate trait pleiotropy. We also investigated the overlap of loci discovered in our LVMWT GWAS and a recently published LV mean wall thickness GWAS 24 in the same UKB cohort.

Long-range chromatin interaction (Hi-C) analysis
To identify distal candidate genes, we explored chromatin interaction (promotor capture Hi-C) data 63 for all lead and secondary variants and their proxies with regulatory potential (RegulomeDB score ≤ 3). We identified several genes whose promoter regions form significant chromatin interactions in heart atria, ventricles, and aorta.
Candidate genes at each locus were then collated based on evidence from: 6. genes which are nearest to or locating within the 10kb window of the lead variant or the region of LD block r 2 > 0.5, using University of California, Santa Cruz (UCSC) known genes database

Gene-set and pathway enrichment analyses
We performed an unbiased gene-set and pathway enrichment analysis of GWAS signals in DEPICT using the LVMWT GWAS summary statistics with an association P value threshold of 1 x 10 -5 as recommended by the authors of DEPICT. Additionally, we used the g:Profiler tool 65 to investigate if our prioritised genes (supported by at least 2 lines of evidence as described above) are over-represented in a particular biological pathway. In g:profiler, multiple-testing correction was performed by the bespoke ontology-focused g:SCS (Set Counts and Sizes) method 66 at 5% threshold.

Phenome-wide association study
We examined the associations between the variants for LVMWT and other phenotypes by conducting a Phenome-wide association study (PheWAS) in unrelated individuals of European ancestry from UK Biobank (N = 343,849) not included in our LVMWT GWAS. Logistic regression analyses adjusted for age, sex and the first ten genetic principal components (PCs) were performed for each locus-specific weighted allele scores and 1,513 outcome phenotypes based on the hospital episode statistics defined according to the phecode system as previously described 67 . Bonferroni correction was applied to the PheWAS P values (adjusted P < 3.3x10 -5 ).