Genetic‐Genomic Replication to Identify Candidate Mouse Atherosclerosis Modifier Genes

Objective Genetics plays a large role in atherosclerosis susceptibility in humans and mice. We attempted to confirm previously determined mouse atherosclerosis‐associated loci and use bioinformatics and transcriptomics to create a catalog of candidate atherosclerosis modifier genes at these loci. Methods and Results A strain intercross was performed between AKR and DBA/2 mice on the apoE−/− background generating 166 F2 progeny. Using the phenotype log10 of the aortic root lesion area, we identified 3 suggestive atherosclerosis quantitative trait loci (Ath QTLs). When combined with our prior strain intercross, we confirmed 3 significant Ath QTLs on chromosomes 2, 15, and 17, with combined logarithm of odds scores of 5.9, 5.3, and 5.6, respectively, which each met the genome‐wide 5% false discovery rate threshold. We identified all of the protein coding differences between these 2 mouse strains within the Ath QTL intervals. Microarray gene expression profiling was performed on macrophages and endothelial cells from this intercross to identify expression QTLs (eQTLs), the loci that are associated with variation in the expression levels of specific transcripts. Cross tissue eQTLs and macrophage eQTLs that replicated from a prior strain intercross were identified. These bioinformatic and eQTL analyses produced a comprehensive list of candidate genes that may be responsible for the Ath QTLs. Conclusions Replication studies for clinical traits as well as gene expression traits are worthwhile in identifying true versus false genetic associations. We have replicated 3 loci on mouse chromosomes 2, 15, and 17 that are associated with atherosclerosis. We have also identified protein coding differences and multiple replicated eQTLs, which may be useful in the identification of atherosclerosis modifier genes.

A therosclerosis is a complex disease with both environmental and genetic susceptibility components. The heritability of atherosclerotic coronary artery disease (CAD) in humans is evident from family history being a significant risk factor. 1,2 In addition, genome-wide association studies (GWAS) have identified multiple loci associated with CAD. 3 However, these studies have a tremendous statistical burden to overcome to meet the threshold of genome-wide signifi-cance, and thus much of the genetic contribution may underreported. Also, GWAS do not ascertain rare variants, and it is becoming increasingly clear that rare variants in aggregate can account for a significant portion of population variance for complex traits such as plasma triglycerides. 4 Thus, there is still impetus to identify novel genes and pathways that play a role in atherosclerosis susceptibility. Genetics also plays a role in lesion development in mouse models of atherosclerosis, as different inbred strains have markedly different aortic lesion areas. 5 Mouse models provide an opportunity to tease out the potential genetic modifiers for multigenic phenotypes. We have previously shown that AKR apoE À/À mice have %10-fold smaller aortic root lesions compared with DBA/2 apoE À/À mice when fed a chow diet. 5 A previous intercross between these 2 strains identified 2 significant and 4 suggestive quantitative trait loci (QTLs) for aortic root lesion area. Just as lesion area is a quantitative trait that can be used for gene mapping studies, gene expression levels can likewise be treated as a quantitative trait to map the expression QTLs (eQTLs), or the loci that control the expression of specific transcripts. 6 We had previously performed an eQTL analysis using macrophages from the F2 cohort of the AKR apoE À/À 9DBA/2 apoE À/À strain intercross. 7 Here, we report atherosclerosis (Ath) QTL and eQTL findings from a second independent strain intercross of these same 2 strains.
We found that both significant Ath QTLs in the prior cross were replicated in the new cross, whereas only one of the prior suggestive Ath QTLs was replicated. We carefully excluded analysis of transcriptome data from microarray probes that contained strain-specific sequence polymorphisms, and we still found robust replication of macrophage cis-acting eQTLs between the prior and new crosses. We also observed many cis-eQTLs that were conserved between macrophages and endothelial cells (ECs). However, transacting eQTLs were not well replicated between the 2 crosses, leading us to believe that there is a high falsepositive rate for the identification of trans-eQTLs. We compiled the lists of all protein coding differences between the AKR and DBA/2 strains, as well as the eQTLs, within the replicated Ath QTLs. These genes provide a comprehensive list of candidates that may be responsible for the observed Ath QTLs.

Mouse and Cell Studies
A DBA/2J apoE À/À 9AKR/J apoE À/À reciprocal strain intercross was performed to generate an F 1 cohort and the subsequent F 2 cohort of 89 males and 77 females. The F 2 mice were weaned at 21 days and placed on a chow diet. Mice were killed at 16 weeks of age. Femurs were collected from all mice, and the descending aortas were removed from males for culture of ECs as described later. Tail-tip DNA was prepared from each F 2 mouse by proteinase K digestion followed by ethanol precipitation. Lesion areas of the aortic root were quantified as previously described. 8 Genotyping was performed using Illumina Golden Gate mouse genotyping arrays. Genotyping calls were made using Illumina Genome Studio software. In all, 599 informative markers between AKR and DBA/2 were used for QTL and eQTL analyses.
Bone marrow-derived macrophages (BMMs) were derived as previously described. 7 To obtain cultured ECs from the F2 mice, descending aortas were isolated, cut into 2-mm sections, placed on top of matrigel-coated plates, and grown to confluence (%10 to 14 days) in DMEM supplemented as previously described. 9 Cells were treated with dispase and passaged twice before RNA isolation. This protocol was successful in obtaining EC cultures in 48 of 89 attempts. RNA was isolated using Qiagen's total RNA kits and digested with DNase I for 12 minutes at room temperature to remove genomic DNA. RNA integrity was confirmed by agarose gel electrophoresis. cDNA was synthesized using Illumina protocols and reagents and hybridized on Illumina Mouse Ref-8 v2 microarrays. All expression, phenotype, and genotype data are available in GEO (accession No. GSE35676).

Statistical Methods
Ath QTLs, using the log10 of aortic root lesion areas, were mapped using the R package qtl (R/qtl). 10 False discovery rates (FDRs) were estimated with 100 000 permutations using the scanone function in R/qtl. The Ath QTL CIs were calculated within the same software using the Bayesian credible interval function.
Gene expression data were loaded into the R-package lumi 11 log2 transformed and quantile normalized. eQTLs were mapped using the scanone function in the R/qtl. 10 Probes were mapped using BLAT 12 against the mouse mm9 reference genome. Probes that matched to multiple locations or annotated transcripts from Ensembl release 63 were discarded. Probes containing polymorphisms, either an indel or a single nucleotide polymorphism (SNP), or probes mapping to known structural variants between DBA/2 and AKR strains 13,14 were discarded because they could lead to identification of false cis-eQTLs. 15 This was performed by taking the genomic locations from BLAT in the University of California Santa Cruz Genome Browser and using Tabix 16 to retrieve sequence variants from the Mouse Genome Sequencing Project. 13,14 This filtering resulted in the removal of 2749 probes from the data set.
Human-mouse alignments generated previously by Schwarz et al 17 were used to obtain the human regions corresponding to our mouse Ath QTLs, and within these regions we identified human GWAS loci 18 related to CAD.
To compare eQTLs from the current and previous studies, 7 matches between Affymetrix and Illumina's probes were provided by Illumina (http://www.switchtoi.com/ probemapping.ilmn). To perform the combined cross-QTL analyses, the SNPs from each cross were imput to each other using the fill.geno function in R/qtl, with the simple assumption of no double crossovers. In the prior cross, there were 1947 markers, whereas in the new cross, there were 599 markers, of which only 170 overlapped between the 2, leading to 2376 total markers. There is no metric for imputation quality, but we found that using the original set of markers in the second cross versus the combined set of markers did not greatly alter QTL location or strength. A liberal 20-Mb window was used between studies to determine if an eQTL overlapped. A combined eQTL analysis was done with data from both studies, using cross membership as an additive covariate and sex as an interactive covariate.

Results and Discussion
Atherosclerosis QTL Replication in a New Cross F 2 mice were generated using a reciprocal cross strategy from apoE À/À mice on the AKR and DBA/2 backgrounds. Lesion areas in the aortic root were quantified in 166 F 2 mice (77 female and 89 male). A genome scan was performed for each F 2 mouse using 599 informative SNPs. We defined significant QTLs as those that have genome-wide FDRs of <10%, by permutation analysis. Combining both sexes and using sex as an interactive covariate, we identified 3 suggestive Ath QTLs at a logarithm of odds (LOD) score threshold of 2.0 on chromosomes 2, 15, and 17 with FDRs of 15%, 30%, and 16%, respectively (Table 1). Although these Ath QTLs only met the suggestive threshold due to small sample size, each of these Ath QTLs were detected in a prior AKR9DBA/2 strain intercross 19 : Ath28, a suggestive QTL on chromosome 2; Ath22, a significant QTL on chromosome 15; and Ath26, a significant QTL on chromosome 17. We performed a combined Ath QTL analysis using both crosses with cohort membership as an additive covariate and sex as an interactive covariate. As the platform and markers used to genotype the 2 crosses differed, R/qtl 20 was used to impute the genotypes between the 2 crosses. On chromosome 2, Ath28 was replicated and had a combined LOD score of 5.9; on chromosome 15, Ath22 was replicated with a combined LOD score of 5.3; and on chromosome 17, Ath26 was replicated with a combined LOD score of 5.6 ( Figure 1). All of these combined LOD scores met the genome-wide FDR threshold of <10%, and in fact they all had <5% FDR. However, the suggestive Ath QTLs identified in the first cross using sex as an interactive covariate on chromosomes 5, 3, and 13 19 were not replicated in the second cross. The approximate 95% Bayesian credible interval was obtained for all 3 loci (Table 1). Thus, the clinical trait mouse QTL model is partially reproducible for a phenotype as complex as lesion area, which has a fairly large coefficient of variation within inbred strains.
We identified the human chromosome segments orthologous to these mouse loci. We examined whether these human orthologous regions contained common genetic variants associated with CAD, myocardial infarction, or related risk factors by searching the National Human Genome Research Institute GWAS catalog 18 (Table 2). The known atherosclerosis-related risk factors included blood lipid levels, subclinical atherosclerosis, type 2 diabetes, and hypertension. There were no human CAD GWAS hits in the region in synteny with Ath28, although %21% of the Ath28 interval on chromosome 2 displayed no synteny due to complex expansion in the mouse genome after species divergence. The Ath22 locus on chromosome 15 contains the corresponding segment on human chromosome 5 that has been associated with subclinical atherosclerosis. 21 The Ath26 locus on chromosome 17 corresponds to human chromosomes 6 (primarily) and 19, including the major histocompatibility complex locus, and overlaps with multiple human GWAS loci for CAD and related risk factors ( Table 2).  (LOD) plots for the prior and new crosses of AKR apoE À/À and DBA/2 apoE À/À mice, respectively. The black line shows the LOD plot for the combined analysis using cross as an additive covariate. In all analyses, sex was used as an interactive covariate.

Protein Coding Differences Between AKR and DBA/2 Mice Residing in Ath QTLs
We used a variety of bioinformatic and genomic methods to identify candidate genes that may be responsible for the 3 replicated Ath QTLs. Using the mouse sequence data from 15 common inbred strains, 13,14 we identified all of the nonsynonymous protein changes in these 3 loci. These strain variable genes on chromosomes 2 (11 genes), 15 (23 genes), and 17 (258 genes) are listed in Table S1, with many genes having >1   amino acid substitution between these 2 strains. We identified many more strain variant genes for Ath26 on chromosome 17, because it is a very large 52-Mb gene-dense interval that contains the highly polymorphic mouse H2 major histocompatibility region. After exclusion of the major histocompatibility genes, we used Polyphen 2 22 to ascertain in silico the likelihood that each protein coding change would impair protein function, and we found numerous potential protein functional differences between the 2 strains (Tables 3 and S1). On chromosome 2, only 3 protein changes were predicted to be damaging in 1 strain relative to the other strain. Zbp1 is a Z-DNA binding protein, Tcfap2c is an AP-2 transcription factor involved in early development, and Zfp663 is a zinc-finger protein; none of these proteins have been previously implicated in atherosclerosis susceptibility. On chromosome 15, there were several protein coding changes that are predicted to be detrimental in 1 strain versus the other. Two components of the complement system, C6 and C7, have predicted functional differences, both with the AKR version being detrimental. The complement system has potential roles in cardiovascular disease as previously reviewed. 23 On chromosome 17, there were 50 genes with predicted detrimental changes, and many additional changes in major histocompatibility complex genes, that were not subjected to the Polyphen analysis. Some notable protein changes were found in Rab44, Collagen 11a2, and Notch4, which can alter cellular vesicular trafficking, extracellular matrix, and signal transduction, respectively, all potential atherosclerosis modifiers.

eQTLs in Bone Marrow-Derived Macrophages and Endothelial Cells
The global profile of gene expression in bone marrow-derived macrophages was assayed using Illumina microarrays from 79 female and 81 male F 2 mice. With the same 599 SNPs used to map the Ath QTLs, we mapped eQTLs, or loci that are associated with the expression of each transcript, using sex as an additive covariate. An eQTL was defined as a cis-eQTL if the eQTL mapped within 20 Mb of the probe position. A trans-eQTL is defined as the QTL mapping anywhere else on the genome. To eliminate spurious eQTLs, we filtered out 2479 expression array probes that contained an SNP or insertion/ deletion between these 2 strains, which could lead to altered probe hybridization impairing an accurate measure of gene expression. We validated that these strain variant probes would indeed lead to false cis-eQTLs with on average an LOD score that was double the LOD score of comparable probes containing no variant (LOD 14.7 versus 7.6, P<2.2910 À6 ). In addition, the transcripts with the nonreference SNP allele overlapping the probes were overwhelmingly called with lower expression values versus the transcripts containing the reference allele ( Figure S1). After filtering out probes that were not expressed above background in ≥25% of the samples, 9600 probes were evaluated for eQTLs. We used a stringent FDR cut-off of 5% to identify cis-eQTLs, which corresponded to an LOD score threshold of 2.4 and found 937 cis-eQTLs (Table S2). Table 4 shows the top 25 cis-eQTLs ranked by LOD score. Because trans-eQTLs are indirect, and often not as strong as cis-eQTLs, we applied both a liberal FDR cutoff of 30% and a stringent 5% cutoff. The 30% and 5% FDR thresholds corresponded to LOD scores of 2.81 and 3.75, respectively, with 3797 and 551 trans-eQTLs identified, respectively (Table S3). Table 5 shows the top 25 trans-eQTLs ranked by LOD score. Lusis has proposed that mouse strain effects on EC function may underlie some strain effects on atherosclerosis. 24 We successfully cultured primary aortic ECs from 48 male F 2 mice used in the atherosclerosis study and assayed global gene expression by microarray. As expected, these cells expressed high levels of canonical EC transcripts encoding the proteins Tie2, the Vegf receptors, and von Willebrand factor, all of which were lowly expressed in BMMs. We applied the same FDR thresholds as in the macrophage analysis to identify EC cis-and trans-eQTLs. At the 5% FDR threshold, corresponding to an LOD score of 2.47, we identified 440 cis-eQTLs (Table S4 and top 25 in Table 6). For trans-eQTLs, the 30% and 5% FDR thresholds corresponded to LOD scores of 2.70 and 3.92, with 4894 and 365 trans-eQTLs identified, respectively (Table S5 and top 25 in Table 7). To evaluate cross-tissue eQTLs, we counted the number of cis-eQTLs (5% FDR) and trans-eQTLs (30% FDR) that were found in both macrophages and ECs. We identified 156 cis-eQTLs (Table S6) common in both tissues, although our power was limited by the relatively small number of EC samples. We identified 12 cross-tissue cis-eQTLs that were located in the 3 replicated Ath QTLs in chromosomes 2, 15, and 17 (Table 8). An example of a cross-tissue cis-eQTL within an Ath QTL interval is Sys1, coding for the Golgi-localized integral membrane protein homolog (Figure 2). In contrast to the 156 cross-tissue cis-eQTLs, there were only 18 cross-tissue trans-eQTLs at the 30% FDR threshold that overlapped the 2 tissues (Table 9). A replicated trans-eQTL is defined one in which the trans-eQTL markers map within 10 Mb of each other. On inspection, it appears that 3 of these cross-tissue trans-eQTLs on chromosome 7 may in fact be cis-eQTLs, because the positions of the gene and the markers were on the chromosome 7 and only slightly greater that the 20-Mb cutoff used to classify cis-eQTLs. The low number of crosstissue trans-eQTLs has been noted in previous studies. 25 One of these cross-tissue trans-eQTLs mapped to the Ath22 interval on chromosome 15, which was associated with the expression of the Klf2 transcription factor on chromosome 8.

Macrophage eQTL Replication Between Different Crosses and Different Platforms
A macrophage eQTL study was performed in the previous AKR9DBA/2 F 2 intercross; however, different genetic   markers and different expression array platforms were used.
To examine replication of macrophage eQTLs between the current and previous study, we reanalyzed the prior data by imputing to the currently used 599 strain-specific SNPs and mapping the Affymetrix gene expression probes to the currently used Illumina probes. After filtering out probes not mapped to the Illumina platform or those that were excluded in our new cross, only 5678 probes remained for analysis. We then performed the eQTL analysis of the prior dataset using sex as an additive covariate and obtained cis-and trans-eQTLs at the same FDRs as described earlier (summary statistics in Table 10). Of the 738 and 482 cis-eQTLs identified in the prior and new crosses, respectively, 265 were replicated, representing 36% and 55% of the input cis-eQTLs in the old and new cross, respectively ( Figure 3, Table S7). The cis-eQTL replication percentage range (36% to 55%) in our study is somewhat lower than that of previously published replication study by van Nas et al that found a cis-eQTL replication rate of %50% to 60%. 25 However, van Nas et al used the same platform and genotyping markers across their 2 studies, whereas we used separate platforms. In addition, van Nas et al probably overestimated the replication rate, because they did not remove probes containing strain polymorphic SNPs as we did in our study. We demonstrated that inclusion of the strainpolymorphic probes leads to strong false-positive eQTLs. Sys1 not only had a cross-tissue cis-eQTL, but it is also an example of a cross-study replicated cis-eQTL in BMMs (Figure 2). The SNP rs3671849, within the Ath28 locus, displayed a strong additive effect on the expression of Sys1, with the DBA/2 allele expressed higher. This marker was associated with 51% and 42% of the variation in BMM Sys1 gene expression in the new and prior crosses, respectively, and 63% of the variation in EC Sys1 gene expression. There were only 5 trans-eQTLs that replicated between the 2 crosses, or 0.9% and 0.6% of the old cross and new cross trans-eQTLs, respectively, at a 5% FDR LOD score cutoff (Table 11). The LOD plots and allele effects on gene expression for the Lamb2 gene, which had a replicated trans-eQTL, is shown in Figure 4. Relaxing the FDR to 30% in both crosses resulted in 23 trans-eQTLs that replicated between the studies, or 6% and 4% of the old cross and new cross trans-eQTLs, respectively. This is lower than the %19% trans-eQTL replication rate observed by van Nas et al; however, the same caveats apply to our analysis concerning our use of 2 expression array and SNP platforms. 25 As an alternative to examining replication of eQTLs, we combined the data from both F 2 cohorts and performed a combined analysis of cis-and trans-macrophage eQTLs using sex and cross as additive covariates. The combined method has more power to identify eQTLs than the replication method because it uses a larger sample size and thus is not penalized by a near-miss false-negative result in 1 of the 2 crosses. In the combined analysis, there were 783 cis-eQTLs at a 5% FDR threshold (Tables 10 and S8). An example of a significant cis-eQTL found in the combined analysis, but not in the replicated analysis, is an eQTL for Wdr70, a WD40 repeat adapter protein. In the combined analysis, there were 160 cis-eQTLs that were found that were not found in either analysis. Furthermore, there were 703 and 3158 trans-eQTLs at the 5% and 30% FDR thresholds in the combined analysis, respectively (Tables 10 and S9).
We systematically searched for replicated eQTLs within the Ath QTL regions to identify potential atherosclerosis modifier candidate genes. In total, there were 14 genes that met this criterion, and for each we determined the correlation of macrophage gene expression and lesion area within the F 2 mice of the prior and current crosses. Twelve of these correlations had conserved directions in the 2 crosses (Table 12). At the Ath28 locus on chromosome 2, we identified 2 replicated macrophage cis-eQTLs, of which Sys1 may have a connection to cholesterol ester metabolism. Sys1, whose expression was positively associated with lesion area, is a Golgi-localized integral membrane protein that is essential for the targeting for several proteins to the Golgi complex and   Prss22 is a serine protease that converts prourokinase-type plasminogen activator into its enzymatically active form, abbreviated as uPA. 28 We found that Prss22 expression was inversely correlated with atherosclerosis; thus, we would predict that uPA activity may also be inversely correlated with atherosclerosis. However, this is not the case, as previous studies have shown that macrophage expression of uPA is positively associated with atherosclerosis in apoEdeficient mice. 29,30 Ltb, encoding lymphotoxin-b (a member of the tumor necrosis factor gene family), resides in the Ath26 locus, and its expression was positively correlated with lesion area. Lymphotoxin-b receptor signaling in the arterial media beneath atherosclerotic plauques has been found to promote tertiary lymphoid organogenesis. 31 In addition, circulating levels of lymphotoxin-b receptor in humans were positively associated with coronary artery calcium scores. 32 However, it is difficult to interpret whether these findings are relevant to our observed correlation of macrophage Ltb expression and lesion area. None of the other replicated eQTLs at the Ath loci had obvious known connections to pathways implicated in atherosclerosis.

Conclusions
We found that phenotypic QTLs for the complex trait of atherosclerosis were partially reproducible. Of the 6 Ath QTLs indentified in the prior cross, the 2 significant QTLs replicated, as did 1 of 4 suggestive QTLs in a combined analysis with the new cross. In the new and smaller cross, all 3 of the suggestive Ath QTLs were found in the prior cross. Based on these results, it may be prudent to replicate phenotypic QTLs before embarking on extensive gene discovery and fine mapping studies. We also report here than many cis-eQTLs can be replicated in independent crosses even when the genotyping and gene expression platforms used differed between the studies. This is not unexpected because the cis-eQTLs are direct and often have very strong effects on gene expression. However, we found a lower rate of trans-eQTL replication compared with another replication study. 25,33 Our conclusion is that many trans-eQTLs identified in mouse studies may be false positives, or very sensitive to environmental effects, making replication less likely.

Sources of Funding
This work was supported by National Institutes of Health (NIH), National Heart, Lung, and Blood Institute grant