Systems Biology With High-Throughput Sequencing Reveals Genetic Mechanisms Underlying the Metabolic Syndrome in the Lyon Hypertensive Rat
The metabolic syndrome (MetS) is a collection of co-occurring complex disorders including obesity, hypertension, dyslipidemia, and insulin resistance. The Lyon hypertensive and Lyon normotensive rats are models of MetS sensitivity and resistance, respectively. To identify genetic determinants and mechanisms underlying MetS, an F2 intercross between Lyon hypertensive and Lyon normotensive was comprehensively studied.
Methods and Results—
Multidimensional data were obtained including genotypes of 1536 single-nucleotide polymorphisms, 23 physiological traits, and >150 billion nucleotides of RNA-seq reads from the livers of F2 intercross offspring and parental rats. Phenotypic and expression quantitative trait loci (eQTL) were mapped. Application of systems biology methods identified 17 candidate MetS genes. Several putative causal cis-eQTL were identified corresponding with phenotypic QTL loci. We found an eQTL hotspot on rat chromosome 17 that is causally associated with multiple MetS-related traits and found RGD1562963, a gene regulated in cis by this eQTL hotspot, as the most likely eQTL driver gene directly affected by genetic variation between Lyon hypertensive and Lyon normotensive rats.
Our study sheds light on the intricate pathogenesis of MetS and demonstrates that systems biology with high-throughput sequencing is a powerful method to study the pathogenesis of complex genetic diseases.
The metabolic syndrome (MetS), characterized by hypertension, central obesity, dyslipidemia, and insulin resistance causes increased mortality because of cardiovascular and renal disease.1,2 The pathogenesis of the MetS is complex and is a collection of multifactorial traits each involving environmental and genetic interactions. Although twin studies, familial segregation, and intercorrelation analyses have all supported the existence of strong genetic influences on the MetS,3–5 the majority of the genetic determinants of the MetS remain to be identified.
Clinical Perspective on p 326
Identification of the genetic contribution to complex disease is aided by integrated studies involving genetic variation, transcriptional regulation, and animal models. The Lyon hypertensive (LH) rat was selectively bred for high blood pressure6; however, it has several features common to human MetS including high body weight, cholesterol, and triglycerides, increased insulin and insulin/glucose ratio, and high blood pressure.7 The Lyon normotensive (LN) control strain, concurrently bred for normal blood pressure from the same Sprague Dawley colony, is genetically similar to the LH; however, this strain is lean, has normal plasma lipids, and is normotensive. Consequently, the LH and LN strains are really to be considered as the Lyon MetS sensitive and Lyon MetS resistant rat strains, respectively, and represent a simplified multigenic model for better understanding the pathological links between MetS and its associated risk for cardiovascular disease.
Previous studies identified quantitative trait loci (QTL) related to MetS phenotypes in the LH rat, including blood pressure, body weight, plasma lipids, glucose, and insulin, among others.8 Although mapping QTL provide valuable information, integrated systems genetics approaches can complement positional cloning approaches to identify novel genes causing complex traits (reviewed in ref. 9). In this study, we used a systems genetics approach to identify genes involved in traits defining MetS in the LH rat. In a single F2 intercross between the LH and LN rat strains, we determined MetS phenotypes as well as liver mRNA levels by RNA-seq. Combined mapping of phenotypic QTL (pQTL) and expression QTL (eQTL) followed by coexpression network analyses led to strong candidate genes underlying the metabolic dysfunction in the LH rat.
Materials and Methods
Expanded methods are available in the Data Supplement, as indicated below.
LH/MRrrcAek, LN/MRrrcAek, and their F1 and F2 progeny were bred and maintained in an approved animal facility on a 12-hour light-dark cycle at the University of Iowa. The rats were provided chow (Teklad 7913) and water ad libidum unless otherwise noted as part of the experimental protocol. LH males were bred to LN females to produce F1 offspring, which were brother-sister mated to generate 169 F2 male progeny to be used in this study. All animal protocols were approved by the Institutional Animal Care and Use Committee (IACUC) at the University of Iowa.
DNA was isolated from tail or spleen samples from each F2 offspring and parental rats (DNEasy Blood and Tissue kit, Valencia, CA). Single-nucleotide polymorphism (SNP) genotyping was determined using a custom 1536 Illumina SNP chip, enriched with 453 SNPs selected to tag all haplotypes differing between the LH and LN genomes (Table I and Methods in the Data Supplement).10 Genotyping was performed according to manufacturer specifications at GeneSeek (Neogen Corporation). Only SNP calls that were polymorphic between LH and LN strains, and with GenCall scores >0.4 were accepted for analysis.
Beginning at 3 weeks of age, parental and F2 rats entered a 12-week phenotyping protocol (Table II and Methods in the Data Supplement). Briefly, body weight, blood pressure (by telemetry), and plasma measures of lipids, glucose, and leptin were determined. Ear and tail snips were taken for DNA isolation, and tissues were harvested at the end of the protocol. For each phenotype, differences between the F2 offspring and parental strains were determined using ANOVA. Phenotypic (p)QTL were mapped to the genome using R/qtl and the SNP genotypes determined in the F2 rats.11 Multiple testing error was corrected for by permutation testing (see Methods in the Data Supplement).
Total RNA from liver tissue obtained after completing the 12-week phenotyping protocol from LH (n=6), LN (n=6), and F2 (n=36) rats was sequenced on an Illumina HiSeq 2000 to produce ≈30 million 51-bp paired-end reads per sample (Table III in the Data Supplement). The sequenced F2 rats were selected to be enriched for the extremes of the phenotype distributions. Sequences were aligned to the rat genome using Tophat (Figure I in the Data Supplement),12 and for mapped reads, fragments per kilobase of transcript per million fragments mapped (FPKM) were determined using Cufflinks.13 Differentially expressed genes (DEGs) between LH and LN were determined using edgeR (version 3.0.0)14 and corrected for multiple testing using a false discovery rate <0.05. eQTL mapping was performed in the genotyped F2 offspring using FPKM data as normalized gene expression and the R/qtl and R/eqtl packages, with permutation testing to correct for multiple testing error.15 Gene Ontology (GO) enrichment analyses were performed using DAVID.16 Detailed methods can be found in the Data Supplement.
To infer causal relationship between gene expression and phenotypes (or another gene expression), we used the single marker analysis function from the Network Edge Orienting package17 implemented in R (see Methods in the Data Supplement). In this study, the eQTL peak or module QTL (mQTL) SNPs were used as anchor SNPs. Causality tests were conducted only when traits show correlation with expression value (Pearson R≥0.3). Causality was indicated when the root mean square error of approximation value was <0.05, which indicate good model fit, and the causality score was >0.3,18–20 representing 2× higher probability of causal model than any other 4 models. For causality tests with a large number of genes, we focused on global patterns rather than specific genes to avoid the pitfalls of multiple testing that may inflate the false-positive rate.
Network Construction and Analyses
Unsigned weighted gene coexpression network based on the gene expression of 36 F2 rats was constructed using WGCNA package in R,21 including 10 088 genes with mean FPKMs >1 across the cohort. Network modules were determined with ≥30 genes and minimum height for merging modules at 0.25 to obtain moderately large modules (Figure II and Methods in the Data Supplement). GO enrichment analysis of each module was conducted using DAVID.16 Module eigengenes were summarized by the first principal component of the module expression profiles. To determine mQTLs, clusters of genes that were regulated by the same eQTL hotspot were selected and tested for enrichment in particular modules using Fisher exact tests.
Quantitative Polymerase Chain Reaction Validation
Quantitative reverse transcriptase polymerase chain reaction was performed on liver RNA from all available F2 rats (n=144) and parental controls for the cis-eQTL genes RGD1562963, Aqp11, Prcp, and Mapk9 (see Methods in the Data Supplement). Gene expression (2−ΔΔCt) differences between groups were determined by ANOVA followed by post hoc tests for pairwise significance compared with the LH parental control. Correlations between gene expression and phenotypes were determined using Spearman correlation, with nominal P<0.05.
Genome Sequencing and Analysis
Genome sequencing and identification of variants were performed as previously described.22 Sequence data are available at the EBI Sequence Read Archive under accession number ERP002160. Sequence variants are available at the Rat Genome Database (http://rgd.mcw.edu).The Alibaba223 program was used to predict transcription factor binding sites using binding sites from TRANSFAC database (http://www.biobase-international.com/product/transcription-factor-binding-sites).
RNA-seq raw data and gene expression values of each sample have been deposited in the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/) under the accession number GSE50027.
To identify loci and genes contributing to the phenotypes defining MetS, we phenotyped 169 male offspring from an F2 intercross between LH and LN rats, as well as male parental LH (n=13) and LN rats (n=20), for traits including blood pressure, plasma lipid, glucose, and leptin levels, and body weight and growth (Tables IV and V in the Data Supplement). For all blood pressure, body weight and growth measures, and plasma leptin levels, the F2 cohort had an intermediate phenotype, significantly differing from both LH and LN strains (P<0.01). At 18 weeks of age, plasma triglyceride levels in the F2 rats were more similar to LN rats, both with significantly lower levels than LH rats, whereas plasma cholesterol levels at 12 and 16 weeks of age were similar or higher in the F2s than LH.
Four hundred fifty-three SNPs were used to tag all haplotypes differing between LH and LN across the genomes of the two strains10,24 (Table I in the Data Supplement). pQTLs of 23 phenotypes were calculated based on the F2 cohort and analyzed using R/qtl.11 Seventeen pQTLs exceeding the 5% genome-wide threshold for significance (defined by permutation testing) were identified for plasma cholesterol, plasma leptin, body weight, growth curve, heart rate, and blood pressure on chromosomes 1, 2, 10, 15, and 17 (Figure 1; Table VI in the Data Supplement). Overlapping QTLs for plasma cholesterol and body weight were identified on RNO10, whereas overlapping QTLs for plasma leptin and blood pressure were identified on RNO17. All other QTLs were nonoverlapping.
eQTL Mapping and Parental Rat DEG Analyses
Liver mRNA from parental (n=6) and F2 (n=36 with extremes of the phenotypes) rats were sequenced to produce ≈30 million paired-end reads (51 bp) per sample. FPKM 13 was used as a normalized quantitative measure of gene expression, and eQTL mapping was performed using the R/eqtl package.15 We identified 1264 suggestive eQTLs (logarithm of the odds score [LOD]>3.82) and 276 significant eQTLs (Lod>4.84). The chromosomal locations of the regulated genes were plotted against the chromosomal locations of their associated SNPs (Figure 2) to visualize both cis-eQTLs (diagonal) and trans-eQTL hotspots (vertical). For example, we identified a single SNP (ENSRNOSNP962219), on RNO17 as an eQTL hotspot. This SNP is associated with 66 trans-eQTLs and 1 cis-eQTL at 29.9 Mb on RNO17 that exceed 5% genome-wide significance. The cis-regulated gene, RGD1562963, will be discussed in detail below. The genes sharing this eQTL hotspot have significant GO enrichment in genes involved in mitochondrial function including oxidative phosphorylation (false discovery rate=0.017) and comprise 22.4% of the 67 genes. Other trans-eQTL hotspots were identified on chromosomes 7, 8, and 14.
In liver from LH and LN rats, 610 DEGs were identified with a false discovery rate of 5% (Figure III in the Data Supplement). The numbers of genes upregulated and downregulated in the LH rats were similar (322 and 288, respectively). Genes upregulated in the LH rat were significantly enriched for immune response and inflammatory processes while downregulated genes were enriched for fatty acid metabolism (Figure III in the Data Supplement).
Integrate pQTLs, cis-eQTLs, and DEGs to Identify Candidate Causal Genes
Integration of the liver gene expression with SNPs and phenotypes was used to discover candidate genes responsible for the pathogenesis of MetS. Several criteria were set as filters to select the most promising eQTL-gene pairs causing MetS in the LH rat. First, the genes must be differentially expressed between LH and LN strains. Second, they must have ≥1 eQTL where the peak SNP colocalizes within the pQTL confidence interval; third, the average F2 FPKM must be >1, to avoid potential noise in RNA-seq of genes with low expression. Finally, to address whether the regulated genes are likely to cause the phenotypic outcomes, the Network Edge Orienting package17 was used to infer the causal relationships between genes meeting the above criteria (n=25) and all measured phenotypes, using the eQTL peak SNPs as anchors. Causality tests (LEO.NB.SingleMarker) were calculated between each of the genes and any correlated phenotypes (R≥0.3). Genes causally affecting ≥1 phenotype were ranked by their maximum scores, using a previously suggested causality score threshold of 0.3,18–20 equivalent to a 2-fold higher probability compared with any other model (for example, the phenotype modifies the gene’s expression). This approach identified 17 genes (66 SNP-gene-phenotype trios) (Table; Table VII in the Data Supplement). Of these 17 genes, 6 are cis-regulated and fall within pQTLs on RNO 1, 2, 10, and 17, whereas the remaining 11 are regulated in trans. One gene (Aqp11) has a cis-eQTL that shares the same peak SNP with pQTLs, and the average number of phenotypes causally affected by these 17 genes was 3.3.
|Gene Name||eQTL Chr||eQTL LOD||Maker Name of P||Type||Peak Position||LH Mean FPKM||LN Mean FPKM||DEG FDR||Gene Chr||Gene Start||Gene End||No. of Anchor*||No. of Traits†||Causality Score‡|
Cis-eQTLs are strong candidates for causing pQTLs, when the cis-regulated genes fall within the pQTL intervals. To investigate further causal cis-eQTL, we performed quantitative polymerase chain reaction for 4of the 6 cis-regulated genes (Aqp11, Prcp, Mapk9, and RGD1562963) in the entire cohort of F2 rats completing the study (n=144). Using the nearby SNP as a marker for the genotype for each gene, all 4 genes were found to have genotype-specific expression in the F2 rats (Figure 3A–3D), with allelic expression similar to that found in liver from the parental LH and LN rats.
Aqp11 and Prcp both fall within QTL on RNO1 for measures of body weight (Figure 1A; Table VI in the Data Supplement). Both RNA-seq and quantitative polymerase chain reaction determined liver Aqp11 expression is downregulated in the LH compared with the LN allele (Table; Figure 3A). Furthermore, expression of Aqp11 is negatively correlated with a colocalized pQTL trait—length adjusted body weight 16 weeks (nominal P<0.05; Figure 3E). Conversely, Prcp expression is significantly upregulated in LH liver (Table; Figure 3B). Prcp expression is also significantly correlated with all traits having colocalized pQTL (length adjusted body weight 12 weeks; length adjusted body weight 16 weeks; growth area under the curve; Figure 3E). These data suggest that genetic variation causing concurrent downregulation of Aqp11 and upregulation of Prcp cause obesity in the LH rat. Of note, both genes have been reported as MetS-related genes through mouse knockout studies (Table VII in the Data Supplement).25–27
Network Construction and mQTL Analyses
Although the integrated eQTL/pQTL approach identified regulated genes with gene expression and sequence variation between the LH and LN strains, as well as functional relevance to the mapped traits, the phenotypic variation because of single genes in a multifactorial disease like MetS may be limited. As such, the expression data generated from RNA-seq can also be used to unveil the mechanisms of pathogenesis of this disease at the network level. A weighted gene coexpression network was constructed based on the gene expression measures of the 36 F2 individuals, consisting of 10 088 genes with average FPKM >1. Fifteen modules were obtained from the network analysis. Figure 4A is a global view of the network. Together the top 5 modules contain 7781 genes, over 77% of all genes included in the data set and have strong GO enrichment (Figure 4B). Other modules, such as the cyan and pink modules, have strong GO enrichment in immune regulation and inflammatory response, similar to that of the enrichment in the upregulated genes in the LH compared with LN rats (Figure III in the Data Supplement).
Module eigengenes were calculated to represent the gene expression of each module, and correlations between multiple module eigengenes and multiple MetS-related traits were observed (Figure IV in the Data Supplement). For example, the gray module eigengene has strong correlation with leptin, glucose, white adipose, and multiple blood pressure–related traits. mQTLs were also determined, defined as an eQTL-peak SNP of multiple genes that show significant module enrichment based on the Fisher exact test (see Methods in the Data Supplement). In this analysis, we relaxed the eQTL Lod score threshold to Lod >3 to increase the power of mQTL detection. As shown in Figure 4C, genes sharing the same eQTL peak SNP are enriched in a limited number of modules.
Of the mQTLs, ENSRNOSNP962219 is located in a pQTL region for blood pressure and plasma leptin on RNO17 (Figure 1) and has the largest number of genes regulated by this eQTL hotspot. Other mQTLs, including the strong trans-eQTL hotspots on RNO 7, 8, and 14, are not located in pQTL regions and thus were not studied further. ENSRNOSNP962219 is an mQTL of both the turquoise (P=3.7×10–7, Fisher exact test) and blue modules (P=2.0×10–42, Fisher exact tests) (Figure 4C). These two modules are the top two largest modules and represent distinct functional categories. The turquoise module is enriched for genes involved in transcriptional regulation; the blue module is enriched for genes associated with translation, ribosomes, mitochondria, and oxidative phosphorylation (Figure 4B). As shown in Figure 5A, these two modules also show significant correlations with multiple phenotypes, including body weight, white adipose tissue, blood pressure, and plasma triglycerides. To test whether these two modules have causal relationship with phenotypes, we used ENSRNOSNP962219 as the anchor SNP to calculate the causality scores of the module eigengenes to the correlated phenotypes. As shown in Figure 5B, the blue module shows a causal relationship to white adipose, whereas the turquoise module shows a causal relationship to multiple phenotypes including body weight, white adipose, and blood pressure. Therefore, these 2 modules are predicted to causally affect most of the correlated phenotypes.
To study further the role that mQTL ENSRNOSNP962219 plays on gene expression regulation and MetS phenotypes, we focused on the 293 genes sharing the eQTL peak SNP ENSRNOSNP962219 at a Lod >3. Of these 293 genes, 279 (95.2%) fell into either the turquoise (118 genes) or blue modules (161 genes) (Figure 4C). For each of these 279 genes, we conducted causality tests to determine whether the gene’s expression causally affects phenotypes that are correlated with this specific gene. We found 136 (84.5%) of the 161 genes in blue module causally affect white adipose, and >80% of the 118 genes in turquoise causally affect adjusted body weight and diastolic blood pressure. Several genes show causal relationship to other phenotypes (Figure 5C). These results are consistent with the causality tests using eigengenes, suggesting that these 2 modules are responsible for many MetS phenotypes in the LH rat. Furthermore, it suggests that the turquoise network has pleiotropic effects on common MetS phenotypes.
RGD1562963 Is a Potential Driver Gene of the Gene Cluster Sharing a Common eQTL Hotspot Within pQTL for MetS Traits
As described above, ENSRNOSNP962219 defines a trans-eQTL hotspot, suggesting that genes in the hotspot are affected, directly or indirectly, by a common driver gene. In principle, the driver gene would be regulated in cis (ie, a cis-eQTL) at ENSRNOSNP962219 and have causal relationship to trans-regulated genes sharing the same eQTL peak. To find the driver gene, we selected the 10 genes in the 95% eQTL hotspot confidence interval (RNO17:1–39 Mb) with an eQTL Lod >3 (relaxed criteria). Only 1 cis-eQTL gene, RGD1562963, exceeded the Lod 4.8 threshold for significance (Lod 6.8; Figure 6A). Two other cis-eQTL genes had suggestive linkage, Sfxn1 and Prelid1 (both with Lod 3.93). Furthermore, only RGD1562963 is differentially expressed between the parental LH and LN strains and shows allele-specific regulation in the F2 cohort (Table; Figure 3D).
If a gene is cis-regulated, it also must have sequence variation regulating its expression. Genome sequencing of both the LH and LN strains identified ≈643 000 single-nucleotide variants (SNVs) and 327 000 indels between the strains.22 These variants cluster into haplotypes derived from different ancestral chromosomes.10 In the 95% confidence interval of the eQTL hotspot on RNO17, only 3 small haplotypes differ between LH and LN; one of these haplotypes (29.7–30.3 Mb) contains RGD1562963 (Figure 6B). There are 86 sequence variants between LH and LN rats in the genomic interval encoding RGD1562963 and the 5 kb upstream. In comparison, Sfxn1 has only 1 intronic indel and Prelid1 has no SNVs between the strains (Figure 6C). Three nonsynonymous variants (V36I, R135H, and C139Y) were identified in RGD1562963, although functional variant prediction tools (Polyphen and SIFT28,29) predict these variants to be benign. We also identified 3 variants within relatively conserved regions shown as conservation score peaks on the University of California, Santa Cruz genome browser conservation track based on comparisons of 9 vertebrates, and 19 variants within putative transcription factor binding sites for PHO2, Oct-1, and GATA-1 (Alibaba223) (Table VIII and Figure V in the Data Supplement). One variant (SNV1) in the predicted promoter of RGD1562963 fell into both categories and is a strong candidate for the functional variant regulating RGD1562963 expression.
The data implicate RGD1562963 as being cis-regulated in LH rats. However, to be the driver gene of the trans-eQTL cluster, it must also affect downstream trans-eQTL genes. Therefore, we performed causality testing between RGD1562963 and the 278 trans-eQTL genes that are located on different chromosomes (Lod >3). RGD1562963 is predicted to causally affect 100 of the 278 trans-eQTL genes (Figure 6D). These results support RGD1562963 as the likely driver gene of the gene cluster sharing the ENSRNOSNP962219 eQTL hotspot. Together the data indicate that the genetic dysregulation of RGD1562963 directly affects the turquoise and blue modules through the trans-eQTL genes, resulting in a cascade of dysregulation of the downstream module genes, which in turn affect the MetS traits.
The pathogenesis of the MetS is complex because it is a collection of several underlying multifactorial traits. To identify novel genes and pathways that underlie the individual components characterizing MetS, as well those that could explain the co-occurrence of these traits, we applied a systems genetics approach in LH and LN rats, inbred strains that show genetic MetS susceptibility and resistance, respectively. Through integrating pQTL and eQTL from a segregating F2 intercross between LH and LN, and parental gene expression differences, followed by coexpression network analysis and tests for causality, we identified 17 genes predicted to be causal for at least a subset of the MetS phenotypes. We identified several cis-regulated genes that fall within pQTL for MetS traits which are strong candidates for the pQTL (Table; Figure 3). Furthermore one of these cis-regulated genes (RGD1562963) also falls within a trans-eQTL hotspot on RNO17 and is a strong candidate driver gene of downstream gene regulation (Figure 6). The genes in the trans-eQTL hotspot fall within modules that are causally related to the measured MetS traits, with function related to global transcriptional regulation and mitochondrial oxidative phosphorylation, providing candidate mechanisms underlying a genetic component of MetS in the LH rat model (Figure 5).
The study identified pQTL for MetS traits on RNO1 (body weight measures and plasma cholesterol), 2 (growth, plasma cholesterol, heart rate), 10 (body weight measures and plasma cholesterol), 15 (heart rate), and 17 (blood pressure measures, plasma leptin). Comparing these results with our previous study,8 we found concordance of many of the pQTL identified in this study. However, there are also differences. For example, we did not confirm the previously mapped blood pressure QTL on RNO2 and RNO13. Furthermore, although body weight, plasma lipids, and blood pressure all mapped to RNO17 in the previous study, only the blood pressure QTL were replicated. Some of the differences could be because ofdifferences in cohort size, with the previous study including over 300 animals. Furthermore, the phenotypes in the previous study were measured at 29 to 31 weeks of age, compared with12 to 18 weeks of age in the present study, which could significantly influence the QTL detection. For instance, it is known that the blood pressure in LH continues to increase between 18 and 29 weeks of age.7,30 The fact that blood pressure solely mapped to RNO17 at a younger age could suggest that the RNO17 locus initiates hypertension while loci on RNO2 and RNO13 involve disease progression evident at later ages.
In the pQTL for body weight and growth on RNO1, we identified 2 cis-eQTL genes of note—Prcp (also known as angiotensinase C) and Aqp11 (aquaporin 11). Prcp encodes a lysosomal prolylcarboxypeptidase, which regulates both the renin–angiotensin and kallikrein–kinin systems.31 It has more recently been shown to be a key enzyme involved in the degradation of α-melanocyte–stimulating hormone to an inactive form unable to inhibit food intake.32 Deletion of Prcp in mice reduces body weight and attenuates the metabolic effects of diet-induced obesity.25,27 It is significantly upregulated in LH compared with LN liver (fold change=1.8; false discovery rate=9×10−9), and its expression is nominally correlated to the QTL traits, including body weight and plasma cholesterol, making it a candidate causal gene for the QTL (Figure 3E).
Another cis-regulated gene in the RNO1 pQTL is Aqp11, a member of the aquaporin major intrinsic protein family that facilitates the transport of water and small neutral solutes across cell membranes. As shown in Figure 3E, Aqp11 expression is downregulated in the LH, correlated with body weight, and causally affects leptin level and bodyweight. LH rats have a variant in the 3′ untranslated region (1g.154973845T>C) as well as in the 5′ upstream region (1g.154985253A>T) of Aqp11 that are in positions with conserved sequence in rat, mouse, human, dog, and opossum. Of note, the 5′ upstream variant is located in a predicted transcription factor binding site for C/EBPalp, which has been reported to bind to the promoter and modulate the expression of the gene encoding leptin, providing a possible causal variant for altered leptin levels.33 Interestingly, liver-specific knockout mice showed increased high-density lipoprotein cholesterol as well as periportal hepatic rough endoplasmic reticulum vacuolization associated with endoplasmic reticulum stress.26
The current and previous studies found QTL for multiple MetS traits on RNO17, which were replicated in consomic rat models.8,34 Therefore, we speculate that there may be a gene or genes having pleiotropic effects on the MetS spectrum. In this study, we identified a cis-regulated gene on RNO17, RGD1562963, as a putative driver gene regulating multiple downstream genes. The trans-eQTL genes fall within 2 coexpression modules which are causally related to body weight and blood pressure measures. Therefore, we speculate that RGD1562963 dysregulation causes a cascade of downstream gene dysregulation that ultimately impacts body weight and blood pressure. However, it must be noted that RGD1562963 expression alone was not significantly correlated with these traits; it showed only a nominal correlation to plasma cholesterol levels (Figure 3), which was not significant after correction for multiple testing of 23 traits. One possibility for these results is that RGD1562963 may be a driver gene but unrelated to the LH phenotypes. Another possibility is that the expression of MetS involves multiple pathways and the variation in a single gene, without considering its downstream effectors, may be insufficient to identify a causal relationship. The expression difference in RGD1562963 between the LH and LN genotypes, while consistent across our studies, is relatively modest (Figure 3). There may not be enough variation in RGD1562963 alone to detect significant correlation with the phenotypes, but its dysregulation may induce a cascade of downstream gene dysregulation (ie, the genes in the trans-eQTL hotspot), and the conglomeration of these genes show causal relationships to the MetS traits.
RGD1562963 has no functional annotation. Its ortholog in the human (C6Orf52) is also only annotated as an open reading frame with no predicted function; orthologs have also been identified in several other species. The closest sequence homology with RGD1562963 or its orthologs is with TRNAU1AP, coding for an RNA-binding protein in a complex that is required to generate selenoproteins, which are important antioxidants such as Glutathione peroxidases and Thioredoxin reductase.35,36 The RNA-binding domain characterizing TRNAU1AP (RRM domain) also has a role in regulating transcript stability.37 Both functions directly relate to the functional enrichment of the blue (mitochondrial oxidative phosphorylation) and turquoise (transcriptional regulation) modules. However, the sequence similarity between RGD1562963 (and its orthologs) and TRNA1AP do not include the RNA-binding domain or any known functional domain or motif, thus predicting that its function is premature and will be the focus of future studies.
The expression of RGD1562963 is also strongly associated with the blue coexpression module, including genes involved in mitochondria and increased oxidative phosphorylation. This is evidenced by the mRNA upregulation of several nicotinamide adenine dinucleotide dehydrogenase genes found in the blue coregulation module. Interestingly, genes involving oxidative phosphorylation have also been found to be upregulated in livers from diabetic obese patients compared with diabetic lean patients.38 Although increased respiration is likely aimed at maintaining homeostasis, chronic activation could eventually lead to opening of the mitochondrial permeability transition pore and apoptosis,39 causing further defects in lipid metabolism in the liver. Although the role of the mitochondria is well established in most traits characterizing MetS, further studies are needed to elucidate the mechanism of RGD1562963 and its downstream effects on the MetS.
Transcriptome studies only capture a temporal and spatial snapshot of gene expression. Some gene expression variation may be because of a consequence of the MetS state, rather than driving MetS. Furthermore, cell types may differ in the groups being compared. For example, there is significant enrichment of DEGs between LH and LN rats involved in inflammatory response; this may be because of immune cell infiltration into the liver rather than liver gene expression differences. To address these issues, gene expression could be measured prior to significant disease onset. Because the phenotypes are present by 5 weeks of age in the LH,40 a temporal study at early ages may better define the time of MetS initiation and provide more homogeneous cell types in the liver.
Systems genetics approaches are predictive and hypothesis generating but require experimental validation. From the F2 cohort of 169 male rats, a subset of 36 rats was selected for the eQTL studies. Although these numbers are in line with studies in RI strains, they are low for many eQTL studies in F2 cohorts. Although the rats were selected based on phenotypic extremes, power may be insufficient to detect some eQTL, leading to possible type II error. However, combining the eQTL mapping with systems biology approaches have led to both genes and pathways that are strong candidates for functional validation. Additional experimental validation is required to prove the role of RGD1562963 as a driver gene of the trans-eQTL hotspot and its role in MetS in the LH rat. For example, small interfering RNA knockdown of RGD1562963 in hepatocytes or in vivo should alter the expression of genes in the trans-eQTL hotspot and affect liver metabolism. These studies are ongoing. Studies in congenic or knockout strains involving RGD1562963 will be necessary to prove a causal role for this novel gene in the MetS traits in the LH rat.
This study uses a unique rat model of MetS (the LH and LN inbred rat strains) to integrate genetic, transcriptomics, and systems genetics approaches in the identification of candidate genes and mechanistic pathways causing the diverse phenotypes defining MetS. Through our studies we identified cis-regulated genes that are strong causal candidates for determining body weight and plasma lipid levels. One of these cis-regulated genes, RGD1562963, is also a putative master regulator of downstream pathways involving global transcriptional regulation and mitochondrial function. Future studies that validate and determine the role of these genes in the LH rat facilitates not only the translational studies of MetS-susceptibility genes in humans but also generalized disease mechanisms that offer in-roads to personalized therapies.
We thank Janet Beinhart for rat colony maintenance as well as Stephanie Dunkel and James Stewart II for technical assistance.
Sources of Funding
This work is supported by
Alberti KG, Eckel RH, Grundy SM, Zimmet PZ, Cleeman JI, Donato KA,; International Diabetes Federation Task Force on Epidemiology and Prevention; Hational Heart, Lung, and Blood Institute; American Heart Association; World Heart Federation; International Atherosclerosis Society; International Association for the Study of Obesity. Harmonizing the metabolic syndrome: a joint interim statement of the International Diabetes Federation Task Force on Epidemiology and Prevention; National Heart, Lung, and Blood Institute; American Heart Association; World Heart Federation; International Atherosclerosis Society; and International Association for the Study of Obesity.Circulation. 2009; 120:1640–1645. doi: 10.1161/CIRCULATIONAHA.109.192644.LinkGoogle Scholar
Ogden CL, Carroll MD, Curtin LR, McDowell MA, Tabak CJ, Flegal KM. Prevalence of overweight and obesity in the United States, 1999-2004.JAMA. 2006; 295:1549–1555. doi: 10.1001/jama.295.13.1549.CrossrefMedlineGoogle Scholar
Hong Y, Pedersen NL, Brismar K, de Faire U. Genetic and environmental architecture of the features of the insulin-resistance syndrome.Am J Hum Genet. 1997; 60:143–152.MedlineGoogle Scholar
Liese AD, Mayer-Davis EJ, Tyroler HA, Davis CE, Keil U, Schmidt MI,. Familial components of the multiple metabolic syndrome: the ARIC study.Diabetologia. 1997; 40:963–970. doi: 10.1007/s001250050775.CrossrefMedlineGoogle Scholar
Mitchell BD, Kammerer CM, Mahaney MC, Blangero J, Comuzzie AG, Atwood LD,. Genetic analysis of the IRS. Pleiotropic effects of genes influencing insulin levels on lipoprotein and obesity measures.Arterioscler Thromb Vasc Biol. 1996; 16:281–288.LinkGoogle Scholar
Dupont J, Dupont JC, Froment A, Milon H, Vincent M. Selection of three strains of rats with spontaneously different levels of blood pressure.Biomedicine. 1973; 19:36–41.MedlineGoogle Scholar
Vincent M, Cartier R, Privat P, Benzoni D, Samani NJ, Sassard J. Major cardiovascular risk factors in Lyon hypertensive rats. A correlation analysis in a segregating population.J Hypertens. 1996; 14:469–474.CrossrefMedlineGoogle Scholar
Bilusic M, Bataillard A, Tschannen MR, Gao L, Barreto NE, Vincent M,. Mapping the genetic determinants of hypertension, metabolic diseases, and related phenotypes in the lyon hypertensive rat.Hypertension. 2004; 44:695–701. doi: 10.1161/01.HYP.0000144542.57306.5e.LinkGoogle Scholar
Civelek M, Lusis AJ. Systems genetics approaches to understand complex traits.Nat Rev Genet. 2014; 15:34–48. doi: 10.1038/nrg3575.CrossrefMedlineGoogle Scholar
Ma MC, Atanur SS, Aitman TJ, Kwitek AE. Genomic structure of nucleotide diversity among Lyon rat models of metabolic syndrome.BMC Genomics. 2014; 15:197. doi: 10.1186/1471-2164-15-197.CrossrefMedlineGoogle Scholar
Broman KW, Wu H, Sen S, Churchill GA. R/qtl: QTL mapping in experimental crosses.Bioinformatics. 2003; 19:889–890.CrossrefMedlineGoogle Scholar
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq.Bioinformatics. 2009; 25:1105–1111. doi: 10.1093/bioinformatics/btp120.CrossrefMedlineGoogle Scholar
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ,. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.Nat Biotechnol. 2010; 28:511–515. doi: 10.1038/nbt.1621.CrossrefMedlineGoogle Scholar
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.Bioinformatics. 2010; 26:139–140. doi: 10.1093/bioinformatics/btp616.CrossrefMedlineGoogle Scholar
Cubillos FA, Yansouni J, Khalili H, Balzergue S, Elftieh S, Martin-Magniette ML,. Expression variation in connected recombinant populations of Arabidopsis thaliana highlights distinct transcriptome architectures.BMC Genomics. 2012; 13:117. doi: 10.1186/1471-2164-13-117.CrossrefMedlineGoogle Scholar
Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists.Nucleic Acids Res. 2009; 37:1–13.CrossrefMedlineGoogle Scholar
Aten JE, Fuller TF, Lusis AJ, Horvath S. Using genetic markers to orient the edges in quantitative trait networks: the NEO software.BMC Syst Biol. 2008; 2:34. doi: 10.1186/1752-0509-2-34.CrossrefMedlineGoogle Scholar
Plaisier CL, Horvath S, Huertas-Vazquez A, Cruz-Bautista I, Herrera MF, Tusie-Luna T,. A systems genetics approach implicates USF1, FADS3, and other causal candidate genes for familial combined hyperlipidemia.PLoS Genet. 2009; 5:e1000642. doi: 10.1371/journal.pgen.1000642.CrossrefMedlineGoogle Scholar
Presson AP, Sobel EM, Papp JC, Suarez CJ, Whistler T, Rajeevan MS,. Integrated weighted gene co-expression network analysis with an application to chronic fatigue syndrome.BMC Syst Biol. 2008; 2:95. doi: 10.1186/1752-0509-2-95.CrossrefMedlineGoogle Scholar
Park CC, Gale GD, de Jong S, Ghazalpour A, Bennett BJ, Farber CR,. Gene networks associated with conditional fear in mice identified using a systems genetics approach.BMC Syst Biol. 2011; 5:43. doi: 10.1186/1752-0509-5-43.CrossrefMedlineGoogle Scholar
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis.BMC Bioinformatics. 2008; 9:559. doi: 10.1186/1471-2105-9-559.CrossrefMedlineGoogle Scholar
Atanur SS, Diaz AG, Maratou K, Sarkis A, Rotival M, Game L,. Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat.Cell. 2013; 154:691–703. doi: 10.1016/j.cell.2013.06.040.CrossrefMedlineGoogle Scholar
Grabe N. AliBaba2: context specific identification of transcription factor binding sites.In Silico Biol. 2002; 2:S1–S15.MedlineGoogle Scholar
Saar K, Beck A, Bihoreau MT, Birney E, Brocklebank D, Chen Y,. SNP and haplotype mapping for genetic analysis in the rat.Nat Genet. 2008; 40:560–566.CrossrefMedlineGoogle Scholar
Jeong JK, Szabo G, Raso GM, Meli R, Diano S. Deletion of prolyl carboxypeptidase attenuates the metabolic effects of diet-induced obesity.Am J Physiol Endocrinol Metab. 2012; 302:E1502–E1510. doi: 10.1152/ajpendo.00544.2011.CrossrefMedlineGoogle Scholar
Rojek A, Füchtbauer EM, Füchtbauer A, Jelen S, Malmendal A, Fenton RA,. Liver-specific Aquaporin 11 knockout mice show rapid vacuolization of the rough endoplasmic reticulum in periportal hepatocytes after amino acid feeding.Am J Physiol Gastrointest Liver Physiol. 2013; 304:G501–G515. doi: 10.1152/ajpgi.00208.2012.CrossrefMedlineGoogle Scholar
Wallingford N, Perroud B, Gao Q, Coppola A, Gyengesi E, Liu ZW,. Prolylcarboxypeptidase regulates food intake by inactivating alpha-MSH in rodents.J Clin Invest. 2009; 119:2291–2303. doi: 10.1172/JCI37209.MedlineGoogle Scholar
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P,. A method and server for predicting damaging missense mutations.Nat Methods. 2010; 7:248–249. doi: 10.1038/nmeth0410-248.CrossrefMedlineGoogle Scholar
Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm.Nat Protoc. 2009; 4:1073–1081. doi: 10.1038/nprot.2009.86.CrossrefMedlineGoogle Scholar
Vincent M, Boussaïri EH, Cartier R, Lo M, Sassolas A, Cerutti C,. High blood pressure and metabolic disorders are associated in the Lyon hypertensive rat.J Hypertens. 1993; 11:1179–1185.CrossrefMedlineGoogle Scholar
Yang HY, Erdös EG. Second kininase in human blood plasma.Nature. 1967; 215:1402–1403.CrossrefMedlineGoogle Scholar
Jeong JK, Diano S. Prolyl carboxypeptidase and its inhibitors in metabolism.Trends Endocrinol Metab. 2013; 24:61–67. doi: 10.1016/j.tem.2012.11.001.CrossrefMedlineGoogle Scholar
Mason MM, He Y, Chen H, Quon MJ, Reitman M. Regulation of leptin promoter function by Sp1, C/EBP, and a novel factor.Endocrinology. 1998; 139:1013–1022. doi: 10.1210/endo.139.3.5792.CrossrefMedlineGoogle Scholar
Gilibert S, Kwitek AE, Hubner N, Tschannen M, Jacob HJ, Sassard J,. Effects of chromosome 17 on features of the metabolic syndrome in the Lyon hypertensive rat.Physiol Genomics. 2008; 33:212–217. doi: 10.1152/physiolgenomics.00262.2007.CrossrefMedlineGoogle Scholar
Brown KM, Arthur JR. Selenium, selenoproteins and human health: a review.Public Health Nutr. 2001; 4(2B):593–599.CrossrefMedlineGoogle Scholar
Xu XM, Carlson BA, Mix H, Zhang Y, Saira K, Glass RS,. Biosynthesis of selenocysteine on its tRNA in eukaryotes.PLoS Biol. 2007; 5:e4. doi: 10.1371/journal.pbio.0050004.CrossrefMedlineGoogle Scholar
Maris C, Dominguez C, Allain FH. The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression.FEBS J. 2005; 272:2118–2131. doi: 10.1111/j.1742-4658.2005.04653.x.CrossrefMedlineGoogle Scholar
Takamura T, Misu H, Matsuzawa-Nagata N, Sakurai M, Ota T, Shimizu A,. Obesity upregulates genes involved in oxidative phosphorylation in livers of diabetic patients.Obesity (Silver Spring). 2008; 16:2601–2609. doi: 10.1038/oby.2008.419.CrossrefMedlineGoogle Scholar
Deniaud A, Sharaf el dein O, Maillier E, Poncet D, Kroemer G, Lemaire C,. Endoplasmic reticulum stress induces calcium-dependent permeability transition, mitochondrial outer membrane permeabilization and apoptosis.Oncogene. 2008; 27:285–299. doi: 10.1038/sj.onc.1210638.CrossrefMedlineGoogle Scholar
Sassolas A, Vincent M, Benzoni D, Sassard J. Plasma lipids in genetically hypertensive rats of the Lyon strain.J Cardiovasc Pharmacol. 1981; 3:1008–1014.CrossrefMedlineGoogle Scholar
The metabolic syndrome (MetS)—hypertension, obesity, dyslipidemia, and insulin resistance or hyperglycemia—is a major risk factor for cardiovascular disease, type 2 diabetes mellitus, and end organ failure. Although genetic susceptibility to the defining features of MetS is evident, the effect size of the genes identified to date is low and there is poor understanding of the mechanisms linking the underlying disorders of MetS. Better understanding of such a complex pathogenesis is aided by comprehensive studies involving genetics, transcriptomics, and animal models. The Lyon hypertensive rat is a unique inbred model of MetS, exhibiting the phenotypes typically associated with MetS, whereas the genetically similar Lyon normotensive control strain displays no features of MetS, offering a simplified model system to identify the genetic basis and disease mechanism of MetS. Through an integrated systems genetics approach, the current study identified both putative causal genes for features of MetS, as well as a pathway linking its defining features. Understanding the pathways and mechanisms linking obesity to diabetes mellitus and cardiovascular disease is critical to preventing a dramatic increase in morbidity and mortality worldwide resulting from the obesity epidemic.