Identification and Initial Functional Characterization of a Human Vascular Cell–Enriched Long Noncoding RNA
Long noncoding RNAs (lncRNAs) represent a rapidly growing class of RNA genes with functions related primarily to transcriptional and post-transcriptional control of gene expression. There is a paucity of information about lncRNA expression and function in human vascular cells. Thus, we set out to identify novel lncRNA genes in human vascular smooth muscle cells and to gain insight into their role in the control of smooth muscle cell phenotypes.
Approach and Results—
RNA sequencing (RNA-seq) of human coronary artery smooth muscle cells revealed 31 unannotated lncRNAs, including a vascular cell–enriched lncRNA (Smooth muscle and Endothelial cell–enriched migration/differentiation-associated long NonCoding RNA [SENCR]). Strand-specific reverse transcription polymerase chain reaction (PCR) and rapid amplification of cDNA ends indicate that SENCR is transcribed antisense from the 5′ end of the FLI1 gene and exists as 2 splice variants. RNA fluorescence in situ hybridization and biochemical fractionation studies demonstrate SENCR is a cytoplasmic lncRNA. Consistent with this observation, knockdown studies reveal little to no cis-acting effect of SENCR on FLI1 or neighboring gene expression. RNA-seq experiments in smooth muscle cells after SENCR knockdown disclose decreased expression of Myocardin and numerous smooth muscle contractile genes, whereas several promigratory genes are increased. Reverse transcription PCR and Western blotting experiments validate several differentially expressed genes after SENCR knockdown. Loss-of-function studies in scratch wound and Boyden chamber assays support SENCR as an inhibitor of smooth muscle cell migration.
SENCR is a new vascular cell–enriched, cytoplasmic lncRNA that seems to stabilize the smooth muscle cell contractile phenotype.
Although the human genome is 30× larger than that of Caenorhabditis elegans, each species is endowed with a similar number of protein-coding genes, a fact seemingly in support of an abundance of junk DNA within our genome.1 Two major discoveries during the past 10 years challenge this decades-old concept. First, genome-wide RNA expression studies show widespread transcription across the mouse and human genomes with roughly equal amounts of polyadenylated and nonpolyadenylated RNA.2–7 Second, the combined efforts of the ENCyclopedia Of DNA Elements (ENCODE) Consortium and many other laboratories have revealed the existence of millions of codes that punctuate the human genome, most notably codes for transcription factor binding.8–12 These findings, coupled with the notion that much of the human genome is functional with 50% to 90% comprising transcribed sequences,13,14 debunk the concept of junk DNA and point to a genome replete with information essential for human life.
See accompanying editorial on page 1124
Much of the noncoding RNA (ncRNA) in a cell functions to orchestrate basic translation (transfer and ribosomal RNA); however, 2 broad classes of ncRNA expanded greatly at the turn of the millennium, primarily as a result of large-scale transcriptomics projects.2,3,15 These ncRNAs are classified subjectively as either short (processed transcript length <200 nucleotides) or long (processed transcript length >200 nucleotides). Short ncRNAs include small nucleolar RNA and their derivatives that act as guide RNAs to modify ribosomal and transfer RNAs,16 as well as microRNA, small interfering RNA, and PIWI-interacting RNA that use Argonaute proteins to mediate endonucleolytic cleavage of target RNAs.17
Long ncRNAs (lncRNAs) function in a myriad of biological processes and may be classified loosely based on their physical location in the genome. Long intervening ncRNAs (lincRNAs) are a subclass of lncRNAs found between 2 transcription units, and they exhibit similar active chromatin signatures as those found around protein-coding genes.18–20 LincRNAs may display tissue-specific patterns of expression and function principally as scaffold or guide RNAs that facilitate chromatin remodeling in cis or trans to directly influence gene transcription (nuclear lincRNAs) or effect changes in mRNA stability/protein translation (cytoplasmic lincRNAs).20–22 Examples of lincRNAs include the abundantly expressed MALAT1 that functions in processing of mRNAs23 and the epidermal prodifferentiating TINCR.24 A recent report defined very long intervening ncRNAs whose expression correlates with malignancy; these transcripts may encompass previously annotated lincRNAs.25 LincRNAs may also overlap transcriptional enhancers to effect cis-mediated changes in gene expression.26,27
Intragenic lncRNAs represent another subclass of RNA genes that reside on the sense or antisense strand of an overlapping gene. Sense lncRNAs have been reported only sporadically,28 although a recent report contends there exists a large number of ill-defined sense ncRNAs within introns.29 Antisense lncRNAs occur in a significant number of protein-coding genes and may overlap the 5′ or 3′ end of a gene, occur entirely within an intron, or overlap multiple exons.30–32 Antisense lncRNAs whose exons overlap protein-coding (or ncRNA) exons are known as natural antisense transcripts and these can function in cis or trans to negatively or positively regulate gene expression through RNA interactions with chromatin remodeling factors.33 Examples of natural antisense transcripts include the X chromosome inactivating XIST34 and the cell cycle regulator ANRIL.35 Some processed antisense lncRNAs do not overlap sense exons and thus may have unexpected functions (below). The number of human lncRNAs is soaring with the current catalogue of LNCipedia36 listing >32 000 (http://www.lncipedia.org/), a number that exceeds all protein-coding genes. Thus, lncRNAs embody a rapidly growing class of genes with functions related primarily to the regulation of gene/protein expression.
Cellular differentiation requires the coordinated activation of unique gene sets through transcription factors in association with cofactors over discrete cis elements. For example, vascular smooth muscle cell (SMC) differentiation is chiefly a function of ubiquitously expressed serum response factor37 binding a cardiovascular-restricted cofactor called myocardin (MYOCD)38 over CArG elements located in the proximal promoter region of many SMC-associated genes.39 Similarly, endothelial cell (EC) differentiation proceeds, in part, through the FOXC240 and ETV241 transcription factors binding a composite cis element, the FOX-ETS motif, found in promoter/enhancer sequences of several EC-specific genes.42 Normal differentiated properties of SMC and EC further require fine tuning of gene expression through the action of microRNAs.43 Because lncRNAs are prevalent and play key roles in modulating gene expression,44 they too may have functions linked to vascular cell phenotype. Little is known, however, about the expression or function of lncRNAs in vascular cells,45–49 and there is nothing known about human-specific, vascular cell–selective lncRNAs. Accordingly, we performed RNA-seq in human coronary artery smooth muscle cell (HCASMC) as a first step toward understanding the potential role of lncRNAs in human SMC phenotypic control. Here, we report on the identification of 31 lncRNAs, including 1 named SENCR (for Smooth muscle and Endothelial cell–enriched migration/differentiation-associated long NonCoding RNA). We have characterized the expression, splicing, and localization of SENCR and have identified unique gene signatures on its knockdown in SMC. SENCR seems to play a role in maintaining the normal SMC differentiated state as its attenuated expression leads to reduced MYOCD and contractile gene expression with elevations in migratory genes that foster a hyper-motile state. This report outlines the first foray into lncRNA discovery in human vascular cells and establishes a foundation for further inquiry into SENCR biology, as well as the identification, expression, and function of other human vascular-selective lncRNAs under normal and pathological cell states.
Materials and Methods
Materials and Methods are available in the online-only Supplement.
Identification and Validation of lncRNAs in HCASMC
We have developed a rigorous workflow for the identification and study of lncRNAs in primary-derived HCASMC using RNA-seq methodology (Figure I in the online-only Data Supplement). A total of 79.41% of filtered reads could be aligned to the human reference genome. Thirty-one lncRNAs met our strict inclusion criteria (Methods in the online-only Data Supplement) with the majority (22/31) falling into the lincRNA subclass (Table II in the online-only Data supplement). Conventional reverse transcription polymerase chain reaction (RT-PCR) showed detectable expression of 21 of 31 lncRNAs in a panel of human cell types, including HCASMC and human umbilical vein EC (HUVEC; Figure 1A). Sequence analysis of the PCR products confirmed the identity of each lncRNA (not shown). The majority of HCASMC lncRNAs are distributed widely across human tissues with several detected in dated human plasma (Figure 1B and 1C). One of the lncRNAs (lncRNA9) exhibited a selective pattern of expression in cell lines (Figure 1A and 1D) and human tissues (Figure 1A and 1B). We refer to this lncRNA as SENCR because of its enriched expression in both smooth muscle and ECs (Figure 1A and 1D) and its proposed function (below).
SENCR Is a Vascular Cell–Selective Antisense lncRNA
RNA-seq alignment, 5′ RACE, and RT-PCR with oligo-dT and strand-specific primers established that SENCR comprises 3 exons and is transcribed in the antisense orientation from within the first intron of Friend leukemia virus integration 1 (FLI1), an important transcription factor programming EC and blood cell formation50 (Figure 2A). There is no overlap between SENCR and FLI1 exonic sequences, indicating that SENCR is not a natural antisense transcript33 (Figure 2A). The longest open reading frame flanked by start and stop codons is 61 amino acids; however, analysis of this and other predicted open reading frames in SENCR failed to reveal any known protein-coding domains, suggesting that this transcript has no or low protein-coding potential (not shown). Primers to exons 1 and 3 of SENCR showed the presence of 2 distinct PCR products (Figure 2B). Sequence analysis confirmed these products as full length (SENCR_V1) and an alternatively spliced variant (SENCR_V2) of the SENCR gene (Figure 2A and 2B). These sequences have been deposited in GenBank under accession numbers KF806591 and KF806590, respectively. We used specific primer pairs to examine SENCR isoform expression in a panel of human tissues and cell lines. Results showed SENCR_V1 to be more broadly expressed than SENCR_V2 (Figure 2C and 2D). In general, there was coincident expression of SENCR with FLI1, suggesting that these transcripts may be under similar transcriptional control processes (Figure 2E and 2F). Quantitative RT-PCR analysis suggested the FLI1 transcript to have higher expression than SENCR (Figure II in the online-only Data Supplement).
Exon 1 of FLI1 shows high conservation across 46 mammalian species; however, much less conservation exists across the 3 exons of SENCR (Figure 3), consistent with the fact that no orthologous SENCR transcripts have yet been found outside human/chimp lineages. Interestingly, exons 2 and 3 of SENCR harbor single nucleotide polymorphisms, suggesting potential deleterious effects on SENCR function (Figure 3). Analysis of ENCODE data on the UCSC Genome Browser (http://genome.ucsc.edu/) supports the enriched expression of SENCR in HUVEC with lower levels in other cell types. Further, there is a prominent HUVEC-associated H3K4me3 mark near exon 1 of SENCR, suggesting the presence of an active promoter (Figure 3). As a first step toward delineating SENCR transcription, we cloned and tested several luciferase reporter constructs. Luciferase assays showed little to no detectable SENCR promoter activity in HUVEC unless sequences encompassing the 5′ FLI1 promoter region were included, although even these reporters showed much lower activity than a control promoter construct (not shown). Collectively, these results define an alternatively spliced, vascular cell–enriched antisense lncRNA that overlaps the 5′ end of the FLI1 transcription factor yet, in its mature form, does not harbor exonic sequences that could undergo Watson–Crick base-pairing with corresponding exonic sequences in FLI1.
SENCR Is a Cytoplasmic lncRNA
Quantitative RT-PCR showed SENCR RNA to be most abundant in HUVEC with undetectable transcripts in HeLa cells (Figure 1D). We used high-resolution RNA fluorescence in situ hybridization51 in these 2 cells types to unambiguously discern the intracellular compartment where SENCR transcripts reside. Consistent with quantitative RT-PCR, no SENCR transcripts were seen in individual HeLa cells (Figure 4A, bottom). However, we observed variably low numbers of SENCR RNA molecules in the cytoplasm of individual HUVEC (Figure 4A, top and middle). We sometimes observed SENCR RNA in the nucleus although this probably reflects either active transcription or unprocessed RNA. The cytoplasmic, low-level expression of SENCR RNA contrasts with the higher-level nuclear accumulation of NEAT1 lncRNA as well as cytoplasmic PP1B mRNA (Figure 4A). Biochemical fractionation followed by RT-PCR further documented cytoplasmic localization of SENCR in both HUVEC and HCASMC. In contrast, the lncRNAs NEAT1 and XIST show predominantly nuclear accumulation in these cell types (Figure 4B; Figure III in the online-only Data Supplement). We next used 2 distinct probe sets to SENCR in HUVEC treated with a control dicer substrate RNA or 2 dicer substrate RNAs targeting different regions of SENCR to further demonstrate the specificity of the signal (Figure 4C). Quantitative analysis of coincident hybridization of each probe set demonstrated a likely underestimate of ≈0.8 copies of SENCR per cell, a value that was approximately halved on SENCR knockdown (Figure 4D). These results establish the cytoplasmic localization of SENCR and indicate its relatively weaker level of expression as compared with housekeeping mRNA molecules (PP1B) and at least one other lncRNA (NEAT1).
SENCR Knockdown Exerts Little Effect on FLI1 mRNA in Vascular Cells
Many lncRNAs that overlap protein-coding genes in the antisense orientation exert cis or trans effects on gene expression through the recruitment of chromatin remodeling factors.52 However, no uniform cis-acting effect on FLI1 or neighboring gene expression was observed on knocking down SENCR with multiple dicer substrate RNAs in HCASMC (Figure 5A–5C) or HUVEC (Figure 5D and 5E), consistent with its cytoplasmic localization. There was also little effect of SENCR knockdown on the nuclear accumulation of FLI1 protein or steady-state FLI1 protein levels (Figure 6G and 6H). Further, knockdown of FLI1 effected no significant change in levels of SENCR RNA (Figure 5F). We occasionally observed mild variation in FLI1 mRNA expression (either up or down) with some dicer substrate RNAs in certain isolates of vascular cells; however, these changes were sporadic and not reproducible when tested by multiple investigators. We therefore conclude that reducing SENCR RNA has little to no cis-acting effect on local gene expression.
SENCR Knockdown Alters the Normal Contractile Gene Program in HCASMC
Several cytoplasmic lncRNAs effect changes in a cell’s transcriptome through post-transcriptional control processes.53 As an initial step toward understanding the function of SENCR, we performed RNA-seq in HCASMC after knockdown of SENCR to assess changes in the transcriptome. Most sequencing reads were aligned to the reference genome and scatterplots of replicates showed similar transcript profiles (not shown). Statistical analysis of each set of replicates revealed hundreds of genes that were significantly induced or repressed on SENCR knockdown (Figure 6A; Table III in the online-only Data supplement). Strikingly, many SMC contractile genes showed significant reduction in mRNA expression with SENCR knockdown (Figure 6B; Table III in the online-only Data supplement). Gene ontology analysis using DAVID revealed biological processes associated with this reduced contractile gene signature (Table IV in the online-only Data supplement). Of note, the key transcriptional switch for SMC contractile gene expression, MYOCD,39 was also reduced with SENCR knockdown (Figure 6B), and several dicer substrate RNAs to SENCR validated such downregulation in HCASMC (Figure 6D). We also confirmed reduced expression of several of the SMC contractile genes at both the mRNA level (Figure 6E) and the protein level (Figure 6G). Although the SMC contractile program was reduced with SENCR knockdown, several genes associated with cell migration were induced (Figure 6C; Table III in the online-only Data Supplement). DAVID analysis supported biological processes linked to cellular locomotion with SENCR knockdown (Table V in the online-only Data Supplement). We validated 2 migratory genes (MDK and PTN) at the mRNA level in HCASMC (Figure 6F) and HUVEC (Figure IV in the online-only Data Supplement). Collectively, these data show that reduced SENCR expression compromised the SMC contractile phenotype and promoted a promigratory gene signature.
Attenuated SENCR Expression Confers a Hyper-Motile Phenotype in HCASMC
To ascertain whether the increase in promigratory gene expression on knockdown of SENCR translates into a functional phenotype, we performed 2 independent measures of cell migration. Using a scratch wound assay, we observed hyper-motile HCASMC with SENCR knockdown (Figure 7A and 7B). Many of these cells exhibited reorganization of the actin cytoskeleton with formation of lamellipodia, consistent with a migratory cell phenotype (Figure 7Af, arrows). Importantly, the increase in HCASMC migration could be completely rescued on simultaneous knockdown of either of 2 promigratory genes shown to be induced on knockdown of SENCR (Figure 7C; Figure V in the online-only Data Supplement). To further confirm this accentuated cell migration phenotype on knockdown of SENCR, we used a modified Boyden chamber assay. Consistent with the scratch wound assay, we noted that HCASMC migration was elevated with SENCR knockdown although not as much as that observed with the potent migratory stimulus, PDGF-BB (Figure 7D and 7E). We also observed augmented PDGF-BB–induced cell migration on concomitant knockdown of SENCR (Figure VI in the online-only Data Supplement). Taken together, these results strongly support a role for SENCR in the regulation of HCASMC differentiation and cellular motility.
Contrary to the historical notion of pervasive junk DNA,1 most of the human genome is transcribed signifying a treasure-trove of previously unrecognized functional DNA sequences. These include tens of millions of regulatory elements as well as the expansive class of lncRNA genes. LncRNA genes already outnumber protein-coding genes and they exhibit diverse functions related to gene expression and splicing; protein translation, activity, and trafficking; as well as the formation of specialized microenvironmental niches.54,55 Here, we present the first RNA-seq study in a human vascular cell type for the specific discovery of lncRNA genes. We used strict criteria and discovered 31 previously unannotated lncRNAs, 21 of which we validated in human cell lines and human tissues. In addition, we detected a few lncRNAs in dated human plasma, suggesting that these may have potential utility as biomarkers of clinical disease.56 One of the lncRNA genes, named here as SENCR, shows a selective pattern of expression in cells and tissues with highest levels in human vascular SMC and ECs. We discovered that SENCR undergoes alternative splicing, consistent with widespread splicing of transcripts across the human genome.57SENCR overlaps the 5′ end of the FLI1 transcription factor in the antisense orientation, but does not seem to regulate local gene expression in cis. Indeed, our extensive RNA fluorescence in situ hybridization and biochemical fractionation studies clearly indicate SENCR to be a cytoplasmic lncRNA supporting an extranuclear function. Using RNA-seq after knockdown of SENCR, we observed uniform decreases in expression of SMC contractile–associated genes as well as attenuated expression of the major transcriptional switch (Myocardin) for the differentiation of vascular SMC.39 However, knockdown of SENCR augments a promigratory gene signature that facilitates heightened SMC migration. Thus, we have uncovered a new vascular cell–enriched lncRNA that seems to function in the maintenance of a normal, nonmotile SMC phenotype.
An analysis of 707 sense–antisense gene pairs annotated in the UCSC genome browser58 shows diversity in structural orientation, with most lncRNAs representing natural antisense transcripts (47.0%), followed by intronic (18.8%), divergent (16.4%), completely overlapping (7.4%), 5′ overlapping (7.1%), and 3′ overlapping (3.4%) lncRNAs (Table VI in the online-only Data Supplement). Much of what is known about sense–antisense gene pairs relates to natural antisense transcripts and effects on local gene expression through such processes as transcriptional interference, double-stranded RNA-mediated events, or the guidance of chromatin remodeling complexes that repress or enhance protein-coding gene expression in cis or trans.33,53,59SENCR falls within the subclass of 5′ overlapping lncRNAs whose exons do not overlap with those of the sense protein-coding (or noncoding) gene. The terminal portion of intron 1 of SENCR overlaps a region of high homology, likely representing conserved sequences corresponding to the proximal 5′ promoter of FLI1. There is another island of homology within intron 2 of SENCR, suggesting SENCR could be a precursor for conserved small RNA molecules. Although the second and third exons of SENCR overlap the 5′ promoter region of FLI1, there is comparatively weak sequence conservation suggesting SENCR does not sponge critical DNA-binding transcription factors necessary for FLI1 mRNA expression (Figure 3). In fact, SENCR and FLI1 seem to be coexpressed in several cells and tissues, including vascular SMC. This is entirely congruent with our inability to show a consistent effect of knocking down either SENCR or FLI1 on the other gene’s level of expression. It is interesting to note that there are little, if any, data on expression of FLI1 mRNA and protein in vascular SMC. Further, the functionality of FLI1 in vascular SMC has not been assessed although an EC-specific knockout of Fli1 showed reduced pericytes and vascular SMC investing the dermal microvasculature.60 In light of FLI1 expression in vascular SMC as reported here, it will be important to directly assess the role of FLI1 in vascular SMC differentiation and function through conditional gene ablation studies.
We know little as to how sense–antisense gene pairs involving lncRNAs are transcriptionally controlled. Presumably, divergent (head to head) sense–antisense pairs share a common promoter as has been described for many bidirectionally transcribed protein-coding genes.61 However, it is completely unclear how other sense–antisense pairs may be transcribed, particularly a lncRNA that is coexpressed with the sense mRNA as shown in this report. Simultaneous expression of FLI1 and SENCR would seem unlikely because of transcriptional collision.62 How then might SENCR and FLI1 be transcribed? Perhaps there are shared promoter elements that facilitate alternating transcription between SENCR and FLI1. Consistent with this idea, no SENCR promoter activity was detected unless sequences encompassing the FLI1 5′ region were included, although the level of activity remained much lower when compared with an EC-restricted promoter (DLL4; not shown). Interestingly, a previous report showed undetectable activity of the FLI1 promoter in cells expressing high levels of FLI1 mRNA.63 This could imply there exists a remotely acting enhancer element critical for alternating transcription of SENCR and FLI1. Another possibility is that SENCR and FLI1 are monoallelically expressed in a mutually exclusive manner.64 Recently, single-cell RNA-seq analysis demonstrated that as much as 24% of autosomal genes exhibit monoallelic expression thus providing support for this hypothesis.65 Clearly, a major task for future investigative work will be to elucidate the transcriptional control of SENCR and other lncRNAs during vascular cell differentiation or pathological conditions.
Elucidating the function of lncRNAs has been hampered by the absence of any obvious lncRNA sequence code. One approach to begin understanding lncRNA function is to reduce the level of lncRNA expression and then evaluate the transcriptome of a cell type.66 In this study, we knocked down SENCR in HCASMC and found that the contractile phenotype of these cells was attenuated with concomitant increases in several promigratory genes leading to enhanced cell motility. The mechanism for such changes in cell phenotype is unknown at this time; however, because SENCR is localized to the cytoplasm it seems unlikely that it acts through direct interaction with DNA or the recruitment of chromatin-modifying complexes to target genes as shown for many nuclear lncRNAs.53,67 It is more probable that SENCR functions in some post-transcriptional capacity to effect the observed changes in gene expression. Because all SMC contractile genes were attenuated with SENCR knockdown, a post-transcriptional mechanism would likely involve the targeting of a protein or RNA that is antecedent to the SMC contractile gene program. One possibility would be that SENCR sponges a low abundant microRNA that otherwise would function to mute the SMC contractile gene program, similar to what has been shown for linc-MD1 in skeletal muscle.68 Other potential post-transcriptional mechanisms of action for SENCR include stabilization, destabilization, or enhanced ribosomal translation of pivotal RNA transcripts, as proposed for other recently defined lncRNAs.24,69–71 The results of this study provide a foundation for exploration of these and other possible mechanisms of SENCR activity using emerging biochemical tools to analyze lncRNA interactions with other macromolecules in the cytoplasm.72
The explosive rise of lncRNAs in human and mouse genomes has profound implications for future research in vascular biology. First, unlike microRNAs, which number ≈1000 and almost universally function through a predictable and well-defined process, lncRNAs number in the tens of thousands and their functions and mechanisms of action will be, arguably, as diverse as those for protein-coding genes. This will necessitate a global effort to define all lncRNAs in the vasculature (especially nonpolyadenylated) under normal and stress-induced conditions and delineate their mode of regulation and function. Second, lncRNAs such as SENCR are poorly conserved and lack easily defined sequences that would imply a clear function in blood vessels. The apparent lack of orthologous mouse lncRNA genes such as SENCR constrains the extent to which experimental analyses can be done in a rigorous and controlled manner to gain functional insights. However, mouse-specific lncRNAs may have limited translational relevance to the study of human development and disease. Structural similarity between lncRNAs having little sequence homology may, nevertheless, exhibit comparable functions across species.73,74 In this context, there is a pressing need to gain insight into the structure of lncRNAs to develop lncRNA codes that would facilitate functional classification across species. As a first approximation of the structure of SENCR, we used mFold (http://mfold.rna.albany.edu) and RNAfold (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi) and found it to exhibit a stable RNA structure (Figure VII in the online-only Data Supplement) with minimum free energies of −486 and −470 kcal/mol, respectively. Another implication of widespread lncRNA genes will be the need for extreme caution and strategic design in the creation of genetically altered mice, especially when targeting the 5′ end of a gene where inadvertent disruption of other sequences such as lncRNAs is likely to occur. The emergence of precision-guided genome editing (eg, CRISPR/Cas9) will be of great value in this context.75 Finally, most genetic variation occurs in non–protein-coding sequence space,76 which is interposed with transcription factor binding sites such as CArG boxes11 and lncRNAs such as ANRIL.35 Historically, there has been a notable lack of understanding as to how noncoding sequence variations associated with disease perturb function in a cell. Now, with increasing efforts devoted to understanding noncoding sequences, there will be an effort to model human single nucleotide polymorphisms associated with vascular disease through, for example, CRISPR/Cas9-mediated point mutations in the mouse genome. In this context, it will be important to know whether the sequence variants in exons 2 and 3 of SENCR confer differential expression, localization, or function in a disease setting. Altered lncRNA expression of TIE1-AS146 and ANRIL48 has already been noted in human vascular disease.
In summary, we have developed a rigorous experimental pipeline for the discovery and study of lncRNAs in human vascular cells (Figure I in the online-only Data Supplement). This approach uncovered many previously unrecognized lncRNAs, including the human-specific, vascular cell–selective SENCR, which we show is an alternatively spliced and weakly expressed cytoplasmic 5′ overlapping antisense lncRNA. Loss-of-function studies support the concept of SENCR acting as a fine-tuner of the vascular SMC phenotype. Of note, SENCR is one of the first 5′ overlapping antisense lncRNAs (as defined here in Table VI in the online-only Data Supplement) to be studied in detail. Future work should aim to elucidate the regulatory control and function of SENCR in models of human vascular SMC and EC development as well as disease-associated processes.
ENCyclopedia Of DNA Elements
Friend leukemia virus integration 1
human coronary artery smooth muscle cell(s)
human umbilical vein endothelial cell(s)
long intervening noncoding RNA
long noncoding RNA
reverse transcription polymerase chain reaction
smooth muscle and endothelial cell enriched migration/differentiation-associated long noncoding RNA
smooth muscle cell
We gratefully acknowledge the expertise of the University of Rochester Genomics Research Center for performing the RNA-seq experiments and analyzing differences in protein-coding genes as well as generating the volcano plot.
Sources of Funding
This work was supported by grants from the National Institutes of Health (HL62572 and HL091168 to J.M. Miano; MH099452 to D. Zheng and partially by HL111770; and 5P01-CA013106 to D.L. Spector) and the American Heart Association (10SDG3670036 to X. Long and 12POST11950002 to R.D. Bell). J.H. Bergmann was funded by a DAAD postdoctoral fellowship.
Ohno S. So much “junk” DNA in our genome.Brookhaven Symp Biol. 1972; 23:366–370.MedlineGoogle Scholar
Okazaki Y, Furuno M, Kasukawa T,.; FANTOM Consortium; RIKEN Genome Exploration Research Group Phase I & II Team. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.Nature. 2002; 420:563–573.CrossrefMedlineGoogle Scholar
Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR. Large-scale transcriptional activity in chromosomes 21 and 22.Science. 2002; 296:916–919.CrossrefMedlineGoogle Scholar
Carninci P, Kasukawa T, Katayama S,.; FANTOM Consortium; RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group). The transcriptional landscape of the mammalian genome.Science. 2005; 309:1559–1563.CrossrefMedlineGoogle Scholar
Cheng J, Kapranov P, Drenkow J,. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution.Science. 2005; 308:1149–1154.CrossrefMedlineGoogle Scholar
Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS, Rinn JL. Targeted RNA sequencing reveals the deep complexity of the human transcriptome.Nat Biotechnol. 2012; 30:99–104.CrossrefGoogle Scholar
Livyatan I, Harikumar A, Nissim-Rafinia M, Duttagupta R, Gingeras TR, Meshorer E. Non-polyadenylated transcription in embryonic stem cells reveals novel non-coding RNA related to pluripotency and differentiation.Nucleic Acids Res. 2013; 41:6300–6315.CrossrefMedlineGoogle Scholar
Gerstein MB, Kundaje A, Hariharan M,. Architecture of the human regulatory network derived from ENCODE data.Nature. 2012; 489:91–100.CrossrefMedlineGoogle Scholar
Neph S, Vierstra J, Stergachis AB,. An expansive human regulatory lexicon encoded in transcription factor footprints.Nature. 2012; 489:83–90.CrossrefMedlineGoogle Scholar
Zhang X, Odom DT, Koo SH,. Genome-wide analysis of cAMP-response element binding protein occupancy, phosphorylation, and target gene activation in human tissues.Proc Natl Acad Sci USA. 2005; 102:4459–4464.CrossrefMedlineGoogle Scholar
Benson CC, Zhou Q, Long X, Miano JM. Identifying functional single nucleotide polymorphisms in the human CArGome.Physiol Genomics. 2011; 43:1038–1048.CrossrefMedlineGoogle Scholar
Johnson R, Richter N, Bogu GK, Bhinge A, Teng SW, Choo SH, Andrieux LO, de Benedictis C, Jauch R, Stanton LW. A genome-wide screen for genetic variants that modify the recruitment of REST to its target genes.PLoS Genet. 2012; 8:e1002624.CrossrefMedlineGoogle Scholar
- 13. The Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome.Nature. 2012; 489:57–74.CrossrefMedlineGoogle Scholar
Belinky F, Bahir I, Stelzer G, Zimmerman S, Rosen N, Nativ N, Dalah I, Iny Stein T, Rappaport N, Mituyama T, Safran M, Lancet D. Non-redundant compendium of human ncRNA genes in GeneCards.Bioinformatics. 2013; 29:255–261.CrossrefMedlineGoogle Scholar
Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identification of novel genes coding for small expressed RNAs.Science. 2001; 294:853–858.CrossrefMedlineGoogle Scholar
Falaleeva M, Stamm S. Processing of snoRNAs as a new source of regulatory non-coding RNAs: snoRNA fragments form a new class of functional RNAs.Bioessays. 2013; 35:46–54.CrossrefMedlineGoogle Scholar
Ghildiyal M, Zamore PD. Small silencing RNAs: an expanding universe.Nat Rev Genet. 2009; 10:94–108.CrossrefMedlineGoogle Scholar
Guttman M, Amit I, Garber M,. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals.Nature. 2009; 458:223–227.CrossrefMedlineGoogle Scholar
Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, Thomas K, Presser A, Bernstein BE, van Oudenaarden A, Regev A, Lander ES, Rinn JL. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression.Proc Natl Acad Sci USA. 2009; 106:11667–11672.CrossrefMedlineGoogle Scholar
Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses.Genes Dev. 2011; 25:1915–1927.CrossrefMedlineGoogle Scholar
Lee JT. Epigenetic regulation by long noncoding RNAs.Science. 2012; 338:1435–1439.CrossrefMedlineGoogle Scholar
Ulitsky I, Bartel DP. lincRNAs: genomics, evolution, and mechanisms.Cell. 2013; 154:26–46.CrossrefMedlineGoogle Scholar
Gutschner T, Hämmerle M, Diederichs S. MALAT1—a paradigm for long noncoding RNA function in cancer.J Mol Med (Berl). 2013; 91:791–801.CrossrefMedlineGoogle Scholar
Kretz M, Siprashvili Z, Chu C,. Control of somatic tissue differentiation by the long non-coding RNA TINCR.Nature. 2013; 493:231–235.CrossrefMedlineGoogle Scholar
St Laurent G, Shtokalo D, Dong B, Tackett MR, Fan X, Lazorthes S, Nicolas E, Sang N, Triche TJ, McCaffrey TA, Xiao W, Kapranov P. VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer.Genome Biol. 2013; 14:R73.CrossrefMedlineGoogle Scholar
De Santa F, Barozzi I, Mietton F, Ghisletti S, Polletti S, Tusi BK, Muller H, Ragoussis J, Wei CL, Natoli G. A large fraction of extragenic RNA pol II transcription sites overlap enhancers.PLoS Biol. 2010; 8:e1000384.CrossrefMedlineGoogle Scholar
Ørom UA, Derrien T, Beringer M, Gumireddy K, Gardini A, Bussotti G, Lai F, Zytnicki M, Notredame C, Huang Q, Guigo R, Shiekhattar R. Long noncoding RNAs with enhancer-like function in human cells.Cell. 2010; 143:46–58.CrossrefMedlineGoogle Scholar
Tahira AC, Kubrusly MS, Faria MF, Dazzani B, Fonseca RS, Maracaja-Coutinho V, Verjovski-Almeida S, Machado MC, Reis EM. Long noncoding intronic RNAs are differentially expressed in primary and metastatic pancreatic cancer.Mol Cancer. 2011; 10:141.CrossrefMedlineGoogle Scholar
St Laurent G, Shtokalo D, Tackett MR, Yang Z, Eremina T, Wahlestedt C, Urcuqui-Inchima S, Seilheimer B, McCaffrey TA, Kapranov P. Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells.BMC Genomics. 2012; 13:504.CrossrefMedlineGoogle Scholar
Chen J, Sun M, Kent WJ, Huang X, Xie H, Wang W, Zhou G, Shi RZ, Rowley JD. Over 20% of human transcripts might form sense-antisense pairs.Nucleic Acids Res. 2004; 32:4812–4820.CrossrefMedlineGoogle Scholar
Katayama S, Tomaru Y, Kasukawa T,.; RIKEN Genome Exploration Research Group; Genome Science Group (Genome Network Project Core Group); FANTOM Consortium. Antisense transcription in the mammalian transcriptome.Science. 2005; 309:1564–1566.CrossrefMedlineGoogle Scholar
Derrien T, Johnson R, Bussotti G,. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression.Genome Res. 2012; 22:1775–1789.CrossrefMedlineGoogle Scholar
Magistri M, Faghihi MA, St Laurent G, Wahlestedt C. Regulation of chromatin structure by long noncoding RNAs: focus on natural antisense transcripts.Trends Genet. 2012; 28:389–396.CrossrefMedlineGoogle Scholar
Lee JT, Bartolomei MS. X-inactivation, imprinting, and long noncoding RNAs in health and disease.Cell. 2013; 152:1308–1323.CrossrefMedlineGoogle Scholar
Pasmant E, Sabbagh A, Vidaud M, Bièche I. ANRIL, a long, noncoding RNA, is an unexpected major hotspot in GWAS.FASEB J. 2011; 25:444–448.CrossrefMedlineGoogle Scholar
Volders PJ, Helsens K, Wang X, Menten B, Martens L, Gevaert K, Vandesompele J, Mestdagh P. LNCipedia: a database for annotated human lncRNA transcript sequences and structures.Nucleic Acids Res. 2013; 41(Database issue):D246–D251.CrossrefMedlineGoogle Scholar
Norman C, Runswick M, Pollock R, Treisman R. Isolation and properties of cDNA clones encoding SRF, a transcription factor that binds to the c-fos serum response element.Cell. 1988; 55:989–1003.CrossrefMedlineGoogle Scholar
Wang D, Chang PS, Wang Z, Sutherland L, Richardson JA, Small E, Krieg PA, Olson EN. Activation of cardiac gene expression by myocardin, a transcriptional cofactor for serum response factor.Cell. 2001; 105:851–862.CrossrefMedlineGoogle Scholar
Chen J, Kitchen CM, Streb JW, Miano JM. Myocardin: a component of a molecular switch for smooth muscle differentiation.J Mol Cell Cardiol. 2002; 34:1345–1356.CrossrefMedlineGoogle Scholar
Kume T. Foxc2 transcription factor: a newly described regulator of angiogenesis.Trends Cardiovasc Med. 2008; 18:224–228.CrossrefMedlineGoogle Scholar
Lammerts van Bueren K, Black BL. Regulation of endothelial and hematopoietic development by the ETS transcription factor Etv2.Curr Opin Hematol. 2012; 19:199–205.CrossrefMedlineGoogle Scholar
De Val S, Chi NC, Meadows SM, Minovitsky S, Anderson JP, Harris IS, Ehlers ML, Agarwal P, Visel A, Xu SM, Pennacchio LA, Dubchak I, Krieg PA, Stainier DY, Black BL. Combinatorial regulation of endothelial gene expression by ets and forkhead transcription factors.Cell. 2008; 135:1053–1064.CrossrefMedlineGoogle Scholar
Small EM, Olson EN. Pervasive roles of microRNAs in cardiovascular biology.Nature. 2011; 469:336–342.CrossrefMedlineGoogle Scholar
Rinn JL, Chang HY. Genome regulation by long noncoding RNAs.Annu Rev Biochem. 2012; 81:145–166.CrossrefMedlineGoogle Scholar
Han DK, Khaing ZZ, Pollock RA, Haudenschild CC, Liau G. H19, a marker of developmental transition, is reexpressed in human atherosclerotic plaques and is regulated by the insulin family of growth factors in cultured rabbit smooth muscle cells.J Clin Invest. 1996; 97:1276–1285.CrossrefMedlineGoogle Scholar
Li K, Blum Y, Verma A,. A noncoding antisense RNA in tie-1 locus regulates tie-1 function in vivo.Blood. 2010; 115:133–139.CrossrefMedlineGoogle Scholar
Congrains A, Kamide K, Katsuya T, Yasuda O, Oguro R, Yamamoto K, Ohishi M, Rakugi H. CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC.Biochem Biophys Res Commun. 2012; 419:612–616.CrossrefMedlineGoogle Scholar
Motterle A, Pu X, Wood H, Xiao Q, Gor S, Ng FL, Chan K, Cross F, Shohreh B, Poston RN, Tucker AT, Caulfield MJ, Ye S. Functional analyses of coronary artery disease associated variation on chromosome 9p21 in vascular smooth muscle cells.Hum Mol Genet. 2012; 21:4021–4029.CrossrefMedlineGoogle Scholar
Leung A, Trac C, Jin W, Lanting L, Akbany A, Sætrom P, Schones DE, Natarajan R. Novel long noncoding RNAs are regulated by angiotensin II in vascular smooth muscle cells.Circ Res. 2013; 113:266–278.LinkGoogle Scholar
Liu F, Walmsley M, Rodaway A, Patient R. Fli1 acts at the top of the transcriptional network driving blood and endothelial development.Curr Biol. 2008; 18:1234–1240.CrossrefMedlineGoogle Scholar
Battich N, Stoeger T, Pelkmans L. Image-based transcriptomics in thousands of single human cells at single-molecule resolution.Nat Methods. 2013; 10:1127–1133.CrossrefMedlineGoogle Scholar
Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs.Cell. 2009; 136:629–641.CrossrefMedlineGoogle Scholar
Fatica A, Bozzoni I. Long non-coding RNAs: new players in cell differentiation and development.Nat Rev Genet. 2014; 15:7–21.CrossrefMedlineGoogle Scholar
Batista PJ, Chang HY. Long noncoding RNAs: cellular address codes in development and disease.Cell. 2013; 152:1298–1307.CrossrefMedlineGoogle Scholar
Kung JT, Colognori D, Lee JT. Long noncoding RNAs: past, present, and future.Genetics. 2013; 193:651–669.CrossrefMedlineGoogle Scholar
Arita T, Ichikawa D, Konishi H, Komatsu S, Shiozaki A, Shoda K, Kawaguchi T, Hirajima S, Nagata H, Kubota T, Fujiwara H, Okamoto K, Otsuji E. Circulating long non-coding RNAs in plasma of patients with gastric cancer.Anticancer Res. 2013; 33:3185–3193.MedlineGoogle Scholar
Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing.Nat Genet. 2008; 40:1413–1415.CrossrefMedlineGoogle Scholar
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC.Genome Res. 2002; 12:996–1006.CrossrefMedlineGoogle Scholar
Lapidot M, Pilpel Y. Genome-wide natural antisense transcription: coupling its regulation to its different regulatory mechanisms.EMBO Rep. 2006; 7:1216–1222.CrossrefMedlineGoogle Scholar
Asano Y, Stawski L, Hant F, Highland K, Silver R, Szalai G, Watson DK, Trojanowska M. Endothelial Fli1 deficiency impairs vascular homeostasis: a role in scleroderma vasculopathy.Am J Pathol. 2010; 176:1983–1998.CrossrefMedlineGoogle Scholar
Wakano C, Byun JS, Di LJ, Gardner K. The dual lives of bidirectional promoters.Biochim Biophys Acta. 2012; 1819:688–693.CrossrefMedlineGoogle Scholar
Hobson DJ, Wei W, Steinmetz LM, Svejstrup JQ. RNA polymerase II collision interrupts convergent transcription.Mol Cell. 2012; 48:365–374.CrossrefMedlineGoogle Scholar
Barbeau B, Bergeron D, Beaulieu M, Nadjem Z, Rassart E. Characterization of the human and mouse Fli-1 promoter regions.Biochim Biophys Acta. 1996; 1307:220–232.CrossrefMedlineGoogle Scholar
Raslova H, Komura E, Le Couédic JP, Larbret F, Debili N, Feunteun J, Danos O, Albagli O, Vainchenker W, Favier R. FLI1 monoallelic expression combined with its hemizygous loss underlies Paris-Trousseau/Jacobsen thrombopenia.J Clin Invest. 2004; 114:77–84.CrossrefMedlineGoogle Scholar
Deng Q, Ramsköld D, Reinius B, Sandberg R. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells.Science. 2014; 343:193–196.CrossrefMedlineGoogle Scholar
Panzitt K, Tschernatsch MM, Guelly C, Moustafa T, Stradner M, Strohmaier HM, Buck CR, Denk H, Schroeder R, Trauner M, Zatloukal K. Characterization of HULC, a novel gene with striking up-regulation in hepatocellular carcinoma, as noncoding RNA.Gastroenterology. 2007; 132:330–342.CrossrefMedlineGoogle Scholar
Yang L, Froberg JE, Lee JT. Long noncoding RNAs: fresh perspectives into the RNA world.Trends Biochem Sci. 2014; 39:35–43.CrossrefMedlineGoogle Scholar
Cesana M, Cacchiarelli D, Legnini I, Santini T, Sthandier O, Chinappi M, Tramontano A, Bozzoni I. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA.Cell. 2011; 147:358–369.CrossrefMedlineGoogle Scholar
Faghihi MA, Modarresi F, Khalil AM, Wood DE, Sahagan BG, Morgan TE, Finch CE, St Laurent G, Kenny PJ, Wahlestedt C. Expression of a noncoding RNA is elevated in Alzheimer’s disease and drives rapid feed-forward regulation of beta-secretase.Nat Med. 2008; 14:723–730.CrossrefMedlineGoogle Scholar
Gong C, Maquat LE. lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3’ UTRs via Alu elements.Nature. 2011; 470:284–288.CrossrefMedlineGoogle Scholar
Carrieri C, Cimatti L, Biagioli M, Beugnet A, Zucchelli S, Fedele S, Pesce E, Ferrer I, Collavin L, Santoro C, Forrest AR, Carninci P, Biffo S, Stupka E, Gustincich S. Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat.Nature. 2012; 491:454–457.CrossrefMedlineGoogle Scholar
Zhu J, Fu H, Wu Y, Zheng X. Function of lncRNAs and approaches to lncRNA-protein interactions.Sci China Life Sci. 2013; 56:876–885.CrossrefMedlineGoogle Scholar
Ulitsky I, Shkumatava A, Jan CH, Sive H, Bartel DP. Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution.Cell. 2011; 147:1537–1550.CrossrefMedlineGoogle Scholar
Torarinsson E, Sawera M, Havgaard JH, Fredholm M, Gorodkin J. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure.Genome Res. 2006; 16:885–889.CrossrefMedlineGoogle Scholar
Wang H, Yang H, Shivalila CS, Dawlaty MM, Cheng AW, Zhang F, Jaenisch R. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering.Cell. 2013; 153:910–918.CrossrefMedlineGoogle Scholar
Maurano MT, Humbert R, Rynes E,. Systematic localization of common disease-associated variation in regulatory DNA.Science. 2012; 337: 1190–1195.CrossrefMedlineGoogle Scholar
For the first time, RNA-seq has been performed in human coronary artery smooth muscle cell for the discovery of long noncoding RNA genes. We report the gene structure, expression, splicing, and spatial localization of a new vascular cell–selective long noncoding RNA we call SENCR. Although SENCR has no apparent cis effect on gene expression, there is a compromise in the smooth muscle cell contractile gene program on its knockdown with elevations in many promigratory genes. Accordingly, these cells exhibit a hyper-motile phenotype, which can be reversed by knocking down 2 promigratory genes that are induced with SENCR knockdown. These results report the first novel long noncoding RNA gene selectively expressed in human vascular cells and provide a framework for further study of long noncoding RNA genes during vascular cell development and in disease processes.