Next Article in Journal
FLVCR1-AS1 and FBXL19-AS1: Two Putative lncRNA Candidates in Multiple Human Cancers
Previous Article in Journal
Is Evolutionary Conservation a Useful Predictor for Cancer Long Noncoding RNAs? Insights from the Cancer LncRNA Census 3
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Idiopathic Pulmonary Fibrosis-Associated Single Nucleotide Polymorphism RS35705950 Is Transcribed in a MUC5B Promoter Associated Long Non-Coding RNA (AC061979.1)

1
Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Ellison Building, Newcastle-Upon-Tyne NE1 8ST, UK
2
Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Central Parkway, Newcastle-Upon-Tyne NE1 3BZ, UK
3
Institue of General Physiology, University of Ulm, Albert-Einstein-Allee 11, D89081 Ulm, Germany
4
Institute of Pathology, MHH Hannover, 30625 Hannover, Germany
*
Author to whom correspondence should be addressed.
Non-Coding RNA 2022, 8(6), 83; https://doi.org/10.3390/ncrna8060083
Submission received: 1 October 2022 / Revised: 25 November 2022 / Accepted: 30 November 2022 / Published: 8 December 2022

Abstract

:
LncRNAs are involved in regulatory processes in the human genome, including gene expression. The rs35705950 SNP, previously associated with IPF, overlaps with the recently annotated lncRNA AC061979.1, a 1712 nucleotide transcript located within the MUC5B promoter at chromosome 11p15.5. To document the expression pattern of the transcript, we processed 3.9 TBases of publicly available RNA-SEQ data across 27 independent studies involving lung airway epithelial cells. Epithelial lung cells showed expression of this putative pancRNA. The findings were independently validated in cell lines and primary cells. The rs35705950 is found within a conserved region (from fish to primates) within the expressed sequence indicating functional importance. These results implicate the rs35705950-containing AC061979.1 pancRNA as a novel component of the MUC5B expression control minicircuitry.

1. Introduction

Human DNA consists of protein-coding regions and non-coding regions. Protein-coding genomic regions are abundantly transcribed, evolutionarily conserved, mutationally sensitive sequences which impact cellular phenotype. These constitute approximately 1% of the human genome [1]. Non-coding regions of DNA, on the other hand, are more complex and can be divided into at least five structural types: (i) binding motifs for regulatory proteins, (ii) non-coding RNAs (ncRNAs), (iii) transposable elements, (iv) highly repetitive DNA—essential in gene regulation and chromosome maintenance, and (v) pseudogenes [2,3]. In 2003, the ENCODE (Encyclopedia of DNA Elements) project was launched to identify and classify functional elements in the human genome including non-coding transcripts. The project continues to grow with the results being made available on Ensembl and UCSC genome browser for both human and mouse [4].
Most of the ncRNAs predicted by ENCODE are expressed at low levels [4]. However, their abundance is not a proxy for their functionality [5]. For example, the predicted lncRNA ENST00000567151 or viability enhancing in lung Cancer transcript (VELUCT) was found at only 0.01 copies per cell. Despite its low copy number, VELUCT expression was reported to be upregulated by 5.2 fold in lung cancer cells, and its knockdown reduces the viability of multiple lung-cancer cell lines by as much as 90% [6].
Recently, lncRNAs (long ncRNAs) have been intensively studied due to their involvement in cancer [7], neurological conditions [8], pulmonary diseases [9], and the regulation of chromosome structure [10]. This research culminated in several published lncRNA databases: NONCODE [11], Lnc2Cancer [12], LncRNADisease [13], and LncRNAdb [14]. LncRNAs can regulate expression via at least two mechanisms: cis-acting lncRNA (which regulate expression of adjacent genes) and trans-acting lncRNAs (regulating the expression of distant genes on other chromosomes) [15]. Most pancRNAs (promoter associated ncRNAs), to date, have been associated with an increased production of mRNA from the adjacent protein-coding gene, suggesting that pancRNAs might contribute to gene expression regulation. Protein-coding genes that possess pancRNAs also exhibit tri-methylated lysine 4 of histone 3 (H3K4me3) and acetylated lysine 27 of histone histone 3 (H3K27ac), whereas the pancRNA-free genes appear to lack such epigenetic signatures [16]. However, how pancRNAs change the expression of protein-coding genes remains unknown.
Furthermore, evidence is amassing concerning the expression of pancRNAs and the occurrence of epigenetic changes. Thus, typically, when single nucleotide polymorphism (SNPs) appear in such non-coding transcript loci, the associated pancRNA secondary structure is disrupted, affecting expression patterns and impacting upon the function [17]. Whilst expression changes in high-copy-number lncRNA are easy to determine by routine RNA-SEQ, the effect of SNPs resulting in small changes in lncRNA expression levels is harder to study.
The G/T rs35705950 SNP found in the promoter of mucin 5B (MUC5B) on chromosome 11p15.5 [18,19] has one of the highest (~40%) [20] and most reproducible associations with idiopathic pulmonary fibrosis (IPF) across white, hispanic, and Asian populations [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37], with homozygous mutants exhibiting a higher risk of developing the disease [38] and higher mortality [39]. The polymorphism is implicated in the elevated transcription and translation of MUC5B in both healthy and diseased individuals [40]. This is evidenced via episomal expression of luciferase driven by TT or GG MUC5B promoters cloned from IPF patients in A549 alveolar epithelial cells [41]. Since MUC5B is one of the largest proteins encoded in the human genome, excessive expression is proposed to lead to elevated endoplasmatic reticulum (ER) stress [42] through MUC5B protein recycling and the unfolded protein response, increasing cell sensitivity to exogenous insults and pro-apoptotic phenotypes. This is exacerbated in alveolar lung epithelia, where MUC5B aberrant mRNA expression is elevated but MUC5B protein production is not normally observed. Presently, the polymorphism is thought to a) disrupt a 25 CpG motif differentially methylated region which is, counterintuitively, hypermethylated in IPF, and b) enhance the binding of the transcription factor Forkhead Box Protein A2 (FOXA2), 32 bp downstream of the SNP, as evidenced by chromatin immunoprecipitation [18]. Given the distal effect of the SNP to the FOXA2 binding site and the emerging role of pancRNA in transcription regulation, we sought to determine whether an lncRNA transcript might be implicated in MUC5B expression and its transcriptional dysregulation in the context of the rs35705950 SNP.
To this end, we analysed publicly deposited RNA-SEQ datasets. However, most pipelines for novel transcript discovery are focused on small RNA populations or certain RNA species [43], and RNA-SEQ workflows typically involve polyadenylated transcript enrichment. This creates a classical signal-to-noise-ratio detection problem where selective signal acquisition and amplification during sequencing-library preparation may reduce non-polyadenylated transcript read frequencies to levels typically ascribed to background noise. Inspired by the application of very long base interferometry in expanding observation dynamic range beyond standard signal-to-noise-ratio limitations through signal integration from multiple sources operating similar data acquisition protocols [44], we applied composite analysis of third-party RNA-SEQ datasets to reveal the existence of such technically occluded transcripts. Overall, we describe a novel and simple computational method for performing such de-novo lncRNA transcript searches by aggregating data from diverse input sources, and focusing analytical efforts on the RNA-SEQ-verse to specific genomic regions of interest.

2. Results

2.1. Rediscovery of the Non-Coding Transcript AC061979.1 in the Promoter Region of the MUC5B Gene

To identify a putative non-coding transcript in a “dark” intergenic region on the p-terminus of chromosome 11 on the human genome in the context of lung epithelia, we manually collected and interrogated a total of 3.9 TBases of publicly available RNA-SEQ data involving epithelial lung cells (see Table S1) to generate a summative transcriptional signal of the AC061979.1 locus (see Figure 1A). Concomitantly, the GENCODE [45] release 32 (GRCh38) described the putative transcript AC061979.1 (chromosome 11:1,218,530–1,220,242) mapping to the same region (see Figure 2). Interestingly, this novel lncRNA is reported to be subject to splicing, with rs35705950 mapping to the second nucleotide of exon 2 and, therefore, possibly altering AC061979.1 splicing; however, no evidence of splicing was immediately apparent through our analysis (see Figure 1) and no rs35705950-containing RNA-SEQ data from lung epithelia were found among the surveyed studies, with the exception of three donor samples across two independent studies (SRP096589/GSE93526 and SRP102483/GSE97036) where the expression of AC061979.1 was documented (see Table S1).
Manual inspection of each dataset indicated that almost all samples representing lung epithelial cell lines or primary lung epithelial cells showed evidence of expression in the locus. Of particular interest, however, was a dataset (SRP082973) that isolated only basal cells from the epithelium. Thus, within the same study, we compared basal cell and the epithelium (mix of basal, ciliated, columnar and secretory cells) RNA expression [46]. Interestingly, whilst reads from epithelial cell extracts mapped liberally to the AC061979.1 locus, sporadic alignments of only a couple of reads were detected among basal-cell RNA (Figure 1). Given that basal cells are a sub-type of human-airway epithelial cells not involved in mucous production, which act as stem cells for the other sub-types (ciliated, columnar and secretory cells) [47], these results potentially show the activation of ncRNA AC061979.1 after cell differentiation. Analysis of data from IPF studies (see Figure 1C) involving lung tissue (GSE52463 or SRP033095), primary cells (GSE116086 or SRP151008), and at single-cell level (GSE124685 or SRP175341) demonstrated only limited coverage of the locus in line with the low expression level indicated elsewhere, and confirmed the absence of AC061979.1 expression in fibroblasts. Taken together, these results suggest that expression at AC061979.1 is detectable in lung epithelia irrespective of the biological origin of the data or the precise sequencing protocol used, minimising the risk of batch-associated effects.
To determine the evolutionary importance of the DNA sequence harbouring the rs35705950 polymorphism, the human reference genome (GRCh38.p13) was aligned against nine vertebrate genome seuqences: six mammals (Rhesus monkey, baboon, marmoset, pig, sheep, rat, mouse), one fish (zebrafish) and one bird (chicken). High similarity was observed in exonic regions across primates, with phylogenetically distant mammals showing conservation only at the 5 end of the second exon (see Figure 3A). This region appears to harbour at least three conserved loci, including a FOXA2 binding motif and four SMAD binding motifs, two of which reside in the putative intron (see Figure S1) and are found across mammalian species (see Figure 3B).
Although AC061979.1 transcript abundance in other species is limited by the lack of RNA-SEQ datasets to interrogate, taken together these results indicate a functional significance for this pancRNA, with the G/T rs35705950 SNP possibly being involved in differential splicing of AC061979.1.

2.2. MUC5B pancRNA Expression Validation

To independently validate the expression of AC061979.1, we first designed probe hydrolysis RT-qPCR assays for the putative spliced variant and holotranscript. These assays exhibited amplification efficiencies of 128.8% and 90.6% when tested against serial dilutions of a spliced AC061979.1 geneblock or A549 cell DNA extracts, respectively. Next, we obtained RNA extracts from adenocarcinoma human alveolar type-II epithelial cells (A549 cells) and cystic-fibrosis bronchial epithelial cells (CFBE41o-) representing alveolar and bronchial epithelial cells, respectively. To account for the potential impact of contact inhibition effects on AC061979.1 expression, total RNA was extracted at low (<30%) and high (>70%) confluence, and expression of the two AC061979.1 variants was assessed against 18S rRNA and MUC5B, across serial dilutions of total RNA. These analyses indicated that only the AC061979.1 full transcript was detectable, albeit at a very low copy number (see Figure 4). Thus, at 50 ng of RNA input per RT-PCR reaction, in A549 cells the paRNA Δ Ct to 18S was 22.44 (±5.94) at high confluence vs 24.12 (±3.16) at low confluence, whereas in CFBE41o- the Δ Ct was 21.92 (±5.53) at high confluence vs 28.59 (±0.81) at low confluence (n = 3). Of note is that where RNA extraction resulted in higher Ct values for 18S, the capacity to detect the pancRNA transcript was lost as concentrations dropped below the assay limit of detection, justifying the very high load of RNA template in the RT-PCR reactions.
The same assays were perfomed with undifferentiated (basal) and ALI differentiated HAEpCs. The basal cells were cultivated to a confluence of 40–50% before extracting total RNA. Of the two AC061979.1 transcripts, only the full variant was detectable (36.19 ± 0.29), whereas the spliced variant was not detectable. Interestingly, MUC5B levels were below the assay detection limit; 18S was detected at a Ct of 10.63 (±0.14). ALI-differentiated HAEpCs in control conditions showed similar values to the basal cells (36.84 ± 1.38), whereas in IL-13-stimulated cells both AC061979.1 variants were below the detection limit. In differentiated epithelia, Cts for 18S were low for both controls and IL-13-treated cells (7.97 ± 0.06 and 8.53 ± 0.08, respectively). Similarly, to the basal cells, MUC5B was not detectable. Efforts to define the 5 and 3 ends of the transcript by RLM-RACE failed to produce sequencing-grade amplicons, probably due to the low expression level of the transcript.
To ascertain the relevance of the FOXA2 binding motif in exon 2 of AC061979.1, we examined FOXA2 expression levels in A549 cells and HAEpCs. This analysis indicated no statistically significant difference between A549 cells at different confluence levels (p = 0.1000), but a 10.6 ± 2.77-fold reduction in FOXA2 levels after IL-13 stimulation (p < 0.0001) consistent with the loss of AC061979.1 (see Figure S2).

3. Discussion

MUC5B dysregulation presently appears to be mechanistically involved in the development of the underlying pathology, particularly in the context of the IPF-associated SNP rs35705950. It contributes to mucus overproduction and expression in the alveolar microenvironment, leading to micro-injuries to alveolar epithelium and, across the lifetime of a carrier, excessive cell death and fibrosis [48]. Whilst in one study the polymorphism was found in 51% of the patients with IPF, but in only 23% of the control group [18], it is unclear at present if the onset of disease among rs35705950 positive controls is a matter of time, lifestyle, or additional genetic variability. However, the strong association and high incidence rate of the polymorphism in IPF make a compelling case for lifestyle management and preventative chronic or genome modifying treatments targeting MUC5B expression repression and IPF, such as small interfering RNA, antisense or genome/prime editing. The four putative SMAD binding sites within the AC061979.1 locus, four of which reside in the proposed intron, suggest complex interplay between SMAD as an inducer and FOXA2 as a repressor of MUC5B.
Helling et al. (2017) reported a binding motif for FOXA2, located 32 bp downstream of rs35705950, which overlaps with the second putative exon of the pancRNA AC061979.1, as reported in GENCODE v32 (see Figure 2). The protein-coding gene for FOXA2 originates on chromosome 20, p11.21, between the lncRNA LINC00261 and LNCNEF. Whilst LINC00261 (a.k.a. DEANR1 [49], FALCOR [50], and LCAL62 [51]) is widely studied for its role in non-small cell lung cancer, no study exists on LNCNEF to date. LINC00261 is an endoderm-associated lncRNA which recruits SMAD2/3 to induce the expression of FOXA2 [49,51]. FOXA2 transcription factor is known to have a role in lung development and homeostasis [52], MUC5B expression and IPF [53]; however, research in this area is limited.
Our RNA-SEQ-verse survey did not return an explicit splicing signal in line with RNA-SEQ observations associated with high copy number RNAs (see Figure 5); However, if the proposed splicing event is confirmed, the location of the SNP raises the possibility that aberrant AC061979.1 splicing might be occurring in the context of the rs35705950 SNP. In turn, improper AC061979.1 splicing could be driving aberrant biochemistry on the locus such as the FOXA2 association demonstrated by Helling et al. (2017). Such a finding would introduce the additional option of a splice-correcting treatment in preventing the onset of IPF among rs35705950 carriers. Importantly, this oligonucleotide therapeutic modality has been approved for clinical use in Duchenne’s muscular dystrophies (eteplirsen) and spinal muscular atrophy (nusinersen) without the need for drug-delivery solutions that otherwise plague efficacious oligonucleotide therapies for the lung [54].
LncRNAs can form complex biological systems by binding to other RNA molecules, regulatory proteins, or DNA. FENDRR is an lncRNA expressed in the nascent lateral mesoderm, in the promoter of Forkhead Box F1 (FOXF1), where it forms a triple helix with double-stranded DNA and increases the occupancy of the Polycomb repressive complex 2 (PRC2) at this site. Rescue experiments on FENDRR-knockdown cells wherein a construct expressing the lncRNA was placed randomly in the genome showed its biological role and that the transcript acts in trans [55]. Similarly, LINC00261-null cells were rescued by viruses expressing FOXA2, in the transcriptional activation of FOXA2, which is upstream of LINC00261 [49]. It is, thus, possible that the mechanism behind MUC5B regulation involves an assembly between the pancRNA AC061979.1 and other regulatory proteins or transcripts interacting with the promoter region of MUC5B acting in cis or in trans, including the competitive binding of FOXA2 or SMAD2/3. Although Helling et al. (2017) did not assess the importance of SMAD2/3 in MUC5B expression, Feldman et al. (2019) showed that phosphorylated SMAD levels are low in mucosecretory cells, and the inflammatory TGF-beta-dependendent SMAD signaling inhibition enhanced mucin expression, as well as goblet-cell metaplasia and hyperplasia, supporting a role for SMAD proteins in MUC5B expression regulation [56]. Interactions with SMAD2/3 in the promoter of MUC5B and AC061979.1 are indeed possible due to the presence of the canonical SMAD binding element (SBE) CAGAC within the intronic region of the pancRNA, and the newly described GGC(GC)(CG) motif also known as 5GC SBE [57] within the first exon of AC061979.1 [56]. Moreover, SMAD2/3 does not necessarily need to occupy either of these SBEs on chromatin, because SMAD2/3 does not occupy the SBEs located within the LINC00261 gene [49]: instead, it interacts with LINC00261 directly at least under some experimental conditions [50]. LINC00261 is, therefore, an example of cis-acting ncRNA, whereas other lncRNAs such as EMT-associated lncRNA induced by TGF b e t a 1 (ELIT-1) act in t r a n s to bind SMAD to SMAD binding elements (SBEs) such as the CAGAC box [58]. Disruption of this proposed SMAD2/3, AC061979.1, and possibly the FOXA2 ribonucleoprotein and chromatin interaction network at the MUC5B promoter by rs35705950, for example, due to aberrant splicing, could explain MUC5B overexpression in IPF, given the pivotal role of SMAD proteins in resolving goblet-cell metaplasia and hyperplasia in inflammatory pulmonary disease [56].
To date, the FOXA2 binding site 32 bp downstream of rs35705950 has been shown to bind FOXA2 in episomal reporter systems but not by genome editing or CHIP-SEQ [18]. Our own genome-editing efforts with three separate single-guide RNAs to introduce the rs35705950 G/T transversion at Chr11:1,219,991 in A549 cells in support of CHIP-SEQ, RIP-SEQ, and proteomic experiments to resolve the MUC5B transcriptional complex, have so far proven to irreparably affect cell viability or fail in generating any detectable editing either by T7-EI or sequencing assays. Furthermore, no verified G/T or T/T lung epithelial cell line is currently available to support such mechanistic studies. As lncRNA–protein interaction is a hot research topic, recent studies have focused on developing computational methods for predicting these complex networks [59,60,61,62]. It is, thus, anticipated that with increasing understanding of lncRNA biology and characterisation of lncRNA structures and families, additional insights into AC061979.1 function might be obtained.
In this study, we developed a simple-to-use method for the targeted mining of the RNA-SEQ dataverse for lncRNA transcripts irrespective of their polyadenylation status. Our method is achievable on a public server in Galaxy (galaxyproject.org) with an extensive easy-to-follow guide available (see Scheme S1). It takes as input Sequence Read Archive (SRA) codes and the output is a .TXT file reporting the depth of coverage per position making end-user memory requirements compatible with standard desktop/laptop computers or even smartphones. However, it can be adapted to run on a cluster without a graphical user interface (GUI). Using this method, we have been able to amass evidence through the analysis of 3.9 TBase of RNA-SEQ data across 27 publications documenting the expression of a novel pancRNA overlapping the IPF-associated rs35705950 SNP implicated in MUC5B overexpression, annotated as AC061979.1 by GENCODE. The results were replicated by qRT-PCR in A549 cells and CFBE41o submerged cultures as well as in pHAECs.

4. Materials and Methods

4.1. RNA-SEQ Data Processing for Novel ncRNA Detection

To determine the existence of a MUC5B pancRNA, we manually collected publicly deposited RNA-SEQ data from 27 independent studies involving alveolar and bronchial samples from primary human tissue and i n v i t r o experiments (see Table S1). RNA-SEQ reads above Q20 were mapped to the human reference genome GRCh38.p13 using HISAT2 [63]. Mapped reads were filtered with samtools view [64] and only read pairs mapping to chromosome 11, region 1,202,000–1,220,500, were kept. Subsequently, the depth of coverage per base was extracted from all datasets and collapsed. The results were visualised in R Studio (ggplot2). The pipeline can be performed in Galaxy (galaxyproject.org). An extensive step-by-step guide is available as a Supplementary File (Scheme S1).

4.2. Multiple Sequence Alignment

To demonstrate the evolutionary importance of the region overlapping the promoter polymorphism rs35705950, we compared the human ncRNA with nucleotide sequences of 10 other species from fish to primates. Rhesus monkey (Macaca mulatta), baboon (Papio anubis), white-tufted-ear marmoset Callithrix jacchus, pig (Sus scrofa), sheep (Ovis aries), Norvegian rat (Rattus norvegicus), house mouse (Mus musculus), chicken (Gallus gallus) and zebrafish (Danio rerio). Alignments of genome sequences were undertaken using AVID and Shuffle-LAGAN programs implemented through mVISTA (http://genome.lbl.gov/vista/mvista/submit.shtml, accessed on 10 January 2022) [65] with a match criterion of 70% identity over 50bp [66]. All sequences used in analysis are included in Table S2. Subsequently, we aligned the same genomic sequences with ClustalOmega for nucleotide-by-nucleotide approach.

4.3. Cell Culture

A549 cell passages 10–12 were thawed, seeded on t25 flasks at 37 ° C 5% CO2 in DMEM/F12 (1:1) (ThermoFisher Scientific, Cramlington, UK) with 10%FBS, +1% L-Glutamine, 1% Penicillin/Streptomycin (Merck Life Science UK Ltd., Dorset, UK). Cells were cultivated till 50–60% confluency, then split in other t25 flasks until 20–30% confluency was reached (usually 24h). CFBE41o- cells (passage 10–12) were thawed, then seeded on T25 flasks at 37 ° C, 5% CO2 in MEM (Merck Life science UK limited) with 10% FBS, 1% L-Glutamine, 1% Pen/Strep (Merck Life Science UK Ltd.). Cells were cultivated till 50–60% then seeded into new t25 flasks until 20–30% or 50–60% confluency was reached for subsequent low-confluency or high-confluency total RNA extractions, respectively.
Primary human airway epithelial primary cells (pHAECs) from several donors (n = 1 basal cells, n = 4 ALI differentiated cells) were isolated from fresh tissues that were obtained during tumor resections or lung transplantation with the full consent of patients (ethics approval: ethics committee Medical School Hannover, project no. 2701-2015).
In addition, pHAECs basal cells (passage 4) were cultivated on T75 Flasks in airway epithelial cell basal medium supplemented with airway epithelial cell growth medium supplement pack and with 5 μ g/mL Plasmocin prophylactic, 100 μ g/mL Primocin and 10 μ g/mL Fungin (all from InvivoGen, Toulouse, France). Trypsinization with Promocell DetachKit (Promocell, Heidelberg, Germany) and RNA etxraction was performed at ~40–50% confluency.
pHAECs basal cell for air liquid interface (passage 2) were expanded as above in T75 flasks till 90% confluency. The cells were than trypsinized and seeded into Transwell filters (6.5 mm diameter, 4 μ m pore size, Corning Costar, Kaiserslauten, Germany). Filters, prior to cell seeding, were coated with 100 μ L collagen solution (StemCell Technologies, Saint Égrève, France), and left to dry under sterile hood overnight. Subsequently, the filters were exposed to UV light for 30 min and stored at 4 ° C.
Cells were resuspended in growth medium, and 200 μ L containing 4 × 104 cells were added apically to each filter, an additional 600 μ L of the medium were added basolaterally. The medium was replaced every 48 h until 100% confluence was reached. Growth medium was then removed from apical side and on the basolateral side it was replaced with ALI differentiation medium ±10 ng/ml IL-13 (IL012; Merck Millipore). Once the ALI interface was established, medium was exchanged every second day till day 25–28 on ALI. At the endpoint of cultivation, RNA extraction was performed directly on the filter.

4.4. RNA Extraction

RNA extraction was carried out using the miRNeasy mini kit (Qiagen, Manchester, UK). Briefly, cells were detached by trypsinisation then resuspended in 0.7 mL Qiazol Lysis reagent with subsequent steps according to the supplier’s total RNA extraction protocol.
For pHAECs, cells were detached by trypsinisation then resuspended in 2.1 mL Lysis Solution RL from my-Budget RNA Mini Kit (BioBudget, Krefeld, Germany), RNA isolation was performed following the manufacturer protocol. If not used immediately after lysis, the samples were stored at −80 ° C. For pHAECs ALI cultures, 100 μ L of Lysis Solution RL from the same kit was added to the filters apically and the samples were immediately frozen at −80 ° C.

4.5. DNase Treatment and cDNA Synthesis

Total RNA was DNase treated using the Precision T M DNase kit Primer Design (Southampton, UK), following the manufacturer’s protocol. cDNA synthesis was carried out using the High Capacity cDNA reverse Transcription Kit (Thermo Fisher scientific (ThermoFisher Scientific) following the manufacturers protocol. A total of 1000 ng of total RNA was loaded into each 20 μ L cDNA synthesis reaction.
For pHAEC cells, the cDNA synthesis from the extracted was performed using SuperScript VILO cDNA Synthesis Kit (Thermo Fisher) following the manufacturer protocol. A total of 400 ng of RNA were used for each reaction.

4.6. Real-Time Quantitative PCR

Custom primers and probes (Table S3) were designed using the PrimerQuestTM tool (Integrated DNA Technologies BVBA, Leuven, Belgium) and validated against an AC061979.1 geneblock (Integrated DNA Technologies) corresponding to the predicted spliced transcript. Inventoried predesigned assays for 18S and MUC5B were purchased from Thermo Fisher Scientific, and for FOXA2 from Qiagen. Real-time quantitative PCR was performed in 10 μ L reactions containing 5L TaqMan Fast Advanced Master Mix (2×) (ThermoFisher Scientific), 900 nM forward primer, 900 nM reverse primer and 250nM probe per reaction and 1 μ L template on a StepOnePlusTM real-time PCR system (Thermo Fisher Scientific). After a UNG incubation at 50 ° C for 2 min, initial denaturation at 95 ° C for 2 min was followed by 40 cycles of 95 ° C denaturation for 1 s and 60 ° C anneal extension for 20 s. Gene expression was calculated according to the delta Ct method [67]. Statistical analyses on gene expression were performed on data expressed as a fold difference to high confluence A549 samples, and control samples in a paired sample fashion for HAEpCs, respectively. GraphPad Prism v.9.4.1 (GraphPad Software. LLC, San Diego, CA, USA) was used for Kolmogorov–Smirnov tests for cell-line results and paired t tests for HAEpC results. For RNA ligase-mediated 5 and 3 rapid amplification of cDNA ends (RLM-RACE), the FirstChoice RLM-RACE kit was used according to the manufacturer’s instructions (Thermo Fisher) using pancRNA gene-specific primers for RT-PCR.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ncrna8060083/s1, Table S1: List of the studies used in the RNA-SEQ meta-analysis; Table S2: Overview of the intergenic region between MUC5AC and MUC5B and the rs35705950-MUC5B transcription-start-site (TSS) distance across four species; Table S3: Primers and probes used in the study; Scheme S1: Extensive workflow of the RNA-SEQ meta-analysis; Figure S1: AC061979.1—DNA Sequence. Uper case letters—exons; lower case letters—intron; green highlight—SMAD2/3 binding motif; pink highlight—FOXA2 binding motif; Figure S2: FOXA2 expression levels in A549 and HAEpC’s. The levels of FOXA2 mRNA were determined in A549 cells at low and high con uence levels and HAEpC’s with and without IL-13 stimulation. Expression, nor- malised to 18S rRNA levels, was calculated relative to high con uence A549s or paired unstimulated HAEpC’s. ****: p < 0.0001

Author Contributions

Conceptualization, S.A.M.; methodology, R.N., I.E., G.F., G.A. and M.F.; software, R.N.; validation, D.J.T.; formal analysis, R.N.; investigation, R.N; resources, E.C.S., S.A.M. and P.B.; data curation, R.N.; writing—original draft preparation, R.N.; writing—review and editing, S.A.M. and E.C.S.; visualization, R.N.; supervision, S.A.M., S.V. and M.F.; project administration, S.A.M.; funding acquisition, S.A.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the ethics committee Medical School Hannover (project no. 2701-2015, approved on 23 April 2015.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Publicly available datasets were analyzed in this study. The SRA/GSM accession numbers for each study can be found in Supplementary Table S2.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Carroll, S.B. Evo-devo and an expanding evolutionary synthesis: A genetic theory of morphological evolution. Cell 2008, 134, 25–36. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Chen, Y.G.; Satpathy, A.T.; Chang, H.Y. Gene regulation in the immune system by long noncoding RNAs. Nat. Immunol. 2017, 18, 962–972. [Google Scholar] [CrossRef] [PubMed]
  3. Palazzo, A.F.; Gregory, T.R. The case for junk DNA. PLoS Genet. 2014, 10, e1004351. [Google Scholar] [CrossRef] [Green Version]
  4. The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 2012, 489, 57. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Palazzo, A.F.; Lee, E.S. Non-coding RNA: What is functional and what is junk? Front. Genet. 2015, 6, 2. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Seiler, J.; Breinig, M.; Caudron-Herger, M.; Polycarpou-Schwarz, M.; Boutros, M.; Diederichs, S. The lncRNA VELUCT strongly regulates viability of lung cancer cells despite its extremely low abundance. Nucleic Acids Res. 2017, 45, 5458–5469. [Google Scholar] [CrossRef]
  7. Ji, P.; Diederichs, S.; Wang, W.; Böing, S.; Metzger, R.; Schneider, P.M.; Tidow, N.; Brandt, B.; Buerger, H.; Bulk, E.; et al. MALAT-1, a novel noncoding RNA, and thymosin β 4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene 2003, 22, 8031–8041. [Google Scholar] [CrossRef] [Green Version]
  8. Bahrami, T.; Taheri, M.; Omrani, M.D.; Karimipoor, M. Associations Between Genomic Variants in lncRNA-TRPM2-AS and lncRNA-HNF1A-AS1 Genes and Risk of Multiple Sclerosis. J. Mol. Neurosci. 2020, 70, 1050–1055. [Google Scholar] [CrossRef]
  9. Kumar, P.; Sen, C.; Peters, K.; Frizzell, R.A.; Biswas, R. Comparative analyses of long non-coding RNA profiles in vivo in cystic fibrosis lung airway and parenchyma tissues. Respir. Res. 2019, 20, 284. [Google Scholar] [CrossRef] [Green Version]
  10. Gendrel, A.V.; Heard, E. Fifty years of X-inactivation research. Development 2011, 138, 5049–5055. [Google Scholar] [CrossRef]
  11. Zhao, Y.; Li, H.; Fang, S.; Kang, Y.; Wu, W.; Hao, Y.; Li, Z.; Bu, D.; Sun, N.; Zhang, M.Q.; et al. NONCODE 2016: An informative and valuable data source of long non-coding RNAs. Nucleic Acids Res. 2016, 44, D203–D208. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Ning, S.; Zhang, J.; Wang, P.; Zhi, H.; Wang, J.; Liu, Y.; Gao, Y.; Guo, M.; Yue, M.; Wang, L.; et al. Lnc2Cancer: A manually curated database of experimentally supported lncRNAs associated with various human cancers. Nucleic Acids Res. 2016, 44, D980–D985. [Google Scholar] [CrossRef] [PubMed]
  13. Chen, G.; Wang, Z.; Wang, D.; Qiu, C.; Liu, M.; Chen, X.; Zhang, Q.; Yan, G.; Cui, Q. LncRNADisease: A database for long-non-coding RNA-associated diseases. Nucleic Acids Res. 2012, 41, D983–D986. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Quek, X.C.; Thomson, D.W.; Maag, J.L.; Bartonicek, N.; Signal, B.; Clark, M.B.; Gloss, B.S.; Dinger, M.E. lncRNAdb v2. 0: Expanding the reference database for functional long noncoding RNAs. Nucleic Acids Res. 2015, 43, D168–D173. [Google Scholar] [CrossRef]
  15. Ma, L.; Bajic, V.B.; Zhang, Z. On the classification of long non-coding RNAs. RNA Biol. 2013, 10, 924–933. [Google Scholar] [CrossRef] [Green Version]
  16. Uesaka, M.; Agata, K.; Oishi, T.; Nakashima, K.; Imamura, T. Evolutionary acquisition of promoter-associated non-coding RNA (pancRNA) repertoires diversifies species-dependent gene activation mechanisms in mammals. BMC Genom. 2017, 18, 285. [Google Scholar] [CrossRef] [Green Version]
  17. Minotti, L.; Agnoletto, C.; Baldassari, F.; Corrà, F.; Volinia, S. SNPs and somatic mutation on long non-coding RNA: New frontier in the cancer studies? High-Throughput 2018, 7, 34. [Google Scholar] [CrossRef] [Green Version]
  18. Helling, B.A.; Gerber, A.N.; Kadiyala, V.; Sasse, S.K.; Pedersen, B.S.; Sparks, L.; Nakano, Y.; Okamoto, T.; Evans, C.M.; Yang, I.V.; et al. Regulation of MUC5B expression in idiopathic pulmonary fibrosis. Am. J. Respir. Cell Mol. Biol. 2017, 57, 91–99. [Google Scholar] [CrossRef]
  19. Evans, C.M.; Fingerlin, T.E.; Schwarz, M.I.; Lynch, D.; Kurche, J.; Warg, L.; Yang, I.V.; Schwartz, D.A. Idiopathic pulmonary fibrosis: A genetic disease that involves mucociliary dysfunction of the peripheral airways. Physiol. Rev. 2016, 96, 1567–1591. [Google Scholar] [CrossRef] [Green Version]
  20. Seibold, M.A.; Wise, A.L.; Speer, M.C.; Steele, M.P.; Brown, K.K.; Loyd, J.E.; Fingerlin, T.E.; Zhang, W.; Gudmundsson, G.; Groshong, S.D.; et al. A common MUC5B promoter polymorphism and pulmonary fibrosis. New Engl. J. Med. 2011, 364, 1503–1512. [Google Scholar] [CrossRef]
  21. Noth, I.; Zhang, Y.; Ma, S.F.; Flores, C.; Barber, M.; Huang, Y.; Broderick, S.M.; Wade, M.S.; Hysi, P.; Scuirba, J.; et al. Genetic variants associated with idiopathic pulmonary fibrosis susceptibility and mortality: A genome-wide association study. Lancet Respir. Med. 2013, 1, 309–317. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Dressen, A.; Abbas, A.R.; Cabanski, C.; Reeder, J.; Ramalingam, T.R.; Neighbors, M.; Bhangale, T.R.; Brauer, M.J.; Hunkapiller, J.; Reeder, J.; et al. Analysis of protein-altering variants in telomerase genes and their association with MUC5B common variant status in patients with idiopathic pulmonary fibrosis: A candidate gene sequencing study. Lancet Respir. Med. 2018, 6, 603–614. [Google Scholar] [CrossRef] [PubMed]
  23. Hobbs, B.D.; Putman, R.K.; Araki, T.; Nishino, M.; Gudmundsson, G.; Gudnason, V.; Eiriksdottir, G.; Zilhao Nogueira, N.R.; Dupuis, J.; Xu, H.; et al. Overlap of genetic risk between interstitial lung abnormalities and idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 2019, 200, 1402–1413. [Google Scholar] [CrossRef] [PubMed]
  24. Wang, C.; Zhuang, Y.; Guo, W.; Cao, L.; Zhang, H.; Xu, L.; Fan, Y.; Zhang, D.; Wang, Y. Mucin 5B promoter polymorphism is associated with susceptibility to interstitial lung diseases in Chinese males. PLoS ONE 2014, 9, e104919. [Google Scholar] [CrossRef] [Green Version]
  25. Hunninghake, G.M.; Hatabu, H.; Okajima, Y.; Gao, W.; Dupuis, J.; Latourelle, J.C.; Nishino, M.; Araki, T.; Zazueta, O.E.; Kurugol, S.; et al. MUC5B promoter polymorphism and interstitial lung abnormalities. New Engl. J. Med. 2013, 368, 2192–2200. [Google Scholar] [CrossRef] [Green Version]
  26. Van der Vis, J.J.; Snetselaar, R.; Kazemier, K.M.; ten Klooster, L.; Grutters, J.C.; van Moorsel, C.H. Effect of M uc5b promoter polymorphism on disease predisposition and survival in idiopathic interstitial pneumonias. Respirology 2016, 21, 712–717. [Google Scholar] [CrossRef]
  27. Wei, R.; Li, C.; Zhang, M.; Jones-Hall, Y.L.; Myers, J.L.; Noth, I.; Liu, W. Association between MUC5B and TERT polymorphisms and different interstitial lung disease phenotypes. Transl. Res. 2014, 163, 494–502. [Google Scholar] [CrossRef] [Green Version]
  28. Peljto, A.L.; Zhang, Y.; Fingerlin, T.E.; Ma, S.F.; Garcia, J.G.; Richards, T.J.; Silveira, L.J.; Lindell, K.O.; Steele, M.P.; Loyd, J.E.; et al. Association between the MUC5B promoter polymorphism and survival in patients with idiopathic pulmonary fibrosis. JAMA 2013, 309, 2232–2239. [Google Scholar] [CrossRef]
  29. Peljto, A.L.; Selman, M.; Kim, D.S.; Murphy, E.; Tucker, L.; Pardo, A.; Lee, J.S.; Ji, W.; Schwarz, M.I.; Yang, I.V.; et al. The MUC5B promoter polymorphism is associated with idiopathic pulmonary fibrosis in a Mexican cohort but is rare among Asian ancestries. Chest 2015, 147, 460–464. [Google Scholar] [CrossRef] [Green Version]
  30. Stock, C.J.; Sato, H.; Fonseca, C.; Banya, W.A.; Molyneaux, P.L.; Adamali, H.; Russell, A.M.; Denton, C.P.; Abraham, D.J.; Hansell, D.M.; et al. Mucin 5B promoter polymorphism is associated with idiopathic pulmonary fibrosis but not with development of lung fibrosis in systemic sclerosis or sarcoidosis. Thorax 2013, 68, 436–441. [Google Scholar] [CrossRef]
  31. Borie, R.; Crestani, B.; Dieude, P.; Nunes, H.; Allanore, Y.; Kannengiesser, C.; Airo, P.; Matucci-Cerinic, M.; Wallaert, B.; Israel-Biet, D.; et al. The MUC5B variant is associated with idiopathic pulmonary fibrosis but not with systemic sclerosis interstitial lung disease in the European Caucasian population. PLoS ONE 2013, 8, e70621. [Google Scholar] [CrossRef] [PubMed]
  32. Kishore, A.; Žižková, V.; Kocourková, L.; Petrkova, J.; Bouros, E.; Nunes, H.; Loštáková, V.; Müller-Quernheim, J.; Zissel, G.; Kolek, V.; et al. Association study for 26 candidate loci in idiopathic pulmonary fibrosis patients from four European populations. Front. Immunol. 2016, 7, 274. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Zhu, Q.Q.; Zhang, X.L.; Zhang, S.M.; Tang, S.W.; Min, H.Y.; Yi, L.; Xu, B.; Song, Y. Association between the MUC5B promoter polymorphism rs35705950 and idiopathic pulmonary fibrosis: A meta-analysis and trial sequential analysis in Caucasian and Asian populations. Medicine 2015, 94. [Google Scholar] [CrossRef] [PubMed]
  34. Horimasu, Y.; Ohshimo, S.; Bonella, F.; Tanaka, S.; Ishikawa, N.; Hattori, N.; Kohno, N.; Guzman, J.; Costabel, U. MUC 5 B promoter polymorphism in J apanese patients with idiopathic pulmonary fibrosis. Respirology 2015, 20, 439–444. [Google Scholar] [CrossRef] [PubMed]
  35. Deng, Y.; Li, Z.; Liu, J.; Wang, Z.; Cao, Y.; Mou, Y.; Fu, B.; Mo, B.; Wei, J.; Cheng, Z.; et al. Targeted resequencing reveals genetic risks in patients with sporadic idiopathic pulmonary fibrosis. Hum. Mutat. 2018, 39, 1238–1245. [Google Scholar] [CrossRef]
  36. Mathai, S.K.; Humphries, S.; Kropski, J.A.; Blackwell, T.S.; Powers, J.; Walts, A.D.; Markin, C.; Woodward, J.; Chung, J.H.; Brown, K.K.; et al. MUC5B variant is associated with visually and quantitatively detected preclinical pulmonary fibrosis. Thorax 2019, 74, 1131–1139. [Google Scholar] [CrossRef]
  37. Lorenzo-Salazar, J.M.; Ma, S.F.; Jou, J.; Hou, P.C.; Guillen-Guio, B.; Allen, R.J.; Jenkins, R.G.; Wain, L.V.; Oldham, J.M.; Noth, I.; et al. Novel idiopathic pulmonary fibrosis susceptibility variants revealed by deep sequencing. ERJ Open Res. 2019, 5. [Google Scholar] [CrossRef]
  38. Moore, C.; Blumhagen, R.Z.; Yang, I.V.; Walts, A.; Powers, J.; Walker, T.; Bishop, M.; Russell, P.; Vestal, B.; Cardwell, J.; et al. Resequencing study confirms that host defense and cell senescence gene variants contribute to the risk of idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 2019, 200, 199–208. [Google Scholar] [CrossRef]
  39. Jiang, H.; Hu, Y.; Shang, L.; Li, Y.; Yang, L.; Chen, Y. Association between MUC5B polymorphism and susceptibility and severity of idiopathic pulmonary fibrosis. Int. J. Clin. Exp. Pathol. 2015, 8, 14953. [Google Scholar]
  40. Stock, C.J.; Conti, C.; Montero-Fernandez, Á.; Caramori, G.; Molyneaux, P.L.; George, P.M.; Kokosi, M.; Kouranos, V.; Maher, T.M.; Chua, F.; et al. Interaction between the promoter MUC5B polymorphism and mucin expression: Is there a difference according to ILD subtype? Thorax 2020, 75, 901–903. [Google Scholar] [CrossRef]
  41. Nakano, Y.; Yang, I.V.; Walts, A.D.; Watson, A.M.; Helling, B.A.; Fletcher, A.A.; Lara, A.R.; Schwarz, M.I.; Evans, C.M.; Schwartz, D.A. MUC5B promoter variant rs35705950 affects MUC5B expression in the distal airways in idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 2016, 193, 464–466. [Google Scholar] [CrossRef] [PubMed]
  42. Chen, G.; Ribeiro, C.M.; Sun, L.; Okuda, K.; Kato, T.; Gilmore, R.C.; Martino, M.B.; Dang, H.; Abzhanova, A.; Lin, J.M.; et al. XBP1S regulates MUC5B in a promoter variant–dependent pathway in idiopathic pulmonary fibrosis airway epithelia. Am. J. Respir. Crit. Care Med. 2019, 200, 220–234. [Google Scholar] [CrossRef] [PubMed]
  43. Di Bella, S.; La Ferlita, A.; Carapezza, G.; Alaimo, S.; Isacchi, A.; Ferro, A.; Pulvirenti, A.; Bosotti, R. A benchmarking of pipelines for detecting ncRNAs from RNA-Seq data. Briefings Bioinform. 2020, 21, 1987–1998. [Google Scholar] [CrossRef] [PubMed]
  44. Akiyama, K.; Alberdi, A.; Alef, W.; Asada, K.; Azulay, R.; Baczko, A.K.; Ball, D.; Baloković, M.; Barrett, J.; Bintley, D.; et al. First m87 event horizon telescope results: iii—Data processing and calibration. Astrophys. J. Lett. 2019, 875, L3. [Google Scholar]
  45. Frankish, A.; Diekhans, M.; Ferreira, A.M.; Johnson, R.; Jungreis, I.; Loveland, J.; Mudge, J.M.; Sisu, C.; Wright, J.; Armstrong, J.; et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019, 47, D766–D773. [Google Scholar] [CrossRef] [Green Version]
  46. Zhang, H.; Yang, J.; Walters, M.S.; Staudt, M.R.; Strulovici-Barel, Y.; Salit, J.; Mezey, J.G.; Leopold, P.L.; Crystal, R.G. Mandatory role of HMGA1 in human airway epithelial normal differentiation and post-injury regeneration. Oncotarget 2018, 9, 14324. [Google Scholar] [CrossRef] [Green Version]
  47. Hackett, N.R.; Shaykhiev, R.; Walters, M.S.; Wang, R.; Zwick, R.K.; Ferris, B.; Witover, B.; Salit, J.; Crystal, R.G. The human airway epithelial basal cell transcriptome. PLoS ONE 2011, 6, e18378. [Google Scholar] [CrossRef]
  48. Richeldi, L.; Collard, H.R.; Jones, M.G. Idiopathic pulmonary fibrosis. Lancet 2017, 389, 1941–1952. [Google Scholar] [CrossRef]
  49. Jiang, W.; Liu, Y.; Liu, R.; Zhang, K.; Zhang, Y. The lncRNA DEANR1 facilitates human endoderm differentiation by activating FOXA2 expression. Cell Rep. 2015, 11, 137–148. [Google Scholar] [CrossRef] [Green Version]
  50. Swarr, D.T.; Herriges, M.; Li, S.; Morley, M.; Fernandes, S.; Sridharan, A.; Zhou, S.; Garcia, B.A.; Stewart, K.; Morrisey, E.E. The long noncoding RNA Falcor regulates Foxa2 expression to maintain lung epithelial homeostasis and promote regeneration. Genes Dev. 2019, 33, 656–668. [Google Scholar] [CrossRef] [Green Version]
  51. Dang, H.X.; White, N.M.; Rozycki, E.B.; Felsheim, B.M.; Watson, M.A.; Govindan, R.; Luo, J.; Maher, C.A. Long non-coding RNA LCAL62/LINC00261 is associated with lung adenocarcinoma prognosis. Heliyon 2020, 6, e03521. [Google Scholar] [CrossRef]
  52. Choi, W.; Choe, S.; Lau, G.W. Inactivation of FOXA2 by respiratory bacterial pathogens and dysregulation of pulmonary mucus homeostasis. Front. Immunol. 2020, 11, 515. [Google Scholar] [CrossRef] [PubMed]
  53. Zhang, Q.; Wang, Y.; Qu, D.; Yu, J.; Yang, J. The possible pathogenesis of idiopathic pulmonary fibrosis considering MUC5B. BioMed Res. Int. 2019, 2019, 9712464. [Google Scholar] [PubMed] [Green Version]
  54. Kumar, M.; Moschos, S. Oligonucleotide therapies for the lung: Ready to return to the clinic? Mol. Ther. 2017, 25, 2604–2606. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. Grote, P.; Herrmann, B.G. The long non-coding RNA Fendrr links epigenetic control mechanisms to gene regulatory networks in mammalian embryogenesis. RNA Biol. 2013, 10, 1579–1585. [Google Scholar] [CrossRef] [Green Version]
  56. Feldman, M.B.; Wood, M.; Lapey, A.; Mou, H. SMAD signaling restricts mucous cell differentiation in human airway epithelium. Am. J. Respir. Cell Mol. Biol. 2019, 61, 322–331. [Google Scholar] [CrossRef]
  57. Martin-Malpartida, P.; Batet, M.; Kaczmarska, Z.; Freier, R.; Gomes, T.; Aragón, E.; Zou, Y.; Wang, Q.; Xi, Q.; Ruiz, L.; et al. Structural basis for genome wide recognition of 5-bp GC motifs by SMAD transcription factors. Nat. Commun. 2017, 8, 1–15. [Google Scholar]
  58. Sakai, S.; Ohhata, T.; Kitagawa, K.; Uchida, C.; Aoshima, T.; Niida, H.; Suzuki, T.; Inoue, Y.; Miyazawa, K.; Kitagawa, M. Long noncoding RNA ELIT-1 Acts as a Smad3 cofactor to facilitate TGFβ/Smad signaling and promote Epithelial–mesenchymal TRansition. Cancer Res. 2019, 79, 2821–2838. [Google Scholar] [CrossRef] [Green Version]
  59. Pyfrom, S.C.; Luo, H.; Payton, J.E. PLAIDOH: A novel method for functional prediction of long non-coding RNAs identifies cancer-specific LncRNA activities. BMC Genom. 2019, 20, 137. [Google Scholar] [CrossRef]
  60. Xiao, Y.; Zhang, J.; Deng, L. Prediction of lncRNA-protein interactions using HeteSim scores based on heterogeneous networks. Sci. Rep. 2017, 7, 3664. [Google Scholar] [CrossRef]
  61. Zhao, Q.; Zhang, Y.; Hu, H.; Ren, G.; Zhang, W.; Liu, H. IRWNRLPI: Integrating random walk and neighborhood regularized logistic matrix factorization for lncRNA-protein interaction prediction. Front. Genet. 2018, 9, 239. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  62. Peng, L.; Liu, F.; Yang, J.; Liu, X.; Meng, Y.; Deng, X.; Peng, C.; Tian, G.; Zhou, L. Probing lncRNA–Protein Interactions: Data Repositories, Models, and Algorithms. Front. Genet. 2019, 10, 1346. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  63. Kim, D.; Paggi, J.M.; Park, C.; Bennett, C.; Salzberg, S.L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019, 37, 907–915. [Google Scholar] [CrossRef] [PubMed]
  64. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The sequence alignment/map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef] [Green Version]
  65. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004, 32, W273–W279. [Google Scholar] [CrossRef] [Green Version]
  66. Li, G.; Holland, P.W. The origin and evolution of ARGFX homeobox loci in mammalian radiation. BMC Evol. Biol. 2010, 10, 182. [Google Scholar] [CrossRef] [Green Version]
  67. Livak, K.J.; Schmittgen, T.D. Analysis of relative gene expression data using real-time quantitative PCR and the 2- ΔΔCT method. Methods 2001, 25, 402–408. [Google Scholar] [CrossRef]
Figure 1. RNA-SEQ data processing results. Depth of coverage spanning chromosome 11: 1,218,530–1,220,242 collected from (A) 27 studies (3.9 TBases), (B) a single dataset (SRP082973) comparing epithelial to basal cell expression, and (C) 3 IPF-related lung-tissue (SRP033095) and fibroblast (SRP151008 and SRP175341) studies. The position of rs35705950 is indicated by a red vertical line and the AC061979.1, primary transcript and spliced exons are indicated in orange.
Figure 1. RNA-SEQ data processing results. Depth of coverage spanning chromosome 11: 1,218,530–1,220,242 collected from (A) 27 studies (3.9 TBases), (B) a single dataset (SRP082973) comparing epithelial to basal cell expression, and (C) 3 IPF-related lung-tissue (SRP033095) and fibroblast (SRP151008 and SRP175341) studies. The position of rs35705950 is indicated by a red vertical line and the AC061979.1, primary transcript and spliced exons are indicated in orange.
Ncrna 08 00083 g001
Figure 2. UCSC genome browser Genomic location of the annotated lncRNA AC061979.1. The putative pancRNA AC061979.1 is located on chromosome 11, at 1,218,530–1,220,242—green; the transcription start site of MUC5B—dark blue; thick line—exons; thin line—introns; rs35705950—highlighted in red.
Figure 2. UCSC genome browser Genomic location of the annotated lncRNA AC061979.1. The putative pancRNA AC061979.1 is located on chromosome 11, at 1,218,530–1,220,242—green; the transcription start site of MUC5B—dark blue; thick line—exons; thin line—introns; rs35705950—highlighted in red.
Ncrna 08 00083 g002
Figure 3. Conservation of the MUC5B-MUC5AC intergenic region across 10 species. (A) The genomic sequences were aligned using AVID in mVISTA: global pair-wise alignment between ~2000 nt spanning the human AC061979.1 transcript and the whole intergenic region of the other species (~20,000 nt; Table S2). Coloured peaks (purple: AC061979.1 exons; pink: intergenic regions) indicate at least 50 bp with 70% similarity. The grey rectangle indicates the conserved exon across mammals. (B) Multiple Sequence Alignment by ClustalOmega in Jalview shows that 100 nucleotides downstream of rs35705950 (red rectangle) there are (i) 15–25 bp conserved across mammals (purple shades by nucleotide similarity percentage), (ii) a FOXA2 binding site (grey rectangle), and (iii) a third conserved region approximately 10 nt downstream of the FOXA2 binding site.
Figure 3. Conservation of the MUC5B-MUC5AC intergenic region across 10 species. (A) The genomic sequences were aligned using AVID in mVISTA: global pair-wise alignment between ~2000 nt spanning the human AC061979.1 transcript and the whole intergenic region of the other species (~20,000 nt; Table S2). Coloured peaks (purple: AC061979.1 exons; pink: intergenic regions) indicate at least 50 bp with 70% similarity. The grey rectangle indicates the conserved exon across mammals. (B) Multiple Sequence Alignment by ClustalOmega in Jalview shows that 100 nucleotides downstream of rs35705950 (red rectangle) there are (i) 15–25 bp conserved across mammals (purple shades by nucleotide similarity percentage), (ii) a FOXA2 binding site (grey rectangle), and (iii) a third conserved region approximately 10 nt downstream of the FOXA2 binding site.
Ncrna 08 00083 g003
Figure 4. AC061979.1 expression validation in cell culture. Two-fold serial dilutions of A549 (A,B) and CFBE41o- (C,D) RNA extracts obtained from low- (<30%; (A,C)) and high- (>70%; (B,D)) confluence cells were analysed by probe hydrolysis RT-qPCR for 18S rRNA (triangle), MUC5B (diamond) and AC061979.1 pancRNA (square) expression across two-fold serial dilutions, with a no RT of AC061979.1 reaction set included as a negative control (grey circles). Data are expressed in log2-linear scale and are representative of three independent biological experiments and dual technical replicate Ct’s points shown.
Figure 4. AC061979.1 expression validation in cell culture. Two-fold serial dilutions of A549 (A,B) and CFBE41o- (C,D) RNA extracts obtained from low- (<30%; (A,C)) and high- (>70%; (B,D)) confluence cells were analysed by probe hydrolysis RT-qPCR for 18S rRNA (triangle), MUC5B (diamond) and AC061979.1 pancRNA (square) expression across two-fold serial dilutions, with a no RT of AC061979.1 reaction set included as a negative control (grey circles). Data are expressed in log2-linear scale and are representative of three independent biological experiments and dual technical replicate Ct’s points shown.
Ncrna 08 00083 g004
Figure 5. UCSC genome browser genomic location of the annotated lncRNA LINC00261 and FOXA2. LINC00261 and lncRNA neighboring enhancer of FOXA2 (LNCNEF)—green; FOXA2—dark blue.
Figure 5. UCSC genome browser genomic location of the annotated lncRNA LINC00261 and FOXA2. LINC00261 and lncRNA neighboring enhancer of FOXA2 (LNCNEF)—green; FOXA2—dark blue.
Ncrna 08 00083 g005
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Neatu, R.; Enekwa, I.; Thompson, D.J.; Schwalbe, E.C.; Fois, G.; Abdelaal, G.; Veuger, S.; Frick, M.; Braubach, P.; Moschos, S.A. The Idiopathic Pulmonary Fibrosis-Associated Single Nucleotide Polymorphism RS35705950 Is Transcribed in a MUC5B Promoter Associated Long Non-Coding RNA (AC061979.1). Non-Coding RNA 2022, 8, 83. https://doi.org/10.3390/ncrna8060083

AMA Style

Neatu R, Enekwa I, Thompson DJ, Schwalbe EC, Fois G, Abdelaal G, Veuger S, Frick M, Braubach P, Moschos SA. The Idiopathic Pulmonary Fibrosis-Associated Single Nucleotide Polymorphism RS35705950 Is Transcribed in a MUC5B Promoter Associated Long Non-Coding RNA (AC061979.1). Non-Coding RNA. 2022; 8(6):83. https://doi.org/10.3390/ncrna8060083

Chicago/Turabian Style

Neatu, Ruxandra, Ifeanyi Enekwa, Dean J. Thompson, Edward C. Schwalbe, Giorgio Fois, Gina Abdelaal, Stephany Veuger, Manfred Frick, Peter Braubach, and Sterghios A. Moschos. 2022. "The Idiopathic Pulmonary Fibrosis-Associated Single Nucleotide Polymorphism RS35705950 Is Transcribed in a MUC5B Promoter Associated Long Non-Coding RNA (AC061979.1)" Non-Coding RNA 8, no. 6: 83. https://doi.org/10.3390/ncrna8060083

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop