1. Introduction
The RNA structure is the primary feature determining the RNA function. The functional structural motifs within RNA molecules allow for specific interactions with other cellular components, such as proteins, RNAs, or small metabolites [
1]. It also determines the faith of the transcripts, providing recognition signals for the processing or the degradation machinery [
2,
3]. Thus, solving the RNA structure adds ultimate knowledge about RNA biology.
The most informative for the RNA function is its tertiary structure. It illustrates the final arrangement of the RNA chain in space, demonstrating which nucleotides are available for interactions and which are accountable for intra-molecular contacts contributing to RNA stability. Solving the tertiary RNA structure is much more complex than proteins, mostly due to RNA molecules’ higher flexibility and dynamics. Thus, the secondary structure is often used to describe the functional motifs in RNA research. It is determined by a pattern of base pairs observed within the RNA molecule.
There are several methods available for the determination of the RNA secondary structure. Some of the most widely used methods are based on the mapping of the structurally accessible RNA regions, using chemical reagents which induce cleavage at single-stranded RNA regions (hydroxyl radicals, lead ions) or introduce chemical modifications which can be detected by primer extension (dimethyl sulfate, CMCT, kethoxal) [
4]. Furthermore, SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) has emerged as a valuable strategy for probing the RNA structure. SHAPE uses small hydroxyl-selective electrophilic reagents to probe the reactivity of the RNA ribose 2′-OH group [
5,
6]. SHAPE chemistry provides quantitative, reproducible, single-nucleotide resolution data because almost all ribonucleotides possess a free 2′-hydroxyl, and each position in the RNA is interrogated. The initial electrophile developed for SHAPE, N-methylisatoic anhydride (NMIA), is not very reactive; therefore, the RNA structures must be probed over tens of minutes. Due to the dynamic nature of the RNA structure, this issue represented a major disadvantage for the initial version of SHAPE chemistry. However, several superior reagents for SHAPE chemistry have been created, including those that are faster reacting and capable of interrogating RNA structure on the second timescale. During the last decade, the reactivity of multiple reagents has been extensively validated on numerous RNAs. These include, among others, 1-methyl-7-nitroisatoic anhydride (1M7) [
7], 1-methyl-6-nitroisatoic anhydride (1M6) [
8], 2-methyl-3-furoic acid imidazolide (FAI) and 2-methylnicotinic acid imidazolide (NAI) [
9], and benzoyl cyanide (BzCN) [
10], or the recently introduced 2-aminopyridine-3-carboxylic acid imidazolide (2A3), for the interrogation of RNA structures in vivo [
11]. The choice of the SHAPE reagent for RNA structure probing depends on its specific features, e.g., efficiency, accuracy, or the ability to permeate biological membranes.
In the last decade, high-throughput techniques that combine chemical probing with next-generation sequencing have started to shed new light on the sequence-structure relationship of RNA. Because of the single-nucleotide resolution measurements of the RNA structure, high-throughput sequencing-based approaches are providing genome-wide snapshots of the RNA structure. Several such techniques have been developed: SHAPE-Seq [
5,
6], DMS-Seq [
12,
13,
14], MAP-Seq (Multiplexed Accessibility Probing sequencing) [
15], and SHAPE-MaP (SHAPE combined with mutational profiling) [
16] or DMS-Mapseq [
17]. Each follows the same routine protocol, consisting of the following steps: (i) modification of the RNA depending on the structure; (ii) generating a cDNA pool via reverse transcription (RT) of the modified RNA; (iii) construction of the sequencing library; (iv) sequencing of the library; (v) bioinformatic analysis. However, they differ in the approaches for adduct detection, e.g., SHAPE-Seq uses RT conditions where an enzyme stops at the modified nucleotide, and in SHAPE-MaP reverse transcriptase incorporates noncomplementary nucleotides at the sites of SHAPE chemical adducts. The latter is possible due to the presence of Mn
2+ during the RT reaction, as Mn
2+ (among other divalent cations) has been shown to decrease the fidelity of the DNA synthesis [
18]. This increased error frequency in the presence of Mn
2+ has been observed in multiple reverse transcriptases (e.g., [
19,
20] is the main principle of the mutational profiling approaches). Moreover, several new-generation reverse transcriptases with increased robustness and thermostability allow for obtaining full-length cDNAs of highly structured RNAs, as they might incorporate the modifications into cDNA as random mutations. The most prominent example of such an enzyme is a thermostable group II intron RT (TGIRT-III), and its ability to read through the methyl adducts in DMS-MaP-Seq has been shown recently in a breakthrough publication in the
Nature Journal [
21].
Such sequencing-based techniques are complex since they involve many steps. Surprisingly, very little work has been done to evaluate the impact of the particular steps of the RNA chemical probing combined with sequencing on the quality of the obtained structural signal. To fill the lack of direct comparisons between current protocols, we report a simple but crucial comparison between different mRNA isolation methods, chemical reagents for structure-dependent RNA modification, and enzymes used for cDNA synthesis. We believe that the results of our analyses are likely to have a far-reaching impact on how SHAPE experiments are conducted in many laboratories.
3. Discussion
After its publication in 2014 [
16], the SHAPE-MaP protocol became the most popular tool for experimental studies on RNA secondary structure. In this study, we decided to validate in detail the influence of the SHAPE reagent, buffer conditions, and the type of reverse transcriptase on the efficiency of the protocol. During the development of the original SHAPE-MaP protocol, manganese divalent cations were found to most efficiently promote SSII readthrough and mutational detection of 1M7, 1M6, and NMIA chemical adducts [
16]. The results of our study confirmed that the employment of Mn
2+-containing MaP buffer is a major factor providing a significant increase in the structural signal.
We have observed that all reverse transcriptases suitable for SHAPE-Map protocol (SSII, SSIV, and TGIRT-III) could introduce mutations at the positions modified by both tested reagents (1M7 and BzCN); however, SSIV with a MaP buffer achieved the highest signal-to-noise ratio. One of the explanations could be the higher processivity and reaction speed of the SuperScript IV RT over its predecessor, SSII. Our observation is also consistent with the general high activity of the Moloney Murine Leukemia Virus Reverse Transcriptase (M-MuLV RT) in the presence of Mn
2+ and the ability of the manganese cations to advocate a mutagenic nature of the DNA polymerases [
22], since SS RT enzymes are M-MuLV RT mutants.
However, it was surprising to observe that the length of obtained cDNA was negatively affected by the RT in a buffer containing Mn
2+ (
Figure 3 and
Figure 4). Recent research on mutational detection of natural RNA modifications evaluated the performance of different reverse transcriptases on detecting
Saccharomyces cerevisiae tRNA modifications in the presence of manganese cations [
18]. The authors have used SuperScript III, SuperScript IV, ProtoScript II, and EpiScript RT. They generally observed that Mn2+ increased the modification-induced mutation rates and nucleotide skipping, accompanied by increased readthrough as represented by longer cDNA products. This difference may be related to the different nature of detected modifications—SHAPE regents introduce adducts on a 2′-OH group of the ribose, whereas tRNA modifications analyzed by Kristen et al. contained mostly nucleobase adducts. Furthermore, we also observed the differences in cDNA length between 1M7 and BzCN adduct-containing products. Therefore, the role of Mn
2+ in the modulation of the mutational adduct detection and readthrough depends on the localization and the type of the adduct.
The 1M7 was initially identified as a general purpose SHAPE reagent because of its short reaction half-life of 14 s and its ability to address the flexibility of all four ribonucleotides with a similar reactivity [
23]. BzCN, on the other hand, is supremely suited for time-resolving RNA structures due to its extremely short but still manageable half-life of 0.25 s at 37 °C [
10]. To our knowledge, a direct comparison of the reactivity of these two SHAPE reagents has never been tested before for RNA pools. Only one study tackled this point, testing the reactivity of 1M7 and BzCN on one isolated RNA molecule, RNase P RNA [
10]. That was the first experiment reporting BzCN utility in SHAPE, by Mortimer and Weeks, and they reported a robust correlation between 1M7 and BzCN reactivity on RNase P RNA. We have, therefore, examined the adduct detection rates of 1M7 and BzCN-treated human mRNAs. It appeared that BzCN showed a pattern of MaP mutation rates markedly different from that of 1M7. In general, 1M7 generated more robust mutation profiles, and it produced more structured information than BzCN. Overall, these results demonstrate that 1M7 is a robust human mRNA modifier in vitro, whereas BzCN performs less efficiently. These differences between the reagents observed in our study might be due to the type of RNA we used rather than the type of mRNA modification protocol since BzCN has been previously shown to be a potent in vitro modification reagent [
10,
24].
Altogether, our results reveal that the latest member of the SuperSscript RT family, SSIV, in combination with Mn2+-containing buffer, outperforms SSII in the mutational profiling of human mRNAs. However, the high sensitivity of mutational detection is coupled with an Mn2+-induced decrease of a readthrough of SHAPE adducts, revealing a novel role of Mn2+ in the modulation of reverse transcriptase activity.
4. Materials and Methods
4.1. Cell Line
HEK293 cells were obtained from ATCC and maintained in DMEM (Biowest, Nuaillé, France) supplemented with 10% FBS, 10,000 units/mL penicillin G, 10 mg/mL streptomycin sulfate, and 25 μg/mL amphotericin B.
4.2. RNA Isolation
Total RNA was isolated using TRI Reagent (Mercator Medical S.A., Kraków, Poland), following the manufacturer’s instruction, and stored at −80 °C. The concentration of the total RNA was measured with a Qbit fluorometer using RNA HS assay. The quality of the total cellular RNA was verified using an Agilent Bioanalyzer 2100 with an RNA Nano 6000 kit (Total RNA assay, Agilent Technologies, Inc., Santa Clara, CA, USA). The mRNA was isolated from the total RNA with Magnetight oligo (dT) particles (Merck, Darmstadt, Germany), Dynabeads Oligo (dT)25 (Thermo Fisher Scientific Inc., Waltham, MA, USA), or Poly(A)Purist MAG kit (Thermo Fisher Scientific Inc., Waltham, MA, USA), following the manufacturer’s instruction, and stored at −80 °C. The concentration, integrity, and purity of the mRNA samples were assessed using an Agilent Bioanalyzer 2100 with RNA Nano 6000 kit (mRNA assay, Agilent Technologies, Inc., Santa Clara, CA, USA).
4.3. Chemical Modification of RNA
A total of 1 µg of purified mRNA was folded by a 95 °C denaturation, a quick cool to 4 °C, followed by 20 min incubation at 37 °C in 100 mM HEPES pH 8.0, 10 mM MgCl2, and 100 mM NaCl. Samples were then modified for 5 hydrolysis half-lives of the SHAPE reagents: 5 min at 37 °C with 10 mM 1-methyl-7-nitroisatoic anhydride (1M7), 15 s at 25 °C with 40 mM or 80 mM benzoyl cyanide (BzCN), and 5 s at 37 °C with 40 mM or 80 mM BzCN. A control sample containing DMSO instead of a SHAPE reagent was included. Modified RNA was then cleaned up and eluted in RNase-free water using RNA Clean and Concentrator-5 (Zymo Research, Irvine, CA, USA) and stored at −20 °C. The concentration, integrity, and purity of modified mRNA samples were assessed using an Agilent Bioanalyzer 2100 with an RNA Nano 6000 kit (mRNA assay, Agilent Technologies, Inc., Santa Clara, CA, USA).
4.4. Reverse Transcription
A total of 200 ng of the modified RNA was subjected to reverse transcription with random 9-mer primers (New England Biolabs, Ipswich, MA, USA). Primers were annealed by incubation at 65 °C for 5 min. We have used the following protocols: SuperScript II Reverse Transcriptase (SSII, Thermo Fisher Scientific) with Mn2+ buffer (SSII-Mn2+), SuperScript IV Reverse Transcriptase (SSIV, Thermo Fisher Scientific Inc., Waltham, MA, USA) with Mn2+ buffer (SSIV-Mn2+), and a standard SSIV buffer (SSIV) and TGIRT-III Enzyme (InGex, St Louis, MO, USA) with a standard TGIRT buffer (TGIRT, InGex, St Louis, MO, USA)). The conditions of reverse transcription were as follows.
SSII-Mn2+: A total of 8 μL freshly prepared Mn2+-containing MaP buffer (50 mM Tris pH 8.0, 75 mM KCl, 6 mM MnCl2, 10 mM DTT, and 0.5 mM dNTPs) was added to the RNA/primer mixture and incubated at 25 °C for 2 min, followed by the addition of 200 U Superscript II. The reaction mixtures (total volume of 20 μL) were incubated at 25 °C for 10 min, 42 °C for 3 h, and 70 °C for 15 min. Samples were held on ice.
SSIV-Mn2+: A total of 8 μL freshly prepared Mn2+-containing MaP buffer (50 mM Tris pH 8.0, 75 mM KCl, 6 mM MnCl2, 10 mM DTT, and 0.5 mM dNTPs) was added to the RNA/primer mixture and incubated at 25 °C for 2 min, followed by the addition of 200 U Superscript IV. The reaction mixtures (total volume of 20 μL) were incubated at 25 °C for 10 min, 42 °C for 3 h, and 70 °C for 15 min. Samples were held on ice.
SSIV: A total of 4 μL buffer (50 mM Tris-HCl pH 8.3, 75 mM KCl, and 3 mM MgCl2), 1 μL 100 mM DTT, 1 μL 10 mM dNTP, and 200 U Superscript IV was added to the RNA/primer mixture. The complete reaction mixtures (total volume of 20 μL) were incubated at 25 °C for 10 min, 55 °C for 10 min, and 70 °C for 15 min. Samples were held on ice.
TGIRT: A total of 8 μL buffer (50 mM Tris-HCl pH 8.3, 75 mM KCl, and 3 mM MgCl2) was added to the RNA/primer mixture and incubated at 25 °C for 5 min, followed by the addition of 1 μL 100 mM DTT, 2 μL 10 mM dNTP mix and 200 U TGIRT-III Enzyme. The reaction mixtures (total volume of 20 μL) were incubated at 57 °C for 1.5 h and 70 °C for 15 min. Samples were held on ice.
4.5. Second-Strand Synthesis
Reverse transcription reactions were purified using Microspin™ G-50 Columns (GE Healthcare, Chicago, IL, USA). Second strand synthesis was performed using NEBNext Ultra II Non-Directional RNA Second Strand Synthesis Module (New England Biolabs, Ipswich, MA, USA) with incubation at 16 °C for 2.5 h. The dsDNA from the second-strand synthesis reaction was purified using PureLink PCR micro spin columns (Thermo Fisher Scientific Scientific Inc., Waltham, MA, USA), according to the manufacturer’s instructions. The size and concentration of dsDNA were assessed using an Agilent Bioanalyzer 2100 with an HS DNA kit (Agilent Technologies, Inc., Santa Clara, CA, USA).
4.6. Library Preparation and Sequencing
Approximately 1–5 ng of purified dsDNA was fragmented and tagged with adapter sequences in a single step using Nextera XT DNA Library Preparation Kit (Illumina Inc., San Diego, CA, USA) at 55 °C for 5 min, followed by incubation at 10 °C and sample neutralization, according to the manufacturer’s instruction. Tagmented libraries were then used as inputs in PCR reactions using 14 cycles of PCR amplification. PCR adds the Index 1 (i7), Index 2 (i5), and full adapter sequences to the tagmented DNA. The index adapters and Nextera PCR Master Mix were added directly to the 25 μL of tagmented dsDNA from the previous step. PCR reactions were cleaned up with Agencourt Ampure XP (Beckman Coulter, Brea, CA, USA) beads following the manufacturer’s protocol, eluting with 15 μL RNAse-free water. No direct size selection was performed on the resulting adapter-ligated library. Libraries were assayed for quality using an Agilent Bioanalyzer 2100 HS DNA chip. Libraries were then sequenced on an Illumina MiSeq platform following the manufacturer’s standard cluster generation and sequencing protocols.
4.7. SHAPE-Seq Data Analysis
Fastq files generated from the Illumina sequencing were mapped to the human transcriptome GRCh38.p12 using bowtie2 [
25]. The average transcript coverage was analyzed with Samtools [
26]. Transcripts with an average coverage of ≥50.0 were further considered. The mutation rates were calculated using the ShapeMapper 2.1.3 software (
https://weekslab.com/software/) [
8]. As the ShapeMapper operates simultaneously on the control and the treated sample, for each pair of control and treated samples, only transcripts that met the coverage criteria in both samples were processed. The statistics of the mutation rate were limited to the bases with an effective coverage ≥100. The modification factors were calculated as the fraction of the nucleotides with an effective coverage ≥100, revealing the mutation rate as >0.001. A matrix of sample distances based on the mutation rate distributions was calculated as previously published [
27].