2.1. Detection of 21 Plant Viruses and One Plant Viroid by NGS in Apple Orchards Affected by RAD
Initial apple orchard surveys in the Okanagan Valley in 2018 revealed tree mortality of up to 40% in extreme cases, as estimated by visual inspection of selected orchards. One orchard was so sparse with trees that it was completely removed after sampling. From the responses to a questionnaire sent to growers, rapid decline symptoms were noted throughout the Okanagan and Similkameen Valleys from as far south as Osoyoos, BC and north to Vernon, BC, suggesting that it is a widespread issue encompassing the majority of the apple growing regions in BC. To obtain a snapshot of plant viruses present in orchards affected by RAD in the Okanagan and Similkameen valleys of British Columbia (Canada), we collected 148 individual leaf samples from nine experimental and commercial conventionally managed orchards in Summerland in the South Okanagan Valley and from one commercial organic orchard in Cawston in the Similkameen Valley. The samples represented different types of rootstocks/scions cultivar and were collected from diseased trees as well as asymptomatic trees (
Table 1, see Material and Methods for more details). Next, we produced ten composite leaf samples for NGS. Each composite sample was a mixture of three selected individual leaf samples (
Table 2). This strategy allowed us to limit the number of samples sent for NGS sequencing, while ensuring that the selected samples represented most cultivars, orchard locations, sampling times, and types of symptoms.
A total of 71–103 million reads were obtained for each composite sample. Mapping to the reference genome of
Malus domestica (GDDH13v1.1) to eliminate host sequences resulted in 5.5–9.8 million unmapped reads.
De novo assembly from these unmapped reads generated 2937 to 10,535 contigs with length varying from 250 to 11,148 nucleotides (nts). Blast search and remapping of these contigs against a database of plant virus sequences revealed the presence of 21 viruses and one viroid (AHVd) in the ten NGS samples (
Table 3 and
Table 4). Twenty of these viruses and the viroid were previously described. In addition, an ilarvirus sequence was readily detected that was related to previously described plant ilarviruses, but clearly distinct. Other viruses were also detected that showed sequence identities to known fungal or insect viruses but are not reported here.
The viroid AHVd was detected in seven composite samples with 100% genome sequence coverage. The highest number of reads mapping to AHVd was 180,962 in sample HX8 (
Table 4 and
Table S1), which was collected from three trees that did not show typical RAD symptoms (
Table 2).
The ubiquitous ASPV, ACLSV, and ASGV from the family
Betaflexiviridae were readily detected in our composite samples (
Table 4 and
Table S1). ASPV was detected in eight composite samples, with genome coverage ranging from 6.3% to 99%. Variability in ASPV sequences was very high, with nt sequence identity to the reference isolate ranging from 80% to 99%, suggesting the presence of multiple divergent variants in each composite sample. ACLSV and ASGV were found in six and three composite samples, respectively. Genome coverage ranged from 86% to 96% and percentage of sequence identity to the reference isolates ranged from 82% to 99%, again suggesting high diversity of sequence in the BC isolates (
Table S1). Because multiple variants of ASPV, ACLSV, and ASGV were detected in each composite samples, assembly of near-complete genome sequence of individual variants proved difficult and was not investigated further. Cherry virus A (CVA), another member of the family
Betaflexiviridae, was detected at low incidence in only one composite sample (187 reads mapped to CVA in sample HX1,
Table 4 and
Table S1).
The luteovirus ALV1 was detected in five composite samples with reads ranging from 1978 for sample HX4 to 39,292 for HX1. Nucleotide sequence identity to the reference ALV1 genome from Pennsylvania (PA8) isolate (NC_040680.1) ranged from 94% to 99% (
Table 4 and
Table S1), and genome coverage was above 100%, due to the longer genomic RNAs of BC isolates (see below). We selected three composite NGS samples, of which each was later found to include only one ALV1 positive tree and assembled near full-length genome contigs. The selected samples were HX4 (including ALV1-positive individual sample BC52), HX7 (including ALV1-positive individual sample BC85), and HX10 (including ALV1-positive individual sample BC134). Sequences have been deposited in the NCBI database with accession numbers OP271661, OP271662, and OP271663 for contigs corresponding to individual samples BC52, BC85, and BC134, respectively (annotated sequence are also available as Text S1 to S3). Interestingly, the length of the contigs assembled ranged from 6362 to 6390 nts, whereas the complete genomic RNA length for the reference PA8 isolate was reported to be 6001 nts [
4]. An extension of at least 420 nts at the 3′ end of the ALV1 RNA 3′ untranslated region (UTR) was noted in all sequenced samples (
Figure 1A). Presence of this 3′ UTR extension in the BC isolates was confirmed by RT-PCR (
Figure 1B). The general genomic RNA organization of ALV1 BC isolates was similar to that described for the PA8 isolate with a few minor differences (
Figure 1A). All major ORFs were highly conserved between the BC and PA8 isolates including ORF0, which was already noted to have a sequence unique to ALV1 with no apparent sequence homologies with the corresponding ORF0 of other luteoviruses [
4]. Two small in-frame deletions were noted in the ORF5 of the BC isolates when compared to the PA8 isolate (
Figure 1A,
Figure S1). This region of ORF5 overlaps with small ORF5a and similar small deletions were also noted in ORF5a of the BC isolates (
Figure 1A,
Figure S1). Small ORFs 6 and 7, which were noted previously in the PA isolate, were also detected in most BC isolates, although ORF6 was absent from the BC134 contig (
Figure 1A). Finally, an additional small ORF was detected in the 3′ UTR extension of the BC isolates (ORF8,
Figure 1A). Although ORF8 was slightly larger in the BC52 contig due to a mutation of an in-frame stop codon upstream of the ORF, the C-terminal sequence of ORF8 was highly conserved among the BC isolates (
Figure S1). It is not known whether small ORFs 5a, 6, 7, or 8 are expressed in infected plants.
CCGaV, ARWaV-1, and ARWaV-2 from the family
Phenuiviridae were readily detected in three to four composite samples each with nt sequence identities of 99.3–99.9%, 93–99%, and 95–99.8% to the reference isolates, respectively (
Table 4 and
Table S1).
Numerous ilarviruses have been reported to infect fruit trees, including apple trees [
14]. Six known ilarviruses were detected in low abundance and incidence (
Table 4 and
Table S1). ApMV, tobacco streak virus (TSV), and prunus necrotic ringspot virus (PNRSV) were only found in one composite sample. The partially sequenced AIV1, first identified in WA [
1], was found in two composite samples with only low read numbers. Solanum nigrum ilarvirus (SNIV) was detected in six composite samples, but also showed a low number of reads.
Several contigs from each of the ten composite samples showed around 90% nt sequence identity to the partially sequenced blacklegged tick-associated ilarvirus (BLTaIV), a poorly described virus discovered in an NGS study of environmental tick samples [
15]. The contigs were also related to known plant ilarviruses, although with only 52–71% nt sequence identity with the RNA1, RNA2, and RNA3 of the closest relative, citrus variegation virus (CVV). Further characterization of this virus, which we named apple ilarvirus 2 (AIV2), is presented below.
Only low read numbers were noted for other plant viruses including clover yellow mosaic virus and white clover mosaic virus (genus
Potexvirus), detected in five and four composite samples, respectively; turnip vein clearing virus (genus
Tobamovirus) and turnip mosaic virus (genus
Potyvirus) each detected in three composite samples; as well as turnip crinkle virus (genus
Betacarmovirus) and sowbane mosaic virus (genus
Sobemovirus) each detected in only one composite sample (
Table 4 and
Table S1).
2.2. Virus Prevalence in Apple Orchards Affected by RAD
RT-PCR was conducted on the 148 individual leaf samples (each corresponding to a single tree) to determine the field prevalence of the twelve viruses and one viroid for which there were more than 200 reads in the composite NGS samples (i.e., AHVd, ACLSV, ASGV, ASPV, AIV2, ApMV, SNIV, TSV, PNRSV, ALV1, ARWaV-1, ARWaV-2, and CCGaV, see
Table S2 for primer sequences). Of the 148 individual samples tested, 147 trees were found to be infected with at least one of the tested viruses (
Table 5). The majority of the trees were infected by more than two of the tested viruses (81.7% of the individual samples), with up to eight viruses and/or viroid detected in a tree (
Table 5). Thus, mixed infections were very common in the orchards.
Virus detection rates ranged from 63.5% of the individual samples for AIV2 to 0.7% for PNRSV (
Table 6). The most prevalent viruses were AIV2 (63.5%), CCGaV (61.2%), ASPV (54.7%), ACLSV (50%), and ASGV (41.9%). The last three viruses are known as latent viruses in apples and are widely detected worldwide. Thus, it was not surprising to find them in high abundance in the collected samples. CCGaV was initially associated with concave gum disease in citrus but was recently detected at high prevalence in apple trees in Washington State, USA [
1,
5]. In contrast, AIV2, which was detected with the highest overall field prevalence, has not been previously reported in association with apple trees. Other ilarviruses were only detected at low incidence. TSV, SNIV3, and ApMV were detected in 9.5%, 7.4%, and 4.1% of the individual samples, respectively, while PNRSV was only found in one sample. The viroid AHVd was detected in 23% of the samples. Three viruses newly identified in apple orchards elsewhere, ALV1 (first detected in PA) as well as ARWaV1 and ARWaV2 (both first reported from WA), were readily detected in orchards from the Okanagan and Similkameen Valleys of British Columbia. Indeed, these three viruses were found in 18.2%, 8.8%, and 15.5% of the samples, respectively (
Table 6).
We compared virus incidence in samples collected from trees with RAD symptoms to that in samples collected from non-symptomatic trees (
Table 6). There was no clear association of a single virus with the presence of typical RAD symptoms, although the incidence of ALV1, ASGV, ARWaV1, SNIV3, ARWaV2, and ApMV was slightly higher (1.2–4.8%) in samples collected from diseased trees. In contrast, the incidence of CCGaV, ASPV, TSV, AHVd, ACLSV, and AIV2 was slightly lower (1.9–13.1%) in these samples. As stated above, mixed infection was common. However, we could not find a clear association between a specific combination of viruses (or viroid) and the presence of RAD symptoms.
2.3. Complete Genome Sequence and Phylogenetic Analysis of Apple Ilarvirus 2
AIV2, which was identified in our NGS dataset, was the most prevalent virus in our individual samples. Thus, we decided to determine the complete genomic sequence of the virus and examine its phylogenetic relationship with other ilarviruses.
The complete sequence of the three AIV2 RNAs was obtained by compilation of the NGS datasets, 5′ and 3′ RACE as well as Sanger sequencing of at least three individual RT-PCR clones for each RNA (see Material and Methods). The sequence was deposited in the NCBI database under accession numbers ON932434 to ON932436 (annotated sequence is also available as Text S4). The genomic organization of the RNAs is similar to that of subgroups 1 and 2 ilarviruses [
16] (
Figure 2A). RNA1 contains a single ORF that encodes protein 1a (a putative replicase) of 1064 amino acids (aa) (120.44 kDa). Typical methyltranferase (aa location 59–405) and helicase (aa location 781–1032) domains were identified in this protein. RNA2 has two ORFs overlapping by 368 nts, coding for protein 2a of 816 aa (92.95 kDa), which has canonical RNA-dependent-RNA-polymerase motifs (aa 234–649) and protein 2b of 190 aa (21.73 kDa). Protein 2b is only found in ilarviruses of subgroups 1 and 2 and may function in viral movement and as a suppressor of RNA silencing [
16,
17]. RNA3 contains two non-overlapping ORFs, coding for a movement protein of 295 aa (33.12 kDa) and a coat protein of 219 aa (23.98 kDa). The first three nts in the 5′ UTRs were identical for all three RNAs and the first 67 nts in the 5′ UTR were identical for RNA1 and RNA2 (
Figure 2B). The last 191 nts of the 3′ end (including the 3′UTR) were almost identical (with only three mismatches) for all three RNAs (
Figure 2C).
To investigate the relationships between AIV2 and other ilarviruses, phylogenetic trees were generated based on alignments of the deduced amino acid sequence of proteins 1a, 2a, and CP. AIV2 clustered with subgroup 2 ilarviruses, but was clearly distinct from other plant ilarviruses (
Figure 3). Nucleotide sequence identities with subgroup 2 ilarviruses were between 50 and 71% for RNA1, 49 and 67% for RNA2, and 42 and 57% for RNA3. As mentioned above, AIV2 is closely related to the partially sequenced and poorly characterized ilarvirus BLTaIV, which was found in an NGS dataset from blacklegged tick collected in the USA East Coast [
15] (see CP phylogenetic tree,
Figure 3).
So far, all well-characterized ilarviruses have been reported from plants [
16,
18] and it is possible that the reported tick-associated ilarvirus was detected because of cross-contamination of the environmental tick samples with plant material, especially considering that it was only detected at low incidence in the tick NGS dataset [
15]. Although unlikely due to its high prevalence in our samples, we also considered the possibility that detection of AIV2 could have been the result of cross-contamination of our leaf samples with ticks or similar arthropods (such as mites, which can be prevalent in high numbers in apple orchards under conducive environmental conditions). We collected leaf samples from two individual trees that tested positive in our initial tests (trees BC1 and BC2). Samples were collected at different times of the growing season for three consecutive years. AIV2 could be detected by RT-PCR from the two trees each year although it was most prevalent in June and July (
Figure 4). AIV2 was also detected in phloem tissues (stems peeled from the outer and inner bark, see Material and Methods) in a fourth consecutive year (
Figure 4). These phloem-enriched samples are unlikely to be contaminated with arthropods. We conclude that AIV2 is most likely a plant virus.