Next Article in Journal
Co-Circulation of Dengue Virus Serotypes 1, 2, and 3 during the 2022 Dengue Outbreak in Nepal: A Cross-Sectional Study
Previous Article in Journal
SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Expansion of Kuravirus-like Phage Sequences within the Past Decade, including Escherichia Phage YF01 from Japan, Prompt the Creation of Three New Genera

Division of Materials Science and Chemical Engineering, Yokohama National University, Yokohama 240-8501, Kanagawa, Japan
*
Author to whom correspondence should be addressed.
Viruses 2023, 15(2), 506; https://doi.org/10.3390/v15020506
Submission received: 21 December 2022 / Revised: 3 February 2023 / Accepted: 9 February 2023 / Published: 11 February 2023
(This article belongs to the Section Bacterial Viruses)

Abstract

:
Bacteriophages, viruses that infect bacteria, are currently receiving significant attention amid an ever-growing global antibiotic resistance crisis. In tandem, a surge in the availability and affordability of next-generation and third-generation sequencing technologies has driven the deposition of a wealth of phage sequence data. Here, we have isolated a novel Escherichia phage, YF01, from a municipal wastewater treatment plant in Yokohama, Japan. We demonstrate that the YF01 phage shares a high similarity to a collection of thirty-five Escherichia and Shigella phages found in public databases, six of which have been previously classified into the Kuravirus genus by the International Committee on Taxonomy of Viruses (ICTV). Using modern phylogenetic approaches, we demonstrate that an expansion and reshaping of the current six-membered Kuravirus genus is required to accommodate all thirty-six member phages. Ultimately, we propose the creation of three additional genera, Vellorevirus, Jinjuvirus, and Yesanvirus, which will allow a more organized approach to the addition of future Kuravirus-like phages.

1. Introduction

Escherichia coli is a versatile and genetically diverse species commonly inhabiting the intestinal tracts of humans and animals as both a commensal and pathogenic microbe [1]. Pathogenic E. coli infecting humans are routinely released into the environment via sewage systems, which lead to wastewater treatment plants (WWTPs) [2]. Given the rising global public health concerns surrounding antibiotic resistance in pathogenic microbes, such as E. coli, alternative methods to control these pathogens are highly sought after. Bacteriophages (or phages) have received high attention as a potential therapeutic alternative to antibiotics [3,4,5].
Phages are viruses that multiply by infecting bacteria, often very specifically at the species or strain level. They play a key role both in microbial ecology and microbial evolution, having the ability to alter the population dynamics within microbial communities and modify bacterial genomes through horizontal gene transfer [5,6,7]. In biological WWTPs, phage concentrations are estimated to be approximately 108–109 particles per milliliter [8,9,10,11], which is higher than many other ecosystems studied to date [6,10]. Studies have shown that some of phage agents can control the growth of E. coli in biological WWTPs [12]. The use of multiple phages (phage cocktail) can more effectively control phage resistance and the recovery of phage-resistant bacteria after treatment [13].
The advent of next-generation sequencing technologies, such as those offered by Illumina, has led to a preponderance of genomic sequencing. Newer “third-generation sequencing” technologies, such as those offered by Oxford Nanopore, facilitate the acquisition of long-sequence reads (up to Mbs), allowing for the generation of complete bacterial and other higher organism genomes. While commonly used in combination with highly accurate short-reads, more recent revisions of the technology, known as Q20, include qualitative improvements that facilitate the generation of near-perfect bacterial assemblies, and perfect viral assemblies, without the need for short-read polishing [14,15].
As of December 2022, there are currently ~1600 Escherichia phage assemblies listed on NCBI Genbank. The International Committee on Taxonomy of Viruses (ICTV) currently recognizes Escherichia phages belonging to at least 11 different families and 100 different genera [16]. Given the wealth of sequencing currently available, the genomic diversity of phages and the way we assign their taxonomy is constantly shifting and evolving. For example, the Phieco32-like virus genus was created in 2009, in response to the isolation and characterization of the Escherichia phage phiEco32, a novel 77 kb phage with a rare C3 podovirus morphotype [17]. The genus was subsequently renamed twice within the next few years (to Phieco32likevirus and then Phieco32virus), while also adding five new phage members to its ranks (Escherichia viruses ECBP2, NJ01, Septima11, SU10, and kv1721) [18,19,20]. The genus was again renamed to its current name of Kuravirus in 2019.
In this study, we isolated a novel E. coli phage, YF01, from a WWTP in Yokohama, Japan. The YF01 phage was sequenced and assembled using long-read Q20 Oxford Nanopore technology and shown to be related to phages from the ICTV-classified Kuravirus genus. Modern reticulate network analyses revealed that YF01 and the current six ICTV-classified Kuravirus phages shared high similarity with 29 other phages present in public databases. Using a combination of approaches, we demonstrate that these 36 phages represent four different genera under current ICTV guidelines. We further analyze the genomic features of the YF01 phage in comparison to other Kuravirus-like group phages and elucidate the core proteome of the group to demonstrate that the large terminase subunit and portal protein, among three other proteins, show high conservation and highly correlated phylogenetic reconstruction to whole genome-based methods. Ultimately, we recommend the expansion and reshaping of the Kuravirus genus to four separate genera to accommodate a total of 36 phages isolated worldwide.

2. Materials and Methods

2.1. Bacterial Strains and Media

Escherichia coli strain K12 (JCM20135), isolated from feces from a diphtheria convalescent sample, was used in this study. Bacterial cultures were grown in Nutrient Broth with NaCl (5 g L−1 peptone, 3 g L−1 beef extract and 5 g L−1 NaCl, pH 7; NB) and NB with agar (with added 12 g L−1 agar) at 25 °C. Cultures were grown under aerobic conditions.

2.2. Isolation and Purification of Phages

Activated sludge mixed liquor was collected from a municipal WWTP in Yokohama in November 2021 where urban wastewater was received. The activated sludge sample was transferred to a 15 mL tube and centrifuged at 2000× g for 5 min. The supernatant was filtered through a cellulose acetate (CA) filter with a 0.20 μm pore size (Advantec Toyo, Tokyo, Japan). A total of 1 mL of filtrate was added to NB and 100 µL of log phase E. coli K12. The suspension was incubated overnight at 25 °C. The overnight culture was centrifuged at 2000× g for 10 min and the supernatant filtered through a CA 0.20 μm pore filter (Advantec Toyo, Tokyo, Japan). E. coli colonies from a growth plate were taken with sterile cotton swab and spread uniformly on NB agar medium. A total of 40 μL of filtrate was added plate dropwise, air-dried, and incubated at 25 °C overnight. A visible plaque was removed with the wide end of a tip and resuspended in SM buffer (200 mM NaCl, 10 mM MgSO4, 50 mM Tris-HCL, pH 7.5), a process repeated twice to ensure a pure phage isolate.

2.3. Transmission Electron Microscopy

TEM was performed as previously described [21]. Briefly, copper grids coated with carbon and formvar were first subjected to a glow discard treatment for 60 s. A total of 20 µL the YF01 phage (~1 × 1011 PFU mL−1) was incubated on a grid for 10 min. Grids were washed twice in MilliQ water and stained with 2% (w/v) uranyl acetate prior to a final wash in MilliQ water. Grids were dried prior to examination under a JEOL JEM02010HC electron microscope.

2.4. Verification of pH and Temperature Stability

To test phage stability, the pH of NB was adjusted to 3, 5, 7, 9, and 11 using NaOH and HCl. The pH-adjusted NB and YF01 phages (~1 × 1011 PFU mL−1) were mixed in a 9:1 ratio and incubated at 25 °C for 1 h. Phage PFU mL−1 was then determined by spot test (20 µL) using a dilution series down to 10−8 and counting the dilution that resulted in 10–100 plaques. For temperature testing, the NB and YF01 phages (~1 × 1011 PFU mL−1) were mixed in a 9:1 ratio and incubated for 1 h at different temperatures (4, 25, 37, 50, 60, 70, and 80 °C). Phage PFU mL−1 was determined as above. Each experiment was repeated two times.

2.5. Phage DNA Isolation

Genomic DNA from 1 mL phage filtrate (>1 × 1010 PFU mL−1) was extracted using a zinc chloride phenol:chloroform extraction method [22]. Briefly, the phage filtrate was first treated with DNase I, Rnase A, and MgCl2 to remove host contaminants before the precipitation of phage virions by the addition of 40 mM ZnCl2. Virions were resuspended in phage extraction buffer (400 mM NaCl, 20 mM EDTA, 0.5% (w/v) SDS and 50 µg mL−1 proteinase K) and incubated for 1 h at 55 °C. An equal volume of phenol:chloroform:isoamyl alcohol (25:24:1) was added and the top layer was removed and DNA precipitated with isopropanol. The pellet was washed in 70% (v/v) ethanol and resuspended in 10 mM Tris-HCl (pH 8.5). DNA concentration and purity were measured using a Quantus Fluorometer (Promega, Tokyo, Japan) and NanoDrop One (Thermo Scientific, Tokyo, Japan), respectively. DNA integrity was assessed via agarose gel electrophoresis.

2.6. Phage DNA Sequencing and Assembly

DNA libraries were prepared using 1 ug of DNA using the Ligation Sequencing Kit 14 (SQK-LSK114), loaded on a R10.4.1 flow cell (FLO-MIN114) and sequenced using a MinION Mk1B (Oxford Nanopore, Tokyo, Japan). Simplex and duplex basecalling was performed using Guppy v6.4.2 in super accurate (SUP) mode. Read filtering was performed with Filtlong v0.2.1 (https://github.com/rrwick/Filtlong). Filtered mean read size was 17.8 kb with a mean read quality of Q20.3, as assessed by NanoPlot v1.40.0 [23]. Reads were assembled with Flye v2.9.1 [24] in high-quality mode to generate a complete phage assembly (704X fold coverage). Post-assembly polishing with Medaka v1.7.2 (https://github.com/nanoporetech/medaka) made no corrections to the assembly. Residual Nanopore adapter sequences were truncated from the termini of the final assembly by screening against Y-adapter top (5′-GGCGTCTGCTTGGGTGTTTAACCTTTTTTTTTTAATGTACTTCGTTCAGTTACGTATTGCT-3′) and Y-adapter bottom (5′- GCAATACGTAACTGAACGAAGT -3′) sequences.

2.7. Phylogenetic Analysis

For the vConTACT2 reticulate network analysis, the YF01 phage genome was first annotated with Prokka v1.14.6 [25] and the phage protein sequences were combined with protein sequences from all phage genomes present on Genbank in November 2022 (19,164 genomes) [26]. Gene2Genome was used to assign and map protein coding sequences prior to the use of vConTACT2 v0.9.19 using default settings [27]. The network was visualised with Cytoscape v3.9.1 [28] using the default layout for the entire network and then an edge-weighted, spring-embedded model in the zoom view. Kuravirus-like phage group matrix nucleotide similarity analyses were performed using VIRIDIC v1.0 [29] and visualised with R v4.0.3 and R Studio v2022.07.1 using the pheatmap package v.1.0.12. For VICTOR analysis, Kuravirus-like group protein sequences were generated using Prokka as above and analysed using the VICTOR3 pipeline [30]. Briefly, pairwise comparisons of the nucleotide sequences were conducted using the Genome-BLAST Distance Phylogeny (GBDP) method [31] under settings recommended for prokaryotic viruses [30]. The resulting intergenomic distances were used to infer a minimum evolution tree with branch support via FASTME including SPR postprocessing [32]. Branch support was inferred from 100 pseudo-bootstrap replicates each. Taxon boundaries at the species, genus, and family level were estimated through the OPTSIL program [33], the recommended clustering thresholds [30], and an F value (fraction of links required for cluster fusion) of 0.5 [34]. Phylogenetic trees inferred using the D0 formula (optimised for nucleotides) were displayed using iTOL v5 [35]. The phylogenetic reconstruction of group core proteins (terminase large subunit, portal protein, MCP, DNA helicase/primase and hypothetical protein were performed with Phylogeny.fr [36] using MUSCLE v3.8.31 alignment [37] and PhyML v3.1 using the maximum likelihood method for reconstruction [38].

2.8. Genome Annotation

For the genome termini analysis, raw >Q30 Nanopore sequencing reads above 78,000 bp were filtered with Filtlong v0.2.1. Reads were then manually inspected for the presence of direct terminal repeats (DTRs). DTR sequences in other Kuravirus-like group phages were identified using the “Annotate from…” function, as performed previously [22]. To annotate the YF01 phage genome, the assembly was imported into Geneious Prime v2022.2.1 and genes were predicted with Glimmer3 [39] and manually inspected for the presence of ribosome binding sites (RBS). ORFs were annotated using a combination of the NCBI Conserved Domain Database (CDD) [40], a profile hidden Markov model (HMM) similarity using Hhpred [41], and the Virfam webserver [42]. tRNAs were identified using tRNAscan-SE [43] and Aragorn v1.2.41 [44]. Figures were generated using CLC Genomics WorkBench v9.5.5 and Clinker v0.0.12 [45]. Amino acid similarity calculations were performed using Clustal Omega v1.2.3 [46] in Geneious Prime v2022.2.1 with the Blosum 62 similarity matrix. Average amino acid similarity values presented in the text refer to the average value of all similarity values in the matrix. Raw data and calculations are available in the Supplementary Materials.

2.9. Codon Usage Analysis

Coding sequences (nucleotide) for YF01, phiEco32, ES17, KBNP1711, and ECBP2 phages were determined using Prokka and directly accessed from Genbank for E. coli K12 MG1655 (NC_000913). Codon usage was determined using the statistics function in Geneious Prime v2022.2.1. Codon usage that differed by ≥2-fold compared to the host were plotted using Prism v7.0.

2.10. Pangenome Analysis

The core analysis of the Kuravirus-like group phages was performed using Roary [22]. Briefly, phage genomes were first annotated with Prokka. Roary v3.13.0 core gene alignment [47] was used with a reduced BLASTp identity threshold of 30% [48] and the representative core proteome (YF01 phage) was subject to bioinformatic analyses (as described in section above). Charts were generated with Prism v7.0.

3. Results and Discussion

3.1. Isolation and Characteristics of Escherichia Phage YF01

Escherichia phage YF01 was isolated from activated sludge collected from a municipal WWTP in Yokohama, Japan. The phage formed small, transparent, circular plaques approximately 1 mm in diameter (Figure 1a). Phage virions negatively stained with uranyl acetate and imaged under the transmission electron microscope displayed the unusual C3 podovirus morphotype with an elongated capsid and short tail (Figure 1b). We next tested the stability of the YF01 phage by incubating it at varying temperature and pH conditions. The YF01 phage was stable at temperatures from 4–50 °C, with a decline in phage activity (as measured by PFU mL−1) noted from 60 °C and no observable phage activity above 70 °C (Figure 1c). The YF01 phage was relatively stable in the pH range of 3–11 with only mild reduction (~2-fold) in phage activity noted after 1 h of storage at pH 3 (Figure 1d). The pH of activated sludge systems are typically rather neutral (~6.5–8 pH), and even with the activated sludge at the plant in Yokohama recording at the lower end of this range (pH 6.4), the YF01 phage is well suited to persisting in wastewater environments.

3.2. Genome Assembly and Phylogenetic Placement of the YF01 Phage

The YF01 phage was sequenced using long-read Q20 Oxford Nanopore technology, leading to the assembly of a 78,626 bp genome with 704-fold coverage (Table 1). The GC content of the YF01 phage, 42.1%, was markedly lower than that observed of its host E. coli (~50.8%), a common phenomenon among lytic phages [49].
To phylogenetically place the YF01 phage we first made use of vContact2, which is a network-based approach that measures the degree of protein content shared between viruses to compute viral clusters (VC) [27]. As vContact2 allows for the input of up to 1 million sequences, a global network phage analysis was performed by inputting the total protein content of all complete phage genome sequences present within the Genbank database as of November 2022 (19,164 phage sequences) in addition to the YF01 phage. In the resulting reticulate network (Figure 2), phages that were found to share statistically confident protein-level similarity with the YF01 phage (as indicated by edge-connection) were further extracted and visualized using an edge-weighted layout to observe their grouping more accurately (Figure 2; zoom panel). The YF01 phage (in blue) formed a VC with 34 other Escherichia (and 1 Shigella) phages (in yellow; Figure 2). Six of these phages, phiEco32 [17], ECBP2 [18], NJ01 [19], KBNP1711, SU10 [20], and 172-1, were previously classified by the ICTV as the Kuravirus genus. vContact2 further split the 36-membered VC into two sub-VCs (considered to be the equivalent to genera grouping [27]) of 24 and 12 phages, with YF01, phiEco32, KBNP1711, SU10, and 172-1 phages and 18 non-ICTV-classified phages placed in the former and the ECBP2 phage and 11 non-ICTV-classified phages placed in the latter. Phages infecting other bacterial genera within the Gammaproteobacteria, such as Aeromonas phage Lah_6, Proteus phage Privateer, Salmonella phage 7-11, and Vibrio phage Vp_R1, showed the highest protein-level similarity to the 36-membered VC (Figure 2; zoom panel).
We took these 36 grouped phages, including the YF01 phage and the six ICTV-classified Kuravirus phages, and performed matrix nucleotide identity analysis to understand the nucleotide-level similarity more precisely. Using nucleotide similarity thresholds of 95% and 70%, for species and genus demarcation, respectively, as recommended by the ICTV, VIRIDIC grouped these phages into a total of thirty-three species and four genera G1-G4 (Figure 3; Supplementary Materials, Table S1). In contrast to the genera placings suggested by vContact2 (sub-VCs), here we saw a further split of the first sub-VC (24 phages) into three separate genera of nineteen (G1), three (G2), and two (G3) phages. Four of the ICTV-classified Kuravirus phages (phiEco32, NJ01, SU10, and 172-1 phages) grouped into the largest genus (G1), while KBNP1711 and ECBP2 phages grouped into two separate genera, G3 and G4, respectively. The apparent discrepancy between our genera groupings and the Kuravirus classifications by ICTV occurred due to the nucleotide similarity between the KBNP1711 and ECBP2 phages and the other ICTV-classified phages (such as the phiEco32 phage) falling below the currently accepted threshold of 70% (i.e., ~60.9% nucleotide similarity ECBP2 vs. phiEco32).
We then performed a phylogenetic reconstruction of the clustered phage genomes using VICTOR, which is based on Genome BLAST Distance Phylogeny (GBDP), using whole phage genome sequence inputs [30]. Here we input the complete nucleotide sequence of the 36 clustered phages as well as the complete nucleotide sequences from 14 related phages infecting other Aeromonas, Cronobacter, Escherichia, Proteus, Salmonella, and Vibrio species. VICTOR phylogeny demonstrated that the 36 clustered phages formed a monophyletic group (Figure 4), with clear sub-clade formation representing the four genus groupings G1–G4, indicated by VIRIDIC (Figure 4, G1–4). OPTSIL taxon boundary prediction suggested, in contrast to our vContact2 and VIRIDIC analyses, that the 36 grouped phages belonged to a single genus. VICTOR was originally optimized against a large reference phage genome database recognized by the ICTV in 2017, and it is likely that the evolution of threshold criteria and the rapid expansion of ICTV-classified phages over the past 5 years has caused this apparent divergence.
The current ICTV classification of the six Kuravirus phages occurred between 2009 (the creation of the genus with phiEco32 phage) and 2015–2016 (the addition of ECBP2, NJ01, KBNP1711, SU10, and 172-1 phages). Given the substantial increase in the amount of Kuravirus-like phage genomes deposited in public databases since then (30 new Kuravirus-like phages deposited in the past 9 years) and based upon the current criteria for the genome-based classification of viruses [48], we believe it would be sensible to revisit and reshape the Kuravirus genus into four separate genera based on the currently accepted similarity thresholds (95% and 70% for species and genus, respectively). We will recommend to the ICTV the identification of these four genera as Kuravirus (G1), Vellorevirus (G2), Jinjuvirus (G3), and Yesanvirus (G4), with the names of the new genera (G2–G4) representing either the geographical location of isolation or locality of the research group, of the founding phage member of each respective genus (myPSH1131, KBNP1711, and ECB2 phages for Valleorevirus, Jinjuvirus, and Yesanvirus, respectively). At the present time we will refer to this grouping of 36 phages as the Kuravirus-like phage group or simply “the group” (Figure 4).

3.3. Genomic Organisation of the YF01 Phage

The 78,626 bp YF01 phage genome encodes 121 CDS, 66% of which are hypothetical proteins, and a single tRNA (Supplementary Materials, Table S2). The genome is characteristically organized into functional modules, with genes involved in virion morphogenesis and lysis module on the left (gp3-gp22) and genes involved in DNA replication and nucleotide metabolism on the right (gp23-gp81; Figure 5).

3.3.1. Genome Termini

Kuravirus-like group phages are known to exhibit short (~190 bp) direct terminal repeats (DTRs) indicative of a T7 phage-like replication strategy where the substrate for capsid packaging are large DNA concatemers. The presence of DTRs in the group was first determined in phiEco32 phage (193-bp DTRs) using a combination of restriction profiling, primer walking, and sub-cloning strategies [17]. In more modern approaches, DNA termini are often computed based off raw short-read sequence data via the analysis of starting positive coverage [63]. With this approach, the termini of only two other group phages have been determined, SU7 and Paul phages, with noted DTRs of 53 bp (cautioned by the authors) and 193 bp, respectively [56,61]. Given the YF01 phage was sequenced with nanopore long-read technology, we curated reads spanning the entire YF01 genome sequence (~78,600 bp; many >Q30 due to duplex strand basecalling) and analyzed the DNA termini. Except residual sequencing adapter sequence on the read termini, we noted the distinct presence of 193 bp direct repeats on most reads, consistent with our assembly, indicating that the YF01 phage, like phiEco32 and Paul phages, contains 193 bp DTRs. Using sequence similarity, we identified 192–193 bp DTRs in a total of 30 phages in the group, which share a highly conserved region between nucleotide positions ~130–170 bp (Supplementary Materials, Figure S1).

3.3.2. Virion Morphogenesis and Lysis

The virion morphogenesis and lysis module in the YF01 phage shares high homology with other group phages, comprising a total of 20 genes transcribed in the forward direction (Figure 6). The terminase small and large subunits (gp3-4), which form the molecular motor responsible for packaging of the nascent phage DNA, are the first two genes in this module. While gp3 displayed no known domains via the NCBI conserved domains database (CDD), it did show similarity to the small terminase subunit in the Pseudomonas phage PaP3 using a more sensitive comparison of profile hidden Markov models (HMM) [41,64]. We did note that some members of the group, such as the 172-1 phage, have a gene further downstream of this module (equivalent to gp25 in the YF01 phage) annotated as a terminase small subunit. This was likely due to the presence of a HNH endonuclease domain, which are often associated with phage DNA packaging [65]. However, given HNH endonuclease domains are numerous in phages (i.e., there are four in the YF01 phage) and the small and large terminase subunits are commonly co-located in the phage genome, we have instead assigned gp3 as the putative small terminase subunit in the YF01 phage.
The next genes in this module (gp5-9) encode proteins that assemble to form the rare C3 morphotype (an elongated phage capsid) seen in the YF01 phage (Figure 1b) and other group members, including the portal protein (gp5), scaffolding protein (gp7), major capsid protein (gp8), and adapter protein (gp9; Figure 6). The largest variance was observed in the major capsid protein (MCP) of the YF01 phage, compared to the representative members of the group (Figure 6). The MCP in the YF01 and other group phages [17] is transcribed in two forms; a shorter product corresponding to the natural start (ATG) and the stop (TAA) codons of gp8 (nt. 7253–8314 in the YF01 phage, 353 a.a) and a larger product that occurs due to a ribosomal slippage event at a heptanucleotide slippery sequence (GGGAAAG, nt. 8293-8299 in the YF01 phage) near the natural stop codon of gp8, leading to an extended isoform using the -1 frame (nt. 7253-9930 in the YF01 phage, 892 a.a). This is a well-known phenomenon in the tail assembly chaperone in long-tailed phages (~3.5% slippage rate in the Escherichia phage Lambda [66]) but also occurs in the MCP of Escherichia phages T3 and T7 (~10% slippage rate) [67,68].
In the YF01 phage, the extended MCP isoform is substantially larger (892 a.a) than those seen in many other group members (~500–550 a.a) and most similar in size to that seen in the Paul phage (894 a.a.) [56]. As observed in T3 phage [67], the extended MCP isoform results in the addition of an immunoglobulin (Ig)-like domain to the C-terminus of the MCP. Ig-like domains are extremely common, diverse (>68 domain types recognized), and widely distributed in phages [69]. The specific function of Ig-like domains remains yet to be fully understood; however, they appear to be only present in structural proteins [69], with several pieces of evidence indicating possible involvement in attachment to the host cell surface [69]. The specific implications of these Ig-like domains in Kuravirus-like group phages remains unclear.
Interestingly, the ultrastructural analysis of the group member, SU10 phage, revealed a capsid structure with apparent uniform MCP formation (i.e., lacking the extended isoform) [70]. Consistent with this fact, a band correlating to an extended MCP isoform was absent via SDS-PAGE [70]. This contrasts with earlier work on SU10 phage, which indicated the presence of an extended MCP isoform by mass spectrometry [20]. Upon our inspection of the SU10 phage genome, we noted that while a heptanucleotide slippery sequence is present in the gene encoding MCP, a stop codon (TAG) occurs 39 nucleotides downstream in the −1 frame, leading to the premature termination of the extended MCP isoform, which would result in a product indistinguishable in size (~7 a.a larger) to the natural MCP. This apparent loss of the extended MCP isoform is unique to SU10 phage among the group.
In the middle of this module lie two host lysis-related proteins, a holin (gp12) and an endolysin (gp13), which work in concert to induce host cell lysis at the conclusion of the phage lifecycle (Figure 6). Many members of the Kuravirus-like group have demonstrated good bacteriolytic activity against pathogenic and drug-resistant E. coli [17,50,52,53,62], the ability to eliminate E. coli from food matrices when used in phage cocktails [50], and have shown effectiveness in reducing E. coli load in vivo [53].
The next structural genes in this module encode numerous tail proteins (gp10-11, gp14-16; Figure 6). The structure and assembly of the SU10 phage provides us detailed insights into how these components may assemble in the YF01 phage, given the high sequence similarity between the two phages [70]. In the YF01 phage, the proximal (gp10) and distal (gp11) tail fibers likely assemble into hexameric long tail fibers, which are attached to the adapter complex [70]. The nozzle, composed of a hexamer of nozzle proteins (gp15), also attached to the adapter complex, is bound by short tail fibers (gp14) [70]. The three tail fiber genes showed some of the largest sequence variance across the group (Figure 6), a common finding among related phages [22], and may cause variances in the host range regarding members of the group. Finally, the tail needle, formed by gp16, extends from the channel formed by the nozzle.
The final genes in this module appear to encode ejection proteins (gp18-22) and may be packaged inside the phage capsid along with the genomic DNA and play roles during the infection process (Figure 6) [70].

3.3.3. DNA Replication and Nucleotide Metabolism

The genome of the YF01 phage contains an assortment of genes involved in DNA replication, repair, and nucleotide metabolism. These include a DNA polymerase (gp49) and separately encoded 5′-3′ exonuclease (gp27) and 3′-5′ exonuclease (gp70) subunits. The separation of the DNA polymerase and its exonuclease domains is common to all group members and is similarly observed in phages closely related to the group (as shown in Figure 4), such as the Salmonella phage 7-11, the Proteus phage Privateer, the Vibrio phage Vp_R1, and the Aeromonas phage BUCT695. The YF01 phage also encodes a dual-purpose DNA primase/helicase (gp71) and NAD-dependent DNA ligase (gp57). The YF01 phage, like many phages, encodes certain enzymes involved in nucleotide metabolism and modification. These include a deoxynucleoside monophosphate kinase (gp28), a thymidylate synthase (gp60), and a deoxycytidine triphosphate deaminase (gp68).
Within the YF01 phage genome, we also identified genes encoding an RNA polymerase sigma factor (gp30) and a small transcriptional regular (gp76), which are highly conserved across the phage (95.2% and 93.5% average similarity, respectively; Supplementary Materials, Table S3). Transcriptional studies of the group member phage, phiEco32, highlighted the temporal expression of genes, corresponding to “early”, “middle”, and “late” genes [71]. The expression profiles correspond to the location and function of genes in the phage genome, with early genes located on the far right of the phage genome (gp82-122 in the YF01 phage; most hypothetical), middle genes located in the middle of the phage genome corresponding to DNA replication genes (gp23-81 in the YF01 phage), and late genes located at the left of the phage genome corresponding to the virion morphogenesis and lysis genes (gp1-22 in the YF01 phage). In the phiEco32 phage, RNA polymerase holoenzymes assembled with the phage-encoded sigma factor (gp30 in the YF01 phage) were shown to drive expression of some middle genes and all late genes (i.e., drive expression later in the phage infection cycle) [71]. The transcriptional regulator (gp76 in the YF01 phage) appeared to have a dual function in the phiEco32 phage by (1) promoting the expression of the phage sigma factor and (2) shutting off early gene expression via physical interaction with host σ70-factor, essentially promoting the progression of the phage lifecycle to late stage genes [71].
The YF01 phage, like many of the other group phages [17] also encodes a single tRNA (Figure 5; magenta). Phages are known to encode specific tRNA genes to compensate for differences in their codon usage compared to their hosts [72,73]. In the YF01 phage, the tRNA gene has a TCT anticodon corresponding to an arginine (Arg) AGA codon. We analyzed codon usage across all ORFs in the YF01 phage to show AGA was the second most used Arg codon with a usage share of approximately 20.2%. Conversely, in the host E. coli strain (K12), AGA was the second least used of five possible Arg codons, with a usage share of approximately 3.6%. This represents a ~5.6 fold increase in AGA codon usage in the YF01 phage over the host (Figure 7; yellow; Supplementary Materials, Table S5). AGA codon usage shares ranged from ~4.4–5.4 fold across other representative group members from G1-G4 versus E. coli K12 (Supplementary Materials, Table S5). While AGA showed the highest fold-increase usage in the YF01 phage, other codons were also found to be preferred when compared to the host (Figure 7; green). It remains unclear why the YF01 phage, Kuravirus-like group members, and other phages [74] carry only specific supplemental tRNA genes within their genomes despite the high usage frequency of additional codons.

3.4. The Core Proteome of Kuravirus-like Group Phages

Finally, we wanted to determine the core protein-encoding genome of the Kuravirus-like phages. Our pangenome analysis of 32 group members (four phages were excluded due to the presence of minor assembly errors, as assessed by the completeness of the terminase large subunit ORF) revealed a core proteome of 63 proteins, which were shared amongst all member phages (53% of the total proteome in the case of YF01 phage; Figure 8; Supplementary Materials, Table S4). This core proteome consisted of 16 early genes (all hypothetical or unknown function), 28 middle genes encoding some DNA replication and transcription proteins, and 19 late genes encoding mostly virion structural components (Figure 8). Consistent with the above findings of larger sequence variance across the proximal and distal long tail fibers (and the lack of a clear homologue in some cases) and the short tail fiber, these components were not considered to be core among the group of phages. Conversely, the highly conserved RNA polymerase sigma factor and small transcriptional regulator unsurprisingly formed part of the core proteome. The large terminase subunit, portal protein, MCP (the non-extended isoform), DNA primase/helicase, and a hypothetical protein located in the early genes (no sequence or HMM profile similarity to known domains) were found to show the highest sequence conservation across the group (94.4–96.5% average a.a identity; Supplementary Materials, Table S5). The phylogenetic reconstruction of the group phages using these proteins as markers resulted in groupings (G1–G4) that were relatively consistent with our earlier whole genome nucleotide (VIRIDIC and VICTOR) or whole proteome (vContact2) analyses, indicating the strength of these five core proteins in phylogenetic analyses (Supplementary Materials, Figure S2).

4. Conclusions

This study describes the expansion and reshaping of the members from the current ICTV-classified Kuravirus genus from six to thirty-six member phages using a combination of modern bioinformatics approaches—here grouped into four genera, namely Kuravirus and the three newly proposed genera, Vellorevirus, Jinjuvirus, and Yesanvirus. We also described a new member of this group, the Escherichia phage YF01, isolated from activated sludge in Yokohama, Japan. Genomic analyses described the organization of YF01 phage and characterized the function of encoded proteins involved in phage DNA replication and metabolism, virion morphogenesis, host cell lysis, and transcriptional regulation. Lastly, we determined that sixty-three proteins form the core proteome of Kuravirus-like group phages, with five proteins, including the large terminase subunit and DNA primase/helicase, as some of the strongest phylogenetic markers of Kuravirus-like phages.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v15020506/s1, Supplementary Figure S1: The direct terminal repeat (DTR) sequence in Kuravirus-like cluster phages; Supplementary Figure S2: Phylogenetic reconstruction based on five high-identity Kuravirus-like phage cluster core proteins; Supplementary Table S1: VIRIDIC percentage intergenomic similarity raw data; Supplementary Table S2: Escherichia phage YF01 genome annotations; Supplementary Table S3: ClustalW protein sequence similarity matrices and average calculations for transcription genes; Supplementary Table S4: Kuravirus-like group core proteome list; Supplementary Table S5: Codon usage in YF01 phage and representative Kuravirus-like group phages versus the host; Supplementary Table S6: ClustalW protein sequence identity matrices and average calculations for core proteome.

Author Contributions

S.B. and T.N. conceived and designed the study. Y.F. isolated the phage and performed pH and temperature stability experiments. Y.F. and S.B. sequenced and assembled the phage genome. S.B. performed bioinformatic analyses on the YF01 phage and the Kuravirus-like group phages. S.B. wrote the manuscript in consultation with Y.F. and T.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JSPS KAKENHI Grants (JP17KK0010, JP22H03771, JP22F20714). S.B was funded by a JSPS Postdoctoral Fellowship (P20714).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The genome sequence of the Escherichia phage vB_EcoP-YF01 is available in NCBI Genbank under the accession number OQ025076. Raw sequence data are available under BioProject accession number PRJNA912594.

Acknowledgments

We thank Y. Kaneda for performing the transmission electron microscopy and S. Petrovski for their helpful discussion. We thank the Yokohama City Government for providing the activated sludge sample.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Tenaillon, O.; Skurnik, D.; Picard, B.; Denamur, E. The population genetics of commensal Escherichia coli. Nat. Rev. Microbiol. 2010, 8, 207–217. [Google Scholar] [CrossRef]
  2. Michael, I.; Rizzo, L.; McArdell, C.S.; Manaia, C.M.; Merlin, C.; Schwartz, T.; Dagot, C.; Fatta-Kassinos, D. Urban wastewater treatment plants as hotspots for the release of antibiotics in the environment: A review. Water Res. 2013, 47, 957–995. [Google Scholar] [CrossRef]
  3. Torres-Barceló, C.; Hochberg, M.E. Evolutionary rationale for phages as complements of antibiotics. Trends Microbiol. 2016, 24, 249–256. [Google Scholar] [CrossRef]
  4. Kutateladze, M.; Adamia, R. Bacteriophages as potential new therapeutics to replace or supplement antibiotics. Trends Biotechnol. 2010, 28, 591–595. [Google Scholar] [CrossRef]
  5. Levin, B.R.; Bull, J.J. Population and evolutionary dynamics of phage therapy. Nat. Rev. Microbiol. 2004, 2, 166–173. [Google Scholar] [CrossRef] [PubMed]
  6. Batinovic, S.; Wassef, F.; Knowler, S.A.; Rice, D.T.; Stanton, C.R.; Rose, J.; Tucci, J.; Nittami, T.; Vinh, A.; Drummond, G.R. Bacteriophages in Natural and Artificial Environments. Pathogens 2019, 8, 100. [Google Scholar] [CrossRef]
  7. Salmond, G.P.; Fineran, P.C. A century of the phage: Past, present and future. Nat. Rev. Microbiol. 2015, 13, 777–786. [Google Scholar] [CrossRef] [PubMed]
  8. Du, B.; Wang, Q.; Yang, Q.; Wang, R.; Yuan, W.; Yan, L. Responses of bacterial and bacteriophage communities to long-term exposure to antimicrobial agents in wastewater treatment systems. J. Hazard. Mater. 2021, 414, 125486. [Google Scholar] [CrossRef] [PubMed]
  9. Ewert, D.L.; Paynter, M. Enumeration of bacteriophages and host bacteria in sewage and the activated-sludge treatment process. Appl. Environ. Microbiol. 1980, 39, 576–583. [Google Scholar] [CrossRef] [PubMed]
  10. Otawa, K.; Lee, S.H.; Yamazoe, A.; Onuki, M.; Satoh, H.; Mino, T. Abundance, diversity, and dynamics of viruses on microorganisms in activated sludge processes. Microb. Ecol. 2007, 53, 143–152. [Google Scholar] [CrossRef] [PubMed]
  11. Wu, Q.; Liu, W.-T. Determination of virus abundance, diversity and distribution in a municipal wastewater treatment plant. Water Res. 2009, 43, 1101–1109. [Google Scholar] [CrossRef] [PubMed]
  12. Maal, K.B.; Delfan, A.S.; Salmanizadeh, S. Isolation and identification of two novel Escherichia coli bacteriophages and their application in wastewater treatment and coliform’s phage therapy. Jundishapur J. Microbiol. 2015, 8, e14945. [Google Scholar]
  13. Chan, B.K.; Abedon, S.T.; Loc-Carrillo, C. Phage cocktails and the future of phage therapy. Future Microbiol. 2013, 8, 769–783. [Google Scholar] [CrossRef] [PubMed]
  14. Sereika, M.; Kirkegaard, R.H.; Karst, S.M.; Michaelsen, T.Y.; Sørensen, E.A.; Wollenberg, R.D.; Albertsen, M. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat. Methods 2022, 19, 823–826. [Google Scholar] [CrossRef]
  15. Luo, J.; Meng, Z.; Xu, X.; Wang, L.; Zhao, K.; Zhu, X.; Qiao, Q.; Ge, Y.; Mao, L.; Cui, L. Systematic benchmarking of nanopore Q20+ kit in SARS-CoV-2 whole genome sequencing. Front. Microbiol. 2022, 13, 973367. [Google Scholar] [CrossRef]
  16. Vitt, A.R.; Ahern, S.J.; Gambino, M.; Sørensen, M.C.; Brøndsted, L. Genome Sequences of 16 Escherichia coli Bacteriophages Isolated from Wastewater, Pond Water, Cow Manure, and Bird Feces. Microbiol. Resour. Announc. 2022, 11, e00608-22. [Google Scholar] [CrossRef]
  17. Savalia, D.; Westblade, L.F.; Goel, M.; Florens, L.; Kemp, P.; Akulenko, N.; Pavlova, O.; Padovan, J.C.; Chait, B.T.; Washburn, M.P. Genomic and proteomic analysis of phiEco32, a novel Escherichia coli bacteriophage. J. Mol. Biol. 2008, 377, 774–789. [Google Scholar] [CrossRef]
  18. Nho, S.-W.; Ha, M.-A.; Kim, K.-S.; Kim, T.-H.; Jang, H.-B.; Cha, I.-S.; Park, S.-B.; Kim, Y.-K.; Jung, T.-S. Complete Genome Sequence of the Bacteriophages ECBP1 and ECBP2 Isolated from Two Different Escherichia coli Strains. J. Virol. 2012, 86, 12439–12440. [Google Scholar] [CrossRef]
  19. Li, Y.; Chen, M.; Tang, F.; Yao, H.; Lu, C.; Zhang, W. Complete genome sequence of the novel lytic avian pathogenic coliphage NJ01. J. Virol. 2012, 86, 13874–13875. [Google Scholar] [CrossRef]
  20. Mirzaei, M.K.; Eriksson, H.; Kasuga, K.; Haggård-Ljungquist, E.; Nilsson, A.S. Genomic, proteomic, morphological, and phylogenetic analyses of vB_EcoP_SU10, a podoviridae phage with C3 morphology. PLoS ONE 2014, 9, e116294. [Google Scholar] [CrossRef]
  21. Stanton, C.R.; Rice, D.T.; Beer, M.; Batinovic, S.; Petrovski, S. Isolation and characterisation of the Bundooravirus genus and phylogenetic investigation of the Salasmaviridae bacteriophages. Viruses 2021, 13, 1557. [Google Scholar] [CrossRef]
  22. Batinovic, S.; Stanton, C.R.; Rice, D.T.F.; Rowe, B.; Beer, M.; Petrovski, S. Tyroviruses are a new group of temperate phages that infect Bacillus species in soil environments worldwide. BMC Genom. 2022, 23, 777. [Google Scholar] [CrossRef]
  23. De Coster, W.; D’hert, S.; Schultz, D.T.; Cruts, M.; Van Broeckhoven, C. NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics 2018, 34, 2666–2669. [Google Scholar] [CrossRef]
  24. Kolmogorov, M.; Yuan, J.; Lin, Y.; Pevzner, P.A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019, 37, 540–546. [Google Scholar] [CrossRef] [PubMed]
  25. Seemann, T. Prokka: Rapid prokaryotic genome annotation. Bioinformatics 2014, 30, 2068–2069. [Google Scholar] [CrossRef] [PubMed]
  26. Cook, R.; Brown, N.; Redgwell, T.; Rihtman, B.; Barnes, M.; Clokie, M.; Stekel, D.J.; Hobman, J.; Jones, M.A.; Millard, A. Infrastructure for a Phage Reference database: Identification of large-scale biases in the current collection of cultured phage genomes. PHAGE 2021, 2, 214–223. [Google Scholar] [CrossRef] [PubMed]
  27. Jang, H.B.; Bolduc, B.; Zablocki, O.; Kuhn, J.H.; Roux, S.; Adriaenssens, E.M.; Brister, J.R.; Kropinski, A.M.; Krupovic, M.; Lavigne, R. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat. Biotechnol. 2019, 37, 632–639. [Google Scholar] [CrossRef]
  28. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
  29. Moraru, C.; Varsani, A.; Kropinski, A.M. VIRIDIC—A novel tool to calculate the intergenomic similarities of prokaryote-infecting viruses. Viruses 2020, 12, 1268. [Google Scholar] [CrossRef]
  30. Meier-Kolthoff, J.P.; Göker, M. VICTOR: Genome-based phylogeny and classification of prokaryotic viruses. Bioinformatics 2017, 33, 3396–3404. [Google Scholar] [CrossRef] [PubMed]
  31. Meier-Kolthoff, J.P.; Auch, A.F.; Klenk, H.-P.; Göker, M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinform. 2013, 14, 60. [Google Scholar] [CrossRef]
  32. Lefort, V.; Desper, R.; Gascuel, O. FastME 2.0: A comprehensive, accurate, and fast distance-based phylogeny inference program. Mol. Biol. Evol. 2015, 32, 2798–2800. [Google Scholar] [CrossRef] [PubMed]
  33. Göker, M.; García-Blázquez, G.; Voglmayr, H.; Tellería, M.T.; Martín, M.P. Molecular taxonomy of phytopathogenic fungi: A case study in Peronospora. PLoS ONE 2009, 4, e6319. [Google Scholar] [CrossRef] [PubMed]
  34. Meier-Kolthoff, J.P.; Hahnke, R.L.; Petersen, J.; Scheuner, C.; Michael, V.; Fiebig, A.; Rohde, C.; Rohde, M.; Fartmann, B.; Goodwin, L.A. Complete genome sequence of DSM 30083T, the type strain (U5/41T) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy. Stand. Genom. Sci. 2014, 9, 2. [Google Scholar] [CrossRef] [PubMed]
  35. Letunic, I.; Bork, P. Interactive Tree of Life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021, 49, W293–W296. [Google Scholar] [CrossRef]
  36. Dereeper, A.; Guignon, V.; Blanc, G.; Audic, S.; Buffet, S.; Chevenet, F.; Dufayard, J.-F.; Guindon, S.; Lefort, V.; Lescot, M. Phylogeny. fr: Robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008, 36, W465–W469. [Google Scholar] [CrossRef]
  37. Edgar, R.C. MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinform. 2004, 5, 113. [Google Scholar] [CrossRef]
  38. Guindon, S.; Dufayard, J.-F.; Lefort, V.; Anisimova, M.; Hordijk, W.; Gascuel, O. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst. Biol. 2010, 59, 307–321. [Google Scholar] [CrossRef]
  39. Delcher, A.L.; Bratke, K.A.; Powers, E.C.; Salzberg, S.L. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 2007, 23, 673–679. [Google Scholar] [CrossRef]
  40. Lu, S.; Wang, J.; Chitsaz, F.; Derbyshire, M.K.; Geer, R.C.; Gonzales, N.R.; Gwadz, M.; Hurwitz, D.I.; Marchler, G.H.; Song, J.S. CDD/SPARCLE: The conserved domain database in 2020. Nucleic Acids Res. 2020, 48, D265–D268. [Google Scholar] [CrossRef]
  41. Zimmermann, L.; Stephens, A.; Nam, S.-Z.; Rau, D.; Kübler, J.; Lozajic, M.; Gabler, F.; Söding, J.; Lupas, A.N.; Alva, V. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 2018, 430, 2237–2243. [Google Scholar] [CrossRef] [PubMed]
  42. Lopes, A.; Tavares, P.; Petit, M.-A.; Guérois, R.; Zinn-Justin, S. Automated classification of tailed bacteriophages according to their neck organization. BMC Genom. 2014, 15, 1027. [Google Scholar] [CrossRef] [PubMed]
  43. Chan, P.P.; Lowe, T.M. tRNAscan-SE: Searching for tRNA genes in genomic sequences. In Gene Prediction; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1–14. [Google Scholar]
  44. Laslett, D.; Canback, B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004, 32, 11–16. [Google Scholar] [CrossRef] [PubMed]
  45. Gilchrist, C.L.; Chooi, Y.-H. Clinker & clustermap. js: Automatic generation of gene cluster comparison figures. Bioinformatics 2021, 37, 2473–2475. [Google Scholar]
  46. Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T.J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 2011, 7, 539. [Google Scholar] [CrossRef]
  47. Page, A.J.; Cummins, C.A.; Hunt, M.; Wong, V.K.; Reuter, S.; Holden, M.T.; Fookes, M.; Falush, D.; Keane, J.A.; Parkhill, J. Roary: Rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015, 31, 3691–3693. [Google Scholar] [CrossRef]
  48. Turner, D.; Kropinski, A.M.; Adriaenssens, E.M. A roadmap for genome-based phage taxonomy. Viruses 2021, 13, 506. [Google Scholar] [CrossRef]
  49. Rocha, E.P.; Danchin, A. Base composition bias might result from competition for metabolic resources. TRENDS in Genet. 2002, 18, 291–294. [Google Scholar] [CrossRef]
  50. Shahin, K.; Bao, H.; Zhu, S.; Soleimani-Delfan, A.; He, T.; Mansoorianfar, M.; Wang, R. Bio-control of O157:H7, and colistin-resistant MCR-1-positive Escherichia coli using a new designed broad host range phage cocktail. LWT 2022, 154, 112836. [Google Scholar] [CrossRef]
  51. Mierlo, J.v.; Hagens, S.; Witte, S.; Klamert, S.; Straat, L.v.d.; Fieseler, L. Complete Genome Sequences of Escherichia coli Phages vB_EcoM-EP75 and vB_EcoP-EP335. Microbiol. Resour. Announc. 2019, 8, e00078-19. [Google Scholar] [CrossRef]
  52. Manohar, P.; Tamhankar, A.J.; Lundborg, C.S.; Nachimuthu, R. Therapeutic characterization and efficacy of bacteriophage cocktails infecting Escherichia coli, Klebsiella pneumoniae, and Enterobacter species. Front. Microbiol. 2019, 10, 574. [Google Scholar] [CrossRef]
  53. Manohar, P.; Tamhankar, A.J.; Lundborg, C.S.; Ramesh, N. Isolation, characterization and in vivo efficacy of Escherichia phage myPSH1131. PLoS ONE 2018, 13, e0206278. [Google Scholar] [CrossRef] [PubMed]
  54. Gogokhia, L.; Buhrke, K.; Bell, R.; Hoffman, B.; Brown, D.G.; Hanke-Gogokhia, C.; Ajami, N.J.; Wong, M.C.; Ghazaryan, A.; Valentine, J.F.; et al. Expansion of Bacteriophages Is Linked to Aggravated Intestinal Inflammation and Colitis. Cell Host Microbe 2019, 25, 285–299.e8. [Google Scholar] [CrossRef] [PubMed]
  55. Korf, I.H.E.; Meier-Kolthoff, J.P.; Adriaenssens, E.M.; Kropinski, A.M.; Nimtz, M.; Rohde, M.; van Raaij, M.J.; Wittmann, J. Still Something to Discover: Novel Insights into Escherichia coli Phage Diversity and Taxonomy. Viruses 2019, 11, 454. [Google Scholar] [CrossRef] [PubMed]
  56. Holt, A.; Saldana, R.; Moreland, R.; Gill, J.J.; Liu, M.; Ramsey, J. Complete Genome Sequence of Escherichia coli Phage Paul. Microbiol. Resour. Announc. 2019, 8, e01093-19. [Google Scholar] [CrossRef] [PubMed]
  57. Lu, H.; Yan, P.; Xiong, W.; Wang, J.; Liu, X. Genomic characterization of a novel virulent phage infecting Shigella fiexneri and isolated from sewage. Virus Res. 2020, 283, 197983. [Google Scholar] [CrossRef] [PubMed]
  58. Gibson, S.B.; Green, S.I.; Liu, C.G.; Salazar, K.C.; Clark, J.R.; Terwilliger, A.L.; Kaplan, H.B.; Maresso, A.W.; Trautner, B.W.; Ramig, R.F. Constructing and Characterizing Bacteriophage Libraries for Phage Therapy of Human Infections. Front. Microbiol. 2019, 10, 2537. [Google Scholar] [CrossRef] [PubMed]
  59. Hinkley, T.C.; Garing, S.; Clute-Reinig, N.; Spencer, E.; Jain, P.; Alonzo, L.F.; Ny, A.-L.M.L. Genome Sequences of 38 Bacteriophages Infecting Escherichia coli, Isolated from Raw Sewage. Microbiol. Resour. Announc. 2020, 9, e00909-20. [Google Scholar] [CrossRef]
  60. Loose, M.; Sáez Moreno, D.; Mutti, M.; Hitzenhammer, E.; Visram, Z.; Dippel, D.; Schertler, S.; Tišáková, L.P.; Wittmann, J.; Corsini, L.; et al. Natural Bred ε2-Phages Have an Improved Host Range and Virulence against Uropathogenic Escherichia coli over Their Ancestor Phages. Antibiotics 2021, 10, 1337. [Google Scholar] [CrossRef]
  61. Koonjan, S.; Cooper, C.J.; Nilsson, A.S. Complete Genome Sequence of vB_EcoP_SU7, a Podoviridae Coliphage with the Rare C3 Morphotype. Microorganisms 2021, 9, 1576. [Google Scholar] [CrossRef]
  62. Vera-Mansilla, J.; Sánchez, P.; Silva-Valenzuela, C.A.; Molina-Quiroz, R.C. Isolation and Characterization of Novel Lytic Phages Infecting Multidrug-Resistant Escherichia coli. Microbiol. Spectr. 2022, 10, e01678-21. [Google Scholar] [CrossRef] [PubMed]
  63. Garneau, J.R.; Depardieu, F.; Fortier, L.-C.; Bikard, D.; Monot, M. PhageTerm: A tool for fast and accurate determination of phage termini and packaging mechanism using next-generation sequencing data. Sci. Rep. 2017, 7, 8292. [Google Scholar] [CrossRef] [PubMed]
  64. Niazi, M.; Florio, T.J.; Yang, R.; Lokareddy, R.K.; Swanson, N.A.; Gillilan, R.E.; Cingolani, G. Biophysical analysis of Pseudomonas-phage PaP3 small terminase suggests a mechanism for sequence-specific DNA-binding by lateral interdigitation. Nucleic Acids Res. 2020, 48, 11721–11736. [Google Scholar] [CrossRef] [PubMed]
  65. Kala, S.; Cumby, N.; Sadowski, P.D.; Hyder, B.Z.; Kanelis, V.; Davidson, A.R.; Maxwell, K.L. HNH proteins are a widespread component of phage DNA packaging machines. Proc. Natl. Acad. Sci. USA 2014, 111, 6022–6027. [Google Scholar] [CrossRef]
  66. Xu, J.; Hendrix, R.W.; Duda, R.L. Conserved translational frameshift in dsDNA bacteriophage tail assembly genes. Mol. Cell 2004, 16, 11–21. [Google Scholar] [CrossRef]
  67. Condreay, J.P.; Wright, S.E.; Molineux, I.J. Nucleotide sequence and complementation studies of the gene 10 region of bacteriophage T3. J. Mol. Biol. 1989, 207, 555–561. [Google Scholar] [CrossRef]
  68. Dunn, J.J.; Studier, F.W. Complete nucleotide sequence of bacteriophage T7 DNA and the locations of T7 genetic elements. J. Mol. Biol. 1983, 166, 477–535. [Google Scholar] [CrossRef]
  69. Fraser, J.S.; Yu, Z.; Maxwell, K.L.; Davidson, A.R. Ig-like domains on bacteriophages: A tale of promiscuity and deceit. J. Mol. Biol. 2006, 359, 496–507. [Google Scholar] [CrossRef]
  70. Šiborová, M.; Füzik, T.; Procházková, M.; Nováček, J.; Benešík, M.; Nilsson, A.S.; Plevka, P. Tail proteins of phage SU10 reorganize into the nozzle for genome delivery. Nat. Commun. 2022, 13, 5622. [Google Scholar] [CrossRef]
  71. Pavlova, O.; Lavysh, D.; Klimuk, E.; Djordjevic, M.; Ravcheev, D.A.; Gelfand, M.S.; Severinov, K.; Akulenko, N. Temporal regulation of gene expression of the Escherichia coli bacteriophage phiEco32. J. Mol. Biol. 2012, 416, 389–399. [Google Scholar] [CrossRef]
  72. Bailly-Bechet, M.; Vergassola, M.; Rocha, E. Causes for the intriguing presence of tRNAs in phages. Genome Res. 2007, 17, 1486–1495. [Google Scholar] [CrossRef] [PubMed]
  73. Chan, H.T.; Ku, H.; Low, Y.P.; Batinovic, S.; Kabwe, M.; Petrovski, S.; Tucci, J. Characterization of novel lytic bacteriophages of Achromobacter marplantensis isolated from a pneumonia patient. Viruses 2020, 12, 1138. [Google Scholar] [CrossRef] [PubMed]
  74. Ku, H.; Kabwe, M.; Chan, H.T.; Stanton, C.; Petrovski, S.; Batinovic, S.; Tucci, J. Novel Drexlerviridae bacteriophage KMI8 with specific lytic activity against Klebsiella michiganensis and its biofilms. PLoS ONE 2021, 16, e0257102. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Isolation and properties of the Escherichia phage YF01. (a) Plaques (~1 mm) produced by the YF01 phage tested on E. coli K12 on solid medium. (b) Transmission electron microscopy of the YF01 phage. Scale bar, 100 nm. (c) Temperature and (d) pH stability of the YF01 phage.
Figure 1. Isolation and properties of the Escherichia phage YF01. (a) Plaques (~1 mm) produced by the YF01 phage tested on E. coli K12 on solid medium. (b) Transmission electron microscopy of the YF01 phage. Scale bar, 100 nm. (c) Temperature and (d) pH stability of the YF01 phage.
Viruses 15 00506 g001
Figure 2. Kuravirus-like group phages form a tight cluster using a reticulate network. A reticulate network of >19,000 phage genomes was generated with individual phages represented as nodes (circles) and phage intergenomic similarity represented as edges (lines). Escherichia phages are colored in orange. The zoom panel shows the tight clustering of the Kuravirus-like group phages (yellow) using an edge-weight spring embedded layout model.
Figure 2. Kuravirus-like group phages form a tight cluster using a reticulate network. A reticulate network of >19,000 phage genomes was generated with individual phages represented as nodes (circles) and phage intergenomic similarity represented as edges (lines). Escherichia phages are colored in orange. The zoom panel shows the tight clustering of the Kuravirus-like group phages (yellow) using an edge-weight spring embedded layout model.
Viruses 15 00506 g002
Figure 3. Whole genome nucleotide similarity of phages within the Kuravirus-like group. The heatmap demonstrates the splitting of the Kuravirus-like group into four genera (shown on right) based on ≥70% intergenomic similarity to cluster into a genus. Bold entries indicate ICTV-classified Kuravirus phages. Raw percentage similarity data are shown in the Supplementary Materials, Table S1.
Figure 3. Whole genome nucleotide similarity of phages within the Kuravirus-like group. The heatmap demonstrates the splitting of the Kuravirus-like group into four genera (shown on right) based on ≥70% intergenomic similarity to cluster into a genus. Bold entries indicate ICTV-classified Kuravirus phages. Raw percentage similarity data are shown in the Supplementary Materials, Table S1.
Viruses 15 00506 g003
Figure 4. Phylogenetic placement of the Kuravirus-like group. Whole genome-based phylogeny (nucleotide) was inferred using Genome-BLAST Distance Phylogeny (100 bootstraps) using the formula D0 yielding an average support of 23%. OPTSIL clustering yielded forty-seven species clusters, five genus clusters and two family clusters. Phages classified by the ICTV are noted in bold. Proposed Kuravirus-like group is shown. Scale bar represents the number of nucleotide substitutions per site.
Figure 4. Phylogenetic placement of the Kuravirus-like group. Whole genome-based phylogeny (nucleotide) was inferred using Genome-BLAST Distance Phylogeny (100 bootstraps) using the formula D0 yielding an average support of 23%. OPTSIL clustering yielded forty-seven species clusters, five genus clusters and two family clusters. Phages classified by the ICTV are noted in bold. Proposed Kuravirus-like group is shown. Scale bar represents the number of nucleotide substitutions per site.
Viruses 15 00506 g004
Figure 5. The genome of the Escherichia phage YF01. A modular arrangement of genes is noted with those involved in early, middle, and late expression marked. Annotated genes with known function are listed below the map, including genes involved in DNA packaging, phage morphogenesis, host lysis, DNA replication, and metabolism and transcriptional regulation. tRNAs are also shown. Unknown function indicates genes that contain a conserved domain but have no assignable function in the phage lifecycle. Hypothetical indicates genes that do not contain any conserved domains. All YF01 phage genome annotations are shown in the Supplementary Materials, Table S2.
Figure 5. The genome of the Escherichia phage YF01. A modular arrangement of genes is noted with those involved in early, middle, and late expression marked. Annotated genes with known function are listed below the map, including genes involved in DNA packaging, phage morphogenesis, host lysis, DNA replication, and metabolism and transcriptional regulation. tRNAs are also shown. Unknown function indicates genes that contain a conserved domain but have no assignable function in the phage lifecycle. Hypothetical indicates genes that do not contain any conserved domains. All YF01 phage genome annotations are shown in the Supplementary Materials, Table S2.
Viruses 15 00506 g005
Figure 6. The virion morphogenesis and lysis module of the YF01 phage. Module alignment between YF01 phage and representatives from G1-G4 Kuravirus-like group phages demonstrate high gene synteny and identity within the module. The major capsid protein (gp8) is expressed in two isoforms (natural and extended) via a ribosomal slippage event. Genes encoding long tail fiber (gp10-11), short tail fiber (gp14), and nozzle protein (gp15) share less sequence identity. Gene color scheme is consistent with that described in Figure 5. Text in red points out relevant differences discussed in the main text. Scale bar represents 5 kb.
Figure 6. The virion morphogenesis and lysis module of the YF01 phage. Module alignment between YF01 phage and representatives from G1-G4 Kuravirus-like group phages demonstrate high gene synteny and identity within the module. The major capsid protein (gp8) is expressed in two isoforms (natural and extended) via a ribosomal slippage event. Genes encoding long tail fiber (gp10-11), short tail fiber (gp14), and nozzle protein (gp15) share less sequence identity. Gene color scheme is consistent with that described in Figure 5. Text in red points out relevant differences discussed in the main text. Scale bar represents 5 kb.
Viruses 15 00506 g006
Figure 7. Codon usage bias in the YF01 phage. Codon usage was examined in the coding sequences of both the YF01 phage and E. coli K12. Codons whose frequency by two-fold or above were plotted, with positive values (in green) representing codons with higher fold-usage and negative values (in red) representing codons with lower fold-usage in the YF01 phage. The YF01 phage encodes a single tRNA (tRNAArg) with an anti-codon TCT consistent with the higher usage of the Arg-AGA (yellow) codon in the YF01 genome.
Figure 7. Codon usage bias in the YF01 phage. Codon usage was examined in the coding sequences of both the YF01 phage and E. coli K12. Codons whose frequency by two-fold or above were plotted, with positive values (in green) representing codons with higher fold-usage and negative values (in red) representing codons with lower fold-usage in the YF01 phage. The YF01 phage encodes a single tRNA (tRNAArg) with an anti-codon TCT consistent with the higher usage of the Arg-AGA (yellow) codon in the YF01 genome.
Viruses 15 00506 g007
Figure 8. The core proteome of the Kuravirus-like group phages. Pangenome analysis indicated 63 proteins are shared by all group phages. Representatives from G1–G4 Kuravirus-like group phages are shown alongside the YF01 phage. The core and accessory proteomes of each phage are shown in white and black, respectively, with percentage values indicating the percentage of accessory proteins in each phage. The core proteome of the YF01 phage is further broken down by protein annotation.
Figure 8. The core proteome of the Kuravirus-like group phages. Pangenome analysis indicated 63 proteins are shared by all group phages. Representatives from G1–G4 Kuravirus-like group phages are shown alongside the YF01 phage. The core and accessory proteomes of each phage are shown in white and black, respectively, with percentage values indicating the percentage of accessory proteins in each phage. The core proteome of the YF01 phage is further broken down by protein annotation.
Viruses 15 00506 g008
Table 1. Characteristics of Kuravirus-like group phages.
Table 1. Characteristics of Kuravirus-like group phages.
NameHostSize (nt)GC %CDS 1tRNA 2GeographySourceYearAccessionReference
YF01E. coli78,62642.11211JapanWastewater2021OQ025076This study
phiEco32E. coli77,55442.31241USAWater2004EU330206[17]
ECBP2E. coli77,31542.41191South KoreaUnknown2012 3JX415536[18]
NJ01E. coli77,448421321ChinaAnimal2012 3JX867715[19]
KBNP1711E. coli76,18442.41241South KoreaUnknown2013 3KF981730
SU10E. coli77,32742.11241SwedenUnknown2014 3KM044272[20]
172-1E. coli77,266421281ChinaAnimal feces2014 3KP308307
EK010E. coli78,07842.11201ChinaWastewater2020 3LC553734[50]
O18-011E. coli75,64642.11221ChinaUnknown2020 3LC553735[50]
LAMPE. coli68,52142.2961RussiaAnimal feces2017 3MG673519
EP335E. coli76,62242.51231NetherlandsWastewater2018 3MG748548[51]
myPSH2311E. coli68,71242.31180IndiaWastewater2018 3MG976803[52]
myPSH1131E. coli76,16342.41290IndiaWater2018 3MG983840[53]
NC-BE. coli76,64142.11161USAHuman feces2019 3MK310183[54]
WFI101126E. coli77,30742.11301GermanySewage2015MK373770[55]
PaulE. coli79,429421241USAWater2018MN045231[56]
SGF2S. flexneri76,96442.31181ChinaWater2019 3MN148435[57]
ES17E. coli75,00742.11201USASewage2018MN508615[58]
EcoN5E. coli76,08342.11281ColombiaUnknown2019 3MN715356
PGN6866E. coli78,54942.31291IndiaSewage2020 3MT127620
MN03E. coli77,18742.21241BangladeshWater2017MT129653
MN05E. coli76,89942.21261BangladeshWater2017MT129655
TH06E. coli77,67842.11231USAWastewater2020 3MT446386[59]
TH34E. coli77,94442.31201USAWastewater2020 3MT446407[59]
TH38E. coli81,55242.21321USAWastewater2020 3MT446410[59]
TH42E. coli77,28442.31180USAWastewater2020 3MT446413[59]
TH43E. coli77,98042.41181USAWastewater2020 3MT446414[59]
DE5E. coli77,30542.11251ChinaUnknown2021 3MW741821
101114BS3E. coli75,747421131AustriaWastewater2018MZ234015[60]
101114UKE3E. coli75,747421131AustriaWastewater2018MZ234017[60]
CHD5UKE1E. coli77,35942.21231AustriaWastewater2018MZ234028[60]
SU7E. coli76,62642.11171SwedenWastewater2016MZ342906[61]
IME267E. coli76,631421171ChinaUnknown2021 3MZ398243
IMEP8E. coli75,80942.11181ChinaAnimal milk2021MZ648214
MLP3E. coli76,23442.11151ChileWater2019OK148440[62]
E20-1E. coli77,93842.21221ChinaWastewater2018OP293233
1,2 CDS and tRNA numbers are reported from the Prokka annotation pipeline for consistency. 3 Unknown year of isolation, Genbank submission date listed.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Batinovic, S.; Fujii, Y.; Nittami, T. Expansion of Kuravirus-like Phage Sequences within the Past Decade, including Escherichia Phage YF01 from Japan, Prompt the Creation of Three New Genera. Viruses 2023, 15, 506. https://doi.org/10.3390/v15020506

AMA Style

Batinovic S, Fujii Y, Nittami T. Expansion of Kuravirus-like Phage Sequences within the Past Decade, including Escherichia Phage YF01 from Japan, Prompt the Creation of Three New Genera. Viruses. 2023; 15(2):506. https://doi.org/10.3390/v15020506

Chicago/Turabian Style

Batinovic, Steven, Yugo Fujii, and Tadashi Nittami. 2023. "Expansion of Kuravirus-like Phage Sequences within the Past Decade, including Escherichia Phage YF01 from Japan, Prompt the Creation of Three New Genera" Viruses 15, no. 2: 506. https://doi.org/10.3390/v15020506

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop