Next Article in Journal
The Impact of Breeding Yellow-Legged Gulls on Vegetation Cover and Plant Composition of Grey Dune Habitats
Previous Article in Journal
Effect of Cultivable Bacteria and Fungi on the Limestone Weathering Used in Historical Buildings
Previous Article in Special Issue
Spatiotemporal Distributions of Scolytinae Beetles in the Subtropical Forests of Southern China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Complete Mitochondrial Genome of Northeast Asian Rove Beetle, Lordithon arcuatus (Solsky, 1871) and Performance of Site-Specific Mixture Models in Building the Mitogenomic Phylogeny of Staphylinidae (Insecta: Coleoptera)

1
State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
3
Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology of Hebei Province, College of Life Sciences, Hebei Normal University, Shijiazhuang 050024, China
4
Key Laboratory of Vegetation Ecology, Ministry of Education, Northeast Normal University, Changchun 130024, China
5
State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun 130117, China
6
Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Diversity 2023, 15(5), 588; https://doi.org/10.3390/d15050588
Submission received: 30 November 2022 / Revised: 18 April 2023 / Accepted: 20 April 2023 / Published: 23 April 2023

Abstract

:

Simple Summary

This paper describes the first complete mitochondrial genome of an Asian Lordithon species (Coleoptera, Staphylinidae). The mitochondrial genome of Lordithon arcuatus (Solsky, 1871) is 18,290 bp long. Maximum likelihood and Bayesian phylogenetic analyses using 68 staphylinid taxa revealed that the mycetoporine representatives constituted a stable and fully supported clade. In addition, we evaluated the performance of mixture models in constructing the mitochondrial tree of staphylinids, and our findings suggest that the class-unlinked heterotachy models, despite having the lowest AIC or BIC value, may produce deceptive results.

Abstract

Lordithon species are typically mushroom-dwelling rove beetles that devour maggots. This study presents the mitogenome of a Lordithon arcuatus specimen that was procured from Changbai Mountain in the Jilin Province of China. The mitogenome is 18,290 bp long and comprises 13 protein-coding genes, 22 tRNAs, and 2 rRNAs. The base composition of the mitogenome is as follows: A = 38.80%, T = 37.93%, G = 8.94%, and C = 14.32%. Maximum likelihood and Bayesian phylogenetic trees were constructed using 68 representative staphylinid species, which showed that Lordithon, Bolitobius, and Ischnosoma form a stable and fully supported Mycetoporinae clade, whereas there was no consensus regarding the relationships among Tachyporinae taxa. Additionally, the performance of site-specific mixture models for inferring the phylogeny of staphylinids using mitogenomic data was assessed. The results suggest that heterotachy models should be used with caution, as they may result in incorrect topology with delusive precedence in AIC- or BIC-based model selection.

1. Introduction

The staphylinid genus Lordithon Thomson has 134 species worldwide, including 73 Palaearctic species [1], which are spread throughout the temperate regions of all continents, excluding Antarctica [2]. Twelve species have been documented in China, the majority of which are found in northern provinces [3]. The current mitogenomic data for this genus available in GenBank come from a European species, L. exoletus (Erichson, 1839) (accession number: KX087309). Here, we present new mitogenome data for an Asian species, Lordithon arcuatus (Solsky, 1871). It is widespread across Northeast Asia, Siberia, and Russia’s Far East [4]. This species currently only has records from Jilin Province in China. [3]. Both Ban et al. [4] and our collection show that L. arcuatus is a mountainous species and can be captured using pitfall traps, flight-interception traps, and sifting from mushrooms. However, it was not present in our collecting funnels for insects that live in the canopy at the same plots [5]. Its association with mushrooms is not evidence of fungivory. Instead, we observed some individuals eating maggots that were present in mushrooms, as summarized by Campbell [2]. In addition, previous studies with different data and models revealed a number of variations in the topologies of staphylinid phylogeny [6,7,8]. In particular, it has been discovered that the site heterogeneity of sequence, which assumes that each site of alignment has undergone uneven evolutionary processes and thus has a specific rate or composition [9,10,11,12,13], has been identified as a significant contributor to the estimation of errors in beetle phylogenetics [14,15]. Therefore, the influence of datasets and models in phylogenetic inference also needs to be addressed, especially the performance of the mixture models, which were created to handle the site heterogeneity of sequences.

2. Material and Methods

2.1. Sample Collection, DNA Extraction, and Mitogenome Sequencing

The specimen for sequencing was gathered in Antu County, Jilin Province, China (42.1075° N, 128.0956° E 1400 m, July 2019). It was identified according to Li et al. [3] and Ban et al. [4]. Whole-genome DNA was isolated from the forebody using a modified CTAB (pH 8.0)-based DNA extraction protocol that was described in Zhao et al. [7]. The genomic DNA library was constructed using an NEB Next® UltraTM DNA Library Prep Kit for Illumina (NEB, Ipswich, MA, USA), following the manufacturer’s recommendations. Then, the DNA library was sequenced, and about 20 Gb of data from 150 bp pair-end clean reads were generated on the Illumina NovaSeq 6000 platform (Novogene, Beijing, China); these data are deposited in the GenBank-SRA database (accession number: SRR24124725). The voucher specimen was preserved in 100% ethanol, in a −20 °C freezer at Hebei Normal University.

2.2. Mitogenome Assembly, Annotation, and Bioinformatic Analysis

MITOZ v2.4 [16] was used to assemble the mitogenome, annotate protein-coding genes (PCGs) and ribosomal RNAs (rRNAs), and draw a map of the mitochondrial genome. The annotation and prediction of the second structures of tRNAs were carried out on the MITOS Web Server (http://mitos2.bioinf.uni-leipzig.de/index.py), accessed on 14 October 2021 [17]. Geneious Prime v.2021.2.2 [18] was subsequently used to confirm the PCG boundaries by identifying the open reading frame (ORF) and calculating the sequence similarity with related species.
The base composition, amino acid usage, and relative synonymous codon usage (RSCU) were calculated using a Python script that refers to the CAI module [19]. According to the base composition, the following formula was used: AT skew = [A − T]/[A + T] and GC skew = [G − C]/[G + C] [20]. The complete mitogenome sequence of L. arcuatus with full annotation has been deposited in GenBank (accession number: OK501152).

2.3. Phylogenetic Analysis

In total, 68 representative species were sampled for phylogenetic analysis (Supplementary Material Table S1), comprising 4 from Mycetoporinae, 3 from Tachyporinae, 60 from various rove beetle subfamilies, and 1 from Leiodidae, Sciodrepoides watsoni, as an outgroup. The mitochondrial sequences were retrieved using PhyloSuite v1.2.2 [21] and individually aligned with MAFFT v7.313 [22] using the iterative refinement method of E-INS-I for nucleotide alignment and L-INS-I for ribosome RNA alignment. Gene alignments were concatenated into a matrix using PhyloSuite v.1.2.2.
Seven datasets (matrices) were used in the phylogenetic analyses: (1) AA: the amino acid sequence of 13 PCGs (3524 sites, 13 partitions); (2) P2: the 2nd position of PCG nucleotide sequences (3405 sites, 13 partitions); (3) P12: the united 1st and 2nd positions of PCGs (6372 sites, 13 partitions); (4) P123: the united 1st, 2nd, and 3rd positions of PCGs (8783sites, 13 partitions); (5) P2R: the 2nd position of PCGs, concatenated with 2 rRNAs (4852 sites, 15 partitions); (6) P12R: the united 1st and 2nd positions of PCGs, concatenated with 2 rRNAs (7819 sites, 15 partitions); (7) P123R: the united 1st, 2nd, and 3rd positions of PCGs, concatenated with 2 rRNAs (10230 sites, 15 partitions). For each dataset, ambiguous sites and poorly aligned positions were eliminated using BMGE v1.12 (m = DNAPAM100:2 for nucleotide sequences, m = BLOSUM90 for amino acid sequences, h = 0.4 for all) separately [23]. Three partition schemes were applied to all seven datasets: (1) non-partitioned (NP, all proteins were analyzed as a single partition); (2) fully partitioned (FP, each locus was a distinct partition with its own substitution matrix); (3) merge-partitioned (MP). ModelFinder [24] was used to identify the partitioning schemes and select the substitution models based on the Bayesian Information Criterion (BIC). In addition, Two types of four-class heterotachy models [13], class-linked (+H4) and class-unlinked (*H4), were applied with the selected substitution model to the NP scheme of all datasets; two empirical profile mixture models [20], mtART+F+R7+C60 and C60+FO+R4 (alias for POISSON+G+FMIX), were applied to the AA dataset (Table 1). IQTREE v2.1.2 [25] was utilized for building the maximum likelihood (ML) trees in phylogenetic analyses. The best-resulting tree for each combination of dataset, partitioning scheme, and models was automatically generated through 4 independent runs, and the nodal supporting values (BS) were calculated using 1000 replicates of the Ultrafast Bootstrap [26]. Nodes with a BS value of 100 were determined to be fully supported, 95–99 to be strongly supported, 90–94 to be moderately supported, and <90 to be unsupported. The partition files and data matrices in Phylip format are available in Supplementary Materials.
As an alternative method, Bayesian inference (BI) under the CAT-GTR+G4 model was applied to suppress potential artifacts due to, for example, compositional heterogeneity across sites [27,28]. Three MCMC sampling chains, each with 8000 generations, were implemented in PhyloBayes MPI 1.9 [29]. After discarding the initial 2000 generations dropped as burn-in, 2 chains attained convergence (maxdiff < 0.3), with an acceptable effective sample size (minimum effsize > 50), as suggested in the PhyloBayes tutorial. The programs “bpcomp” and “tracecomp” in the PhyloBayes MPI package were used to construct the consensus tree and diagnose the chain convergence. The nodal support was quantified using marginal posterior probability (PP).

3. Results and Discussion

3.1. Genome Organization and Base Composition

The complete mitogenome of L. arcuatus (18,290 bp) (Figure 1) contains 13 PCGs (COX1–3, ND1–6, ND4L, CYTB, ATP6, and ATP8), 22 tRNA genes, and 2 rRNA genes (16S rRNA, or l-rRNA; and 12S rRNA, or s-rRNA). Neither MITOS nor Geneious, however, managed to recognize the control region. Notably, 23 of these 37 genes are located on the forward strand, while the rest are on the reverse strand (Supplementary Material Table S2). The longest gap (81 bp) was found between tRNA-Arg and tRNA-Asn. The AT content of the whole mitochondrial sequence is 76.72%, and the proportions of the four bases are as follows: A, 38.80%; T, 37.93%; C, 14.32%; and G, 8.94%. The AT and GC skews of the entire sequence are 0.01 and −0.23, respectively; it tends to be negative in PCGs but positive in tRNA and rRNA genes (Table 2).

3.2. Protein-Coding Genes and Codon Usage

The total length of PCGs is 11,161 bp, which accounts for 61.02% of the mitogenome. The longest single PDG is nad5 (ND5), while the shortest is ATP8. Five of PCGs were identified on the reverse strand (nad1, nad4, nad4L, and nad5) (Supplementary Material Table S2). Three of them (cox2, nad4, and nad5) end with an incomplete stop codon (T/TA). The AT and GC skews of PCGs are shown in Supplementary Material Table S1. Among the codon positions, the second codon exhibits the lowest AT skew, indicating that the proportion of T is greater than A.
The relative synonymous codon usage (RSCU) of L. arcuatus mitogenome is shown in Table 3 and Figure 2. The third codon contains 86.10% AT, which is significantly higher than the first and second codons. The codons and amino acids TTA (4.05, Leu), TCT (2.443, Ser), CGA (2.327, Arg), AGA (2.156, Ser), and GGA (2.062, Gly) correspond to the top five RSCU values.

3.3. Ribosomal RNAs and Transfer RNAs

Two rRNA genes of L. arcuatus are located on the reverse strand, with lengths of 795 bp (12S rRNA) and 1270 bp (16S rRNA), respectively. AT content is 76.86% (12S rRNA) and 82.05% (16S rRNA), while AT skews are −0.03 (16S rRNA) and 0.04 (12S rRNA) (Table 2). The length of 22 tRNA genes ranges from 62 to 70 bp, with an AT content of 77.87%. Most tRNAs have a characteristic clover-leaf secondary structure, except for the tRNA-Ser1 (Figure 3). Fourteen of them lie on the forward strand, while the rest are on the reverse strand.

3.4. Phylogenetic Results

Thirty-seven ML trees and one BI tree were constructed (see the trees in Supplementary Materials). For each dataset, the results were obtained from the trees under homogeneous models and three schemes and trees under mixture models with the NP scheme, among which the tree with the lowest BIC value (extracted from IQTREE output files) was selected as the final resulting ML tree (Table 1). The tree under NP-schemed mtART+C60 profile mixture model was chosen as the final tree for an amino acid dataset, which was also reported in previous mitogenome-based phylogenetic research (e.g., Zhao et al. [6]). In contrast, the trees under the heterotachy models (+H4 and *H4) revealed rate variation among sites, and lineages had a worse BIC score. The final tree for the P2 and P2R datasets was built using a four-class partitioning scheme with separate models (different substitution matrices among partitions). For the remaining four datasets (P12, P12R, P123, and P123R), trees under the four-class-unlinked GTR models (GTR+F+R*H4) were ranked highest based on their BIC values. Other than the profile mixture and heterotachy models, the partitioned models (the MP scheme with partition-specific substitution models) performed the best when reconstructing trees using mitogenomic data.
Different datasets yielded various topologies, but few of them were found to have strong evidence (Figure 4, Figure 5 and Figure 6). Additionally, a clear “regression” is that the supra-subfamily-level relationships shown in the present resulting trees were completely uncertain and radically different from those in corresponding trees built from the same data type and partition scheme in our previous study [6], in which datasets with similar but slightly larger taxon sampling were used. Since our workframe had not changed, we hypothesize that this is largely attributable to the change in outgroups.
Specifically, the results from diverse phylogenetic trees revealed in agreement that L. arcuatus and three additional mycetoporine species form a stable and fully supported clade (BS = 100, PP = 1): Bolitobius castaneus + (Ischnosoma splendidum + (L. arcuatus + L. exoletus)) (Figure 4, Figure 5, Figure 6 and Figure 7). This node appears to be 100% supported under most of the final resulting trees, with the exception of the GTR+F+R7*H4 tree of the P123 dataset, where it was scored as only 88 (Figure 6D). However, in line with [30], Tachyporinae, which was believed to be closely related to Mycetoporinae, appeared to be a polyphyletic group in the results from AA, P2, P12, and P2R datasets or to be clustered together with weaker branch supports in the other datasets (P123 and P12R datasets with BS < 90 in the partitioned model, P123R datasets with BS = 95 in the partitioned model, and = 90 in the class-linked heterotachy model) (Figure 4, Figure 5, Figure 6 and Figure 7). As indicated in previous studies [7,8], the sequence of Tachinus subterraneus (KX087351) herein remains a “tricky outlier” whose place varied among trees and was isolated from its taxonomic affiliation in most trees. Intriguingly, it is included in the Tachyporinae clade with very strong support (PP = 0.99) in our BI tree under the CAT-GTR+G4 model, which appears to indicate an LBA artifact caused by site-compositional heterogeneity. However, the tree shown in Figure 1 of the study by Song et al. [8] was also constructed using mitogenomic AA data and a site-heterogeneous CAT model. Hence, further investigation is required to determine whether it is caused by the site-compositional heterogeneity or simply the identity of the sequence. A more practical alternative is to use new data from the same species or genus, as Tachinus is a typical Tachyporinae genus based on both morphological characteristics and independent molecular phylogenetic studies [7,30]. Although Yamamoto [31] taxonomically divided the former “Tachyporinae” (including Mycetoporinae) based on morphological phylogeny, the accurate position and relationship of the associated taxa remain unresolved. Phylogenetics with more precisely identified data and a larger sample size is required for convincing conclusions.
Staphylininae is well defined as a monophyletic group based on its pupal and larval characteristics [32], despite the fact that different studies with different data supported this to varying degrees [6,7,28,29,30,31,32], including our previous results using amino acid sequences of the complete mitogenome, which fully supported its monophyly [6]. In the present study, however, we found that heterotachy models, particularly the class-unlinked models, tend to hasten the artifact of staphylinine separation, i.e., the staphylinine clade is inserted or distantly separate in five of the seven datasets (Figure 6), among which the two exceptions are the trees built under the GTR+F+R5*H4 model using P12 (Figure 6C) or P12R matrix (Figure 6F). In contrast, only one or two cases could be found under partitioned models or class-linked heterotachy models (Figure 4 and Figure 5). Otherwise, the P2 dataset appears to be more susceptible to this artifact. As the heterotachy models were used with only four classes and only applied to NP-schemed datasets, in which the model parameters were universally shared across the dataset, we hypothesize that this issue must be alleviated by raising the heterogeneous consideration into the model, e.g., a higher number of classes, partitioning schemes, and branch-length models across loci or partitions [33].
The position of Neophoninae is another issue worthy of consideration. This monotypic subfamily has been confirmed to be the sister group of Dasycerinae and Pselaphinae with either complete or very strong statistical support [6,7,30]. Our results generated using partitioned or profile mixture models (including the CAT-GTR+G4 model using the Bayesian method) perfectly recovered its sibling relationship with Pselaphinae (Figure 4 and Figure 7), whereas those generated with heterotachy models using nucleotide datasets indicated an isolated Neophoninae branch against all the other subfamilies (Figure 5 and Figure 6). That being said, the majority of our results endorse a tremendously basal position for Neophoninae, which is consistent with our previous study using mitogenomic data [6] but contradicts the multi-gene phylogenies (with nuclear genes), in which Neophoninae appeared to be a rather derived lineage in staphylinid trees [7,30].

4. Conclusions

Our research investigated the mitogenome of a Lordithon arcuatus specimen acquired from Northeast China and its phylogenetic relationship with three mycetoporine species. ML and BI analyses provide strong support for the conclusion that they form a monophyletic group. Comparing different models for amino acid and nucleotide mitogenomic data, we found that the profile mixture model plus the empirical matrix, which performs approximately as well as the site-heterogeneous CAT-GTR+G4 model, prove the most appropriate for amino acid data, whereas partitioned models provide a better and more reliable fit than heterotachy models for the nucleotide data of staphylinid mitogenomes, despite the fact that heterotachy models are more likely be superior in AIC or BIC rankings.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/d15050588/s1, Table S1: Sequences used in phylogenetic analysis, including taxonomic information, GenBank accession number, length, and AT content; Datasets: The partition files and data matrices in Phylip format; Trees: All of Maximum Likelihood (ML) and Bayesian (BI) trees; Table S2: Mitogenomic organization of L. arcuatus. A positive value in the “Gap or Overlap” column indicates the length of the gap between the current gene and the next; a negative value implies overlap.

Author Contributions

Conceptualization, T.-Y.Z. and D.-H.W.; methodology, Y.-N.S. and T.-Y.Z.; formal analysis, Y.-N.S.; investigation, L.L.; fieldwork and resources, Q.-Q.J.; data curation, L.L.; writing—original draft preparation, Y.-N.S.; writing—review and editing, Q.-Q.J. and L.L.; visualization, Y.-N.S.; supervision, D.-H.W. and L.L.; project administration, T.-Y.Z.; funding acquisition, D.-H.W. and L.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Fundamental Resources Investigation Program of China, grant number 2018FY100300; the Hebei Normal University Start-up Funds, grant number L2018B13; and the China Scholarship Council, grant number 202104910331.

Data Availability Statement

The data presented in this study are available on request from the corresponding authors; sequencing data have been deposited in the GenBank database.

Acknowledgments

We thank Changbai Mountain National Reserve, China, for providing permission for field collection. Additionally, we are deeply grateful to the editor and the five anonymous reviewers for their thoughtful comments and valuable feedback, which have significantly improved the quality of our manuscript and enriched our work.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Schülke, M.; Smetana, A. Staphylinidae. In Hydrophiloidea—Staphylinoidea; Löbl, I., Löbl, D., Eds.; Revised and Updated; Brill: Leiden, The Netherlands, 2015; pp. 304–1134. [Google Scholar]
  2. Campbell, J.M. A Revision of the Genus Lordithon Thomson of North and Central America (Coleoptera: Staphylinidae). Memoirs Èntomol. Soc. Can. 1982, 114, 5–116. [Google Scholar] [CrossRef]
  3. Li, L.; Hu, J.Y.; Peng, Z.; Tang, L.; Yin, Z.W.; Zhao, M.J. Catalogue of Chinese Coleoptera Volume 3—Staphylinidae; Science Press: Beijing, China, 2019. [Google Scholar]
  4. Ban, Y.-G.; Jeong, W.-J.; Ahn, K.-J. Taxonomy of Korean Lordithon Thomson (Coleoptera: Staphylinidae: Tachyporinae). J. Asia-Pac. Biodivers. 2019, 12, 545–557. [Google Scholar] [CrossRef]
  5. Zheng, G.; Li, S.; Yang, X. Forest Ecology and Management Spider Diversity in Canopies of Xishuangbanna Rainforest (China) Indicates an Alarming Juggernaut Effect of Rubber Plantations. For. Ecol. Manag. 2015, 338, 200–207. [Google Scholar] [CrossRef]
  6. Zhao, T.-Y.; He, L.; Xu, X.; Chen, Z.-N.; Gao, Y.-Y.; Lü, L. The First Mitochondrial Genome of Creophilus Leach and Platydracus Thomson (Coleoptera: Staphylinidae: Staphylinini) and Phylogenetic Implications. Zootaxa 2022, 5099, 179–200. [Google Scholar] [CrossRef] [PubMed]
  7. Zhao, T.-Y.; Zhang, C.-J.; Lü, L. Comparative Description of the Mitochondrial Genome of Scaphidium formosanum Pic, 1915 (Coleoptera: Staphylinidae: Scaphidiinae). Zootaxa 2021, 4941, 487–510. [Google Scholar] [CrossRef]
  8. Song, N.; Zhai, Q.; Zhang, Y. Higher-Level Phylogenetic Relationships of Rove Beetles (Coleoptera, Staphylinidae) Inferred from Mitochondrial Genome Sequences. Mitochondrial DNA Part A DNA Mapp. Seq. Anal. 2021, 32, 98–105. [Google Scholar] [CrossRef]
  9. Yang, Z. Maximum Likelihood Phylogenetic Estimation from DNA Sequences with Variable Rates over Sites: Approximate Methods. J. Mol. Evol. 1994, 39, 306–314. [Google Scholar] [CrossRef]
  10. Lopez, P.; Casane, D.; Philippe, H. Heterotachy, an Important Process of Protein Evolution. Mol. Biol. Evol. 2002, 19, 1–7. [Google Scholar] [CrossRef]
  11. Quang, L.S.; Gascuel, O.; Lartillot, N. Empirical Profile Mixture Models for Phylogenetic Reconstruction. Bioinformatics 2008, 24, 2317–2323. [Google Scholar] [CrossRef]
  12. Wang, H.-C.; Minh, B.Q.; Susko, E.; Roger, A.J. Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation. Syst. Biol. 2018, 67, 216–235. [Google Scholar] [CrossRef]
  13. Crotty, S.; Minh, B.Q.; Bean, N.G.; Holland, B.R.; Tuke, J.; Jermiin, L.S.; Von Haeseler, A. GHOST: Recovering Historical Signal from Heterotachously Evolved Sequence Alignments. Syst. Biol. 2020, 69, 249–264. [Google Scholar] [CrossRef]
  14. Vasilikopoulos, A.; Balke, M.; Kukowka, S.; Pflug, J.M.; Martin, S.; Meusemann, K.; Hendrich, L.; Mayer, C.; Maddison, D.R.; Niehuis, O.; et al. Phylogenomic Analyses Clarify the Pattern of Evolution of Adephaga (Coleoptera) and Highlight Phylogenetic Artefacts Due to Model Misspecification and Excessive Data Trimming. Syst. Èntomol. 2021, 46, 991–1018. [Google Scholar] [CrossRef]
  15. Cai, C.; Tihelka, E.; Giacomelli, M.; Lawrence, J.F.; Ślipiński, A.; Kundrata, R.; Yamamoto, S.; Thayer, M.K.; Newton, A.F.; Leschen, R.A.B.; et al. Integrated phylogenomics and fossil data illuminate the evolution of beetles. R. Soc. Open Sci. 2022, 9, 211771. [Google Scholar] [CrossRef]
  16. Meng, G.; Li, Y.; Yang, C.; Liu, S. MitoZ: A Toolkit for Animal Mitochondrial Genome Assembly, Annotation and Visualization. Nucleic Acids Res. 2019, 47, e63. [Google Scholar] [CrossRef]
  17. Bernt, M.; Donath, A.; Jühling, F.; Externbrink, F.; Florentz, C.; Fritzsch, G.; Pütz, J.; Middendorf, M.; Stadler, P.F. MITOS: Improved De Novo Metazoan Mitochondrial Genome Annotation. Mol. Phylogenet. Evol. 2013, 69, 313–319. [Google Scholar] [CrossRef]
  18. Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S.; Buxton, S.; Cooper, A.; Markowitz, S.; Duran, C.; et al. Geneious Basic: An Integrated and Extendable Desktop Software Platform for the Organization and Analysis of Sequence data. Bioinformatics 2012, 28, 1647–1649. [Google Scholar] [CrossRef]
  19. Lee, B.D. Python Implementation of Codon Adaptation Index. J. Open Source Softw. 2018, 3, 905. [Google Scholar] [CrossRef]
  20. Perna, N.; Kocher, T. Patterns of Nucleotide Composition at Fourfold Degenerate Sites of Animal Mitochondrial Genomes. J. Mol. Evol. 1995, 41, 353–358. [Google Scholar] [CrossRef]
  21. Zhang, D.; Gao, F.; Jakovlić, I.; Zhou, H.; Zhang, J.; Li, W.X.; Wang, G.T. PhyloSuite: An Integrated and Scalable Desktop Platform for Streamlined Molecular Sequence Data Management and Evolutionary Phylogenetics Studies. Mol. Ecol. Resour. 2020, 20, 348–355. [Google Scholar] [CrossRef]
  22. Katoh, K.; Standley, D.M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
  23. Criscuolo, A.; Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): A new Software for Selection of Phylogenetic Informative Regions from Multiple Sequence Alignments. BMC Evol. Biol. 2010, 10, 210. [Google Scholar] [CrossRef] [PubMed]
  24. Kalyaanamoorthy, S.; Minh, B.Q.; Wong, T.K.F.; Von Haeseler, A.; Jermiin, L.S. ModelFinder: Fast Model Selection for Accurate Phylogenetic Estimates. Nat. Methods 2017, 14, 587–589. [Google Scholar] [CrossRef] [PubMed]
  25. Minh, B.Q.; Schmidt, H.A.; Chernomor, O.; Schrempf, D.; Woodhams, M.D.; von Haeseler, A.; Lanfear, R.; Teeling, E. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evol. 2020, 37, 1530–1534. [Google Scholar] [CrossRef] [PubMed]
  26. Hoang, D.T.; Chernomor, O.; Von Haeseler, A.; Minh, B.Q.; Vinh, L.S. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol. Biol. Evol. 2018, 35, 518–522. [Google Scholar] [CrossRef] [PubMed]
  27. Lartillot, N.; Philippe, H. A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process. Mol. Biol. Evol. 2004, 21, 1095–1109. [Google Scholar] [CrossRef]
  28. Lartillot, N.; Brinkmann, H.; Philippe, H. Suppression of Long-Branch Attraction Artefacts in the Animal Phylogeny Using a Site-Heterogeneous Model. BMC Evol. Biol. 2007, 7, S4. [Google Scholar] [CrossRef]
  29. Lartillot, N.; Rodrigue, N.; Stubbs, D.; Richer, J. PhyloBayes MPI: Phylogenetic Reconstruction with Infinite Mixtures of Profiles in a Parallel Environment. Syst. Biol. 2013, 62, 611–615. [Google Scholar] [CrossRef]
  30. Yamamoto, S. Tachyporinae Revisited: Phylogeny, Evolution, and Higher Classification Based on Morphology, with Recognition of a New Rove Beetle Subfamily (Coleoptera: Staphylinidae). Biology 2021, 10, 323. [Google Scholar] [CrossRef]
  31. Mckenna, D.D.; Farrell, B.D.; Caterino, M.S.; Farnum, C.W.; Hawks, D.C.; Maddison, D.R.; Seago, A.E.; Short, A.E.Z.; Newton, A.F.; Thayer, M.K. Phylogeny and Evolution of Staphyliniformia and Scarabaeiformia: Forest Litter as a Stepping Stone for Diversification of Nonphytophagous Beetles. Syst. Èntomol. 2015, 40, 35–60. [Google Scholar] [CrossRef]
  32. Lü, L.; Cai, C.-Y.; Zhang, X.; Newton, A.F.; Thayer, M.K.; Zhou, H.-Z. Linking Evolutionary Mode to Palaeoclimate Change Reveals Rapid Radiations of Staphylinoid Beetles in Low-Energy Conditions. Curr. Zool. 2021, 66, 435–444. [Google Scholar] [CrossRef]
  33. Duchêne, D.; Tong, K.J.; Foster, C.S.P.; Duchêne, S.; Lanfear, R.; Ho, S.Y.W. Linking Branch Lengths across Sets of Loci Provides the Highest Statistical Support for Phylogenetic Inference. Mol. Biol. Evol. 2020, 37, 1202–1210. [Google Scholar] [CrossRef]
Figure 1. Gene map of the mitochondrial genome of L. arcuatus. The blue bar diagram of the inner circle shows the GC content of every 50 sites; the red circle represents 50% GC content, and each black concentric circle represents a 5% increment. The outer circle shows the distribution of the genes; the genes inside are arranged clockwise (forward strand).
Figure 1. Gene map of the mitochondrial genome of L. arcuatus. The blue bar diagram of the inner circle shows the GC content of every 50 sites; the red circle represents 50% GC content, and each black concentric circle represents a 5% increment. The outer circle shows the distribution of the genes; the genes inside are arranged clockwise (forward strand).
Diversity 15 00588 g001
Figure 2. Relative synonymous codon usage (RSCU) of L. arcuatus. The codon families are listed in alphabetical order beneath the horizontal axis. The asterisk represents stop codons.
Figure 2. Relative synonymous codon usage (RSCU) of L. arcuatus. The codon families are listed in alphabetical order beneath the horizontal axis. The asterisk represents stop codons.
Diversity 15 00588 g002
Figure 3. Secondary structure of tRNAs in the mitogenome of L. arcuatus. Red bases on the anticodon arm are anticodon. The number after the abbreviation of amino acid represents tRNAs with different anticodons.
Figure 3. Secondary structure of tRNAs in the mitogenome of L. arcuatus. Red bases on the anticodon arm are anticodon. The number after the abbreviation of amino acid represents tRNAs with different anticodons.
Diversity 15 00588 g003
Figure 4. Maximum likelihood trees built under the best model selected according to BIC rankings: (A) AA matrix with mtART+C60 profile mixture model; (B) AA matrix with C60 profile mixture model; (C) P2 matrix with MP scheme; (D) P12 matrix with MP scheme; (E) P123 matrix with MP scheme; (F) P2R matrix with MP scheme; (G) P12R matrix with MP scheme; (H) P123R matrix with MP scheme. The color of the circle at each node represents the bootstrap value (those with BS < 90 are not shown here). Branch length scales are located at the lower left of the trees. The asterisk indicates the species whose mitogenome was sequenced for this study. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Figure 4. Maximum likelihood trees built under the best model selected according to BIC rankings: (A) AA matrix with mtART+C60 profile mixture model; (B) AA matrix with C60 profile mixture model; (C) P2 matrix with MP scheme; (D) P12 matrix with MP scheme; (E) P123 matrix with MP scheme; (F) P2R matrix with MP scheme; (G) P12R matrix with MP scheme; (H) P123R matrix with MP scheme. The color of the circle at each node represents the bootstrap value (those with BS < 90 are not shown here). Branch length scales are located at the lower left of the trees. The asterisk indicates the species whose mitogenome was sequenced for this study. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Diversity 15 00588 g004
Figure 5. Maximum likelihood trees built with class-linked heterotachy models: (A) AA matrix with mtART+F+H4 model; (B) P2 matrix with TVM+F+R4+H4 model; (C) P12 matrix with GTR+F+R5+H4 model; (D) P123 matrix with GTR+F+R7+H4 model; (E) P2R matrix with GTR+F+R5+H4 model; (F) P12R matrix with GTR+F+R5+H4 model; (G) P123R matrix with GTR+F+R7+H4 model. Coloring, annotation, and legends are the same as in Figure 4. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Figure 5. Maximum likelihood trees built with class-linked heterotachy models: (A) AA matrix with mtART+F+H4 model; (B) P2 matrix with TVM+F+R4+H4 model; (C) P12 matrix with GTR+F+R5+H4 model; (D) P123 matrix with GTR+F+R7+H4 model; (E) P2R matrix with GTR+F+R5+H4 model; (F) P12R matrix with GTR+F+R5+H4 model; (G) P123R matrix with GTR+F+R7+H4 model. Coloring, annotation, and legends are the same as in Figure 4. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Diversity 15 00588 g005
Figure 6. Maximum likelihood trees built with class-unlinked heterotachy models: (A) AA matrix with mtART+F*H4 model; (B) P2 matrix with TVM+F+R4*H4 model; (C) P12 matrix with GTR+F+R5*H4 model; (D) P123 matrix with GTR+F+R7*H4 model; (E) P2R matrix with GTR+F+R5*H4 model; (F) P12R matrix with GTR+F+R5*H4 model; (G) P123R matrix with GTR+F+R7*H4 model. Coloring, annotation, and legends are the same as in Figure 4. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Figure 6. Maximum likelihood trees built with class-unlinked heterotachy models: (A) AA matrix with mtART+F*H4 model; (B) P2 matrix with TVM+F+R4*H4 model; (C) P12 matrix with GTR+F+R5*H4 model; (D) P123 matrix with GTR+F+R7*H4 model; (E) P2R matrix with GTR+F+R5*H4 model; (F) P12R matrix with GTR+F+R5*H4 model; (G) P123R matrix with GTR+F+R7*H4 model. Coloring, annotation, and legends are the same as in Figure 4. The “tricky outlier” Tachinus subterraneus is indicated by a black frame.
Diversity 15 00588 g006
Figure 7. Bayesian consensus tree under the CAT-GTR+G4 model for non-partitioned AA dataset, without clade collapse. Nodes with PP < 0.97 are not shown. Coloring, annotation, and legends are the same as in Figure 4. The tip of Tachinus subterraneus is indicated by a black frame.
Figure 7. Bayesian consensus tree under the CAT-GTR+G4 model for non-partitioned AA dataset, without clade collapse. Nodes with PP < 0.97 are not shown. Coloring, annotation, and legends are the same as in Figure 4. The tip of Tachinus subterraneus is indicated by a black frame.
Diversity 15 00588 g007
Table 1. Information on the ML trees, including different datasets, number of sites, partitioning schemes, substitution model, log-likelihood, AIC (Akaike Information Criterion), and BIC (Bayesian Information Criterion). FP: edge-unlinked full partitioning scheme; MP: merged and edge-unlinked partitioning scheme; NP: non-partitioning scheme (treating the entire sequence as a single locus); t: the number of partitions.
Table 1. Information on the ML trees, including different datasets, number of sites, partitioning schemes, substitution model, log-likelihood, AIC (Akaike Information Criterion), and BIC (Bayesian Information Criterion). FP: edge-unlinked full partitioning scheme; MP: merged and edge-unlinked partitioning scheme; NP: non-partitioning scheme (treating the entire sequence as a single locus); t: the number of partitions.
Matrix
(Sites)
Partition
Scheme (t)
Modelln(Lik)AICBIC
AANP(1)mtART+F+R6+C60−123,042.14 246,528.29247,897.44
(3524)MP(3)-−125,256.50 250,952.99252,309.81
NP(1)C60+FO+R4−125,549.07251,534.13252,878.61
NP(1)mtART+F+R6−125,798.97 251,921.94252,921.05
FP(13)-−125,464.80 251,525.59253,363.46
NP(1)mtART+F+R6*H4−125,594.16252,296.33255,713.04
NP(1)mtART+F+R6+H4−125,594.17252,296.34255,713.05
P2MP(4)-−38,619.95 77,611.9078,752.64
(3405)NP(1)TVM+F+R4−39,019.88 78,331.7679,227.18
FP(13)-−38,491.25 77,536.5079,235.34
NP(1)TVM+F+R4*H4−37,929.3676,966.7280,364.41
NP(1)TVM+F+R4+H4−38,646.27 78,376.5381,700.62
P2_rRNAMP(4)-−62,520.48 125,426.95126,678.97
(4852)FP(15)-−62,357.30 125,322.60127,294.70
NP(1)GTR+F+R5−63,912.06 128,122.13129,088.71
NP(1)GTR+F+R5*H4−62,228.67125,573.34129,193.17
NP(1)GTR+F+R5+H4−63,529.66 128,145.33131,667.85
P12NP(1)GTR+F+R5*H4−98,269.24197,654.49201,426.38
(6372)MP(5)-−100,360.79 201,131.57202,517.30
FP(13)-−100,335.13 201,218.25203,070.40
NP(1)GTR+F+R5−101,659.33 203,616.66204,623.85
NP(1)GTR+F+R5+H4−101,062.88 203,211.75206,882.25
P12_rRNANP(1)GTR+F+R5*H4−122,181.55245,479.10249,365.19
(7819)MP(6)-−124,234.00 248,903.99250,422.21
FP(15)-−124,177.93 248,963.86251,081.01
NP(1)GTR+F+R5−126,121.29 252,540.58253,578.26
NP(1)GTR+F+R5+H4−125,512.24 252,110.49255,892.11
P123NP(1)GTR+F+R7*H4−211,109.53423,335.05427,286.01
(8783)MP(4)-−215,623.52 431,663.04433,135.80
FP(13)-−215,506.09 431,692.18434,099.57
NP(1)GTR+F+R7−218,658.81 437,623.62438,706.95
NP(1)GTR+F+R7+H4−218,140.69 437,367.38441,212.13
P123_rRNANP(1)GTR+F+R7*H4−235,257.15471,630.31475,666.37
(10230)MP(5)-−239,647.82 479,741.65481,354.62
FP(15)-−239,509.47 479,754.95482,416.72
NP(1)GTR+F+R7−243,413.52 487,133.03488,239.70
NP(1)GTR+F+R7+H4−242,877.46 486,840.93490,768.49
Table 2. Base composition and skewness of L. arcuatus.
Table 2. Base composition and skewness of L. arcuatus.
LocationATCGTotalAT%AT SkewGC Skew
Whole mitochondrial genome709669372620163618,29076.720.01−0.23
Protein-coding genes (PCGs)356047361487137811,16174.33−0.14−0.04
1st codon12821298474668372269.32−0.010.17
2nd codon7581756700506372067.58−0.40−0.16
3rd codon15201682313204371986.10−0.05−0.21
PCGs of the forward strand224227251104794686572.35−0.10−0.16
1st codon825687364413228966.060.090.06
2nd codon4711040474303228866.04−0.38−0.22
3rd codon94699826678228884.97−0.03−0.55
PCGs of the reverse strand13182011383584429677.49−0.210.21
1st codon457611110255143374.53−0.140.40
2nd codon287716226203143270.04−0.43−0.05
3rd codon57468447126143187.91−0.090.46
tRNAs561558140178143777.870.000.12
tRNAs of the forward strand36034810510091377.550.02−0.02
tRNAs of the reverse strand201210357852478.44−0.020.38
rRNAs825828142269206580.050.000.31
l-rRNA50753577151127082.05−0.030.32
s-rRNA3182936511879576.860.040.29
Table 3. Relative synonymous codon usage (RSCU) of L. arcuatus. The asterisk represents stop codons.
Table 3. Relative synonymous codon usage (RSCU) of L. arcuatus. The asterisk represents stop codons.
CondonAmino AcidCountRSCUCondonAmino AcidCountRSCUCondonAmino AcidCountRSCU
TTTPhe3101.662CCAPro481.455AGTSer250.599
TTCPhe630.338CCGPro40.121AGCSer30.072
TTALeu3764.05CATHis541.521AGASer902.156
TTGLeu280.302CACHis170.479AGGSer60.144
TCTSer1022.443CAAGln591.873GTTVal631.546
TCCSer240.575CAGGln40.127GTCVal110.27
TCASer801.916CGTArg151.091GTAVal751.84
TCGSer40.096CGCArg20.145GTGVal140.344
TATTyr1261.527CGAArg322.327GCTAla852.012
TACTyr390.473CGGArg60.436GCCAla230.544
TAA * 71.4ATTIle3671.812GCAAla571.349
TAG * 30.6ATCIle380.188GCGAla40.095
TGTCys311.824ATAMet2211.713GATAsp561.697
TGCCys30.176ATGMet370.287GACAsp100.303
TGATrp881.778ACTThr901.905GAAGlu621.632
TGGTrp110.222ACCThr170.36GAGGlu140.368
CTTLeu730.786ACAThr781.651GGTGly480.99
CTCLeu150.162ACGThr40.085GGCGly50.103
CTALeu610.657AATAsn1731.73GGAGly1002.062
CTGLeu40.043AACAsn270.27GGGGly410.845
CCTPro641.939AAALys861.623
CCCPro160.485AAGLys200.377
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ji, Q.-Q.; Sun, Y.-N.; Lü, L.; Zhao, T.-Y.; Wu, D.-H. The Complete Mitochondrial Genome of Northeast Asian Rove Beetle, Lordithon arcuatus (Solsky, 1871) and Performance of Site-Specific Mixture Models in Building the Mitogenomic Phylogeny of Staphylinidae (Insecta: Coleoptera). Diversity 2023, 15, 588. https://doi.org/10.3390/d15050588

AMA Style

Ji Q-Q, Sun Y-N, Lü L, Zhao T-Y, Wu D-H. The Complete Mitochondrial Genome of Northeast Asian Rove Beetle, Lordithon arcuatus (Solsky, 1871) and Performance of Site-Specific Mixture Models in Building the Mitogenomic Phylogeny of Staphylinidae (Insecta: Coleoptera). Diversity. 2023; 15(5):588. https://doi.org/10.3390/d15050588

Chicago/Turabian Style

Ji, Qiao-Qiao, Yi-Nuo Sun, Liang Lü, Tian-You Zhao, and Dong-Hui Wu. 2023. "The Complete Mitochondrial Genome of Northeast Asian Rove Beetle, Lordithon arcuatus (Solsky, 1871) and Performance of Site-Specific Mixture Models in Building the Mitogenomic Phylogeny of Staphylinidae (Insecta: Coleoptera)" Diversity 15, no. 5: 588. https://doi.org/10.3390/d15050588

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop