Next Article in Journal
New Insights into the Identity of the DFNA58 Gene
Next Article in Special Issue
AMPK Promotes Larval Metamorphosis of Mytilus coruscus
Previous Article in Journal
Unlocking Horse Y Chromosome Diversity
Previous Article in Special Issue
Systematic Evaluation of Genomic Prediction Algorithms for Genomic Prediction and Breeding of Aquatic Animals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparing the Efficiency of Single-Locus Species Delimitation Methods within Trochoidea (Gastropoda: Vetigastropoda)

1
Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao 266003, China
2
Sanya Oceanographic Institution, Ocean University of China, Sanya 572000, China
3
Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266003, China
*
Author to whom correspondence should be addressed.
Genes 2022, 13(12), 2273; https://doi.org/10.3390/genes13122273
Submission received: 22 October 2022 / Revised: 26 November 2022 / Accepted: 29 November 2022 / Published: 2 December 2022
(This article belongs to the Special Issue Genetic Breeding and Genomics of Marine Shellfish)

Abstract

:
In the context of diminishing global biodiversity, the validity and practicality of species delimitation methods for the identification of many neglected and undescribed biodiverse species have been paid increasing attention. DNA sequence-based species delimitation methods are mainly classified into two categories, namely, distance-based and tree-based methods, and have been widely adopted in many studies. In the present study, we performed three distance-based (ad hoc threshold, ABGD, and ASAP) and four tree-based (sGMYC, mGMYC, PTP, and mPTP) analyses based on Trochoidea COI data and analyzed the discordance between them. Moreover, we also observed the performance of these methods at different taxonomic ranks (the genus, subfamily, and family ranks). The results suggested that the distance-based approach is generally superior to the tree-based approach, with the ASAP method being the most efficient. In terms of phylogenetic methods, the single threshold version performed better than the multiple threshold version of GMYC, and PTP showed higher efficiency than mPTP in delimiting species. Additionally, GMYC was found to be significantly influenced by taxonomic rank, showing poorer efficiency in datasets at the genus level than at higher levels. Finally, our results highlighted that cryptic diversity within Trochoidea (Mollusca: Vetigastropoda) might be underestimated, which provides quantitative evidence for excavating the cryptic lineages of these species.

Graphical Abstract

1. Introduction

In the context of the gradual reduction in and loss of global biodiversity [1,2,3,4,5,6,7], it is necessary to use taxonomic knowledge to quantify and delimit species diversity. An increasing number of researchers support the use of molecular information to assist taxonomy, which can not only expedite the proper identification and delimitation of species, but can also provide effective means for the discovery and documentation of new taxa [8,9,10]. Methods for species delimitation relying on molecular information can generally be divided into single- or multi-locus methods [11]. A variety of approaches using single-gene data have become important for studies on hundreds or thousands of species [12,13]. Despite multi-locus studies having significant advantages in simultaneously considering several loci when delimiting species, single-locus methods still predominate in the literature [14,15,16]. Moreover, single-gene data also have much lower costs and have relatively smaller computational requirements. The use of single-locus data, rather than multi-locus data, as the primary evidence for identifying species, has enabled the rapid and effective generation of species hypotheses. As a consequence, the single-locus species delimitation method has become an extremely effective molecular tool when applied to integrative taxonomy, which is a multisource approach to exploring biodiversity [17]. Additionally, the single-gene methods are likely to continue to be used for many years. However, even if the single-locus methods have some advantages, the species hypotheses proposed by this approach are generally less robust and need to be complemented by other approaches.
Single-locus methods can generally be classified into two types: distances-based and tree-based methods [18,19,20]. Ever since the concept of a DNA barcode was proposed [21], a vast number of studies have aimed to verify its validity and practicality by comparing the knowledge of species boundaries obtained with morphospecies [22,23]. One of the first methods proposed for delimiting species from the cytochrome c oxidase subunit I (COI) barcode dataset was based on the pairwise nucleotide genetic distances of sequences [24]. This approach tries to find a “barcoding gap”, which is the assumption that the amount of genetic variation within species is smaller than the amount of variation between species [25]. Although efforts have been made to establish a fixed threshold to distinguish between intra- and inter-species divergence (e.g., 3% of divergence [26] or the 10 × rule [21]), it cannot apply to all groups of organisms when taxa with different evolutionary histories are analyzed together [25,27]. Therefore, the application of a group-specific ad hoc threshold should lead to more accurate delimitations. From this point of view, the automatic barcode gap discovery (ABGD) method was proposed [19] and has been widely used in several studies [13,28,29]. The method seeks to search for a “barcode gap” in the distribution of pairwise sequence divergences and partitioning of the data is repeated until no further splitting occurs. The hierarchical clustering algorithm mainly uses a series of prior intraspecific divergences to infer the unilateral confidence interval limit for intraspecific divergence. The barcode gap is used as a threshold to delimit molecular operational taxonomic units (MOTUs) [30]; the confidence limit is deduced, and the barcode gap is detected recursively until the MOTUs cannot be divided [19]. Subsequently, the updated implementation of the ABGD, named “assemble species by automatic partitioning” (ASAP), was described by Puillandre et al. [31], which delimits species without prior biological insight into intraspecific diversity and provides an asap-score system used to rank the alternative partitioning schemes.
In contrast to the previous methods, the tree-based method uses tree topology instead of relying on the clustering of haplotype sequences [32]. This method is mainly classified into two types: the generalized mixed Yule coalescence (GMYC) [33] and Poisson tree processes (PTP) [34]. In recent years, the GMYC model has proven to be a practical tool by which to delimit species from single-locus information in datasets due to its robustness and accuracy [12,35,36,37]. This method takes an ultra-metric tree as an input and attempts to detect the transition of branching patterns from interspecific branches to intraspecific branches. The principle of this strategy is to estimate the rates of branching events in order to infer two categories, i.e., speciation and coalescence within species. GMYC methods were first implemented under a maximum likelihood framework, including a single-threshold [33] and a multi-threshold version [38], and then expanded to a Bayesian framework [28]. In contrast to GMYC, the PTP model is designed to use the number of substitutions rather than time to estimate branching processes and, thus, requires an ML tree as input, which results in it being faster and easier to run than GMYC [34,39]. In principle, the number of substitutions is expected to be significantly higher between species than within species. For PTP, assuming that each substitution has a small probability of generating a speciation, and that the number of substitutions in a population of a certain size is large, the process follows a Poisson distribution in continuous time [34]. The original PTP model was then expanded to its Bayesian implementation (bPTP), which adds Bayesian support (BS) values to the input tree. Recently, an improved method, named the multi-rate PTP (mPTP), was proposed by Kapli et al. [40], which fits a distinct exponential distribution to each delimited species to make explicit differences in the evolutionary process or sampling intensity.
The performance of single-locus species delimitation methods has been tested in chordates, arthropods, molluscs, and other biological groups [13,27,41,42,43,44]. Although many studies have primarily aimed to examine species diversity, these have effectively compared the results obtained from different species delimitation approaches. For example, Padula et al. [45] showed that the GMYC approach was inclined to split species compared with ABGD and PTP; Knutson and Gosliner [46] showed that the conservative results obtained by ABGD and PTP were mostly concordant, even if there were some differences. Trochoidea (Gastropoda: Vetigastropoda), which is the largest and most biodiverse superfamily within Vetigastropoda [24,47], consists of more than 2000 living species that are grouped into about 237 recognized genera in WORMS (https://www.marinespecies.org/, accessed on 20 August 2022). It is an excellent model for comparative analyses of the efficiency of MOTU-delimitation algorithms in recovering species boundaries because a large number of molecular sequences have been published and used to study the complex phylogenetic relationships based on single-loci or genomic data [48,49,50]. Given that more empirical studies should be used to compare and contrast the results between species delimitation methods and nominal species, in this study, the primary goal was to compare the efficiency of single-locus species delimitation methods within Trochoidea. Moreover, we were interested in the performance of species delimitation methods at different taxonomic levels. Finally, we highlight that the consistency of results obtained from multiple tests should be considered before drawing conclusions of the cryptic diversity of species.

2. Materials and Methods

2.1. DNA Extraction, PCR Amplification, and Sequencing

The total genomic DNA was extracted from 5 to 10 mg of foot tissue using the TIANamp Marine Animals DNA Kit (TIANGEN Biotech Beijing Co., Ltd., Beijing, China), following the manufacturer’s protocols. A partial region of the COI gene was amplified using a universal primer pair: LCO1490 (5′-GGTCAACAAATCATAAAGATATTGG-3′) and HCO2198 (5′-TAAACTTCAGGGTGACCAAAAAATCA-3′) [51]. The polymerase chain reaction (PCR) was carried out in a final volume of 25 µL and contained 22µL of GoldenStar® T6 Super PCR Mix (1.1×) (Tsingke Biotechnology Co., Ltd., Beijing, China), 1 µL of each primer, and 1µL of template DNA. Thermocycler conditions were set with initial denaturation at 94 °C for 3 min, followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 48 °C for 30 s, elongation at 72 °C for 1 min, and final elongation at 72 °C for 10 min. PCR products were confirmed by electrophoresis in 1.5% agarose gel and the qualified products were sent to Sangon Biotech (BGI TECH SOLUTIONS Co., Ltd., Beijing, China) for purification and bidirectional sequencing.

2.2. Sequence Analysis and Dataset Composition

In the present study, we combined 1248 previously published COI sequences of Trochoidea in GenBank (Table S1) with our 108 new COI sequences (Table S2). Chromatograms were examined and single consensus sequences for each individual were generated using Seqman software v.7.1 (DNASTAR, Inc., Madison, WI, USA) [52]. In terms of sequence selection, ambiguous taxonomic labels (e.g., “Arene sp.”) were excluded because they are meaningless when comparing the results of species delimitation methods. Additionally, we only analyzed sequences that were at least 500 bp in length in order to meet the minimum length of sequences required for the DNA barcode [20]. A series of 1356 sequences was assigned to 40 datasets according to the taxonomic levels, resulting in 12 families, 7 subfamilies, and 21 genus datasets (Table 1). Then, multiple sequence alignment was implemented using MAFFT v.7.4 [53], and the sequences were adjusted and trimmed using TrimALv1.4 [54] with default parameters in a Linux environment. For each dataset, identical haplotypes were removed using ALTER [55] because the GMYC model can only handle dichotomous branches [12,56].
The existence of a “barcode gap” for each dataset was evaluated using the “Spider” package [57] in R version 3.6 [58] by calculating the difference between the minimum interspecific distance and the maximum intraspecific distance. Our results showed that a zero difference existed in the Angariidae dataset. ABGD cannot propose a primary partition and is therefore not suitable for species delimitation when no barcode gap exists in a dataset [19]. Since the Areneidae dataset was composed of only one sequence, it could not be applied for species delimitation. Moreover, tree-based methods were used on the phylogenetic trees by excluding the Skeneidae dataset with only one sequence per species. Therefore, in our study, all the analyses were performed on 37 haplotype datasets, including 9 families, 7 subfamilies, and 21 genus datasets (Figure 1). The datasets varied greatly in terms of species and COI sequence numbers, which ranged from the family dataset, composed of 130 species and 411 sequences, to the genus dataset, containing 3 species and 4 sequences (Table 1). The genus datasets included 8.3 species on average; the maximum number of species contained was 28, and the minimum was 3 species. The number of species in the subfamily datasets ranged from 6 to 80, with a mean of 28.9 species. The family datasets contained a mean of 32 species, with the number ranging from 2 to 130 species.

2.3. Phylogenetic Analysis

Prior to the species delimitation analyses, we selected the best-fit model of nucleotide substitution for the dataset using BIC in MEGA X [59] based on the Bayesian information criterion. Two phylogenetic trees were obtained for each dataset as follows: a maximum likelihood (ML) tree calculated in RAxML-HPC2 v8.2 [60] and an ultra-metric gene tree in BEAST v.2.6 [61]. ML trees were obtained in the Cyber Infrastructure for Phylogenetic Research (CIPRES) portal v.3.1 [62], under the GTR+γ model of evolution, using 200 heuristic independent runs and a bootstrap resampling of 1000 replicates. With respect to the ultra-metric tree for the GMYC analyses, under a strict clock, yule tree prior, and all other priors were left at the default values. A Markov chain length ranging from 10 × 106 to 80 × 107 generations, depending on the estimated sample size of each parameter of the model, and a burn-in of 0.1, trace logs were visualized in Tracer version 1.7 [63] to assess convergence and adequate posterior sampling (ESS > 200). A maximum clade credibility (MCC) tree was created in TreeAnnotator v.2.6 [61] using mean heights for annotation. Additionally, FigTree 1.4.4 [64] was used to visualize and edit the trees.

2.4. Molecular Species Delimitation Analyses

In this study, three distance-based and four tree-based methods were used for the production of species hypotheses, as follows: (1) the ad hoc estimated threshold on each dataset [27]; (2) ABGD [19]; (3) ASAP [31]; (4) the single-threshold GMYC (sGMYC) model [33] and the multi-threshold GMYC (mGMYC) model [38]; and (5) the PTP model [34] and the multi-rate PTP (mPTP) model [40].
In order to assess the efficiency of the different species delimitation methods, we adopted the strategy used by Magoga et al. [27] to compare the results of a variety of species delimitation approaches. The efficiency of each method in delimiting species was evaluated by examining the correspondence between molecular delimited units and nominal species for each dataset. The efficiency of each method in recovering boundaries was assessed according to the following categories: (1) match: all sequences belonging to a species are assigned to an MOTU that contains no other members; (2) split: the sequences of a species are divided into two or more MOTUs; (3) merge: all the sequences of one species are placed in a single MOTU along with all the sequences of another species; (4) mixture: some sequences of a species are split, while others are merged.
As the threshold value of the barcode gap is variable between groups [18], the R function “localMinima” of the “Spider” package was used to compute the minimum value in the density of the nucleotide distances as the most likely threshold for species delimitation in the dataset. Then, we took advantage of the R function “tclust” of the “Spider” package to cluster sequences at the previously identified ad hoc threshold value. Our analyses were processed in the command-line of the ABGD using the Kimura two-parameter (K2P) substitution model [65] and a gap width (X) of 1.5; when a gap was not found using this value, a width of 1.0 or 0.5 was set, the prior maximum value of intraspecific divergence ranging from 0.001 to 0.1 and other parameters were left as default. For ABGD, only the primary partitions were considered since they were typically stable on a wider range of prior values. ASAP analyses were run under the Linux environment; K2P was selected as the nucleotide substitution model and the remaining parameters were left as default. This method mainly overcomes the limitations of ABGD in two respects. On the one hand, it no longer requires users to provide a prior limit to intraspecific diversity (p). On the other hand, it provides users with an ASAP score for each partition, which allows the users to choose the “best” partition of species delimitation [31]. For ASAP, we selected the partition that was closest to the delimitation of nominal species among the ten partitions. For GMYC delimitation, the sGMYC and mGMYC methods were both conducted in R, using the “Splits” (Species Limits by Threshold Statistics) package [66]. The main function of “Splits” was to test the fit of a GMYC model versus a null model of coalescence. For the PTP and mPTP methods, we performed ML and MCMC analyses using an ML tree in the command line, and both utilized the –multi option to incorporate differences in the rates of coalescence among species. For each dataset, a sequence was selected as an outgroup to root the phylogenetic tree based on current systematic knowledge and it was removed before running the delimitation analysis. We performed 10 different runs with the following settings: mcmc run of 10 million generations, sample every 5000, and the first million generations were discarded as burn-ins. Convergence was assessed by observing the output plot of generation vs. log-likelihood (created using the “-mcmc_log” command). The congruency of the independent runs was assessed according to the average standard deviation of the delimitation support values (ASDDSV), which approached zero as the multiple MCMC runs converged on the same delimitation distribution. In addition, we assessed the confidence of the ML delimitation according to the average support values (ASV) over the 10 runs.

3. Results

In this study, the number of matches obtained from each analysis represents the efficiency of the species delimitation methods. Our results showed that molecular delimitation using ASAP resulted in the highest percentage of matches (86.3% on average) between the estimated MOTUs and the nominal species. A similar percentage of matches was obtained using the ad hoc nucleotide distance threshold (78.0% on average, Table S3) and ABGD (78.7% on average), which were moderately lower than when using sGMYC (79.3% on average) (Figure 2, Table S4). Regarding the analysis of the results of sGMYC and mGMYC (79.8% and 70.9% on average, respectively), these both demonstrated over-splitting of species compared with the other approaches. For the PTP analyses, the ASDDSV was ≤0.01 for all the datasets, suggesting the better convergence of the 10 independent MCMC runs. Additionally, for the majority of the datasets (32 out of 37 datasets), ASV exceeded 50%, suggesting that the ML analysis was well-supported by our data. For the mPTP analyses, the ASDDSV was ≤0.01 in 36 out of 37 datasets; for the six datasets with ASV values under 50%, these should be considered as having no significance in the results. In comparison with the matching number of PTPs (73.1% on average), mPTP showed a lower proportion of matches (35.1% on average) and a higher percentage of merges (59.0% on average) (Figure 2 and Table S4).
There were also considerable similarities in the performance of GMYC and PTP regarding phylogenetic-coalescent methods. Our results showed that a lower number of matches was obtained with the multi-threshold version than the single-threshold version of GMYC and PTP. Additionally, the GMYC method tended to split species, and molecular delimitation using sGMYC and mGMYC resulted in a higher percentage of splits (7.2% and 12.5% on average, respectively, Figure 2).
Taxonomic rank is one of the factors affecting the efficiency of species delimitation methods [27]. In our study, we tested these methods on several replicates of the same dataset, aggregated on the base of growing taxonomic ranks, for example, by grouping all the species of one genus, all the genera of a subfamily, and all the subfamilies of a family. In genus-level datasets, we distinctly observed the better efficiency of the distance-based methods compared with the tree-based methods (Figure 3). In addition, the species delimitation efficiency of ASAP was the highest in the genus- and family-level datasets, while it was slightly inferior to sGMYC in the subfamily datasets. At the subfamily level, using the ad hoc threshold, ASAP and sGMYC showed almost equal efficiency in recovering species boundaries. Moreover, it can be easily observed that the data were mostly concentrated on a smaller scale and analyzed at the level of the family (Figure 3). In accordance with the conclusion of Magoga et al. [27], GMYC was significantly affected by taxonomic rank and had poor efficiency when used on our genus-level datasets compared with other levels (Figure 3).

4. Discussion

4.1. Species Delimitation Method Efficiency

Our results showed that different methods do produce different delimitation scenarios based on single-locus data, which is congruent with the findings of previous studies [32,67]. Considering all the approaches adopted in this study, the distance-based method analyses generally outperformed the coalescent-derived method analyses, with the highest efficiency being from ASAP, followed by sGMYC and ABGD. The ASAP method showed the 10 best partitions with species delimitation. Although lower ASAP scores indicated better partitions for species delimitation [31], only 78% of the analyzed datasets achieved a high percentage of matches in one of the two best ASAP-score partitions. This reminds us that we should not only consider the two optimal partitions of ASAP, but also that it is better to combine the partitions with biological knowledge or other characteristics to delimit species in an integrative taxonomy framework. In addition, only 64% of the analyzed datasets were identical to the partition selected as the “best” from the ABGD output. Certainly, it should be taken into consideration that these methods based on pairwise distance, to a large extent depend on the choices of the users, namely, the use of the ad hoc threshold, the initial and recursive partition of ABGD, or the partition selection of ASAP.
It is well-known that GMYC is prone to over-splitting species [45,56], especially the multi-threshold version of GMYC [68,69]. The mGMYC method is very sensitive to deep coalescent events since it allows many temporal thresholds for speciation, resulting in the overestimation of the number of species [70]. For the PTP model, species were significantly prone to being over-lumped by mPTP in our study (Figure 2), which was contrary to what was observed to occur in the splitting of haplotype-rich species [13,31]. There are many factors contributing to this result, such as the ratio of population sizes to species divergence times, effective population size, variation among species, uneven sampling, and gene flow [32,70,71]. A high proportion of singletons may also hinder the performance of these two methods in species delimitation [12,34]. Some issues lead to incongruence in tree-based methods for species delimitation. For instance, the zero-length terminal branches of trees were considered as a relevant point for the correct identification of the transition point between coalescent and speciation processes [39,42]. Moreover, the process of the intraspecific diversification can also result in phylogenetic arrangements with long branches, which might produce a deviation for estimating species diversity [70].
However, in our study, all the results obtained were only based on a single locus. Therefore, in future studies, we will consider repeating the same analyses using other loci to determine whether the conclusions remain unaltered. Additionally, for given datasets, the incongruence in the results across methods also confirmed the individual shortcomings of one or more of the methods used to delimit species. Therefore, a more appropriate means of delimiting species is to analyze these data using a wide range of methods.

4.2. Cryptic Diversity

Incongruence across one or more of the methods used for delimiting species is likely to be useful evidence in the detection of cryptic lineages. In this study, by comparing the inconsistence between seven species delimitation approaches, we observed four genera (Turbo, Tegula, Monodonta, and Bolma) whose phylogenetic relationships remained ambiguous. Our results showed that several nominal species were included in a single MOTU, while a few recognized species were inferred to represent two or more MOTUs. In Turbo, these methods congruently inferred multiple isolated species-level lineages within Turbo castanea, while mGMYC additionally inferred multiple lineages within Turbo chrysostomus and Turbo corutus (Figure 4). Conversely, PTP and mPTP inferred Turbo petholatus as the same lineage, but as distinct lineages in the other five methods. Moreover, two morphologically similar, but genetically polyphyletic, groups were identified within Turbo (Turbo sp. A and Turbo sp. B). All the samples of Turbo sp. A were collected from Hainan province and all the specimens of Turbo sp. B were collected from Guangxi province (Table S2). Moreover, the genus Tegula was found to have some unclear taxonomic relationships on account of its higher intraspecific morphological variation. Tegula pfeifferi was observed to have two distinctive morphotypes based on its shell surface (smooth and ribbed) and Tegula xanthostigma had two shell colors (black and light brown) [49,72]. According to the recovered species boundaries using the ad hoc threshold with ASAP and sGMYC (Figure 5), two T. pfeifferi individuals were of the ribbed type and all the samples of Tegula rustica were incorporated into one MOTU. However, other T. pfeifferi individuals with either a ribbed or smooth surface were assigned to another MOTU. Therefore, the phylogenetic relationship between T. rustica and T. pfeifferi needs to be further studied [49]. Additionally, for T. xanthostigma, individuals with different shell colors were divided into two MOTUs by all the methods, excluding ABGD and mGMYC (Figure 5), which suggested that the trait shell color morphotypes were genetically distinguishable. The conclusions above are consistent with the phenomenon described by Yamazaki et al. [49]. Moreover, in Monodonta, within three species, named Monodonta labio, Monodonta australis, and Monodonta confusa, respectively, the number of MOTUs inferred by unilocus species delimitation methods was mismatched with the number of morphologic entities. All the approaches, except sGMYC and mGMYC, divided the individuals of M. labio into two lineages. One lineage not only included M. labio, but also contained one individual of M. confusa. M. australis and the remaining haplotypes of M. confusa were lumped by the distance-based methods, but the tree-based approaches could clearly distinguish the two species (Figure 6). The remaining haplotypes of M. confusa were assigned to three MOTUs by the GMYC method, which further demonstrated that the GMYC method is characterized by the over-splitting of species compared with the other approaches. In the genus Bolma, shell morphology was considered of poor significance in species identification due to its higher adaptation [73]. Some species belonging to Bolma showed high levels of intraspecific genetic differentiation, such as Bolma henica, Bolma castelinae, and Bolma fuscolineata (Figure 7). Although the discordances between nominal species and genetic entities based on COI or genomic data were studied within Trochoidea [49,73,74], we used more single-locus species delimitation methods to further confirm this phenomenon and provided evidence for probing the cryptic lineages of these species.
There are many factors resulting in the splitting of species. For example, the splitting of the oceans leads to high levels of genetic differentiation in M. labio [74]. In addition, different preferred habitats might trigger genetic divergence among populations, such as T. xanthostigma [49]. Certainly, the situation of lumping or mixing species also exists, which might be explained by incomplete lineage sorting in ancient lineages or the emergence of hybridization due to incomplete reproductive isolation [49,75]. For instance, M. confusa and M. labio were genetically distinguished, but one individual of M. confusa was gathered with M. labio, which may be due to hybridization having occurred between M. confusa and M. labio [74] (Figure 6). Furthermore, for shelled molluscs, the problem of shell polymorphism and conservatism is a huge challenge for species identification [73]; for instance, the high level of intraspecific shell polymorphism in Bolma and the different shell sculptures and colors existing in Tegula. In these cases, using the molecular data rather than morphological characters as primary evidence of species identification can objectively produce species hypotheses. Therefore, further taxonomic investigation into these lineages is essential. Finally, we suggest that additional molecular analyses using multi-locus data or SNPs obtained by next-generation sequencing are necessary for further studies of these ambiguous taxonomic problems.

5. Conclusions

In this study, we took advantage of multiple datasets composed of COI data to test a series of unilocus species delimitation methods based on pairwise genetic distance and phylogenetic trees. By examining the number of matches between nominal species and inferred MOTUs using these approaches, we found that the efficiency of the distance-based and tree-based approaches had certain variance in recovering species boundaries. The results suggested that the distance-based approach was generally superior to the tree-based approach, with the ASAP method being the most efficient. Additionally, the single-threshold version of GMYC performed better than the multi-threshold version in general. The PTP method showed higher efficiency than mPTP in delimiting species. Moreover, GMYC was found to be significantly influenced by the taxonomic rank, showing poorer efficiency in datasets at the genus level than at higher levels. In addition, our results further confirmed that the different species delimitation methods produced different results, which strongly supports the necessity of using a combination of methods before reaching a final species hypothesis. Moreover, we emphasize that there is a need to explore previously unrecognized species diversity in order to lay a foundation for taxonomic and systematic investigation. However, other factors affecting species delimitation should also be taken into consideration when drawing conclusions regarding the efficiency of these methods. Considering that we compared the efficiency between these methods based only on one locus, in the future, we will use other loci to repeat the same analyses to confirm whether the conclusions here remain unchanged.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/genes13122273/s1, Table S1: GenBank accession number of all COI sequences analyzed in our study; Table S2: The sample localities and GenBank accession number of sequences; Table S3: Nucleotide distance threshold values for species delimitation ad hoc estimated for each dataset; Table S4: Molecular species delimitation analyses results.

Author Contributions

B.G. conducted the experimental process, data analysis, and manuscript writing. L.K. designed the study and edited manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Hainan Provincial Joint Project of Sanya Yazhou Bay and Technology City Grant 320LH019, the National Natural Science Foundation of China under Grant 31772414, and the Fundamental Research Funds for the Central Universities under Grant 201964001.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original sequencing data have been submitted to GenBank (accession number ON908702-ON908706; ON908817-ON908836; ON908877-ON908879; ON908885-ON908888; ON908946-ON908951; OP457006-OP457075).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Houlahan, J.E.; Findlay, C.S.; Meyer, A.H.; Kuzmin, S.L.; Schmidt, B.R. Ecology—Global amphibian population declines—Reply. Nature 2001, 412, 500. [Google Scholar] [CrossRef]
  2. Light, T.; Marchetti, M.P. Distinguishing between invasions and habitat changes as drivers of diversity loss among California’s freshwater fishes. Conserv. Biol. 2007, 21, 434–446. [Google Scholar] [CrossRef] [PubMed]
  3. Winne, C.T.; Willson, J.D.; Todd, B.D.; Andrews, K.M.; Gibbons, J.W. Enigmatic decline of a protected population of Eastern Kingsnakes, Lampropeltis getula, in South Carolina. Copeia 2007, 2007, 507–519. [Google Scholar] [CrossRef]
  4. King, D.I.; Lambert, J.D.; Buonaccorsi, J.P.; Prout, L.S. Avian population trends in the vulnerable montane forests of the Northern Appalachians, USA. Biodivers. Conserv. 2008, 17, 2691–2700. [Google Scholar] [CrossRef]
  5. Costello, M.J.; May, R.M.; Stork, N.E. Can We Name Earth’s Species before They Go Extinct? Science 2013, 339, 413–416. [Google Scholar] [CrossRef] [Green Version]
  6. Laurance, W.F. The Race to Name Earth’s Species. Science 2013, 339, 1275. [Google Scholar] [CrossRef]
  7. Hallmann, C.A.; Sorg, M.; Jongejans, E.; Siepel, H.; Hofland, N.; Schwan, H.; Stenmans, W.; Müller, A.; Sumser, H.; Hörren, T. More than 75 percent decline over 27 years in total flying insect biomass in protected areas. PLoS ONE 2017, 12, e0185809. [Google Scholar] [CrossRef] [Green Version]
  8. Kim, K.C.; Byrne, L.B. Biodiversity loss and the taxonomic bottleneck: Emerging biodiversity science. Ecol. Res. 2006, 21, 794–810. [Google Scholar] [CrossRef] [Green Version]
  9. Mutanen, M.; Kaila, L.; Tabell, J. Wide-ranging barcoding aids discovery of one-third increase of species richness in presumably well-investigated moths. Sci. Rep. 2013, 3, 2901. [Google Scholar] [CrossRef] [Green Version]
  10. Van Klink, R. Meta-analysis reveals declines in terrestrial but increases in freshwater insect abundances. Science 2020, 370, 1175. [Google Scholar] [CrossRef]
  11. Carstens, B.C.; Pelletier, T.A.; Reid, N.M.; Satler, J.D. How to fail at species delimitation. Mol. Ecol. 2013, 22, 4369–4383. [Google Scholar] [CrossRef]
  12. Fujisawa, T.; Barraclough, T.G. Delimiting Species Using Single-Locus Data and the Generalized Mixed Yule Coalescent Approach: A Revised Method and Evaluation on Simulated Data Sets. Syst. Biol. 2013, 62, 707–724. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Hofmann, E.P.; Nicholson, K.E.; Luque-Montes, I.R.; Koehler, G.; Cerrato-Mendoza, C.A.; Medina-Flores, M.; Wilson, L.D.; Townsend, J.H. Cryptic Diversity, but to What Extent? Discordance between Single-Locus Species Delimitation Methods within Mainland Anoles (Squamata: Dactyloidae) of Northern Central America. Front. Genet. 2019, 10, 11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Yang, Z.; Rannala, B. Bayesian species delimitation using multilocus sequence data. Proc. Natl. Acad. Sci. USA 2010, 107, 9264–9269. [Google Scholar] [CrossRef] [Green Version]
  15. Yang, Z.; Rannala, B. Unguided Species Delimitation Using DNA Sequence Data from Multiple Loci. Mol. Biol. Evol. 2014, 31, 3125–3135. [Google Scholar] [CrossRef] [Green Version]
  16. Fujisawa, T.; Aswad, A.; Barraclough, T.G. A Rapid and Scalable Method for Multilocus Species Delimitation Using Bayesian Model Comparison and Rooted Triplets. Syst. Biol. 2016, 65, 759–771. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Schlick-Steiner, B.C.; Steiner, F.M.; Seifert, B.; Stauffer, C.; Christian, E.; Crozier, R.H. Integrative Taxonomy: A Multisource Approach to Exploring Biodiversity. Annu. Rev. Entomol. 2010, 55, 421–438. [Google Scholar] [CrossRef]
  18. Meier, R.; Shiyang, K.; Vaidya, G.; Ng, P.K.L. DNA barcoding and taxonomy in diptera: A tale of high intraspecific variability and low identification success. Syst. Biol. 2006, 55, 715–728. [Google Scholar] [CrossRef] [Green Version]
  19. Puillandre, N.; Lambert, A.; Brouillet, S.; Achaz, G. ABGD, Automatic Barcode Gap Discovery for primary species delimitation. Mol. Ecol. 2012, 21, 1864–1877. [Google Scholar] [CrossRef]
  20. Ratnasingham, S.; Hebert, P.D.N. A DNA-Based Registry for All Animal Species: The Barcode Index Number (BIN) System. PLoS ONE 2013, 8, e66213. [Google Scholar] [CrossRef]
  21. Hebert, P.D.; Ratnasingham, S.; De Waard, J.R. Barcoding animal life: Cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. Lond. B Biol. Sci. 2003, 270, S96–S99. [Google Scholar] [CrossRef] [Green Version]
  22. Clare, E.L.; Lim, B.K.; Engstrom, M.D.; Eger, J.L.; Hebert, P.D.N. DNA barcoding of Neotropical bats: Species identification and discovery within Guyana. Mol. Ecol. Notes 2007, 7, 184–190. [Google Scholar] [CrossRef]
  23. Kerr, K.C.R.; Stoeckle, M.Y.; Dove, C.J.; Weigt, L.A.; Francis, C.M.; Hebert, P.D.N. Comprehensive DNA barcode coverage of North American birds. Mol. Ecol. Notes 2007, 7, 535–543. [Google Scholar] [CrossRef] [PubMed]
  24. Sokal, R.R. Numerical taxonomy. Sci. Am. 1966, 215, 106–117. [Google Scholar] [CrossRef]
  25. Meyer, C.P.; Paulay, G. DNA barcoding: Error rates based on comprehensive sampling. PLoS Biol. 2005, 3, 2229–2238. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Smith, M.A.; Fisher, B.L.; Hebert, P.D.N. DNA barcoding for effective biodiversity assessment of a hyperdiverse arthropod group: The ants of Madagascar. Philos. Trans. R. Soc. B-Biol. Sci. 2005, 360, 1825–1834. [Google Scholar] [CrossRef] [Green Version]
  27. Magoga, G.; Fontaneto, D.; Montagna, M. Factors affecting the efficiency of molecular species delimitation in a species-rich insect family. Mol. Ecol. Resour. 2021, 21, 1475–1489. [Google Scholar] [CrossRef]
  28. Reid, N.M.; Carstens, B.C. Phylogenetic estimation error can decrease the accuracy of species delimitation: A Bayesian implementation of the general mixed Yule-coalescent model. BMC Evol. Biol. 2012, 12, 196. [Google Scholar] [CrossRef] [Green Version]
  29. Kekkonen, M.; Hebert, P.D.N. DNA barcode-based delineation of putative species: Efficient start for taxonomic workflows. Mol. Ecol. Resour. 2014, 14, 706–715. [Google Scholar] [CrossRef] [Green Version]
  30. Goldstein, P.Z.; DeSalle, R. Integrating DNA barcode data and taxonomic practice: Determination, discovery, and description. Bioessays 2011, 33, 135–147. [Google Scholar] [CrossRef]
  31. Puillandre, N.; Brouillet, S.; Achaz, G. ASAP: Assemble species by automatic partitioning. Mol. Ecol. Resour. 2021, 21, 609–620. [Google Scholar] [CrossRef]
  32. Ahrens, D.; Fujisawa, T.; Krammer, H.-J.; Eberle, J.; Fabrizi, S.; Vogler, A.P. Rarity and Incomplete Sampling in DNA-Based Species Delimitation. Syst. Biol. 2016, 65, 478–494. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Pons, J.; Barraclough, T.G.; Gomez-Zurita, J.; Cardoso, A.; Duran, D.P.; Hazell, S.; Kamoun, S.; Sumlin, W.D.; Vogler, A.P. Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst. Biol. 2006, 55, 595–609. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Zhang, J.; Kapli, P.; Pavlidis, P.; Stamatakis, A. A general species delimitation method with applications to phylogenetic placements. Bioinformatics 2013, 29, 2869–2876. [Google Scholar] [CrossRef] [Green Version]
  35. Barraclough, T.G.; Hughes, M.; Ashford-Hodges, N.; Fujisawa, T. Inferring evolutionarily significant units of bacterial diversity from broad environmental surveys of single-locus data. Biol. Lett. 2009, 5, 425–428. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Papadopoulou, A.; Anastasiou, I.; Spagopoulou, F.; Stalimerou, M.; Terzopoulou, S.; Legakis, A.; Vogler, A.P. Testing the species—Genetic diversity correlation in the Aegean archipelago: Toward a haplotype-based macroecology? Am. Nat. 2011, 178, 241–255. [Google Scholar] [CrossRef] [Green Version]
  37. Dellicour, S.; Flot, J.-F. Delimiting Species-Poor Data Sets using Single Molecular Markers: A Study of Barcode Gaps, Haplowebs and GMYC. Syst. Biol. 2015, 64, 900–908. [Google Scholar] [CrossRef] [Green Version]
  38. Monaghan, M.T.; Wild, R.; Elliot, M.; Fujisawa, T.; Balke, M.; Inward, D.J.G.; Lees, D.C.; Ranaivosolo, R.; Eggleton, P.; Barraclough, T.G.; et al. Accelerated Species Inventory on Madagascar Using Coalescent-Based Models of Species Delineation. Syst. Biol. 2009, 58, 298–311. [Google Scholar] [CrossRef] [Green Version]
  39. Tang, C.Q.; Humphreys, A.M.; Fontaneto, D.; Barraclough, T.G. Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data. Methods Ecol. Evol. 2014, 5, 1086–1094. [Google Scholar] [CrossRef] [Green Version]
  40. Kapli, P.; Lutteropp, S.; Zhang, J.; Kobert, K.; Pavlidis, P.; Stamatakis, A.; Flouri, T. Multi-rate Poisson tree processes for single-locus species delimitation under maximum likelihood and Markov chain Monte Carlo. Bioinformatics 2017, 33, 1630–1638. [Google Scholar] [CrossRef]
  41. Pentinsaari, M.; Vos, R.; Mutanen, M. Algorithmic single-locus species delimitation: Effects of sampling effort, variation and nonmonophyly in four methods and 1870 species of beetles. Mol. Ecol. Resour. 2017, 17, 393–404. [Google Scholar] [CrossRef]
  42. da Silva, R.; Peloso, P.L.V.; Sturaro, M.J.; Veneza, I.; Sampaio, I.; Schneider, H.; Gomes, G. Comparative analyses of species delimitation methods with molecular data in snappers (Perciformes: Lutjaninae). Mitochondrial DNA Part A 2018, 29, 1108–1114. [Google Scholar] [CrossRef] [PubMed]
  43. Nantarat, N.; Sutcharit, C.; Tongkerd, P.; Wade, C.M.; Naggs, F.; Panha, S. Phylogenetics and species delimitations of the operculated land snail Cyclophorus volvulus (Gastropoda: Cyclophoridae) reveal cryptic diversity and new species in Thailand. Sci. Rep. 2019, 9, 7041. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Rodrigues, B.L.; Baton, L.A.; Shimabukuro, P.H.F. Single-locus DNA barcoding and species delimitation of the sandfly subgenus Evandromyia (Aldamyia). Med. Vet. Entomol. 2020, 34, 420–431. [Google Scholar] [CrossRef] [PubMed]
  45. Padula, V.; Bahia, J.; Stoeger, I.; Camacho-Garcia, Y.; Malaquias, M.A.E.; Lucas Cervera, J.; Schroedl, M. A test of color-based taxonomy in nudibranchs: Molecular phylogeny and species delimitation of the Felimida clenchi (Mollusca: Chromodorididae) species complex. Mol. Phylogen. Evol. 2016, 103, 215–229. [Google Scholar] [CrossRef]
  46. Knutson, V.L.; Gosliner, T.M. The first phylogenetic and species delimitation study of the nudibranch genus Gymnodoris reveals high species diversity (Gastropoda: Nudibranchia). Mol. Phylogen. Evol. 2022, 171, 107470. [Google Scholar] [CrossRef] [PubMed]
  47. Geiger, D.; Nützel, A.; Sasaki, T. Vetigastropoda. In Phylogeny and Evolution of the Mollusca; University of California Press: Berkeley, CA, USA, 2008; pp. 297–330. [Google Scholar]
  48. Guo, E.; Yang, Y.; Kong, L.; Yu, H.; Liu, S.; Liu, Z.; Li, Q. Mitogenomic phylogeny of Trochoidea (Gastropoda: Vetigastropoda): New insights from increased complete genomes. Zool. Scr. 2020, 50, 43–57. [Google Scholar] [CrossRef]
  49. Yamazaki, D.; Hirano, T.; Uchida, S.; Miura, O.; Chiba, S. Relationship between contrasting morphotypes and the phylogeny of the marine gastropod genus Tegula (Vetigastropoda: Tegulidae) in East Asia. J. Molluscan Stud. 2019, 85, 24–34. [Google Scholar] [CrossRef]
  50. Ran, K.; Li, Q.; Qi, L.; Li, W.; Kong, L. DNA barcoding for identification of marine gastropod species from Hainan island, China. Fish. Res. 2020, 225, 105504. [Google Scholar] [CrossRef]
  51. Folmer, O.; Black, M.; Hoeh, W.; Lutz, R.; Vrijenhoek, R. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 1994, 3, 294–299. [Google Scholar]
  52. Swindell, S.R.; Plasterer, T.N. SEQMAN: Contig assembly. In Methods in Molecular Biology; Sequence Data Analysis Guidebook; Swindell, S.R., Ed.; Springer: Berlin/Heidelberg, Germany, 1997; Volume 70, pp. 75–89. [Google Scholar]
  53. Katoh, K.; Standley, D.M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
  54. Capella-Gutierrez, S.; Silla-Martinez, J.M.; Gabaldon, T. trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 2009, 25, 1972–1973. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. Glez-Pena, D.; Gomez-Blanco, D.; Reboiro-Jato, M.; Fdez-Riverola, F.; Posada, D. ALTER: Program-oriented conversion of DNA and protein alignments. Nucleic Acids Res. 2010, 38, W14–W18. [Google Scholar] [CrossRef] [PubMed]
  56. Talavera, G.; Dinca, V.; Vila, R. Factors affecting species delimitations with the GMYC model: Insights from a butterfly survey. Methods Ecol. Evol. 2013, 4, 1101–1110. [Google Scholar] [CrossRef] [Green Version]
  57. Brown, S.D.J.; Collins, R.A.; Boyer, S.; Lefort, M.-C.; Malumbres-Olarte, J.; Vink, C.J.; Cruickshank, R.H. Spider: An R package for the analysis of species identity and evolution, with particular reference to DNA barcoding. Mol. Ecol. Resour. 2012, 12, 562–565. [Google Scholar] [CrossRef] [PubMed]
  58. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
  59. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef] [PubMed]
  60. Stamatakis, A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014, 30, 1312–1313. [Google Scholar] [CrossRef] [Green Version]
  61. Bouckaert, R.; Vaughan, T.G.; Barido-Sottani, J.; Duchene, S.; Fourment, M.; Gavryushkina, A.; Heled, J.; Jones, G.; Kuehnert, D.; De Maio, N.; et al. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 2019, 15, e1006650. [Google Scholar] [CrossRef] [Green Version]
  62. Miller, M.A.; Pfeiffer, W.; Schwartz, T. The CIPRES science gateway: A community resource for phylogenetic analyses. In Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, Salt Lake City, UT, USA, 18–21 July 2011; pp. 1–8. [Google Scholar]
  63. Rambaut, A.; Drummond, A.J.; Xie, D.; Baele, G.; Suchard, M.A. Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7. Syst. Biol. 2018, 67, 901–904. [Google Scholar] [CrossRef] [Green Version]
  64. Rambaut, A. Figtree Version 1.4.0. 2015. Available online: http//tree.bio.ed.ac.uk/software/figtree/ (accessed on 10 January 2022).
  65. Kimura, M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 1980, 16, 111–120. [Google Scholar] [CrossRef]
  66. Ezard, T.; Fujisawa, T.; Barraclough, T.G. Splits: Species’ limits by threshold statistics. R Package Version 2009, 1, r29. [Google Scholar]
  67. Dellicour, S.; Flot, J.-F. The hitchhiker’s guide to single-locus species delimitation. Mol. Ecol. Resour. 2018, 18, 1234–1246. [Google Scholar] [CrossRef] [PubMed]
  68. Miralles, A.; Vences, M. New Metrics for Comparison of Taxonomies Reveal Striking Discrepancies among Species Delimitation Methods in Madascincus Lizards. PLoS ONE 2013, 8, e68242. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  69. Renner, M.A.M.; Heslewood, M.M.; Patzak, S.D.F.; Schaefer-Verwimp, A.; Heinrichs, J. By how much do we underestimate species diversity of liverworts using morphological evidence? An example from Australasian Plagiochila (Plagiochilaceae: Jungermanniopsida). Mol. Phylogen. Evol. 2017, 107, 576–593. [Google Scholar] [CrossRef] [PubMed]
  70. Esselstyn, J.A.; Evans, B.J.; Sedlock, J.L.; Khan, F.A.A.; Heaney, L.R. Single-locus species delimitation: A test of the mixed Yule-coalescent model, with an empirical application to Philippine round-leaf bats. Proc. R. Soc. B-Biol. Sci. 2012, 279, 3678–3686. [Google Scholar] [CrossRef] [PubMed]
  71. Luo, A.; Ling, C.; Ho, S.Y.W.; Zhu, C.-D. Comparison of Methods for Molecular Species Delimitation across a Range of Speciation Scenarios. Syst. Biol. 2018, 67, 830–846. [Google Scholar] [CrossRef] [Green Version]
  72. Okutani, T. Marine Mollusks in Japan; Tokai University Press: Tokyo, Japan, 2017; Volume 2. [Google Scholar]
  73. Castelin, M.; Williams, S.T.; Buge, B.; Maestrati, P.; Lambourdiere, J.; Ozawa, T.; Utge, J.; Couloux, A.; Alf, A.; Samadi, S. Untangling species identity in gastropods with polymorphic shells in the genus Bolma Risso, 1826 (Mollusca, Vetigastropoda). Eur. J. Taxon. 2017, 288, 1–21. [Google Scholar] [CrossRef] [Green Version]
  74. Yamazaki, D.; Miura, O.; Ikeda, M.; Kijima, A.; Do Van, T.; Sasaki, T.; Chiba, S. Genetic diversification of intertidal gastropoda in an archipelago: The effects of islands, oceanic currents, and ecology. Mar. Biol. 2017, 164, 184. [Google Scholar] [CrossRef]
  75. Dillon, R.T., Jr.; Robinson, J.D. The snails the dinosaurs saw: Are the pleurocerid populations of the Older Appalachians a relict of the Paleozoic Era? J. N. Am. Benthol. Soc. 2009, 28, 1–11. [Google Scholar] [CrossRef]
Figure 1. Datasets analyzed, species delimitation methods, and the categorization of results in the study. “n=” refers to the total number of datasets. The red circle represents nominal species and the blue circle represents the MOTUs.
Figure 1. Datasets analyzed, species delimitation methods, and the categorization of results in the study. “n=” refers to the total number of datasets. The red circle represents nominal species and the blue circle represents the MOTUs.
Genes 13 02273 g001
Figure 2. Efficiency of the species delimitation methods applied to the 37 analyzed datasets. The y-axis is the mean percentage of matches (orange), splits (green), merges (purple), and mixtures (yellow) observed for each approach. The primary partitions of ABGD and the partition closest to the delimitation of nominal species in ASAP were considered as the final results.
Figure 2. Efficiency of the species delimitation methods applied to the 37 analyzed datasets. The y-axis is the mean percentage of matches (orange), splits (green), merges (purple), and mixtures (yellow) observed for each approach. The primary partitions of ABGD and the partition closest to the delimitation of nominal species in ASAP were considered as the final results.
Genes 13 02273 g002
Figure 3. The y-axis represents match percentage observed for each method in three taxonomic levels (genus, subfamily, and family). The square in the bar represents the average of the data; the horizontal line in the bar represents the median of the data; the lines parallel to the columns are the lines between the maximum and minimum values in each set of data.
Figure 3. The y-axis represents match percentage observed for each method in three taxonomic levels (genus, subfamily, and family). The square in the bar represents the average of the data; the horizontal line in the bar represents the median of the data; the lines parallel to the columns are the lines between the maximum and minimum values in each set of data.
Genes 13 02273 g003
Figure 4. Comparison of species delimitation results of Turbo based on analysis of COI sequences of 98 individuals from 28 different nominal species and two unclassified clades (Turbo sp. A and Turbo sp. B) highlighted in red, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Figure 4. Comparison of species delimitation results of Turbo based on analysis of COI sequences of 98 individuals from 28 different nominal species and two unclassified clades (Turbo sp. A and Turbo sp. B) highlighted in red, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Genes 13 02273 g004
Figure 5. Comparison of species delimitation results of Tegula based on analysis of COI sequences of 149 individuals from 12 different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Figure 5. Comparison of species delimitation results of Tegula based on analysis of COI sequences of 149 individuals from 12 different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Genes 13 02273 g005
Figure 6. Comparison of species delimitation results of Monodonta based on analysis of COI sequences of 112 individuals from six different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades. The red branch represents one individual of M. confusa.
Figure 6. Comparison of species delimitation results of Monodonta based on analysis of COI sequences of 112 individuals from six different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades. The red branch represents one individual of M. confusa.
Genes 13 02273 g006
Figure 7. Comparison of species delimitation results of Bolma based on analysis of COI sequences of 91 individuals from 14 different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Figure 7. Comparison of species delimitation results of Bolma based on analysis of COI sequences of 91 individuals from 14 different nominal species, with lineage assignments from the four tree-based (PTP, mPTP, sGMYC, mGMYC) and three distance-based (ad hoc threshold, ABGD, ASAP) methods. Each colored bar represents a species delimited by each method tested and gray bars represent nominal species. Gene tree is from a BEAST analysis and MCC tree is shown. Node values represent Bayesian posterior probabilities (≥0.95) for major clades.
Genes 13 02273 g007
Table 1. The number of species and sequences contained in 40 datasets.
Table 1. The number of species and sequences contained in 40 datasets.
DatasetsTaxonomic RankNumber of Nominal SpeciesNumber of SequencesNumber of Haplotypes
AstraliumGenus122121
AustrocochleaGenus75832
BolmaGenus1416691
CalliostomaGenus115827
CantharidusGenus72314
ClanculusGenus687
DilomaGenus101817
GibbulaGenus64440
HomalopomaGenus61910
JujubinusGenus455
LirulariaGenus366
LunellaGenus108042
MicrelenchusGenus61412
MonodontaGenus6198112
PhorcusGenus89361
SteromphalaGenus75437
TectusGenus354
TegulaGenus12149107
TrochusGenus3197
TurboGenus287659
UmboniumGenus51210
CantharidinaeSubfamily54254193
FossarininaeSubfamily697
MonodontinaeSubfamily22274160
StomatellinaeSubfamily6119
TrochinaeSubfamily163723
TurbininaeSubfamily80375226
UmboniinaeSubfamily183225
AngariidaeFamily232
AreneidaeFamily111
CalliostomatidaeFamily176433
ColloniidaeFamily92213
LiotiidaeFamily243
MargaritidaeFamily91912
PhasianellidaeFamily599
SkeneidaeFamily777
SolariellidaeFamily154539
TegulidaeFamily20159116
TrochidaeFamily130624411
TurbinidaeFamily81377228
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Guo, B.; Kong, L. Comparing the Efficiency of Single-Locus Species Delimitation Methods within Trochoidea (Gastropoda: Vetigastropoda). Genes 2022, 13, 2273. https://doi.org/10.3390/genes13122273

AMA Style

Guo B, Kong L. Comparing the Efficiency of Single-Locus Species Delimitation Methods within Trochoidea (Gastropoda: Vetigastropoda). Genes. 2022; 13(12):2273. https://doi.org/10.3390/genes13122273

Chicago/Turabian Style

Guo, Bingyu, and Lingfeng Kong. 2022. "Comparing the Efficiency of Single-Locus Species Delimitation Methods within Trochoidea (Gastropoda: Vetigastropoda)" Genes 13, no. 12: 2273. https://doi.org/10.3390/genes13122273

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop