Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China

Wang, Wei; Shi, Yi; He, Fang; Fang, Donghui; Gan, Jia; Wu, Fuqiu; AG, Yueda; Deng, Xiaodong; Cao, Qi; Duo, Chu; RZ, Wangdeng; Fu, Maozhong; Yi, Jun

doi:10.3390/ani12233239

Open AccessArticle

Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China

by

Wei Wang

¹,

Yi Shi

¹,

Fang He

¹,

Donghui Fang

¹,

Jia Gan

¹,

Fuqiu Wu

²,

Yueda AG

¹,

Xiaodong Deng

¹,

Qi Cao

²,

Chu Duo

²,

Wangdeng RZ

²,

Maozhong Fu

¹ and

Jun Yi

^1,*

¹

Animal Breeding and Genetics Key Laboratory of Sichuan Province, Sichuan Animal Science Academy, Chengdu 610066, China

²

Jiulong County Agriculture, Animal Husbandry and Science and Technology Bureau, Jiulong 626200, China

^*

Author to whom correspondence should be addressed.

Animals 2022, 12(23), 3239; https://doi.org/10.3390/ani12233239

Submission received: 5 August 2022 / Revised: 6 October 2022 / Accepted: 11 November 2022 / Published: 22 November 2022

(This article belongs to the Collection Advances in Cattle Breeding, Genetics and Genomics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Simple Summary

For the reason that the systematic breeding programs and pedigree records are unavailable in the indigenous livestock breeds, it is hard to evaluate their genetic diversity and population structures using the traditional pedigree-based and demographic approaches. The Xieka cattle are an indigenous breed geographically distributed in the southeastern Sichuan, China, and have the long-term evolutionary adaptation to local subtropical highland environments. To explore efficient programs on genetic resources conservation and utilization, the genetic diversity and population structures of Xieka cattle were investigated in this study using genomic information. Our analyses revealed that this indigenous cattle breed have remained a relatively high degree of genetic diversity and have not suffered from the recently generated inbreeding. Furthermore, some candidate genomic regions and genes were suggested to be likely associated with the diverse production traits in cattle.

Abstract

In aiming to achieve sustainable development goals in the livestock industry, it is becoming increasingly necessary and important for the effective conservation of genetic resources. There are some indigenous cattle breeds in Sichuan, southwest China, for which, however, the genetic diversity and population structures still remain unknown because of the unavailability of systematic breeding programs and pedigree information. Xieka cattle are an indigenous breed locally distributed in southeastern Sichuan and have a long-term evolutionary adaptation to local environments and climates. In this study, we obtained 796,828 single nucleotide polymorphisms (SNPs) through sequencing the genomes of 30 Xieka cattle and used them for analyzing the genetic diversity and runs of homozygosity (ROH). The mean nucleotide diversity was 0.28 and 72% of SNPs were found to be in the heterozygous states. A total of 4377 ROH were detected with even distribution among all autosomes, and 74% of them were lower than 1 Mb in length. Meanwhile, only five ROH were found longer than 5 Mb. We further determined 19 significant genomic regions that were obviously enriched by ROH, in which 35 positional candidate genes were found. Some of these genes have been previously reported to be significantly associated with various production traits in cattle, such as meat quality, carcass performances, and diseases. In conclusion, the relatively high degree of genetic diversity of Xieka cattle was revealed using the genomic information, and the proposed candidate genes will help us optimize the breeding programs regarding this indigenous breed.

Keywords:

Xieka cattle; ROH; SNPs; inbreeding; Sichuan

1. Introduction

Because of the relative low productivity on the economically important traits in context of modern livestock industry, indigenous breeds are facing significantly decreasing population sizes and increasing inbreeding worldwide [1]. However, these gene pools are valuable due to their long-term evolutionary adaptation to the diverse local environments. There have been abundant indigenous cattle genetic resources in Sichuan Province, China, with seven officially recognized breeds (Bos taurus), as well as some unregistered breeds and populations. Wang and colleagues (2018) [2] investigated the genetic diversity among six indigenous cattle breeds in Sichuan using genome-wide single nucleotide polymorphisms (SNPs) that were obtained from the restriction site-associated DNA sequencing (RADseq) approach. However, the genomic diversity and population structures have not been studied yet for these unrecognized cattle breeds officially in Sichuan. The Xieka cattle are an indigenous breed with small body size, which have the average adult live weights of 370 Kg and 280 Kg for males and females, respectively (according to our field investigation). It is estimated that the current population size of Xieka cattle is ~15,000 having been mostly distributed in Jiulong county, which is geographically located in southeastern Ganzi Tibetan Autonomous Prefecture, Sichuan Province, China.

Using the genome-wide short tandem-repeat polymorphisms, Broman and Weber (1999) [3] first found the prevalent occurrences of long homozygous chromosomal segments in humans, and these homozygous segments were termed as the runs of homozygosity (ROH) in the accompanying editorial comments [4]. However, the relevant studies of ROH had not been comprehensively carried out until the presences of high-throughput genotyping technologies, including the second-generation genome sequencing approaches and SNP assays [5]. For the reason that ROH are theoretically mainly derived from parental inbreeding, they have been extensively involved in studying both population structures and demographic history in human and livestock [6,7]. Notably, Curik et al. (2014) [8] reviewed the application of ROH analysis for estimating inbreeding at the individual and population levels and compared different relevant population statistics. The genome sequence variants and pedigree data in Holstein cattle were comprehensively evaluated for different measures of inbreeding [9]. In domestic cattle, genome-wide ROH was first investigated using 777,962 SNPs among nine worldwide modern breeds [10], which provided an excellent research framework in relation to ROH in livestock. Subsequently, the ROH analyses of cattle were often reported, for instance, in the U.S. Holstein cattle regarding the artificial selection signatures [11] and bull fertility [12], and in Polish and Chinese beef cattle associated with the beef production traits [13,14].

Aiming to understand the current genetic landscape of the gene pool of Xieka cattle, in this study, we employed the genome resequencing data for ROH analyses and further investigating the genomic diversity and population structure regarding this indigenous breed in China.

2. Materials and Methods

2.1. Ethics Statement

All blood samples involved in this study were collected by veterinarians at the annual health inspection, which means that no ethical approval is required.

2.2. Animals and Sample Collection

A total of 30 adult Xieka cattle were collected in this study, consisting of 12 males and 18 females. To guarantee our sampling as being representative as possible, we recruited the candidate animals according to the two considerations. First, these sampled animals were separately raised on rural farmers and did not have known pedigree relationships with each other. Second, individual morphological characteristics were carefully checked to avoid possible hybrid offspring most likely from the exotic cattle breeds. Blood samples from the external jugular vein were collected and used for the genome sequencing.

2.3. Genomic DNA and Sequencing

Genomic DNA was extracted using the Axy-Prep Genomic DNA Miniprep Kit (Axygen Bioscience, Tewksbury, MA, USA). The paired-end sequencing libraries with 350 bp in length were constructed according to Illumina’s protocol (Illumina, San Diego, CA, USA). In brief, 0.5 μg of genomic DNA was fragmented, end-paired, and ligated to adaptors, respectively. The ligated fragments were subsequently fractionated on the agarose gels and purified by PCR amplification to produce sequencing libraries. The successfully constructed libraries were sequenced on the Illumina HiSeq platform and the 150 bp paired-end reads were finally generated (Novogene Co., Ltd., Beijing, China).

2.4. SNP Genotyping and Quality Controls

The raw sequencing reads in FASTQ format were subjected to quality filtering to discard low-quality reads using the fastp software v0.23.2 [15], which were categorized into one of the following types: (i) reads contaminated by adaptor sequences, (ii) reads containing unambiguous bases of N more than 10% of total length, and (iii) reads containing low-quality bases (i.e., the reported Quality value of base <5) more than 50% of the total length. If any member of the paired reads was marked as low quality, both pairs were discarded. After these steps, we obtained clean reads and mapped them to cattle reference genome (ARS-UCD1.2) using BWA mapper v0.7.17 with the default parameters [16]. Subsequently, we employed GATK toolkit v4.2.5.0 [17] for the SNP discovery and genotyping across all samples according to the GATK Best Practices recommendations [18], in which the duplicate removal, InDel realignment and hard filtering algorithms were performed with the default parameters.

After obtaining raw SNPs, we first extracted these biallelic SNPs with Quality value > 20, coverage depth > 3, and being located on the 29 autosomes using VCFtools v0.1.16 [19]. Second, these SNPs were further subjected to population-based quality filtering using PLINK software v1.9 [20], as requiring the minor allele frequency (MAF) > 0.05, missing genotype rate < 0.1, and no significant deviation from Hard-Weinberg equilibrium (HWE, p > 10⁻⁶). Finally, the missing genotypes were also imputed using Beagle software v5.3 with the default parameters [21].

2.5. Genetic Diversity and Runs of Homozygosity

First, we investigated the genomic distribution regarding all clean SNPs and calculated three genetic diversity statistics, including the MAF, or nucleotide diversity on a per-site basis [

= \frac{n}{n - 1} (1 - \sum x_{i}^{2})

, where

n

is the total number of sequences and

x_{i}

is the observed frequency of

i

allele], and observed heterozygosity using VCFtools v0.1.16 [19]; meanwhile, we calculated the pairwise individual relatedness based on the method of Yang et al. (2010) [22]. Second, we detected the genomic ROH using the detectRUNS (v0.9.6) R package [23]. The individual homozygous segments were determined in every animal using a sliding window of 50 SNPs, in which no more than one heterozygous SNPs and five missing SNPs were allowed. Subsequently, an effective ROH was defined by requiring the minimum number of 100 SNPs contained, at least 500 Kb in length, the minimum SNP density of 1 SNP per 50 Kb, the minimum proportion of homozygous overlap window of 0.05, and the maximum gap between continuous homozygous SNPs, which was 100 Kb. The significant genomic regions were determined if they were simultaneously coveraged by ROH among more than 30% of all samples.

2.6. Functional Analysis

Within the significant genomic regions, the annotated functional genes were explored using biomaRt R package [24]. The ARS-UCD1.2 assembly was used as reference genome. Regarding these candidate genes, functional enrichment analyses were conducted using the DAVID web tool (accessed on June 15, 2022) [25] regarding the Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The default Benjamini-Hochberg method was used for computing p values with the threshold of 0.05.

3. Results

3.1. SNPs and Genetic Diversity

We obtained a total of 1711 million raw paired sequencing reads, from which 1704 million clean reads (with the mean of 56.8 million per sample) were finally generated after quality filtering (Table S1 in Supplementary Materials). Against the reference genome, average mapping rate was 99.5% for these clean reads; and, by which, the 96.9% and 57.2% of genome sequences were covered by at least 1X and 4X sequencing reads, respectively (Table S2). A total of 48,699,312 raw SNPs were initially obtained, which resulted into 796,828 clean SNPs after being subjected to our custom processing steps. These high-quality SNPs were evenly distributed among the 29 autosomes (Figure 1A), with a transition/transversion ratio of 2.25.

Regarding the three statistics of genetic diversity that were calculated by SNPs, their statistical distributions are shown in Figure 1B. Among all SNPs, the mean and median of MAF were 0.19 and 0.15, respectively; meanwhile, the nucleotide diversity ranged from 0.09 to 0.51 (with the mean of 0.28). Furthermore, on average 72% (ranging from 68% to 76%) of SNPs were found to be in the heterozygous states among all the 30 samples. Among these animals, only a small proportion of pairwise comparisons showed the relatively high genetic relatedness (Figure 1C), and all of which had the mean and median of −0.03 and −0.05, respectively.

3.2. Genomic Patterns of Runs of Homozygosity

A total of 4377 ROH were detected among the 30 individuals, and all of them were evenly distributed across the 29 autosomes (Figure 1D). The highest and lowest numbers of ROH were observed in the chromosome BTA1 (n = 282) and BTA29 (n = 41), respectively. On the whole, there were 3226 ROH with length of 0.5–1 Mb (74%), 970 ROH of 1–2 Mb (22%), 176 ROH of 2–5 Mb (4%), and five ROH > 5 Mb. Within each individual, the length distributions of ROH are shown in Figure 2, which revealed the observable differences on their numbers and lengths of ROH; the mean (±standard deviation) of total ROH lengths was 132.9 ± 81.9 Mb, and whose proportions present in the genome (i.e., the ROH-based inbreeding coefficients) individually ranged from 1.6% to 16.7% (mean = 5.3%). Furthermore, the five longest ROH (>5 Mb) were found in five different animals (Figure 2).

There were 1263 SNPs being located within the ROH that were simultaneously found in more than 30% of all individuals. These SNPs were further clustered into 19 significant genomic regions that have been distributed among nine chromosomes, including the BTA2 (n = 2), BTA6 (n = 2), BTA9 (n = 2), BTA11 (n = 1), BTA15 (n = 1), BTA16 (n = 2), BTA17 (n = 1), BTA21 (n = 6), and BTA27 (n = 2), respectively (Table 1). Among them, the four longest genomic regions were found on both BTA15 (895 Kb in length) and BTA21 (703 Kb, 610 Kb, and 606 Kb in length).

3.3. Genes within the Significant Genomic Regions

Within these significant genomic regions, we found a total of 34 protein-coding and one microRNA genes (Table 1 and Table S3), and these genes were mainly distributed on the three chromosomes of BTA27, BTA21, and BTA11, respectively. Based on the functional enrichment analyses regarding these candidate genes, four GO terms (GO:0005887~integral component of plasma membrane, GO:0009986~cell surface, GO:0043005~neuron projection, and GO:0005783~endoplasmic reticulum) and one KEGG pathway (bta04080:Neuroactive ligand-receptor interaction) were revealed.

4. Discussion

Jiulong county has subtropical highland climates with annual mean temperature of 9.1–17.5 °C [26]. Meanwhile, this region has been relatively isolated because of geography and poor transportation, especially in these rural areas. Accordingly, we believe these local genetic resources of domestic cattle, as well as other livestock, would be valuable for achieving sustainable development goals in future. The Xieka cattle are an indigenous breed but have not been officially registered yet. Understanding genetic diversity and population structures is necessary for establishing efficient breeding and conservation programs. Unfortunately, it is difficult or impossible in such an indigenous breed to obtain the accurate pedigree information, which has restricted the possible application of traditional approaches [27]. In this study, therefore, we obtained genome-wide SNPs in Xieka cattle and used them for evaluating genomic diversity and population structure. To the best of our knowledge, this is the first genetic study in Xieka cattle using the genomic information.

Using genome-wide SNPs generated from RADseq approach, Wang et al. (2018) [2] analyzed six officially registered indigenous cattle breeds in Sichuan and found their mean nucleotide diversity was 0.19 with the ranges from 0.26 in Ganzi cattle to 0.31 in Pingwu cattle. In this study, we similarly found the comparable nucleotide diversity with a mean of 0.28, and most SNPs were in the heterozygous states. Compared with previous reports of ROH in cattle [10,13], it seems that a smaller number of ROH were detected in this study; however, the total numbers and distribution of length sizes could not be directly compared due to differences on sample sizes and ROH definitions among the different studies. According to the total length of ROH, the estimated inbreeding coefficients in Xieka cattle (5.3%) are lower than the previous relevant estimates in other cattle breeds, such as in U.S. Holstein cattle [12], and in Hereford, Montbeliarde, and others [13]. These results indicate that Xieka cattle have retained the considerable genetic diversity despite its relatively small population size in comparison with the geographically adjacent cattle breeds. Furthermore, more than 70% of the detected ROH in Xieka cattle have length less than 1 Mb and only a few ROH were longer than 5 Mb, which suggests the low inbreeding levels and also the absence of recent inbreeding. Taken together, our genomic analyses confidently revealed that the Xieka cattle have not suffered from the serious losses of genetic diversity and obvious inbreeding.

Based on the ROH analysis, nine genomic regions were found to be significantly enriched in low-fertility when compared with high-fertility bulls [12]. In Chinese Simmental cattle, Zhao et al. (2021) [14] conducted association analyses between the enriched genomic regions of ROH and beef production traits and found many biologically meaningful candidate genes. In eight different livestock and pet species, Gorssen et al. (2021) [28] conducted a comprehensive ROH analysis and made the results publicly available for finding potential artificial or natural selection signatures and candidate functional genes. Therefore, ROH analyses are becoming a state-of-the-art approach for exploring the population structures and mapping the critical genomic regions and candidate genes associated with the economically important traits in cattle and other livestock species. In this study, we found some significant genomic regions, especially located on the BTA21 and BTA15, that were obviously enriched by ROH. Furthermore, four GO terms and one KEGG pathway were revealed to be associated with the candidate genes that are located in these significant genomic regions. Among them, the neuroactive ligand-receptor interaction pathway was recently reported to be significantly associated with heat stress in Australian Holsteins, which may participate in maintaining metabolic homeostasis in cattle during thermal stress [29].

We further conducted literature searches for investigating the possible biological functions of candidate genes revealed by ROH analysis in this study. Among them, the CORIN (corin serine pepsidase) gene, a member of the trypsin superfamily located on BTA6, was reported to be associated with meat colour traits in Nellore cattle [30]. The SPAST (spastin) gene was proposed to be the affected gene of spinal dysmielination disease in cattle [31]. In humans, the genetic variants of TNFSF4 (TNF superfamily member 4) gene were significantly associated with the primary Sjögren’s syndrome [32]. In Original Braunvieh cattle, the TMEM201 (transmembrane protein 201) gene was suggested to be related to selection signatures [33]. Buchanan et al. (2016) [34] analyzed the triacylglycerol and phospholipid fatty acid fractions in angus cattle and reported INPP4B (inositol polyphosphate-4-phosphatase type II B) as a candidate gene. Other functional genes revealed in this study were also previously reported to be associated with diverse production traits, including the RCN2 (reticulocalbin 2), PSTPIP1 (proline-serine-threonine phosphatase interacting protein 1), ADRB3 (adrenoceptor beta 3) genes with carcass traits in Hanwoo and Qinchuan cattle [35,36], GABRB3 (gamma-aminobutyric acid type A receptor subunit beta3) and GABRB5 (gamma-aminobutyric acid type A receptor subunit alpha5) genes with temperament in beef cattle [37], ZNF703 (zinc finger protein 703) and ERLIN2 (ER lipid raft associated 2) genes with average daily gain in Nellore and with residual concentrate intake in Holstein [38], and GOT1L1 (Glutamic-oxaloacetic transaminase 1 like 1) as a cell indicator of stress in the cattle semen [39]. Interestingly, we also mapped a microRNA gene of bta-mir-2400 that was experimentally revealed for affecting the preadipocytes proliferation and differentiation in yak [40], regulating skeletal muscle satellite cells proliferation in Chinese Simmental cattle [41]. On the whole, the current literature evidence confirms that the revealed significant genomic regions are likely associated with the local environmental adaptations and other production traits in Xieka cattle.

5. Conclusions

In this study, we employed the genomic information and ROH analyses for investigating genetic diversity and population structure regarding an indigenous cattle breed that are geographically distributed in Southwest China. Our results revealed the comparable genetic diversity to other geographically adjacent indigenous cattle breeds. Furthermore, we determined the significant genomic regions that are enriched by ROH, in which some known and novel genes were suggested to be associated with production traits in cattle.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ani12233239/s1, Table S1: Sequencing results and quality filtering; Table S2: Mapping results of clean reads against reference genome; Table S3: The related information of candidate genes.

Author Contributions

Conceptualization: W.W. and J.Y.; Data analysis: W.W., Y.S., F.H., D.F., J.G. and J.Y.; Resources, F.W., Y.A.G., X.D., Q.C., C.D., W.R.Z., M.F. and J.Y.; Writing: W.W., Y.S. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Sichuan Province Science and Technology Planning Project (2021YFYZ0007 and 2021YFYZ0001), and Sichuan Beef Cattle Innovation Team of Modern Agricultural Technology System (SCCXTD-2022-13).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the involved blood samples were collected by veterinarians at the annual health inspection.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data supporting this study are included in the article and in the Supplementary Materials.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kristensen, T.N.; Hoffmann, A.A.; Pertoldi, C.; Stronen, A.V. What can livestock breeders learn from conservation genetics and vice versa? Front. Genet. 2015, 6, 38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, W.; Gan, J.; Fang, D.; Tang, H.; Wang, H.; Yi, J.; Fu, M. Genome-wide SNP discovery and evaluation of genetic diversity among six Chinese indigenous cattle breeds in Sichuan. PLoS ONE 2018, 13, e0201534. [Google Scholar] [CrossRef] [PubMed]
Broman, K.W.; Weber, J.L. Long homozygous chromosomal segments in reference families from the centre d’Etude du polymorphisme humain. Am. J. Hum. Genet. 1999, 65, 1493–1500. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Clark, A.G. The size distribution of homozygous segments in the human genome. Am. J. Hum. Genet. 1999, 65, 1489–1492. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Reuter, J.A.; Spacek, D.V.; Snyder, M.P. High-throughput sequencing technologies. Mol. Cell 2015, 58, 586–597. [Google Scholar] [CrossRef] [Green Version]
Peripolli, E.; Munari, D.P.; Silva, M.V.G.B.; Lima, A.L.F.; Irgang, R.; Baldi, F. Runs of homozygosity: Current knowledge and applications in livestock. Anim. Genet. 2017, 48, 255–271. [Google Scholar] [CrossRef]
Ceballos, F.C.; Joshi, P.K.; Clark, D.W.; Ramsay, M.; Wilson, J.F. Runs of homozygosity: Windows into population history and trait architecture. Nat. Rev. Genet. 2018, 19, 220–234. [Google Scholar] [CrossRef]
Curik, I.; Ferenčaković, M.; Sölkner, J. Inbreeding and runs of homozygosity: A possible solution to an old problem. Livest. Sci. 2014, 166, 26–34. [Google Scholar] [CrossRef]
Alemu, S.W.; Kadri, N.K.; Harland, C.; Faux, P.; Charlier, C.; Caballero, A.; Druet, T. An evaluation of inbreeding measures using a whole-genome sequenced cattle pedigree. Heredity 2021, 126, 410–423. [Google Scholar] [CrossRef]
Purfield, D.C.; Berry, D.P.; McParland, S.; Bradley, D.G. Runs of homozygosity and population history in cattle. BMC Genet. 2012, 13, 70. [Google Scholar] [CrossRef]
Kim, E.S.; Cole, J.B.; Huson, H.; Wiggans, G.R.; Van Tassell, C.P.; Crooker, B.A.; Liu, G.; Da, Y.; Sonstegard, T.S. Effect of artificial selection on runs of homozygosity in US Holstein cattle. PLoS ONE 2013, 8, e80813. [Google Scholar] [CrossRef] [Green Version]
Nani, J.P.; Peñagaricano, F. Whole-genome homozygosity mapping reveals candidate regions affecting bull fertility in US Holstein cattle. BMC Genom. 2020, 21, 338. [Google Scholar] [CrossRef]
Szmatoła, T.; Gurgul, A.; Jasielczuk, I.; Ząbek, T.; Ropka-Molik, K.; Litwińczuk, Z.; Bugno-Poniewierska, M. A comprehensive analysis of runs of homozygosity of eleven cattle breeds representing different production types. Animals 2019, 9, 1024. [Google Scholar] [CrossRef] [Green Version]
Zhao, G.; Liu, Y.; Niu, Q.; Zheng, X.; Zhang, T.; Wang, Z.; Xu, L.; Zhu, B.; Gao, X.; Zhang, L.; et al. Runs of homozygosity analysis reveals consensus homozygous regions affecting production traits in Chinese Simmental beef cattle. BMC Genom. 2021, 22, 678. [Google Scholar] [CrossRef]
Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef]
Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [Green Version]
McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef] [Green Version]
Van der Auwera, G.A.; Carneiro, M.O.; Hartl, C.; Poplin, R.; Del Angel, G.; Levy-Moonshine, A.; Jordan, T.; Shakir, K.; Roazen, D.; Thibault, J.; et al. From FastQ data to high confidence variant calls: The Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinform. 2013, 43, 11.10.1–11.10.33. [Google Scholar] [CrossRef] [Green Version]
Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
Browning, S.R.; Browning, B.L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 2007, 81, 1084–1097. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Yang, J.; Benyamin, B.; McEvoy, B.P.; Gordon, S.; Henders, A.K.; Nyholt, D.R.; Madden, P.A.; Heath, A.C.; Martin, N.G.; Montgomery, G.W.; et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010, 42, 565–569. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Biscarini, F.; Cozzi, P.; Gaspa, G.; Marras, G. detectRUNS: Detect Runs of Homozygosity and Runs of Heterozygosity in Diploid Genomes. 2019. R Package Version 0.9.6. Available online: https://CRAN.R-project.org/package=detectRUNS (accessed on 20 May 2022).
Smedley, D.; Haider, S.; Durinck, S.; Pandini, L.; Provero, P.; Allen, J.; Arnaiz, O.; Awedh, M.H.; Baldock, R.; Barbiera, G.; et al. The BioMart community portal: An innovative alternative to large, centralized data repositories. Nucleic Acids Res. 2015, 43, W589–W598. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Huang, D.W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4, 44–57. [Google Scholar] [CrossRef] [PubMed]
Tang, Y.; Xu, G.; Wan, R.; Wang, X.; Wang, J.; Li, P. Atmospheric Thermal and Dynamic Vertical Structures of Summer Hourly Precipitation in Jiulong of the Tibetan Plateau. Atmosphere 2021, 12, 505. [Google Scholar] [CrossRef]
Calboli, F.C.; Sampson, J.; Fretwell, N.; Balding, D.J. Population structure and inbreeding from pedigree analysis of purebred dogs. Genetics 2008, 179, 593–601. [Google Scholar] [CrossRef] [Green Version]
Gorssen, W.; Meyermans, R.; Janssens, S.; Buys, N. A publicly available repository of ROH islands reveals signatures of selection in different livestock and pet species. Genet. Sel. Evol. 2021, 53, 2. [Google Scholar] [CrossRef]
Cheruiyot, E.K.; Haile-Mariam, M.; Cocks, B.G.; MacLeod, I.M.; Xiang, R.; Pryce, J.E. New loci and neuronal pathways for resilience to heat stress in cattle. Sci. Rep. 2021, 11, 16619. [Google Scholar] [CrossRef]
Marín-Garzón, N.A.; Magalhães, A.F.B.; Mota, L.F.M.; Fonseca, L.F.S.; Chardulo, L.A.L.; Albuquerque, L.G. Genome-wide association study identified genomic regions and putative candidate genes affecting meat color traits in Nellore cattle. Meat Sci. 2021, 171, 108288. [Google Scholar] [CrossRef]
Gholap, P.N.; Kale, D.S.; Sirothia, A.R. Genetic diseases in cattle: A review. Res. J. Anim. Vet. Fish. Sci. 2014, 2, 24–33. [Google Scholar]
Nordmark, G.; Kristjansdottir, G.; Theander, E.; Appel, S.; Eriksson, P.; Vasaitis, L.; Kvarnström, M.; Delaleu, N.; Lundmark, P.; Lundmark, A.; et al. Association of EBF1, FAM167A (C8orf13)-BLK and TNFSF4 gene variants with primary Sjögren’s syndrome. Genes Immun. 2011, 12, 100–109. [Google Scholar] [CrossRef]
Bhati, M.; Kadri, N.K.; Crysnanto, D.; Pausch, H. Assessing genomic diversity and signatures of selection in Original Braunvieh cattle using whole-genome sequencing data. BMC Genom. 2020, 21, 27. [Google Scholar] [CrossRef] [Green Version]
Buchanan, J.W.; Reecy, J.M.; Garrick, D.J.; Duan, Q.; Beitz, D.C.; Koltes, J.E.; Saatchi, M.; Koesterke, L.; Mateescu, R.G. Deriving gene networks from SNP associated with triacylglycerol and phospholipid fatty acid fractions from Ribeyes of Angus cattle. Front. Genet. 2016, 7, 116. [Google Scholar] [CrossRef] [Green Version]
Yoon, D.; Ko, E. Association study between SNPs of the genes within bovine QTLs and meat quality of Hanwoo. J. Anim. Sci. 2016, 94, 145. [Google Scholar] [CrossRef] [Green Version]
Mei, C.G.; Gui, L.S.; Wang, H.C.; Tian, W.Q.; Li, Y.K.; Zan, L.S. Polymorphisms in adrenergic receptor genes in Qinchuan cattle show associations with selected carcass traits. Meat Sci. 2018, 135, 166–173. [Google Scholar] [CrossRef]
Costilla, R.; Kemper, K.E.; Byrne, E.M.; Porto-Neto, L.R.; Carvalheiro, R.; Purfield, D.C.; Doyle, J.L.; Berry, D.P.; Moore, S.S.; Wray, N.R.; et al. Genetic control of temperament traits across species: Association of autism spectrum disorder risk genes with cattle temperament. Genet. Sel. Evol. 2020, 52, 51. [Google Scholar] [CrossRef]
Manca, E.; Cesarani, A.; Falchi, L.; Atzori, A.S.; Gaspa, G.; Rossoni, A.; Macciotta, N.P.P.; Dimauro, C. Genome-wide association study for residual concentrate intake using different approaches in Italian Brown Swiss. Ital. J. Anim. Sci. 2021, 20, 1957–1967. [Google Scholar] [CrossRef]
Sanglard, L.P.; Nascimento, M.; Moriel, P.; Sommer, J.; Ashwell, M.; Poore, M.H.; Duarte, M.D.S.; Serão, N.V. Impact of energy restriction during late gestation on the muscle and blood transcriptome of beef calves after preconditioning. BMC Genom. 2018, 19, 702. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Ma, L.; Gu, Y.; Chang, Y.; Liang, C.; Guo, X.; Bao, P.; Chu, M.; Ding, X.; Yan, P. Bta-miR-2400 Targets SUMO1 to Affect Yak Preadipocytes Proliferation and Differentiation. Biology 2021, 10, 949. [Google Scholar] [CrossRef]
Zhang, W.W.; Tong, H.L.; Sun, X.F.; Hu, Q.; Yang, Y.; Li, S.F.; Yan, Y.Q.; Li, G.P. Identification of miR-2400 gene as a novel regulator in skeletal muscle satellite cells proliferation by targeting MYOG gene. Biochem. Biophys. Res. Commun. 2015, 463, 624–631. [Google Scholar] [CrossRef]

Figure 1. Results of SNPs, genetic diversity, and runs of homozygosity. All SNPs were illustrated for their genomic distribution, with the number of SNPs (y-axis) per 10 Mb genomic region (A), the calculated statistics of minor allele frequency (MAF), nucleotide diversity (Pi), and the observed heterozygosity (Ho) in (B), and the derived pairwise relatedness among individuals (C). Among the 29 autosomes, the numbers of runs of homozygosity are schematically shown in (D).

Figure 2. Numbers and lengths of runs of homozygosity. All runs of homozygosity are represented by the hollow points with their lengths showing on the left-hand y-axis. The individual genomic proportions (the red line) for total length of runs of homozygosity are shown in the right-hand y-axis.

Table 1. Distribution of the significant genomic regions and candidate genes.

Genomic Regions				N of SNPs	Candidate Genes
Chr	Start	End	Kb	N of SNPs	Candidate Genes
BTA2	81,899,298	82,248,318	349	65	None
BTA2	82,355,133	82,378,221	23	7	None
BTA6	66,234,825	66,436,164	201	57	ATP10D, CORIN
BTA6	81,161,826	81,281,680	120	22	EPHA5
BTA9	46,061,002	46,211,676	151	38	None
BTA9	46,270,931	46,319,445	49	8	None
BTA11	14,444,882	14,944,596	500	89	MEMO1, DPY30, SPAST, SLC30A6, NLRC4, YIPF4
BTA15	11,474,883	12,369,497	895	152	None
BTA16	40,264,305	40,823,586	559	92	TNFSF18, TNFSF4
BTA16	43,872,679	44,009,452	137	18	TMEM201, SLC25A33
BTA17	15,394,915	15,766,365	371	64	INPP4B
BTA21	2,945,936	3,556,313	610	136	None
	4,051,182	4,754,672	703	129	SCAPER, RCN2, PSTPIP1, TSPAN3, PEAK1
	31,947,095	32,283,141	336	70	PEAK1
	32,283,208	32,346,053	63	15	PEAK1, HMG20A, LINGO1
	32,490,146	32,802,506	312	89	None
	36,931,645	37,537,467	606	150	GABRB3, GABRA5
BTA27	32,891,817	33,308,815	417	56	ZNF703, ERLIN2, PLPBP, ADGRA2, BRF2, RAB11FIP1, GOT1L1, ADRB3, EIF4EBP1, ASH2L
BTA27	33,448,422	33,466,272	18	6	NSD3, bta-mir-2400

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, W.; Shi, Y.; He, F.; Fang, D.; Gan, J.; Wu, F.; AG, Y.; Deng, X.; Cao, Q.; Duo, C.; et al. Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China. Animals 2022, 12, 3239. https://doi.org/10.3390/ani12233239

AMA Style

Wang W, Shi Y, He F, Fang D, Gan J, Wu F, AG Y, Deng X, Cao Q, Duo C, et al. Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China. Animals. 2022; 12(23):3239. https://doi.org/10.3390/ani12233239

Chicago/Turabian Style

Wang, Wei, Yi Shi, Fang He, Donghui Fang, Jia Gan, Fuqiu Wu, Yueda AG, Xiaodong Deng, Qi Cao, Chu Duo, and et al. 2022. "Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China" Animals 12, no. 23: 3239. https://doi.org/10.3390/ani12233239

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Runs of Homozygosity Analysis Reveals Genomic Diversity and Population Structure of an Indigenous Cattle Breed in Southwest China

Abstract

Simple Summary

Abstract

1. Introduction

2. Materials and Methods

2.1. Ethics Statement

2.2. Animals and Sample Collection

2.3. Genomic DNA and Sequencing

2.4. SNP Genotyping and Quality Controls

2.5. Genetic Diversity and Runs of Homozygosity

2.6. Functional Analysis

3. Results

3.1. SNPs and Genetic Diversity

3.2. Genomic Patterns of Runs of Homozygosity

3.3. Genes within the Significant Genomic Regions

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI