Gene-Based Genome-Wide Association Study Identified Genes for Agronomic Traits in Maize
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Maize Genomic Data Processing
2.2. FaST-LMM for Genes
2.3. Implementation
lmm <- function(ystar, xstar, w){ | |
fit0 <- fastLmPure(y = ystar, X = as.matrix(xstar[,1])) | |
yd <- ystar - xstar [,1]*fit0$coefficients [1] | |
ssy <- sum(yd^2) | |
fit <- fastLmPure(y = ystar, X = xstar) | |
resi <- ystar-xstar%*%fit$coefficients | |
sse <- sum(resi^2) | |
ssr <- ssy-sse | |
dfe <- fit$df.residual | |
ve <- sse/dfe | |
dfb <- ncol(xstar)-1 | |
F <- (ssr/dfb)/ve | |
p <- 1-pf(F, dfb, dfe, lower.tail = FALSE) | |
logL<- log(det(w)) + nobs*log(ve) | |
} |
3. Results
3.1. Simulations
3.2. Case Analyses
Trait | ALLSNP | FPC | EPC | EPCLM | |
---|---|---|---|---|---|
HAU maize | C16:0P | 0.381 | 1.041 | 1.061 | 5.163 |
C16:1P | 0.397 | 1.032 | 1.044 | 3.557 | |
C18:0P | 1.199 | 0.978 | 1.000 | 2.913 | |
C18:1P | 0.346 | 1.062 | 1.077 | 4.609 | |
C18:2P | 0.355 | 1.056 | 1.069 | 5.647 | |
C18:3P | 0.312 | 1.074 | 1.073 | 4.808 | |
C20:0P | 0.382 | 1.036 | 1.043 | 5.305 | |
C20:1P | 0.450 | 1.009 | 1.020 | 3.153 | |
C22:0P | 0.545 | 1.035 | 1.047 | 4.007 | |
C24:0P | 0.340 | 1.047 | 1.059 | 4.378 | |
C16:0/C16:1 | 0.381 | 1.017 | 1.035 | 2.508 | |
C16:0/C18:0 | 0.386 | 1.070 | 1.086 | 3.942 | |
C18:0/C18:1 | 0.614 | 1.022 | 1.036 | 1.680 | |
C18:1/C18:2 | 0.466 | 1.028 | 1.049 | 4.508 | |
C18:2/C18:3 | 0.365 | 1.031 | 1.056 | 5.792 | |
C18:0/C20:0 | 0.579 | 1.014 | 1.027 | 3.544 | |
C20:0/C20:1 | 0.418 | 1.006 | 1.012 | 2.816 | |
C20:0/C22:0 | 0.410 | 1.050 | 1.070 | 5.315 | |
C22:0/C24:0 | 8.171 | 0.923 | 0.951 | 2.849 | |
SFA/USFA | 0.373 | 1.035 | 1.055 | 5.384 | |
AP maize | DTS | 0.866 | 1.063 | 1.080 | 80.330 |
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
References
- Lesnick, T.G.; Papapetropoulos, S.; Mash, D.C.; Ffrench-Mullen, J.; Shehadeh, L.; de Andrade, M.; Henley, J.R.; Rocca, W.A.; Ahlskog, J.E.; Maraganore, D.M. A genomic pathway approach to a complex disease: Axon guidance and Parkinson disease. PLoS Genet. 2007, 3, e98. [Google Scholar] [CrossRef] [PubMed]
- Hauser, E.; Cremer, N.; Hein, R.; Deshmukh, H. Haplotype-based analysis: A summary of GAW16 Group 4 analysis. Genet. Epidemiol. 2009, 33 (Suppl. S1), S24–S28. [Google Scholar] [CrossRef] [PubMed]
- Pryce, J.E.; Bolormaa, S.; Chamberlain, A.J.; Bowman, P.J.; Savin, K.; Goddard, M.E.; Hayes, B.J. A validated genome-wide association study in 2 dairy cattle breeds for milk production and fertility traits using variable length haplotypes. J. Dairy Sci. 2010, 93, 3331–3345. [Google Scholar] [CrossRef] [PubMed]
- Yang, H.C.; Liang, Y.J.; Chung, C.M.; Chen, J.W.; Pan, W.H. Genome-wide gene-based association study. BMC Proc. 2009, 3 (Suppl. S7), S135. [Google Scholar] [CrossRef] [PubMed]
- Yang, H.C.; Lin, C.Y.; Fann, C.S. A sliding-window weighted linkage disequilibrium test. Genet. Epidemiol. 2006, 30, 531–545. [Google Scholar] [CrossRef]
- Yang, H.C.; Hsieh, H.Y.; Fann, C.S. Kernel-based association test. Genetics 2008, 179, 1057–1068. [Google Scholar] [CrossRef]
- Peng, Q.; Zhao, J.; Xue, F. PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs. BMC Genet. 2010, 11, 6. [Google Scholar] [CrossRef]
- Wang, K.; Abbott, D. A principal components regression approach to multilocus genetic association studies. Genet. Epidemiol. 2008, 32, 108–118. [Google Scholar] [CrossRef]
- Wang, X.; Qin, H.; Sha, Q. Incorporating multiple-marker information to detect risk loci for rheumatoid arthritis. BMC Proc. 2009, 3 (Suppl. S7), S28. [Google Scholar] [CrossRef]
- Gauderman, W.J.; Murcray, C.; Gilliland, F.; Conti, D.V. Testing association between disease and multiple SNPs in a candidate gene. Genet. Epidemiol. 2007, 31, 383–395. [Google Scholar] [CrossRef]
- Xia, J.; Fan, H.; Chang, T.; Xu, L.; Zhang, W.; Song, Y.; Zhu, B.; Zhang, L.; Gao, X.; Chen, Y.; et al. Searching for new loci and candidate genes for economically important traits through gene-based association analysis of Simmental cattle. Sci. Rep. 2017, 7, 42048. [Google Scholar] [CrossRef] [PubMed]
- Zaykin, D.V.; Zhivotovsky, L.A.; Czika, W.; Shao, S.; Wolfinger, R.D. Combining p-values in large-scale genomics experiments. Pharm. Stat. 2007, 6, 217–226. [Google Scholar] [CrossRef] [PubMed]
- Yano, K.; Morinaka, Y.; Wang, F.; Huang, P.; Takehara, S.; Hirai, T.; Ito, A.; Koketsu, E.; Kawamura, M.; Kotake, K.; et al. GWAS with principal component analysis identifies a gene comprehensively controlling rice architecture. Proc. Natl. Acad. Sci. USA 2019, 116, 21262–21267. [Google Scholar] [CrossRef] [PubMed]
- Aulchenko, Y.S.; de Koning, D.J.; Haley, C. Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 2007, 177, 577–585. [Google Scholar] [CrossRef]
- Kang, H.M.; Sul, J.H.; Service, S.K.; Zaitlen, N.A.; Kong, S.; Freimer, N.B.; Sabatti, C.; Eskin, E. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 2010, 42, 348–354. [Google Scholar] [CrossRef]
- Zhang, Z.; Ersoz, E.; Lai, C.Q.; Todhunter, R.J.; Tiwari, H.K.; Gore, M.A.; Bradbury, P.J.; Yu, J.; Arnett, D.K.; Ordovas, J.M.; et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 2010, 42, 355–360. [Google Scholar] [CrossRef]
- Svishcheva, G.R.; Axenovich, T.I.; Belonogova, N.M.; Van Duijn, C.M.; Aulchenko, Y.S. Rapid variance components-based method for whole-genome association analysis. Nat. Genet. 2012, 44, 1166–1170. [Google Scholar] [CrossRef]
- Loh, P.R.; Tucker, G.; Buliksullivan, B.K.; Vilhjálmsson, B.J.; Finucane, H.K.; Salem, R.M.; Chasman, D.I.; Ridker, P.M.; Neale, B.M.; Berger, B. Efficient Bayesian mixed model analysis increases association power in large cohorts. Nat. Genet. 2015, 47, 284–290. [Google Scholar] [CrossRef]
- Kang, H.M.; Zaitlen, N.A.; Wade, C.M.; Kirby, A.; Heckerman, D.; Daly, M.J.; Eskin, E. Efficient Control of Population Structure in Model Organism Association Mapping. Genetics 2008, 178, 1709–1723. [Google Scholar] [CrossRef]
- Lippert, C.; Listgarten, J.; Liu, Y.; Kadie, C.M.; Davidson, R.I.; Heckerman, D. FaST linear mixed models for genome-wide association studies. Nat. Methods 2011, 8, 833–835. [Google Scholar] [CrossRef]
- Zhou, X.; Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 2012, 44, 821–824. [Google Scholar] [CrossRef] [PubMed]
- Patterson, H.D.; Thompson, R. Recovery of inter-block information when block sizes are unequal. Biometrika 1971, 58, 545–554. [Google Scholar] [CrossRef]
- Gao, J.; Zhou, X.; Hao, Z.; Jiang, L.; Yang, R. Genome-wide barebones regression scan for mixed-model association analysis. Appl. Genet. 2020, 133, 51–58. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Hao, Z.; Zhao, Y.; Yang, R. A fast-linear mixed model for genome-wide haplotype association analysis: Application to agronomic traits in maize. BMC Genom. 2020, 21, 151. [Google Scholar] [CrossRef] [PubMed]
- Peng, G.; Luo, L.; Siu, H.C.; Zhu, Y.; Hu, P.F.; Hong, S.J.; Zhao, J.Y.; Zhou, X.D.; Reveille, J.D.; Jin, L.; et al. Gene and pathway-based second-wave analysis of genome-wide association studies. Eur. J. Hum. Genet. 2010, 18, 111–117. [Google Scholar] [CrossRef]
- Liu, H.; Luo, X.; Niu, L.; Xiao, Y.; Chen, L.; Liu, J.; Wang, X.; Jin, M.; Li, W.; Zhang, Q.; et al. Distant eQTLs and Non-coding Sequences Play Critical Roles in Regulating Gene Expression and Quantitative Trait Variation in Maize. Mol. Plant 2017, 10, 414–426. [Google Scholar] [CrossRef]
- Romay, M.C.; Millard, M.J.; Glaubitz, J.C.; Peiffer, J.A.; Swarts, K.L.; Casstevens, T.M.; Elshire, R.J.; Acharya, C.B.; Mitchell, S.E.; Flint-Garcia, S.A.; et al. Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol. 2013, 14, R55. [Google Scholar] [CrossRef]
- Yang, X.H.; Gao, S.B.; Xu, S.T.; Zhang, Z.X.; Prasanna, B.M.; Li, L.; Li, J.S.; Yan, J.B. Characterization of a global germplasm collection and its potential utilization for analysis of complex quantitative traits in maize. Mol. Breeding 2011, 28, 511–526. [Google Scholar] [CrossRef]
- Li, H.; Peng, Z.; Yang, X.; Wang, W.; Fu, J.; Wang, J.; Han, Y.; Chai, Y.; Guo, T.; Yang, N.; et al. Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat. Genet. 2013, 45, 43–50. [Google Scholar] [CrossRef]
- Yu, J.M.; Pressoir, G.; Briggs, W.H.; Bi, I.V.; Yamasaki, M.; Doebley, J.F.; McMullen, M.D.; Gaut, B.S.; Nielsen, D.M.; Holland, J.B.; et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 2006, 38, 203–208. [Google Scholar] [CrossRef]
- Goddard, M.E.; Wray, N.R.; Verbyla, K.; Visscher, P.M. Estimating effects and making predictions from genome-wide marker data. Stat. Sci. 2009, 24, 517–529. [Google Scholar] [CrossRef]
- Hayes, B.J.; Visscher, P.M.; Goddard, M.E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. 2009, 91, 143. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.A.; Benyamin, B.; McEvoy, B.P.; Gordon, S.; Henders, A.K.; Nyholt, D.R.; Madden, P.A.; Heath, A.C.; Martin, N.G.; Montgomery, G.W.; et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010, 42, 565–569. [Google Scholar] [CrossRef] [PubMed]
- Listgarten, J.; Lippert, C.; Kadie, C.M.; Davidson, R.I.; Eskin, E.; Heckerman, D. Improved linear mixed models for genome-wide association studies. Nat. Methods 2012, 9, 525–526. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Zaitlen, N.A.; Goddard, M.E.; Visscher, P.M.; Price, A.L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 2014, 46, 100–106. [Google Scholar] [CrossRef]
- Wang, Q.; Tian, F.; Pan, Y.; Buckler, E.S.; Zhang, Z. A SUPER powerful method for genome wide association study. PLoS ONE 2014, 9, e107684. [Google Scholar] [CrossRef]
Gene Shared among Traits | Trait Name |
---|---|
GRMZM2G169089 | C18:0P, C18:1P, C18:2P, C18:3P, C24:0P, C16:0/C18:0, C18:1/C18:2, C20:0/C22:0 |
GRMZM2G169114 | C18:0P, C18:1P, C18:2P, C18:3P, C24:0P, C18:1/C18:2, C20:0/C22:0 |
GRMZM2G173579 | C16:0P, C16:0/C16:1, C16:0/C18:0, SFA/USFA |
GRMZM2G173628 | C16:0P, C16:0/C16:1, C16:0/C18:0, SFA/USFA |
GRMZM2G064701 | C18:1P, C18:2P, C18:0/C18:1, C18:1/C18:2 |
GRMZM2G173615 | C16:0P, C16:0/C16:1, SFA/USFA |
GRMZM2G173641 | C16:0P, C16:0/C16:1, SFA/USFA |
GRMZM2G444801 | C16:0P, C16:0/C16:1, SFA/USFA |
GRMZM5G829544 | C16:0P, C16:0/C16:1, SFA/USFA |
GRMZM5G867927 | C18:1P, C18:2P, C18:0/C18:1, C18:1/C18:2 |
GRMZM2G029506 | C20:0P, C22:0P, C24:0P |
GRMZM2G125455 | C18:1P, C18:2P, C18:1/C18:2 |
GRMZM2G125544 | C18:1P, C18:2P |
GRMZM2G149138 | C18:1P, C18:2P, C18:1/C18:2 |
GRMZM2G173678 | C16:0P, C16:0/C16:1, SFA/USFA |
GRMZM2G335618 | C20:0P, C22:0P, C18:0/C20:0 |
GRMZM2G365292 | C18:1P, C18:2P, C18:1/C18:2 |
GRMZM2G444623 | C18:1P, C18:2P, C18:1/C18:2 |
GRMZM2G449817 | C22:0P, C24:0P, C20:0/C22:0 |
GRMZM2G005339 | C16:0P, SFA/USFA |
GRMZM2G075637 | C16:0P, C16:0/C18:0 |
GRMZM2G094871 | C18:1P, C18:1/C18:2 |
GRMZM2G101707 | C20:1P, C22:0P |
GRMZM2G103475 | C16:0P, SFA/USFA |
GRMZM2G109009 | C18:2P, C20:0/C22:0 |
GRMZM2G404897 | C16:0P, SFA/USFA |
GRMZM2G461671 | C18:1P, C18:1/C18:2 |
GRMZM5G899300 | C16:0P, SFA/USFA |
Gene | Function and Pathway | Number of Times Found by FPC | Number of Times Found by EPC |
---|---|---|---|
GRMZM2G169089 | linoleic acid1, triacylglycerol biosynthesis pathway | 9 | 8 |
GRMZM2G064701 | fatty acid desaturase1 | 3 | 4 |
GRMZM5G829544 | fatty acyl-ACP thioesterase2, oleate biosynthesis I (plants) | 0 | 4 |
GRMZM5G867927 | fatty acid desaturase1 | 3 | 4 |
GRMZM2G022558 | fatty acid elongase2 | 2 | 1 |
GRMZM2G370357 | lipid metabolic process | Not Available | 1 (From Table S3) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, Y.; Gao, J.; Guo, X.; Su, B.; Wang, H.; Yang, R.; Jiang, L. Gene-Based Genome-Wide Association Study Identified Genes for Agronomic Traits in Maize. Biology 2022, 11, 1649. https://doi.org/10.3390/biology11111649
Zhao Y, Gao J, Guo X, Su B, Wang H, Yang R, Jiang L. Gene-Based Genome-Wide Association Study Identified Genes for Agronomic Traits in Maize. Biology. 2022; 11(11):1649. https://doi.org/10.3390/biology11111649
Chicago/Turabian StyleZhao, Yunfeng, Jin Gao, Xiugang Guo, Baofeng Su, Haijie Wang, Runqing Yang, and Li Jiang. 2022. "Gene-Based Genome-Wide Association Study Identified Genes for Agronomic Traits in Maize" Biology 11, no. 11: 1649. https://doi.org/10.3390/biology11111649