The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated ...sequence variants and genotypes of key ancestor bulls. In the first phase of the 1000 bull genomes project, we sequenced the whole genomes of 234 cattle to an average of 8.3-fold coverage. This sequencing includes data for 129 individuals from the global Holstein-Friesian population, 43 individuals from the Fleckvieh breed and 15 individuals from the Jersey breed. We identified a total of 28.3 million variants, with an average of 1.44 heterozygous sites per kilobase for each individual. We demonstrate the use of this database in identifying a recessive mutation underlying embryonic death and a dominant mutation underlying lethal chrondrodysplasia. We also performed genome-wide association studies for milk production and curly coat, using imputed sequence variants, and identified variants associated with these traits in cattle.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 ...with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases.
In this study using the high density BovineHD SNP array, we performed high resolution CNV analyses on both Btau_4.0 and UMD3.1 with 674 animals of 27 cattle breeds. We first compared CNV results derived from these two different SNP array platforms on Btau_4.0. With two thirds of the animals shared between studies, on Btau_4.0 we identified 3,346 candidate CNV regions representing 142.7 megabases (~4.70%) of the genome. With a similar total length but 5 times more event counts, the average CNVR length of current Btau_4.0 dataset is significantly shorter than the previous one (42.7 kb vs. 205 kb). Although subsets of these two results overlapped, 64% (91.6 megabases) of current dataset was not present in the previous study. We also performed similar analyses on UMD3.1 using these BovineHD SNP array results. Approximately 50% more and 20% longer CNVs were called on UMD3.1 as compared to those on Btau_4.0. However, a comparable result of CNVRs (3,438 regions with a total length 146.9 megabases) was obtained. We suspect that these results are due to the UMD3.1 assembly's efforts of placing unplaced contigs and removing unmerged alleles. Selected CNVs were further experimentally validated, achieving a 73% PCR validation rate, which is considerably higher than the previous validation rate. About 20-45% of CNV regions overlapped with cattle RefSeq genes and Ensembl genes. Panther and IPA analyses indicated that these genes provide a wide spectrum of biological processes involving immune system, lipid metabolism, cell, organism and system development.
We present a comprehensive result of cattle CNVs at a higher resolution and sensitivity. We identified over 3,000 candidate CNV regions on both Btau_4.0 and UMD3.1, further compared current datasets with previous results, and examined the impacts of genome assemblies on CNV calling.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The mineral composition of bovine milk plays an important role in determining its nutritional and cheese-making value. Concentrations of the main minerals predicted from mid-infrared spectra produced ...during milk recording, combined with cow genotypes, provide a unique opportunity to decipher the genetic determinism of these traits. The present study included 1 million test-day predictions of Ca, Mg, P, K, Na, and citrate content from 126,876 Montbéliarde cows, of which 19,586 had genotype data available. All investigated traits were highly heritable (0.50-0.58), with the exception of Na (0.32). A sequence-based genome-wide association study (GWAS) detected 50 QTL (18 affecting two to five traits) and positional candidate genes and variants, mostly located in non-coding sequences. In silico post-GWAS analyses highlighted 877 variants that could be regulatory SNPs altering transcription factor (TF) binding sites or located in non-coding RNA (mainly lncRNA). Furthermore, we found 47 positional candidate genes and 45 TFs highly expressed in mammary gland compared to 90 other bovine tissues. Among the mammary-specific genes, SLC37A1 and ANKH, encoding proteins involved in ion transport were located in the most significant QTL. This study therefore highlights a comprehensive set of functional candidate genes and variants that affect milk mineral content.
Using GWAS to identify candidate genes associated with cattle morphology traits at a functional level is challenging. The main difficulty of identifying candidate genes and gene interactions ...associated with such complex traits is the long-range linkage disequilibrium (LD) phenomenon reported widely in dairy cattle. Systems biology approaches, such as combining the Association Weight Matrix (AWM) with a Partial Correlation in an Information Theory (PCIT) algorithm, can assist in overcoming this LD. Used in a multi-breed and multi-phenotype context, the AWM-PCIT could aid in identifying udder traits candidate genes and gene networks with regulatory and functional significance. This study aims to use the AWM-PCIT algorithm as a post-GWAS analysis tool with the goal of identifying candidate genes underlying udder morphology. We used data from 78,440 dairy cows from three breeds and with own phenotypes for five udder morphology traits, five production traits, somatic cell score and clinical mastitis. Cows were genotyped with medium (50k) or low-density (7 to 10k) chips and imputed to 50k. We performed a within breed and trait GWAS. The GWAS showed 9,830 significant SNP across the genome (p < 0.05). Five thousand and ten SNP did not map a gene, and 4,820 SNP were within 10-kb of a gene. After accounting for 1SNP:1gene, 3,651 SNP were within 10-kb of a gene (set1), and 2,673 significant SNP were further than 10-kb of a gene (set2). The two SNP sets formed 6,324 SNP matrix, which was fitted in an AWM-PCIT considering udder depth/ development as the key trait resulting in 1,013 genes associated with udder morphology, mastitis and production phenotypes. The AWM-PCIT detected ten potential candidate genes for udder related traits: ESR1, FGF2, FGFR2, GLI2, IQGAP3, PGR, PRLR, RREB1, BTRC, and TGFBR2.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The objective of this study was to compare mapping precision and power of within-breed and multibreed genome-wide association studies (GWAS) and to compare the results obtained by the multibreed GWAS ...with 3 meta-analysis methods. The multibreed GWAS was expected to improve mapping precision compared with a within-breed GWAS because linkage disequilibrium is conserved over shorter distances across breeds than within breeds. The multibreed GWAS was also expected to increase detection power for quantitative trait loci (QTL) segregating across breeds. GWAS were performed for production traits in dairy cattle, using imputed full genome sequences of 16,031 bulls, originating from 6 French and Danish dairy cattle populations. Our results show that a multibreed GWAS can be a valuable tool for the detection and fine mapping of quantitative trait loci. The number of QTL detected with the multibreed GWAS was larger than the number detected by the within-breed GWAS, indicating an increase in power, especially when the 2 Holstein populations were combined. The largest number of QTL was detected when all populations were combined. The analysis combining all breeds was, however, dominated by Holstein, and QTL segregating in other breeds but not in Holstein were sometimes overshadowed by larger QTL segregating in Holstein. Therefore, the GWAS combining all breeds except Holstein was useful to detect such peaks. Combining all breeds except Holstein resulted in smaller QTL intervals on average, but this outcome was not the case when the Holstein populations were included in the analysis. Although no decrease in the average QTL size was observed, mapping precision did improve for several QTL. Out of 3 different multibreed meta-analysis methods, the weighted z-scores model resulted in the most similar results to the full multibreed GWAS and can be useful as an alternative to a full multibreed GWAS. Differences between the multibreed GWAS and the meta-analyses were larger when different breeds were combined than when the 2 Holstein populations were combined.
Genetic merit, or breeding values as referred to in livestock and crop breeding programs, is one of the keys to the successful selection of animals in commercial farming systems. The developments in ...statistical methods during the twentieth century and single nucleotide polymorphism (SNP) chip technologies in the twenty-first century have revolutionized agricultural production, by allowing highly accurate predictions of breeding values for selection candidates at a very early age. Nonetheless, for many breeding populations, realized accuracies of predicted breeding values (PBV) remain below the theoretical maximum, even when the reference population is sufficiently large, and SNPs included in the model are in sufficient linkage disequilibrium (LD) with the quantitative trait locus (QTL). This is particularly noticeable over generations, as we observe the so-called erosion of the effects of SNPs due to recombinations, accompanied by the erosion of the accuracy of prediction. While accurately quantifying the erosion at the individual SNP level is a difficult and unresolved task, quantifying the erosion of the accuracy of prediction is a more tractable problem. In this paper, we describe a method that uses the relationship between reference and target populations to calculate expected values for the accuracies of predicted breeding values for non-phenotyped individuals accounting for erosion. The accuracy of the expected values was evaluated through simulations, and a further evaluation was performed on real data.
Using simulations, we empirically confirmed that our expected values for the accuracy of PBV accounting for erosion were able to correctly determine the prediction accuracy of breeding values for non-phenotyped individuals. When comparing the expected to the realized accuracies of PBV with real data, only one out of the four traits evaluated presented accuracies that were significantly higher than the expected, approaching
.
We defined an index of genetic correlation between reference and target populations, which summarizes the expected overall erosion due to differences in allele frequencies and LD patterns between populations. We used this correlation along with a trait's heritability to derive expected values for the accuracy (
) of PBV accounting for the erosion, and demonstrated that our derived
is a reliable metric.
MicroRNAs are small noncoding RNAs that have important roles in the lactation process and milk biosynthesis. Some polymorphisms have been studied in various livestock species from the perspective of ...pathology or production traits. To target variants that could be the causal variants of dairy traits, genetic variants of microRNAs expressed in the mammary gland or present in milk and localized in dairy quantitative trait loci (QTLs) were investigated in bovine, caprine, and ovine species. In this study, a total of 59,124 (out of 28 millions), 13,427 (out of 87 millions), and 4761 (out of 38 millions) genetic variants in microRNAs expressed in the mammary gland or present in milk were identified in bovine, caprine, and ovine species, respectively. A total of 4679 of these detected bovine genetic variants are located in dairy QTLs. In caprine species, 127 genetic variants are localized in dairy QTLs. In ovine species, no genetic variant was identified in dairy QTLs. This study leads to the detection of microRNA genetic variants of interest in the context of dairy production, taking advantage of whole genome data to identify microRNA genetic variants expressed in the mammary gland and localized in dairy QTLs.
Fertility is an economically important trait in livestock. Poor fertility in dairy cattle can be due to loss-of-function variants affecting any essential gene that causes early embryonic mortality in ...homozygotes. To identify fertility-associated quantitative trait loci, we performed single-marker association analyses for 8 fertility traits in Holstein, Jersey, and Nordic Red Dairy cattle using imputed whole-genome sequence variants including SNPs, indels, and large deletion. We then performed stepwise selection of independent markers from GWAS loci using conditional and joint association analyses. From single-marker analyses for fertility traits, we reported genome-wide significant associations of 30,384 SNPs, 178 indels, and 3 deletions in Holstein; 23,481 SNPs, 189 indels, and 13 deletions in Nordic Red; and 17 SNPs in Jersey cattle. Conditional and joint association analyses identified 37 and 23 independent associations in Holstein and Nordic Red Dairy cattle, respectively. Fertility-associated GWAS loci were enriched for developmental and cellular processes (Gene Ontology enrichment, false discovery rate < 0.05). For these quantitative trait loci regions (top marker and 500 kb of surrounding regions), we proposed several candidate genes with functional annotations corresponding to embryonic lethality and various fertility-related phenotypes in mouse and cattle. The inclusion of these top markers in future releases of the custom SNP chip used for genomic evaluations will enable their validation in independent populations and improve the accuracy of genomic predictions.
Meiotic recombination is an essential biological process that ensures proper chromosome segregation and creates genetic diversity. Individual variation in global recombination rates has been shown to ...be heritable in several species, and variants significantly associated with this trait have been identified. Recombination on the sex chromosome has often been ignored in these studies although this trait may be particularly interesting as it may correspond to a biological process distinct from that on autosomes. For instance, recombination in males is restricted to the pseudo-autosomal region (PAR). We herein used a large cattle pedigree with more than 100,000 genotyped animals to improve the genetic map of the X chromosome and to study the genetic architecture of individual variation in recombination rate on the sex chromosome (XRR). The length of the genetic map was 46.4 and 121.2 cM in males and females, respectively, but the recombination rate in the PAR was six times higher in males. The heritability of CO counts on the X chromosome was comparable to that of autosomes in males (0.011) but larger than that of autosomes in females (0.024). XRR was highly correlated (0.76) with global recombination rate (GRR) in females, suggesting that both traits might be governed by shared variants. In agreement, a set of eleven previously identified variants associated with GRR had correlated effects on female XRR (0.86). In males, XRR and GRR appeared to be distinct traits, although more accurate CO counts on the PAR would be valuable to confirm these results.
Genome-wide association studies (GWAS) were performed at the sequence level to identify candidate mutations that affect the expression of six major milk proteins in Montbéliarde (MON), Normande ...(NOR), and Holstein (HOL) dairy cattle. Whey protein (α-lactalbumin and β-lactoglobulin) and casein (αs1, αs2, β, and κ) contents were estimated by mid-infrared (MIR) spectrometry, with medium to high accuracy (0.59 ≤ R
≤ 0.92), for 848,068 test-day milk samples from 156,660 cows in the first three lactations. Milk composition was evaluated as average test-day measurements adjusted for environmental effects. Next, we genotyped a subset of 8080 cows (2967 MON, 2737 NOR, and 2306 HOL) with the BovineSNP50 Beadchip. For each breed, genotypes were first imputed to high-density (HD) using HD single nucleotide polymorphisms (SNPs) genotypes of 522 MON, 546 NOR, and 776 HOL bulls. The resulting HD SNP genotypes were subsequently imputed to the sequence level using 27 million high-quality sequence variants selected from Run4 of the 1000 Bull Genomes consortium (1147 bulls). Within-breed, multi-breed, and conditional GWAS were performed.
Thirty-four distinct genomic regions were identified. Three regions on chromosomes 6, 11, and 20 had very significant effects on milk composition and were shared across the three breeds. Other significant effects, which partially overlapped across breeds, were found on almost all the autosomes. Multi-breed analyses provided a larger number of significant genomic regions with smaller confidence intervals than within-breed analyses. Combinations of within-breed, multi-breed, and conditional analyses led to the identification of putative causative variants in several candidate genes that presented significant protein-protein interactions enrichment, including those with previously described effects on milk composition (SLC37A1, MGST1, ABCG2, CSN1S1, CSN2, CSN1S2, CSN3, PAEP, DGAT1, AGPAT6) and those with effects reported for the first time here (ALPL, ANKH, PICALM).
GWAS applied to fine-scale phenotypes, multiple breeds, and whole-genome sequences seems to be effective to identify candidate gene variants. However, although we identified functional links between some candidate genes and milk phenotypes, the causality between candidate variants and milk protein composition remains to be demonstrated. Nevertheless, the identification of potential causative mutations that underlie milk protein composition may have immediate applications for improvements in cheese-making.