One of the seminal events since 2019 has been the outbreak of the SARS-CoV-2 pandemic. Countries have adopted various policies to deal with it, but they also differ in their socio-geographical ...characteristics and public health care facilities. Our study aimed to investigate differences between epidemiological parameters across countries.
The analysed data represents SARS-CoV-2 repository provided by the Johns Hopkins University. Separately for each country, we estimated recovery and mortality rates using the SIRD model applied to the first 30, 60, 150, and 300 days of the pandemic. Moreover, a mixture of normal distributions was fitted to the number of confirmed cases and deaths during the first 300 days. The estimates of peaks' means and variances were used to identify countries with outlying parameters.
For 300 days Belgium, Cyprus, France, the Netherlands, Serbia, and the UK were classified as outliers by all three outlier detection methods. Yemen was classified as an outlier for each of the four considered timeframes, due to high mortality rates. During the first 300 days of the pandemic, the majority of countries underwent three peaks in the number of confirmed cases, except Australia and Kazakhstan with two peaks.
Considering recovery and mortality rates we observed heterogeneity between countries. Liechtenstein was the "positive" outlier with low mortality rates and high recovery rates, at the opposite, Yemen represented a "negative" outlier with high mortality for all four considered periods and low recovery for 30 and 60 days.
Since global temperature is expected to rise by 2 °C in 2050 heat stress may become the most severe environmental factor. In the study, we illustrate the application of mixed linear models for the ...analysis of whole transcriptome expression in livers and adrenal tissues of Sprague-Dawley rats obtained by a heat stress experiment. By applying those models, we considered four sources of variation in transcript expression, comprising transcripts (1), genes (2), Gene Ontology terms (3), and Reactome pathways (4) and focussed on accounting for the similarity within each source, which was expressed as a covariance matrix. Models based on transcripts or genes levels explained a larger proportion of log
fold change than models fitting the functional components of Gene Ontology terms or Reactome pathways. In the liver, among the most significant genes were PNKD and TRIP12. In the adrenal tissue, one transcript of the SUCO gene was expressed more strongly in the control group than in the heat-stress group. PLEC had two transcripts, which were significantly overexpressed in the heat-stress group. PER3 was significant only on gene level. Moving to the functional scale, five Gene Ontologies and one Reactome pathway were significant in the liver. They can be grouped into ontologies related to DNA repair, histone ubiquitination, the regulation of embryonic development and cytoplasmic translation. Linear mixed models are valuable tools for the analysis of high-throughput biological data. Their main advantages are the possibility to incorporate information on covariance between observations and circumventing the problem of multiple testing.
Humans have been influencing climate changes by burning fossil fuels, farming livestock, and cutting down rainforests, which has led to global temperature rise. This problem of global warming affects ...animals by causing heat stress, which negatively affects their health, biological functions, and reproduction. On the molecular level, it has been proved that heat stress changes the expression level of genes and therefore causes changes in proteome and metabolome. The importance of a microbiome in many studies showed that it is considered as individuals' "second genome". Physiological changes caused by heat stress may impact the microbiome composition. In this study, we identified fecal microbiota associated with heat stress that was quantified by three metrics - rectal temperature, drooling, and respiratory scores represented by their Estimated Breeding Values. We analyzed the microbiota from 136 fecal samples of Chinese Holstein cows through a 16S rRNA gene sequencing approach. Statistical modeling was performed using a negative binomial regression. The analysis revealed the total number of 24 genera and 12 phyla associated with heat stress metrics. Rhizobium and Pseudobutyrivibrio turned out to be the most significant genera, while Acidobacteria and Gemmatimonadetes were the most significant phyla. Phylogenetic analysis revealed that three heat stress indicators quantify different metabolic ways of animals' reaction to heat stress. Other studies already identified that those genera had significantly increased abundance in mice exposed to stressor-induced changes. This study provides insights into the analysis of microbiome composition in cattle using heat stress measured as a continuous variable. The bacteria highly associated with heat stress were highlighted and can be used as biomarkers in further microbiological studies.
The new ARS-UCD1.2 assembly of the bovine genome has considerable improvements over the previous assembly and thus more accurate identification of patterns of genetic variation can be achieved with ...it. We explored differences in genetic variation between autosomes, the X chromosome, and the Y chromosome. In particular, variant densities, annotations, lengths (only for InDels), nucleotide divergence, and Tajima's D statistics between chromosomes were considered. Whole-genome DNA sequences of 217 individuals representing different cattle breeds were examined. The analysis included the alignment to the new reference genome and variant identification. 23,655,295 SNPs and 3,758,781 InDels were detected. In contrast to autosomes, both sex chromosomes had negative values of Tajima's D and lower nucleotide divergence. That implies a correlation between nucleotide diversity and recombination rate, which is obviously reduced for sex chromosomes. Moreover, the accumulation of nonsynonymous mutations on the Y chromosome could be associated with loss of recombination. Also, the relatively lower effective population size for sex chromosomes leads to a lower expected density of variants.
The single-step model is becoming increasingly popular for national genetic evaluations of dairy cattle due to the benefits that it offers such as joint breeding value estimation for genotyped and ...ungenotyped animals. However, the complexity of the model due to a large number of correlated effects can lead to significant computational challenges, especially in terms of accuracy and efficiency of the preconditioned conjugate gradient method used for the estimation. The aim of this study was to investigate the effect of pedigree depth on the model's overall convergence rate as well as on the convergence of different components of the model, in the context of the single-step single nucleotide polymorphism best linear unbiased prediction (SNP-BLUP) model. The results demonstrate that the dataset with a truncated pedigree converged twice as fast as the full dataset. Still, both datasets showed very high Pearson correlations between predicted breeding values. In addition, by comparing the top 50 bulls between the two datasets we found a high correlation between their rankings. We also analysed the specific convergence patterns underlying different animal groups and model effects, which revealed heterogeneity in convergence behaviour. Effects of SNPs converged the fastest while those of genetic groups converged the slowest, which reflects the difference in information content available in the dataset for those effects. Pre-selection criteria for the SNP set based on minor allele frequency had no impact on either the rate or pattern of their convergence. Among different groups of individuals, genotyped animals with phenotype data converged the fastest, while non-genotyped animals without own records required the largest number of iterations. We conclude that pedigree structure markedly impacts the convergence rate of the optimisation which is more efficient for the truncated than for the full dataset.
The serious drawback underlying the biological annotation of whole-genome sequence data is the p >> n problem, which means that the number of polymorphic variants (p) is much larger than the number ...of available phenotypic records (n). We propose a way to circumvent the problem by combining a LASSO logistic regression with deep learning to classify cows as susceptible or resistant to mastitis, based on single nucleotide polymorphism (SNP) genotypes. Among several architectures, the one with 204,642 SNPs was selected as the best. This architecture was composed of two layers with, respectively, 7 and 46 units per layer implementing respective drop-out rates of 0.210 and 0.358. The classification of the test data resulted in AUC = 0.750, accuracy = 0.650, sensitivity = 0.600, and specificity = 0.700. Significant SNPs were selected based on the SHapley Additive exPlanation (SHAP). As a final result, one GO term related to the biological process and thirteen GO terms related to molecular function were significantly enriched in the gene set that corresponded to the significant SNPs. Our findings revealed that the optimal approach can correctly predict susceptibility or resistance status for approximately 65% of cows. Genes marked by the most significant SNPs are related to the immune response and protein synthesis.
In Bos taurus the universality of the reference genome is biased towards genetic variation represented by only two related individuals representing the same Hereford breed. Therefore, results of ...genetic analyses based on this reference may not be reliable. The 1000 Bull Genomes resource allows for identification of breed-specific polymorphisms and for the construction of breed-specific reference genomes. Whole-genome sequences or 936 bulls allowed us to construct seven breed specific reference genomes of Bos taurus for Angus, Brown Swiss, Fleckvieh, Hereford, Jersey, Limousin and Simmental. In order to identify breed-specific variants all detected SNPs were filtered within-breed to satisfy criteria of the number of missing genotypes not higher than 7% and the alternative allele frequency equal to unity. The highest number of breed-specific SNPs was identified for Jersey (130,070) and the lowest-for the Simmental breed (197). Such breed-specific polymorphisms were annotated to coding regions overlapping with 78 genes in Angus, 140 in Brown Swiss, 132 in Fleckvieh, 100 in Hereford, 643 in Jersey, 10 in Limousin and no genes in Simmental. For most of the breeds, the majority of breed-specific variants from coding regions was synonymous. However, most of Fleckvieh-specific and Hereford-specific polymorphisms were missense mutations. Since the identified variants are characteristic for the analysed breeds, they form the basis of phenotypic differences observed between them, which result from different breeding programmes. Breed-specific reference genomes can enhance the accuracy of SNP driven inferences such as Genome-wide Association Studies or SNP genotype imputation.
Climate change affects animal physiology. In particular, rising ambient temperatures reduce animal vitality due to heat stress and this can be observed at various levels which included genome, ...transcriptome, and microbiome. In a previous study, microbiota highly associated with changes in cattle physiology, which included rectal temperature, drooling score and respiratory score, were identified under heat stress conditions. In the present study, genes differentially expressed between individuals were selected representing different additive genetic effects toward the heat stress response in cattle in their production condition. Moreover, a correlation network analysis was performed to identify interactions between the transcriptome and microbiome for 71 Chinese Holstein cows sequenced for mRNA from blood samples and for 16S rRNA genes from fecal samples. Bioinformatics analysis was performed comprising: i) clustering and classification of 16S rRNA sequence reads, ii) mapping cows' transcripts to the reference genome and their expression quantification, and iii) statistical analysis of both data types-including differential gene expression analysis and gene set enrichment analysis. A weighted co-expression network analysis was carried out to assess changes in the association between gene expression and microbiota abundance as well as to find hub genes/microbiota responsible for the regulation of gene expression under heat stress. Results showed 1,851 differentially expressed genes were found that were shared by three heat stress phenotypes. These genes were predominantly associated with the cytokine-cytokine receptor interaction pathway. The interaction analysis revealed three modules of genes and microbiota associated with rectal temperature with which two hubs of those modules were bacterial species, demonstrating the importance of the microbiome in the regulation of gene expression during heat stress. Genes and microbiota from the significant modules can be used as biomarkers of heat stress in cattle.
Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the
1000 Genomes Project
...to characterise genomic regions vulnerable to CNV formation and to identify sequence features characteristic for those regions. The GC content for deletions was lower and for duplications was higher than for randomly selected regions. In regions flanking deletions and downstream of duplications, content was higher than in the random sequences, but upstream of duplication content was lower. In duplications and downstream of deletion regions, the percentage of low-complexity sequences was not different from the randomised data. In deletions and upstream of CNVs, it was higher, while for downstream of duplications, it was lower as compared to random sequences. The majority of CNVs intersected with genic regions — mainly with introns. GC content may be associated with CNV formation and CNVs, especially duplications are initiated in low-complexity regions. Moreover, CNVs located or overlapped with introns indicate their role in shaping intron variability. Genic CNV regions were enriched in many essential biological processes such as cell adhesion, synaptic transmission, transport, cytoskeleton organization, immune response and metabolic mechanisms, which indicates that these large-scaled variants play important biological roles.
Undoubtedly, genetic factors play an important role in susceptibility and resistance to COVID-19. In this study, we conducted the GWAS analysis. Out of 15,489,173 SNPs, we identified 18,191 ...significant SNPs for severe and 11,799 SNPs for resistant phenotype, showing that a great number of loci were significant in different COVID-19 representations. The majority of variants were synonymous (60.56% for severe, 58.46% for resistant phenotype) or located in introns (55.77% for severe, 59.83% for resistant phenotype). We identified the most significant SNPs for a severe outcome (in AJAP1 intron) and for COVID resistance (in FIG4 intron). We found no missense variants with a potential causal function on resistance to COVID-19; however, two missense variants were determined as significant a severe phenotype (in PM20D1 and LRP4 exons). None of the aforementioned SNPs and missense variants found in this study have been previously associated with COVID-19.