Abstract
Background
The shape of pig scapula is complex and is important for sow robustness and health. To better understand the relationship between 3D shape of the scapula and functional traits, it ...is necessary to build a model that explains most of the morphological variation between animals. This requires point correspondence, i.e. a map that explains which points represent the same piece of tissue among individuals. The objective of this study was to further develop an automated computational pipeline for the segmentation of computed tomography (CT) scans to incorporate 3D modelling of the scapula, and to develop a genetic prediction model for 3D morphology.
Results
The surface voxels of the scapula were identified on 2143 CT-scanned pigs, and point correspondence was established by predicting the coordinates of 1234 semi-landmarks on each animal, using the coherent point drift algorithm. A subsequent principal component analysis showed that the first 10 principal components covered more than 80% of the total variation in 3D shape of the scapula. Using principal component scores as phenotypes in a genetic model, estimates of heritability ranged from 0.4 to 0.8 (with standard errors from 0.07 to 0.08). To validate the entire computational pipeline, a statistical model was trained to predict scapula shape based on marker genotype data. The mean prediction reliability averaged over the whole scapula was equal to 0.18 (standard deviation = 0.05) with a higher reliability in convex than in concave regions.
Conclusions
Estimates of heritability of the principal components were high and indicated that the computational pipeline that processes CT data to principal component phenotypes was associated with little error. Furthermore, we showed that it is possible to predict the 3D shape of scapula based on marker genotype data. Taken together, these results show that the proposed computational pipeline closes the gap between a point cloud representing the shape of an animal and its underlying genetic components.
In pigs, crossbreeding aims at exploiting heterosis, but heterosis is difficult to quantify. Heterozygosity at genetic markers is easier to measure and could potentially be used as an indicator of ...heterosis. The objective of this study was to investigate the effect of heterozygosity on various maternal and production traits in purebred and crossbred pigs. The proportion of heterozygosity at genetic markers across the genome for each individual was included in the prediction model as a fixed regression across or within breeds.
Estimates of regression coefficients of heterozygosity showed large effects for some traits. For maternal traits, regression coefficient estimates were always in a favourable direction, while for production, meat and slaughter quality traits, they were both favourable and unfavourable. Traits with the largest estimated effects of heterozygosity were total number born, litter weight at 3 weeks, weight at 150 days, and age at 40 kg. Estimates of regression coefficients on heterozygosity differed between breeds. Traits with the largest effect of heterozygosity also showed a significant (P < 0.05) increase in prediction accuracy when heterozygosity was included in the model compared to the model without heterozygosity.
For traits with the largest estimates of regression coefficients on heterozygosity, the inclusion of heterozygosity in the model improved prediction accuracy. Using models that include heterozygosity would result in selecting different animals for breeding, which has the potential to improve genetic gain for these traits. This is most beneficial when crossbreds or several breeds are included in the estimation of breeding values and is relevant to all species, not only pigs. Thus, our results show that including heterozygosity in the model is beneficial for some traits, likely due to dominant gene action.
Sequence-based genome-wide association studies (GWAS) provide high statistical power to identify candidate causal mutations when a large number of individuals with both sequence variant genotypes and ...phenotypes is available. A meta-analysis combines summary statistics from multiple GWAS and increases the power to detect trait-associated variants without requiring access to data at the individual level of the GWAS mapping cohorts. Because linkage disequilibrium between adjacent markers is conserved only over short distances across breeds, a multi-breed meta-analysis can improve mapping precision.
To maximise the power to identify quantitative trait loci (QTL), we combined the results of nine within-population GWAS that used imputed sequence variant genotypes of 94,321 cattle from eight breeds, to perform a large-scale meta-analysis for fat and protein percentage in cattle. The meta-analysis detected (p ≤ 10
) 138 QTL for fat percentage and 176 QTL for protein percentage. This was more than the number of QTL detected in all within-population GWAS together (124 QTL for fat percentage and 104 QTL for protein percentage). Among all the lead variants, 100 QTL for fat percentage and 114 QTL for protein percentage had the same direction of effect in all within-population GWAS. This indicates either persistence of the linkage phase between the causal variant and the lead variant across breeds or that some of the lead variants might indeed be causal or tightly linked with causal variants. The percentage of intergenic variants was substantially lower for significant variants than for non-significant variants, and significant variants had mostly moderate to high minor allele frequencies. Significant variants were also clustered in genes that are known to be relevant for fat and protein percentages in milk.
Our study identified a large number of QTL associated with fat and protein percentage in dairy cattle. We demonstrated that large-scale multi-breed meta-analysis reveals more QTL at the nucleotide resolution than within-population GWAS. Significant variants were more often located in genic regions than non-significant variants and a large part of them was located in potentially regulatory regions.
It has been debated whether intensive selection for growth and carcass yield in pig breeding programmes can affect the size of internal organs, and thereby reduce the animal's ability to handle ...stress and increase the risk of sudden deaths. To explore the respiratory and circulatory system in pigs, a deep learning based computational pipeline was built to extract the size of lungs and hearts from CT-scan images. This pipeline was applied on CT images from 11,000 boar selection candidates acquired during the last decade. Further, heart and lung volumes were analysed genetically and correlated with production traits. Both heart and lung volumes were heritable, with h
estimated to 0.35 and 0.34, respectively, in Landrace, and 0.28 and 0.4 in Duroc. Both volumes were positively correlated with lean meat percentage, and lung volume was negatively genetically correlated with growth (r
= - 0.48 ± 0.07 for Landrace and r
= - 0.44 ± 0.07 for Duroc). The main findings suggest that the current pig breeding programs could, as an indirect response to selection, affect the size of hearts- and lungs. The presented methods can be used to monitor the development of internal organs in the future.
The main aim of this study was to create an automated method for the measurement of the scrotal circumference (SC) of Norwegian Red bulls using 3D images of the scrotum based on convolutional neural ...networks. The study population was bull calves recruited for performance testing before the selection of bulls for semen production in the breeding program. Bulls were measured at four different time points: upon arrival in quarantine (Q) and thereafter at approximately 6, 9 and 12 months of age. Both 3D images and manual SC measurements were performed at all time points. In our approach, SC could be calculated without direct contact with the bull, using only 3D images and a simple, user–friendly application into which mentioned images are uploaded. The results show that SC measurements obtained using semantic segmentation are comparable with manual measurements. The mean prediction error was significantly different between age groups Q, 6, 9 and 12, and it was -3.07 cm, -3.02 cm, -1.79 cm and -1.11 cm, respectively. The results show a significant difference in the measurement error of the SC based on the quality of the images. Images were categorised into three quality groups. For good prediction accuracy, we recommend capturing 3D images of quality 2 – full circle from individuals older than 6 months.
The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, ...biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions.
Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors.
Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.
The VPH/Physiome Project is developing the model encoding standards CellML (cellml.org) and FieldML (fieldml.org) as well as web-accessible model repositories based on these standards ...(models.physiome.org). Freely available open source computational modelling software is also being developed to solve the partial differential equations described by the models and to visualise results. The OpenCMISS code (opencmiss.org), described here, has been developed by the authors over the last six years to replace the CMISS code that has supported a number of organ system Physiome projects.
OpenCMISS is designed to encompass multiple sets of physical equations and to link subcellular and tissue-level biophysical processes into organ-level processes. In the Heart Physiome project, for example, the large deformation mechanics of the myocardial wall need to be coupled to both ventricular flow and embedded coronary flow, and the reaction–diffusion equations that govern the propagation of electrical waves through myocardial tissue need to be coupled with equations that describe the ion channel currents that flow through the cardiac cell membranes.
In this paper we discuss the design principles and distributed memory architecture behind the OpenCMISS code. We also discuss the design of the interfaces that link the sets of physical equations across common boundaries (such as fluid-structure coupling), or between spatial fields over the same domain (such as coupled electromechanics), and the concepts behind CellML and FieldML that are embodied in the OpenCMISS data structures. We show how all of these provide a flexible infrastructure for combining models developed across the VPH/Physiome community.
We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, ...combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a recent vision-transformer-based zero-shot model proposed by Meta AI Research for producing the segmentation data modality and for extracting the cattle-only region from the images. For experimental analysis, we consider three baseline deep learning models. The objective is to assess how the integration of diverse data sources influences the accuracy and robustness of the deep learning models considering four different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and
-squared (R2). We explore the synergies and challenges associated with each modality and their combined use in enhancing the precision of cattle weight prediction. Through comprehensive experimentation and evaluation, we aim to provide insights into the effectiveness of different data modalities in improving the performance of established deep learning models, facilitating informed decision-making for precision livestock management systems.
We propose optimized deep learning (DL) models for automatic analysis of udder conformation traits of cattle. One of the traits is represented by supernumerary teats that is in excess of the normal ...number of teats. Supernumerary teats are the most common congenital heritable in cattle. Therefore, the major advantage of our proposed method is its capability to automatically select the relevant images and thereafter perform supernumerary teat classification when limited data are available. For this purpose, we perform experimental analysis on the image dataset that we collected using a handheld device consisting of a combined depth and RGB camera. To disclose the underlying characteristics of our data, we consider the uniform manifold approximation and projection (UMAP) technique. Furthermore, for comprehensive evaluation, we explore the impact of different data augmentation techniques on the performances of DL models. We also explore the impact of only RGB data and the combination of RGB and depth data on the performances of the DL models. For this purpose, we integrate the three channels of RGB data with the depth channel to generate four channels of data. We present the results of all the models in terms of four performance metrics, namely accuracy, F-score, precision, and sensitivity. The experimental results reveal that a higher level of data augmentation techniques improves the performances of the DL models by approximately 10%. Our proposed method also outperforms the reference methods recently introduced in the literature.
Abstract
Bias and inflation in genomic evaluation with the single-step methods have been reported in several studies. Incompatibility between the base-populations of the pedigree-based and the ...genomic relationship matrix (G) could be a reason for these biases. Inappropriate ways of accounting for missing parents could be another reason for biases in genetic evaluations with or without genomic information. To handle these problems, we fitted and evaluated a fixed covariate (J) that contains ones for genotyped animals and zeros for unrelated non-genotyped animals, or pedigree-based regression coefficients for related non-genotyped animals. We also evaluated alternative ways of fitting the J covariate together with genetic groups on biases and stability of breeding value estimates, and of including it into G as a random effect. In a whole vs. partial data set comparison, four scenarios were investigated for the partial data: genotypes missing, phenotypes missing, both genotypes and phenotypes missing, and pedigree missing. Fitting J either as fixed or random reduced level-bias and inflation and increased stability of genomic predictions as compared to the basic model where neither J nor genetic groups were fitted. In most models, genomic predictions were largely biased for scenarios with missing genotype and phenotype information. The biases were reduced for models which combined group and J effects. Models with these corrected group covariates performed better than the recently published model where genetic groups were encapsulated and fitted as random via the Quaas and Pollak transformation. In our Norwegian Red cattle data, a model which combined group and J regression coefficients was preferred because it showed least bias and highest stability of genomic predictions across the scenarios.
Towards an unbiased and stable combination of information from genotyped and non-genotyped animals in genomic prediction models.
Lay Summary
Our study dealt with strategies on how to reduce biases (inflation and level-bias) and improve a parameter related to accuracy (stability) of genomic predictions of breeding values that combine genotyped and non-genotyped animals, which are denoted as single-step genomic predictions. We tried to remedy incompatibilities between the pedigree- and the genomics-based relationships matrices by fitting a covariate (J) that corrects for base-population differences that may occur between both relationship matrices. We also evaluated alternative ways to combine the J covariate and genetic group effects to account for missing parental information, which often occurs in practical breeding schemes. We found that fitting J either as fixed or random reduced level-bias and inflation and increased stability of genomic predictions as compared to the basic model where neither J nor genetic groups were fitted. Level-biases and inflation of breeding value estimates were reduced, and stability of genomic predictions improved for models which combined group and J effects. A model which fits group regression coefficients minus the part that could be explained from pedigree was recommended because it showed least bias and highest stability across the scenarios and has theoretical justification.