Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing ...large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC's data structure and algorithms are valuable for accelerating large-scale genomic research.
This book is the first in a projected series on Evolutionary Cell Biology , the intent of which is to demonstrate the essential role of cellular mechanisms in transforming the genotype into the ...phenotype by transforming gene activity into evolutionary change in morphology. This book — Cells in Evolutionary Biology — evaluates the evolution of cells themselves and the role cells have been viewed to play as agents of change at other levels of biological organization. Chapters explore Darwin’s use of cells in his theory of evolution and how Weismann’s theory of the separation of germ plasm from body cells brought cells to center stage in understanding how acquired changes to cells within generations are not passed on to future generations.
Chapter 7 of this book is freely available as a downloadable Open Access PDF under a Creative Commons Attribution-Non Commercial-No Derivatives 3.0 license. https://s3-us-west-2.amazonaws.com/tandfbis/rt-files/docs/Open+Access+Chapters/9781315155968_oachapter7.pdf
Hepatitis B virus (HBV) genotypes E to J are understudied genotypes. Genotype E is found almost exclusively in West Africa. Genotypes F and H are found in America and are rare in other parts of the ...world. The distribution of genotype G is not completely known. Genotypes I and J are found in Asia and probably result from recombination events with other genotypes. The number of reported sequences for HBV genotypes E to J is small compared to other genotypes, which could impact phylogenetic and pairwise distance analyses. Genotype F is the most divergent of the HBV genotypes and is subdivided into six subgenotypes F1 to F6. Genotype E may be a recent genotype circulating almost exclusively in sub-Saharan Africa. Genotype J is a putative genotype originating from a single Japanese patient. The paucity of data from sub-Saharan Africa and Latin America is due to the under-representation of these regions in clinical and research cohorts. The purpose of this review is to highlight the need for further research on HBV genotypes E to J, which appear to be overlooked genotypes.
Low‐coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost‐effective approach for population genomic studies in both model and nonmodel species. However, with read depths too low ...to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data, and how the distribution of sequencing effort between the number of samples analysed and per‐sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate how the per‐sample cost for lcWGS is now comparable to RAD‐seq and Pool‐seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency, genetic diversity, and linkage disequilibrium estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference, with a few notable exceptions. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in nonmodel species, and discuss current limitations and future perspectives for lcWGS‐based population genomics research. With this overview, we hope to make lcWGS more approachable and stimulate its broader adoption.
Despite the importance of climate‐adjusted provenancing to mitigate the effects of environmental change, climatic considerations alone are insufficient when restoring highly degraded sites. Here we ...propose a comprehensive landscape genomic approach to assist the restoration of moderately disturbed and highly degraded sites. To illustrate it we employ genomic data sets comprising thousands of single nucleotide polymorphisms from two plant species suitable for the restoration of iron‐rich Amazonian Savannas. We first use a subset of neutral loci to assess genetic structure and determine the genetic neighbourhood size. We then identify genotype‐phenotype‐environment associations, map adaptive genetic variation, and predict adaptive genotypes for restoration sites. Whereas local provenances were found optimal to restore a moderately disturbed site, a mixture of genotypes seemed the most promising strategy to recover a highly degraded mining site. We discuss how our results can help define site‐adjusted provenancing strategies, and argue that our methods can be more broadly applied to assist other restoration initiatives.
see also the Perspective by Yessica Rico
Eastern oysters in the northern Gulf of Mexico are facing rapid environmental changes and can respond to this change via plasticity or evolution. Plasticity can act as an immediate buffer against ...environmental change, but this buffering could impact the organism's ability to evolve in subsequent generations. While plasticity and evolution are not mutually exclusive, the relative contribution and interaction between them remains unclear. In this study, we investigate the roles of plastic and evolved responses to environmental variation and Perkinsus marinus infection in Crassostrea virginica by using a common garden experiment with 80 oysters from six families outplanted at two field sites naturally differing in salinity. We use growth data, P. marinus infection intensities, 3′ RNA sequencing (TagSeq) and low‐coverage whole‐genome sequencing to identify the effect of genotype, environment and genotype‐by‐environment interaction on the oyster's response to site. As one of first studies to characterize the joint effects of genotype and environment on transcriptomic and morphological profiles in a natural setting, we demonstrate that C. virginica has a highly plastic response to environment and that this response is parallel among genotypes. We also find that genes responding to genotype have distinct and opposing profiles compared to genes responding to environment with regard to expression levels, Ka/Ks ratios and nucleotide diversity. Our findings suggest that C. virginica may be able to buffer the immediate impacts of future environmental changes by altering gene expression and physiology, but the lack of genetic variation in plasticity suggests limited capacity for evolved responses.
Studying how the fitness benefits of mutualism differ among a wide range of partner genotypes, and at multiple spatial scales, can shed light on the processes that maintain mutualism and structure ...revolutionary interactions. Using legumes and rhizobia from three natural populations, I studied the symbiotic fitness benefits for both partners in 108 plant maternal family by rhizobium strain combinations. Genotype-by-genotype (G x G) interactions among local genotypes and among partner populations determined, in part, the benefits of mutualism for both partners; for example, the fitness effects of particular rhizobium strains ranged from uncooperative to mutualistic depending on the plant family. Correlations between plant and rhizobium fitness benefits suggest a trade off, and therefore a potential conflict, between the interests of the two partners. These results suggest that legume-rhizobium mutualisms are dynamic at multiple spatial scales, and that strictly additive models of mutualism benefits may ignore dynamics potentially important to both the maintenance of genetic variation and the generation of geographic patterns in coevolutionary interactions.
Landscape genomics is an emerging research field that aims to identify the environmental factors that shape adaptive genetic variation and the gene variants that drive local adaptation. Its ...development has been facilitated by next‐generation sequencing, which allows for screening thousands to millions of single nucleotide polymorphisms in many individuals and populations at reasonable costs. In parallel, data sets describing environmental factors have greatly improved and increasingly become publicly accessible. Accordingly, numerous analytical methods for environmental association studies have been developed. Environmental association analysis identifies genetic variants associated with particular environmental factors and has the potential to uncover adaptive patterns that are not discovered by traditional tests for the detection of outlier loci based on population genetic differentiation. We review methods for conducting environmental association analysis including categorical tests, logistic regressions, matrix correlations, general linear models and mixed effects models. We discuss the advantages and disadvantages of different approaches, provide a list of dedicated software packages and their specific properties, and stress the importance of incorporating neutral genetic structure in the analysis. We also touch on additional important aspects such as sampling design, environmental data preparation, pooled and reduced‐representation sequencing, candidate‐gene approaches, linearity of allele–environment associations and the combination of environmental association analyses with traditional outlier detection tests. We conclude by summarizing expected future directions in the field, such as the extension of statistical approaches, environmental association analysis for ecological gene annotation, and the need for replication and post hoc validation studies.