Natural history collections (NHCs) are the foundation of historical baselines for assessing anthropogenic impacts on biodiversity. Along these lines, the online mobilization of specimens via ...digitization—the conversion of specimen data into accessible digital content—has greatly expanded the use of NHC collections across a diversity of disciplines. We broaden the current vision of digitization (Digitization 1.0)—whereby specimens are digitized within NHCs—to include new approaches that rely on digitized products rather than the physical specimen (Digitization 2.0). Digitization 2.0 builds on the data, workflows, and infrastructure produced by Digitization 1.0 to create digital-only workflows that facilitate digitization, curation, and data links, thus returning value to physical specimens by creating new layers of annotation, empowering a global community, and developing automated approaches to advance biodiversity discovery and conservation. These efforts will transform large-scale biodiversity assessments to address fundamental questions including those pertaining to critical issues of global change.
Cytonuclear discordance is commonly observed in phylogenetic studies, yet few studies have tested whether these patterns reflect incomplete lineage sorting or organellar introgression.
Here, we used ...whole-chloroplast sequence data in combination with over 1000 nuclear single-nucleotide polymorphisms to clarify the extent of cytonuclear discordance in wild annual sunflowers (Helianthus), and to test alternative explanations for such discordance.
Our phylogenetic analyses indicate that cytonuclear discordance is widespread within this group, both in terms of the relationships among species and among individuals within species. Simulations of chloroplast evolution show that incomplete lineage sorting cannot explain these patterns in most cases. Instead, most of the observed discordance is better explained by cytoplasmic introgression. Molecular tests of evolution further indicate that selection may have played a role in driving patterns of plastid variation – although additional experimental work is needed to fully evaluate the importance of selection on organellar variants in different parts of the geographic range.
Overall, this study represents one of the most comprehensive tests of the drivers of cytonuclear discordance and highlights the potential for gene flow to lead to extensive organellar introgression in hybridizing taxa.
Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole‐genome shotgun ...sequencing of the nuclear genome of flax. Seven paired‐end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep‐coverage (approximately 94× raw, approximately 69× filtered) short‐sequence reads (44–100 bp), produced a set of scaffolds with N50 = 694 kb, including contigs with N50 = 20.1 kb. The contig assembly contained 302 Mb of non‐redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole‐genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis‐assembly of regions at the genome scale. A total of 43 384 protein‐coding genes were predicted in the whole‐genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (Ks) observed within duplicate gene pairs was consistent with a recent (5–9 MYA) whole‐genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam‐A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole‐genome shotgun short‐sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.
Summary
Demand for cannabidiol (CBD), the predominant cannabinoid in hemp (Cannabis sativa), has favored cultivars producing unprecedented quantities of CBD. We investigated the ancestry of a new ...cultivar and cannabinoid synthase genes in relation to cannabinoid inheritance.
A nanopore‐based assembly anchored to a high‐resolution linkage map provided a chromosome‐resolved genome for CBDRx, a potent CBD‐type cultivar. We measured cannabinoid synthase expression by cDNA sequencing and conducted a population genetic analysis of diverse Cannabis accessions. Quantitative trait locus mapping of cannabinoids in a hemp × marijuana segregating population was also performed.
Cannabinoid synthase paralogs are arranged in tandem arrays embedded in long terminal repeat retrotransposons on chromosome 7. Although CBDRx is predominantly of marijuana ancestry, the genome has cannabidiolic acid synthase (CBDAS) introgressed from hemp and lacks a complete sequence for tetrahydrocannabinolic acid synthase (THCAS). Three additional genomes, including one with complete THCAS, confirmed this genomic structure. Only cannabidiolic acid synthase (CBDAS) was expressed in CBD‐type Cannabis, while both CBDAS and THCAS were expressed in a cultivar with an intermediate tetrahydrocannabinol (THC) : CBD ratio.
Although variation among cannabinoid synthase loci might affect the THC : CBD ratio, variability among cultivars in overall cannabinoid content (potency) was also associated with other chromosomes.
The development of modern crops typically involves both selection and hybridization, but to date most studies have focused on the former. In the present study, we explore how both processes, and ...their interactions, have molded the genome of the cultivated sunflower (Helianthus annuus), a globally important oilseed. To identify genes targeted by selection during the domestication and improvement of sunflower, and to detect post‐domestication hybridization with wild species, we analyzed transcriptome sequences of 80 genotypes, including wild, landrace, and modern lines of H. annuus, as well as two cross‐compatible wild relatives, Helianthus argophyllus and Helianthus petiolaris. Outlier analyses identified 122 and 15 candidate genes associated with domestication and improvement, respectively. As in several previous studies, genes putatively involved in oil biosynthesis were the most extreme outliers. Additionally, several promising associations were observed with previously mapped quantitative trait loci (QTLs), such as branching. Admixture analyses revealed that all the modern cultivar genomes we examined contained one or more introgressions from wild populations, with every chromosome having evidence of introgression in at least one modern line. Cumulatively, introgressions cover c. 10% of the cultivated sunflower genome. Surprisingly, introgressions do not avoid candidate domestication genes, probably because of the reintroduction of branching.
We present a draft genome assembly for the tropical liverwort, Marchantia inflexa, which adds to a growing body of genomic resources for bryophytes and provides an important perspective on the ...evolution and diversification of land plants. We specifically address questions related to sex chromosome evolution, sexual dimorphisms, and the genomic underpinnings of dehydration tolerance. This assembly leveraged the recently published genome of related liverwort, M. polymorpha, to improve scaffolding and annotation, aid in the identification of sex-linked sequences, and quantify patterns of sequence differentiation within Marchantia. We find that genes on sex chromosomes are under greater diversifying selection than autosomal and organellar genes. Interestingly, this is driven primarily by divergence of male-specific genes, while divergence of other sex-linked genes is similar to autosomal genes. Through analysis of sex-specific read coverage, we identify and validate genetic sex markers for M. inflexa, which will enable diagnosis of sex for non-reproductive individuals. To investigate dehydration tolerance, we capitalized on a difference between genetic lines, which allowed us to identify multiple dehydration associated genes two of which were sex-linked, suggesting that dehydration tolerance may be impacted by sex-specific genes.
Display omitted
•We present fully resolved phylogenies of Gracilariales across three genomic compartments.•Nuclear, plastidial and mitochondrial phylogenies are highly congruent.•Phylogenomics and ...trait evolution analyses support Gracilaria s.l. as a unique genus.•Gracilaria s.l. and Gracilariopsis are comparable taxonomic ranks in our dated phylogeny.•Genome’s architecture provides apomorphies to support our taxonomic decisions.
The Gracilariales is a highly diverse, widely distributed order of red algae (Rhodophyta) that forms a well-supported clade. Aside from their ecological importance, species of Gracilariales provide important sources of agarans and possess bioactive compounds with medicinal and pharmaceutical use. Recent phylogenetic analyses from a small number of genes have greatly advanced our knowledge of evolutionary relationships in this clade, yet several key nodes were not especially well resolved. We assembled a phylogenomic data set containing 79 nuclear genes, 195 plastid genes, and 24 mitochondrial genes from species representing all three major Gracilariales lineages, including: Melanthalia, Gracilariopsis, and Gracilaria sensu lato. This data set leads to a fully-resolved phylogeny of Gracilariales, which is highly-consistent across genomic compartments. In agreement with previous findings, Melanthalia obtusata was sister to a clade including Gracilaria s.l. and Gracilariopsis, which were each resolved as well-supported clades. Our results also clarified the long-standing uncertainty about relationships in Gracilaria s.l., not resolved in single and multi-genes approaches. We further characterized the divergence time, organellar genome architecture, and morphological trait evolution in Gracilarales to better facilitate its taxonomic treatment. Gracilariopsis and Gracilaria s.l. are comparable taxonomic ranks, based on the overlapping time range of their divergence. The genomic structure of plastid and mitochondria is highly conserved within each clade but differs slightly among these clades in gene contents. For example, the plastid gene petP is lost in Gracilaria s.l. and the mitochondrial gene trnH is in different positions in the genome of Gracilariopsis and Gracilaria s.l. Our analyses of ancestral character evolution provide evidence that the main characters used to delimitate genera in Gracilariales, such as spermatangia type and features of the cystocarp’s anatomy, overlap in subclades of Gracilaria s.l. We discuss the taxonomy of Gracilariales in light of these results and propose an objective and practical classification, which is in agreement with the criteria of monophyly, exclusive characters, predictability and nomenclatural stability.
Domesticated plants and animals often display dramatic responses to selection, but the origins of the genetic diversity underlying these responses remain poorly understood. Despite domestication and ...improvement bottlenecks, the cultivated sunflower remains highly variable genetically, possibly due to hybridization with wild relatives. To characterize genetic diversity in the sunflower and to quantify contributions from wild relatives, we sequenced 287 cultivated lines, 17 Native American landraces and 189 wild accessions representing 11 compatible wild species. Cultivar sequences failing to map to the sunflower reference were assembled de novo for each genotype to determine the gene repertoire, or 'pan-genome', of the cultivated sunflower. Assembled genes were then compared to the wild species to estimate origins. Results indicate that the cultivated sunflower pan-genome comprises 61,205 genes, of which 27% vary across genotypes. Approximately 10% of the cultivated sunflower pan-genome is derived through introgression from wild sunflower species, and 1.5% of genes originated solely through introgression. Gene ontology functional analyses further indicate that genes associated with biotic resistance are over-represented among introgressed regions, an observation consistent with breeding records. Analyses of allelic variation associated with downy mildew resistance provide an example in which such introgressions have contributed to resistance to a globally challenging disease.
Populus trichocarpa is an ecologically important tree across western North America. We used a large population sample of 498 accessions over a wide geographical area genotyped with a 34K Populus SNP ...array to quantify geographical patterns of genetic variation in this species (landscape genomics). We present evidence that three processes contribute to the observed patterns: (1) introgression from the sister species P. balsamifera, (2) isolation by distance (IBD), and (3) natural selection. Introgression was detected only at the margins of the species' distribution. IBD was significant across the sampled area as a whole, but no evidence of restricted gene flow was detected in a core of drainages from southern British Columbia (BC). We identified a large number of FST outliers. Gene Ontology analyses revealed that FST outliers are overrepresented in genes involved in circadian rhythm and response to red/far-red light when the entire dataset is considered, whereas in southern BC heat response genes are overrepresented. We also identified strong correlations between geoclimate variables and allele frequencies at FST outlier loci that provide clues regarding the selective pressures acting at these loci.
The genomics of local adaptation is an increasingly active field, providing insights into the forces driving ecological speciation and the repeatability of evolution. Demography and gene flow play an ...important role in determining the paths by which parallel evolution occurs and the genomic signatures of adaptation. In the annual sunflowers, hybridization between species has repeatedly led to the colonization of extreme habitats, such as sand dunes. In a new case of adaptation to sand dunes that occurs in populations of H. petiolaris growing at Great Sand Dunes National Park and Preserve (Colorado), we wished to determine the age and long‐term migration patterns of the system, as well as its ancestry. We addressed these questions with restriction‐associated DNA (RAD) sequence data, aligned to a reference transcriptome. In an isolation with migration model using RAD sequences, coalescent analysis showed that the dune ecotype originated since the last ice age, which is very recent compared with the hybrid dune species, H. anomalus. Large effective population sizes and substantial numbers of gene migrants per generation between dune and nondune ecotypes explained the highly heterogeneous divergence observed among loci. Analysis of RAD‐derived SNPs identified heterogeneous divergence between the dune and nondune ecotypes, as well as identifying its nearest relative. Our results did not support the hypothesis that the dune ecotype has hybrid ancestry, suggesting that adaptation of sunflowers to dunes has occurred by multiple mechanisms. The ancestry and long‐term history of gene flow between incipient sunflower species provides valuable context for our understanding of ecological speciation and parallel adaptation.