With the advent of chromatin‐interaction maps, chromosome‐level genome assemblies have become a reality for a wide range of organisms. Scaffolding quality is, however, difficult to judge. To explore ...this gap, we generated multiple chromosome‐scale genome assemblies of an emerging wild animal model for carcinogenesis, the California sea lion (Zalophus californianus). Short‐read assemblies were scaffolded with two independent chromatin interaction mapping data sets (Hi‐C and Chicago), and long‐read assemblies with three data types (Hi‐C, optical maps and 10X linked reads) following the “Vertebrate Genomes Project (VGP)” pipeline. In both approaches, 18 major scaffolds recovered the karyotype (2n = 36), with scaffold N50s of 138 and 147 Mb, respectively. Synteny relationships at the chromosome level with other pinniped genomes (2n = 32–36), ferret (2n = 34), red panda (2n = 36) and domestic dog (2n = 78) were consistent across approaches and recovered known fissions and fusions. Comparative chromosome painting and multicolour chromosome tiling with a panel of 264 genome‐integrated single‐locus canine bacterial artificial chromosome probes provided independent evaluation of genome organization. Broad‐scale discrepancies between the approaches were observed within chromosomes, most commonly in translocations centred around centromeres and telomeres, which were better resolved in the VGP assembly. Genomic and cytological approaches agreed on near‐perfect synteny of the X chromosome, and in combination allowed detailed investigation of autosomal rearrangements between dog and sea lion. This study presents high‐quality genomes of an emerging cancer model and highlights that even highly fragmented short‐read assemblies scaffolded with Hi‐C can yield reliable chromosome‐level scaffolds suitable for comparative genomic analyses.
Abstract
Background: The ring-tailed lemur (Lemur catta) is a charismatic strepsirrhine primate endemic to Madagascar. These lemurs are of particular interest, given their status as a flagship ...species and widespread publicity in the popular media. Unfortunately, a recent population decline has resulted in the census population decreasing to <2,500 individuals in the wild, and the species's classification as an endangered species by the IUCN. As is the case for most strepsirrhine primates, only a limited amount of genomic research has been conducted on L. catta, in part owing to the lack of genomic resources. Results: We generated a new high-quality reference genome assembly for L. catta (mLemCat1) that conforms to the standards of the Vertebrate Genomes Project. This new long-read assembly is composed of Pacific Biosciences continuous long reads (CLR data), Optical Mapping Bionano reads, Arima HiC data, and 10X linked reads. The contiguity and completeness of the assembly are extremely high, with scaffold and contig N50 values of 90.982 and 10.570 Mb, respectively. Additionally, when compared to other high-quality primate assemblies, L. catta has the lowest reported number of Alu elements, which results predominantly from a lack of AluS and AluY elements. Conclusions: mLemCat1 is an excellent genomic resource not only for the ring-tailed lemur community, but also for other members of the Lemuridae family, and is the first very long read assembly for a strepsirrhine.
Abstract
Background
The tufted duck is a non-model organism that experiences high mortality in highly pathogenic avian influenza outbreaks. It belongs to the same bird family (Anatidae) as the ...mallard, one of the best-studied natural hosts of low-pathogenic avian influenza viruses. Studies in non-model bird species are crucial to disentangle the role of the host response in avian influenza virus infection in the natural reservoir. Such endeavour requires a high-quality genome assembly and transcriptome.
Findings
This study presents the first high-quality, chromosome-level reference genome assembly of the tufted duck using the Vertebrate Genomes Project pipeline. We sequenced RNA (complementary DNA) from brain, ileum, lung, ovary, spleen, and testis using Illumina short-read and Pacific Biosciences long-read sequencing platforms, which were used for annotation. We found 34 autosomes plus Z and W sex chromosomes in the curated genome assembly, with 99.6% of the sequence assigned to chromosomes. Functional annotation revealed 14,099 protein-coding genes that generate 111,934 transcripts, which implies a mean of 7.9 isoforms per gene. We also identified 246 small RNA families.
Conclusions
This annotated genome contributes to continuing research into the host response in avian influenza virus infections in a natural reservoir. Our findings from a comparison between short-read and long-read reference transcriptomics contribute to a deeper understanding of these competing options. In this study, both technologies complemented each other. We expect this annotation to be a foundation for further comparative and evolutionary genomic studies, including many waterfowl relatives with differing susceptibilities to avian influenza viruses.
The lack of efficient high-throughput methods for enrichment of specific sequences from genomic DNA represents a key bottleneck in exploiting the enormous potential of next-generation sequencers. ...Such methods would allow for a systematic and targeted analysis of relevant genomic regions. Recent studies reported sequence enrichment using a hybridization step to specific DNA capture probes as a possible solution to the problem. However, so far no method has provided sufficient depths of coverage for reliable base calling over the entire target regions. We report a strategy to multiply the enrichment performance and consequently improve depth and breadth of coverage for desired target sequences by applying two iterative cycles of hybridization with microfluidic Geniom biochips. Using this strategy, we enriched and then sequenced the cancer-related genes BRCA1 and TP53 and a set of 1000 individual dbSNP regions of 500 bp using Illumina technology. We achieved overall enrichment factors of up to 1062-fold and average coverage depths of 470-fold. Combined with high coverage uniformity, this resulted in nearly complete consensus coverages with >86% of target region covered at 20-fold or higher. Analysis of SNP calling accuracies after enrichment revealed excellent concordance, with the reference sequence closely mirroring the previously reported performance of Illumina sequencing conducted without sequence enrichment.
There are several protocols and kits for the extraction of circulating RNAs from plasma with a following quantification of specific genes via RT-qPCR. Due to the marginal amount of cell-free RNA in ...plasma samples, the total RNA yield is insufficient to perform Next-Generation Sequencing (NGS), the state-of-the-art technology in massive parallel sequencing that enables a comprehensive characterization of the whole transcriptome. Screening the transcriptome for biomarker signatures accelerates progress in biomarker profiling for molecular diagnostics, early disease detection or food safety. Therefore, the aim was to optimize a method that enables the extraction of sufficient amounts of total RNA from bovine plasma to generate good-quality small RNA Sequencing (small RNA-Seq) data. An increased volume of plasma (9 ml) was processed using the Qiagen miRNeasy Serum/Plasma Kit in combination with the QIAvac24 Plus system, a vacuum manifold that enables handling of high volumes during RNA isolation. 35 ng of total RNA were passed on to cDNA library preparation followed by small RNA high-throughput sequencing analysis on the Illumina HiSeq2000 platform. Raw sequencing reads were processed by a data analysis pipeline using different free software solutions. Seq-data was trimmed, quality checked, gradually selected for miRNAs/piRNAs and aligned to small RNA reference annotation indexes. Mapping to human reference indexes resulted in 4.8 plus or minus 2.8% of mature miRNAs and 1.4 plus or minus 0.8% of piRNAs and of 5.0 plus or minus 2.9% of mature miRNAs for bos taurus.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Balanced chromosome abnormalities (BCAs) occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a ...phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD) and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14) that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK