• The cotton fibre serves as a valuable experimental system to study cell wall synthesis in plants, but our understanding of the genetic regulation of this process during fibre development remains ...limited.
• We performed a genome-wide association study (GWAS) and identified 28 genetic loci associated with fibre quality in allotetraploid cotton. To investigate the regulatory roles of these loci, we sequenced fibre transcriptomes of 251 cotton accessions and identified 15 330 expression quantitative trait loci (eQTL).
• Analysis of local eQTL and GWAS data prioritised 13 likely causal genes for differential fibre quality in a transcriptome-wide association study (TWAS). Characterisation of distal eQTL revealed unequal genetic regulation patterns between two subgenomes, highlighted by an eQTL hotspot (Hot216) that established a genome-wide genetic network regulating the expression of 962 genes. The primary regulatory role of Hot216, and specifically the gene encoding a KIP-related protein, was found to be the transcriptional regulation of genes responsible for cell wall synthesis, which contributes to fibre length by modulating the developmental transition from rapid cell elongation to secondary cell wall synthesis.
• This study uncovered the genetic regulation of fibre-cell development and revealed the molecular basis of the temporal modulation of secondary cell wall synthesis during plant cell elongation.
Current usage of the name Ulva lactuca, the generitype of Ulva, remains uncertain. Genetic analyses were performed on the U. lactuca Linnaean holotype, the U. fasciata epitype, the U. fenestrata ...holotype, the U. lobata lectotype, and the U. stipitata lectotype. The U. lactuca holotype is nearly identical in rbcL sequence to the epitype of U. fasciata, a warm temperate to tropical species, rather than the cold temperate species to which the name U. lactuca has generally been applied. We hypothesize that the holotype specimen of U. lactuca came from the Indo‐Pacific rather than northern Europe. Our analyses indicate that U. fasciata and U. lobata are heterotypic synonyms of U. lactuca. Ulva fenestrata is the earliest name for northern hemisphere, cold temperate Atlantic and Pacific species, with U. stipitata a junior synonym. DNA sequencing of type specimens provides an unequivocal method for applying names to Ulva species.
In this paper we consider the problem of encoding data into repeat-free sequences in which sequences are imposed to contain any <inline-formula> <tex-math notation="LaTeX">k ...</tex-math></inline-formula>-tuple at most once (for predefined <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula>). First, the capacity of the repeat-free constraint are calculated. Then, an efficient algorithm, which uses two bits of redundancy, is presented to encode length-<inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> sequences for <inline-formula> <tex-math notation="LaTeX">k=2+2\log (n) </tex-math></inline-formula>. This algorithm is then improved to support any value of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> of the form <inline-formula> <tex-math notation="LaTeX">k=a\log (n) </tex-math></inline-formula>, for <inline-formula> <tex-math notation="LaTeX">1< a </tex-math></inline-formula>, while its redundancy is <inline-formula> <tex-math notation="LaTeX">o(n) </tex-math></inline-formula>. We also calculate the capacity of repeat-free sequences when combined with local constraints which are given by a constrained system, and the capacity of multi-dimensional repeat-free codes.
Selective breeding is increasingly recognized as a key component of sustainable production of aquaculture species. The uptake of genomic technology in aquaculture breeding has traditionally lagged ...behind terrestrial farmed animals. However, the rapid development and application of sequencing technologies has allowed aquaculture to narrow the gap, leading to substantial genomic resources for all major aquaculture species. While high‐density single‐nucleotide polymorphism (SNP) arrays for some species have been developed recently, direct genotyping by sequencing (GBS) techniques have underpinned many of the advances in aquaculture genetics and breeding to date. In particular, restriction‐site associated DNA sequencing (RAD‐Seq) and subsequent variations have been extensively applied to generate population‐level SNP genotype data. These GBS techniques are not dependent on prior genomic information such as a reference genome assembly for the species of interest. As such, they have been widely utilized by researchers and companies focussing on nonmodel aquaculture species with relatively small research communities. Applications of RAD‐Seq techniques have included generation of genetic linkage maps, performing genome‐wide association studies, improvements of reference genome assemblies and, more recently, genomic selection for traits of interest to aquaculture like growth, sex determination or disease resistance. In this review, we briefly discuss the history of GBS, the nuances of the various GBS techniques, bioinformatics approaches and application of these techniques to various aquaculture species.
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. ...Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, KISLJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Error-Correcting Codes for Nanopore Sequencing Banerjee, Anisha; Yehezkeally, Yonatan; Wachter-Zeh, Antonia ...
IEEE transactions on information theory,
07/2024, Letnik:
70, Številka:
7
Journal Article
Recenzirano
Odprti dostop
Nanopore sequencing, superior to other sequencing technologies for DNA storage in multiple aspects, has recently attracted considerable attention. Its high error rates, however, demand thorough ...research on practical and efficient coding schemes to enable accurate recovery of stored data. To this end, we consider a simplified model of a nanopore sequencer inspired by Mao et al., incorporating intersymbol interference and measurement noise. Essentially, our channel model passes a sliding window of length <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> over a <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula>-ary input sequence that outputs the composition of the enclosed <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> bits, and shifts by <inline-formula> <tex-math notation="LaTeX">\delta </tex-math></inline-formula> positions with each time step. In this context, the composition of a q-ary vector <inline-formula> <tex-math notation="LaTeX">{\boldsymbol x} </tex-math></inline-formula> specifies the number of occurrences in <inline-formula> <tex-math notation="LaTeX">{\boldsymbol x} </tex-math></inline-formula> of each symbol in <inline-formula> <tex-math notation="LaTeX">\lbrace 0,1,\ldots, q-1\rbrace </tex-math></inline-formula>. The resulting compositions vector, termed the read vector, may also be corrupted by t substitution errors. By employing graph-theoretic techniques, we deduce that for <inline-formula> <tex-math notation="LaTeX">\delta =1 </tex-math></inline-formula>, at least <inline-formula> <tex-math notation="LaTeX">\log \log n </tex-math></inline-formula> symbols of redundancy are required to correct a single (<inline-formula> <tex-math notation="LaTeX">t=1 </tex-math></inline-formula>) substitution. Finally, for <inline-formula> <tex-math notation="LaTeX">\ell \geq 3 </tex-math></inline-formula>, we exploit some inherent characteristics of read vectors to arrive at an error-correcting code that is of optimal redundancy up to a (small) additive constant for this setting. This construction is also found to be optimal for the case of reconstruction from two noisy read vectors.
Advances in high-throughput sequencing techniques now allow relatively easy and affordable sequencing of large portions of the genome, even for nonmodel organisms. Many phylogenetic studies reduce ...costs by focusing their sequencing efforts on a selected set of targeted loci, commonly enriched using sequence capture. The advantage of this approach is that it recovers a consistent set of loci, each with high sequencing depth, which leads to more confidence in the assembly of target sequences. High sequencing depth can also be used to identify phylogenetically informative allelic variation within sequenced individuals, but allele sequences are infrequently assembled in phylogenetic studies. Instead, many scientists perform their phylogenetic analyses using contig sequences which result from the de novo assembly of sequencing reads into contigs containing only canonical nucleobases, and this may reduce both statistical power and phylogenetic accuracy. Here, we develop an easy-to-use pipeline to recover allele sequences from sequence capture data, and we use simulated and empirical data to demonstrate the utility of integrating these allele sequences to analyses performed under the multispecies coalescent model. Our empirical analyses of ultraconserved element locus data collected from the South American hummingbird genus Topaza demonstrate that phased allele sequences carry sufficient phylogenetic information to infer the genetic structure, lineage divergence, and biogeographic history of a genus that diversified during the last 3 myr. The phylogenetic results support the recognition of two species and suggest a high rate of gene flow across large distances of rainforest habitats but rare admixture across the Amazon River. Our simulations provide evidence that analyzing allele sequences leads to more accurate estimates of tree topology and divergence times than the more common approach of using contig sequences.
Celotno besedilo
Dostopno za:
BFBNIB, DOBA, IZUM, KILJ, NMLJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Leprosy urgently needs a precise and early diagnostic tool. The sensitivity of the direct (bacilli staining, Mycobacterium leprae DNA) and indirect (antibody levels, T cell assays) diagnostics ...methods vary based on the clinical form. Recently, PCR-based M. leprae DNA detection has been shown to differentially diagnose leprosy from other dermatological conditions. However, accuracy can still be improved, especially for use with less invasive clinical samples. We tested different commercial DNA extraction kits: DNeasy Blood & Tissue, QIAamp DNA Microbiome, Maxwell 16 DNA Purification, PowerSoil DNA Isolation; as well as in-house phenol-chloroform and Trizol/FastPrep methods. Extraction was performed on M. leprae-infected mouse footpads and different clinical samples of leprosy patients (skin biopsies and scrapings, lesion, oral and nasal swabs, body hair, blood on FTA cards, peripheral whole blood). We observed that the Microbiome kit was able to enrich for mycobacterial DNA, most likely due the enzymatic digestion cocktail along with mechanical disruption involved in this method. Consequently, we had a significant increase in sensitivity in skin biopsies from paucibacillary leprosy patients using a duplex qPCR targeting 16S rRNA (M. leprae) and 18S rRNA (mammal) in the StepOnePlus system. Our data showed that the presence of M. leprae DNA was best detected in skin biopsies and skin scrapings, independent of the extraction method or the clinical form. For multibacillary patients, detection of M. leprae DNA in nasal swabs indicates the possibility of having a much less invasive sample that can be used for the purposes of DNA sequencing for relapse analysis and drug resistance monitoring. Overall, DNA extracted with the Microbiome kit presented the best bacilli detection rate for paucibacillary cases, indicating that investments in extraction methods with mechanical and DNA digestion should be made.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The revolution and acceleration in DNA sequencing over the past three decades has driven the development of new biomolecular tools like environmental DNA (eDNA) metabarcoding for characterizing ...marine biodiversity. In order to operationalize eDNA approaches for routine NOAA observatories, new bioinformatic programs and improved organismal reference barcodes are needed to serve accurate and reliable biological data in a timely manner. To address these needs, we present Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline (REVAMP), which provides streamlined end-to-end data processing from raw reads to data exploration, visualization, and hypothesis generation. One benefit of REVAMP is the ability to iteratively assess marker gene and reference database performance. Here, we used a filtered reference database that only included sequences uploaded prior to specified date cutoffs from 1995 to 2022 to analyze changes in eDNA metabarcoding taxonomic assignments, revealing patterns of uneven improvement in taxonomic assignment depth and accuracy across time, region, and marker sets. This work highlights the need for targeted reference sequencing efforts for key regional taxa and the importance of such efforts for improving eDNA biomonitoring approaches in the future.
Our study was conducted to detect virulence genes in Serratia marcescens. It has
many virulence genes that cause nosocomal infections in immunocompromised
persons and neonates. A total of 24/100 ...(24%) S. marcescens were obtained from
neonates suffering from meningitis, and they were identified using culture characteristics biochemical- tests and confirmed by Polymerase chain reaction (PCR)
technique, using the 16S rRNA gene. All virulence factors, including the fimA
gene that encodes type-1 fimbria, the bsmB gene that encodes exo polysaccharide
production, and ampC that encodes ß-lactamase enzymes, were done using the
PCR technique. The results revealed that S. marcescens isolates have 16S rRNA
gene at the percentage (100%), fimA gene at the percentage (54%), bsmB gene at
the percentage (71%) and ampC gene at the percentage (100%). Finally, the DNA
sequencing of (fimA, bsmB, and ampC genes) was done using a DNA sequencer
technique to determine the sequence of nucleotides. The results revealed the similarities of the genes in local isolates of S. marcescens (98%) with S. marcescens
isolates globally registered on the NCBI-Genbank website.
Keywords: fimA gene, bsmB gene, ampC gene, Serratia marcescens, DNA sequences.