Generating chromosome-scale haplotype resolved assembly is important for functional studies. However, current de novo assemblers are either haploid assemblers that discard allelic information, or ...diploid assemblers that can only tackle genomes of low complexity.
Here, Using robust programs, we build a diploid genome assembly pipeline called gcaPDA (gamete cells assisted Phased Diploid Assembler), which exploits haploid gamete cells to assist in resolving haplotypes. We demonstrate the effectiveness of gcaPDA based on simulated HiFi reads of maize genome which is highly heterozygous and repetitive, and real data from rice.
With applicability of coping with complex genomes and fewer restrictions on application than most of diploid assemblers, gcaPDA is likely to find broad applications in studies of eukaryotic genomes.
Optical mapping is a high-throughput sequencing technology which carries long-range genome information at no risk of PCR artifacts. On account of its long span, optical maps leave much fewer gaps ...when used for genome assembly. However, a high risk of errors poses an enormous challenge to optical map assembly. Here we propose an iterative algorithm for de novo optical map assembly. In any iteration, only significant pairwise alignments beyond strict thresholds are used to construct accurate contigs. These contigs act as input molecules for the next iteration of assembly. Strict thresholds ensures a good quality of the local assembly. The iterative method retains the connectivity between contigs in a progressive manner. In practice, our IOMA (iterative optical map assembler) outperforms two popular assemblers being used in the community on both simulated and real E. coli datasets.