Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed Open access
  • Dupsifter: a lightweight du...
    Morrison, Jacob; Zhou, Wanding; Johnson, Benjamin K; Shen, Hui

    Bioinformatics (Oxford, England), 12/2023, Volume: 39, Issue: 12
    Journal Article

    Abstract Summary In whole genome sequencing data, polymerase chain reaction amplification results in duplicate DNA fragments coming from the same location in the genome. The process of preparing a whole genome bisulfite sequencing (WGBS) library, on the other hand, can create two DNA fragments from the same location that should not be considered duplicates. Currently, only one WGBS-aware duplicate marking tool exists. However, it only works with the output from a single tool, does not accept streaming input or output, and requires a substantial amount of memory relative to the input size. Dupsifter provides an aligner-agnostic duplicate marking tool that is lightweight, has streaming capabilities, and is memory efficient. Availability and implementation Source code and binaries are freely available at https://github.com/huishenlab/dupsifter under the MIT license. Dupsifter is implemented in C and is supported on macOS and Linux.