UNI-MB - logo
UMNIK - logo
 
E-resources
Full text
Peer reviewed
  • DeepHost: phage host predic...
    Ruohan, Wang; Xianglilan, Zhang; Jianping, Wang; Shuai Cheng, L I

    Briefings in bioinformatics, 01/2022, Volume: 23, Issue: 1
    Journal Article

    Abstract Next-generation sequencing expands the known phage genomes rapidly. Unlike culture-based methods, the hosts of phages discovered from next-generation sequencing data remain uncharacterized. The high diversity of the phage genomes makes the host assignment task challenging. To solve the issue, we proposed a phage host prediction tool—DeepHost. To encode the phage genomes into matrices, we design a genome encoding method that applied various spaced $k$-mer pairs to tolerate sequence variations, including insertion, deletions, and mutations. DeepHost applies a convolutional neural network to predict host taxonomies. DeepHost achieves the prediction accuracy of 96.05% at the genus level (72 taxonomies) and 90.78% at the species level (118 taxonomies), which outperforms the existing phage host prediction tools by 10.16–30.48% and achieves comparable results to BLAST. For the genomes without hits in BLAST, DeepHost obtains the accuracy of 38.00% at the genus level and 26.47% at the species level, making it suitable for genomes of less homologous sequences with the existing datasets. DeepHost is alignment-free, and it is faster than BLAST, especially for large datasets. DeepHost is available at https://github.com/deepomicslab/DeepHost.