RiboDetector is designed to rapidly and accurately detect rRNA sequences from metagenomic, metatranscriptomic, and ncRNA sequencing data. It has been optimized for use with both CPUs and GPUs. It outperforms existing software by delivering 10-50x faster runtime and ~10x fewer false classifications.
RiboDetector is a novel software based on a Bi-directional Long Short-Term Memory (BiLSTM) neural network, which rapidly and accurately identifies rRNA reads from transcriptomic, metagenomic, metatranscriptomic, noncoding RNA, and ribosome profiling sequence data. Compared with state-of-the-art approaches, RiboDetector produced at least six times fewer misclassifications on the benchmark datasets. Importantly, the few false positives of RiboDetector were not enriched in certain Gene Ontology (GO) terms, suggesting a low bias for downstream functional profiling. RiboDetector also demonstrated a remarkable generalizability for detecting novel rRNA sequences that are divergent from the training data with sequence identities of <90%. On a personal computer, RiboDetector processed 40M reads in less than 6 min, which was ∼50 times faster in GPU mode and ∼15 times in CPU mode than other methods. RiboDetector is available under a GPL v3.0 license at https://github.com/hzi-bifo/RiboDetector.