New algorithm makes offline disease ID possible.
The Garvan Institute of Medical Research has partnered with the University of NSW to take genome analysis ‘offline’ by adapting the algorithms that perform DNA analysis to require far less compute than current tools.
Medical practitioners fighting the Ebola and Zika viruses in New Guinea and Brazil have already used small genome sequencing devices that can clip on to a smartphone, but these devices still require high-performance computer workstations or reliable internet connections to identify genes.
Devices like the Oxford Nanopore Technologies MinION can create over a terabyte of data in 48 hours, but their use still isn’t commonplace because comparing or ‘aligning’ DNA from an unknown sample to a reference database to figure out what the sample is requires around 16 GB of RAM, which is beyond the capabilities of most mid-range laptops and flagship smartphones.
For cash-strapped medical programs in developing countries or during large-scale outbreaks, that kind of processing power isn’t easy to come by at scale, and a reliable internet connection can be just as hard to find.
In a new paper released in Nature, Garvan’s Genomic Technologies lead Dr Martin Smith and his team detailed the computational method for reducing the amount of memory needed for aligning sequences from 11GB to 2GB - well within the reach of mid-range smartphones.
The researchers adapted the Minimap2 program, which aligns DNA sequencing ‘reads’ to a reference library of known genomes.
This reference library is usually indexed, which helps to map sequencing reads to their corresponding positions in a genome.
“The challenge, so far, has been that the reference index requires too much computer memory,” Smith said.