Reconstructing genomes of organisms from high-throughput sequencing experiments without a reference genome available (de novo assembly) is a challenging problem which has been approached in several ways in the past decade. Although numerous methods are available and many offer fair performance in reconstruction, there is a lack of generalized template libraries and interchangeable data structures/methods for serial, multithreaded and distributed processing. In this work we propose a novel set of cache oblivious generic data structures for serial, multithreaded and distributed processing of high-throughput sequencing data for the creation of de Bruijn or k-mer graphs towards their usage in de novo assembly and related HTS data analytics problems.
Milicchio, F. (2016). High-performance data structures for de novo assembly of genomes: cache oblivious generic programming. In Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics.
Titolo: | High-performance data structures for de novo assembly of genomes: cache oblivious generic programming |
Autori: | |
Data di pubblicazione: | 2016 |
Citazione: | Milicchio, F. (2016). High-performance data structures for de novo assembly of genomes: cache oblivious generic programming. In Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. |
Handle: | http://hdl.handle.net/11590/320473 |
Appare nelle tipologie: | 4.1 Contributo in Atti di convegno |