Parallel Algorithm for indexing large DNA Sequences Using MapReduce on Hadoop
Conference proceedings article
Authors / Editors
Research Areas
No matching items found.
Publication Details
Author list: Kaniwa F, Dinakenyane O, Kuthadi VM
Place: NEW YORK
Publication year: 2017
Journal: 2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) (2156-1125)
Journal acronym: IEEE INT C BIOINFORM
Start page: 1576
End page: 1582
Number of pages: 7
eISBN: 978-1-5090-3050-7
ISSN: 2156-1125
eISSN: 2156-1133
Languages: English-Great Britain (EN-GB)
View in Web of Science | View citing articles in Web of Science
Abstract
MapReduce has recently become very successful parallel processing technique. Latest DNA sequencing technologies are now able to generate huge DNA sequences easily and cheaper. Consequently making it a challenge for single-core processor systems to mine patterns, hence leading to unsatisfactory performance. In this paper, we explore this challenge by making use of MapReduce on Hadoop platform using a successful data structure called the generalized suffix tree. Our experimental results show that the proposed approach can index long sequences with improved performance than previous related approaches.
Keywords
DNA Sequences, MapReduce, Parallel Algorithm, Suffix Trees
Documents
No matching items found.