"Several worldwide pandemics, such as influenza, human immunodeficiency virus, and coronavirus, are caused by viral quasispecies. Characterization of quasispecies harboring in a host is essential to unveil the mechanisms that are at the base of the pathogen evolution, infection and spread at the epidemic level. Next generation sequencing (NGS) produces many thousands of sequence fragments from a single sample, allowing the full genome sequencing at high resolution. In this work, an original approach for the de novo assembly (reconstruction of a full genome without the need of a reference genome) of NGS reads into the quasispecies present in the sample is introduced, using biased random walks over an overlap graph construction. The proposed framework is shown to be successful in reconstructing viral quasispecies at different diversities, using both simulated and empirical data. In addition, a broad set of measures describing topological properties of the overlap graphs is examined, in order to highlight differences in the data sets and therefore in the population structures."
Prosperi, M., Meloni, S., Fanti, I., Panzieri, S., Ulivi, G., Salemi, M. (2012). Characterization of de novo assemblies of quasispecies from next-generation sequencing via complex network modeling. SCIENTIFIC RESEARCH AND ESSAYS, 2997-3009 [10.5897/SRE12.242].
Characterization of de novo assemblies of quasispecies from next-generation sequencing via complex network modeling
PANZIERI, Stefano;ULIVI, Giovanni;
2012-01-01
Abstract
"Several worldwide pandemics, such as influenza, human immunodeficiency virus, and coronavirus, are caused by viral quasispecies. Characterization of quasispecies harboring in a host is essential to unveil the mechanisms that are at the base of the pathogen evolution, infection and spread at the epidemic level. Next generation sequencing (NGS) produces many thousands of sequence fragments from a single sample, allowing the full genome sequencing at high resolution. In this work, an original approach for the de novo assembly (reconstruction of a full genome without the need of a reference genome) of NGS reads into the quasispecies present in the sample is introduced, using biased random walks over an overlap graph construction. The proposed framework is shown to be successful in reconstructing viral quasispecies at different diversities, using both simulated and empirical data. In addition, a broad set of measures describing topological properties of the overlap graphs is examined, in order to highlight differences in the data sets and therefore in the population structures."I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.