Speaker
Description
Pangenomes are becoming increasingly popular data structures for genomics analyses due to their ability to compactly represent the genetic diversity within populations. Constructing a pangenome graph, however, is still a time-consuming and expensive process. A promising approach for pangenome construction consists in progressively augmenting a pangenome graph with additional high-quality assemblies. Currently, there is no method for augmenting a pangenome graph with unassembled reads from newly sequenced samples without first aligning the reads to a reference genome and performing variant calling and genotyping on the new individuals. In this work, we present the first assembly-free and mapping-free approach for augmenting an existing pangenome graph using unassembled long reads from an individual not already present in the pangenome. Our approach consists of finding sample specific sequences in reads using efficient indexes, clustering reads corresponding to the same novel variant(s), and then building a consensus sequence to be added to the pangenome graph for each variant separately. An additional advantage of our approach lies in its capability to collect and characterize variants specific to new individuals while performing updates to the graph topology. However, as we will demonstrate, evaluating the accuracy of this characterization presents several challenges that render the task particularly complex. Using simulated reads and real pangenome graphs provided by the Human Pangenome Reference Consortium (HPRC), we demonstrate the effectiveness of the proposed approach. Software and code is freely available at github.com/ldenti/palss and github.com/ldenti/svbench-fw.
| Pracovisko fakulty (katedra)/ Department of Faculty | Department of Applied Informatics |
|---|---|
| Tlač postru/ Print poster | Budem požadovať tlač /I hereby required to print the poster in faculty |