Defense: Accurate Genome Analysis with Nanopore Sequencing Using Deep Neural Networks

Kishwar Shafin
Biomolecular Engineering & Bioinformatics PhD Candidate
Location
Hybrid
Advisor
Benedict Paten

Location: Engineering 2, Room 599

Join us on Zoom: https://ucsc.zoom.us/j/91977905535?pwd=eVZnVGFxYWdRY01LVHZjMHJDcndEZz09/ Passcode: pepper

Description: Nanopore sequencing, commercialized by Oxford Nanopore Technology (ONT), is a high-throughput genome sequencing platform. Unlike traditional sequencing-by-synthesis methods, nanopore sequencing uses measured current signals to sense the nucleotide sequence flowing through the pore. The signal-to-base conversion process introduces unique error patterns, making it challenging to design methods that rely on hand-crafted features. Deep learning is a subclass of machine learning that uses multiple layers to progressively learn complex patterns in the input data, making it suitable for nanopore genome analysis.

Here I will present a set of methods I developed based on deep neural networks that improve human genome assembly and variant calling with nanopore sequencing data. I will demonstrate a pipeline to perform de novo assembly of eleven human genomes in nine days. I will introduce haplotype-aware variant caller PEPPER-Margin-DeepVariant that produces state-of-the-art results for nanopore long-reads. I will show the application of the methods to validate and correct errors in the first complete human genome assembly. Finally, I will demonstrate the utility of PEPPER-Margin-DeepVariant paired with highly multiplexed nanopore sequencing for rapidly identifying disease-causing variants in critically ill patients admitted to neonatal intensive care unit (NICU).