Michael Han

PhD Candidate, Department of Computer Science

Harvard University

Description: Attacks today are increasingly difficult to detect and their damagecontinues to skyrocket. For example, it takes an average of over 200 days to identify a data breach and costs about $4 million to rectify. More than 18,000 organizations were affected in the late 2020 SolarWinds supply chain attack. Devastating attacks that make headlines (e.g., Equifax, Target, and Kaseya) are no longer isolated, rare incidents.



In this talk, I will present my work on leveraging kernel-level data provenance to detect system intrusions. Kernel-level data provenance describes system activity as a directed acyclic graph that represents interactions between low-level kernel objects such as processes, files, and sockets. I will describe CamFlow, an OS infrastructure that captures such provenance graphs with negligible performance overhead. I will then describe a host intrusion detection system (IDS), called Unicorn, that uses provenance graphs to detect particularly dangerous attacks called advanced persistent threats (APTs). APTs are the main cause of many of today's large-scale data breaches. Unicorn applies machine learning to provenance graphs to identify system anomalies caused by APTs in real time without a priori attack knowledge.



I will close the talk by discussing challenges and opportunities in provenance-based intrusion detection, including efforts to develop a robust IDS that not only provides timely anomaly detection, but also explains the manner in which an attack unfolds.

Speaker Bio: Xueyuan (Michael) Han is a computer science doctoral candidate advised by Professor James Mickens at Harvard University and Professor Margo Seltzer at the University of British Columbia. His research interests lie at the intersection of systems, security, and privacy. His work focuses on combining practical system design and machine learning to detect hostintrusions, and designing language-level frameworks that respect user directives for handling private data. He has previously spent time at the University of Cambridge, Microsoft Research, and NEC Labs America. He is a Siebel Scholar and holds a B.S. in computer science from UCLA.

