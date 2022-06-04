Liang Huang

Associate Professor Computer Science

Oregon State University

Description: Human sentence processing is well-known to be incremental and linear-time: you never wait until the end of a sentence to start parsing it. This is in sharp contrast with computer processing of natural language, which often needs a full sentence as input (thus not incremental) and is slow in speed (superlinear time). How can we teach computers to understand, generate, and translate human languages in a way similar to what we do everyday? This talk presents two success stories of human-inspired incremental algorithms in NLP. First, we showcase our recent breakthrough in the extremely challenging task of simultaneous translation, where the translation happens concurrently with the source language speech. Inspired by human interpreters, I invented a prefix-to-prefix framework tailored to this problem that naturally enables anticipation and controllable latency. This breakthrough rejuvenates community-wide interest in this hard problem and has been covered by numerous media reports. I will also discuss our recent efforts towards simultaneous speech-to speech translation. Second, inspired by psycholinguistics and compiler theory, I designed the first linear-time dynamic programming algorithm for incremental parsing that searches over exponentially many candidates in linear time, mimicking local ambiguity packing in psycholinguistics.



On the other hand, we can also adapt my algorithms from language to biology thanks to the deep connection between syntax and biological structures. For example, I adapted the above-mentioned linear-time parsing algorithm to RNA structure prediction, achieving the first linear-time RNA folding algorithm. Quite unexpectedly, this algorithm has been widely used in the COVID-19 pandemic where the excessive length of the SARS-CoV-2 genome renders all other folding algorithms impractical. Based on this, I further developed two parsing-based algorithms to fight COVID-19: one for mRNA vaccine design used by 30+ vaccine companies worldwide, and the other to find the “Achilees heels’’ of SARS-CoV-2 genomes (PNAS, 2021).

Speaker Bio: Liang Huang (PhD, 2008, UPenn, CS) is an Associate Professor of Computer Science at Oregon State University and Distinguished Scientist at Baidu Research USA. Before that he held faculty and scientist positions at CUNY, USC, and Google. He develops fast and principled algorithms for natural language processing and adapts them to computational biology. His recognitions include ACL 2019 keynote, CVPR 2021 invited talk, ACL 2008 best paper award, EMNLP 2016 best paper honorable mention, several best paper nominations (ACL 2007, EMNLP 2008, ACL 2010, SIGMOD 2018), and a University Teaching Prize at Penn. Several of his algorithms became de facto standards in NLP textbooks. During COVID-19, he has done impactful work to help fight the pandemic, including an mRNA design algorithm used by 30+ vaccine companies worldwide, and a fast algorithm to find the “Achilees heels’’ of SARS-CoV-2 genomes (PNAS, 2021), both of which were inspired by natural language parsing techniques.