[BK21 x ERC Seminar] Dr. Krzysztof Choromanski, Google DeepMind, 11/15(수) 오후4시
November 15, 2023

일시: 11월15일(수) 오후 4:00-5:00
장소: 43-2동 B102호
Speaker: Dr. Krzysztof Choromanski, Google DeepMind
Title: Towards Practical Robotics Transformers
Abstract: Transformer architectures have revolutionized modern machine learning, quickly overtaking regular deep neural networks in practically all its fields: from large language through vision and speech to Robotics. One of the main challenges in using them to model long-range interactions (critical for such applications as bioinformatics, e.g. genome modeling) and in settings with strict latency constraints (e.g. Robotics) remains the prohibitively expensive quadratic space & time complexity (in the lengths of their input sequences) of their core attention modules. Attention linearization techniques applying kernel methods and random features led to one of the most mathematically rigorous ways to address this problem and the birth of various scalable Transformer architectures (such as the class of low-rank implicit-attention Transformers called Performers).
In this talk, I will summarize the recent progress made on scaling up Transformers with kernel features and present related open mathematical problems. I will discuss those in the context of the new, rapidly growing field of Robotics Transformers. The talk with provide an introduction to modern attention linearization algorithms based on low-rank factorization (such as FAVOR, FAVOR+ and FAVOR# mechanisms, QMC techniques and methods producing topologically-aware modulation of the regular attention modules in Transformers via RF-based linearizations of various graph kernels). Applications of linear-attention in Robotics Transformers on the example of the Performer-MPC controllers and the class of SARA-RTs (new Robotics Transformers used recently to speed up RT-2 models and controllers leveraging Point Cloud Transformers, with no quality loss) will be given.
Bio: Dr. Krzysztof Choromanski is a research scientist at Google DeepMind and an adjunct assistant professor at Columbia University. He obtained his Ph.D. from the IEOR Department at Columbia University, where he worked on various problems in structural graph theory (in particular the celebrated Erdos-Hajnal Conjecture and random graphs). His current interests include Robotics, scalable Transformer architectures (also for topologically-rich inputs), the theory of random features, and structural neural networks. Krzysztof is one of the co-founders of the class of Performers-Transformers, the first Transformer architectures providing efficient unbiased estimation of the regular softmax-kernel matrices used in Transformers.