Rulin Shao

I am a first-year PhD at University of Washington advised by Prof. Pang Wei Koh and Prof. Luke Zettlemoyer . Besfore joining UW, I worked as an applied scientist at AWS on large-scale pretraining for Amazon Bedrock for half a year. I completed my master's in Machine Learning at CMU advised by Prof. Eric Xing and my undergraduate degree in Mathematics at XJTU.

I'm interested in enhancing LM (or VLM)'s ability to utilize external/test-time knowledge. In particular, I work on scaling retrieval-based LMs and efficient system designs to make large-scale retrieval augmentation accessible. Besides retrieval, I also explore vision-language multimodal learning, long-context Transformers, and data-centric optimization.

Google Scholar / GitHub / Twitter / LinkedIn / Email

News

2024-07 Introducing MassiveDS , a 1.4T-token open-source datastore, and a scaling study for retrieval-based language models! We show considering datastore as a new dimension of scaling brings better compute-optimal scaling curves and better performance!
2024-06 Started internship at the Fundamental AI Research (FAIR) team at Meta!
2023-10 Gave a talk at AWS on Long-context LLMs! [slides]
2023-10 Gave a talk at SAMPL Lab on scaling up retrieval-based language models!
2023-10 Introducing LightSeq , a better sequence parallelism solution for long-context LLM training! It is highly optimized for decoders, and it does not entangle with the model architecture, allowing you to scale up your sequence length without an upper bound!
2023-08 Introducing VisIT-Bench , a new vision-language instruction following benchmark inspired by real-world use cases.
2023-06 Introducing our latest work on the long-context models (LongChat) and benchmark (LongEval) in this LMSys blog ! Stay tuned!
2023-01 I graduated from CMU and joined AWS as a full-time Applied Scientist!

Research

Awards

My boyfriend asked me to put him here :). He is awesome!