News
- 2024-09 Joined Meta as a visiting researcher!
- 2024-07 Introducing MassiveDS , a 1.4T-token open-source datastore, and a scaling study for retrieval-based language models! We show scaling data used at test time brings better compute-optimal scaling curves.
- 2024-06 Started internship at Meta FAIR!
- 2023-10 Gave a talk at AWS on Long-context LLMs! [slides]
- 2023-10 Gave a talk at SAMPL Lab on scaling up retrieval-based language models!
- 2023-10 Introducing LightSeq , a better sequence parallelism solution for long-context LLM training! It is highly optimized for decoders, and it does not entangle with the model architecture, allowing you to scale up your sequence length without an upper bound!
- 2023-09 Started PhD at University of Washington!
- 2023-06 Introducing our latest work on the long-context models (LongChat) and benchmark (LongEval) in this LMSys blog ! Stay tuned!
- 2023-01 Joined AWS as a full-time Applied Scientist working on language model pretraining!
|