Governors Island @ NYC, Jun. 2022
Photo not (😏) taken by my friend Nancy


Jiao Sun

Hi there 👋 I am a Ph.D. candidate (yay! 🎉) and an Amazon Fellow working on NLP at USC since Fall 2019, working with Xuezhe Ma (advisor), Nanyun (Violet) Peng and Swabha Swayamdipta. Before that, I finish my master's @ IIIS, lead by Andrew Chi-Chih Yao at Tsinghua University.

I work on trustworthy generation. More specifically, I build controlled text generation models and enhance the robustness of NLG systems, including NLG evaluation and data efficiency. More specifically:

I have also been actively working on Large Language Models (LLMs): 1) I pretrained models with over 60 TB data and then fine-tuned with mT5 on TPUs, ranging from mT5 base towards mT5 XXL (13 billion parameters). 2) I distilled PaLM's ability of dialect rewriting into a smaller model based on EdiT5. 3) I was part of the LIMA effort, where we train a high-quality LLM with only 1000 examples. Besides, I spent time working on fairness problems and advocating fair AI techniques, including event bias on Wikipedia [Best Paper Nomination @ ACL 2021] and on greeting cards [Best Paper Honorable Mention @ CHI 2022].

Feel free to drop me a line! jiaosun.thu@gmail.com


News
  • 2023.08 Thrilled to be selected as a 2023 EECS Rising Star!
  • 2022.05   Excited to start my internship at Google Research!
  • 2022.03   Best paper honorable mention at CHI 2022!
  • 2021.08   AESOP and ESTER got accepted by EMNLP 2021, stay tuned!
  • : I follow Boston Celtics and Miami Heat.

Experience

May 2023 - Nov 2023

Student Researcher, Google Research
Host: Cyrus Rashtchian

May 2022 - Nov 2022

Student Researcher, Google DeepMind
Hosts: Sebastian Gehrmann, Jacob Eisenstein

Jan 2022 - May 2022

Research Internship, Amazon Alexa AI
Hosts: Nanyun Violet Peng, Anjali Narayan-Chen

May 2021 - Aug 2021

Research Internship, IBM Thomas J. Watson Research Center
Hosts: Justin D Weisz, Q. Vera Liao

Invited Talks


2023

03/15@Wikimedia Research

Event Gender Bias in Wikipedia
Host: Emily Lescak  

02/14@Allen Institute for AI

Discovering and Addressing the Pitfalls in Practical Text Generation
Host: Noah A. Smith

2022

13/12 Interview@ Amazon Science

09/23@USC-Amazon Center

Context Situated Pun Generation
Host: Salman Avestimehr

Service

2022

AAAI 2022, CSCW 2022, EMNLP 2022

2023

AAAI 2023, CHI 2023, CSCW 2023, TL4NLP@NeurIPS 2023