prof_pic.jpg

I’m a Ph.D. candidate at Stanford CS, advised by Chelsea Finn and part of the IRIS lab. I am affiliated with SAIL, CRFM, and the ML Group at Stanford. My research is supported by an OpenAI Superalignment Fellowship and a KFAS PhD Scholarship.

Previously, as mandatory military service for the South Korean army, I was a research scientist at Kakao and AITRICS, working with Juho Lee. Before that, I completed my master’s (CS, advised by Seungjin Choi) and undergraduate (math) degrees at POSTECH.

Here are some key questions that guide my research:

  • Teaching strong models: Strong pre-trained models already know much of what we want to teach them. Post-training seems to be more about eliciting the appropriate pre-existing capabilities than instilling entirely new information. Can we develop a more effective paradigm for “teaching” models that leverage the pre-existing capabilities inside pre-trained models?
  • Underspecification: No dataset fully specifies its intended task. How can we make models recognize and represent the multitude of possible realities consistent with given data? What is the best way to leverage such diverse hypotheses?
  • Understanding information: Within any data, there is an underlying essence (“information”) that exists independently of the specific representation. How can we better conceptualize this notion of information, and understand the mechanisms by which machine learning models extract, store, and communicate it?
  • Mitigating risks: What strategies can we employ to handle the reality of machine learning systems generating potentially erroneous or harmful outputs?

Selected Papers

Clarify: Improving Model Robustness with Natural Language Corrections

Yoonho Lee, Michelle Lam, Helena Vasconcelos, Michael S. Bernstein, Chelsea Finn

NeurIPS 2023 workshops XAIA, ICBINB

AutoFT: Learning an Objective for Robust Fine-Tuning

Caroline Choi*, Yoonho Lee*, Annie S. Chen, Allan Zhou, Aditi Raghunathan, Chelsea Finn

NeurIPS 2023 workshop DistShift

Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

Annie S. Chen*, Yoonho Lee*, Amrith Setlur, Sergey Levine, Chelsea Finn

ICLR 2024 (spotlight)

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn

ICML 2023 (long oral)

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Yoonho Lee*, Annie S. Chen*, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, Chelsea Finn

ICLR 2023

Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, Yee Whye Teh

ICML 2019