Photo of Yoonho Lee

I’m a final-year PhD student at Stanford CS advised by Chelsea Finn. My research is supported by OpenAI and KFAS.

Continual learning is important: we want systems that learn during deployment. True continual learning goes beyond fast local adaptation and towards long-horizon improvement from rich, heterogeneous, history-dependent experience.

Text space is a natural medium for this kind of continual learning. Many forms of textual feedback contain useful information about how to improve a system: execution traces, scored or annotated trajectories, human critiques, retrieved articles, experiment logs all contain information about what went wrong and what to try next. Standard learning pipelines compress such information too early into a scalar objective to compute gradients against. External textual artifacts give us a natural place to keep and directly reason over such information without immediately discarding structure.

Active research directions, all aimed at operationalizing true continual learning:

  1. Better text optimizers. Optimizing text artifacts against a given objective is a fundamental building block of continual learning in text space. My recent contribution here is the design pattern of preserving all previous information by storing everything in a filesystem, which enables long-horizon credit assignment without blowing up the context window.

  2. Deciding what to optimize. Models seem increasingly capable of autonomously hill-climbing almost any reasonable fixed objective, and progress on text optimization will accelerate this trend. The bottleneck, then, shifts to determining what should be optimized. Concretely, I am interested in learning pipelines where the optimizee and objective evolve together: new artifacts expose weaknesses in the current objective, and new objectives redirect the next round of search.

  3. Understanding text-space learning. Learning in text space works surprisingly well, yet we have little understanding of its properties. I am interested in building a clean theoretical foundation for understanding the “text hypothesis class” and the algorithms that operate on it.

For a technical overview, see my blog posts on Feedback Descent and Meta-Harness or the selected papers below.

2026

Preprint

Agentic search over LLM harnesses using filesystem access to full execution history. Outperforms hand-designed systems on text classification, math reasoning, and agentic coding.
2026

Oral, MemAgents Workshop @ ICLR 2026
RSI Workshop @ ICLR 2026

Operationalizes the core text optimization loop, accumulating "why better" signals from pairwise comparisons across up to a thousand iterations.
2026

ICLR 2026
Spotlight, ES-FoMo Workshop @ ICML 2025
Oral, Ram2 Workshop @ CoLM 2025

A hierarchical RL framework for training LLMs to discover and use textual abstractions for reasoning problems. Demonstrates that useful information for solving reasoning problems can be represented in pure text form.
2024

UIST 2024
XAIA Workshop @ NeurIPS 2023
ICBINB Workshop @ NeurIPS 2023

A natural language interface for directly teaching vision models using natural language. The human feedback directly specifies the concept to (un)learn via gradient descent.

My name (윤호) is pronounced like “you know” said quickly (with stress on ‘you’). This is a good approximation.

Feel free to reach out via email! I’m planning to be on the job market in early 2027 (both academic and industry).