Kwanyoung Park
I'm an undergraduate in Seoul National University, actively engaging in research within the field of Reinforcement Learning (RL).
I'm majoring in Computer Science and Engineering, with a minor in Mathematical Sciences.
Currently, I am tackling offline RL problems with model-based approaches at Yonsei University's RLLAB under the guidance of Youngwoon Lee.
Previously, I have done a research internship at Human-centered Computer Systems Lab in Seoul National University with Youngki Lee.
During the internship, my primary focus was on bridging the gap between machine learning processes and human-like learning processes.
Email /
CV /
Google Scholar /
Github /
Twitter
|
|
Research
The goal of my research is to develop a "foundational embodied agent", an embodied agent that is generalizable and rapidly adapt to novel tasks with few demonstrations.
To enable this, foundational models must develop priors about the world, capturing task distributions (e.g., manipulation, goal-reaching) and environmental dynamics (e.g., physical properties).
My current focus is on three key approaches:
- World Models:
World models, which can be trained on abundant task-agnostic data, offer richer and grounded learning signals via simulated trajectories, enabling improved generalization and planning.
However, world models are often inaccurate, especially in capturing physical knowledge critical for embodied agents.
How can we address these inaccuracies in model-based RL, and how can we train better world models?
- Offline RL:
While behavior cloning (BC) with large demonstration dataset have proven instrumental in building foundation models in robotics, it cannot learn stitching or corrective behaviors, requiring extensive optimal data.
Can we leverage offline RL to utilize suboptimal datasets for training and fine-tuning foundational embodied agents?
- Unsupervised RL:
Unsupervised RL enables task-agnostic pretraining through intrinsic rewards.
Similar to the success of unsupervised pretraining in vision and NLP domains, can we train transferable representations and skills in RL and effectively apply them to downstream tasks?
|
Publications
(* denotes equal contribution)
|
|
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
Kwanyoung Park,
Youngwoon Lee
ICLR, 2025
paper /
code /
website /
twitter
We introduce a novel model-based offline RL method, Lower Expectile Q-learning (LEQ), which enhances long-horizon task performance by mitigating the high bias in model-based value estimation via expectile regression of λ-returns.
Our empirical results show that LEQ significantly outperforms previous model-based offline RL methods on long-horizon tasks, such as the D4RL AntMaze tasks, matching or surpassing the performance of model-free approaches.
|
|
TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations
Junik Bae,
Kwanyoung Park,
Youngwoon Lee
CoRL, 2024
paper /
code /
website /
twitter
We propose a novel unsupervised goal-conditioned RL method, TLDR, which leverages TemporaL Distance-aware Representations.
Our approach selects faraway goals to initiate exploration and compute intrinsic exploration rewards and goal-reaching rewards.
Our experimental results in robotic locomotion and manipulation environments demonstrate that our method significantly outperforms previous unsupervised GCRL methods in achieving a wide variety of states.
|
|
VECA: A New Benchmark and Toolkit for General Cognitive Development
Kwanyoung Park*,
Hyunseok Oh*,
Youngki Lee
AAAI, 2022   (Oral Presentation, Acceptance Rate: 384/9,251 = 4.15%)
paper /
code
We present VECA(Virtual Environment for Cognitive Assessment), which consists of two main components: (i) a first benchmark to assess the overall cognitive development of an AI agent, and (ii) a novel toolkit to generate diverse and distinct cognitive tasks.
VECA benchmark virtually implements the cognitive scale of Bayley Scales of Infant and Toddler Development-IV(Bayley-4), the gold-standard developmental assessment for human infants and toddlers.
|
|
Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents
Junseok Park,
Kwanyoung Park,
Hyunseok Oh,
Ganghun Lee,
Minsu Lee,
Youngki Lee,
Byoung-Tak Zhang
ICMI, 2021   (Oral Presentation)
paper /
code
We investigate the emergence of critical periods on multimodal reinforcement learning.
We show the performance on RL task and transfer learning depends on what and when the guidance is given to the agent.
|
|
Learning task-agnostic representation via toddler-inspired learning
Kwanyoung Park,
Junseok Park,
Hyunseok Oh,
Byoung-Tak Zhang,
Youngki Lee
NeurIPS Workshop, 2020
paper /
code
Toddler's learning procedure consists of interactive experiences, resulting in task-agnostic representations.
Inspired by those precedures, we pretrain the agent on a visual navigation task and show that the representations obtained during the RL task is expandable to various vision tasks.
|
Website template from Jon Barron.
|
|