Kwanyoung Park

I'm an undergraduate in Seoul National University, actively engaging in research within the field of Reinforcement Learning (RL). I'm majoring in Computer Science and Engineering, with a minor in Mathematical Sciences. Currently, I am tackling offline RL problems at Yonsei University's RLLAB under the guidance of Youngwoon Lee.

Previously, I have done a research internship at Human-centered Computer Systems Lab in Seoul National University with Youngki Lee. During the internship, my primary focus was on bridging the gap between machine learning processes and human-like learning processes. Additionally, I contributed to the development of NeRF models tailored for on-device applications.

Email  /  CV  /  Google Scholar  /  Github  /  Twitter

profile photo

Research

The goal of my research is to bridge the gap between machine learning processes and human-like learning processes, focusing on improving data efficiency of reinforcement learning agents.

So far, I have explored the differences and similarities between reinforcement learning agents and cognition of human agents (toddlers). I also explored on how to evaluate human-like artificial intelligence from the perspective of human cognitive development. Currently, I'm focusing on how to utilize offline data to accelerate the learning process of reinforcement learning agents.

Publications and Preprints

(* denotes equal contribution)

Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning
Kwanyoung Park, Youngwoon Lee
Preprint, 2024
paper / code / website / twitter

We introduce a novel model-based offline RL method, Lower Expectile Q-learning (LEQ), which enhances long-horizon task performance by mitigating the high bias in model-based value estimation via expectile regression of λ-returns. Our empirical results show that LEQ significantly outperforms previous model-based offline RL methods on long-horizon tasks, such as the D4RL AntMaze tasks, matching or surpassing the performance of model-free approaches.

TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations
Junik Bae, Kwanyoung Park, Youngwoon Lee
CoRL, 2024
paper / code / website

We propose a novel unsupervised goal-conditioned RL method, TLDR, which leverages TemporaL Distance-aware Representations. Our approach selects faraway goals to initiate exploration and compute intrinsic exploration rewards and goal-reaching rewards. Our experimental results in robotic locomotion and manipulation environments demonstrate that our method significantly outperforms previous unsupervised GCRL methods in achieving a wide variety of states.

VECA: A New Benchmark and Toolkit for General Cognitive Development
Kwanyoung Park*, Hyunseok Oh*, Youngki Lee
AAAI, 2022   (Oral Presentation, Acceptance Rate: 384/9,251 = 4.15%)
paper / code

We present VECA(Virtual Environment for Cognitive Assessment), which consists of two main components: (i) a first benchmark to assess the overall cognitive development of an AI agent, and (ii) a novel toolkit to generate diverse and distinct cognitive tasks. VECA benchmark virtually implements the cognitive scale of Bayley Scales of Infant and Toddler Development-IV(Bayley-4), the gold-standard developmental assessment for human infants and toddlers.

Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents
Junseok Park, Kwanyoung Park, Hyunseok Oh, Ganghun Lee, Minsu Lee, Youngki Lee, Byoung-Tak Zhang
ICMI, 2021   (Oral Presentation)
paper / code

We investigate the emergence of critical periods on multimodal reinforcement learning. We show the performance on RL task and transfer learning depends on what and when the guidance is given to the agent.

Learning task-agnostic representation via toddler-inspired learning
Kwanyoung Park, Junseok Park, Hyunseok Oh, Youngki Lee, Byoung-Tak Zhang
NeurIPS Workshop, 2020
paper / code

Toddler's learning procedure consists of interactive experiences, resulting in task-agnostic representations. Inspired by those precedures, we pretrain the agent on a visual navigation task and show that the representations obtained during the RL task is expandable to various vision tasks.

Website template from Jon Barron.