Scalable Offline Model-Based RL with
Action Chunks

UC Berkeley1, Yonsei University2

Abstract


Challenges

  • Long-horizon tasks require long-horizon rollouts, since the bias introduced by bootstrapping in value updates accumulates over time.
  • This creates a unique challenge in model-based RL, where prediction errors from the learned world model accumulate over time.
  • Therefore, we need methods to perform long-horizon rollouts while keeping model prediction errors small.

Model-based RL with Action Chunking (MAC)


Action chunking



Flow rejection sampling



Experiments


Long-horizon tasks with 100M-scale dataset


Reward-based benchmarks with 1M-scale dataset


BibTeX


@article{park2025_MAC,
  title={Scalable Offline Model-Based RL with Action Chunking},
  author={Kwanyoung Park, Seohong Park, Youngwoon Lee, Sergey Levine},
  journal={arXiv Preprint},
  year={2025}
}