# FlowRL **Repository Path**: ByteDance/FlowRL ## Basic Information - **Project Name**: FlowRL - **Description**: Official implementation of "Flow Based Policy for Online Reinforcement Learning" - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-07-01 - **Last Updated**: 2025-09-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
👋 Hi, everyone!
We are ByteDance Seed team.

You can get to know us better through the following channels👇

![seed logo](https://github.com/user-attachments/assets/c42e675e-497c-4508-8bb9-093ad4d1f216) # Flow-based Polciy for Online Reinforcement Learning


We are delighted to introduce FlowRL. It is a new approach for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. This creates a promising framework that integrates generative policies with reinforcement learning. ## News [2025/06/10]🔥We release the PyTorch version of the code. ## Introduction FlowRL is an Actor-Critic framework that leverages flow-based policy representation and integrates Wasserstein-2-regularized optimization. By implicitly constraining the current policy to the optimal behavioral policy via W2 distance, FlowRL achieves superior performance on challenging benchmarks like the DM_Control (Dog domain, Humanoid domain) and Humanoid_Bench. ## Getting Started 1. **Setup Conda Environment:** Create an environment with ```bash conda create -n flowrl python=3.11 ``` 2. **Clone this Repository:** ```bash git clone https://github.com/bytedance/FlowRL.git cd FlowRL ``` 3. **Install FlowRL Dependencies:** ```bash pip install -r requirements.txt ``` 4. **Training Examples:** - Run a single training instance: ```bash python3 main.py --domain dog --task run ``` - Run parallel training: ```bash bash scripts/train_parallel.sh ``` ## License This project is licensed under the Apache License 2.0. See the LICENSE file for details. ## TODO - [ ] Release JAX version source code ## Citation If you find FlowRL useful for your research and applications, please consider giving us a star ⭐ or cite us using: ```bibtex @article{lv2025flow, title={Flow-Based Policy for Online Reinforcement Learning}, author={Lv, Lei and Li, Yunfei and Luo, Yu and Sun, Fuchun and Kong, Tao and Xu, Jiafeng and Ma, Xiao}, journal={arXiv preprint arXiv:2506.12811}, year={2025} } ``` ## About [ByteDance Seed Team](https://seed.bytedance.com/) Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society.