# Satori **Repository Path**: cchencode/satori ## Basic Information - **Project Name**: Satori - **Description**: Transformer + Reinforcement Learning打麻将 - **Primary Language**: Python - **License**: MulanPSL-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 6 - **Created**: 2024-03-25 - **Last Updated**: 2024-03-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Satori 基于Transformer模型+强化学习训练的立直麻将agent > 此项目刚开始不久,欢迎提PR,欢迎交流讨论(权重、数据集未公开) > > 待完善的地方: > > 1. `Misc/libc/evaluate.cc`计算打点时未考虑所有役种,需添加完善 > 2. 需要训练4个model: discard, chi, peng, reach,后三个模型数据集不够,discard训练的epoch不够,需要更多的硬件资源和时间,欢迎贡献cpu、gpu等资源 ## How to use Satori 1. 进入 `Misc/libc`目录下,执行`make_lib.sh`编译`c++`库 2. 运行`tests/run_test.py`: 例如对于手牌`6678m3445p4567s44z, dora=3z`,在`tests/`下执行 `python run_test.py --use_rl_model=1 --ht=6678m3445p4567s44z --di=2z --sw=1z --rw=1z`, 给出$\pi(s)$如下: ## How to generate your dataset 1. 从 [tenhou 网站](https://tenhou.net/sc/raw/) 上下载日志文件包,放在`SL/logs/`文件夹下 2. 运行`SL/spider.py`,会自动解析logs文件并将记载对局信息的`.json`文件存储到`SL/games/`文件夹下 3. 运行`SL/game_loader.py`,会自动解析`.json`文件并将数据写入`.pkl`文件,当大小超过4.2GB时自动打包成`.zip`并放到`SL/dataset/`文件夹下 4. 至此可运行`SL/SL.py`或`SL/SL_ddp.py`开始训练 ## Supervised Learning(SL) [go to SL/README.md](./SL/README.md) ## Reinforcement Learning(RL) [go to RL/README.md](./RL/README.md) ## Referfence 1. Building a Computer Mahjong Player via Deep Convolutional Neural Networks 2. Suphx: Mastering Mahjong with Deep Reinforcement Learning, 3. Mathematical Foundations of Reinforcement Learning, Shiyu Zhao 4. Proximal Policy Optimization Algorithms