# CycleGAN-VC2 **Repository Path**: segmentationFaults/CycleGAN-VC2 ## Basic Information - **Project Name**: CycleGAN-VC2 - **Description**: No description available - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-05-10 - **Last Updated**: 2022-07-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # **CycleGAN-VC2-PyTorch** [](https://github.com/jackaduma/CycleGAN-VC2) [](https://paypal.me/jackaduma?locale.x=zh_XC) [**中文说明**](./README.zh-CN.md) | [**English**](./README.md) ------ This code is a **PyTorch** implementation for paper: [CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion](https://arxiv.org/abs/1904.04631]), a nice work on **Voice-Conversion/Voice Cloning**. - [x] Dataset - [ ] VC - [x] Chinese Male Speakers (S0913 from [AISHELL-Speech](https://openslr.org/33/) & [GaoXiaoSong: a Chinese star](https://en.wikipedia.org/wiki/Gao_Xiaosong)) - [x] Usage - [x] Training - [x] Example - [ ] Demo - [x] Reference ------ ## **Update** **2020.11.17**: fixed issues: re-implements the second step adverserial loss. **2020.08.27**: add the second step adverserial loss by [Jeffery-zhang-nfls](https://github.com/Jeffery-zhang-nfls) ## **CycleGAN-VC2** ### [**Project Page**](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/index.html) To advance the research on non-parallel VC, we propose CycleGAN-VC2, which is an improved version of CycleGAN-VC incorporating three new techniques: an improved objective (two-step adversarial losses), improved generator (2-1-2D CNN), and improved discriminator (Patch GAN).  ------ **This repository contains:** 1. [model code](model_tf.py) which implemented the paper. 2. [audio preprocessing script](preprocess_training.py) you can use to create cache for [training data](data). 3. [training scripts](train.py) to train the model. 4. [Examples of Voice Conversion](converted_sound/) - converted result after training. ------ ## **Table of Contents** - [**CycleGAN-VC2-PyTorch**](#cyclegan-vc2-pytorch) - [**CycleGAN-VC2**](#cyclegan-vc2) - [**Project Page**](#project-page) - [**Table of Contents**](#table-of-contents) - [**Requirement**](#requirement) - [**Usage**](#usage) - [**preprocess**](#preprocess) - [**train**](#train) - [**Pretrained**](#pretrained) - [**Demo**](#demo) - [**Reference**](#reference) - [**Donation**](#donation) - [**License**](#license) ------ ## **Requirement** ```bash pip install -r requirements.txt ``` ## **Usage** ### **preprocess** ```python python preprocess_training.py ``` is short for ```python python preprocess_training.py --train_A_dir ./data/S0913/ --train_B_dir ./data/gaoxiaosong/ --cache_folder ./cache/ ``` ### **train** ```python python train.py ``` is short for ```python python train.py --logf0s_normalization ./cache/logf0s_normalization.npz --mcep_normalization ./cache/mcep_normalization.npz --coded_sps_A_norm ./cache/coded_sps_A_norm.pickle --coded_sps_B_norm ./cache/coded_sps_B_norm.pickle --model_checkpoint ./model_checkpoint/ --resume_training_at ./model_checkpoint/_CycleGAN_CheckPoint --validation_A_dir ./data/S0913/ --output_A_dir ./converted_sound/S0913 --validation_B_dir ./data/gaoxiaosong/ --output_B_dir ./converted_sound/gaoxiaosong/ ``` ------ ## **Pretrained** a pretrained model which converted between S0913 and GaoXiaoSong download from [Google Drive](https://drive.google.com/file/d/1iamizL98NWIPw4pw0nF-7b6eoBJrxEfj/view?usp=sharing) <735MB> ------ ## **Demo** Samples: **reference speaker A:** [S0913(./data/S0913/BAC009S0913W0351.wav)](https://drive.google.com/file/d/14zU1mI8QtoBwb8cHkNdZiPmXI6Mj6pVW/view?usp=sharing) **reference speaker B:** [GaoXiaoSong(./data/gaoxiaosong/gaoxiaosong_1.wav)](https://drive.google.com/file/d/1s0ip6JwnWmYoWFcEQBwVIIdHJSqPThR3/view?usp=sharing) **speaker A's speech changes to speaker B's voice:** [Converted from S0913 to GaoXiaoSong (./converted_sound/S0913/BAC009S0913W0351.wav)](https://drive.google.com/file/d/1S4vSNGM-T0RTo_aclxRgIPkUJ7NEqmjU/view?usp=sharing) ------ ## **Reference** 1. **CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion**. [Paper](https://arxiv.org/abs/1904.04631), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/index.html) 2. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1711.11293), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc/) 3. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1703.10593), [Project](https://junyanz.github.io/CycleGAN/), [Code](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) 4. Image-to-Image Translation with Conditional Adversarial Nets. [Paper](https://arxiv.org/abs/1611.07004), [Project](https://phillipi.github.io/pix2pix/), [Code](https://github.com/phillipi/pix2pix) ------ ## Donation If this project help you reduce time to develop, you can give me a cup of coffee :) AliPay(支付宝)