# HelloMeme
**Repository Path**: mirrors_sudoconf/HelloMeme
## Basic Information
- **Project Name**: HelloMeme
- **Description**: The official HelloMeme GitHub site
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-12-14
- **Last Updated**: 2025-10-05
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models
HelloVision | HelloGroup Inc.
* Intern
## π New Features/Updates
- β [`ExperimentsOnSKAttentions`](https://github.com/HelloVision/ExperimentsOnSKAttentions) for ablation experiments.
- β
`12/12/2024` Added HelloMeme V2 (synchronize code from the [`ComfyUI`](https://github.com/HelloVision/ComfyUI_HelloMeme) repo).
- β
`11/14/2024` Added the `HMControlNet2` module
- β
`11/12/2024` Added a newly fine-tuned version of [`Animatediff`](https://huggingface.co/songkey/hm_animatediff_frame12) with a patch size of 12, which uses less VRAM (Tested on 2080Ti).
- β
`11/5/2024` [`ComfyUI`](https://github.com/HelloVision/ComfyUI_HelloMeme) interface for HelloMeme.
- β
`11/1/2024` Release the code for the core functionalities..
## Introduction
This repository contains the official code implementation of the paper [`HelloMeme`](https://arxiv.org/pdf/2410.22901). Any updates related to the code or models from the paper will be posted here. The code for the ablation experiments discussed in the paper will be added to the [`ExperimentsOnSKAttentions`](https://github.com/HelloVision/ExperimentsOnSKAttentions) section. Additionally, we plan to release a `ComfyUI` interface for HelloMeme, with updates posted here as well.
## Getting Started
### 1. Create a Conda Environment
```bash
conda create -n hellomeme python=3.10.11
conda activate hellomeme
```
### 2. Install PyTorch and FFmpeg
To install the latest version of PyTorch, please refer to the official [PyTorch](https://pytorch.org/get-started/locally/) website for detailed installation instructions. Additionally, the code will invoke the system's ffmpeg command for video and audio editing, so the runtime environment must have ffmpeg pre-installed. For installation guidance, please refer to the official [FFmpeg](https://ffmpeg.org/) website.
### 3. Install dependencies
```bash
pip install diffusers transformers einops scipy opencv-python tqdm pillow onnxruntime onnx safetensors accelerate peft
```
> [!IMPORTANT]
>
> Note the version of diffusers required: frequent updates to diffusers may lead to dependency conflicts. We will periodically check the repoβs compatibility with the latest diffusers version. The currently tested and supported version is **diffusers==0.31.0**.
### 4. Clone the repository
```bash
git clone https://github.com/HelloVision/HelloMeme
cd HelloMeme
```
### 5. Run the code
```bash
python inference_image.py # for image generation
python inference_video.py # for video generation
```
### 6. Install for Gradio App
We recommend setting up the environment with conda.
```bash
pip install gradio
pip install imageio[ffmpeg]
run python app.py
```
After run the app, all models will be downloaded.
Longer the driver video, more VRAM will need.
## Examples
### Image Generation
The input for the image generation script `inference_image.py` consists of a reference image and a drive image, as shown in the figure below:
Reference Image |
Drive Image |
The output of the image generation script is shown below:
### Video Generation
The input for the video generation script `inference_video.py` consists of a reference image and a drive video, as shown in the figure below:
Reference Image |
Drive Video |
The output of the video generation script is shown below:
> [!Note]
>
> If the face in the driving video has significant movement (such as evident camera motion), it is recommended to set the `trans_ratio` parameter to 0 to prevent distorted outputs.
>
>`inference_video(engines, ref_img_path, drive_video_path, save_path, trans_ratio=0.0)`
## Pretrained Models
Our models are all hosted on [π€](https://huggingface.co/songkey), and the startup script will download them automatically. The specific model information is as follows:
| model | size | url | Info |
|-------|-------|------|-------------------------------------------------------|
| songkey/hm_reference | 312M |
| The weights of the ReferenceAdapter module |
| songkey/hm_control | 149M |
| The weights of the HMControlNet module |
| songkey/hm_animatediff | 835M |
| The weights of the Turned Animatediff (patch size 16) |
| songkey/hm_animatediff_frame12 | 835M |
| The weights of the Turned Animatediff (patch size 12) |
| hello_3dmm.onnx | 311M |
| For face RT Extractor |
| hello_arkit_blendshape.onnx | 9.11M |
| Extract ARKit blendshape parameters |
| hello_face_det.onnx | 317K |
| Face Detector |
| hello_face_landmark.onnx | 2.87M |
| Face Landmarks (222 points) |
Our pipeline also supports loading stylized base models (safetensors). For video generation tasks, using some customized models for portrait generation, such as [**Realistic Vision V6.0 B1**](https://civitai.com/models/4201/realistic-vision-v60-b1), can produce better results. You can download checkpoints and loras into the directories `pretrained_models/` and `pretrained_models/loras/`, respectively.
## Acknowledgements
Thanks to π€ for providing [diffusers](https://huggingface.co/docs/diffusers), which has greatly enhanced development efficiency in diffusion-related work. We also drew considerable inspiration from [MagicAnimate](https://github.com/magic-research/magic-animate) and [EMO](https://github.com/HumanAIGC/EMO), and [Animatediff](https://github.com/guoyww/AnimateDiff) allowed us to implement the video version at a very low cost. Finally, we thank our colleagues **Shengjie Wu** and **Zemin An**, whose foundational modules played a significant role in this work.
## Citation
```bibtex
@misc{zhang2024hellomemeintegratingspatialknitting,
title={HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models},
author={Shengkai Zhang and Nianhong Jiao and Tian Li and Chaojie Yang and Chenhui Xue and Boya Niu and Jun Gao},
year={2024},
eprint={2410.22901},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.22901},
}
```
## Contact
**Shengkai Zhang** (songkey@pku.edu.cn)