# UI-TARS-desktop
**Repository Path**: kingleon99/UI-TARS-desktop
## Basic Information
- **Project Name**: UI-TARS-desktop
- **Description**: An GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 2
- **Created**: 2025-04-28
- **Last Updated**: 2025-04-28
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
> [!IMPORTANT]
>
>
>
>
> **\[2025-03-18\]** We released a **technical preview** version of a new desktop app - [Agent TARS](./apps/agent-tars/README.md), a multimodal AI agent that leverages browser operations by visually interpreting web pages and seamlessly integrating with command lines and file systems.
# UI-TARS Desktop
UI-TARS Desktop is a GUI Agent application based on [UI-TARS (Vision-Language Model)](https://github.com/bytedance/UI-TARS) that allows you to control your computer using natural language.
   π Paper   
| π€ Hugging Face Models  
|   π«¨ Discord  
|   π€ ModelScope  
π₯οΈ Desktop Application   
|    π Midscene (use in browser)
## Showcases
| Instruction | Video |
| :---: | :---: |
| Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting. | |
| Could you help me check the latest open issue of the UI-TARS-Desktop project on GitHub? | |
## News
- **\[2025-04-17\]** - π We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports [the advanced UI-TARS-1.5 model](https://seed-tars.com/1.5) for improved performance and precise control.
- **\[2025-02-20\]** - π¦ Introduced [UI TARS SDK](./docs/sdk.md), is a powerful cross-platform toolkit for building GUI automation agents.
- **\[2025-01-23\]** - π We updated the **[Cloud Deployment](./docs/deployment.md#cloud-deployment)** section in the δΈζη: [GUI樑ει¨η½²ζη¨](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb) with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.
## Features
- π€ Natural language control powered by Vision-Language Model
- π₯οΈ Screenshot and visual recognition support
- π― Precise mouse and keyboard control
- π» Cross-platform support (Windows/MacOS/Browser)
- π Real-time feedback and status display
- π Private and secure - fully local processing
## Quick Start
See [Quick Start](./docs/quick-start.md).
## Deployment
See [Deployment](https://github.com/bytedance/UI-TARS/blob/main/README_deploy.md).
## Contributing
See [CONTRIBUTING.md](./CONTRIBUTING.md).
## SDK (Experimental)
See [@ui-tars/sdk](./docs/sdk.md)
## License
UI-TARS Desktop is licensed under the Apache License 2.0.
## Citation
If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:
```BibTeX
@article{qin2025ui,
title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
journal={arXiv preprint arXiv:2501.12326},
year={2025}
}
```