diff --git a/docs/engines/autotrain-engine.md b/docs/engines/autotrain-engine.md deleted file mode 100644 index 93354f60a75c4f65682fd3f00c35dd7a88636e55..0000000000000000000000000000000000000000 --- a/docs/engines/autotrain-engine.md +++ /dev/null @@ -1,54 +0,0 @@ -# 训练引擎 - -训练引擎是一款无代码微调工具,它能为您完成文本处理、大语言模型(LLM)、文生图(DreamBooth)等多个领域的微调训练任务。 - -## 适用人群 - -如果您不想在训练模型的技术细节上花费大量时间,训练引擎是您很好的选择,因为它基本满足当前热门、最先进开源模型的微调工作。 - -通过简单的 UI 交互,为您在训练模型的技术细节上节省时间。训练引擎的目标就是让所有的用户都可以简单轻松地进行模型训练,而不只是科学家或者AI方面的工程师。 - -## 如何部署训练引擎 - -### 1.创建引擎 - -您可以这样部署您的训练引擎: - -- [创建应用](/apps/overview#创建应用),选择 Docker 作为您的应用 SDK ,同时模版选择 - AutoTrain,同时你需要填写好您的密钥 ([点击此处创建密钥](https://gitee.com/profile/personal_access_tokens)),建议把应用设置为私密。 - -![alt text](/img/engines/autotrain-engine/image-5.png) - -### 2.更新引擎(可选) - -创建训练引擎后,如果需要更新密钥或者遇到错误,可以进入应用的设置页面选择“功能”选项选择其中的出厂重启,您的训练程序就会被重置。 - -![alt text](/img/engines/autotrain-engine/image.png) - -等待构建和启动完成,你就可以看到训练的界面了。 - -![alt text](/img/engines/autotrain-engine/img.png) - -### 3.开始训练 - -以 bert-base-chinese 为基础,训练一个简单的工作任务分类模型: - -- 首先在页面上选择任务为 `文本分类` ,选择基础模型为 `bert-base-chinese`。 - -![alt text](/img/engines/autotrain-engine/img_2.png) - -- 按照格式要求,上传准备好的数据 - -![img.png](/img/engines/autotrain-engine/img_3.png) - -- 点击开启训练按钮,即可开始微调任务。 - -您可以通过观察页面下方训练日志了解训练进度。 - -![alt text](/img/engines/autotrain-engine/train-log.png) - -训练结束后,模型会自动上传到您的账号下,可以在个人主页中查看您的模型。 - -![img.png](/img/engines/autotrain-engine/img_4.png) - -您可以在 模型挂件 上进行体验,或者根据你的需要部署到 [模型引擎](/docs/engines/model-engine) 中,对外提供服务。 diff --git a/docs/engines/img.png b/docs/engines/img.png deleted file mode 100644 index e1829cc26858e840df5d85934ab5f857be530e7a..0000000000000000000000000000000000000000 Binary files a/docs/engines/img.png and /dev/null differ diff --git a/docs/training.md b/docs/training.md new file mode 100644 index 0000000000000000000000000000000000000000..64e4c31fe192b34648f6ea48635d5efcf99d5d22 --- /dev/null +++ b/docs/training.md @@ -0,0 +1,13 @@ +# 模型微调 + +LoRA 模型训练是一项复杂的工作!需要脚本、依赖项;Python、Torch 2.0,所有复杂的术语,一旦你掌握了这些,还有沉重的硬件要求! +我们希望 LoRA 训练尽可能易于访问,因此我们将训练器尽可能简化,以便新用户可以使用,即使他们可能从未训练过任何东西,同时也为经验丰富的用户提供了高级选项。 + +我们支持的功能有: + +- 图片生成类: SDXL ( FLUX 正在开发中) +- 大预言模型(规划中) + + + +请点击 [上手指南](/training/guide) 查看详细的使用说明。 diff --git a/docs/training/guide.md b/docs/training/guide.md new file mode 100644 index 0000000000000000000000000000000000000000..62d9435b8b9d8622ca8baf0c0a8f98c7a4bf00ce --- /dev/null +++ b/docs/training/guide.md @@ -0,0 +1,84 @@ +# 使用指南 + +## 为什么需要微调? +大模型微调是指在已经经过大规模数据训练的基础模型之上,利用特定领域或任务的小规模数据集对模型进行进一步调整和优化的过程。 + +大模型本身已经学习到了非常多的通用的语言知识,但是对于特定的任务和领域,其效果可能还不够理想,通过微调可以让模型理解特定数据并处理特定任务。 + +本文将通过5张特定狗狗照片的数据集,让大模型“记住”特定狗狗的特征。 +:::tip +目前微调功能处于内测阶段,欢迎微信扫码联系小助手开通。 + +![](/img/training/guide/contact-qrcode.png) +::: + +## 步骤一:创建微调任务 +在GiteeAI控制台的左边侧边栏找到`模型微调`。点击右上角 `新建任务`,填写任务名称,点击`创建任务`。 + +![entry](/img/training/guide/create-task.png) +![entry](/img/training/guide/create-task-form.png) + +## 步骤二:数据集准备 +数据集极大地影响微调的效果,数据集包括数据和打标两部分。 + +建议数据集图片中包含目标物体的多个角度,支持 png, jpg, jpeg 格式的图片上传,最大支持 5M 大小的图片,最多不超过 100 张图片。 +> 本教程示例图片 + +| | | | | | +|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------| +| ![dogs](/img/training/dogs/0.jpg) | ![dogs](/img/training/dogs/1.jpg) | ![dogs](/img/training/dogs/2.jpg) | ![dogs](/img/training/dogs/3.jpg) | ![dogs](/img/training/dogs/4.jpg) | + +将图片上传到微调任务中,点击 `上传图片` 按钮,选择图片文件。 + +![entry](/img/training/guide/task-data-form.png) +![entry](/img/training/guide/task-data-upload.png) + + +图片上传完成后,点击`打标/裁剪`按钮,系统将为您的图片进行打标。 + +![entry](/img/training/guide/task-data-processing.png) +![entry](/img/training/guide/task-data-tags.png) + +点击图片,在弹窗中查看自动打标的结果,您可以根据需要进行修改。 + +- 删除你觉得和图片无关的标签 +- 添加你觉得需要的标签内容 + +![entry](/img/training/guide/task-data-tags-edit.png) + +## 步骤三:微调参数设置 + +人物创建后,进入任务设置页面,左侧 `参数设置` 包括模型选择、训练轮次、单张学习次数等,您可以根据自己的需求进行设置。 + +- **循环轮次**: 决定了模型学习整个数据集的次数 (每轮保存一个 `lora` 模型) +- **单张学习次数**: 决定了模型在每个循环轮次 (Epochs) 中学习每张图片的次数 + +两者决定了模型的训练时间,同时也影响了模型的训练效果。 + +![entry](/img/training/guide/task-data-args.png) + + +## 步骤四:模型训练 + +点击右下载角 `开始训练`按钮,进入任务排队状态,稍等片刻,系统将为您的任务分配资源,开始训练。 + +![entry](/img/training/guide/task-result-padding.png) +![entry](/img/training/guide/task-result-training.png) + +训练完成后,您可以查看到训练结果,包括模型的 `loss` ,训练程序会在每个 `循环轮次` 保存一次模型,同时生成 3 张图片,用于评估模型的训练效果。 + +![entry](/img/training/guide/task-result-success.png) + + +## 步骤五:模型保存 + +首先需要确定模型保存到哪,点击页面中的设置,在选择框中选择已有的模型仓库,或者新建模型仓库。 + +![entry](/img/training/guide/task-result-save-set-repo.png) +![entry](/img/training/guide/task-result-save.png) + + +训练完成后,每个 `循环轮次` 都会得到一个 `lora` 模型,你可以跟据效果图,挑选出需要保存的 `lora` 模型,点击 `保存到仓库` 按钮,稍等片刻,模型将保存到您的仓库中。 + +![entry](/img/training/guide/task-result-save-btn.png)![entry](/img/training/guide/task-result-save-result.png) + diff --git a/docusaurus.config.ts b/docusaurus.config.ts index e6f7abb525b96248531fcbbea90457b27a80d786..8c8f54c04a1883454022cc1f8e5a6035587f8950 100644 --- a/docusaurus.config.ts +++ b/docusaurus.config.ts @@ -142,48 +142,48 @@ const config: Config = { footer: { style: 'light', /* - links: [ - { - title: 'Docs', - items: [ - { - label: 'Tutorial', - to: '/docs/intro', - }, - ], - }, - { - title: 'Community', - items: [ - { - label: 'Stack Overflow', - href: 'https://stackoverflow.com/questions/tagged/docusaurus', - }, - { - label: 'Discord', - href: 'https://discordapp.com/invite/docusaurus', - }, - { - label: 'Twitter', - href: 'https://twitter.com/docusaurus', - }, - ], - }, - { - title: 'More', - items: [ - { - label: 'Blog', - to: '/blog', - }, - { - label: 'Gitee AI', - href: 'https://ai.gitee.com', - }, - ], - }, - ], - */ + links: [ + { + title: 'Docs', + items: [ + { + label: 'Tutorial', + to: '/docs/intro', + }, + ], + }, + { + title: 'Community', + items: [ + { + label: 'Stack Overflow', + href: 'https://stackoverflow.com/questions/tagged/docusaurus', + }, + { + label: 'Discord', + href: 'https://discordapp.com/invite/docusaurus', + }, + { + label: 'Twitter', + href: 'https://twitter.com/docusaurus', + }, + ], + }, + { + title: 'More', + items: [ + { + label: 'Blog', + to: '/blog', + }, + { + label: 'Gitee AI', + href: 'https://ai.gitee.com', + }, + ], + }, + ], + */ copyright: `Copyright © ${new Date().getFullYear()} Gitee AI`, }, prism: { diff --git a/sidebars.ts b/sidebars.ts index 50ed1681971563f188be013ca3cd91bf5f5e266b..4dbed6435b25e33d8ae84cbe16e0ea36041c15f5 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -183,15 +183,26 @@ const sidebars: SidebarsConfig = { { type: 'category', collapsed: true, - label: '引擎', + label: '模型微调', + link: { + type: 'doc', + id: 'training', + }, items: [ { type: 'doc', - id: 'engines/model-engine', + id: 'training/guide', }, + ], + }, + { + type: 'category', + collapsed: true, + label: '引擎', + items: [ { type: 'doc', - id: 'engines/autotrain-engine', + id: 'engines/model-engine', }, ], }, diff --git a/static/img/training/dogs/0.jpg b/static/img/training/dogs/0.jpg new file mode 100644 index 0000000000000000000000000000000000000000..35f93ee5da9a1b4293392d8de35cb8ea8e3f8fa5 Binary files /dev/null and b/static/img/training/dogs/0.jpg differ diff --git a/static/img/training/dogs/1.jpg b/static/img/training/dogs/1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..ac74166e236f9f476bde39cdc030ae6f8cdcccb7 Binary files /dev/null and b/static/img/training/dogs/1.jpg differ diff --git a/static/img/training/dogs/2.jpg b/static/img/training/dogs/2.jpg new file mode 100644 index 0000000000000000000000000000000000000000..ac39e176e11fa5c8c9fe1909cf6f3b3f4d38d917 Binary files /dev/null and b/static/img/training/dogs/2.jpg differ diff --git a/static/img/training/dogs/3.jpg b/static/img/training/dogs/3.jpg new file mode 100644 index 0000000000000000000000000000000000000000..8f5c123028d3de4ba3725faaa9af52031b652a66 Binary files /dev/null and b/static/img/training/dogs/3.jpg differ diff --git a/static/img/training/dogs/4.jpg b/static/img/training/dogs/4.jpg new file mode 100644 index 0000000000000000000000000000000000000000..4ae7a7e278bd7abeca949cbd664f453fb24fc04e Binary files /dev/null and b/static/img/training/dogs/4.jpg differ diff --git a/static/img/training/guide/contact-qrcode.png b/static/img/training/guide/contact-qrcode.png new file mode 100644 index 0000000000000000000000000000000000000000..2ca5d897c688ebc633f730b97c60d50629e01251 Binary files /dev/null and b/static/img/training/guide/contact-qrcode.png differ diff --git a/static/img/training/guide/create-task-form.png b/static/img/training/guide/create-task-form.png new file mode 100644 index 0000000000000000000000000000000000000000..b6946b7c979af49e24a152607c4d890358c0ce31 Binary files /dev/null and b/static/img/training/guide/create-task-form.png differ diff --git a/static/img/training/guide/create-task.png b/static/img/training/guide/create-task.png new file mode 100644 index 0000000000000000000000000000000000000000..8f39cb717de46b252df6d5f0d6eaefcd011861fa Binary files /dev/null and b/static/img/training/guide/create-task.png differ diff --git a/static/img/training/guide/task-data-args.png b/static/img/training/guide/task-data-args.png new file mode 100644 index 0000000000000000000000000000000000000000..3d7ef1106dbacb9c6cb7ed2af71da67bfadb161b Binary files /dev/null and b/static/img/training/guide/task-data-args.png differ diff --git a/static/img/training/guide/task-data-form.png b/static/img/training/guide/task-data-form.png new file mode 100644 index 0000000000000000000000000000000000000000..3ebf3ef8cf7dd16ffb4170453e820f83a5aaa49f Binary files /dev/null and b/static/img/training/guide/task-data-form.png differ diff --git a/static/img/training/guide/task-data-processing.png b/static/img/training/guide/task-data-processing.png new file mode 100644 index 0000000000000000000000000000000000000000..2a0340c5e64e1afb79bb7caeb252997a650c0058 Binary files /dev/null and b/static/img/training/guide/task-data-processing.png differ diff --git a/static/img/training/guide/task-data-tags-edit.png b/static/img/training/guide/task-data-tags-edit.png new file mode 100644 index 0000000000000000000000000000000000000000..b4fd7e10fbb1aa869576da3cda6e46e2e4000dd9 Binary files /dev/null and b/static/img/training/guide/task-data-tags-edit.png differ diff --git a/static/img/training/guide/task-data-tags.png b/static/img/training/guide/task-data-tags.png new file mode 100644 index 0000000000000000000000000000000000000000..a2518be051464c89e7d39ada722b5696f0597e2d Binary files /dev/null and b/static/img/training/guide/task-data-tags.png differ diff --git a/static/img/training/guide/task-data-upload.png b/static/img/training/guide/task-data-upload.png new file mode 100644 index 0000000000000000000000000000000000000000..6319cf8fea84255b9ac48ceb3e6eb1a297226edf Binary files /dev/null and b/static/img/training/guide/task-data-upload.png differ diff --git a/static/img/training/guide/task-result-padding.png b/static/img/training/guide/task-result-padding.png new file mode 100644 index 0000000000000000000000000000000000000000..75c53bbce43c99a52242f378a4b899a28aff3cd8 Binary files /dev/null and b/static/img/training/guide/task-result-padding.png differ diff --git a/static/img/training/guide/task-result-save-btn.png b/static/img/training/guide/task-result-save-btn.png new file mode 100644 index 0000000000000000000000000000000000000000..a1993ee41046b5cbb1e71bc7f4dbba358f893dfd Binary files /dev/null and b/static/img/training/guide/task-result-save-btn.png differ diff --git a/static/img/training/guide/task-result-save-result.png b/static/img/training/guide/task-result-save-result.png new file mode 100644 index 0000000000000000000000000000000000000000..7f6ea771e5e262e24bf37b86e96dc93e380a7ed5 Binary files /dev/null and b/static/img/training/guide/task-result-save-result.png differ diff --git a/static/img/training/guide/task-result-save-set-repo.png b/static/img/training/guide/task-result-save-set-repo.png new file mode 100644 index 0000000000000000000000000000000000000000..579c1fc234e12e1473cc439c16c0d446e69608bf Binary files /dev/null and b/static/img/training/guide/task-result-save-set-repo.png differ diff --git a/static/img/training/guide/task-result-save.png b/static/img/training/guide/task-result-save.png new file mode 100644 index 0000000000000000000000000000000000000000..a6c11dac0011dd7d5fd2da12dae68dce33f0a0fa Binary files /dev/null and b/static/img/training/guide/task-result-save.png differ diff --git a/static/img/training/guide/task-result-success.png b/static/img/training/guide/task-result-success.png new file mode 100644 index 0000000000000000000000000000000000000000..c129b557fec5f8ddf10b91bfb3582437a8a33f7b Binary files /dev/null and b/static/img/training/guide/task-result-success.png differ diff --git a/static/img/training/guide/task-result-training.png b/static/img/training/guide/task-result-training.png new file mode 100644 index 0000000000000000000000000000000000000000..cda7f35c2c525c0b3b45692aefdfc1a4d7e3bc48 Binary files /dev/null and b/static/img/training/guide/task-result-training.png differ