Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dtx-datase #31

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/zh/docs/dtx/Dataset/Dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# 管理数据集

- 切换命名空间:在数据集页面,可以点击命名空间进行切换,每个命名空间下只显示属于该命名空间的数据集

- 搜索数据集:在数据集页面,可以点击搜索框进行搜索,搜索结果会显示在数据集列表中。

- 编辑数据集:在数据集页面,点击编辑按钮,可以进入更新数据集页面,可以编辑数据集信息。

- 删除数据集:在数据集页面,点击删除按钮,可以删除数据集。

![数据集管理](images/Dataset5.png)
53 changes: 53 additions & 0 deletions docs/zh/docs/dtx/Dataset/create-Dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# 数据集

!!!note
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!!!note
!!! note

refer to: https://squidfunk.github.io/mkdocs-material/reference/admonitions/


数据集是一个数据的集合,通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量,如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。

- 点击模型微调中的数据集,即可进入到管理数据集页面。

![数据集](images/Dataset1.png)

## 创建数据集

1. 点击创建数据集按钮,即可进入到创建数据集页面。

2. 填写数据集名称(必填):不可为空,可输入1-50个字符。

3. 填写数据集标签(选填):可输入1-50个字符,作为数据集名称的备注。

4. 选择数据集语言类型(必填):目前可选择为中文/英文,也可以全部选择。

5. 选择数据集的命名空间(必填):命名空间是数据集的所属。

6. 选择授权协议:推荐选择为CC-BY-NC,也可以选择其他协议。

7. 选择词条数目(必填):选择的区间范围是上传数据文件时,文件的大小。

![数据集2](images/Dataset2.png)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
![数据集2](images/Dataset2.png)
![数据集2](images/Dataset2.png)


8. 选择数据集任务类型(必填):目前可选择为文本生成/QA对话/文本分类/概括。

- 子任务类型(子任务可以添加多个):可以点击添加,在语言建模,掩蔽语言建模和自然语言处理中选择子任务类型。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 子任务类型(子任务可以添加多个):可以点击添加,在语言建模,掩蔽语言建模和自然语言处理中选择子任务类型。
子任务类型(子任务可以添加多个):可以点击 **添加** ,在语言建模,掩蔽语言建模和自然语言处理中选择子任务类型。

所有 UI 词,粗体


9. 填写数据集信息(分为插件配置和标准配置):

- 插件配置:点击打开插件配置,即可选择插件,如果插件需要填写参数,则在插件配置中填写参数。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 插件配置:点击打开插件配置,即可选择插件,如果插件需要填写参数,则在插件配置中填写参数
- 插件配置:若打开插件配置,可填写运行参数


![插件配置](images/Dataset3.png)

- 标准配置:需要填写子数据集名称,训练数据集地址,验证数据集地址,测试数据集地址。

- 训练数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。

- 验证数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。

- 测试数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。
Comment on lines +41 to +45
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 训练数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。
- 验证数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。
- 测试数据集地址:点击上传数据文件后,点击上传文件,即可上传数据文件。
训练数据集地址、验证数据集地址、测试数据集地址:可上传数据文件。

你是在写水文吗?


![标准配置](images/Dataset4.png)

10. 特征映射:在特征映射中填写数据文件中的列名。

11. 数据源信息:在数据源信息中填写数据源的名称,数据源的描述,数据源的链接。

12. 点击确定,即可创建数据集。
Peauntyang marked this conversation as resolved.
Show resolved Hide resolved
Binary file added docs/zh/docs/dtx/Dataset/images/Dataset1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/zh/docs/dtx/Dataset/images/Dataset2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/zh/docs/dtx/Dataset/images/Dataset3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/zh/docs/dtx/Dataset/images/Dataset4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/zh/docs/dtx/Dataset/images/Dataset5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading