Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(zh-cn): Reviewed 40_uploading-a-dataset-to-the-hub.srt #528

Merged
merged 2 commits into from
Apr 10, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 30 additions & 30 deletions subtitles/zh-CN/40_uploading-a-dataset-to-the-hub.srt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

2
00:00:05,490 --> 00:00:07,950
- 将数据集上传到中心
- 将数据集上传到 hub
- Uploading a dataset to the hub.

3
Expand All @@ -30,7 +30,7 @@ The first thing you need to do

7
00:00:14,670 --> 00:00:17,400
是在集线器上创建一个新的数据集存储库
是在 hub 上创建一个新的数据集仓库
is create a new dataset repository on the hub.

8
Expand All @@ -40,7 +40,7 @@ So, just click on your profile icon

9
00:00:19,260 --> 00:00:21,750
并选择新建数据集按钮
并选择 New Dataset 按钮
and select the New Dataset button.

10
Expand All @@ -50,17 +50,17 @@ Next, we need to assign an owner of the dataset.

11
00:00:24,750 --> 00:00:26,970
默认情况下,这将是你的中心帐户
默认情况下,所有者是你的 hub 帐户
By default, this will be your hub account,

12
00:00:26,970 --> 00:00:28,170
但你也可以创建数据集
但你也可以
but you can also create datasets

13
00:00:28,170 --> 00:00:30,585
在你所属的任何组织下
以你所属的组织的名义创建数据集
under any organization that you belong to.

14
Expand All @@ -80,12 +80,12 @@ Public datasets can be accessed by anyone

17
00:00:39,810 --> 00:00:41,670
而私人数据集只能被访问
而私人数据集只能
while private datasets can only be accessed

18
00:00:41,670 --> 00:00:43,653
由你或你的组织成员
允许你或你的组织成员访问
by you or members of your organization.

19
Expand All @@ -95,7 +95,7 @@ And with that, we can go ahead and create the dataset.

20
00:00:48,690 --> 00:00:51,060
现在你在集线器上有一个空的数据集存储库
现在你在 hub 上有一个空的数据集仓库
Now that you have an empty dataset repository on the hub,

21
Expand All @@ -110,17 +110,17 @@ You can do this with git,

23
00:00:55,050 --> 00:00:57,960
但最简单的方法是选择上传文件按钮
但最简单的方法是选择 Upload file 按钮
but the easiest way is by selecting the Upload file button.

24
00:00:57,960 --> 00:00:59,160
然后,你可以继续
然后,你可以继续下一步
And then, you can just go ahead

25
00:00:59,160 --> 00:01:02,243
并直接从你的机器上传文件
直接从你的机器上传文件
and upload the files directly from your machine.

26
Expand All @@ -130,12 +130,12 @@ After you've uploaded your files,

27
00:01:03,846 --> 00:01:05,670
你会看到它们出现在存储库中
你会在 Files and versions 选项卡下
you'll see them appear in the repository

28
00:01:05,670 --> 00:01:07,320
在文件和版本选项卡下
看到它们出现在仓库中
under the Files and versions tab.

29
Expand All @@ -145,27 +145,27 @@ The last step is to create a dataset card.

30
00:01:11,370 --> 00:01:13,590
记录良好的数据集更有用
对其他人来说,归档良好的数据集貌似更有用,
Well-documented datasets are more likely to be useful

31
00:01:13,590 --> 00:01:15,600
给其他人,因为他们提供了决定的背景
因为他们提供了影响决策的上下文信息
to others as they provide the context to decide

32
00:01:15,600 --> 00:01:17,370
数据集是否相关
包括数据集是否与需求相关
whether the dataset is relevant

33
00:01:17,370 --> 00:01:18,450
或者有没有偏见
或者有没有误差
or whether there are any biases

34
00:01:18,450 --> 00:01:20,673
或与使用数据集相关的风险
或使用该数据可能遇到的风险
or risks associated with using the dataset.

35
Expand All @@ -175,12 +175,12 @@ On the Hugging Face Hub,

36
00:01:22,710 --> 00:01:25,650
此信息存储在每个存储库的自述文件中
此信息存储在每个仓库的 README 文件中
this information is stored in each repositories README file.

37
00:01:25,650 --> 00:01:27,988
你应该采取两个主要步骤
你需要执行两个主要步骤
There are two main steps that you should take.

38
Expand All @@ -190,27 +190,27 @@ First, you need to create some metadata

39
00:01:30,651 --> 00:01:32,010
这将允许你的数据集
这将允许其他人在 hub 上
that will allow your dataset

40
00:01:32,010 --> 00:01:34,590
其他人可以在集线器上轻松找到
轻松找到你的数据集
to be easily found by others on the hub.

41
00:01:34,590 --> 00:01:35,670
你可以创建此元数据
你可以使用数据集标记应用程序
You can create this metadata

42
00:01:35,670 --> 00:01:37,860
使用数据集标记应用程序
创建此元数据
using the datasets tagging application,

43
00:01:37,860 --> 00:01:40,620
我们将在视频说明中链接到它
我们将在视频说明信息中包含它的链接
which we'll link to in the video description.

44
Expand All @@ -230,12 +230,12 @@ and we provide a template

47
00:01:45,240 --> 00:01:47,090
我们还将在视频中链接到
相关链接也会包含在下面的视频信息内容中
that we'll also link to in the video.

48
00:01:48,480 --> 00:01:50,280
一旦你的数据集在集线器上
一旦你的数据集在 hub 上
And once your dataset is on the hub,

49
Expand All @@ -245,12 +245,12 @@ you can load it using the trusty load_dataset function.

50
00:01:53,400 --> 00:01:55,015
只需提供你的存储库的名称
只需提供你的仓库的名称
Just provide the name of your repository

51
00:01:55,015 --> 00:01:57,843
和一个 data_files 参数,你就可以开始了
和一个 data_files 参数,你就可以开始使用了
and a data_files argument, and you're good to go.

52
Expand Down