From f47740c6893ddfbe1d9352a9f167224373f29538 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E6=9D=8E=E6=B4=8B?=
 <45715979+innovation64@users.noreply.github.com>
Date: Tue, 14 Feb 2023 00:54:07 +0800
Subject: [PATCH] docs(zh-cn): Reviewed No. 08 - What happens inside the
 pipeline function? (PyTorch) (#454)

---
 ...inside-the-pipeline-function-(pytorch).srt | 52 +++++++++----------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/subtitles/zh-CN/08_what-happens-inside-the-pipeline-function-(pytorch).srt b/subtitles/zh-CN/08_what-happens-inside-the-pipeline-function-(pytorch).srt
index ca6c0276f..fffc4139f 100644
--- a/subtitles/zh-CN/08_what-happens-inside-the-pipeline-function-(pytorch).srt
+++ b/subtitles/zh-CN/08_what-happens-inside-the-pipeline-function-(pytorch).srt
@@ -5,7 +5,7 @@
 
 2
 00:00:05,340 --> 00:00:07,563
-- 管道函数内部发生了什么？
+- pipeline 函数内部发生了什么？
 - What happens inside the pipeline function?
 
 3
@@ -25,22 +25,22 @@ of the Transformers library.
 
 6
 00:00:15,090 --> 00:00:16,860
-更具体地说，我们将看看
+详细来讲，我们将举例
 More specifically, we will look
 
 7
 00:00:16,860 --> 00:00:19,200
-在情绪分析管道中，
+在情绪分析的 pipeline 中，
 at the sentiment analysis pipeline,
 
 8
 00:00:19,200 --> 00:00:22,020
-以及它是如何从以下两个句子开始的，
+它是如何从以下两个句子开始的，
 and how it went from the two following sentences,
 
 9
 00:00:22,020 --> 00:00:23,970
-正负标签
+将正负标签
 to the positive and negative labels
 
 10
@@ -50,12 +50,12 @@ with their respective scores.
 
 11
 00:00:26,760 --> 00:00:29,190
-正如我们在管道演示中看到的那样，
+正如我们在 pipeline 展示中看到的那样，
 As we have seen in the pipeline presentation,
 
 12
 00:00:29,190 --> 00:00:31,860
-管道分为三个阶段。
+pipeline 分为三个阶段。
 there are three stages in the pipeline.
 
 13
@@ -65,7 +65,7 @@ First, we convert the raw texts to numbers
 
 14
 00:00:34,620 --> 00:00:37,173
-该模型可以理解使用分词器。
+该模型可以通过使用分词器理解。
 the model can make sense of using a tokenizer.
 
 15
@@ -75,17 +75,17 @@ Then those numbers go through the model,
 
 16
 00:00:40,530 --> 00:00:41,943
-输出逻辑。
+输出 logits 。
 which outputs logits.
 
 17
 00:00:42,780 --> 00:00:45,600
-最后，后处理步骤变换
+最后，后处理步骤转换
 Finally, the post-processing steps transforms
 
 18
 00:00:45,600 --> 00:00:48,150
-那些登录到标签和分数。
+那些 logits 包含标签和分数。
 those logits into labels and scores.
 
 19
@@ -100,17 +100,17 @@ and how to replicate them using the Transformers library,
 
 21
 00:00:53,640 --> 00:00:56,043
-从第一阶段开始，标记化。
+从第一阶段开始，token 化。
 beginning with the first stage, tokenization.
 
 22
 00:00:57,915 --> 00:01:00,360
-令牌化过程有几个步骤。
+token 化过程有几个步骤。
 The tokenization process has several steps.
 
 23
 00:01:00,360 --> 00:01:04,950
-首先，文本被分成称为标记的小块。
+首先，文本被分成称为 token 的小块。
 First, the text is split into small chunks called tokens.
 
 24
@@ -120,7 +120,7 @@ They can be words, parts of words or punctuation symbols.
 
 25
 00:01:08,550 --> 00:01:11,580
-然后 tokenizer 将有一些特殊的标记，
+然后 tokenizer 将有一些特殊的 token ，
 Then the tokenizer will had some special tokens,
 
 26
@@ -130,17 +130,17 @@ if the model expect them.
 
 27
 00:01:13,500 --> 00:01:16,860
-这里的模型在开头使用期望 CLS 令牌
+这里的模型在开头使用期望 CLS token
 Here the model uses expects a CLS token at the beginning
 
 28
 00:01:16,860 --> 00:01:19,743
-以及用于分类的句子末尾的 SEP 标记。
+以及用于分类的句子末尾的 SEP token。
 and a SEP token at the end of the sentence to classify.
 
 29
 00:01:20,580 --> 00:01:24,180
-最后，标记器将每个标记与其唯一 ID 匹配
+最后，tokenizer 将每个 token 与其唯一 ID 匹配
 Lastly, the tokenizer matches each token to its unique ID
 
 30
@@ -180,7 +180,7 @@ Here the checkpoint used by default
 
 37
 00:01:45,360 --> 00:01:47,280
-用于情绪分析管道
+用于情绪分析的 pipeline
 for the sentiment analysis pipeline
 
 38
@@ -250,7 +250,7 @@ Looking at the result, we see we have a dictionary
 
 51
 00:02:25,590 --> 00:02:26,670
-用两把钥匙。
+和两个主键
 with two keys.
 
 52
@@ -265,7 +265,7 @@ with zero where the padding is applied.
 
 54
 00:02:32,550 --> 00:02:34,260
-第二把钥匙，注意面具，
+第二个键值，注意力 mask ，
 The second key, attention mask,
 
 55
@@ -280,7 +280,7 @@ so the model does not pay attention to it.
 
 57
 00:02:38,940 --> 00:02:42,090
-这就是标记化步骤中的全部内容。
+这就是 token 化步骤中的全部内容。
 This is all what is inside the tokenization step.
 
 58
@@ -350,7 +350,7 @@ for our classification problem.
 
 71
 00:03:15,030 --> 00:03:19,230
-这里的张量有两个句子，每个句子有 16 个标记，
+这里的张量有两个句子，每个句子有 16 个 token ，
 Here the tensor has two sentences, each of 16 tokens,
 
 72
@@ -425,12 +425,12 @@ This is because each model
 
 86
 00:03:57,270 --> 00:04:00,810
-每个模型都会返回 logits。
+每个模型都会返回 logits 。
 of the Transformers library returns logits.
 
 87
 00:04:00,810 --> 00:04:02,250
-为了理解这些逻辑，
+为了理解这些 logits ，
 To make sense of those logits,
 
 88
@@ -505,7 +505,7 @@ This is how our classifier built
 
 102
 00:04:37,950 --> 00:04:40,230
-使用管道功能选择了那些标签
+使用 pipeline 功能选择了那些标签
 with the pipeline function picked those labels
 
 103