huggingface · tyisme614 · Dec 27, 2022 · Dec 30, 2022
diff --git a/subtitles/zh-CN/66_the-post-processing-step-in-question-answering-(pytorch).srt b/subtitles/zh-CN/66_the-post-processing-step-in-question-answering-(pytorch).srt
@@ -5,12 +5,12 @@
 
 2
 00:00:05,940 --> 00:00:08,913
-- 问答任务中的后处理步骤。
+- 问答任务中的后处理操作。
 - The post-processing step in a question answering task.
 
 3
 00:00:10,440 --> 00:00:12,180
-在做答题时，
+在做问答任务时，
 When doing question answering,
 
 4
@@ -20,7 +20,7 @@ the processing of the initial dataset
 
 5
 00:00:14,550 --> 00:00:17,370
-意味着将示例拆分为多个功能，
+意味着以多个特征拆分示例，
 implies splitting examples in several features,
 
 6
@@ -30,37 +30,37 @@ which may or may not contain the answer.
 
 7
 00:00:21,000 --> 00:00:22,740
-通过模型传递这些特征
+由于我们的标签是词元的索引  
 Passing those features through the model
 
 8
 00:00:22,740 --> 00:00:25,830
-将为我们提供开始和结束位置的 logits，
+其对应于答案的起始和结束，
 will give us logits for the start and end positions,
 
 9
 00:00:25,830 --> 00:00:28,650
-因为我们的标签是令牌的索引
+通过模型传递这些特征
 since our labels are the indices of the token
 
 10
 00:00:28,650 --> 00:00:31,050
-对应于开始和结束的答案。
+将为我们提供开始和结束位置的对数值。
 that correspond to the start and end the answer.
 
 11
 00:00:32,664 --> 00:00:35,490
-然后我们必须以某种方式将这些 logits 转换为答案，
+然后我们必须以某种方式将这些对数值转换为答案，
 We must then somehow convert those logits into an answer,
 
 12
 00:00:35,490 --> 00:00:38,610
-然后从每个功能给出的各种答案中选择一个
+然后从每个特征给出的各种答案中选择一个
 and then pick one of the various answers each feature gives
 
 13
 00:00:38,610 --> 00:00:40,893
-成为给定示例的答案。
+作为给定示例的答案。
 to be the answer for a given example.
 
 14
@@ -70,12 +70,12 @@ For the processing step,
 
 15
 00:00:43,500 --> 00:00:45,750
-你应该参考下面链接的视频。
+你可以参考下面链接的视频。
 you should refer to the video linked below.
 
 16
 00:00:45,750 --> 00:00:47,820
-验证并没有太大的不同，
+验证方面也没有太大的变化，
 It's not very different for validation,
 
 17
@@ -85,42 +85,42 @@ we just need to add a few lines to keep track of two things.
 
 18
 00:00:51,660 --> 00:00:54,960
-我们保留它们，而不是丢弃偏移映射，
+我们保留 offset mapping，而不是丢弃它们，
 Instead of discarding the offset mappings, we keep them,
 
 19
 00:00:54,960 --> 00:00:55,793
-也包括在其中
+并且通过设置特殊词元的偏移量
 and also include in them
 
 20
 00:00:55,793 --> 00:00:58,350
-上下文在哪里的信息
+以及将问题设置为 None
 the information of where the context is
 
 21
 00:00:58,350 --> 00:01:00,690
-通过设置特殊标记的偏移量
+将所保留的 offset mapping 
 by setting the offsets of the special tokens
 
 22
 00:01:00,690 --> 00:01:02,253
-和无的问题。
+包含在上下文的信息里。
 and the question to None.
 
 23
 00:01:03,480 --> 00:01:06,630
-然后我们还跟踪每个功能的示例 ID，
+然后我们还跟踪每个特征的示例 ID，
 Then we also keep track of the example ID for each feature,
 
 24
 00:01:06,630 --> 00:01:08,280
-能够映射回特征
+能够将特征映射回
 to be able to map back feature
 
 25
 00:01:08,280 --> 00:01:10,503
-他们起源的例子。
+他们初始的例子。
 to the examples that they originated from.
 
 26
@@ -130,22 +130,22 @@ If you don't want to compute the validation loss,
 
 27
 00:01:14,100 --> 00:01:15,990
-你不需要包含所有特殊代码
+你不需要包含所有这些
 you won't need to include all the special code
 
 28
 00:01:15,990 --> 00:01:18,420
-我们用来创建标签的。
+用来创建标签的特殊代码。
 that we used to create the labels.
 
 29
 00:01:18,420 --> 00:01:21,090
-完成后，我们可以应用该预处理功能
+完成后，我们可以调用 map 方法
 With this done, we can apply that preprocessing function
 
 30
 00:01:21,090 --> 00:01:22,890
-使用映射方法。
+应用该预处理功能。
 using the map method.
 
 31
@@ -155,7 +155,7 @@ We take the SQUAD dataset
 
 32
 00:01:24,090 --> 00:01:26,840
-比如问答视频的预处理。
+就和问答视频的预处理中所用到的一样。
 like in the preprocessing for question-answering video.
 
 33
@@ -165,12 +165,12 @@ Once this is done, the next step is to create our model.
 
 34
 00:01:30,540 --> 00:01:31,710
-我们使用默认模型
+我们在这里的问答管道背后
 We use the default model
 
 35
 00:01:31,710 --> 00:01:33,930
-在这里的问答管道背后，
+使用默认模型，
 behind the question-answering pipeline here,
 
 36
@@ -190,17 +190,17 @@ so we create a PyTorch DataLoader with our features.
 
 39
 00:01:42,657 --> 00:01:44,520
-有了它，我们可以计算和收集
+有了它，我们可以使用标准的 PyTorch 评估循环
 With it, we can compute and gather
 
 40
 00:01:44,520 --> 00:01:46,650
-所有的开始和结束都是这样的，
+来计算和收集所有
 all the start and end logits like this,
 
 41
 00:01:46,650 --> 00:01:49,653
-使用标准的 PyTorch 评估循环。
+像这样的开始和结束的对数值。
 with a standard PyTorch evaluation loop.
 
 42
@@ -215,7 +215,7 @@ First, we'll need a map from example to features,
 
 44
 00:01:56,340 --> 00:01:57,873
-我们可以这样创建。
+我们像这样创建。
 which we can create like this.
 
 45
@@ -225,22 +225,22 @@ Now, for the main part of the post-processing,
 
 46
 00:02:00,810 --> 00:02:04,230
-让我们看看如何从 logits 中提取答案。
+让我们看看如何从对数中提取答案。
 let's see how to extract an answer from the logits.
 
 47
 00:02:04,230 --> 00:02:05,760
-我们可以只取最好的索引
+我们可以针对开始和结束对数值
 We could just take the best index
 
 48
 00:02:05,760 --> 00:02:07,980
-对于开始和结束登录并完成，
+只取最好的索引，
 for the start and end logits and be done,
 
 49
 00:02:07,980 --> 00:02:10,380
-但如果我们的模型预测了一些不可能的事情，
+但如果我们的模型预测得出了不可思议的结果，
 but if our model predicts something impossible,
 
 50
@@ -250,7 +250,7 @@ like tokens in the question,
 
 51
 00:02:12,150 --> 00:02:13,940
-我们将查看更多的 logits。
+我们将查看更多的对数值。
 we'll look at more of the logits.
 
 52
@@ -260,12 +260,12 @@ Note that in the question-answering pipeline,
 
 53
 00:02:17,070 --> 00:02:18,870
-我们将分数归因于每个答案
+我们基于概率将分数归因
 we attributed score to each answer
 
 54
 00:02:18,870 --> 00:02:20,430
-基于概率，
+于每个答案，
 based on the probabilities,
 
 55
@@ -275,7 +275,7 @@ which we did not compute here.
 
 56
 00:02:22,350 --> 00:02:25,560
-就逻辑而言，我们在分数中的乘法
+就对数而言，我们在分数中的乘法
 In terms of logits, the multiplication we had in the scores
 
 57
@@ -285,27 +285,27 @@ becomes an addition.
 
 58
 00:02:28,110 --> 00:02:29,010
-要走得快，
+要快速完成，
 To go fast,
 
 59
 00:02:29,010 --> 00:02:31,800
-我们不会查看所有可能的开始和结束日志，
+我们不会查看所有可能的开始和结束的对数，
 we don't look at all possible start and end logits,
 
 60
 00:02:31,800 --> 00:02:34,050
-但是最好的 20 个就足够了。
+仅需最好的 20 个就足够了。
 but the 20 best one is enough.
 
 61
 00:02:34,050 --> 00:02:36,570
-我们忽略产生不可能答案的逻辑
+我们忽略产生不可能答案和答案太长
 We ignore the logits that spawn impossible answers
 
 62
 00:02:36,570 --> 00:02:38,550
-或回答太长。
+的对数值。
 or answer that are too long.
 
 63
@@ -315,7 +315,7 @@ As we saw in the preprocessing, the labels 0,0
 
 64
 00:02:41,430 --> 00:02:43,230
-对应不回答。
+对应无答案。
 correspond to a no answer.
 
 65
@@ -330,12 +330,12 @@ to get the answer inside the context.
 
 67
 00:02:47,910 --> 00:02:49,107
-我们来看看预测答案
+我们来看看对于第一个特征
 Let's have a look at the predicted answer
 
 68
 00:02:49,107 --> 00:02:50,370
-对于第一个功能，
+预测的答案，
 for the first feature,
 
 69
@@ -345,7 +345,7 @@ which is the answer with the best score
 
 70
 00:02:51,930 --> 00:02:53,640
-或最好的逻辑分数
+或最好的对数分数
 or the best logit score
 
 71
@@ -355,7 +355,7 @@ since the SoftMax is an increasing function.
 
 72
 00:02:56,280 --> 00:02:58,230
-模型做对了。
+这个模型就是合适的。
 The model got it right.
 
 73
@@ -365,12 +365,12 @@ Next we just have to loop this for every example,
 
 74
 00:03:00,690 --> 00:03:03,720
-为每个选择具有最佳 logit 分数的答案
+在示例生成的所有特征中
 picking for each the answer with the best logit score
 
 75
 00:03:03,720 --> 00:03:06,750
-在示例生成的所有功能中。
+选择具有最佳对数分数的答案。
 in all the features the example generated.
 
 76