From 4dfa82378566009898ae96077f51767d423159c6 Mon Sep 17 00:00:00 2001 From: Yuan Date: Mon, 12 Dec 2022 18:11:09 +0800 Subject: [PATCH] docs(zh-cn): Reviewed 57_what-is-perplexity.srt (#401) --- subtitles/zh-CN/57_what-is-perplexity.srt | 68 +++++++++++------------ 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/subtitles/zh-CN/57_what-is-perplexity.srt b/subtitles/zh-CN/57_what-is-perplexity.srt index 8c1141ddd..b78d4fe1f 100644 --- a/subtitles/zh-CN/57_what-is-perplexity.srt +++ b/subtitles/zh-CN/57_what-is-perplexity.srt @@ -15,22 +15,22 @@ 4 00:00:05,379 --> 00:00:06,720 -- 在这段视频中,我们来看看 +- 在这段视频中,我们来了解一下 - In this video, we take a look 5 00:00:06,720 --> 00:00:09,483 -在称为困惑度的神秘测深指标上。 +在称为困惑度的评估指标上。 at the mysterious sounding metric called perplexity. 6 00:00:11,070 --> 00:00:12,630 -你可能遇到过困惑 +你在研究生成模型时 You might have encountered perplexity 7 00:00:12,630 --> 00:00:14,970 -在阅读生成模型时。 +可能遇到过困惑度。 when reading about generative models. 8 @@ -40,7 +40,7 @@ You can see two examples here, 9 00:00:16,680 --> 00:00:18,577 -一张来自原始变压器纸, +一个来自最初的 transformer 论文, one from the original transformer paper, 10 @@ -50,17 +50,17 @@ one from the original transformer paper, 11 00:00:19,950 --> 00:00:23,340 -另一篇来自最近的 GPT-2 论文。 +另一个来自最近的 GPT-2 论文。 and the other one from the more recent GPT-2 paper. 12 00:00:23,340 --> 00:00:25,740 -困惑度是衡量绩效的常用指标 +困惑度是衡量语言模型的性能 Perplexity is a common metric to measure the performance 13 00:00:25,740 --> 00:00:27,150 -的语言模型。 +的常用指标。 of language models. 14 @@ -70,7 +70,7 @@ The smaller its value, the better the performance. 15 00:00:30,000 --> 00:00:32,950 -但它究竟意味着什么,我们又该如何计算呢? +但它究竟意味着什么,我们又该如何计算得到它呢? But what does it actually mean and how can we calculate it? 16 @@ -80,32 +80,32 @@ A very common quantity in machine learning 17 00:00:36,180 --> 00:00:37,650 -是可能性。 +是相似度。 is the likelihood. 18 00:00:37,650 --> 00:00:39,240 -我们可以计算可能性 +我们可以计算似然性 We can calculate the likelihood 19 00:00:39,240 --> 00:00:42,390 -作为每个标记概率的乘积。 +作为每个词元的概率的乘积。 as the product of each token's probability. 20 00:00:42,390 --> 00:00:44,730 -这意味着对于每个令牌, +这意味着对于每个词元, What this means is that for each token, 21 00:00:44,730 --> 00:00:47,340 -我们使用语言模型来预测它的概率 +我们使用语言模型基于之前的词元 we use the language model to predict its probability 22 00:00:47,340 --> 00:00:49,560 -基于之前的标记。 +来预测它的概率。 based on the previous tokens. 23 @@ -115,12 +115,12 @@ In the end, we multiply all probabilities 24 00:00:52,050 --> 00:00:53,253 -得到的可能性。 +从而得到似然性。 to get the likelihood. 25 00:00:55,892 --> 00:00:57,000 -有可能, +通过似然性, With the likelihood, 26 @@ -135,27 +135,27 @@ the cross-entropy. 28 00:01:01,200 --> 00:01:03,450 -你可能已经听说过交叉熵 +当你接触损失函数时 You might have already heard about cross-entropy 29 00:01:03,450 --> 00:01:05,670 -在查看损失函数时。 +可能已经听说过交叉熵。 when looking at loss function. 30 00:01:05,670 --> 00:01:09,210 -它通常用作分类中的损失函数。 +它通常在分类中作为损失函数使用。 It is often used as a loss function in classification. 31 00:01:09,210 --> 00:01:11,610 -在语言建模中,我们预测下一个标记 +在语言建模中,我们基于之前的词元 In language modeling, we predict the next token 32 00:01:11,610 --> 00:01:12,930 -基于之前的令牌, +预测下一个词元, based on the previous token, 33 @@ -185,32 +185,32 @@ with its inputs as labels. 38 00:01:23,580 --> 00:01:26,433 -然后损失对应于交叉熵。 +其损失与交叉熵相关。 The loss then corresponds to the cross-entropy. 39 00:01:29,130 --> 00:01:31,110 -我们现在只差一个手术了 +现在对于计算困惑度 We are now only a single operation away 40 00:01:31,110 --> 00:01:33,510 -从计算困惑度。 +我们现在只差一个操作了。 from calculating the perplexity. 41 00:01:33,510 --> 00:01:37,710 -通过对交叉熵取幂,我们得到了困惑。 +通过对交叉熵取幂,我们得到了困惑度。 By exponentiating the cross-entropy, we get the perplexity. 42 00:01:37,710 --> 00:01:40,260 -所以你看到困惑是密切相关的 +所以你可以发现困惑度和损失 So you see that the perplexity is closely related 43 00:01:40,260 --> 00:01:41,163 -到损失。 +是密切相关的。 to the loss. 44 @@ -220,27 +220,27 @@ Plugging in previous results 45 00:01:43,380 --> 00:01:47,010 -表明这相当于求幂 +表明这相当于对每个词元 shows that this is equivalent to exponentiating 46 00:01:47,010 --> 00:01:51,033 -每个令牌的负平均锁定概率。 +的负平均锁定概率取幂值。 the negative average lock probability of each token. 47 00:01:52,050 --> 00:01:54,630 -请记住,损失只是一个弱代理 +请记住,损失只是针对模型 Keep in mind that the loss is only a weak proxy 48 00:01:54,630 --> 00:01:57,360 -用于模型生成高质量文本的能力 +生成高质量文本的能力的一个弱代理 for a model's ability to generate quality text 49 00:01:57,360 --> 00:02:00,510 -困惑也是如此。 +困惑度也是如此。 and the same is true for perplexity. 50 @@ -255,7 +255,7 @@ more sophisticated metrics 52 00:02:03,840 --> 00:02:07,413 -例如生成任务上的 BLEU 或 ROUGE。 +例如生成式任务上的 BLEU 或 ROUGE。 such as BLEU or ROUGE on generative tasks. 53