Skip to content

Commit

Permalink
docs(zh-cn): Reviewed 57_what-is-perplexity.srt (#401)
Browse files Browse the repository at this point in the history
  • Loading branch information
tyisme614 authored Dec 12, 2022
1 parent 461e1ac commit 4dfa823
Showing 1 changed file with 34 additions and 34 deletions.
68 changes: 34 additions & 34 deletions subtitles/zh-CN/57_what-is-perplexity.srt
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,22 @@

4
00:00:05,379 --> 00:00:06,720
- 在这段视频中,我们来看看
- 在这段视频中,我们来了解一下
- In this video, we take a look

5
00:00:06,720 --> 00:00:09,483
在称为困惑度的神秘测深指标上
在称为困惑度的评估指标上
at the mysterious sounding metric called perplexity.

6
00:00:11,070 --> 00:00:12,630
你可能遇到过困惑
你在研究生成模型时
You might have encountered perplexity

7
00:00:12,630 --> 00:00:14,970
在阅读生成模型时
可能遇到过困惑度
when reading about generative models.

8
Expand All @@ -40,7 +40,7 @@ You can see two examples here,

9
00:00:16,680 --> 00:00:18,577
一张来自原始变压器纸
一个来自最初的 transformer 论文
one from the original transformer paper,

10
Expand All @@ -50,17 +50,17 @@ one from the original transformer paper,

11
00:00:19,950 --> 00:00:23,340
另一篇来自最近的 GPT-2 论文。
另一个来自最近的 GPT-2 论文。
and the other one from the more recent GPT-2 paper.

12
00:00:23,340 --> 00:00:25,740
困惑度是衡量绩效的常用指标
困惑度是衡量语言模型的性能
Perplexity is a common metric to measure the performance

13
00:00:25,740 --> 00:00:27,150
的语言模型
的常用指标
of language models.

14
Expand All @@ -70,7 +70,7 @@ The smaller its value, the better the performance.

15
00:00:30,000 --> 00:00:32,950
但它究竟意味着什么,我们又该如何计算呢
但它究竟意味着什么,我们又该如何计算得到它呢
But what does it actually mean and how can we calculate it?

16
Expand All @@ -80,32 +80,32 @@ A very common quantity in machine learning

17
00:00:36,180 --> 00:00:37,650
是可能性
是相似度
is the likelihood.

18
00:00:37,650 --> 00:00:39,240
我们可以计算可能性
我们可以计算似然性
We can calculate the likelihood

19
00:00:39,240 --> 00:00:42,390
作为每个标记概率的乘积
作为每个词元的概率的乘积
as the product of each token's probability.

20
00:00:42,390 --> 00:00:44,730
这意味着对于每个令牌
这意味着对于每个词元
What this means is that for each token,

21
00:00:44,730 --> 00:00:47,340
我们使用语言模型来预测它的概率
我们使用语言模型基于之前的词元
we use the language model to predict its probability

22
00:00:47,340 --> 00:00:49,560
基于之前的标记
来预测它的概率
based on the previous tokens.

23
Expand All @@ -115,12 +115,12 @@ In the end, we multiply all probabilities

24
00:00:52,050 --> 00:00:53,253
得到的可能性
从而得到似然性
to get the likelihood.

25
00:00:55,892 --> 00:00:57,000
有可能
通过似然性
With the likelihood,

26
Expand All @@ -135,27 +135,27 @@ the cross-entropy.

28
00:01:01,200 --> 00:01:03,450
你可能已经听说过交叉熵
当你接触损失函数时
You might have already heard about cross-entropy

29
00:01:03,450 --> 00:01:05,670
在查看损失函数时
可能已经听说过交叉熵
when looking at loss function.

30
00:01:05,670 --> 00:01:09,210
它通常用作分类中的损失函数
它通常在分类中作为损失函数使用
It is often used as a loss function in classification.

31
00:01:09,210 --> 00:01:11,610
在语言建模中,我们预测下一个标记
在语言建模中,我们基于之前的词元
In language modeling, we predict the next token

32
00:01:11,610 --> 00:01:12,930
基于之前的令牌
预测下一个词元
based on the previous token,

33
Expand Down Expand Up @@ -185,32 +185,32 @@ with its inputs as labels.

38
00:01:23,580 --> 00:01:26,433
然后损失对应于交叉熵
其损失与交叉熵相关
The loss then corresponds to the cross-entropy.

39
00:01:29,130 --> 00:01:31,110
我们现在只差一个手术了
现在对于计算困惑度
We are now only a single operation away

40
00:01:31,110 --> 00:01:33,510
从计算困惑度
我们现在只差一个操作了
from calculating the perplexity.

41
00:01:33,510 --> 00:01:37,710
通过对交叉熵取幂,我们得到了困惑
通过对交叉熵取幂,我们得到了困惑度
By exponentiating the cross-entropy, we get the perplexity.

42
00:01:37,710 --> 00:01:40,260
所以你看到困惑是密切相关的
所以你可以发现困惑度和损失
So you see that the perplexity is closely related

43
00:01:40,260 --> 00:01:41,163
到损失
是密切相关的
to the loss.

44
Expand All @@ -220,27 +220,27 @@ Plugging in previous results

45
00:01:43,380 --> 00:01:47,010
表明这相当于求幂
表明这相当于对每个词元
shows that this is equivalent to exponentiating

46
00:01:47,010 --> 00:01:51,033
每个令牌的负平均锁定概率
的负平均锁定概率取幂值
the negative average lock probability of each token.

47
00:01:52,050 --> 00:01:54,630
请记住,损失只是一个弱代理
请记住,损失只是针对模型
Keep in mind that the loss is only a weak proxy

48
00:01:54,630 --> 00:01:57,360
用于模型生成高质量文本的能力
生成高质量文本的能力的一个弱代理
for a model's ability to generate quality text

49
00:01:57,360 --> 00:02:00,510
困惑也是如此
困惑度也是如此
and the same is true for perplexity.

50
Expand All @@ -255,7 +255,7 @@ more sophisticated metrics

52
00:02:03,840 --> 00:02:07,413
例如生成任务上的 BLEU 或 ROUGE。
例如生成式任务上的 BLEU 或 ROUGE。
such as BLEU or ROUGE on generative tasks.

53
Expand Down

0 comments on commit 4dfa823

Please sign in to comment.