Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add captions for tasks videos #464

Merged
merged 2 commits into from
Jan 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions subtitles/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,16 @@ For example, in the `zh-CN` subtitles, each block has the following format:
```
1
00:00:05,850 --> 00:00:07,713
- 欢迎来到 Hugging Face 课程。
- Welcome to the Hugging Face Course.
欢迎来到 Hugging Face 课程。
Welcome to the Hugging Face Course.
```

To upload the SRT file to YouTube, we need the subtitle in monolingual format, i.e. the above block should read:

```
1
00:00:05,850 --> 00:00:07,713
- 欢迎来到 Hugging Face 课程。
欢迎来到 Hugging Face 课程。
```

To handle this, we provide a script that converts the bilingual SRT files to monolingual ones. To perform the conversion, run:
Expand Down
7 changes: 7 additions & 0 deletions subtitles/en/metadata_tasks.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
id,title,link,srt_filename
wVHdVlPScxA,🤗 Tasks: Token Classification,https://www.youtube.com/watch?v=wVHdVlPScxA&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=1,subtitles/en/tasks_00_🤗-tasks-token-classification.srt
ajPx5LwJD-I,🤗 Tasks: Question Answering,https://www.youtube.com/watch?v=ajPx5LwJD-I&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=2,subtitles/en/tasks_01_🤗-tasks-question-answering.srt
Vpjb1lu0MDk,🤗 Tasks: Causal Language Modeling,https://www.youtube.com/watch?v=Vpjb1lu0MDk&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=3,subtitles/en/tasks_02_🤗-tasks-causal-language-modeling.srt
mqElG5QJWUg,🤗 Tasks: Masked Language Modeling,https://www.youtube.com/watch?v=mqElG5QJWUg&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=4,subtitles/en/tasks_03_🤗-tasks-masked-language-modeling.srt
yHnr5Dk2zCI,🤗 Tasks: Summarization,https://www.youtube.com/watch?v=yHnr5Dk2zCI&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=5,subtitles/en/tasks_04_🤗-tasks-summarization.srt
1JvfrvZgi6c,🤗 Tasks: Translation,https://www.youtube.com/watch?v=1JvfrvZgi6c&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=6,subtitles/en/tasks_05_🤗-tasks-translation.srt
77 changes: 77 additions & 0 deletions subtitles/en/raw/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
Note: the following transcripts are associated with Merve Noyan's videos in the Hugging Face Tasks playlist: https://www.youtube.com/playlist?list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf

Token Classification video

Welcome to the Hugging Face tasks series! In this video we’ll take a look at the token classification task.
Token classification is the task of assigning a label to each token in a sentence. There are various token classification tasks and the most common are Named Entity Recognition and Part-of-Speech Tagging.
Let’s take a quick look at the Named Entity Recognition task. The goal of this task is to find the entities in a piece of text, such as person, location, or organization. This task is formulated as labelling each token with one class for each entity, and another class for tokens that have no entity.
Another token classification task is part-of-speech tagging. The goal of this task is to label the words for a particular part of a speech, such as noun, pronoun, adjective, verb and so on. This task is formulated as labelling each token with parts of speech.
Token classification models are evaluated on Accuracy, Recall, Precision and F1-Score. The metrics are calculated for each of the classes. We calculate true positive, true negative and false positives to calculate precision and recall, and take their harmonic mean to get F1-Score. Then we calculate it for every class and take the overall average to evaluate our model.
An example dataset used for this task is ConLL2003. Here, each token belongs to a certain named entity class, denoted as the indices of the list containing the labels.
You can extract important information from invoices using named entity recognition models, such as date, organization name or address.
For more information about the Token classification task, check out the Hugging Face course.


Question Answering video

Welcome to the Hugging Face tasks series. In this video, we will take a look at the Question Answering task.
Question answering is the task of extracting an answer in a given document.
Question answering models take a context, which is the document you want to search in, and a question and return an answer. Note that the answer is not generated, but extracted from the context. This type of question answering is called extractive.
The task is evaluated on two metrics, exact match and F1-Score.
As the name implies, exact match looks for an exact match between the predicted answer and the correct answer.
A common metric used is the F1-Score, which is calculated over tokens that are predicted correctly and incorrectly. It is calculated over the average of two metrics called precision and recall which are metrics that are used widely in classification problems.
An example dataset used for this task is called SQuAD. This dataset contains contexts, questions and the answers that are obtained from English Wikipedia articles.
You can use question answering models to automatically answer the questions asked by your customers. You simply need a document containing information about your business and query through that document with the questions asked by your customers.
For more information about the Question Answering task, check out the Hugging Face course.


Causal Language Modeling video

Welcome to the Hugging Face tasks series! In this video we’ll take a look at Causal Language Modeling.
Causal language modeling is the task of predicting the next
word in a sentence, given all the previous words. This task is very similar to the autocorrect function that you might have on your phone.
These models take a sequence to be completed and outputs the complete sequence.
Classification metrics can’t be used as there’s no single correct answer for completion. Instead, we evaluate the distribution of the text completed by the model.
A common metric to do so is the cross-entropy loss. Perplexity is also a widely used metric and it is calculated as the exponential of the cross-entropy loss.
You can use any dataset with plain text and tokenize the text to prepare the data.
Causal language models can be used to generate code.
For more information about the Causal Language Modeling task, check out the Hugging Face course.


Masked Language Modeling video

Welcome to the Hugging Face tasks series! In this video we’ll take a look at Masked Language Modeling.
Masked language modeling is the task of predicting which words should fill in the blanks of a sentence.
These models take a masked text as the input and output the possible values for that mask.
Masked language modeling is handy before fine-tuning your model for your task. For example, if you need to use a model in a specific domain, say, biomedical documents, models like BERT will treat your domain-specific words as rare tokens. If you train a masked language model using your biomedical corpus and then fine tune your model on a downstream task, you will have a better performance.
Classification metrics can’t be used as there’s no single correct answer to mask values. Instead, we evaluate the distribution of the mask values.
A common metric to do so is the cross-entropy loss. Perplexity is also a widely used metric and it is calculated as the exponential of the cross-entropy loss.
You can use any dataset with plain text and tokenize the text to mask the data.
For more information about the Masked Language Modeling, check out the Hugging Face course.


Summarization video

Welcome to the Hugging Face tasks series. In this video, we will take a look at the Text Summarization task.
Summarization is a task of producing a shorter version of a document while preserving the relevant and important information in the document.
Summarization models take a document to be summarized and output the summarized text.
This task is evaluated on the ROUGE score. It’s based on the overlap between the produced sequence and the correct sequence.
You might see this as ROUGE-1, which is the overlap of single tokens and ROUGE-2, the overlap of subsequent token pairs. ROUGE-N refers to the overlap of n subsequent tokens. Here we see an example of how overlaps take place.
An example dataset used for this task is called Extreme Summarization, XSUM. This dataset contains texts and their summarized versions.
You can use summarization models to summarize research papers which would enable researchers to easily pick papers for their reading list.
For more information about the Summarization task, check out the Hugging Face course.


Translation video

Welcome to the Hugging Face tasks series. In this video, we will take a look at the Translation task.
Translation is the task of translating text from one language to another.
These models take a text in the source language and output the translation of that text in the target language.
The task is evaluated on the BLEU score.
The score ranges from 0 to 1, in which 1 means the translation perfectly matched and 0 did not match at all.
BLEU is calculated over subsequent tokens called n-grams. Unigram refers to a single token while bi-gram refers to token pairs and n-grams refer to n subsequent tokens.
Machine translation datasets contain pairs of text in a language and translation of the text in another language.
These models can help you build conversational agents across different languages.
One option is to translate the training data used for the chatbot and train a separate chatbot.
You can put one translation model from your user’s language to the language your chatbot is trained on, translate the user inputs and do intent classification, take the output of the chatbot and translate it from the language your chatbot was trained on to the user’s language.
For more information about the Translation task, check out the Hugging Face course.
116 changes: 116 additions & 0 deletions subtitles/en/tasks_00_🤗-tasks-token-classification.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
1
00:00:04,520 --> 00:00:07,400
Welcome to the Hugging Face tasks series!

2
00:00:07,400 --> 00:00:11,870
In this video we’ll take a look at the token
classification task.

3
00:00:11,870 --> 00:00:17,900
Token classification is the task of assigning
a label to each token in a sentence.

4
00:00:17,900 --> 00:00:23,310
There are various token classification tasks
and the most common are Named Entity Recognition

5
00:00:23,310 --> 00:00:26,430
and Part-of-Speech Tagging.

6
00:00:26,430 --> 00:00:31,640
Let’s take a quick look at the Named Entity
Recognition task.

7
00:00:31,640 --> 00:00:38,400
The goal of this task is to find the entities
in a piece of text, such as person, location,

8
00:00:38,400 --> 00:00:40,210
or organization.

9
00:00:40,210 --> 00:00:45,250
This task is formulated as labelling each
token with one class for each entity, and

10
00:00:45,250 --> 00:00:51,719
another class for tokens that have no entity.

11
00:00:51,719 --> 00:00:55,670
Another token classification task is part-of-speech
tagging.

12
00:00:55,670 --> 00:01:01,399
The goal of this task is to label the words
for a particular part of a speech, such as

13
00:01:01,399 --> 00:01:05,900
noun, pronoun, adjective, verb and so on.

14
00:01:05,900 --> 00:01:11,270
This task is formulated as labelling each
token with parts of speech.

15
00:01:11,270 --> 00:01:19,659
Token classification models are evaluated
on Accuracy, Recall, Precision and F1-Score.

16
00:01:19,659 --> 00:01:22,950
The metrics are calculated for each of the
classes.

17
00:01:22,950 --> 00:01:28,040
We calculate true positive, true negative
and false positives to calculate precision

18
00:01:28,040 --> 00:01:31,829
and recall, and take their harmonic mean to
get F1-Score.

19
00:01:31,829 --> 00:01:42,329
Then we calculate it for every class and take
the overall average to evaluate our model.

20
00:01:42,329 --> 00:01:45,680
An example dataset used for this task is ConLL2003.

21
00:01:45,680 --> 00:01:51,750
Here, each token belongs to a certain named
entity class, denoted as the indices of the

22
00:01:51,750 --> 00:01:55,380
list containing the labels.

23
00:01:55,380 --> 00:02:00,720
You can extract important information from
invoices using named entity recognition models,

24
00:02:00,720 --> 00:02:07,070
such as date, organization name or address.

25
00:02:07,070 --> 00:02:16,840
For more information about the Token classification
task, check out the Hugging Face course.
87 changes: 87 additions & 0 deletions subtitles/en/tasks_01_🤗-tasks-question-answering.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
1
00:00:04,400 --> 00:00:06,480
Welcome to the Hugging Face tasks series.  

2
00:00:07,200 --> 00:00:10,080
In this video, we will take a look 
at the Question Answering task. 

3
00:00:13,120 --> 00:00:17,200
Question answering is the task of 
extracting an answer in a given document. 

4
00:00:21,120 --> 00:00:25,600
Question answering models take a context, 
which is the document you want to search in,  

5
00:00:26,240 --> 00:00:31,440
and a question and return an answer. 
Note that the answer is not generated,  

6
00:00:31,440 --> 00:00:37,600
but extracted from the context. This type 
of question answering is called extractive. 

7
00:00:42,320 --> 00:00:46,960
The task is evaluated on two 
metrics, exact match and F1-Score. 

8
00:00:49,680 --> 00:00:52,320
As the name implies, exact match looks for an  

9
00:00:52,320 --> 00:00:57,840
exact match between the predicted 
answer and the correct answer. 

10
00:01:00,080 --> 00:01:05,520
A common metric used is the F1-Score, which 
is calculated over tokens that are predicted  

11
00:01:05,520 --> 00:01:10,960
correctly and incorrectly. It is calculated 
over the average of two metrics called  

12
00:01:10,960 --> 00:01:16,560
precision and recall which are metrics that 
are used widely in classification problems. 

13
00:01:20,880 --> 00:01:28,240
An example dataset used for this task is called 
SQuAD. This dataset contains contexts, questions  

14
00:01:28,240 --> 00:01:32,080
and the answers that are obtained 
from English Wikipedia articles. 

15
00:01:35,440 --> 00:01:39,520
You can use question answering models to 
automatically answer the questions asked  

16
00:01:39,520 --> 00:01:46,480
by your customers. You simply need a document 
containing information about your business  

17
00:01:47,200 --> 00:01:53,840
and query through that document with 
the questions asked by your customers. 

18
00:01:55,680 --> 00:02:06,160
For more information about the Question Answering 
task, check out the Hugging Face course.
63 changes: 63 additions & 0 deletions subtitles/en/tasks_02_🤗-tasks-causal-language-modeling.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
1
00:00:04,560 --> 00:00:06,640
Welcome to the Hugging Face tasks series!  

2
00:00:07,200 --> 00:00:10,400
In this video we’ll take a look 
at Causal Language Modeling. 

3
00:00:13,600 --> 00:00:16,880
Causal language modeling is 
the task of predicting the next 

4
00:00:16,880 --> 00:00:21,920
word in a sentence, given all the 
previous words. This task is very  

5
00:00:21,920 --> 00:00:29,920
similar to the autocorrect function 
that you might have on your phone. 

6
00:00:29,920 --> 00:00:34,720
These models take a sequence to be 
completed and outputs the complete sequence. 

7
00:00:38,640 --> 00:00:44,160
Classification metrics can’t be used as there’s 
no single correct answer for completion.  

8
00:00:44,960 --> 00:00:49,280
Instead, we evaluate the distribution 
of the text completed by the model. 

9
00:00:50,800 --> 00:00:55,440
A common metric to do so is the 
cross-entropy loss. Perplexity is  

10
00:00:55,440 --> 00:01:01,280
also a widely used metric and it is calculated 
as the exponential of the cross-entropy loss. 

11
00:01:05,200 --> 00:01:11,840
You can use any dataset with plain text 
and tokenize the text to prepare the data. 

12
00:01:15,040 --> 00:01:18,240
Causal language models can 
be used to generate code. 

13
00:01:22,480 --> 00:01:33,200
For more information about the Causal Language 
Modeling task, check out the Hugging Face course.
Loading