Skip to content

Commit

Permalink
Merge branch 'huggingface:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
haruki-N authored Dec 2, 2022
2 parents e2cadb3 + d0a93b9 commit bd00a2d
Show file tree
Hide file tree
Showing 407 changed files with 65,894 additions and 3,387 deletions.
5 changes: 1 addition & 4 deletions .github/workflows/build_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ concurrency:

jobs:
build:
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@use_hf_hub
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
with:
commit_sha: ${{ github.event.pull_request.head.sha }}
pr_number: ${{ github.event.number }}
Expand All @@ -18,6 +18,3 @@ jobs:
additional_args: --not_python_module
languages: ar bn de en es fa fr gj he hi id it ja ko pt ru th tr vi zh-CN zh-TW
hub_base_path: https://moon-ci-docs.huggingface.co
secrets:
token: ${{ secrets.HF_DOC_PUSH }}
comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }}
7 changes: 2 additions & 5 deletions .github/workflows/delete_doc_comment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,7 @@ on:

jobs:
delete:
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@use_hf_hub
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
with:
pr_number: ${{ github.event.number }}
package: course
secrets:
token: ${{ secrets.HF_DOC_PUSH }}
comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }}
package: course
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ This repo contains the content that's used to create the **[Hugging Face course]
| [Bahasa Indonesia](https://huggingface.co/course/id/chapter1/1) (WIP) | [`chapters/id`](https://github.com/huggingface/course/tree/main/chapters/id) | [@gstdl](https://github.com/gstdl) |
| [Italian](https://huggingface.co/course/it/chapter1/1) (WIP) | [`chapters/it`](https://github.com/huggingface/course/tree/main/chapters/it) | [@CaterinaBi](https://github.com/CaterinaBi), [@ClonedOne](https://github.com/ClonedOne), [@Nolanogenn](https://github.com/Nolanogenn), [@EdAbati](https://github.com/EdAbati), [@gdacciaro](https://github.com/gdacciaro) |
| [Japanese](https://huggingface.co/course/ja/chapter1/1) (WIP) | [`chapters/ja`](https://github.com/huggingface/course/tree/main/chapters/ja) | [@hiromu166](https://github.com/@hiromu166), [@younesbelkada](https://github.com/@younesbelkada), [@HiromuHota](https://github.com/@HiromuHota) |
| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae) |
| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae), [@wonhyeongseo](https://github.com/wonhyeongseo) |
| [Portuguese](https://huggingface.co/course/pt/chapter1/1) (WIP) | [`chapters/pt`](https://github.com/huggingface/course/tree/main/chapters/pt) | [@johnnv1](https://github.com/johnnv1), [@victorescosta](https://github.com/victorescosta), [@LincolnVS](https://github.com/LincolnVS) |
| [Russian](https://huggingface.co/course/ru/chapter1/1) (WIP) | [`chapters/ru`](https://github.com/huggingface/course/tree/main/chapters/ru) | [@pdumin](https://github.com/pdumin), [@svv73](https://github.com/svv73) |
| [Thai](https://huggingface.co/course/th/chapter1/1) (WIP) | [`chapters/th`](https://github.com/huggingface/course/tree/main/chapters/th) | [@peeraponw](https://github.com/peeraponw), [@a-krirk](https://github.com/a-krirk), [@jomariya23156](https://github.com/jomariya23156), [@ckingkan](https://github.com/ckingkan) |
Expand Down
8 changes: 4 additions & 4 deletions chapters/en/chapter0/1.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Introduction
# Introduction[[introduction]]

Welcome to the Hugging Face course! This introduction will guide you through setting up a working environment. If you're just starting the course, we recommend you first take a look at [Chapter 1](/course/chapter1), then come back and set up your environment so you can try the code yourself.

Expand All @@ -10,7 +10,7 @@ Note that we will not be covering the Windows system. If you're running on Windo

Most of the course relies on you having a Hugging Face account. We recommend creating one now: [create an account](https://huggingface.co/join).

## Using a Google Colab notebook
## Using a Google Colab notebook[[using-a-google-colab-notebook]]

Using a Colab notebook is the simplest possible setup; boot up a notebook in your browser and get straight to coding!

Expand Down Expand Up @@ -46,7 +46,7 @@ This installs a very light version of 🤗 Transformers. In particular, no speci

This will take a bit of time, but then you'll be ready to go for the rest of the course!

## Using a Python virtual environment
## Using a Python virtual environment[[using-a-python-virtual-environment]]

If you prefer to use a Python virtual environment, the first step is to install Python on your system. We recommend following [this guide](https://realpython.com/installing-python/) to get started.

Expand Down Expand Up @@ -99,7 +99,7 @@ which python
/home/<user>/transformers-course/.env/bin/python
```

### Installing dependencies
### Installing dependencies[[installing-dependencies]]

As in the previous section on using Google Colab instances, you'll now need to install the packages required to continue. Again, you can install the development version of 🤗 Transformers using the `pip` package manager:

Expand Down
10 changes: 5 additions & 5 deletions chapters/en/chapter1/1.mdx
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Introduction
# Introduction[[introduction]]

<CourseFloatingBanner
chapter={1}
classNames="absolute z-10 right-0 top-0"
/>

## Welcome to the 🤗 Course!
## Welcome to the 🤗 Course![[welcome-to-the-course]]

<Youtube id="00GKzGyWFEs" />

This course will teach you about natural language processing (NLP) using libraries from the [Hugging Face](https://huggingface.co/) ecosystem — [🤗 Transformers](https://github.com/huggingface/transformers), [🤗 Datasets](https://github.com/huggingface/datasets), [🤗 Tokenizers](https://github.com/huggingface/tokenizers), and [🤗 Accelerate](https://github.com/huggingface/accelerate) — as well as the [Hugging Face Hub](https://huggingface.co/models). It's completely free and without ads.


## What to expect?
## What to expect?[[what-to-expect]]

Here is a brief overview of the course:

Expand All @@ -33,7 +33,7 @@ This course:

After you've completed this course, we recommend checking out DeepLearning.AI's [Natural Language Processing Specialization](https://www.coursera.org/specializations/natural-language-processing?utm_source=deeplearning-ai&utm_medium=institutions&utm_campaign=20211011-nlp-2-hugging_face-page-nlp-refresh), which covers a wide range of traditional NLP models like naive Bayes and LSTMs that are well worth knowing about!

## Who are we?
## Who are we?[[who-are-we]]

About the authors:

Expand All @@ -55,7 +55,7 @@ About the authors:

**Leandro von Werra** is a machine learning engineer in the open-source team at Hugging Face and also a co-author of the O’Reilly book [Natural Language Processing with Transformers](https://www.oreilly.com/library/view/natural-language-processing/9781098136789/). He has several years of industry experience bringing NLP projects to production by working across the whole machine learning stack..

## FAQ
## FAQ[[faq]]

Here are some answers to frequently asked questions:

Expand Down
5 changes: 2 additions & 3 deletions chapters/en/chapter1/10.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!-- DISABLE-FRONTMATTER-SECTIONS -->

# End-of-chapter quiz
# End-of-chapter quiz[[end-of-chapter-quiz]]

<CourseFloatingBanner
chapter={1}
Expand Down Expand Up @@ -140,7 +140,6 @@ result = classifier("This is a course about the Transformers library")

### 6. True or false? A language model usually does not need labels for its pretraining.


<Question
choices={[
{
Expand All @@ -155,7 +154,7 @@ result = classifier("This is a course about the Transformers library")
]}
/>

### 7. Select the sentence that best describes the terms "model," "architecture," and "weights."
### 7. Select the sentence that best describes the terms "model", "architecture", and "weights".

<Question
choices={[
Expand Down
6 changes: 3 additions & 3 deletions chapters/en/chapter1/2.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Natural Language Processing
# Natural Language Processing[[natural-language-processing]]

<CourseFloatingBanner
chapter={1}
Expand All @@ -7,7 +7,7 @@

Before jumping into Transformer models, let's do a quick overview of what natural language processing is and why we care about it.

## What is NLP?
## What is NLP?[[what-is-nlp]]

NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.

Expand All @@ -21,6 +21,6 @@ The following is a list of common NLP tasks, with some examples of each:

NLP isn't limited to written text though. It also tackles complex challenges in speech recognition and computer vision, such as generating a transcript of an audio sample or a description of an image.

## Why is it challenging?
## Why is it challenging?[[why-is-it-challenging]]

Computers don't process information in the same way as humans. For example, when we read the sentence "I am hungry," we can easily understand its meaning. Similarly, given two sentences such as "I am hungry" and "I am sad," we're able to easily determine how similar they are. For machine learning (ML) models, such tasks are more difficult. The text needs to be processed in a way that enables the model to learn from it. And because language is complex, we need to think carefully about how this processing must be done. There has been a lot of research done on how to represent text, and we will look at some methods in the next chapter.
24 changes: 12 additions & 12 deletions chapters/en/chapter1/3.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Transformers, what can they do?
# Transformers, what can they do?[[transformers-what-can-they-do]]

<CourseFloatingBanner chapter={1}
classNames="absolute z-10 right-0 top-0"
Expand All @@ -15,7 +15,7 @@ In this section, we will look at what Transformer models can do and use our firs
If you want to run the examples locally, we recommend taking a look at the <a href="/course/chapter0">setup</a>.
</Tip>

## Transformers are everywhere!
## Transformers are everywhere![[transformers-are-everywhere]]

Transformer models are used to solve all kinds of NLP tasks, like the ones mentioned in the previous section. Here are some of the companies and organizations using Hugging Face and Transformer models, who also contribute back to the community by sharing their models:

Expand All @@ -29,7 +29,7 @@ The [🤗 Transformers library](https://github.com/huggingface/transformers) pro

Before diving into how Transformer models work under the hood, let's look at a few examples of how they can be used to solve some interesting NLP problems.

## Working with pipelines
## Working with pipelines[[working-with-pipelines]]

<Youtube id="tiZFewofSLM" />

Expand Down Expand Up @@ -82,7 +82,7 @@ Some of the currently [available pipelines](https://huggingface.co/transformers/

Let's have a look at a few of these!

## Zero-shot classification
## Zero-shot classification[[zero-shot-classification]]

We'll start by tackling a more challenging task where we need to classify texts that haven't been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the `zero-shot-classification` pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don't have to rely on the labels of the pretrained model. You've already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like.

Expand Down Expand Up @@ -111,7 +111,7 @@ This pipeline is called _zero-shot_ because you don't need to fine-tune the mode
</Tip>


## Text generation
## Text generation[[text-generation]]

Now let's see how to use a pipeline to generate some text. The main idea here is that you provide a prompt and the model will auto-complete it by generating the remaining text. This is similar to the predictive text feature that is found on many phones. Text generation involves randomness, so it's normal if you don't get the same results as shown below.

Expand Down Expand Up @@ -139,7 +139,7 @@ You can control how many different sequences are generated with the argument `nu
</Tip>


## Using any model from the Hub in a pipeline
## Using any model from the Hub in a pipeline[[using-any-model-from-the-hub-in-a-pipeline]]

The previous examples used the default model for the task at hand, but you can also choose a particular model from the Hub to use in a pipeline for a specific task — say, text generation. Go to the [Model Hub](https://huggingface.co/models) and click on the corresponding tag on the left to display only the supported models for that task. You should get to a page like [this one](https://huggingface.co/models?pipeline_tag=text-generation).

Expand Down Expand Up @@ -174,13 +174,13 @@ Once you select a model by clicking on it, you'll see that there is a widget ena

</Tip>

### The Inference API
### The Inference API[[the-inference-api]]

All the models can be tested directly through your browser using the Inference API, which is available on the Hugging Face [website](https://huggingface.co/). You can play with the model directly on this page by inputting custom text and watching the model process the input data.

The Inference API that powers the widget is also available as a paid product, which comes in handy if you need it for your workflows. See the [pricing page](https://huggingface.co/pricing) for more details.

## Mask filling
## Mask filling[[mask-filling]]

The next pipeline you'll try is `fill-mask`. The idea of this task is to fill in the blanks in a given text:

Expand Down Expand Up @@ -210,7 +210,7 @@ The `top_k` argument controls how many possibilities you want to be displayed. N

</Tip>

## Named entity recognition
## Named entity recognition[[named-entity-recognition]]

Named entity recognition (NER) is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations. Let's look at an example:

Expand Down Expand Up @@ -238,7 +238,7 @@ We pass the option `grouped_entities=True` in the pipeline creation function to

</Tip>

## Question answering
## Question answering[[question-answering]]

The `question-answering` pipeline answers questions using information from a given context:

Expand All @@ -258,7 +258,7 @@ question_answerer(

Note that this pipeline works by extracting information from the provided context; it does not generate the answer.

## Summarization
## Summarization[[summarization]]

Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text. Here's an example:

Expand Down Expand Up @@ -303,7 +303,7 @@ summarizer(
Like with text generation, you can specify a `max_length` or a `min_length` for the result.


## Translation
## Translation[[translation]]

For translation, you can use a default model if you provide a language pair in the task name (such as `"translation_en_to_fr"`), but the easiest way is to pick the model you want to use on the [Model Hub](https://huggingface.co/models). Here we'll try translating from French to English:

Expand Down
Loading

0 comments on commit bd00a2d

Please sign in to comment.