diff --git a/.github/workflows/build_pr_documentation.yml b/.github/workflows/build_pr_documentation.yml index 5ae7db12d..45e3b3e09 100644 --- a/.github/workflows/build_pr_documentation.yml +++ b/.github/workflows/build_pr_documentation.yml @@ -9,7 +9,7 @@ concurrency: jobs: build: - uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@use_hf_hub + uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main with: commit_sha: ${{ github.event.pull_request.head.sha }} pr_number: ${{ github.event.number }} @@ -18,6 +18,3 @@ jobs: additional_args: --not_python_module languages: ar bn de en es fa fr gj he hi id it ja ko pt ru th tr vi zh-CN zh-TW hub_base_path: https://moon-ci-docs.huggingface.co - secrets: - token: ${{ secrets.HF_DOC_PUSH }} - comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }} diff --git a/.github/workflows/delete_doc_comment.yml b/.github/workflows/delete_doc_comment.yml index 0ec59d485..9ec2aaf44 100644 --- a/.github/workflows/delete_doc_comment.yml +++ b/.github/workflows/delete_doc_comment.yml @@ -7,10 +7,7 @@ on: jobs: delete: - uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@use_hf_hub + uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main with: pr_number: ${{ github.event.number }} - package: course - secrets: - token: ${{ secrets.HF_DOC_PUSH }} - comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }} \ No newline at end of file + package: course \ No newline at end of file diff --git a/README.md b/README.md index db20052fa..67b42d9b4 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ This repo contains the content that's used to create the **[Hugging Face course] | [Bahasa Indonesia](https://huggingface.co/course/id/chapter1/1) (WIP) | [`chapters/id`](https://github.com/huggingface/course/tree/main/chapters/id) | [@gstdl](https://github.com/gstdl) | | [Italian](https://huggingface.co/course/it/chapter1/1) (WIP) | [`chapters/it`](https://github.com/huggingface/course/tree/main/chapters/it) | [@CaterinaBi](https://github.com/CaterinaBi), [@ClonedOne](https://github.com/ClonedOne), [@Nolanogenn](https://github.com/Nolanogenn), [@EdAbati](https://github.com/EdAbati), [@gdacciaro](https://github.com/gdacciaro) | | [Japanese](https://huggingface.co/course/ja/chapter1/1) (WIP) | [`chapters/ja`](https://github.com/huggingface/course/tree/main/chapters/ja) | [@hiromu166](https://github.com/@hiromu166), [@younesbelkada](https://github.com/@younesbelkada), [@HiromuHota](https://github.com/@HiromuHota) | -| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae) | +| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae), [@wonhyeongseo](https://github.com/wonhyeongseo) | | [Portuguese](https://huggingface.co/course/pt/chapter1/1) (WIP) | [`chapters/pt`](https://github.com/huggingface/course/tree/main/chapters/pt) | [@johnnv1](https://github.com/johnnv1), [@victorescosta](https://github.com/victorescosta), [@LincolnVS](https://github.com/LincolnVS) | | [Russian](https://huggingface.co/course/ru/chapter1/1) (WIP) | [`chapters/ru`](https://github.com/huggingface/course/tree/main/chapters/ru) | [@pdumin](https://github.com/pdumin), [@svv73](https://github.com/svv73) | | [Thai](https://huggingface.co/course/th/chapter1/1) (WIP) | [`chapters/th`](https://github.com/huggingface/course/tree/main/chapters/th) | [@peeraponw](https://github.com/peeraponw), [@a-krirk](https://github.com/a-krirk), [@jomariya23156](https://github.com/jomariya23156), [@ckingkan](https://github.com/ckingkan) | diff --git a/chapters/en/chapter0/1.mdx b/chapters/en/chapter0/1.mdx index 6ab7c8e23..0f8bac262 100644 --- a/chapters/en/chapter0/1.mdx +++ b/chapters/en/chapter0/1.mdx @@ -1,4 +1,4 @@ -# Introduction +# Introduction[[introduction]] Welcome to the Hugging Face course! This introduction will guide you through setting up a working environment. If you're just starting the course, we recommend you first take a look at [Chapter 1](/course/chapter1), then come back and set up your environment so you can try the code yourself. @@ -10,7 +10,7 @@ Note that we will not be covering the Windows system. If you're running on Windo Most of the course relies on you having a Hugging Face account. We recommend creating one now: [create an account](https://huggingface.co/join). -## Using a Google Colab notebook +## Using a Google Colab notebook[[using-a-google-colab-notebook]] Using a Colab notebook is the simplest possible setup; boot up a notebook in your browser and get straight to coding! @@ -46,7 +46,7 @@ This installs a very light version of 🤗 Transformers. In particular, no speci This will take a bit of time, but then you'll be ready to go for the rest of the course! -## Using a Python virtual environment +## Using a Python virtual environment[[using-a-python-virtual-environment]] If you prefer to use a Python virtual environment, the first step is to install Python on your system. We recommend following [this guide](https://realpython.com/installing-python/) to get started. @@ -99,7 +99,7 @@ which python /home//transformers-course/.env/bin/python ``` -### Installing dependencies +### Installing dependencies[[installing-dependencies]] As in the previous section on using Google Colab instances, you'll now need to install the packages required to continue. Again, you can install the development version of 🤗 Transformers using the `pip` package manager: diff --git a/chapters/en/chapter1/1.mdx b/chapters/en/chapter1/1.mdx index 2c66a5250..e828633c7 100644 --- a/chapters/en/chapter1/1.mdx +++ b/chapters/en/chapter1/1.mdx @@ -1,18 +1,18 @@ -# Introduction +# Introduction[[introduction]] -## Welcome to the 🤗 Course! +## Welcome to the 🤗 Course![[welcome-to-the-course]] This course will teach you about natural language processing (NLP) using libraries from the [Hugging Face](https://huggingface.co/) ecosystem — [🤗 Transformers](https://github.com/huggingface/transformers), [🤗 Datasets](https://github.com/huggingface/datasets), [🤗 Tokenizers](https://github.com/huggingface/tokenizers), and [🤗 Accelerate](https://github.com/huggingface/accelerate) — as well as the [Hugging Face Hub](https://huggingface.co/models). It's completely free and without ads. -## What to expect? +## What to expect?[[what-to-expect]] Here is a brief overview of the course: @@ -33,7 +33,7 @@ This course: After you've completed this course, we recommend checking out DeepLearning.AI's [Natural Language Processing Specialization](https://www.coursera.org/specializations/natural-language-processing?utm_source=deeplearning-ai&utm_medium=institutions&utm_campaign=20211011-nlp-2-hugging_face-page-nlp-refresh), which covers a wide range of traditional NLP models like naive Bayes and LSTMs that are well worth knowing about! -## Who are we? +## Who are we?[[who-are-we]] About the authors: @@ -55,7 +55,7 @@ About the authors: **Leandro von Werra** is a machine learning engineer in the open-source team at Hugging Face and also a co-author of the O’Reilly book [Natural Language Processing with Transformers](https://www.oreilly.com/library/view/natural-language-processing/9781098136789/). He has several years of industry experience bringing NLP projects to production by working across the whole machine learning stack.. -## FAQ +## FAQ[[faq]] Here are some answers to frequently asked questions: diff --git a/chapters/en/chapter1/10.mdx b/chapters/en/chapter1/10.mdx index 7c1b22080..cb0ca145c 100644 --- a/chapters/en/chapter1/10.mdx +++ b/chapters/en/chapter1/10.mdx @@ -1,6 +1,6 @@ -# End-of-chapter quiz +# End-of-chapter quiz[[end-of-chapter-quiz]] -### 7. Select the sentence that best describes the terms "model," "architecture," and "weights." +### 7. Select the sentence that best describes the terms "model", "architecture", and "weights". setup. -## Transformers are everywhere! +## Transformers are everywhere![[transformers-are-everywhere]] Transformer models are used to solve all kinds of NLP tasks, like the ones mentioned in the previous section. Here are some of the companies and organizations using Hugging Face and Transformer models, who also contribute back to the community by sharing their models: @@ -29,7 +29,7 @@ The [🤗 Transformers library](https://github.com/huggingface/transformers) pro Before diving into how Transformer models work under the hood, let's look at a few examples of how they can be used to solve some interesting NLP problems. -## Working with pipelines +## Working with pipelines[[working-with-pipelines]] @@ -82,7 +82,7 @@ Some of the currently [available pipelines](https://huggingface.co/transformers/ Let's have a look at a few of these! -## Zero-shot classification +## Zero-shot classification[[zero-shot-classification]] We'll start by tackling a more challenging task where we need to classify texts that haven't been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the `zero-shot-classification` pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don't have to rely on the labels of the pretrained model. You've already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like. @@ -111,7 +111,7 @@ This pipeline is called _zero-shot_ because you don't need to fine-tune the mode -## Text generation +## Text generation[[text-generation]] Now let's see how to use a pipeline to generate some text. The main idea here is that you provide a prompt and the model will auto-complete it by generating the remaining text. This is similar to the predictive text feature that is found on many phones. Text generation involves randomness, so it's normal if you don't get the same results as shown below. @@ -139,7 +139,7 @@ You can control how many different sequences are generated with the argument `nu -## Using any model from the Hub in a pipeline +## Using any model from the Hub in a pipeline[[using-any-model-from-the-hub-in-a-pipeline]] The previous examples used the default model for the task at hand, but you can also choose a particular model from the Hub to use in a pipeline for a specific task — say, text generation. Go to the [Model Hub](https://huggingface.co/models) and click on the corresponding tag on the left to display only the supported models for that task. You should get to a page like [this one](https://huggingface.co/models?pipeline_tag=text-generation). @@ -174,13 +174,13 @@ Once you select a model by clicking on it, you'll see that there is a widget ena -### The Inference API +### The Inference API[[the-inference-api]] All the models can be tested directly through your browser using the Inference API, which is available on the Hugging Face [website](https://huggingface.co/). You can play with the model directly on this page by inputting custom text and watching the model process the input data. The Inference API that powers the widget is also available as a paid product, which comes in handy if you need it for your workflows. See the [pricing page](https://huggingface.co/pricing) for more details. -## Mask filling +## Mask filling[[mask-filling]] The next pipeline you'll try is `fill-mask`. The idea of this task is to fill in the blanks in a given text: @@ -210,7 +210,7 @@ The `top_k` argument controls how many possibilities you want to be displayed. N -## Named entity recognition +## Named entity recognition[[named-entity-recognition]] Named entity recognition (NER) is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations. Let's look at an example: @@ -238,7 +238,7 @@ We pass the option `grouped_entities=True` in the pipeline creation function to -## Question answering +## Question answering[[question-answering]] The `question-answering` pipeline answers questions using information from a given context: @@ -258,7 +258,7 @@ question_answerer( Note that this pipeline works by extracting information from the provided context; it does not generate the answer. -## Summarization +## Summarization[[summarization]] Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text. Here's an example: @@ -303,7 +303,7 @@ summarizer( Like with text generation, you can specify a `max_length` or a `min_length` for the result. -## Translation +## Translation[[translation]] For translation, you can use a default model if you provide a language pair in the task name (such as `"translation_en_to_fr"`), but the easiest way is to pick the model you want to use on the [Model Hub](https://huggingface.co/models). Here we'll try translating from French to English: diff --git a/chapters/en/chapter1/4.mdx b/chapters/en/chapter1/4.mdx index 6792a8a57..7097771f9 100644 --- a/chapters/en/chapter1/4.mdx +++ b/chapters/en/chapter1/4.mdx @@ -1,4 +1,4 @@ -# How do Transformers work? +# How do Transformers work?[[how-do-transformers-work]]