Skip to content

Commit

Permalink
zh-CN - Chapter 6finished
Browse files Browse the repository at this point in the history
  • Loading branch information
yaoqih committed Aug 1, 2022
1 parent aebb46e commit e69fce2
Show file tree
Hide file tree
Showing 46 changed files with 3,527 additions and 119 deletions.
3 changes: 1 addition & 2 deletions chapters/de/chapter3/3_tf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,7 @@ model.compile(
metrics=["accuracy"],
)
model.fit(
tf_train_dataset,
validation_data=tf_validation_dataset,
tf_train_dataset, validation_data=tf_validation_dataset,
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/en/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/en/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@ from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier(
[
"I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!",]
)
```

Expand Down
3 changes: 1 addition & 2 deletions chapters/en/chapter3/3_tf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,7 @@ model.compile(
metrics=["accuracy"],
)
model.fit(
tf_train_dataset,
validation_data=tf_validation_dataset,
tf_train_dataset, validation_data=tf_validation_dataset,
)
```

Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter5/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Here the `rss` attribute refers to the _resident set size_, which is the fractio

```py
print(f"Number of files in dataset : {pubmed_dataset.dataset_size}")
size_gb = pubmed_dataset.dataset_size / (1024**3)
size_gb = pubmed_dataset.dataset_size / (1024 ** 3)
print(f"Dataset size (cache file) : {size_gb:.2f} GB")
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/en/chapter6/8.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -404,9 +404,7 @@ Great! Now that we're done, we can save the tokenizer like before, and wrap it i
from transformers import PreTrainedTokenizerFast

wrapped_tokenizer = PreTrainedTokenizerFast(
tokenizer_object=tokenizer,
bos_token="<|endoftext|>",
eos_token="<|endoftext|>",
tokenizer_object=tokenizer, bos_token="<|endoftext|>", eos_token="<|endoftext|>",
)
```

Expand Down
17 changes: 4 additions & 13 deletions chapters/en/chapter7/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -413,9 +413,7 @@ Now we can just pass them to the `TFAutoModelForTokenClassification.from_pretrai
from transformers import TFAutoModelForTokenClassification

model = TFAutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down Expand Up @@ -663,9 +661,7 @@ Now we can just pass them to the `AutoModelForTokenClassification.from_pretraine
from transformers import AutoModelForTokenClassification

model = AutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down Expand Up @@ -774,10 +770,7 @@ First we need to build the `DataLoader`s from our datasets. We'll reuse our `dat
from torch.utils.data import DataLoader

train_dataloader = DataLoader(
tokenized_datasets["train"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,
tokenized_datasets["train"], shuffle=True, collate_fn=data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
tokenized_datasets["validation"], collate_fn=data_collator, batch_size=8
Expand All @@ -788,9 +781,7 @@ Next we reinstantiate our model, to make sure we're not continuing the fine-tuni

```py
model = AutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/en/chapter7/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -795,10 +795,7 @@ from torch.utils.data import DataLoader

tokenized_datasets.set_format("torch")
train_dataloader = DataLoader(
tokenized_datasets["train"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,
tokenized_datasets["train"], shuffle=True, collate_fn=data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
tokenized_datasets["validation"], collate_fn=data_collator, batch_size=8
Expand Down
3 changes: 1 addition & 2 deletions chapters/en/chapter7/5.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -928,8 +928,7 @@ for epoch in range(num_train_epochs):
for step, batch in enumerate(eval_dataloader):
with torch.no_grad():
generated_tokens = accelerator.unwrap_model(model).generate(
batch["input_ids"],
attention_mask=batch["attention_mask"],
batch["input_ids"], attention_mask=batch["attention_mask"],
)

generated_tokens = accelerator.pad_across_processes(
Expand Down
5 changes: 1 addition & 4 deletions chapters/en/chapter7/7.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1029,10 +1029,7 @@ validation_set = validation_dataset.remove_columns(["example_id", "offset_mappin
validation_set.set_format("torch")

train_dataloader = DataLoader(
train_dataset,
shuffle=True,
collate_fn=default_data_collator,
batch_size=8,
train_dataset, shuffle=True, collate_fn=default_data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
validation_set, collate_fn=default_data_collator, batch_size=8
Expand Down
4 changes: 1 addition & 3 deletions chapters/es/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -153,9 +153,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/fa/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,7 @@ from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier(
[
"I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!",]
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/hi/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -166,9 +166,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
3 changes: 1 addition & 2 deletions chapters/hi/chapter3/3_tf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,7 @@ model.compile(
metrics=["accuracy"],
)
model.fit(
tf_train_dataset,
validation_data=tf_validation_dataset,
tf_train_dataset, validation_data=tf_validation_dataset,
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/it/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
17 changes: 4 additions & 13 deletions chapters/ja/chapter7/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -419,9 +419,7 @@ label2id = {v: k for k, v in id2label.items()}
from transformers import TFAutoModelForTokenClassification

model = TFAutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down Expand Up @@ -685,9 +683,7 @@ label2id = {v: k for k, v in id2label.items()}
from transformers import AutoModelForTokenClassification

model = AutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down Expand Up @@ -806,10 +802,7 @@ trainer.push_to_hub(commit_message="Training complete")
from torch.utils.data import DataLoader

train_dataloader = DataLoader(
tokenized_datasets["train"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,
tokenized_datasets["train"], shuffle=True, collate_fn=data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
tokenized_datasets["validation"], collate_fn=data_collator, batch_size=8
Expand All @@ -820,9 +813,7 @@ eval_dataloader = DataLoader(

```py
model = AutoModelForTokenClassification.from_pretrained(
model_checkpoint,
id2label=id2label,
label2id=label2id,
model_checkpoint, id2label=id2label, label2id=label2id,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/ja/chapter7/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -817,10 +817,7 @@ from torch.utils.data import DataLoader

tokenized_datasets.set_format("torch")
train_dataloader = DataLoader(
tokenized_datasets["train"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,
tokenized_datasets["train"], shuffle=True, collate_fn=data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
tokenized_datasets["validation"], collate_fn=data_collator, batch_size=8
Expand Down
3 changes: 1 addition & 2 deletions chapters/ja/chapter7/5.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -940,8 +940,7 @@ for epoch in range(num_train_epochs):
for step, batch in enumerate(eval_dataloader):
with torch.no_grad():
generated_tokens = accelerator.unwrap_model(model).generate(
batch["input_ids"],
attention_mask=batch["attention_mask"],
batch["input_ids"], attention_mask=batch["attention_mask"],
)

generated_tokens = accelerator.pad_across_processes(
Expand Down
5 changes: 1 addition & 4 deletions chapters/ja/chapter7/7.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1039,10 +1039,7 @@ validation_set = validation_dataset.remove_columns(["example_id", "offset_mappin
validation_set.set_format("torch")

train_dataloader = DataLoader(
train_dataset,
shuffle=True,
collate_fn=default_data_collator,
batch_size=8,
train_dataset, shuffle=True, collate_fn=default_data_collator, batch_size=8,
)
eval_dataloader = DataLoader(
validation_set, collate_fn=default_data_collator, batch_size=8
Expand Down
4 changes: 1 addition & 3 deletions chapters/ko/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/pt/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -152,9 +152,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/pt/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@ from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier(
[
"I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!",]
)
```

Expand Down
2 changes: 1 addition & 1 deletion chapters/pt/chapter5/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Aqui o atributo `rss` refere-se ao _tamanho do conjunto residente_, que é a fra

```py
print(f"Number of files in dataset : {pubmed_dataset.dataset_size}")
size_gb = pubmed_dataset.dataset_size / (1024**3)
size_gb = pubmed_dataset.dataset_size / (1024 ** 3)
print(f"Dataset size (cache file) : {size_gb:.2f} GB")
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/ru/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -153,9 +153,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/ru/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@ from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier(
[
"I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!",]
)
```

Expand Down
3 changes: 1 addition & 2 deletions chapters/ru/chapter3/3_tf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,7 @@ model.compile(
metrics=["accuracy"],
)
model.fit(
tf_train_dataset,
validation_data=tf_validation_dataset,
tf_train_dataset, validation_data=tf_validation_dataset,
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/th/chapter1/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,7 @@ from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
"In this course, we will teach you how to", max_length=30, num_return_sequences=2,
)
```

Expand Down
5 changes: 1 addition & 4 deletions chapters/th/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@ from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier(
[
"I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!",]
)
```

Expand Down
3 changes: 1 addition & 2 deletions chapters/th/chapter3/3_tf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -86,8 +86,7 @@ model.compile(
metrics=["accuracy"],
)
model.fit(
tf_train_dataset,
validation_data=tf_validation_dataset,
tf_train_dataset, validation_data=tf_validation_dataset,
)
```

Expand Down
4 changes: 1 addition & 3 deletions chapters/th/chapter6/8.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -429,9 +429,7 @@ tokenizer.decode(encoding.ids)
from transformers import PreTrainedTokenizerFast

wrapped_tokenizer = PreTrainedTokenizerFast(
tokenizer_object=tokenizer,
bos_token="<|endoftext|>",
eos_token="<|endoftext|>",
tokenizer_object=tokenizer, bos_token="<|endoftext|>", eos_token="<|endoftext|>",
)
```

Expand Down
Loading

0 comments on commit e69fce2

Please sign in to comment.