Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve docs and samples for glossaries and custom models #18587

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update readme with new file types for glossaries
MIDDLEEAST\v-moshaban committed Aug 23, 2021
commit 45cfbad70a7d7ccf31733c5364c315e0901e977c
16 changes: 9 additions & 7 deletions sdk/translation/azure-ai-translation-document/README.md
Original file line number Diff line number Diff line change
@@ -334,13 +334,11 @@ Glossaries are domain-specific dictionaries. For example, if you want to transla

Document Translation supports glossaries in the following formats:

|**File Type**|**Extension**|**Description**|
|----------------|-------------|-------------|
|Localization Interchange File Format|.xlf. , xliff|A parallel document format, export of Translation Memory systems. The languages used are defined inside the file.|
|Tab Separated Values/TAB|.tsv/.tab|A tab-delimited raw-data file used by spreadsheet programs.

For .tsv/.tab files, you can read more about the format [here][tsv_files_wikipedia]
, and you can also see a sample file [here][sample_tsv_file].
|**File Type**|**Extension**|**Description**|**Samples**|
|---------------|---------------|---------------|---------------|
|Tab-Separated Values/TAB|.tsv, .tab|Read more on [wikipedia][tsv_files_wikipedia]|[glossary_sample.tsv][sample_tsv_file]|
|Comma-Seperated Values|.csv|Read more on [wikipedia][csv_files_wikipedia]|[glossary_sample.csv][sample_csv_file]|
|Localization Interchange File Format|.xlf, .xliff|Read more on [wikipedia][xlf_files_wikipedia]|[glossary_sample.xlf][sample_xlf_file]|
mshaban-msft marked this conversation as resolved.
Show resolved Hide resolved

#### **How Use Glossaries in Document Translation**
In order to use glossaries with Document Translation, you first need to upload your glossaries file to some blob container, and then provide the SaS url to of this blob container - which contains the glossary files - to Document Translation as in the code samples [sample_translation_with_glossaries.py][sample_translation_with_glossaries]. (Note: Don't conflate with blob container with the one you use for the documents you want to translate i.e. source container).
mshaban-msft marked this conversation as resolved.
Show resolved Hide resolved
@@ -475,7 +473,11 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con

[custom_translation_article]: https://docs.microsoft.com/azure/cognitive-services/translator/custom-translator/quickstart-build-deploy-custom-model
[tsv_files_wikipedia]: https://wikipedia.org/wiki/Tab-separated_values
[xlf_files_wikipedia]: https://wikipedia.org/wiki/XLIFF
[csv_files_wikipedia]: https://wikipedia.org/wiki/Comma-separated_values
[sample_tsv_file]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/translation/azure-ai-translation-document/samples/assets/glossary_sample.tsv
[sample_csv_file]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/translation/azure-ai-translation-document/samples/assets/glossary_sample.csv
[sample_xlf_file]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/translation/azure-ai-translation-document/samples/assets/glossary_sample.xlf

[cla]: https://cla.microsoft.com
[code_of_conduct]: https://opensource.microsoft.com/codeofconduct/
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
skull,le crâne
body,corps
heart,cœur
lungs,poumons
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
English French
skull le crâne
body corps
heart cœur
Empty file.