-
Notifications
You must be signed in to change notification settings - Fork 770
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FA] First draft of Chapter2/Page2 (#129)
* merged branch0 into main. no toctree yet. * updated toctree. * Updated the glossary with terms from chapter0. * Second draft in collab w/ @schoobani. Added empty chapter1 for preview. * Glossary typo fix. * Added missing backticks. * Removed a couple of bad indefinite articles I forgot. * First draft of ch2/p2. Adds to glossary. Trans. guidelines moved out. * Fixed missing diacritics, fixed the py/tf switch placing. Other fixes. * Changed the equivalent for prediction per @kambizG 's direction. * Redid ambiguous passage in translation per @lewtun 's direction.
- Loading branch information
1 parent
f851ec7
commit dea95d5
Showing
4 changed files
with
574 additions
and
48 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
1. Use this style guide for Persian. https://apll.ir/wp-content/uploads/2018/10/D-1394.pdf | ||
We deviate from this style guide in the following ways: | ||
- Use 'ء' as a possessive suffix instead of 'ی' | ||
|
||
2. Don't translate industry-accepted acronyms. e.g. TPU or GPU. | ||
|
||
3. Translate acronyms used for convenience in English to full Persian form. e.g. ML | ||
|
||
4. Keep voice active and consistent. Don't overdo it but try to avoid a passive voice. | ||
|
||
5. Refer and contribute to the glossary frequently to stay on top of the latest | ||
choices we make. This minimizes the amount of editing that is required. | ||
|
||
6. Keep POV consistent. | ||
|
||
7. Smaller sentences are better sentences. Apply with nuance. | ||
|
||
8. If translating a technical word, keep the choice of Persian equivalent consistent. | ||
If workspace is mohit-e-kari don't just switch to faza-ye-kari. This ofc | ||
does not apply for non-technical choices, as in those cases variety actually | ||
helps keep the text engaging. | ||
|
||
9. This is merely a translation. Don't add any technical/contextual information | ||
not present in the original text. Also don't leave stuff out. The creative | ||
choices in composing this information were the original authors' to make. | ||
Our creative choices are in doing a quality translation. | ||
|
||
10. Note on translating indefinite articles(a, an) to Persian. Persian syntax does | ||
not have these. Common problem in translations is using | ||
the equivalent of 'One'("Yek", "Yeki") as a direct word-for-word substitute | ||
which makes for ugly Persian. Be creative with Persian word order to avoid this. | ||
|
||
For example, the sentence "we use a library" should be translated as | ||
"ما از کتابخانهای استفاده میکنیم" and not "ما از یک کتابخانه استفاده میکنیم". | ||
"Yek" here implies a number(1) to the Persian reader, but the original text | ||
was emphasizing the indefinite nature of the library and not its count. | ||
|
||
11. Be exact when choosing equivalents for technical words. Package is package. | ||
Library is library. Don't mix and match. | ||
|
||
12. Library names are not brand names, nor are they concepts, so no need to | ||
transliterate. They are shorter namespaces so just use the English form. | ||
For example we will transliterate 'Tokenizer' when it is used to indicate | ||
a concept but not when it is the name of a library. | ||
|
||
13. You need to surround the entire doc with a <div dir="rtl">, also separately | ||
for lists. They render in LTR with the current version of Huggingface docs | ||
if you don't do this and the preview will look ugly. Refer to our previously | ||
submitted pages. | ||
|
||
Also with function names that include ellipses you need to surround them with | ||
a SPAN. i.e. <span dir="ltr">pipeline()</span> | ||
These are stopgap measures for the preview to render correctly. | ||
|
||
14. Put namespaces in backticks. Library names are shorter namespaces. | ||
|
||
15. Persian sentences follow the format Subject-Object-Verb. That means it is | ||
generally expected that the sentences end in a verb. Again there is a lot of | ||
gray area here and we need to apply nuance. But try to keep that in mind. | ||
For example maybe don't end sentences in "هستند یا نه". Maybe, bc if you | ||
do, that is your creative choice and we respect that. | ||
|
||
16. If a Persian equivalent for a word in the glossary has diacritics, they | ||
are important and should be repeated in every occurrence of the word. | ||
While we do want to facilitate our reader's access to the English source | ||
material by introducing loanwords and transliterating where we can, we | ||
also want to avoid confusion in the correct pronunciation of loanwords. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.