Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

RFC for Gatsby Docs Localization #42

Merged
merged 6 commits into from
Oct 21, 2019
Merged

RFC for Gatsby Docs Localization #42

merged 6 commits into from
Oct 21, 2019

Conversation

tesseralis
Copy link
Contributor

@tesseralis tesseralis commented Sep 13, 2019

Plan to localize Gatsby using git/text-based translations (like React).

Currently debating between two options: text/git-based translations (like React does) and a SaaS translation platform (like Crowdin or Transifex).

Will update this once we decide on something with more specific details, but right now I'd like comments from folks who've used these platforms before what they prefer.

@jonniebigodes
Copy link

Why not as a first experiment we go with git/text or markdown and see how it goes and setup por more specifically assign a team of two people at least for a language and when more people know about the saas approach the translations are moved there?


### Drawbacks

The main drawback of this approach is that the Gatsby docs currently live in a monorepo with the rest of the Gatsby source, which means we can't just do a copy of the entire website and use `git merge` like we do with React (and Vue).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do similar copying out of directories for our starters — our bot checks on each commit to master if there's any changes to a starter and then commits that — we could use the same logic to create PRs to language repos — https://github.com/gatsbyjs/gatsby-starter-default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting!

So we could probably do something like:

  1. Create a copy of the docs/ directory in a new repo in a new language
  2. Update that directory whenever there are changes to a file in the original repo
  3. Copy the results of those repos into a single readonly gatsby-i18n repo that is built and deployed as the actual gatsby website

Copy link

@crowdin-support crowdin-support Sep 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Crowdin you can use one repo for a different project and with the GitHub integration you can specify which files should be uploaded for localization and which ones should be ignored. No need to make any copies and bother with merging. Also our team is ready to help with the setup 24/7 :)


The main drawback of this approach is that the Gatsby docs currently live in a monorepo with the rest of the Gatsby source, which means we can't just do a copy of the entire website and use `git merge` like we do with React (and Vue).

This is also an issue if we want translations to live under a path (e.g. `gatsbyjs.org/ja` rather than `ja.gatsbyjs.org`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, we could clone each language repo for doing the build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be a lot simpler to have one site & a better experience as people could easily jump between translations e.g. with a dropdown.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the best way to encourage others to write translations is to make it simpler to write them. I'm translating the docs (tutorials) to spanish myself

One thing about the language dropdown: because the docs are constantly changing and hopefully a lot of people would help translating them, if a web user changes the dropdown language and that page has not being translate yet, It's a little harder to handle those cases gracefully.

Another way I suggest is how Dan Abramov does it on his blog:
Screen Shot 2019-09-14 at 1 31 02 AM

one drawback of this approach is that is harder for a web user to change the overall language.

hope this helps!

Copy link
Contributor Author

@tesseralis tesseralis Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do what wikipedia does: only list languages in the dropdown that have the current page translated. You could have a "More Languages" link at the bottom to take you to a page listing all the languages (like https://reactjs.org/languages)

We'd love your help on the Spanish documentation!

Copy link

@jonniebigodes jonniebigodes Sep 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which does the trick and looks real nice and simple. A while back i started the portuguese translation, did some progress on it so that both pt-pt and pt-br could read it without big issues.

@alexandrtovmach
Copy link

Finally, process started.

About GitHub approach ― it'll make a mess of PR's and issues if we will work with one repository. If create repositories for each locale it'll going to be a mess of repos, that's really hard to manage. Just take a look on NodeJS (https://github.com/nodejs/). Especially in case of Gatsby with very proactive community.

About CrowdIn ― I vote for it, because it's application created for these reasons ― for translations in open source. Now, I'm trying to stabilize the all translation processes for NodeJS, and we decided to go with CrowdIn for all types of info (docs, new website, old website), before it was a mix of different approaches, but CrowdIn it's like a standard.

@micahgodbolt
Copy link

Any way to have github format the MD file in a more readable way? I like the idea of being able to leave reviews, but scrolling to read entire lines is very difficult. Maybe just manual line breaks?

@mhmdAljefri
Copy link

Git based sounds good. And we may find better way do dry website components and shared logic.

@tesseralis
Copy link
Contributor Author

@alexandernanberg As I said in the RFC, I have a lot of reservations about Crowdin as a program. Could you look at some of the drawbacks I posted and let me know your thoughts? Have you worked with Transifex and have any thoughts on that?

@micahgodbolt there should be a way to do soft-wraps. The React docs have it enabled and we've never had a problem with it.

@joerlop
Copy link

joerlop commented Sep 14, 2019

Would love to help translate into Spanish if needed :)

@Darking360
Copy link

Greetings @tesseralis so cool to see this going on 🚀

We translated the React docs with the text/git-based translations option and worked really well, I think that having all the translations in a repo and letting the people use a tool that they use in a daily basis instead of a new one is much faster to start with, and with Github you can do everything inside the page if you want to, code review (super important to make notations, missing accents or language-specific words) and much more inside here 💪 And we can have a bot like the one is being used in the React docs to keep everything up to date @tesseralis 🤔

On the other hand, SaaS like Crowdin are ok, I haven't used them but I remember trying to jump into the NodeJS translations and they use Crowdin I think, and it has a learning curve, I remember opening it and having no clue on where or how to start 😅

My point is that I vote for text/git-based translations, and the best proof that it works are the React docs 🥇 We can test maybe some docs pages and how they behave with some languages first, I think that's the way @tesseralis started with the React docs, with some test languages, try the approach and then scale it 🤔

I think that's it, I'd love to help with the Spanish docs, and I'm sure a ton of people more that helped with React translations would jump in here since they're familiar with the process ❤️ Let me know 💯

@jonniebigodes
Copy link

@Darking360 what happened to you with Crowdin happened to me with trying to help out translating the material ui docs, i had the documentation page for Crowdin opened on one monitor and the documentation itself on another and i was scratching my head trying to figure it out.

@tesseralis
Copy link
Contributor Author

@jonniebigodes the quality of their documentation is... lacking to say the least.

I do have a test project on Transifex if you all want to see how it compares: https://www.transifex.com/shosple-colupis/gatsby-test/

@jonniebigodes
Copy link

jonniebigodes commented Sep 14, 2019

@tesseralis i'm interested in how it works. just sent a collaboration request so that i can see how it works.

@varenaggarwal
Copy link

I would love to help translate in Hindi.

@carburo
Copy link

carburo commented Sep 14, 2019

Thanks, @tesseralis! It's great to see this going.
The text/git-based approach would be my choice, it's simple and approachable for developers. As @Darking360 already said very eloquently we've had a great experience with the React Docs using this method. There are always trade-offs, of course, but it has proven to be a successful way to localize this kind of documentation.
And, of course, I would love to help with the Spanish translation. ❤️

@joostdecock
Copy link

I run a multilingual Gatsby site. At first, we used git-based translation, recently we've switched to crowdin.

Crowdin has some annoying issues, but I tolerate them because translators love it, which means we have better translation since we switched.

The two main drawbacks are:

  • While crowdin can import pre-existing translation for key-value pairs (like json or yaml files for i18n) it cannot do that for free text (markdown or MDX). Which means that you have to start from zero for docs,blog posts and so on.
  • Crowdin routinely messes up the syntax and you end up with a broken markdown file because of unclosed tags and so on. This is especially common with MDX with components that take props in object literal notation

I've created an issue for that second one. They say they're working on it, but I haven't seen much progress.

These drawbacks are significant. But I still stick with Crowdin because the benefits outweigh these annoyances:

  • Did I mention translators love it?
  • It significantly lowers the bar for people to contribute (git is not for everyone)
  • Their suggestions for auto-translatiin are really good, making translation much faster
  • Everything is broken up into single sentences. This is a change in workflow that takes some getting used to (and the reason they can't import prior translations for flowed text) but I actually prefer it because:
    • translation a lengthy page can be intimidating. But now, you can jump in and translate a couple of sentences whenever you've got a few minutes to spare
    • when part of a page is rewritten, Crowdin will pick up on that, notify translators that there's new strings to translate, and then present only what's changed
  • there's a translation step followed by an approval step which allows QA

I have no experience with transiflex, but I would recommend Crowdin if you want to attract more translators.
Perhaps Gatsby's sheer popularity means that there's plenty of volunteers, and those volunteers are comfortable with git, in which case I don't think the benefits of Crowdin outweigh the pain points.

Just my 2 cents, I hope it can help the discussion.

@fanyak
Copy link

fanyak commented Sep 14, 2019

Would love to help translate in Greek

@jumpalottahigh
Copy link

jumpalottahigh commented Sep 14, 2019

I've used both platforms. CrowdIn with VS Code's translation project and the Git based approach in translating the official React docs.

I have no strong opinions about which one is better, but personally I prefer the Git approach. I feel like a lot could be done there by bots to sync content updates, which makes it easier for translators like me to instantly translate incoming PRs in small chunks.

Thanks @tesseralis for working on this with the Gatsby community too!

@SamuelAlev
Copy link

SamuelAlev commented Sep 14, 2019

I'm willing to help translating in French.
I'm up for the Git based approach at first mainly for its simplicity (👍 to React docs) and the fact we can use whatever tool we want to write the docs is really appreciated.

@thib92
Copy link

thib92 commented Sep 14, 2019

I don’t have enough context to give an opinion on the method to manage translations, but I contributed on a few pages for the French translation of the React docs, and the Git approach seemed good to me.

Either way I’d be glad to help with the French translation.


We had considered using Crowdin for the localization of the React docs but realized that Crowdin had some [major drawbacks](https://reactjs.org/blog/2019/02/23/is-react-translated-yet.html):

* It has a steep learning curve and hard for new translators to get accustomed to.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new translators it's 1 hour to understand "how it works?" and from other side there is a ton of translators that's already had an experience in work with CrowdIn. That's platform for translators, and it's not valid argument "hard for new translators to get accustomed to"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really the case? Most people who have commented here seem to prefer the git-based approach. I think the issue boils down to: would you rather have people who are technical but are new to translation, or experienced/professional translators who aren't as familiar with the product? We went with the former for React and I think it worked well for us. The question is if we'd like the same for Gatsby.

Copy link

@wardpeet wardpeet Sep 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our community is big and awesome. I'm pretty sure we have enough people willing to translate with Gatsby knowledge. So, git-based translations seem in most people's comfort zone.

We had considered using Crowdin for the localization of the React docs but realized that Crowdin had some [major drawbacks](https://reactjs.org/blog/2019/02/23/is-react-translated-yet.html):

* It has a steep learning curve and hard for new translators to get accustomed to.
* The GitHub integration isn't customizable: it takes three hours to compile all the languages and just publishes something in a single directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be handled by our build process, so I don't see any issues here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the people who I've talked to that have actually done or attempted to do the integration, it's a huge pain. If you have experience with this and we decide to go with crowdin, please feel free to help.

* It has a steep learning curve and hard for new translators to get accustomed to.
* The GitHub integration isn't customizable: it takes three hours to compile all the languages and just publishes something in a single directory.
* Translation quality issues:
* Doesn't handle web markup well; sites translated with Crowdin sometimes have invalid markup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's one of problem, but can be handled by using some specific format. Git Markdown is more or less stable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the websites I checked did use Markdown, but it got... converted during the upload process to their own custom syntax?

* The GitHub integration isn't customizable: it takes three hours to compile all the languages and just publishes something in a single directory.
* Translation quality issues:
* Doesn't handle web markup well; sites translated with Crowdin sometimes have invalid markup
* (as far as I can tell) no way to verify that a section/page has to pass a quality check before it gets published

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we can configure CrowdIn to create PR's with changes


Other drawbacks of Crowdin:

* No way to prioritize which pages should get translated first

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very useful, I think

Copy link
Contributor Author

@tesseralis tesseralis Sep 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the contrary, it's very useful. With React, we could point people to translate important pages (like the Home Page and tutorial) before things like Hooks.

Same thing with Gatsby: We'd want to translate important stuff like the tutorial before hyper-specific technical stuff.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that's possible with CrowdIn:
image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related documentation on prioritizing files:
https://support.crowdin.com/files-management/#prioritizing-files

By default, files & folders are sorted in the alphabetical order

The GitHub integration isn't customizable: it takes three hours to compile all the languages and just publishes something in a single directory.

If there are thousands of files and dozens of languages, it may indeed take a while to finalize the first synchronization. During future syncs, only updated files and languages will be synced, sometimes it will be a matter of seconds or just a couple of minutes to synchronize dozens of files


* No way to prioritize which pages should get translated first
* Hard to attribution of translators
* Paid program ($125/mo for the 'Bronze' Organization Plan)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use it

text/0010-gatsby-docs-localization Outdated Show resolved Hide resolved
@RamirezAlex
Copy link

I agreed with @Darking360 and @carburo about using text/git-based. It just feels more intuitive to work with.

@tesseralis I would love to help with the Spanish documentation as I already did with the React one.
Great to see this going. 👍

@ghost
Copy link

ghost commented Sep 14, 2019

I would like to help translating to Spanish if there is any opportunity.

@horacioh
Copy link

@fabiobernalt @RamirezAlex I started translating to spanish on my fork (link above). happy to coordinate with you a full translation to spanish! :)

I started with the tutorials since I’m also doing a video series about it. so let’s talk and make this happen! 💪🏻💪🏻

@denisqua
Copy link

Dan from Crowdin

I created a sample project from your GitHub repo, just to show you how your translation project might look like

A few things to seem nice

  • it seems like ~25% of your project are duplicated texts. all of them will be translated automatically. that saves lots of efforts for the volunteers
  • the Translation Memory will save even more for similar texts
  • I included screenshots, in case you might want to translate them also

We have 24x7 tech support ready to help

Here's the demo project https://crowdin.com/project/gatsby-js-docs-demo

@jonniebigodes
Copy link

looks like i said initially about 2 or more people for language, or more precisely some teams for each language is coming together, the spanish team is looking good. Already got some people for french, greek, russian and portuguese. Looks like this is coming together 👍

@dies
Copy link

dies commented Sep 14, 2019

looks like i said initially about 2 or more people for language, or more precisely some teams for each language is coming together, the spanish team is looking good. Already got some people for french, greek, russian and portuguese. Looks like this is coming together 👍

@jonniebigodes, sorry, but I have to argue. imagine that someone changes one sentence in sources in a year

with Crowdin the notifications about changes are usually sent to the volunteers automatically. Then they use Translation memory and change words in the translation for every language. changes that require translators attention would not be efficiently managed with GitHub

still, my main concern is that if you start with text/git-based, then you won't be able to switch to a translation management system

@Axolodev
Copy link

Any chance I can help with the spanish translation too? Seems like there's already a great team, but I would love to help too!

@tesseralis
Copy link
Contributor Author

@joostdecock thank you for your thoughtful comments! In our case, I definitely think our pool of translators would be more familiar with git and GitHub. I think it's important that the translators are people who have used and understand Gatsby, just as it was with React, rather than people who have experience translating but little domain knowledge.

@denisqua Thanks for your comments and setting up a test repo! Where did you get the statistic that 25% of all the gatsby docs are duplicated texts? That doesn't seem like a big issue, since you could just do a find-replace.

@dies For React, we have automation that creates pull requests when new content is available and notifies the translation maintainers. You could argue that they could just choose to ignore those notifications (and indeed, some of the less popular languages do) but the same is true for any platform, including Crowdin.

* [ ] Support for language-specific fonts
* [ ] UI to toggle between languages
* [ ] Preserve hashes when switching languages (e.g. `/en/docs/quick-start/#use-the-gatsby-cli` should go to `/es/docs/quick-start/#use-the-gatsby-cli` even though the heading has been translated to Spanish)
* [ ] Make Algolia only search for results within each language

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about this. When searching for starters even in Spanish, I would expect to show results in English. But if I search for plantillas, then I would expect results in Spanish. So I wouldn't restrict the search to only the current language. Specially because maybe not everything is translated. Or maybe you don't know the name of the thing you are looking for in spanish, but find the english equivalent and from there you can discover if there is a translation for another language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good point, but if we have ten languages and you search for, say, "themes", you shouldn't get the same article in ten different languages. I think at most you should get the language you're on and the source language (English).

It's definitely something we should research more.

@KyleAMathews
Copy link
Contributor

@muescha @tesseralis is getting paid to work full-time on this for the next month or two and is authorized to make final decisions about the directions we're taking. The RFC is to invite input but isn't a "voting" system. She's inviting input and then making decisions.

@muescha
Copy link
Contributor

muescha commented Sep 20, 2019

@KyleAMathews thx for the info :)


The following things need to be created:

* [ ] A bot to track changes to the English docs and create pull requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since gatsbyjs is a monorepo (Compared to reactjs as many repos):

That means that in case of an commit (even changing a typo) then the bot generates a pull request per Language - if we have in react about 30 languages, then we get a wave of 30 pull requests to the monorepo?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in React, which has 45 different translations in progress, we only have around 15 with "complete" translations, and a handful more with maybe 50% completion. The bot will only make pull requests to languages that have translations: for most pages, this will be the 15 completed translations plus one or two more.

Also, even though right now it says "every commit", the React bot runs on a weekly schedule: it batches all the updates from within a week. So we can do something similar or even run on different schedules per language: more popular languages can run the bot with every commit, while less active languages can run once a week, or even not at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for info

How about to make for each language one repo (like React) and then push/sync completed pages to the monorepo?

Then each language can have it own issues and tracking...

@horacioh
Copy link

@tesseralis I just made this PR to include the first translation to the main repo 🎉

hope it's all OK so you can start experimenting with it :)
When this is merged, I will talk to all the spanish translators to start moving the in progress translations to the main repo with the new structure

thanks!

@sheplu
Copy link

sheplu commented Sep 22, 2019

I can help too with the translation in French. I ready worked with react ;)

@jpedroschmitz
Copy link

Hey guys, I can help translating the docs to Brazilian Portuguese 🇧🇷. We have a great and amazing community being built around here 🚀. I would love to help 💜

@alexandrtovmach
Copy link

Please take a look on PR with Russian translation. I've tried to go with next flow for now:

  1. Copied /docs/tutorial -> /translations/LANG_CODE/tutorial
  2. Create a first commit with English version as proposed here
  3. Removed all media files until we have a plan
  4. Translated /translations/LANG_CODE/tutorial/index.md

@lex111
Copy link

lex111 commented Sep 24, 2019

I would be happy to join the translation Gatsby into Russian. I previously translated the React documentation, so I'm familiar with the process.

@orta
Copy link

orta commented Sep 25, 2019

Thanks for this RFC (and the subsequent public conversation) - I've been privately thinking about how we can translate the TypeScript docs and site, and all of the arguments and discussions are exactly the same as what I've been thinking internally.

As I'll be lagging behind on getting TS docs up and running behind the implementation of Gatsby's setup. I'd like to make a request if I can, please consider finding the generic parts of the reactbot (and subsequent gatsbytranslatebot maybe) and see if there's a common core with some unique config per repo. It's likely that we'll share the same problems on TypeScript, and I'd love to help out when I get there.

Good luck ❤️

@jonniebigodes
Copy link

@jpedroschmitz see my comment above, do you want to proceed with a pt-br or try and come up with a unified approach to avoid bloating even more the repo with both pt-pt and pt-br translations?

@jpedroschmitz
Copy link

@jonniebigodes I personally prefer to have 2 separate translations, one for pt-br and other for pt-pt, just like React docs translations.

@alexandrtovmach
Copy link

@tesseralis I've started working with translations in separate repository for Russian and faced with a problem. We need to have some linter to avoid formatting issues. It should be something universal for all translation repos, so maybe we can discuss it here before each translation group implement it by themselves.

@tesseralis
Copy link
Contributor Author

@orta of course! That's the plan.

@jonniebigodes @jpedroschmitz would you like me to create a Portuguese channel in the gatsby discord so you can discuss this more (and bring other people in?)

@alexandrtovmach What sort of linter are you thinking of? Feel free to install a linter for the Russian docs. For React, we let different languages manage their own linters (e.g. TextLint for Japanese), since different linters have different levels of support for different languages.

@alexandrtovmach
Copy link

@tesseralis eslint/prettier + husky/lint-staged for markdown formatting

@jpedroschmitz
Copy link

@tesseralis, that would be great 🚀

Copy link
Contributor

@sidharthachatterjee sidharthachatterjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tesseralis tesseralis merged commit 15ca267 into gatsbyjs:master Oct 21, 2019
@MichaelDeBoey
Copy link

Is the workflow already in place to suggest translation repo's? 🤔

@tesseralis
Copy link
Contributor Author

@MichaelDeBoey yes! If you make a new issue on the gatsbyjs repo you'll see an option to "Request a translation". Follow the instructions there to make it. We're just about to start making the different translation repos.

@Zorig
Copy link

Zorig commented Oct 24, 2019

@tesseralis would love to participate, I have experience from reactjs.org though.

@ilyaspiridonov
Copy link

Would you consider GitLocalize ? It's free and ideal for GitHub/markdown

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.