Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native and complete i18n support #1744

Closed
abourget opened this issue Jan 3, 2016 · 26 comments
Closed

Native and complete i18n support #1744

abourget opened this issue Jan 3, 2016 · 26 comments
Labels

Comments

@abourget
Copy link
Contributor

abourget commented Jan 3, 2016

As stated in #1734


@spf13 I was also looking at https://godoc.org/github.com/nicksnyder/go-i18n/i18n ...wouldn't it be awesome if, in addition to good support for i18n in terms of content, we had great translation capabilities?

We could load i18n/{{$.Site.RenderLanguage}}.yaml as an go-i18n translation on start, just like we load .Site.Data.. This would require Multilingual to be enabled.

I can see quite a nice mapping from the examples there to something Hugo-ish:

T("Hello {{.Person}}", map[string]interface{}{
    "Person": "Bob",
})

to something like {{ i18n "Hello {{.Person}}" (dict "Person" "Bob") }} or something like that.

Or T("You have {{.Count}} unread emails.", 2) -> {{ i18n "You.. {{.Count}} unread mails" .Site.MailCountWhatever }}

I'd add a native i18n function, or T (or both, one aliasing the other ?) to handle that.

What do you think ? What's your take on vendoring other libraries ?

@spf13
Copy link
Contributor

spf13 commented Jan 3, 2016

Fine with vendoring.

I think this should be a bit different. I think we should integrate it a
bit deeper and tighter. How about this...

Multilingual doesn't need to be enabled. If we have translations / language
files than it just works.

For the translation I think we have a dedicated folder (can be inside data)
for translation maps.

Then we have a template function & shortcode that looks up the appropriate
value for the current language.

so Hello {{ T "person" }}

would read from

/data/translations/en.yaml
/data/translations/es.yaml

etc.

Just ideas of what feels cleaner to me. I think having the mapping in the
content / templates is odd.

Does that make sense?

Steve Francia
http://stevefrancia.com
http://spf13.com
http://twitter.com/spf13

On Sat, Jan 2, 2016 at 8:59 PM, Alexandre Bourget [email protected]
wrote:

As stated in #1734 #1734

@spf13 https://github.com/spf13 I was also looking at
https://godoc.org/github.com/nicksnyder/go-i18n/i18n ...wouldn't it be
awesome if, in addition to good support for i18n in terms of content, we
had great translation capabilities?

We could load i18n/{{$.Site.RenderLanguage}}.yaml as an go-i18n
translation on start, just like we load .Site.Data.. This would require
Multilingual to be enabled.

I can see quite a nice mapping from the examples there to something
Hugo-ish:

T("Hello {{.Person}}", map[string]interface{}{
"Person": "Bob",
})

to something like {{ i18n "Hello {{.Person}}" (dict "Person" "Bob") }} or
something like that.

Or T("You have {{.Count}} unread emails.", 2) -> {{ i18n "You..
{{.Count}} unread mails" .Site.MailCountWhatever }}

I'd add a native i18n function, or T (or both, one aliasing the other ?)
to handle that.

What do you think ? What's your take on vendoring other libraries ?


Reply to this email directly or view it on GitHub
#1744.

@abourget
Copy link
Contributor Author

abourget commented Jan 3, 2016

yeah it does :) I just took a bite at this one.. :) wanna take a look ? 1881c36

@abourget
Copy link
Contributor Author

abourget commented Jan 3, 2016

about vendoring, I've seen this tool here: https://github.com/kardianos/govendor which manipulates the vendor dirs.. this would seem to be the best way to do vendoring, because it would all be native, yet still have some annotations..

On other projects, I used git subtree but sometimes it's more difficult to manage for others.. and it's less discoverable.. govendor is pretty discoverable.

@abourget
Copy link
Contributor Author

abourget commented Jan 3, 2016

so the commit I made implements the full blown i18n integration! it's actually pretty simple, yet slick..

You can do translations in templates, it supports i18n in themes, and your own in /i18n ...
Your own translations will override (string by string) those in the theme.

@abourget
Copy link
Contributor Author

abourget commented Jan 3, 2016

It does not depend on Multilingual being enabled .. except it relies on RenderLanguage .. which was added in my Multilingual PR.. RenderLanguage was made to default to en, so it has a value when Multilingual is not enabled.. tweaking it in a config file will shift all translations.

I've also added the watchers for i18n.. and I've not reused the data/ dir.. to make it cleaner for the goi18n translation tool.. there'll be nothing around except i18n stuff

was surprised, but go-i18n supports yaml too, so translations can look like:

- id: home
  translation: "Welcome home!"
- id: sweet_home
  translation: "Welcome to my place and yours"

@vtolstov
Copy link

vtolstov commented Jan 3, 2016

why not try to use something for gettext? I think this is standard....

@abourget
Copy link
Contributor Author

abourget commented Jan 3, 2016

The whole web is JSON based... these tools have matured a lot since one page apps became widespread. Gettext you mean the file format or the C library ?

I for one wouldn't want to bring in a C library.. and I don't see much missing in the tools for JSON/Yaml.

There are even translation services that know and handle go-i18n specifically.

@vtolstov
Copy link

vtolstov commented Jan 3, 2016

gettext file format, go have native tools to handle this format and it allows to use standard tools for translation and plural forms.... also gettext is fast.

@abourget
Copy link
Contributor Author

abourget commented Jan 4, 2016

Not sure about performance, but having a human readable format and no need to run some compiler is interesting to me.

You can take a look at my go-i18n-based branch.. perhaps you could propose a gettext-based version ?

@bep
Copy link
Member

bep commented Jan 4, 2016

gettext isn't an option for us. I have studied this some ... and I wish the sdtlib guys could come up with something a little bit closer to the ... stdlib. As in: A standard. But we cannot wait for that to happen. And now the winner seems to be go-i18n -- so that is a sensible choice.

@spf13
Copy link
Contributor

spf13 commented Jan 4, 2016

I have very limited experience here and will defer to the significant portion of the Hugo community which is multilingual and has first hand experience here.

My two cents is that it seems like the current approach is sound and can provide a lot of needed functionality for Hugo to really become quite robust for multilingual sites and documentation.

@anthonyfok
Copy link
Member

Cc: @coderzh 😉

@bep
Copy link
Member

bep commented Jan 5, 2016

CC: @RickCogley

@RickCogley
Copy link
Contributor

thanks @bep

@coderzh
Copy link
Contributor

coderzh commented Jan 5, 2016

@anthonyfok Interesting, I think it must be easy to use, like spf13's example:

Hello {{ T "person" }}

would read from

/data/translations/en.yaml
/data/translations/es.yaml

And we can have a look at android's i18n solution: http://developer.android.com/intl/zh-cn/guide/topics/resources/localization.html

@abourget
Copy link
Contributor Author

abourget commented Jan 5, 2016

Folks take a look at 1881c36

@abourget
Copy link
Contributor Author

abourget commented Jan 6, 2016

So I've created a PR with more details here: abourget#1

Since it's based off the Multilingual PR.. I'm not sure how I could've made a PR in this repo to be based on something that isn't in this repo yet.. I can re-created a PR when we're good..

@coffeepunk
Copy link
Contributor

I think this is great, this is practically the number one feature request for me personally when it comes to Hugo. Using yaml, toml or json for the translation I feel is fine, mostly since that is the approach I most often use in Rails or Padrino.

The real key is to be able to use, as proposed in other comments {{T "keyword"}} or {{T "some string here"}} in templates and would be a suitable solution for most Hugo power websites.

From my experience what's usually translated are things like navigation / menu, links and buttons, shorter instructions or key parts in themes or templates. The actual content is stored inside a DB or as in Hugos case .md files.

For documentation heavy sites, wouldn't you put most of the content in .md files and have them tagged in Front Matter which language or locale that is used and not in translation files? Much like the way it is done in the multilingual tutorial?

I hope that this will be in a Hugo release very soon, super excited! Great work!

@pdfforge
Copy link

Having read through the documentation, I know that you intend to go with a yml or json-based approach to the translations. I would like to advertise the gettext/po format though, as it very widely spread and is established among translators as well. Other approaches are fine when you create the content and translations yourself, but when you start involving translators (or at "worst" translation agencies), it would make sense to use a format they can use without the need to convert or learn something new.

In gettext, the source language string is the translation key and default translation in one. The downside is, that the format itself is rather ugly. You get support for languages with multiple plural forms for free though, just as suggested in the first comments.

It seems tempting to implement "that little bit of translation" instead of using a complex solution and that is also what we felt was the thing to do. We still are trying to recover from this and properly integrate languages with multiple plural forms properly and are struggling with the acceptance among our translators.

Maybe you can give this some reconsideration, as this is an architectural decision that could be rather hard to change in future. We are using Hugo for a small project righ now and want to extend the usage on a multi-lingual site, so we have some interest in solid support. (and a benchmark of how not to implement it ;-))

@bep
Copy link
Member

bep commented Jul 14, 2016

As to the PO format, that is what is used by WordPress as well.

@abourget
Copy link
Contributor Author

a few things here:

I agree gettext is a more popular choice than go-i18n. On the flipside, a huge lot of browser based translation tools do not use gettext, and rely on json/yaml files. gettext has competition out there :)

Featurewise, https://github.com/nicksnyder/go-i18n#go-i18n- supports pluralization in 200 languages, and is based on CLDR data.. I don't see that we'd be missing features, would we ? One downside of go-i18n is it doesn't handle Currency and Number formatting, but I'm not sure it's a feature gettext would provide though, would it ? Also, https://github.com/maximilien/i18n4go wraps go-i18n and adds additional tooling and extraction methods.

Regarding file formats.. many tools (like https://phraseapp.com/docs/guides/formats/ which supports go-i18n's format) have adapted to many file formats.. Go hasn't settled on the Go way of doing i18n yet.. it's on the roadmap though, but I think if we change courses at some point, I don't see much issues with converting file formats with a tool. Converting back and forth from Chrome Ext i18n's json format seems pretty straightforward (and could even be useful today with go-i18n if your translator doesn't support go-i18n natively). I also like the fact that go-i18n chose a list format, instead of a json object, so that things can be (re-)sorted by tooling.

Probably most importantly, I need to be able to deploy on a single commit / push.. gettext needs build processes (building the .mo files from the .po files) which need tooling and infrastruture. With those YAML/JSON files, someone can modify translations directly on Github for example and have it render using hugo with no other dependencies. gettext was designed for a reason, yet in a different computing era, and an intermediate database (the .mo file) might have been a good idea at that point, but seems to me more like a burden today.

I'm not saying go-i18n is the best thing on Earth (even if it is simple and well designed).. but I haven't found many other libraries as complete that are pure Go. Not relying on CGO I think is a pretty worthy requirement.

I did search for gettext implementations in Go when I started the multilingual PR, but didn't find any. If things have changed, it might be worth reconsidering. Do you have anything else to recommend ?

@abourget
Copy link
Contributor Author

@pdfforge what do you think ?

@pdfforge
Copy link

First off, I am sorry that I have not read into the topic in much detail before. I have not been aware of the internals of the go-i18n project and they do seem to have spent some time considering the different alternatives and designing theirs.

Even though the po format is a bit strange, I personally don't find it that strange that it would justify creating a new format, but certainly that's not intended here as well. Many projects use po files without compiling them to mo, which makes it much more dynamic and versioning-friendly, as it still will be text files. As far a I know gettext/po does not support Currency and Number formatting though, so go-i18n and gettext seem pretty much on par. I do agree that the tooling around it makes the difference in the end and the json files look like they would not be that hard to convert if the need would arise, so it probably makes sense to use go-i18n if it is actively maintained.

There are some gettext/po implementations in go, but they seem quite abandoned from what I have seen so far.

@abourget
Copy link
Contributor Author

Ok, thanks, if you find anything worth looking at before the i18n is merged.. please let me know! I'd definitely want to have the best solution out there :)

@anthonyfok
Copy link
Member

@abourget: Thank you for all the great work and insights you have put in. Sorry, I haven't been able to follow this thread closely, but to answer your question:

I did search for gettext implementations in Go when I started the multilingual PR, but didn't find any. If things have changed, it might be worth reconsidering. Do you have anything else to recommend ?

How about this one, https://github.com/chai2010/gettext-go/ ? I came upon it when I tried to search about this topic using the keywords "pure go gettext" and arrived at lxc/incus#761 where the LXD project seemingly successfully switched over from the GCC+libintl-based https://github.com/gosexy/gettext to https://github.com/chai2010/gettext-go/ about a year ago.

At first glance, https://github.com/chai2010/gettext-go/ seems to be a pretty well-maintained pure Go implementation of gettext because the repositor was updated only 7 days ago. However, there are only 14 commits recorded on GitHub, presumably with the bulk of the previous development happening on the now defunct code.google.com/p/gettext-go/gettext.

I haven't looked at https://github.com/chai2010/gettext-go/ in any detail, so I cannot comment on its feature completeness or usability. All I can say is that LXD has used it for some time, but then I saw this (just now):

commit a5b0e82adfc5a14a381e138e33f0e42bf1df76cf
Author: Stéphane Graber <[email protected]>
Date:   Fri Nov 27 03:42:58 2015 -0500

    Rework translation support

    This switches us back to gosexy/gettext as our current gettext
    implementation just doesn't work on Linux (doesn't respect normal locale
    paths).

    To make things simpler, use a separate i18n module for translation and
    ship a basic shim for non-Linux.

    Signed-off-by: Stéphane Graber <[email protected]>

So LXD no longer uses http://godoc.org/github.com/chai2010/gettext-go/gettext. Not sure what story is behind that, but I presume that was probably the same reason why you didn't consider gettext when you first started working on i18n for Hugo.

Personally, I think it is great that you are using go-i18n as a framework, and I definitely don't want to give up JSON-based translation support over gettext. (I am greedy, I would love both JSON and gettext to be supported eventually, haha!) That said, probably such gettext support should be implemented in go-i18n as one of the optional backend engine, and not directly in Hugo, if the go-i18n authors would agree.

So, for the moment, for those of us who would want gettext support in Hugo, consider working towards that goal with the go-i18n project. That would benefit the greater Go community and not just Hugo too! 😉

bep added a commit that referenced this issue Aug 4, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
bep added a commit that referenced this issue Aug 8, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
bep added a commit that referenced this issue Aug 13, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
bep added a commit that referenced this issue Aug 15, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
bep added a commit that referenced this issue Aug 20, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
bep added a commit that referenced this issue Aug 20, 2016
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See #2309
Closes #2211
Closes #477
Closes #1744
@bep bep closed this as completed in 708bc78 Sep 6, 2016
tychoish pushed a commit to tychoish/hugo that referenced this issue Aug 13, 2017
Work In Progress!

This commit makes a rework of the build and rebuild process to better suit a multi-site setup.

This also includes a complete overhaul of the site tests. Previous these were a messy mix that
were testing just small parts of the build chain, some of it testing code-paths not even used in
"real life". Now all tests that depends on a built site follows the same and real production code path.

See gohugoio#2309
Closes gohugoio#2211
Closes gohugoio#477
Closes gohugoio#1744
@github-actions
Copy link

github-actions bot commented Apr 3, 2022

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

8 participants