Support decoding and encoding of LaTeX characters #161

koppor · 2015-09-11T14:42:18Z

There is latex2utf8, which source is https://github.com/fc7/LaTeX-Decode. The opposite is LaTeX::Encode.

We should think of including this in JabRef. Possible in the cleanup functionality.

See also #160.

The text was updated successfully, but these errors were encountered:

koppor · 2015-09-12T17:20:38Z

This is related with sf bug #721. It seems that this functionality is done during import and not at "Cleanup entries". The existing functionality should be checked with the one of latex2utf8. Then, one of it should be chosen and integrated at "Cleanup entries".

oscargus · 2015-09-23T18:08:46Z

I've browsed the code and (having written most of the current JabRef converter, in HTMLConverter), I'd say that the current one supports more characters (although there may be some missing which are worthwhile adding). Also, the current implementation supports converting from HTML. I would assume that it should be possible to use the same table to do the reverse conversion.

oscargus · 2015-09-23T21:12:04Z

I started merging the missing characters that were present in latex2utf8 and will provide a PR in a few days.

oscargus · 2015-10-04T17:55:22Z

There's a huge list at http://www.w3.org/Math/characters/unicode.xml

koppor · 2015-12-05T12:15:54Z

The Unicode converter converts from Unicode to LaTeX and not vice versa, does it? At my first try, it did not treat the Author field, but at a second try it did. Need to investigate what could have been gone wrong.

oscargus · 2015-12-05T12:26:59Z

Correct. It should be possible to do it the other way around as well similar to the export formatters XMLChars, RTFChars, and HTMLChars.

Especially, one would like to use the huge list in HTMLConverter for HTMLChars (and maybe XMLChars) as well. I think one major issue here is how to deal with {\"{a}} vs \"{a} vs \"a vs {\"a}, but looking at e.g. HtmlCharsMap, it seems like there is a solution for that in HTMLChars, so probably only a matter of converting the LaTeX commands in HTMLConverter to the same format as in HtmlCharsMap.

This is something that I have been thinking about, but so far not succeeded to find the time/motivation to do.

oscargus · 2015-12-05T23:05:50Z

There is also a class FormatChars that does Latex to Unicode, which could be extended to cover everything in HTMLConverter.

oscargus · 2016-02-20T09:23:42Z

With #841 there's a huge step towards having quite good conversion in both directions.

koppor · 2016-02-21T09:21:06Z

Refs #160

koppor · 2016-03-25T12:18:56Z

Refs #1013.

tobiasdiez · 2016-04-08T21:46:40Z

What is the status of this issue? It seems like both conversation direction are present as cleanup operations.

oscargus · 2016-04-09T12:02:25Z

Agreed. Of course, it can always be improved, but I believe it is one of the better conversions (apart from LaTeX).

lenhard · 2016-04-13T18:51:36Z

This issue can be closed thanks to the cleanup operations

a7c6f63e25 correct license to match the SPDX license identifier. (#281) d704bf80af Update locales-nl-NL.xml (#229) 5ffb73b05a Bump nokogiri from 1.13.9 to 1.13.10 (#280) 04be62eda6 Update locales-pt-BR.xml (#251) b4db583787 Update locales-pt-BR.xml (#265) b656b1b6f9 Fix date format for Basque (#274) e7ec9bff94 Bump nokogiri from 1.13.4 to 1.13.9 (#272) 9125705f62 Update locales-nl-NL.xml (#279) 87445b0b65 Add composer.json (#161) 2919a84bff Fix page label in NO locales git-subtree-dir: buildres/csl/csl-locales git-subtree-split: a7c6f63e25323ac2f375943417d7f778f875f11c

koppor added type: feature ui labels Sep 11, 2015

koppor changed the title ~~Support encoding and ecoding of LaTeX characters~~ Support encoding and encoding of LaTeX characters Sep 24, 2015

koppor mentioned this issue Oct 7, 2015

Combining accents not working in Unicode converter (and more) #207

Closed

koppor changed the title ~~Support encoding and encoding of LaTeX characters~~ Support decoding and encoding of LaTeX characters Dec 5, 2015

koppor mentioned this issue Feb 21, 2016

Store LaTeX-free version inside each BibTeX entry #518

Closed

koppor mentioned this issue Feb 21, 2016

Cleanup: Offer conversion from Unicode to LaTeX #809

Closed

1 task

lenhard closed this as completed Apr 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support decoding and encoding of LaTeX characters #161

Support decoding and encoding of LaTeX characters #161

koppor commented Sep 11, 2015

koppor commented Sep 12, 2015

oscargus commented Sep 23, 2015

oscargus commented Sep 23, 2015

oscargus commented Oct 4, 2015

koppor commented Dec 5, 2015

oscargus commented Dec 5, 2015

oscargus commented Dec 5, 2015

oscargus commented Feb 20, 2016

koppor commented Feb 21, 2016

koppor commented Mar 25, 2016

tobiasdiez commented Apr 8, 2016

oscargus commented Apr 9, 2016

lenhard commented Apr 13, 2016

Support decoding and encoding of LaTeX characters #161

Support decoding and encoding of LaTeX characters #161

Comments

koppor commented Sep 11, 2015

koppor commented Sep 12, 2015

oscargus commented Sep 23, 2015

oscargus commented Sep 23, 2015

oscargus commented Oct 4, 2015

koppor commented Dec 5, 2015

oscargus commented Dec 5, 2015

oscargus commented Dec 5, 2015

oscargus commented Feb 20, 2016

koppor commented Feb 21, 2016

koppor commented Mar 25, 2016

tobiasdiez commented Apr 8, 2016

oscargus commented Apr 9, 2016

lenhard commented Apr 13, 2016