-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep unknown commands #9
Comments
Parsing-wise this is a little tricky, because generally braces are not kept in the output unless they are escaped (the same goes to consecutive spaces). Creating a special case here can be quite messy. Problem is, even if we can add special cases to preserve braces, since the arity of an unknown command is, by definition, unknown, we still don't really know how many braces that follows should be kept. E.g. the output for From my understanding, you have problem with the output when an unknown commend is not followed by spaces. E.g. test("Unknown commands") {
LaTeX2Unicode.convert("\\this \\is \\alpha test") shouldBe "{\\this} {\\is} α test"
LaTeX2Unicode.convert("\\unknown command") shouldBe "{\\unknown} command"
LaTeX2Unicode.convert("\\unknown{} empty params") shouldBe "{\\unknown} empty params"
LaTeX2Unicode.convert("\\unknown{cmd}") shouldBe "{\\unknown}cmd"
} |
I think, we can go with the assumption |
Not sure if I can agree.. For example, if Generally, I think it's really tricky to come up with well-defined rules that can guess the arity of an unknown command while sensibly covering all these edge cases. Even with such rules, context-dependent arity-guessing would add a lot of parsing complexity, likely with performance penalty, too. |
I agree that it is very tricky. In our use case, we just need symbol replacement. Thus, all tex-commands not producing symbols directly can be left untranslated. This especially includes superscript, italics and bold face. Superscript is causing huge issues when converting it (see JabRef/jabref#3644 and JabRef/jabref#2596). Is it possible to introduce a flag (second conversion method?) to replace symbols only? That would help us very much and leave the other features intact without any issues with unknown commands. |
I think that's a separate issue. From my understanding, the issue here is that LaTeX generally doesn't keep unescaped brackets Alternatively, maybe we can disregard the standard LaTeX behavior and have an option of always keeping the brackets in outputs? |
+1 for the option to always keep the brackets in outputs! |
Hi @tomtung
You recently introduced the feature that unknown commands are preserved.
However, parameters in braces are removed from the brace environment or empty braces are stripped.
We discussed this at the JabRef dev call and came to the conclusion that it would be beneficial if the commands are just kept as they are if they are unknown.
I added a few tests that show the intended behavior.
Could you help us in bringing this into Scala code?
Or in general, do you agree with our thoughts?
Best regards,
Stefan and the JabRef team
The text was updated successfully, but these errors were encountered: