-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding Issues with FF/Win10 #274
Comments
It's working on Windows, and is correctly displayed in the browser console:
So I guess it is indeed a problem with the python code. Just to make sure, can you maybe try to change the (default) encoding in JabRef / your (test) library. @reox since you apparently have python knowledge, could you please play around with the "jabrefHost.py" script in the JabRef installation location. For example, the decoding is done at: https://github.com/JabRef/jabref/blob/master/buildres/linux/jabrefHost.py#L49. |
No, I cannot reproduce the issue. On Ubuntu both the snap and flatpak show the correct character. |
How can I see this? When I'm opening the browser console I just see:
btw I'm on Windows - is the python script the correct one? I also changed the default encoding to windows-1251 but I get the same result. |
If you are on Windows, then the powershell script is used and not the python one. But I've double-checked it and the most recent version uses utf8 correctly before sending it to JabRef. So based on
I guess that JabRef is for some reason trying to decode it in latin1 instead of utf8. What is the encoding of your library (Library preferences)? |
It is set to UTF8 and mode biblatex and also the bib file itself is indeed utf8:
I also resolved the issue of mixed CRLF/LF and set them all to LF now, but that did not changed the importer :/ Can I somehow debug what is send from the browser to jabref? is there a temp file I can watch? |
Strange... Currently, there it is not written to a temporary file, but you can do this at the following point: |
Okay thanks, I just did that:
but maybe that is because Out-file writes as UTF16 by default? Anyways, in the file the µ is correct. |
And it is working correctly if you import it from the cmd line |
that does not like it at all:
|
I recently updated to JabRef 5.3 but there is still this issue. If I download the Bibtex file from Elsevier directly (i.e. Cite -> Export citation to bibtex) I can import it into JabRef using --importToOpen. I also found out that this issue |
What happens if you use |
I just tried that but then the script crashes.
The results:
As said, utf8NoBom crashes... Unfortunately, I can not see why - the firefox extension simply says "Error while sending to JabRef.
It looks like I'm running PS5:
I downloaded v7 now, and in the v7 shell it seems to work with -Encoding.
Now, they are all the same :D Unfortunately, the characters are still broken. However, I can now import the dumped bib file and the characters are all correct there. |
As a workaround, I created a temporary file and import that one:
This seems to work flawlessly! |
Whooo, nice! I'm glad you found a workaround. May I ask you to open a PR at the main jabref repo with the changes to the powershell script https://github.com/JabRef/jabref/blob/main/buildres/windows/JabRefHost.ps1? Your approach with writing to a temporary file namely also fixes JabRef/jabref#7374. |
The only issue is: it seems to only work with ps7... Thus, it would require to check whenever ps5 is used and use the old method or if ps7 is used, the tmpfile. |
You could use |
could work yes - however for 5.1 there seems to be no way to write a file without BOM, or at least I can not make it work. Then you also have to switch in the bat file to the correct interpreter. The MSDN tells me that the 5.1 interpreter is called |
What are the issues one encounters when using utf8 with BOM? Yes, you are right the bat file also needs to be changed to run pwsh instead of powershell. The following might be helpful for this: |
see #274 (comment) then it can not be imported in jabref. |
Ah ok, I thought it was a problem with utf16. I'm shooting a bit in the blue, but does it work if you use |
This resolves an issue where the encoding somehow got lost when using the Jabref Browser extension. It will now write a temporary file with UTF-8 encoding rather than passing the bibtex on the commandline. See JabRef/JabRef-Browser-Extension#274
yes! I created a PR here: JabRef/jabref#7918 |
Nice! Thanks a lot for your continued work on this issue. Very much appreciated ❤️ |
thank you for the powershell tricks ;) btw I hope that it does not break other users' experience on other windows versions though 😅 |
* Write temporary file on bib import This resolves an issue where the encoding somehow got lost when using the Jabref Browser extension. It will now write a temporary file with UTF-8 encoding rather than passing the bibtex on the commandline. See JabRef/JabRef-Browser-Extension#274 * adding changelog entry Co-authored-by: Sebastian Bachmann <[email protected]>
This is similar to issue #33 (I also tested with the link there, and get the same buggy result)
Running version 2.4 in Firefox 85.0 on Windows 10 with JabRef 5.2--2020-12-24--6a2a512
For example, with this article: https://www.sciencedirect.com/science/article/pii/S8756328220301976
All umlauts and special characters are mangled when importing, for example µ will get μ.
A quick check in python shows that there is indeed some latin1/utf8 mixup:
my Jabref library is configured as UTF-8.
I'm not sure if this bug comes from the extension or from JabRef itself (JabRef/jabref#2013) is the issue again.
The text was updated successfully, but these errors were encountered: