-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup entry "Move DOIs from note and URL field to DOI field and remove http prefix" incorrectly recognizies urls ending with "2010/stuff" as DOIs #6880
Comments
Can I please do this?? |
@PremKolar Sure, go ahead! |
This is not as straight forward as I thought.
I don't think there is a way to safely detect these in a url or in some other field. My only idea was to not delete the entry in respective original field in the case of a found short doi, so as not to lose the information in case of ambiguity. But this would inevitably result in wrong data in the doi field sometimes, when the url field is eg https://www.abc.de/10/abcd or when the field Note reads eg 01/10/2012. Anyone willing to share their thoughts? |
This one isn't matched because there's no 10, though, right? I think that detecting what comes before the 10 and ensuring that it's a valid separator would already be a great improvement. Another option is to query doi validity (I think there's already something like this in automatically searching for dois for an entry). If the matched doi isn't valid (I don't think 10/summary is, for example), then it shouldn't move it to the doi field. |
Exactly! that's the 2nd Problem. Ok yes, validating the doi is of course the obvious solution to this problem.. thanks for the idea! |
Please keep in mind that the Cleanup actions can be executed for all entries in your library. So if you have thousands of entries you would generate 1000 requestss to the DOI resolver |
right.. |
JabRef version 5.2--2020-09-06--c0b139a on Windows 10 10.0 amd64, Java 14.0.2
Steps to reproduce the behavior:
as a
.bib
file.Note that the new source is
This url is not a DOI link, though! Presumably this is because the matcher code at
jabref/src/main/java/org/jabref/model/entry/identifier/DOI.java
Lines 30 to 77 in ba68c09
considers all non-space text starting with
http://
orhttps://
, followed by10/
followed by any non-space text, to be a DOI. This is absurd. The character immediately preceding the10
,doi:
, orurn:
should at the very least be required to be a url separator character such as/
,:
,?
,&
, or=
.The text was updated successfully, but these errors were encountered: