Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACM Portal fetcher returns invalid Bibtex entries #2552

Closed
mew1033 opened this issue Feb 16, 2017 · 14 comments
Closed

ACM Portal fetcher returns invalid Bibtex entries #2552

mew1033 opened this issue Feb 16, 2017 · 14 comments
Labels
bug Confirmed bugs or reports that are very likely to be bugs fetcher

Comments

@mew1033
Copy link

mew1033 commented Feb 16, 2017

JabRef version 3.8.2 on Windows 10

Steps to reproduce:

Search for anything in ACM Portal search.
Select it
The next page is blank.

@stefan-kolb stefan-kolb added the bug Confirmed bugs or reports that are very likely to be bugs label Feb 16, 2017
@stefan-kolb
Copy link
Member

stefan-kolb commented Feb 16, 2017

Is a problem with wrong keys: See comma in id.

@article{Lang,Bellcore:1990:CRF:378956.378963, author = {Lang, Bellcore, Lawrence J. and Watson, James and Services, Ameritech}, title = {Connecting Remote FDDI Installations with Single-mode Fiber, Dedicated Lines, or SMDS}, journal = {SIGCOMM Comput. Commun. Rev.}, issue_date = {July 1 1990}, volume = {20}, number = {3}, month = jul, year = {1990}, issn = {0146-4833}, pages = {72--82}, numpages = {11}, url = {http://doi.acm.org/10.1145/378956.378963}, doi = {10.1145/378956.378963}, acmid = {378963}, publisher = {ACM}, address = {New York, NY, USA},}

@koppor
Copy link
Member

koppor commented Feb 16, 2017

The syntax of the entry is wrong:

Database file #1: paper.bib
"," immediately follows a field name---line 14 of file paper.bib
 :  @article{Lang,Bellcore:1990:CRF:378956.378963
 :                                               , author = {Lang, Bellcore, Lawrence J. and Watson, James and Services, Ameritech}, title = {Connecting Remote FDDI Installations with Single-mode Fiber, Dedicated Lines, or SMDS}, journal = {SIGCOMM Comput. Commun. Rev.}, issue_date = {July 1 1990}, volume = {20}, number = {3}, month = jul, year = {1990}, issn = {0146-4833}, pages = {72--82}, numpages = {11}, url = {http://doi.acm.org/10.1145/378956.378963}, doi = {10.1145/378956.378963}, acmid = {378963}, publisher = {ACM}, address = {New York, NY, USA},}
I'm skipping whatever remains of this entry

biblatex:

> biber paper
INFO - This is Biber 2.7
INFO - Logfile is 'paper.blg'
INFO - Reading 'paper.bcf'
INFO - Found 11 citekeys in bib section 0
INFO - Processing section 0
INFO - Looking for bibtex format file 'paper.bib' for section 0
INFO - Decoding LaTeX character macros into UTF-8
INFO - Found BibTeX data source 'paper.bib'
WARN - Entry Lang does not parse correctly
ERROR - BibTeX subsystem: C:\Users\koppor\AppData\Local\Temp\zIc7BZXNwO\paper.bib_244.utf8, line 91, syntax error: found ",", expected "="

@stefan-kolb
Copy link
Member

@mew1033 Can you post the information of your problematic data, so we can check? I guess it is invalid data coming from ACM as in my example.

@stefan-kolb
Copy link
Member

@lenhard this should emit an exception?! But does not. Can you comment on the behavior of the parser.

    @Test(expected=ParseException.class)
    public void parseEntryWithInvalidBibTeXKeyThrowsException() throws ParseException {
        List<BibEntry> result = parser.parseEntries("@article{Lang,Bellcore:1990:CRF:378956.378963, author={Ed von Test}}");
    }

@tobiasdiez
Copy link
Member

I think this should indeed throw an exception since in this format Lang is the BibTeX-key and Bellcore:1990:CRF:378956.378963 an invalid field-value pair.

@matthiasgeiger matthiasgeiger added the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label Feb 16, 2017
@mew1033
Copy link
Author

mew1033 commented Feb 16, 2017

@stefan-kolb Here's an example of what I did (I used a different publication, but everything on the ACM website is the same).
First I searched for something. In this case I used DOI, but just searching by name has the same problem.
2017-02-16 14_48_25-jabref

Then I selected an article.
2017-02-16 14_48_02-title

And then the final page was blank. :-(
2017-02-16 14_48_13-

@stefan-kolb
Copy link
Member

The exact DOI you posted here works as expected for me.

@mew1033
Copy link
Author

mew1033 commented Feb 16, 2017

@stefan-kolb That's odd... Is there anything I can do to provide logs/feedback as to why it's not working for me?

@stefan-kolb
Copy link
Member

Check the logs, if there is any message.

@stefan-kolb
Copy link
Member

@lenhard It is a little bit of a problem of the parser. As it swallows problems with entries and just tries to parse on.
parseansaddEntry

        } catch (IOException ex) {
            LOGGER.debug("Could not parse entry", ex);
            parserResult.addWarning(Localization.lang("Error occurred when parsing entry") + ": '" + ex.getMessage()
                    + "'. " + Localization.lang("Skipped entry."));

        }

Maybe we don't need such robustness? I mean ok it detects all other entries but there is still an invalid entry. With the new logic of Tobias? the parser result and its warnings are kind of hidden and such entries are basically only ignored. WDYT?

@lenhard
Copy link
Member

lenhard commented Feb 18, 2017

I'm rather indifferent regarding which way we adopt here. I guess we can also just throw a ParserException and crash the whole run instead. There are good reasons for both options.

We could even implement two different modes for parsing: a strict mode (that crashes if there are warnings) and a non-strict mode (that tries to read anything it can)

@Siedlerchr
Copy link
Member

I would rather prefer a non-strict but robust mode with error logging.
Regarding usability, a simple wrong input should not break the whole system/input process. And the working result should be reached with minimal effort although it contains errors.. (ISO-Norm 9241, Part 10)

@stefan-kolb stefan-kolb removed the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label Feb 19, 2017
@mew1033
Copy link
Author

mew1033 commented Feb 19, 2017

Aaaaand, today it's working. Weird.

@stefan-kolb stefan-kolb changed the title ACM Portal fetcher doesn't work on second page ACM Portal fetcher returns invalid Bibtex entries Feb 20, 2017
@LinusDietz
Copy link
Member

I guess we can close this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs fetcher
Projects
None yet
Development

No branches or pull requests

8 participants