Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new parser for MathSciNet search #11055

Merged
merged 27 commits into from
Mar 21, 2024

Conversation

subhramit
Copy link
Member

@subhramit subhramit commented Mar 19, 2024

Closes #10996
Closes #8170
The issue exposed a bug where MathSciNet search gives the following parsing error:

Screenshot 2024-03-19 181336
(Replicated locally)

To fix the issue, a new parser has been built for MathSciNet which parses the retrieved JSON results correctly. A test has been added as well.
Files modified:
jabref/src/main/java/org/jabref/logic/importer/fetcher/MathSciNet.java
jabref/src/test/java/org/jabref/logic/importer/fetcher/MathSciNet.java
The search now works fine:
Screenshot 2024-03-19 235840

Mandatory checks

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@calixtus
Copy link
Member

calixtus commented Mar 19, 2024

Hi, thanks for your pull request, on first look, your changes are promising. Thanks especially for providing a test with your changes. However, there are some small issues with your codestyle and the MathSciNetTest is failing. Don't worry about the other fetchertests for now.
You can read about setting up your workspace, so it automatically cares about the codestyle here:
https://devdocs.jabref.org/getting-into-the-code/guidelines-for-setting-up-a-local-workspace/intellij-13-code-style.html
You can also fix the OpenRewrite test by running gradle rewriteRun from the terminal/console.

@subhramit
Copy link
Member Author

Hi, thanks for your pull request, on first look, your changes are promising. Thanks especially for providing a test with your changes. However, there are some small issues with your codestyle and the MathSciNetTest is failing. Don't worry about the other fetchertests for now. You can read about setting up your workspace, so it automatically cares about the codestyle here: https://devdocs.jabref.org/getting-into-the-code/guidelines-for-setting-up-a-local-workspace/intellij-13-code-style.html You can also fix the OpenRewrite test by running gradle rewriteRun from the terminal/console.

Hey, thanks! I missed the codestyle extension. I'll also have a look at the test and get back.

@subhramit subhramit force-pushed the fix-for-issue-10996 branch from e421224 to 3dc1f55 Compare March 19, 2024 22:03
@subhramit
Copy link
Member Author

Done! @calixtus

@subhramit subhramit force-pushed the fix-for-issue-10996 branch from cbb8a05 to e1e1daa Compare March 20, 2024 18:21
@subhramit subhramit requested a review from koppor March 20, 2024 18:27
private Optional<String> getOrNull(JSONObject item, List<String> keys) {
Object value = item;
for (String key : keys) {
if (value instanceof JSONObject) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (value instanceof JSONObject) {
if (value instanceof JSONObject obj) {
....

You can use the new instance of pattern matching syntax, which makes the extra casting step necessary
https://docs.oracle.com/en/java/javase/17/language/pattern-matching-instanceof-operator.html#GUID-843060B5-240C-4F47-A7B0-95C42E5B08A7

Copy link
Member Author

@subhramit subhramit Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I just went through the docs. If I'm not wrong, did you mean that it would make the extra casting step "unnecessary"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the casts and followed the new syntax. It is odd that IntelliJ is giving me red squiggly lines and a suggestion to cast it back. Will ignore for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay nope, I added the obj syntax but seems like the cast is necessary, else I'm getting this compilation error:
error: cannot find symbol
value = value.opt(key);
^
symbol: method opt(String)
location: variable value of type Object

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the examples in the linked docs, you need to use value = obj.get... After the if

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

@subhramit subhramit requested review from koppor and Siedlerchr March 21, 2024 01:27
@subhramit
Copy link
Member Author

subhramit commented Mar 21, 2024

I've made these new set of changes (wherever feasible, otherwise I've commented on your suggestions) and re-requested for review. I have learnt more from this single pull request regarding production-ready code than I ever have from my three years of CS degree. Thank you so much for your valuable inputs and patience. @koppor @Siedlerchr

Optional<BibEntry> fetchedEntry = fetcher.performSearchById("3537908");
assertEquals(Optional.of(ratiuEntry), fetchedEntry);
void getParser() throws Exception {
String fileName = "/importer/fetcher/jsonTest.json";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java has a special folder for resources src/main/resoures and for test src/test/resources See the other importer test files there

koppor
koppor previously requested changes Mar 21, 2024
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small code comments

return Optional.of(authorsString);
}

private String getKeywords(JSONObject primaryClass) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also use the Optional<String> thing you did at toAuhtors. Because of consistency.

Comment on lines 176 to 179
// Skip author and keywords fields as they are already set
if (field == StandardField.AUTHOR || field == StandardField.KEYWORDS) {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove StandardField.AUTHOR and StandardField.KEYWORDS from FIELD_MAPPING. Think, this is the only "use" in the code, which savely can be removed.

// Handle articleUrl and mrnumber fields separately, as they are non-nested properties in the JSON and can be retrieved as Strings directly
String doi = item.optString("articleUrl", "");
if (!doi.isEmpty()) {
entry.setField(StandardField.DOI, doi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to route through DOI#parse. If that optional is present, use doi.getNormalized(), otherwise use the doi variable. In That way, the http prefix is removed if it is a valid doi, but the full string is kept in case of a DOI parsing error.

return Optional.empty();
}

// Method to change character set, to fix output string encoding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert to full JavaDoc:

/**
 * ... text ...
 */

@subhramit
Copy link
Member Author

Made the third round of changes. @koppor @Siedlerchr
Awaiting more comments if any!

@subhramit subhramit requested review from koppor and Siedlerchr March 21, 2024 15:16
fix tests, rename resources
@Siedlerchr
Copy link
Member

I took the opportunity to fix some issues:

  1. You probably accidentally removed the BibTex Parser. Some json resutls already deliver
  2. I renamed the resource file
  3. Checked the test and found that the search by id returns a json which contains bibtex that we can direclty throw at our parser

@Siedlerchr Siedlerchr enabled auto-merge March 21, 2024 22:16
@subhramit
Copy link
Member Author

I took the opportunity to fix some issues:

  1. You probably accidentally removed the BibTex Parser. Some json resutls already deliver
  2. I renamed the resource file
  3. Checked the test and found that the search by id returns a json which contains bibtex that we can direclty throw at our parser

Thank you so much. (1) is a huge one, don't know if this was because of sleepless coding or a slip of keyboard. (2) will remember this convention. (3) nice catch!

@Siedlerchr Siedlerchr added this pull request to the merge queue Mar 21, 2024
Merged via the queue into JabRef:main with commit 295035a Mar 21, 2024
20 of 21 checks passed
@subhramit subhramit deleted the fix-for-issue-10996 branch March 21, 2024 22:56
@subhramit subhramit changed the title Fix for MathSciNet search Add new parser for MathSciNet search Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mathscinet search is not working MathSciNet import per ID and search does not work (anymore)
4 participants