-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Google Scholar fetcher for downloading a single entry #7075
Conversation
|
@@ -14,6 +14,17 @@ Fetchers are the implementation of the [search using online services](https://do | |||
|
|||
On Windows, you have to log-off and log-on to let IntelliJ know about the environment variable change. Execute the gradle task "processResources" in the group "others" within IntelliJ to ensure the values have been correctly written. Now, the fetcher tests should run without issues. | |||
|
|||
## Change the log levels to enable debugging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also just start JabRef with -debug as program argument.
|
||
String infoPageUrl = BASIC_SEARCH_URL + "q=info:" + matcher.group(1) + ":scholar.google.com/&output=cite&scirp=0&hl=en"; | ||
LOGGER.debug("Using infoPageUrl {}", infoPageUrl); | ||
URLDownload infoPageUrlDownload = new URLDownload(infoPageUrl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yu want to reuse the connection you should use unirest or jsoup
Refs #6369 |
# Conflicts: # src/main/java/org/jabref/logic/importer/fetcher/GoogleScholar.java
Co-authored-by: Dominik Voigt <[email protected]>
…(and log URL) Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
…new one) Co-authored-by: Dominik Voigt <[email protected]>
You can either search by author or by title, for title you need to put in quotes: I don't understand your search query above. |
Google Scholar also works when not using quotes. #convenience. Not sure whether our Google Scholar implementaiton should behave differently than when using their web page. |
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
entry.setField(StandardField.YEAR, "2013"); | ||
entry.setField(StandardField.PAGES, "41--44"); | ||
BibEntry entry = new BibEntry(StandardEntryType.InProceedings) | ||
.withCitationKey("geiger2013detecting") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing wrong with the set...
methods. The with...
methods were added to quickly add one or two field values, mostly in lambda expressions, e.g. map(entry -> entry.withField(...))
Co-authored-by: Dominik Voigt <[email protected]>
Co-authored-by: Dominik Voigt <[email protected]>
@Override | ||
public String solve(String queryURL) { | ||
// slim implementation of https://news.kynosarges.org/2014/05/01/simulating-platform-runandwait/ | ||
final CountDownLatch doneLatch = new CountDownLatch(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to listen for the web engine ready event, see the preview Tab viewer where we add this highlight ja stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previewView.getEngine().getLoadWorker().stateProperty().addListener((observable, oldValue, newValue) -> {
if (newValue != Worker.State.SUCCEEDED) {
return;
}
See https://openjfx.io/javadoc/11/javafx.web/javafx/scene/web/WebEngine.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to listen for the web engine ready event, see the preview Tab viewer where we add this highlight ja stuff
Is this happen synchronously? The interface for the Captcha solver is designed in a synchronous way. Otherwise all fetchers need to be changed.
I'll be away anyway for the next days. Thus, you are free to experiment 😅
This directly competes with #5943, where the browser is used to communicate with Google Scholar. We should write an ADR ^^. |
Since this is on the 5.3 milestones, are we, at least for now, taking this approach? |
Yes, we take this approach here. The other one heavily relies on JabRef's internal save handling. This is currently handled by @Siedlerchr in #6694. Hope, we will make progress somehow. |
DevCall decision: Work-around using our browser extension exists. Thus, this is not a high-priority any more. |
Other implementation hint: https://github.com/JetBrains/jcef |
GoogleScholar changed their page (IMHO)
First page:
Second "page"
Click on cited loads the content of that thing:
There, BibTeX can be downloaded as usual.
Example URL: https://scholar.google.ch/scholar?q=info:RExzBa3OlkQJ:scholar.google.com/&output=cite&scirp=0&hl=en
Third page
Example URL: https://scholar.googleusercontent.com/scholar.bib?q=info:RExzBa3OlkQJ:scholar.google.com/&output=citation&scisdr=CgVYZoPbEOvJ014vy3E:AAGBfm0AAAAAX6Mq03ED_BBuflXyRuQujflFTqExM8uU&scisig=AAGBfm0AAAAAX6Mq0_wSs1k5gywcNDtaUBn0PeTKsRGQ&scisf=4&ct=citation&cd=-1&hl=en
Block notice:
Summary
While programming, I came to the last step. The issue is that after 10 tries, I am banned and cannot continue.