Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use v8 engine for CSL #2250

Merged
merged 4 commits into from
Dec 5, 2016

Conversation

shitikanth
Copy link
Contributor

@shitikanth shitikanth commented Nov 8, 2016

Changes made:

  • Added a compile dependency on com.eclipsesource.j2v8 in build.gradle

We are using citeproc-java library in CitationStyleGenerator to generate previews, which itself is basically a wrapper around the javascript library citeproc-js. Switching to V8 engine dramatically improves the time taken to generate these previews. For example, on my laptop, generating preview used to take about 20s for the first preview, and 10s for each subsequent preview; using V8 reduces both by a factor of 10.

  • Manually tested changed features in running JabRef

…build.gradle.

Citeproc-java detects that V8 is present in the classpath and starts using it
automatically.
@matthiasgeiger
Copy link
Member

Hi and thanks for your contribution.

However, I suppose it will be difficult to actually use com.eclipsesource.j2v8 in this way: If the concrete packages are in fact platform dependent it won't be possible to provide a platform independet jar file anylonger.

@shitikanth
Copy link
Contributor Author

Ooh good point! If the slow performance of CSL citation generation isn't an issue for others, I wouldn't bother about it too much. Maybe we can include V8 in the platform specific binaries though?

}
if (dep == null) {
logger.error("Could not find V8 runtime compatible to this system")
return null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just return an empty string here so that it compiles at other platforms, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is a good idea, but the jar we produce will still not be platform independent (as @matthiasgeiger mentioned). Maybe we can define different gradle tasks for a platform independent jar and platform specific builds.

What do you guys feel about just directly implementing citeproc natively in Java? I can take it up if you think that it will be worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing that (i) the current rendering is very slow and (ii) others will surely use that library too, citeproc natively in Java is a huge win for the community.

I fear, however, that it will be a huge effort and we would also need that effort in JabRef itself.

Copy link

@rmzelle rmzelle Dec 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you guys feel about just directly implementing citeproc natively in Java? I can take it up if you think that it will be worthwhile.

Hi all, CSL maintainer here. I have never worked on a citeproc implementation myself, but just a friendly warning that supporting the full CSL 1.0.1 specification is a significant undertaking. I'm cc'ing some developers who might be able to give you some additional pointers/warnings.

(cc @michel-kraemer (citeproc-java), @fbennett (citeproc-js), @inukshuk (citeproc-ruby))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm this. In fact, I initially thought about creating a pure Java implementation myself, but citeproc-js is so mature and well-maintained, you'll probably never reach the same quality -- at least I would never be able to do so.

Anyway, there are a couple of options to improve the performance of citeproc-java. You already discovered you can use V8, although it isn't documented anywhere. You might be able to improve the performance further by caching the processor instance. But maybe you already do so, I don't know your code well enough. If you're interested we can discuss this topic further. I'd be more than happy to help.

@koppor
Copy link
Member

koppor commented Nov 8, 2016

Note that the code is based o https://github.com/michel-kraemer/citeproc-java/blob/master/citeproc-java/build.gradle#L26

https://github.com/michel-kraemer/citeproc-java/blob/master/citeproc-java/build.gradle#L56:

    if (getV8Dependency() != null) {
        provided getV8Dependency()
}

Maybe replace provided by compile?

@koppor
Copy link
Member

koppor commented Nov 13, 2016

I would vote for generating specific JARs and package them for MacOSX and Windows. For linux, we could generate two JARs, too.

@koppor
Copy link
Member

koppor commented Nov 13, 2016

Could you also update external-libraries.txt as described at https://github.com/JabRef/jabref/blob/master/CONTRIBUTING.md#when-adding-a-library. We need that to ease debian packaging (refs JabRef#135).

@koppor
Copy link
Member

koppor commented Nov 13, 2016

I used your patch here with the new engine. It is very fast. Seeing that, I think, we should invest time for a correct build of JabRef and not reimplementing the whole functionality in Java. That time should better be spent in JabRef itself 😇

@koppor koppor changed the title Use v8 engine for CSL [WIP] Use v8 engine for CSL Nov 13, 2016
@Siedlerchr
Copy link
Member

Another option would be to check if it works wirh jdk8 nashorn engine as discussed here michel-kraemer/citeproc-java#28

@shitikanth
Copy link
Contributor Author

@Siedlerchr, I did try building with the nashorn engine, the performance wasn't improved by very much at least on my machine.

Id: com.eclipse.j2v8
Project: J2V8
URL: https://github.com/eclipsesource/J2V8
Licence: EPL- 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the space before 1.0 so it is a valid SPDX license identifier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@koppor
Copy link
Member

koppor commented Nov 28, 2016

@shitikanth Could you try to adapt the gradle build file to create a plattform-specific jar and a plattform-neutral jar? The speedup of the V8 engine is huge and I would really like to get this into the next JabRef release.

@shitikanth
Copy link
Contributor Author

@koppor I am not familiar with JabRef's build/release process, so it will help me if can you specify what behaviour you want. For example, does the following expected behaviour look okay to you?

  • ./gradlew releaseJar: creates platform specific build like JabRef--linux_x86-64.jar
  • ./gradlew shadowJar: creates a platform independent build.

@matthiasgeiger
Copy link
Member

I think during release the shadowJar variant is used which creates a fat-jar. The needed behavior would be to create 5 different fat-Jars: Windows x86, Win x64, Linux, MacOSX, and independent - which is then used to create the platform specific installers using install4j.

@stefan-kolb Can you assist here? Is it possible to use different jars in install4j?

@stefan-kolb
Copy link
Member

Yes, shadowJar is used right now as far as I know. As for the executables, we probably need to add more launchers inside install4j for every configuration.

@koppor koppor self-assigned this Dec 2, 2016
@koppor koppor added this to the v3.8 milestone Dec 2, 2016
@koppor
Copy link
Member

koppor commented Dec 5, 2016

To illustrate the performance gain, I made two videos:

without V8

without-v8

with V8

with-v8

@stefan-kolb
Copy link
Member

Hm, I ask myself why this stuff is taking so long anyway? It is just a String creation wtf?

@koppor
Copy link
Member

koppor commented Dec 5, 2016

http://eclipsesource.com/blogs/2014/11/17/highly-efficient-java-javascript-integration/:

Unlike other JS Runtimes (including JV8, Jav8), J2V8 takes a primitive based approach, resulting in much less garbage.

I am not in the internals of JavaScript engines and cannot really compare Rhino and Nashorn to J2V8.

@koppor
Copy link
Member

koppor commented Dec 5, 2016

IMHO the alternative is to implement citeproc-java directly in Java as suggested by @shitikanth at #2250 (comment). Maybe, this could be a project for the time between the years? 😇

@koppor koppor changed the base branch from master to use-V8-for-citeproc December 5, 2016 17:35
@koppor koppor merged commit de2823d into JabRef:use-V8-for-citeproc Dec 5, 2016
@koppor
Copy link
Member

koppor commented Dec 5, 2016

I merged to use-V8-for-citeproc to enable other developers continuing working on it.

koppor pushed a commit that referenced this pull request Dec 5, 2016
Add a dependency to com.eclipsesource.J2V8 (Java Bindings for V8) in build.gradle. Citeproc-java detects that V8 is present in the classpath and starts using it automatically.
@michel-kraemer
Copy link
Contributor

Hm, I ask myself why this stuff is taking so long anyway? It is just a String creation wtf?

The citeproc.js file is 664 KB large and the code is very complex. It seems Nashorn is not very fast when compiling and running such a file for the first time. However, since it uses JIT compilation, it gets very fast the longer the program runs and the more often you execute the file.

koppor pushed a commit that referenced this pull request Apr 17, 2017
Add a dependency to com.eclipsesource.J2V8 (Java Bindings for V8) in build.gradle. Citeproc-java detects that V8 is present in the classpath and starts using it automatically.
@koppor koppor mentioned this pull request Sep 1, 2017
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants