-
Notifications
You must be signed in to change notification settings - Fork 54
Backend › Extensions
What if you want to add features to Datashare backend?
Unlike plugins that are providing a way to modify the datashare frontend, extensions have been created to extend the backend functionalities. There are two extension points that have been defined :
- NLP pipelines : you can add a new java NLP pipeline to datashare
- HTTP API : you can add HTTP endpoints to datashare and call the Java API you need in those endpoints
Since version 7.5.0, instead of modifying Datashare directly, you can now isolate your code with a specific set of features and then configure Datashare to use it. Each Datashare user could pick the extensions they need or want, and have a fully customized installation of our search platform.
- Getting started
- Installing and Removing registered extensions
- Create your first extension
- Installing and Removing your custom extension
When starting, Datashare can receive an extensionsDir
option, pointing to your extensions' directory. In this example, let's call it /home/user/extensions
:
mkdir /home/user/extensions
datashare --extensionsDir=/home/user/extensions
You can list official Datashare extensions like this :
$ datashare -m CLI --extensionList
2020-08-29 09:27:51,219 [main] INFO Main - Running datashare
extension datashare-extension-nlp-opennlp
OPENNLP Pipeline
7.0.0
https://github.com/ICIJ/datashare-extension-nlp-opennlp/releases/download/7.0.0/datashare-nlp-opennlp-7.0.0-jar-with-dependencies.jar
Extension to extract NER entities with OPENNLP
NLP
...
You can add a regular expression to --extensionList
. You can filter the extension list if you know what you are looking for.
You can install an extension with its id and providing where the Datashare extensions are stored:
$ datashare -m CLI --extensionInstall datashare-extension-nlp-mitie --extensionsDir "/home/user/extensions"
2020-08-29 09:34:30,927 [main] INFO Main - Running datashare
2020-08-29 09:34:32,632 [main] INFO Extension - downloading from url https://github.com/ICIJ/datashare-extension-nlp-mitie/releases/download/7.0.0/datashare-nlp-mitie-7.0.0-jar-with-dependencies.jar
2020-08-29 09:34:36,324 [main] INFO Extension - installing extension from file /tmp/tmp218535941624710718.jar into /home/user/extensions
Then if you launch Datashare with the same extension location, the extension will be loaded.
When you want to stop using an extension, you can either remove by hand the jar inside the extensions folder or remove it with datashare --extensionDelete
:
$ datashare -m CLI --extensionDelete datashare-extension-nlp-mitie --extensionsDir "/home/user/extensions/"
2020-08-29 09:40:11,033 [main] INFO Main - Running datashare
2020-08-29 09:40:11,249 [main] INFO Extension - removing extension datashare-extension-nlp-mitie jar /home/user/extensions/datashare-nlp-mitie-7.0.0-jar-with-dependencies.jar
You can create a "simple" java project like https://github.com/ICIJ/datashare-extension-nlp-mitie (as simple as a java project can be right), with you preferred build tool.
You will have to add a dependency to the last version of datashare-api.jar to be able to implement your NLP pipeline.
With the datashare API dependency you can then create a class implementing Pipeline or extending AbstractPipeline. When Datashare will load the jar, it will look for a Pipeline
interface.
Unfortunately, you'll have also to make a pull request to datashare-api to add a new type of pipeline. We will remove this step in the future.
Build the jar with its dependencies, and install it in the /home/user/extensions
then start datashare with the extensionsDir
set to /home/user/extensions
. Your plugin will be loaded by datashare.
Finally, your pipeline will be listed in the available pipelines in the UI, when doing NER.
For making a http extension it will be the same as NLP, you'll have to make a java project that will build a jar.
The only dependency that you will need is fluent-http because datashare will look for fluent http annotations @Get, @Post, @Put...
For example, we can create a small class like :
package org.myorg;
import net.codestory.http.annotations.Get;
import net.codestory.http.annotations.Prefix;
@Prefix("myorg")
public class FooResource {
@Get("foo")
public String getFoo() {
return "hello from foo extension";
}
}
Build the jar, copy it to the /home/user/extensions
then start datashare:
$ datashare --extensionsDir /home/user/extensions/
# ... starting logs
2020-08-29 11:03:59,776 [Thread-0] INFO ExtensionLoader - loading jar /home/user/extensions/my-extension.jar
2020-08-29 11:03:59,779 [Thread-0] INFO CorsFilter - adding Cross-Origin Request filter allows *
2020-08-29 11:04:00,314 [Thread-0] INFO Fluent - Production mode
2020-08-29 11:04:00,331 [Thread-0] INFO Fluent - Server started on port 8080
et voilà 🔮! You can query your new endpoint. Easy, right?
$ curl localhost:8080/myorg/foo
hello from foo extension
You can also install and remove extensions with the Datashare CLI.
Then you can install it with:
$ datashare -m CLI --extensionInstall /home/user/src/my-extension/dist/my-extension.jar --extensionsDir "/home/user/extensions"
2020-07-27 10:02:32,381 [main] INFO Main - Running datashare
2020-07-27 10:02:32,596 [main] INFO ExtensionService - installing extension from file /home/user/src/my-extension/dist/my-extension.jar into /home/user/extensions
And remove it:
$ datashare -m CLI --extensionDelete my-extension.jar --extensionsDir "/home/user/extensions"
2020-08-29 10:45:37,363 [main] INFO Main - Running datashare
2020-08-29 10:45:37,579 [main] INFO Extension - removing extension my-extension jar /home/user/extensions/my-extension.jar