Merge remote-tracking branch 'origin/develop'

# Conflicts: # README.md
vivo-community · Nov 5, 2021 · 00ec451 · 00ec451
2 parents 184113c + 7ea1f5d
commit 00ec451
Show file tree

Hide file tree

Showing 62 changed files with 892 additions and 351 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,17 +3,32 @@ All notable changes to this project will be documented in this file.
 
 ## [Unreleased]
 
+## [1.2.0] - 2021-11-05
+### Added
+- add queries for Crossref work and personPlusWorks
+- add tests for all controllers
+- add new parameter for ROR affiliation to ORCID query
+
+### Changed
+- refactor all controllers so GET-path matches resource path
+- improve README
+- remove Dockerfiles and instead use Spring Docker capabilities
+- use Spring profiles to determine how and where data is written
+
 
 ## [1.1.0] - 2021-06-22
 ### Added
 - query ORCID for person and current employees at ROR organization
-- refactor all queries to follow three steps: source, extraction, mapping to vivo-rdf
 - check GraphQL queries for response code 200 and additionally if response body contains error messages (no data)
+- document methods in swagger-UI and remove response section
+
+### Changed
+- refactor all queries to follow three steps: source, extraction, mapping to vivo-rdf
 - make all controller use HTTP-Get requests
 - centralize input validation
 - export to VIVO in chunks
 - simplify error handling
-- document methods in swagger-UI and remove response section
+
 
 ## [Renamed Project] - 2021-05-25
 As new datsources were integrated and the name datacitecommons2vivo was not reflecting

diff --git a/Dockerfile b/Dockerfile
diff --git a/DockerfileBuild b/DockerfileBuild
diff --git a/README.md b/README.md
@@ -1,79 +1,66 @@
+![Java](https://img.shields.io/badge/java-%23ED8B00.svg?style=for-the-badge&logo=java&logoColor=white)
+![Spring](https://img.shields.io/badge/spring-%236DB33F.svg?style=for-the-badge&logo=spring&logoColor=white)
+![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white)
+
 [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
+![Open Source Love](https://badges.frapsoft.com/os/v3/open-source.svg?v=102)
 
 ## generate2vivo
-generate2vivo is an extensible Data Ingest Tool for the open source software VIVO. 
-It currently queries metadata from Datacite Commons, ROR and ORCID
+generate2vivo is an extensible Data Ingest and Transformation Tool for the open source software VIVO.
+It currently contains queries for metadata from [Datacite Commons](https://commons.datacite.org/), [Crossref](https://www.crossref.org/), [ROR](https://ror.org/) and [ORCID](https://orcid.org/)
 and maps them to the VIVO ontology using [sparql-generate](https://ci.mines-stetienne.fr/sparql-generate/index.html).
-The resulting RDF data can be exported to a VIVO instance directly or returned in a HTTP response.
+The resulting RDF data can be exported to a VIVO instance (or any SPARQL endpoint) directly or it can be returned in JSON-LD.
+
+<hr style="border:1px solid gray"> </hr>
 
-- [Available queries](#available-queries)
-  + [Datacite Commons](#datacite-commons)
-  + [ROR](#ror)
-  + [ORCID](#orcid)
+- [Features](#features)
 - [Installation](#installation)
-- [Run in Command Line](#run-in-command-line)
-- [Extensible](#extensible)
+- [Wiki resources](#wiki-resources)
+
+
+### Features
+![generate2vivo features](https://raw.githubusercontent.com/wiki/vivo-community/generate2vivo/images/generate2vivo.png)
 
-### Available queries
-The data sources and queries that are currently available are listed below.
+Starting point was the sparql-generate library that we use as an engine for our transformations, which are
+defined in different GENERATE queries. Notice that code and queries are separate, this allows users
+* to write or change queries without going into the code
+* to reuse queries (meaning you can dump the code and only use the queries for example with the command line
+  tool provided on the sparql-generate website)
+* to reuse code (meaning you can dump the queries if the data sources are not interesting for you and use only the code with your own queries)
 
-##### Datacite Commons
-For [Datacite Commons](https://commons.datacite.org/) the following queries are available:
-* `organization` : This method gets data about an organization by passing a ROR id.
-* `organizationPlusPeople`: This method gets data about an organization and its affiliated people by passing a ROR id.
-* `organizationPlusPeoplePlusPublications`:This method gets data about an organization and its affiliated people and their respective publications by passing a ROR id.
-* `person`: This method gets data about a person by passing an ORCID id.
-* `personPlusPublications`: This method gets data about a person and their publications by passing an ORCID id.
-* `work`: This method gets data about a work by passing an DOI.
 
-##### ROR
-For [ROR](https://ror.org/) there are 2 queries available:
-* `organization`: This method gets data about an organization by passing a ROR id.
-* `organizationPlusChildren`: This method gets data about an organization and all their sub-organizations by passing a ROR id.
+In addition we gave the application a REST API so other programs or services can communicate with the application using HTTP requests which allows generate2vivo to be integrated in an existing data ingest process.
 
-##### ORCID
-For [ORCID](https://orcid.org/) the following queries are available:
-* `personPlusWorks`: This method gets data about a person and their works by passing an ORCID id.
-* `currentEmployeesPlusWorks`: This method gets data about an organization's current employees and their works by passing a ROR id.
 
+On the other side we added output functionality that allows you to export the generated data either directly into a VIVO instance via its SPARQL API or alternatively if you want to check the data before importing or are using a messaging service like Apache Kafka you can return the generated data as
+JSON-LD and do some post-processing with it.
 
 
 ### Installation
+0. Prerequisites: You need to have maven and a JDK for Java 11 installed.
 1. Clone the repository to a local folder using `git clone https://github.com/vivo-community/generate2vivo.git`
-2. Change into the folder where the repository has been cloned. 
-3. Open `src/main/resources/application.properties` and change your VIVO details accordingly. 
+2. Change into the folder where the repository has been cloned.
+3. Open `src/main/resources/application.properties` and change your VIVO details accordingly.
    If you don't provide a vivo.url, vivo.email or vivo.password, the application will not import the mapped data to VIVO but return the triples in format JSON-LD.
-3. Run the application:
-  * If you have maven and a JDK for Java 11 installed, you can run the application directly via `mvn spring-boot:run`. 
-
-  * Alternatively you can compile & run the application in Docker (with or without Java setup):
-    ```dockerfile
-    # with Java setup:
-    mvn package
-    docker build -t g2v .
-    docker run -p 9000:9000 -t g2v
-
-    # without Java setup
-    docker build -f DockerfileBuild -t g2v .
-    docker run -p 9000:9000 -t g2v
+4. Run the application:
+* You can run the application directly via `mvn spring-boot:run`.
+* Or alternatively you can run the application in Docker:
+  ```bash
+  mvn spring-boot:build-image
+  docker run -p 9000:9000 generate2vivo:latest
+  ```
 
 5. A minimal swagger-ui will be available at `http://localhost:9000/swagger-ui/`.
 
-### Run in Command Line
-Alternatively you can run the queries from the command line using the sparql-generate executable JAR-file.
-All queries are placed in folder `src/main/resources/sparqlg` and come with a `sparql-generate-conf.json`. 
-Its structure and use are explained in detail on the [sparql-generate website](https://ci.mines-stetienne.fr/sparql-generate/language-cli.html).
+### Wiki resources
+Additional resources are available in the GitHub wiki, e.g.
 
-### Extensible
-The software is easily extensible, meaning you can add and remove data sources.
+* _[data sources & queries](https://github.com/vivo-community/generate2vivo/wiki/data-sources-&-queries)_ :
+A detailed overview of the data sources and their queries.
 
-For example, if you are not interested in using Datacite Commons, just remove the folder from `src/main/resources/sparqlg`
-and the respective controller in the package `eu.tib.controller`.
+* _[using generate2vivo](https://github.com/vivo-community/generate2vivo/wiki/using-generate2vivo)_: A short user tutorial with screenshots on how to use the swagger-UI to execute a query.
 
-On the other hand, if you would like to add a data source:
-* add a folder with your queries under `src/main/resources/sparqlg` and include a `sparql-generate-conf.json` 
-  (its structure is described on the [sparql-generate website](https://ci.mines-stetienne.fr/sparql-generate/language-cli.html)).
-* add a controller in `eu.tib.controller` that retrieves your input and calls your query like `responseService.buildResponse(queryid, input)`
-    * the connection between controller and the according query is made by the `queryid`. You need to supply the path within the resources folder to your `sparql-generate-conf.json`.
-    * put your input into a Map and every key-value-pair will be available in your query as a binding, where ?key will be replaced with value.
+* _[run queries in cmd line](https://github.com/vivo-community/generate2vivo/wiki/install-&-run#run-in-command-line)_ :
+  An alternative way of running the queries via cmd line with the provided Jar from sparql-generate.
 
+* _[dev guide](https://github.com/vivo-community/generate2vivo/wiki/dev-guide)_ : resources specifically for developers, e.g. on how to add a data source or query, how to put variables from code into the query.
diff --git a/pom.xml b/pom.xml
@@ -13,7 +13,7 @@
 
     <groupId>eu.tib</groupId>
     <artifactId>generate2vivo</artifactId>
-    <version>1.1.0</version>
+    <version>1.2.0</version>
     <description>Extensible Data Ingest Tool for VIVO. Contains data sources like Datacite Commons, ORCID and ROR.</description>
 
     <properties>
@@ -79,19 +79,21 @@
             <version>6.0.16.Final</version>
         </dependency>
 
-        <dependency>
-            <groupId>com.graphql-java</groupId>
-            <artifactId>graphql-java</artifactId>
-            <scope>test</scope>
-            <version>16.2</version>
-        </dependency>
     </dependencies>
 
     <build>
         <plugins>
             <plugin>
                 <groupId>org.springframework.boot</groupId>
                 <artifactId>spring-boot-maven-plugin</artifactId>
+                <configuration>
+                    <layers>
+                        <enabled>true</enabled>
+                    </layers>
+                    <image>
+                        <name>generate2vivo</name>
+                    </image>
+                </configuration>
             </plugin>
         </plugins>
     </build>

diff --git a/src/main/java/eu/tib/controller/CrossrefController.java b/src/main/java/eu/tib/controller/CrossrefController.java
@@ -0,0 +1,63 @@
+package eu.tib.controller;
+
+import eu.tib.controller.validation.InputValidator;
+import eu.tib.service.WriteResultService;
+import io.swagger.annotations.Api;
+import io.swagger.annotations.ApiOperation;
+import io.swagger.annotations.ApiParam;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.beans.factory.annotation.Autowired;
+import org.springframework.http.ResponseEntity;
+import org.springframework.validation.annotation.Validated;
+import org.springframework.web.bind.annotation.GetMapping;
+import org.springframework.web.bind.annotation.RequestMapping;
+import org.springframework.web.bind.annotation.RequestParam;
+import org.springframework.web.bind.annotation.RestController;
+
+import javax.validation.Valid;
+import javax.validation.constraints.Pattern;
+import java.util.Map;
+
+@Slf4j
+@Validated
+@RestController
+@RequestMapping(value = "/crossref")
+@Api(value = "Controller", tags = {"crossref"})
+public class CrossrefController {
+
+    @Autowired
+    private WriteResultService wrService;
+
+    @ApiOperation(value = "Retrieve data about a person and their works from Crossref", notes = "This method gets data about a person and their works from Crossref by passing an ORCID id.")
+    @GetMapping(value = "/personPlusWorks", produces = "application/json")
+    public ResponseEntity<String> getPersonPlusWorks(
+            @Valid @Pattern(regexp = InputValidator.orcid)
+            @ApiParam("Complete Orcid URL consisting of https://orcid.org/ plus id")
+            @RequestParam String orcid,
+            @ApiParam("email for polite requests")
+            @RequestParam(defaultValue = "") String email) {
+
+        final String id = "sparqlg/crossref/personPlusWorks";
+        log.info("Incoming Request for " + id + " with orcid: " + orcid);
+
+        final String CURSOR = "*"; // starting cursor for pagination
+        return wrService.execute(id, Map.of("orcid", orcid,
+                "polite_mail", email,
+                "cursor", CURSOR));
+    }
+
+    @ApiOperation(value = "Retrieve data about a work from Crossref", notes = "This method gets data about a work from Crossref by passing an DOI.")
+    @GetMapping(value = "/work", produces = "application/json")
+    public ResponseEntity<String> getWork(
+            @Valid @Pattern(regexp = InputValidator.doi)
+            @ApiParam("DOI of the publication")
+            @RequestParam String doi,
+            @ApiParam("email for polite requests")
+            @RequestParam(defaultValue = "") String email) {
+
+        final String id = "sparqlg/crossref/work";
+        log.info("Incoming Request for " + id + " with doi: " + doi);
+
+        return wrService.execute(id, Map.of("doi", doi, "polite_mail", email));
+    }
+}