Skip to content

Commit

Permalink
docs and proc name changes
Browse files Browse the repository at this point in the history
  • Loading branch information
vga91 committed Jan 16, 2025
1 parent 49e5315 commit 0370471
Show file tree
Hide file tree
Showing 8 changed files with 114 additions and 87 deletions.
159 changes: 93 additions & 66 deletions docs/asciidoc/modules/ROOT/pages/database-integration/load-jdbc.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -528,22 +528,37 @@ CALL apoc.load.jdbc('jdbc:derby:derbyDB', 'PERSON',[],{credentials:{user:'apoc',

== Load JDBC - Analytics

You can use the `apoc.load.jdbc.analytics(<cypherQuery>, <jdbcUrl>, <sqlQueryOverTemporaryTable>, $config)`
You can use the `apoc.jdbc.analytics(<cypherQuery>, <jdbcUrl>, <sqlQueryOverTemporaryTable>, <paramsList>, $config)`
to create a temporary table starting from a Cypher query
and delegate complex analytics to the database defined JDBC URL.

Please note that the returning SQL column names have to be consistent with the one provided by the Cypher query
Please note that the returning SQL column names have to be consistent with the one provided by the Cypher query.

In addition to the configurations of the `apoc.load.jdbc` procedure, the `apoc.jdbc.analytics` provides the following ones:

[cols="1m,2,1"]
|===
|tableName| the temporary table name | neo4j_tmp_table
|provider| the SQL provider, to handle data type based on it, possible values are "POSTGRES", "MYSQL" and "DEFAULT" | "DEFAULT"
|===


It is possible to specify a provider in the config parameters.
The default value is "DUCKDB".

You can reproduce the following queries using some nodes:

[source, cypher]
----
CREATE (:Movie {title:'The Matrix', language: 'en', released:1999, tagline:'Welcome to the Real World', qty: 5})
CREATE (:Movie {title:'The Matrix Reloaded', language: 'en', released:2003, tagline:'Free your mind', qty: 6})
CREATE (:Movie {title:'The Matrix Revolutions', language: 'en', released:2003, tagline:'Everything that has a beginning has an end', qty: 11})
CREATE (:Movie {title:"The Devil's Advocate", language: 'en', released:1997, tagline:'Evil has its winning ways', qty: 3})
CREATE (:Movie {title:"A Few Good Men", language: 'it', released:1992, tagline:"In the heart of the nation's capital, in a courthouse of the U.S. government, one man will stop at nothing to keep his honor, and one will stop at nothing to find the truth.", qty: 7})
CREATE (:Movie {title:"Top Gun", released:1986, language: 'it', tagline:'I feel the need, the need for speed.', qty: 12})
CREATE (:City {country: 'NL', name: 'Amsterdam', year: 2000, population: 1005})
CREATE (:City {country: 'NL', name: 'Amsterdam', year: 2010, population: 1065})
CREATE (:City {country: 'NL', name: 'Amsterdam', year: 2020, population: 1158})
CREATE (:City {country: 'US', name: 'Seattle', year: 2000, population: 564})
CREATE (:City {country: 'US', name: 'Seattle', year:2010, population: 608})
CREATE (:City {country: 'US', name: 'Seattle', year: 2020, population: 738})
CREATE (:City {country: 'US', name: 'New York City', year: 2000, population: 8015})
CREATE (:City {country: 'US', name: 'New York City', year: 2010, population: 8175})
CREATE (:City {country: 'US', name: 'New York City', year: 2020, population: 8772})
----

=== DuckDB
Expand All @@ -552,54 +567,66 @@ Example to get the rank of the current row with gaps.
Fields of the SQL query should be consistent with the Cypher query.
For detailed information go to https://duckdb.org/docs/sql/functions/window_functions.html#rank

It is possible to specify a provider in the config parameters.
The default value is "DUCKDB".


[source,cypher]
----
CALL apoc.load.jdbc.analytics(
"MATCH (n:Movie) RETURN n.title AS title, n.released AS released, n.language AS language, n.tagline AS tagline",
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"SELECT
title,
released,
language,
tagline,
RANK() OVER (PARTITION BY language ORDER BY released DESC) AS rank
FROM temp_table
ORDER BY rank, title, tagline;"
country,
name,
year,
population,
RANK() OVER (PARTITION BY country ORDER BY year DESC) AS rank
FROM %s
ORDER BY rank, country, name;"
)
----

Another example to get a Pivot table using window functions

[source,cypher]
----
CALL apoc.load.jdbc.analytics(
"MATCH (n:Movie) RETURN n.title AS title, n.released AS released, n.language AS language, n.tagline AS tagline",
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"WITH ranked_data AS (
SELECT
title,
released,
language,
tagline,
qty,
ROW_NUMBER() OVER (PARTITION BY language ORDER BY released DESC) AS rank
FROM temp_table
ORDER BY rank, title, tagline)
SELECT *
FROM ranked_data
PIVOT (
sum(qty)
FOR
language IN ('en', 'it')
GROUP BY released
)"
SELECT
country,
name,
year,
population,
ROW_NUMBER() OVER (PARTITION BY country ORDER BY year DESC) AS rank
FROM %s
ORDER BY rank, country, name
)
SELECT *
FROM ranked_data
PIVOT (
sum(population)
FOR country IN ('NL', 'US')
GROUP BY year
)"
)
----

Or using directly a `PIVOT <table> ON <column>` clause:

[source,cypher]
----
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"PIVOT %s
ON year
USING sum(population)
ORDER by name"
)
----


=== MySQL

Returns the rank of the current row within its partition, with gaps.
Expand All @@ -608,17 +635,17 @@ https://dev.mysql.com/doc/refman/8.4/en/window-function-descriptions.html#functi

[source,cypher]
----
CALL apoc.load.jdbc.analytics(
"MATCH (n:Movie) RETURN n.title AS title, n.released AS released, n.language AS language, n.tagline AS tagline",
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"SELECT
title,
released,
language,
tagline,
RANK() OVER (PARTITION BY language ORDER BY released DESC) AS 'rank'
FROM temp_table
ORDER BY title, tagline",
country,
name,
year,
population,
RANK() OVER (PARTITION BY country ORDER BY year DESC) AS 'rank'
FROM %s
ORDER BY country, name;",
$params,
{ provider: "MYSQL" })
----
Expand All @@ -627,17 +654,17 @@ Here an example of ROW_NUMBER window function with MySQL:

[source,cypher]
----
CALL apoc.load.jdbc.analytics(
"MATCH (n:Movie) RETURN n.title AS title, n.released AS released, n.language AS language, n.tagline AS tagline",
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"SELECT
title,
released,
language,
tagline,
ROW_NUMBER() OVER (PARTITION BY language ORDER BY released DESC) AS 'rank'
FROM temp_table
ORDER BY title, tagline",
country,
name,
year,
population,
ROW_NUMBER() OVER (PARTITION BY country ORDER BY year DESC) AS 'rank'
FROM %s
ORDER BY country, name;",
$params,
{ provider: "MYSQL" })
----
Expand All @@ -648,17 +675,17 @@ Here an example with Window functions.

[source,cypher]
----
CALL apoc.load.jdbc.analytics(
"MATCH (n:Movie) RETURN n.title AS title, n.released AS released, n.language AS language, n.tagline AS tagline",
CALL apoc.jdbc.analytics(
"MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population",
$url,
"SELECT
title,
released,
language,
tagline,
RANK() OVER (PARTITION BY language ORDER BY released DESC) rank
FROM temp_table
ORDER BY rank, title, tagline",
country,
name,
year,
population,
RANK() OVER (PARTITION BY country ORDER BY year DESC) rank
FROM %s
ORDER BY rank, country, name;",
$params,
{ provider: "POSTGRES" })
----
4 changes: 2 additions & 2 deletions extended-it/src/test/java/apoc/load/MySQLJdbcTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ public void testLoadJdbcAnalytics() {
FROM %s
ORDER BY country, name;
""".formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);
testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
map(
"queryCypher", cypher,
"sql", sql,
Expand All @@ -105,7 +105,7 @@ public void testLoadJdbcAnalyticsWindow() {
ORDER BY country, name;
""".formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);

testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
map(
"queryCypher", cypher,
"sql", sql,
Expand Down
20 changes: 10 additions & 10 deletions extended-it/src/test/java/apoc/load/PostgresJdbcTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -148,17 +148,17 @@ public void testLoadJdbcAnalytics() {
String cypher = "MATCH (n:City) RETURN n.country AS country, n.name AS name, n.year AS year, n.population AS population";

String sql = """
SELECT
country,
name,
year,
population,
RANK() OVER (PARTITION BY country ORDER BY year DESC) rank
FROM %s
ORDER BY rank, country, name;
SELECT
country,
name,
year,
population,
RANK() OVER (PARTITION BY country ORDER BY year DESC) rank
FROM %s
ORDER BY rank, country, name;
""".formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);

testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
map(
"queryCypher", cypher,
"sql", sql,
Expand All @@ -183,7 +183,7 @@ public void testLoadJdbcAnalyticsWindow() {
ORDER BY rank, country, name;
""".formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);

testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
map(
"queryCypher", cypher,
"sql", sql,
Expand Down
8 changes: 4 additions & 4 deletions extended/src/main/java/apoc/load/Analytics.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ public class Analytics {
public static final String TABLE_NAME_DEFAULT_CONF_KEY = "neo4j_tmp_table";

enum Provider {
DEFAULT,
POSTGRES,
DUCKDB,
MYSQL
}

Expand All @@ -46,16 +46,16 @@ enum Provider {
@Context
public Transaction tx;

@Procedure("apoc.load.jdbc.analytics")
@Description("apoc.load.jdbc.analytics(<cypherQuery>, <jdbcUrl>, <sqlQueryOverTemporaryTable>, $config) - to create a temporary table starting from a Cypher query and delegate complex analytics to the database defined JDBC URL ")
@Procedure("apoc.jdbc.analytics")
@Description("apoc.jdbc.analytics(<cypherQuery>, <jdbcUrl>, <sqlQueryOverTemporaryTable>, <paramsList>, $config) - to create a temporary table starting from a Cypher query and delegate complex analytics to the database defined JDBC URL ")
public Stream<RowResult> aggregate(
@Name("neo4jQuery") String neo4jQuery,
@Name("jdbc") String urlOrKey,
@Name("sqlQuery") String sqlQuery,
@Name(value = "params", defaultValue = "[]") List<Object> params,
@Name(value = "config",defaultValue = "{}") Map<String, Object> config) {
AtomicReference<String> createTable = new AtomicReference<>();
final Provider provider = Provider.valueOf((String) config.getOrDefault(PROVIDER_CONF_KEY, Provider.DUCKDB.name()));
final Provider provider = Provider.valueOf((String) config.getOrDefault(PROVIDER_CONF_KEY, Provider.DEFAULT.name()));
final String tableName = (String) config.getOrDefault(TABLE_NAME_CONF_KEY, TABLE_NAME_DEFAULT_CONF_KEY);

AtomicReference<String> columns = new AtomicReference<>();
Expand Down
2 changes: 1 addition & 1 deletion extended/src/main/resources/extended.txt
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ apoc.load.html
apoc.load.htmlPlainText
apoc.load.jdbc
apoc.load.jdbcUpdate
apoc.load.jdbc.analytics
apoc.jdbc.analytics
apoc.load.ldap
apoc.load.parquet
apoc.load.xls
Expand Down
2 changes: 1 addition & 1 deletion extended/src/test/java/apoc/load/AbstractJdbcTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ public abstract class AbstractJdbcTest {

protected static java.sql.Time time = java.sql.Time.valueOf("15:37:00");

protected static final String ANALYTICS_CYPHER_FILE = "movies-analytics.cypher";
protected static final String ANALYTICS_CYPHER_FILE = "dataset-analytics.cypher";

public void assertResult(Map<String, Object> row) {
Map<String, Object> expected = Util.map("NAME", "John", "SURNAME", null, "HIRE_DATE", hireDate.toLocalDate(), "EFFECTIVE_FROM_DATE",
Expand Down
6 changes: 3 additions & 3 deletions extended/src/test/java/apoc/load/DuckDBJdbcTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ public void testLoadJdbcAnalytics() {
ORDER BY rank, country, name;
"""
.formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);
testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql)",
map(
"queryCypher", cypher,
"sql", sql,
Expand Down Expand Up @@ -186,7 +186,7 @@ FOR country IN ('NL', 'US')
)
""".formatted(Analytics.TABLE_NAME_DEFAULT_CONF_KEY);;

testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql)",
map(
"queryCypher", cypher,
"sql", sql,
Expand Down Expand Up @@ -232,7 +232,7 @@ USING sum(population)
ORDER by name
""".formatted(customTable);

testResult(db, "CALL apoc.load.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
testResult(db, "CALL apoc.jdbc.analytics($queryCypher, $url, $sql, [], $config)",
map(
"queryCypher", cypher,
"sql", sql,
Expand Down

0 comments on commit 0370471

Please sign in to comment.