Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fixes 3737: Add support for AWS Bedrock for ml procedures #388

Closed
wants to merge 12 commits into from
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ subprojects {

ext {
// NB: due to version.json generation by parsing this file, the next line must not have any if/then/else logic
neo4jVersion = "5.14.0"
neo4jVersion = "5.12.0"
// instead we apply the override logic here
neo4jVersionEffective = project.hasProperty("neo4jVersionOverride") ? project.getProperty("neo4jVersionOverride") : neo4jVersion
testContainersVersion = '1.18.3'
Expand Down
1 change: 1 addition & 0 deletions docs/asciidoc/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ include::partial$generated-documentation/nav.adoc[]
* xref:ml/index.adoc[]
** xref:ml/vertexai.adoc[]
** xref:ml/openai.adoc[]
** xref:ml/bedrock.adoc[]

* xref:background-operations/index.adoc[]
** xref::background-operations/apoc-load-directory-async.adoc[]
Expand Down
225 changes: 225 additions & 0 deletions docs/asciidoc/modules/ROOT/pages/ml/bedrock.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
[[aws-bedrock]]
= AWS Bedrock procedures


These procedures leverage the https://aws.amazon.com/bedrock/[Amazon Bedrock API].


Here is a list of all available Aws Bedrock procedures:


[opts=header, cols="1, 4", separator="|"]
|===
|name| description
|apoc.ml.bedrock.custom(body, $config)| To create a customizable Bedrock API call
|apoc.ml.bedrock.list($config)| To get the list of foundation or custom models
|apoc.ml.bedrock.jurassic(body, $config)| To create an API call to `Jurassic-2` model
|apoc.ml.bedrock.anthropic.claude(body, $config)| To create an API call to `Claude`s model
|apoc.ml.bedrock.titan.embed(body, $config)| To create an API call to `Titan Embedding` model
|apoc.ml.bedrock.stability(body, $config)| To create an API call to `Stable Diffusion` model
|===

All the procedures, leverage the `apoc.ml.bedrock.custom` procedures,
and support the same config parameter, but unlike the `custom` one,
they have some different default parameters and model id.

Moreover, the return data is consistent with the called API,
instead of returning a generic `Object` as a result


== Config

.Config parameters
[opts=header, cols="1,1,2,5"]
|===
| name | type | default | description
| keyId | String | null | The AWS key ID. We can also evaluate it via `apoc.conf`, with the key `apoc.aws.key.id`. As an alternative to the pair keyId-secretKey, we can directly pass the https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html[aws V4 signature] via the `headers` config
| secretKey | String | null | The AWS secret access key. We can also evaluate it via `apoc.conf`, with the key `apoc.aws.secret.id`. As an alternative to the pair keyId-secretKey, we can directly pass the https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html[aws V4 signature] via the `headers` config
| region | String | the one calculated from `endpoint` config | The AWS region
| endpoint | String | see below | The AWS endpoint.
| method | String | `"POST"` (or `"GET"` with the `apoc.ml.bedrock.list` procedure) | The HTTP Method
| headers | Map<String, Object> | `{Content-Type: application/json', Accept, '\*/*'}` | The HTTP Header
| modelId | String | see below | (Ignored with the `bedrock.list` proc.) The modelId
| path | String | "foundation-models" | (Valid only with the `bedrock.list`) The endpoint path.
It will create an endpoint of the type `https://bedrock.us-east-1.amazonaws.com/<path>`, i.e. with default `https://bedrock.us-east-1.amazonaws.com/foundation-models`
|===

The `endpoint` config takes precedence over the `modelId` one.
In case of all procedures, except the `bedrock.list`, the default `endpoint` is `"https://bedrock-runtime.us-east-1.amazonaws.com/model/<modelID>/invoke"`.
The `<modelID>` part must be configured if we use the `ml.bedrock.custom` procedure,
while with the `bedrock.jurassic`, `bedrock.anthropic.claude`, `bedrock.titan.embed`, `bedrock.stability` ones,
has a default value of "ai21.j2-ultra-v1", "amazon.titan-embed-text-v1", "anthropic.claude-v2" and "stability.stable-diffusion-xl-v0" respectively.



== Authentication settings

To authenticate to bedrock services, we can set in the `apoc.conf` file the following entries.

.apoc.conf
[source,properties]
----
apoc.aws.key.id=<AWS Key ID>
apoc.aws.secret.key=<AWS Secret Access Key>
----

Alternatively we can set them as `$config` parameters, i.e.: `{keyId: '<AWS Key ID>', secretKey:'<AWS Secret Access Key>'}`.

Or also, we can put an Authorization header, by using the `header` parameter,
i.e. `{header: {Authorization: 'AWS4-HMAC-SHA256 <CredentialAndSignature..>', ...other entries...} }`.

Note that the default `Content-Type: application/json` and the `Accept: \*/*` header entries,
are always passed to the http request, unless overridden via the config `header`.


In the following examples,
we assume that we set Key id and Secret Access Key via `apoc.conf`.

== Usage Examples

=== Ad-hoc model procedure

.apoc.ml.bedrock.jurassic
[source,cypher]
----
CALL apoc.ml.bedrock.jurassic({prompt: "Review: Extremely old cabinets",
maxTokens: 50,
temperature: 0,
topP: 1.0})
----

.Results
[opts="header",cols="3"]
|===
| id | promptTokens | completions
| 1234 | [
{
"textRange": {
"start": 0,
"end": 6
},
"topTokens": null,
"generatedToken": {
"token": "▁Review",
"raw_logprob": -7.870129585266113,
"logprob": -7.870129585266113
}
}
,
{ ....
| [
{
"data": {
"text": "
These cabinets are very old and outdated. They are made of wood and have a simple, classic design. The cabinets are in good condition, but they could use some updating.",
"tokens": [
{ ...
|===


We can also change the modelId, for example by using the `"ai21.j2-mid-v1"` one, which return the same response values.
[source,cypher]
----
CALL apoc.ml.bedrock.jurassic(<BODY>, {modelId: "ai21.j2-mid-v1"})
----



.apoc.ml.bedrock.anthropic.claude
[source,cypher]
----
CALL apoc.ml.bedrock.anthropic.claude({
prompt: "\n\nHuman: Hello world\n\nAssistant:",
max_tokens_to_sample: 300,
temperature: 0.5,
top_k: 250,
top_p: 1,
stop_sequences: ["\\n\\nHuman:"],
anthropic_version: "bedrock-2023-05-31"
})
----


.Results
[opts="header",cols="2"]
|===
| completion| stopReason
| " Hello!" | "stop_sequence"
|===


.apoc.ml.bedrock.titan.embed
[source,cypher]
----
CALL apoc.ml.bedrock.titan.embed({inputText: "Hello World"})
----

.Results
[opts="header",cols="2"]
|===
| inputTextTokenCount| embedding
| 2 | [0.32421875, 0.35546875, 0.625, 0.20019531, 1.328125, 0.6171875, 0.11425781, -0.00074005127, ....
|===


.apoc.ml.bedrock.stability
[source,cypher]
----
CALL apoc.ml.bedrock.stability({
text_prompts: [{text: "picture of a bird", weight: 1.0}],
cfg_scale: 5,
seed: 123,
steps: 70,
style_preset: "photographic"
})
----

.Results
[opts="header"]
|===
| base64Image
| "iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAIAAAB7GkOtAAABjmVYSWZNTQAqAAAACAAGAQAABAAAAAEAAAIAAQEABAAA...."
|===



=== List of models

[source,cypher]
----
CALL apoc.ml.bedrock.list()
----

.Results
[opts="header"]
|===
| modelId | modelArn |modelName |providerName |responseStreamingSupported|customizationsSupported|inferenceTypesSupported|inputModalities |outputModalities
| "amazon.titan-tg1-large" |"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-tg1-large" |"Titan Text Large" |"Amazon" |true |["FINE_TUNING"] |["ON_DEMAND"] |["TEXT"] |["TEXT"]
| "amazon.titan-e1t-medium" |"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-e1t-medium" |"Titan Text Embeddings" |"Amazon" |null |[] |["ON_DEMAND"] |["TEXT"] |["EMBEDDING"]
| ... |... |... |... |null |[] |... |... |...
|===


== Custom AWS API Call

Via the `apoc.ml.bedrock.custom` we can create a customizable Bedrock API Request, by choosing the HTTP Method, the endpoint, the region and the additional headers.
Useful both for https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html[invoke a model],
in the case the response is incompatible with the previous procedures, and to use any other Bedrock API.

For example, we can call the https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetModelInvocationLoggingConfiguration.html[GetModelInvocationLoggingConfiguration API]
by executing the following query (note that the `body` parameter is null, since the API does not have a request body.):

[source,cypher]
----
CALL apoc.ml.bedrock.custom(null,{
endpoint: "https://bedrock.us-east-1.amazonaws.com/logging/modelinvocations",
method: "GET"
})
----

.Results
[opts="header"]
|===
| value
| `{ "loggingConfig": {"cloudWatchConfig": { ... }}}`
|===
1 change: 1 addition & 0 deletions docs/asciidoc/modules/ROOT/pages/ml/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ This section includes:

* xref::ml/vertexai.adoc[]
* xref::ml/openai.adoc[]
* xref::ml/bedrock.adoc[]
1 change: 1 addition & 0 deletions extended/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ dependencies {
implementation group: 'org.jsoup', name: 'jsoup', version: '1.15.3'
implementation group: 'com.opencsv', name: 'opencsv', version: '5.7.1'
implementation group: 'us.fatehi', name: 'schemacrawler', version: '15.04.01'
implementation group: 'uk.co.lucasweb', name: 'aws-v4-signer-java', version: '1.3'

// These will be dependencies not packaged with the .jar
// They need to be provided either through the database or in an extra .jar
Expand Down
2 changes: 2 additions & 0 deletions extended/src/main/java/apoc/ExtendedApocConfig.java
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ public class ExtendedApocConfig extends LifecycleAdapter
public static final String APOC_UUID_ENABLED_DB = "apoc.uuid.enabled.%s";
public static final String APOC_UUID_FORMAT = "apoc.uuid.format";
public static final String APOC_OPENAI_KEY = "apoc.openai.key";
public static final String APOC_AWS_KEY_ID = "apoc.aws.key.id";
public static final String APOC_AWS_SECRET_KEY = "apoc.aws.secret.key";
public enum UuidFormatType { hex, base64 }

// These were earlier added via the Neo4j config using the ApocSettings.java class
Expand Down
5 changes: 3 additions & 2 deletions extended/src/main/java/apoc/get/GetProcedures.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import apoc.result.NodeResult;
import apoc.result.RelationshipResult;
import org.neo4j.graphdb.Transaction;
import org.neo4j.kernel.impl.coreapi.InternalTransaction;
import org.neo4j.procedure.Context;
import org.neo4j.procedure.Description;
import org.neo4j.procedure.Name;
Expand All @@ -20,13 +21,13 @@ public class GetProcedures {
@Procedure
@Description("apoc.get.nodes(node|id|[ids]) - quickly returns all nodes with these id's")
public Stream<NodeResult> nodes(@Name("nodes") Object ids) {
return new Get(tx).nodes(ids);
return new Get((InternalTransaction) tx).nodes(ids);
}

@Procedure
@Description("apoc.get.rels(rel|id|[ids]) - quickly returns all relationships with these id's")
public Stream<RelationshipResult> rels(@Name("relationships") Object ids) {
return new Get(tx).rels(ids);
return new Get((InternalTransaction) tx).rels(ids);
}

}
Loading