Skip to content

Commit

Permalink
feat: [discoveryengine] support import data from Cloud Spanner, BigTa…
Browse files Browse the repository at this point in the history
…ble, SQL and Firestore (#5218)

* feat: support import data from Cloud Spanner, BigTable, SQL and Firestore
feat: support standalone ranking API
feat: support layout detection and more chunking features
feat: support advanced search boosting
docs: keep the API doc up-to-date with recent changes

PiperOrigin-RevId: 621906335

Source-Link: googleapis/googleapis@624b052

Source-Link: googleapis/googleapis-gen@3c68efb
Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLWRpc2NvdmVyeWVuZ2luZS8uT3dsQm90LnlhbWwiLCJoIjoiM2M2OGVmYjIzZjcwNjJmZWU2MjA5MmNhMDdlOWE5YjQ5NzJkNjM4MCJ9

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
  • Loading branch information
gcf-owl-bot[bot] and gcf-owl-bot[bot] authored Apr 5, 2024
1 parent 9ed38db commit cc25e93
Show file tree
Hide file tree
Showing 113 changed files with 22,774 additions and 4,281 deletions.
2 changes: 2 additions & 0 deletions packages/google-cloud-discoveryengine/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ Samples are in the [`samples/`](https://github.com/googleapis/google-cloud-node/
| Document_service.create_document | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.create_document.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.create_document.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.delete_document | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.delete_document.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.delete_document.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.get_document | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.get_document.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.get_document.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.get_processed_document | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.get_processed_document.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.get_processed_document.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.import_documents | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.import_documents.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.import_documents.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.list_documents | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.list_documents.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.list_documents.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Document_service.purge_documents | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.purge_documents.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/document_service.purge_documents.js,packages/google-cloud-discoveryengine/samples/README.md) |
Expand All @@ -208,6 +209,7 @@ Samples are in the [`samples/`](https://github.com/googleapis/google-cloud-node/
| Engine_service.tune_engine | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/engine_service.tune_engine.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/engine_service.tune_engine.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Engine_service.update_engine | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/engine_service.update_engine.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/engine_service.update_engine.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Estimate_billing_service.estimate_data_size | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/estimate_billing_service.estimate_data_size.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/estimate_billing_service.estimate_data_size.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Rank_service.rank | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/rank_service.rank.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/rank_service.rank.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Recommendation_service.recommend | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/recommendation_service.recommend.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/recommendation_service.recommend.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Schema_service.create_schema | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/schema_service.create_schema.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/schema_service.create_schema.js,packages/google-cloud-discoveryengine/samples/README.md) |
| Schema_service.delete_schema | [source code](https://github.com/googleapis/google-cloud-node/blob/main/packages/google-cloud-discoveryengine/samples/generated/v1alpha/schema_service.delete_schema.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/google-cloud-node&page=editor&open_in_editor=packages/google-cloud-discoveryengine/samples/generated/v1alpha/schema_service.delete_schema.js,packages/google-cloud-discoveryengine/samples/README.md) |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,34 @@ message Chunk {
string title = 2;
}

// Page span of the chunk.
message PageSpan {
// The start page of the chunk.
int32 page_start = 1;

// The end page of the chunk.
int32 page_end = 2;
}

// Metadata of the current chunk. This field is only populated on
// [SearchService.Search][google.cloud.discoveryengine.v1alpha.SearchService.Search]
// API.
message ChunkMetadata {
// The previous chunks of the current chunk. The number is controlled by
// [SearchRequest.ContentSearchSpec.ChunkSpec.num_previous_chunks][google.cloud.discoveryengine.v1alpha.SearchRequest.ContentSearchSpec.ChunkSpec.num_previous_chunks].
// This field is only populated on
// [SearchService.Search][google.cloud.discoveryengine.v1alpha.SearchService.Search]
// API.
repeated Chunk previous_chunks = 1;

// The next chunks of the current chunk. The number is controlled by
// [SearchRequest.ContentSearchSpec.ChunkSpec.num_next_chunks][google.cloud.discoveryengine.v1alpha.SearchRequest.ContentSearchSpec.ChunkSpec.num_next_chunks].
// This field is only populated on
// [SearchService.Search][google.cloud.discoveryengine.v1alpha.SearchService.Search]
// API.
repeated Chunk next_chunks = 2;
}

// The full resource name of the chunk.
// Format:
// `projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id}/chunks/{chunk_id}`.
Expand All @@ -56,7 +84,7 @@ message Chunk {
// characters.
string name = 1;

// Unique chunk id of the current chunk.
// Unique chunk ID of the current chunk.
string id = 2;

// Content is a string from a document (parsed content).
Expand All @@ -69,4 +97,10 @@ message Chunk {
// It contains derived data that are not in the original input document.
google.protobuf.Struct derived_struct_data = 4
[(google.api.field_behavior) = OUTPUT_ONLY];

// Page span of the chunk.
PageSpan page_span = 6;

// Output only. Metadata of the current chunk.
ChunkMetadata chunk_metadata = 7 [(google.api.field_behavior) = OUTPUT_ONLY];
}
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,18 @@ option (google.api.resource_definition) = {
type: "discoveryengine.googleapis.com/Location"
pattern: "projects/{project}/locations/{location}"
};
option (google.api.resource_definition) = {
type: "discoveryengine.googleapis.com/GroundingConfig"
pattern: "projects/{project}/locations/{location}/groundingConfigs/{grounding_config}"
};
option (google.api.resource_definition) = {
type: "discoveryengine.googleapis.com/RankingConfig"
pattern: "projects/{project}/locations/{location}/rankingConfigs/{ranking_config}"
};
option (google.api.resource_definition) = {
type: "healthcare.googleapis.com/FhirStore"
pattern: "projects/{project}/locations/{location}/datasets/{dataset}/fhirStores/{fhir_store}"
};

// The industry vertical associated with the
// [DataStore][google.cloud.discoveryengine.v1alpha.DataStore].
Expand All @@ -52,6 +64,9 @@ enum IndustryVertical {

// The media industry vertical.
MEDIA = 2;

// The healthcare FHIR vertical.
HEALTHCARE_FHIR = 7;
}

// The type of solution.
Expand All @@ -67,6 +82,11 @@ enum SolutionType {

// Used for use cases related to the Generative AI agent.
SOLUTION_TYPE_CHAT = 3;

// Used for use cases related to the Generative Chat agent.
// It's used for Generative chat engine only, the associated data stores
// must enrolled with `SOLUTION_TYPE_CHAT` solution.
SOLUTION_TYPE_GENERATIVE_CHAT = 4;
}

// Tiers of search features. Different tiers might have different
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -202,3 +202,22 @@ message Document {
google.protobuf.Timestamp index_time = 13
[(google.api.field_behavior) = OUTPUT_ONLY];
}

// Document captures all raw metadata information of items to be recommended or
// searched.
message ProcessedDocument {
// Output format of the processed document.
oneof processed_data_format {
// The JSON string representation of the processed document.
string json_data = 2;
}

// Required. Full resource name of the referenced document, in the format
// `projects/*/locations/*/collections/*/dataStores/*/branches/*/documents/*`.
string document = 1 [
(google.api.field_behavior) = REQUIRED,
(google.api.resource_reference) = {
type: "discoveryengine.googleapis.com/Document"
}
];
}
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,19 @@ service DocumentService {
metadata_type: "google.cloud.discoveryengine.v1alpha.PurgeDocumentsMetadata"
};
}

// Gets the parsed layout information for a
// [Document][google.cloud.discoveryengine.v1alpha.Document].
rpc GetProcessedDocument(GetProcessedDocumentRequest)
returns (ProcessedDocument) {
option (google.api.http) = {
get: "/v1alpha/{name=projects/*/locations/*/dataStores/*/branches/*/documents/*}:getProcessedDocument"
additional_bindings {
get: "/v1alpha/{name=projects/*/locations/*/collections/*/dataStores/*/branches/*/documents/*}:getProcessedDocument"
}
};
option (google.api.method_signature) = "name";
}
}

// Request message for
Expand Down Expand Up @@ -322,3 +335,54 @@ message DeleteDocumentRequest {
}
];
}

// Request message for
// [DocumentService.GetDocument][google.cloud.discoveryengine.v1alpha.DocumentService.GetDocument]
// method.
message GetProcessedDocumentRequest {
// The type of processing to return in the response.
enum ProcessedDocumentType {
// Default value.
PROCESSED_DOCUMENT_TYPE_UNSPECIFIED = 0;

// Available for all data store parsing configs.
PARSED_DOCUMENT = 1;

// Only available if ChunkingConfig is enabeld on the data store.
CHUNKED_DOCUMENT = 2;
}

// The format of the returned processed document. If unspecified, defaults to
// JSON.
enum ProcessedDocumentFormat {
// Default value.
PROCESSED_DOCUMENT_FORMAT_UNSPECIFIED = 0;

// output format will be a JSON string representation of processed document.
JSON = 1;
}

// Required. Full resource name of
// [Document][google.cloud.discoveryengine.v1alpha.Document], such as
// `projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document}`.
//
// If the caller does not have permission to access the
// [Document][google.cloud.discoveryengine.v1alpha.Document], regardless of
// whether or not it exists, a `PERMISSION_DENIED` error is returned.
//
// If the requested [Document][google.cloud.discoveryengine.v1alpha.Document]
// does not exist, a `NOT_FOUND` error is returned.
string name = 1 [
(google.api.field_behavior) = REQUIRED,
(google.api.resource_reference) = {
type: "discoveryengine.googleapis.com/Document"
}
];

// Required. What type of processing to return.
ProcessedDocumentType processed_document_type = 2
[(google.api.field_behavior) = REQUIRED];

// What format output should be. If unspecified, defaults to JSON.
ProcessedDocumentFormat processed_document_format = 3;
}
Loading

0 comments on commit cc25e93

Please sign in to comment.