Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document limits to index and bulk pages #4537

Merged
merged 4 commits into from
Jul 19, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions _api-reference/document-apis/bulk.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ Introduced 1.0
The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests.


Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for each document in the request of 512mb or less.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
{: .note}

## Example

```json
Expand Down
4 changes: 4 additions & 0 deletions _im-plugin/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and

For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually.

To make sure that many documents can be indexed through both index API and bulk API are manageable, each document inside an index must be less than 512mb.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved


## Introduction to indexing

Expand Down Expand Up @@ -91,6 +93,8 @@ OpenSearch indexes have the following naming restrictions:

`:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, or `<`



## Read data

After you index a document, you can retrieve it by sending a GET request to the same endpoint that you used for indexing:
Expand Down