-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add understanding workloads section #6164
Conversation
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
|
||
## General search clusters | ||
|
||
For benchmarking clusters built for general search use cases, start with [nyc_taxis](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/nyc_taxis). The `nyc_taxis` workload data about the rides performed by yellow taxis in New York in 2015. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"workload data about" -> "workload contains data about"
Consider the following when deciding which workload would work best for benchmarking your cluster: | ||
|
||
- Consider the use case of your cluster. | ||
- Consider what data types your cluster uses by comparing it the data structure of the documents contained in the workload. Each workload contains an example document so you can compare data types. Also, you can go to `index.json` file in the workload to see the data type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to see the data. type" --> "to see the index mappings and data types"
|
||
## _operations and _test-procedures | ||
|
||
To make the workload more human-readable, operations and test procedures are seperated into two different directors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"different directors" -> "different directories"
|
||
The index.json file contains all of the data mappings and parameters used to index any documents contained inside the workload, as well as the index settings needed when the `create-index` operations in run during the workload. | ||
|
||
For example, in the `nyc_taxis` workload, the `settings` array gives you the ability to customize the number of shards, replicas, and tells the index whether to cache queries or requests. All mappings are based off of a single document, usually called in the `files.txt` file, and includes each mapping parameter and its format, as shown in the following example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rephrase these two paragraphs to something such as:
When OSB creates an index for the workload, it uses the index settings and mappings template found in index.json. Mappings in index.json are based off of the mappings of a single document from the workload's corpus, which can be found in files.txt. For example, the following is the index.json for nyc_taxis workload. Users can customize fields such as number_of_shards, number_of_replicas, query_cache_enabled, and requests_cache_enabled.
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
|
||
### Schedule | ||
|
||
The `schedule` element contains a list of actions and operations that are run by the workload. Operations run according to the order in which they appear in the `schedule`. The following example illustrates a `schedule` with multiple operations, each defined by its `operation-type`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can rephrase this to something like:
The schedule element contains a list of operations that are run according to the order they appear in.
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Naarcha-AWS @natebower Please see my review comments. Thank you, Melissa
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Naarcha-AWS Nice job on this 😄. Please tag me on the rewrites of lines 107 and 161 in the second file so that I can verify them. Thanks!
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/choosing-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
_benchmark/user-guide/understanding-workloads/anatomy-of-a-workload.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Nathan Bower <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
* Add understanding workloads section. Signed-off-by: Naarcha-AWS <[email protected]> * Add additional anatomy sections Signed-off-by: Naarcha-AWS <[email protected]> * Add section headers Signed-off-by: Naarcha-AWS <[email protected]> * Fix link Signed-off-by: Naarcha-AWS <[email protected]> * Fix typos Signed-off-by: Naarcha-AWS <[email protected]> * Change example to fix build error. Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Update anatomy-of-a-workload.md Signed-off-by: Naarcha-AWS <[email protected]> * Fix build errors Signed-off-by: Naarcha-AWS <[email protected]> * Update anatomy-of-a-workload.md Signed-off-by: Naarcha-AWS <[email protected]> * Update anatomy-of-a-workload.md Signed-off-by: Naarcha-AWS <[email protected]> * Update concepts.md Signed-off-by: Naarcha-AWS <[email protected]> * Update index.md Signed-off-by: Naarcha-AWS <[email protected]> --------- Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit bf4ae72) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Add understanding workloads section. * Add additional anatomy sections * Add section headers * Fix link * Fix typos * Change example to fix build error. * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update anatomy-of-a-workload.md * Fix build errors * Update anatomy-of-a-workload.md * Update anatomy-of-a-workload.md * Update concepts.md * Update index.md --------- (cherry picked from commit bf4ae72) Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.