Skip to content

Commit

Permalink
Add vector search with embedding generation workload
Browse files Browse the repository at this point in the history
Signed-off-by: Vesa Pehkonen <[email protected]>
  • Loading branch information
vpehkone committed Mar 11, 2024
0 parents commit 23b079f
Show file tree
Hide file tree
Showing 177 changed files with 61,764 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @IanHoang @gkamat @beaioun @cgchinmay
19 changes: 19 additions & 0 deletions .github/workflows/add-untriaged.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: Apply 'untriaged' label during issue lifecycle

on:
issues:
types: [opened, reopened, transferred]

jobs:
apply-label:
runs-on: ubuntu-latest
steps:
- uses: actions/github-script@v6
with:
script: |
github.rest.issues.addLabels({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
labels: ['untriaged']
})
40 changes: 40 additions & 0 deletions .github/workflows/backport.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
name: Backport
on:
pull_request_target:
types:
- closed
- labeled

jobs:
backport:
name: Backport
runs-on: ubuntu-latest
# Only react to merged PRs for security reasons.
# See https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target.
if: >
github.event.pull_request.merged
&& (
github.event.action == 'closed'
|| (
github.event.action == 'labeled'
&& contains(github.event.label.name, 'backport')
)
)
permissions:
contents: write
pull-requests: write
steps:
- name: GitHub App token
id: github_app_token
uses: tibdex/[email protected]
with:
app_id: ${{ secrets.APP_ID }}
private_key: ${{ secrets.APP_PRIVATE_KEY }}
installation_id: 22958780

- name: Backport
uses: VachaShah/[email protected]
with:
github_token: ${{ steps.github_app_token.outputs.token }}
head_template: backport/backport-<%= number %>-to-<%= base %>
16 changes: 16 additions & 0 deletions .github/workflows/delete-backport-branch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
name: Delete merged branch of the backport PRs
on:
pull_request:
types:
- closed

jobs:
delete-branch:
runs-on: ubuntu-latest
if: startsWith(github.event.pull_request.head.ref,'backport/')
steps:
- name: Delete merged branch
uses: SvanBoxel/delete-merged-branch@main
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
103 changes: 103 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
## https://github.com/github/gitignore/blob/master/Global/OSX.gitignore

.DS_Store
.AppleDouble
.LSOverride

# Icon must end with two \r
Icon


# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

## kinda based on https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore

*.iml

## Directory-based project format:
.idea/

## https://github.com/github/gitignore/blob/master/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/
junit-*.xml

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

#Pickles
*.pk

# pyenv
.python-version
22 changes: 22 additions & 0 deletions .whitesource
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"scanSettings": {
"configMode": "AUTO",
"configExternalURL": "",
"projectToken": "",
"baseBranches": []
},
"checkRunSettings": {
"vulnerableCheckRunConclusionLevel": "failure",
"displayMode": "diff",
"useMendCheckNames": true
},
"issueSettings": {
"minSeverityLevel": "LOW",
"issueType": "DEPENDENCY"
},
"remediateSettings": {
"workflowRules": {
"enabled": true
}
}
}
12 changes: 12 additions & 0 deletions MAINTAINERS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## Overview

This document contains a list of maintainers in this repo. See [opensearch-project/.github/RESPONSIBILITIES.md](https://github.com/opensearch-project/.github/blob/main/RESPONSIBILITIES.md#maintainer-responsibilities) that explains what the role of maintainer means, what maintainers do in this and other repos, and how they should be doing it. If you're interested in contributing, and becoming a maintainer, see [CONTRIBUTING](CONTRIBUTING.md).

## Current Maintainers

| Maintainer | GitHub ID | Affiliation |
| ---------------- | ----------------------------------------------------- | ----------- |
| Ian Hoang | [IanHoang](https://github.com/IanHoang) | Amazon |
| Govind Kamat | [gkamat](https://github.com/gkamat) | Amazon |
| Mingyang Shi | [beaioun](https://github.com/beaioun) | OSCI |
| Chinmay Gadgil | [cgchinmay](https://github.com/cgchinmay) | Amazon |
13 changes: 13 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
OpenSearch
Copyright 2022 OpenSearch Contributors

This product includes software developed by Elasticsearch (http://www.elastic.co) which includes the following Notices:

Rally
Copyright 2012-2019 Elasticsearch B.V.

Rally Tracks
Copyright 2012-2019 Elasticsearch B.V.

Rally Teams
Copyright 2017-2019 Elasticsearch B.V.
35 changes: 35 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
OpenSearch Benchmark Workloads
------------

This repository contains the default workload specifications for the OpenSearch benchmarking tool [OpenSearch Benchmark](https://github.com/opensearch-project/OpenSearch-Benchmark).

You should not need to use this repository directly, except if you want to look under the hood or create your own workloads.

How to Contribute
-----------------

If you want to contribute a workload, please ensure that it works against the main version of OpenSearch (i.e. submit PRs against the `main` branch). We can then check whether it's feasible to backport the track to earlier OpenSearch/Elasticsearch versions.

After making changes to a workload, it's recommended for developers to run a simple test with that workload in `test-mode` to determine if there are any breaking changes.

See all details in the [contributor guidelines](https://github.com/opensearch-project/opensearch-benchmark/blob/main/CONTRIBUTING.md).

Backporting changes
-------------------

With each pull request, maintainers of this repository will be responsible for determining if a change can be backported.
Backporting a change involves cherry-picking a commit onto the branches which correspond to earlier versions of OpenSearch/Elasticsearch.
This ensures that workloads work for the latest `main` version of OpenSearch as well as older versions.

Changes should be `git cherry-pick`ed from `main` to the most recent version of OpenSearch and backward from there.
Example:
```
main → OpenSearch 2 → OpenSearch 1 → Elasticsearch 7 → ...
```
In the case of a merge conflict for a backported change a new pull request should be raised which merges the change.


License
-------

There is no single license for this repository. Licenses are chosen per workload. They are typically licensed under the same terms as the source data. See the README files of each workload for more details.
3 changes: 3 additions & 0 deletions big5/CHANGES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@

The JSON files comprising the workload have been modified to conform to OpenSearch Benchmark terminology and comply with OpenSearch features. The "challenges" directory has been renamed to "test_procedures" for the same reason.

Loading

0 comments on commit 23b079f

Please sign in to comment.