Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cohere vector benchmarks #444

Merged

Conversation

TattdCodeMonkey
Copy link
Contributor

Built the cohere_vector rally track which will be used by the Search team to benchmark vector search with ~30M vectors. We will also be adding tracks for smaller data sets to get a comparison of performance at different dataset sizes.

Note:

  • The cohere vectors had to be normalized to unit-length to be ingested
  • I don't believe we support the correct similarity algorithm for cohere, so the benchmark is not expected to produce relevant results, just a performance test for now

@TattdCodeMonkey TattdCodeMonkey added the new workload Any work related to adding a new track or functionality within a track label Aug 4, 2023
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

cohere_vector/operations/default.json Outdated Show resolved Hide resolved
@TattdCodeMonkey TattdCodeMonkey merged commit 98a66d1 into elastic:master Aug 21, 2023
8 checks passed
inqueue pushed a commit to inqueue/rally-tracks that referenced this pull request Dec 6, 2023
* cohere vector track

* normalize vectors for cohere track

* cohere_vector: add queries & track updates

* cohere_vector: updated README, index & parse docs

* cohere_vector: track with compressed size

* cohere_vector: fix formatting issues

* cohere_vector: split dataset into 11 files

Updated the parse documents script to output multiple json files instead
of one large file.

* cohere_vector: remove script_score challenge

* cohere_vector: fix linting

* cohere_vector: fix track files

* cohere_vector: add test files for track

* cohere_vector: updated README for creating N document files

* Updated readme with sizing requirements
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new workload Any work related to adding a new track or functionality within a track
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants