Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of the wikipedia dataset + creating the track. #429

Merged
merged 23 commits into from
Sep 18, 2023

Conversation

afoucret
Copy link
Contributor

@afoucret afoucret commented Jul 3, 2023

The changes in this PR is related to creating wikipedia dataset and a rally track for the benchmarking stateful and serverless environments.

@saikatsarkar056 saikatsarkar056 marked this pull request as ready for review August 7, 2023 18:24
@saikatsarkar056 saikatsarkar056 self-assigned this Aug 7, 2023
@dliappis dliappis self-requested a review August 9, 2023 08:57
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. This is phenomenally good and well polished work.
I left a few comments, mainly documenting all the track parameters and the use of random.

wikipedia/README.md Show resolved Hide resolved
wikipedia/track.py Outdated Show resolved Hide resolved
wikipedia/track.py Show resolved Hide resolved
Co-authored-by: Dimitrios Liappis <[email protected]>
wikipedia/README.md Outdated Show resolved Hide resolved
afoucret and others added 3 commits September 5, 2023 14:44
Co-authored-by: Dimitrios Liappis <[email protected]>
- compile query cleaning regexp
- make the number of search iterations configurable (higher by default)
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


def params(self):
result = {
"body": {"query": {"query_string": {"query": next(self._queries_iterator), "default_field": self._params["search-fields"]}}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a multi-match query to avoid the reserved keywords (and, or, ...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this in a follow-up. We still have few run to do and I would prefer to keep the same query for now to get comparable results.

@afoucret afoucret merged commit a40f531 into elastic:master Sep 18, 2023
8 checks passed
@b-deam b-deam mentioned this pull request Oct 24, 2023
inqueue pushed a commit to inqueue/rally-tracks that referenced this pull request Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants