Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cirrus: Store podman machine benchmark data #17735

Merged
merged 1 commit into from
Mar 21, 2023

Conversation

cevich
Copy link
Member

@cevich cevich commented Mar 10, 2023

Future work will present podman-machine benchmark data in some useful
format for analysis. However, this data is currently only stored as CI
artifacts. When CI runs on the main branch, after a PR merges, utilize
a pair of purpose-built containers to retrieve then upload the data into
a GCE firestore database. This operation should not be critical, such
that any faults will not cause the entire CI build to be marked as a
failure.

Note: This uses a container with code from the containers/automation repo. to do the actual database insertion.

Does this PR introduce a user-facing change?

None

@openshift-ci openshift-ci bot added release-note-none do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Mar 10, 2023
@cevich
Copy link
Member Author

cevich commented Mar 14, 2023

@edsantiago you mentioned in my demo that keying the data keyed on task is useless. I could fairly easily change it to being keyed by timestamp, do you think that would be more useful?

@edsantiago
Copy link
Member

This is a tough one. To describe the nature of the problem, let's refer to your idea this morning of "comparing against yesterday's results": how exactly does one find that baseline? Task and build IDs are not useful directly, there needs to be a prior step (query by PR, or by cron and date, or ????). That's a solvable problem, it just takes some thinking.

Speaking of thinking, would it make sense to key on Build ID instead of task? Reason: to get a task ID you need to query for both PR and CI-job-with-a-given-name. To get a Build ID, all you need to query is for PR.

@cevich
Copy link
Member Author

cevich commented Mar 14, 2023

Task and build IDs are not useful directly,

Yes. My understanding of firestore is the document "names" don't really matter much. In order for any queries to work, you have to setup indexes on the keys you're interested in.

Query docs

So as-is (documents keyed by task id), something similar to this could be used (assuming an appropriate index is created):

tasks_ref = db.collection(u'tasks')
yesterday = datetime.whatever()
query = tasks_ref.where(u'stamp', u'>=', yesterday)

The code would then simply iterate through the returned documents, and operate on their stored benchmark values as needed.

Anyway, that's all future work. My point is, I don't think the collection-names or document-IDs matter all that much in NOSQL. What's important is the indices you setup for the document-keys. At least that's my understanding, please disagree if I'm mistaken.

Ref: Document Model

@cevich
Copy link
Member Author

cevich commented Mar 15, 2023

Thinking about this more...I think the most reliable thing would be comparing vs an average of the most recent top-3 data-points. That can (in theory) already be accomplished, so then maybe I just need to add a descending index on the task/stamp field. I'll do some experimentation to see if I can make that work.

@cevich
Copy link
Member Author

cevich commented Mar 15, 2023

Yep, using the WebUI query-builder. I can query /benchmarks/x86_64/tasks ordered descending by the timestamp field stamp, and limit the result to 3 entries. No index required, it "just works".

Future work will present podman-machine benchmark data in some useful
format for analysis.  However, this data is currently only stored as CI
artifacts.  When CI runs on the main branch, after a PR merges, utilize
a pair of purpose-built containers to retrieve then upload the data into
a GCE firestore database.  This operation should not be critical, such
that any faults will not cause the entire CI build to be marked as a
failure.

Signed-off-by: Chris Evich <[email protected]>
@cevich cevich changed the title [WIP] Cirrus: Store podman machine benchmark data Cirrus: Store podman machine benchmark data Mar 15, 2023
@cevich
Copy link
Member Author

cevich commented Mar 15, 2023

Force-push: Stripped off the 'DO NOT MERGE' commit. Meaning, the uploading of benchmarks to firestore will only happen from the main branch. i.e. not from PRs (where we want to do comparisons).

@cevich cevich marked this pull request as ready for review March 15, 2023 19:05
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 15, 2023
@edsantiago
Copy link
Member

Meaning, the uploading of benchmarks to firestore will only happen from the main branch. i.e. not from PRs

Thank you, that addresses my panicky concern.

@cevich
Copy link
Member Author

cevich commented Mar 21, 2023

I believe this PR is ready to go.

@edsantiago
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 21, 2023
@rhatdan
Copy link
Member

rhatdan commented Mar 21, 2023

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 21, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cevich, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2023
@openshift-merge-robot openshift-merge-robot merged commit 3820554 into containers:main Mar 21, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 5, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants