-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex search in job traces #245
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this contribution @pkovtunov 🙇!
It looks like an interesting use case, I believe there are some parts of the implementation which can be simplified but otherwise I would be keen to get this merged 👍
@@ -128,6 +128,18 @@ pull: | |||
# discovered project refs (optional, default: 30) | |||
interval_seconds: 30 | |||
|
|||
metrics_with_traces: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could avoid adding another scheduler for the polling of this metric as it can be done during the fetch of the other job's metrics 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trace processing can take a while - depending on how long the logs are and the job count. In our particular case we want to fetch these metrics only once a day. And to keep the load on both sides (exporter and gitlab) as low as possible.
Another approach could be to trigger trace metrics only every n-th polling - but it's also another additional configuration parameter.
# discovered project refs (optional, default: 1800) | ||
interval_seconds: 1800 | ||
|
||
trace_rules: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would omit this parameter in here and use the structure directly within the project settings to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My approach was to reduce the redundancy in our case - we (and our business partners) have a large amount of different projects and it would be cleaner to define the rules only once and then reference them in the projects.
I suppose, for most other use cases it would be fine to put the rules directly into the project/pull/job section. That's what you talking about, right?
@@ -5,6 +5,7 @@ import "github.com/prometheus/client_golang/prometheus" | |||
var ( | |||
defaultLabels = []string{"project", "topics", "kind", "ref", "variables"} | |||
jobLabels = []string{"stage", "job_name", "runner_description"} | |||
traceLabels = []string{"job_id", "trace_rule"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we already have a metric for the job_id
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes - probably I've just put it there to make the promql statement simpler for the new grafana dashboard:
sort_desc(gitlab_ci_pipeline_job_trace_match_count{project=~"$PROJECT", ref=~"$REF", job=~"$JOB", trace_rule=~"$TRACE_RULE"} > 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a table view with ID in the first column. I've seen that you also have JobIDs in the Job Dashboard - by joining/merging the statements, right?
|
||
"github.com/mvisonneau/gitlab-ci-pipelines-exporter/pkg/schemas" | ||
log "github.com/sirupsen/logrus" | ||
) | ||
|
||
func pullRefPipelineJobsMetrics(ref schemas.Ref) error { | ||
func pullRefPipelineJobsMetrics(ref schemas.Ref, pullJobTraces bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we set pullJobTraces as part of the ref to avoid overloading the subsequent functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be glad about it - but I've no good idea how to implement this - or golang experience.
We have two separate schedules for normal and "enriched" metrics (with traces). Both of them trigger the same metric fetch mechanism - with the pullJobTraces flag set to false/true. If that would be a part of ref, we have to switch it there every time.
Another idea - is it possible to define the flag on the scheduler itself and get the value in the exporter/job.go ?
I suppose I need your help here :)
This looks like a cool feature 👍 Are there any plans when this will be merged / released? |
f1d1bc5
to
718e730
Compare
This contribution is based on our requirement to find and count certain patterns in job traces. For example, you can search for (custom) deprecation warnings in the CI/CD pipelines and visualize the hits in Grafana. My team needs such overview to keep an eye on K8S deployments via GitLab and to monitor the usage of our tooling and the running versions. That's how this contribution was born.
For job trace metrics the job metrics have to be enabled. The job trace scraping time is configured separately to reduce the API load. It's recommended to enable job trace metrics only for projects/refs you need to analyze.
Further configuration is pretty simple and documented in the README.md
I can provide some example snippets of regex rules if needed.
It's my first contribution to a Go-Lang project (and to be honest the first experience with Go) - any hints and improvements are welcome :)