Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(devops): filter circleci config no-ops #4731

Merged
merged 49 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
d115e13
initial circleci config filter
ludamad Feb 23, 2024
9f6b7f1
jiggle content hash
ludamad Feb 23, 2024
8242725
jiggle content hash
ludamad Feb 23, 2024
c9d4bc8
jiggle content hash
ludamad Feb 23, 2024
1acbecf
unjiggle. ship it
ludamad Feb 23, 2024
106d245
Merge branch 'master' into ad/test-transform-circleci
ludamad Feb 23, 2024
9095bf2
give exec perms to generate script
ludamad Feb 23, 2024
eb9139d
fix script
ludamad Feb 23, 2024
1faaa15
fix script
ludamad Feb 23, 2024
4a8a19b
fix script
ludamad Feb 23, 2024
1c2f9c4
introspect
ludamad Feb 23, 2024
7f04598
inspect
ludamad Feb 23, 2024
fab1582
inspect
ludamad Feb 23, 2024
5b51c6a
ok
ludamad Feb 23, 2024
9028dad
inspect-it
ludamad Feb 23, 2024
7898435
Update
ludamad Feb 23, 2024
5a8648f
basic test
ludamad Feb 23, 2024
1dbdea0
[ci debug]
ludamad Feb 23, 2024
732bda3
ecr_login needed! ci debug helped
ludamad Feb 23, 2024
eab1e9a
compatible image??
ludamad Feb 23, 2024
b9ac952
[ci debug]
ludamad Feb 23, 2024
c60c683
token
ludamad Feb 23, 2024
399a191
token
ludamad Feb 23, 2024
888cc1b
.
ludamad Feb 23, 2024
d8640c3
heuristic
ludamad Feb 23, 2024
4ced479
robustness via aztec_manifest_key
ludamad Feb 23, 2024
5e0b0d2
add aztec keys to everything
ludamad Feb 23, 2024
bc26261
redo with fixed ecr_login
ludamad Feb 23, 2024
68eed43
fixes
ludamad Feb 23, 2024
ad7f690
[ci debug]
ludamad Feb 23, 2024
490a6af
[ci debug] test
ludamad Feb 23, 2024
9659ab7
test
ludamad Feb 23, 2024
b4ad276
test [ci debug]
ludamad Feb 23, 2024
dacea60
better print
ludamad Feb 23, 2024
cfb06f5
revert
ludamad Feb 23, 2024
e34259b
better print
ludamad Feb 23, 2024
1cc46e1
better print
ludamad Feb 23, 2024
3b42f10
productionize
ludamad Feb 23, 2024
aca0cb2
reinstate config
ludamad Feb 23, 2024
0556f38
cleanup
ludamad Feb 23, 2024
1c2821c
fix rebuild patterns
ludamad Feb 23, 2024
1a6c1fe
more printing
ludamad Feb 23, 2024
9a796aa
fix
ludamad Feb 23, 2024
12a4fd7
try not using xlarge
ludamad Feb 23, 2024
1b97faf
Merge remote-tracking branch 'origin/master' into ad/test-transform-c…
ludamad Feb 23, 2024
2d92791
seems worth saving 8 seconds?
ludamad Feb 23, 2024
3685d1a
revert. test
ludamad Feb 23, 2024
00c3fea
revert. test worked.
ludamad Feb 23, 2024
d5203fe
dont need to run pip
ludamad Feb 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@

version: 2.1

setup: true # have a dynamic config step
orbs:
slack: circleci/[email protected]
continuation: circleci/[email protected]
slack: circleci/[email protected]

parameters:
workflow:
Expand Down Expand Up @@ -69,6 +71,27 @@ setup_env: &setup_env
command: ./build-system/scripts/setup_env "$CIRCLE_SHA1" "$CIRCLE_TAG" "$CIRCLE_JOB" "$CIRCLE_REPOSITORY_URL" "$CIRCLE_BRANCH" "$CIRCLE_PULL_REQUEST"

jobs:
# Dynamically filter our code, quickly figuring out which jobs we can skip.
generate-config:
machine:
image: ubuntu-2204:2023.07.2
resource_class: large
steps:
- *checkout
- *setup_env
- run:
name: Generate Pipeline generated_config.yml file
command: |
echo "$AWS_SECRET_ACCESS_KEY"
# ability to query ECR images
ecr_login
# convert to json, run a python script, write out json (which is a readable as YAML)
pip install PyYAML
build-system/scripts/generate_circleci_config.py > .circleci/generated_config.yml
cat .circleci/generated_config.yml
echo "[]" > .circleci/generated_config.yml
- continuation/continue:
configuration_path: .circleci/generated_config.yml
# Noir
noir-x86_64:
docker:
Expand Down Expand Up @@ -1207,9 +1230,14 @@ bb_test: &bb_test

# Workflows.
workflows:
setup-workflow:
jobs:
- generate-config
system:
when:
equal: [system, << pipeline.parameters.workflow >>]
# Used to generate a dynamic 'system' workflow
# This is rewritten to 'system' on the real workflow (otherwise this is ignored by circleci)
equal: [NEVER, << pipeline.parameters.workflow >>]
jobs:
# Noir
- noir-x86_64: *defaults
Expand Down
97 changes: 97 additions & 0 deletions build-system/scripts/generate_circleci_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/usr/bin/env python3
# Operates on circleci (loaded as json) from stdin
# Outputs filtered circleci without the jobs we don't need to run
# NOTE: This uses the build manifest YAML file to filter the dependency graph in CircleCI BUT it is not one-to-one.
# There is a heuristic here where we expect a job to be associated with a manifest job if it lists the build_manifest.yml job name in its command.
import json
import yaml
from concurrent.futures import ProcessPoolExecutor, as_completed
import subprocess

# same functionality as query_manifest rebuildPatterns but in bulk
def get_manifest_job_names():
manifest = yaml.safe_load(open("build_manifest.yml"))
return list(manifest)

def has_associated_manifest_job(circleci_job, manifest_names):
steps = circleci_job.get("steps", [])
for step in steps:
run_info = step.get("run", {})
command = run_info.get("command", "")
for manifest_name in manifest_names:
if manifest_name in command:
return True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd shoot for doing this more robust. manifest_name in command can lead to a lot of false positives, and small changes like rewriting the run command using the shorthand syntax (ie run: echo foo instead of run: command: echo foo) breaks this.

Perhaps we can search for the build commands we know (ie build, cond_spot_run, etc) on each run command (shorthand or not) and extract the argument with the job name to avoid false positives. Also, not sure if we should account for jobs that depend on more than one build-manifest project (possibly not).

And as I finished writing this I noticed that this is still in draft, so you had probably already considered all this and I just jumped the gun. Apologies if that's the case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested quickly using the circleci local CLI for validating the config yaml, and it seems to allow any property for a step, so we could do:

jobs:
  # Noir
  noir-x86_64:
    docker:
      - image: aztecprotocol/alpine-build-image
    resource_class: small
    steps:
      - *checkout
      - *setup_env
      - run:
          name: "Build"
          command: cond_spot_run_build noir 32
          aztec_project: "noir"

Though I didn't test actually running it, so maybe there's another validation later down the road.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does seem to work, and I agree it's the less footgunny (plus you can search this string if confused and discover this script)

return False

def get_already_built_circleci_job_names(circleci_jobs):
manifest_names = list(get_already_built_manifest_job_names())
for job_name, circleci_job in circleci_jobs.items():
if has_associated_manifest_job(circleci_job, manifest_names):
yield job_name

# Helper for multiprocessing
def _get_already_built_manifest_job_names(manifest_name):
content_hash = subprocess.check_output(['calculate_content_hash', manifest_name]).decode("utf-8")
completed = subprocess.run(["check_rebuild", f"cache-{content_hash}", manifest_name], stdout=subprocess.DEVNULL)
if completed.returncode == 0:
return manifest_name
else:
return None

def get_already_built_manifest_job_names():
manifest_names = get_manifest_job_names()

with ProcessPoolExecutor() as executor:
futures = {executor.submit(_get_already_built_manifest_job_names, key): key for key in manifest_names}
for future in as_completed(futures):
result = future.result()
if result is not None:
yield result

def remove_jobs_from_workflow(jobs, to_remove):
"""
Removes jobs from a given CircleCI JSON workflow.

Parameters:
jobs (dict): The JSON object representing the CircleCI workflow jobs dependencies portion.
to_remove (list): The list of jobs to be removed from the workflow.

Returns:
dict: The new JSON object with specified jobs removed.
"""

new_jobs = []
# Remove specified jobs
for job in jobs:
key = next(iter(job))
if key in to_remove:
continue
# remove our filtered jobs from the dependency graph via the requires attribute
job[key]["requires"] = [r for r in job[key].get("requires", []) if r not in jobs_to_remove]
new_jobs.append(job)
return new_jobs
import sys

def eprint(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)

if __name__ == '__main__':
# The CircleCI workflow as a JSON string (Replace this with your actual workflow)

# Convert the JSON string to a Python dictionary
workflow_dict = yaml.safe_load(open('.circleci/config.yml'))

# # List of jobs to remove
jobs_to_remove = list(get_already_built_circleci_job_names(workflow_dict["jobs"]))

# Get rid of workflow setup step and setup flag
workflow_dict["setup"] = False
del workflow_dict["workflows"]["setup-workflow"]
# Remove the jobs and get the new workflow
workflow_dict["workflows"]["system"]["jobs"] = remove_jobs_from_workflow(workflow_dict["workflows"]["system"]["jobs"], jobs_to_remove)
workflow_dict["workflows"]["system"]["when"] = {"equal":["system","<< pipeline.parameters.workflow >>"]}
# Convert the new workflow back to JSON string
new_workflow_json_str = json.dumps(workflow_dict, indent=2)
for t in workflow_dict["workflows"]["system"]["jobs"]:
eprint("KEPT", t)
print(new_workflow_json_str)
1 change: 1 addition & 0 deletions build-system/scripts/image_exists
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#!/usr/bin/env bash
[ -n "${BUILD_SYSTEM_DEBUG:-}" ] && set -x # conditionally trace
set -eu
# Returns true if the given image exists in the current ECR.
aws ecr describe-images --region=$ECR_REGION --repository-name=$1 --image-ids=imageTag=$2 > /dev/null 2>&1
Loading