Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor image build to use a matrix build, add security scanning #2174

Merged
merged 5 commits into from
Dec 4, 2024

Conversation

mogul
Copy link
Contributor

@mogul mogul commented Dec 4, 2024

With a little help from Claude.AI...! This consolidates multiple jobs into a single create_docker_images job using a matrix strategy. Along the way it removed a couple of (hopefully innocuous) inconsistencies between the three versions of the process.

That turned out well so I also added security-scanning between image build and image push. Currently it doesn't push if there are any critical or high findings; you may want to adjust that. It also doesn't do anything to notify you when that happens; not sure if there's a good way to do that, or if GitHub Security will let you know on its own. 🤔

Summary by CodeRabbit

  • New Features

    • Consolidated Docker image building into a single job for improved efficiency.
    • Added a vulnerability scanning step to enhance security checks.
  • Improvements

    • Streamlined workflow for building and pushing Docker images.
    • Enhanced organization through a matrix strategy for managing multiple images.

With a little help from Claude.AI
Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces a significant restructuring of the GitHub Actions workflow for building Docker images. It consolidates three separate jobs into a single job called create_docker_images, which employs a matrix strategy to manage multiple Docker images. The environment variable IMAGE_NAME is now dynamically set based on the matrix configuration. The workflow retains steps for repository checkout and Docker setup while streamlining the build and push processes. Additionally, the final job quickstart-guide-test now depends on the consolidated job.

Changes

File Change Summary
.github/workflows/build_docker_images.yml - Renamed job create_frontend_docker_image to create_docker_images.
- Removed jobs for backend and demo proxy images.
- Updated IMAGE_NAME to be dynamically set using ${{ matrix.image_name }}.
- Retained steps for repository checkout, Docker Buildx setup, and logging into the registry.
- Modified working directory for version info to ${{ matrix.context }}.
- Updated quickstart-guide-test to depend on create_docker_images.
- Added steps for running a Trivy vulnerability scanner and uploading results to GitHub Security tab.

Sequence Diagram(s)

sequenceDiagram
    participant A as GitHub Actions
    participant B as Docker Registry
    participant C as Matrix Job

    A->>C: Trigger create_docker_images
    C->>A: Checkout repository
    C->>A: Setup Docker Buildx
    C->>A: Login to Docker Registry
    C->>B: Build and push sartography/spiffworkflow-frontend
    C->>B: Build and push sartography/spiffworkflow-backend
    C->>B: Build and push sartography/connector-proxy-demo
    C->>A: Run Trivy vulnerability scanner
    C->>A: Upload Trivy scan results
    A->>A: Run quickstart-guide-test
Loading

Possibly related PRs


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
.github/workflows/build_docker_images.yml (3)

37-50: Excellent refactoring using matrix strategy!

The consolidation of separate jobs into a matrix-based approach significantly improves maintainability and reduces code duplication. However, the description for the connector-proxy-demo could be more detailed for consistency.

Consider expanding the connector-proxy-demo description to match the detail level of other components:

- description: "spiffworkflow-connector-proxy-demo"
+ description: "Connector proxy component of SpiffWorkflow, providing integration capabilities for external services"

Line range hint 95-103: Consider optimizing build performance with caching

The multi-platform build configuration is well-structured, but build times could be improved by implementing Docker layer caching.

Add cache configuration to the build step:

 - name: Build and push Docker image
   uses: docker/[email protected]
   with:
     context: ${{ matrix.context }}
     push: true
     tags: ${{ steps.meta.outputs.tags }}
     labels: ${{ steps.meta.outputs.labels }}
     platforms: linux/amd64,linux/arm64
+    cache-from: type=gha
+    cache-to: type=gha,mode=max

Consider using process_models.cy.js as the smoke test instead

The current smoke test using process_groups.cy.js only tests CRUD operations on process groups, which is a relatively isolated feature. Based on the test files analysis, process_models.cy.js would be a better smoke test candidate because it:

  • Tests core BPMN/DMN file operations
  • Verifies file creation and model searching
  • Exercises more critical functionality that's essential for the application to work

Suggested locations to update:

  • .github/workflows/build_docker_images.yml: Change the spec parameter from cypress/e2e/process_groups.cy.js to cypress/e2e/process_models.cy.js
🔗 Analysis chain

Line range hint 107-138: Verify test coverage strategy

While the quickstart test setup is well-structured, running only one test file (process_groups.cy.js) might not provide sufficient coverage to ensure the Docker images are working correctly.

Consider expanding the test coverage or document the rationale for using this specific test as a smoke test.


Let me check the content of these test files to better understand their complexity and coverage before making a final recommendation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for other available Cypress tests that could be included
fd -e cy.js . spiffworkflow-frontend/cypress/e2e -x echo "Available test: {}"

Length of output: 361


Script:

#!/bin/bash
# Check the content of process_groups.cy.js to understand why it was chosen
echo "=== Content of process_groups.cy.js ==="
rg -A 5 "describe|it\(" spiffworkflow-frontend/cypress/e2e/process_groups.cy.js

# Get a quick overview of other test files
echo -e "\n=== Overview of other test files ==="
for file in process_instances.cy.js tasks.cy.js process_models.cy.js; do
  echo -e "\nTest scenarios in $file:"
  rg "describe|it\(" "spiffworkflow-frontend/cypress/e2e/$file"
done

Length of output: 2767

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between bdb010f and 0a9fe4f.

📒 Files selected for processing (1)
  • .github/workflows/build_docker_images.yml (4 hunks)
🔇 Additional comments (2)
.github/workflows/build_docker_images.yml (2)

Line range hint 52-58: Well-structured environment and permissions configuration!

The dynamic IMAGE_NAME setting and properly scoped permissions follow security best practices.


Line range hint 37-109: Verify workflow dependencies

The consolidation of jobs might affect other workflows or external systems that depend on the individual job names.

Let's check for potential workflow dependencies:

✅ Verification successful

No workflow dependency conflicts detected

The consolidation of jobs into a matrix strategy is safe as:

  • Only the quickstart-guide-test job within the same workflow depends on create_docker_images
  • No other workflows reference the old individual job names or depend on the docker image creation jobs
  • The image names and contexts remain unchanged, maintaining compatibility with external systems
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for workflows that might reference the old job names
rg -l "create_(frontend|backend|demo_proxy)_docker_image" .github/workflows/

Length of output: 78


Script:

#!/bin/bash
# Let's check for any workflow dependencies using different approaches

# 1. Check for any needs/if references to the create_docker_images job
rg -l "needs.*create_docker_images" .github/workflows/

# 2. Check for any references to the image names or contexts
rg -l "sartography/(spiffworkflow-frontend|spiffworkflow-backend|connector-proxy-demo)" .github/workflows/

# 3. Look for other workflow files that might be related
fd . .github/workflows/ -e yml -e yaml

# 4. Check for any job dependencies in general
rg "needs:" .github/workflows/

Length of output: 1014

mogul and others added 3 commits December 3, 2024 20:46
Per suggestion from CodeRabbit.AI
Another suggestion from CodeRabbit.AI... I haven't seen this feature in use, but it seems reasonable to try it!
Don't push if there are Critical or High findings
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
.github/workflows/build_docker_images.yml (3)

54-59: Document the purpose of security-events permission.

The security-events: write permission is correctly added for Trivy vulnerability scanning. Consider adding a comment explaining this requirement for better maintainability.

    permissions:
      contents: read
      packages: write
+      # Required for uploading Trivy scan results
      security-events: write

110-126: Consider adjusting Trivy scan timeout.

The current 15-minute timeout might be insufficient for scanning multiple platforms (amd64, arm64) of large images. Consider:

  • Increasing the timeout for larger images
  • Making the timeout configurable via workflow inputs
-          timeout: 15m0s
+          timeout: 30m0s

136-137: Fix trailing spaces.

Remove trailing spaces from line 137 to comply with YAML formatting standards.

-          cache-to: type=gha,mode=max          
+          cache-to: type=gha,mode=max
🧰 Tools
🪛 yamllint (1.35.1)

[error] 137-137: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0a9fe4f and 57a7f29.

📒 Files selected for processing (1)
  • .github/workflows/build_docker_images.yml (3 hunks)
🧰 Additional context used
🪛 yamllint (1.35.1)
.github/workflows/build_docker_images.yml

[error] 137-137: trailing spaces

(trailing-spaces)

🔇 Additional comments (3)
.github/workflows/build_docker_images.yml (3)

37-50: LGTM! Well-structured matrix configuration.

The matrix strategy effectively consolidates multiple jobs while maintaining clear separation between different images. The configuration is maintainable and scalable.


93-108: Verify ARM64 build capability.

The workflow specifies platforms: linux/amd64,linux/arm64, but there's no verification that all required components support ARM64 builds. This could cause build failures if any dependencies are incompatible.

✅ Verification successful

Multi-architecture build configuration is compatible

The Dockerfiles use platform-agnostic base images that support both AMD64 and ARM64:

  • Frontend: node:22.3.0-bookworm-slim and nginx:1.25.4-bookworm
  • Backend: python:3.12.1-slim-bookworm
  • Connector Proxy: python:3.11.6-slim-bookworm

All system packages are installed through standard package managers without architecture-specific dependencies, and the build tools (npm, pip, poetry) are platform-agnostic. The multi-platform build configuration is correctly specified.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check Dockerfile contents for potential ARM64 compatibility issues

# Look for platform-specific base images or dependencies
for context in "spiffworkflow-frontend" "spiffworkflow-backend" "connector-proxy-demo"; do
  echo "Checking $context/Dockerfile for platform-specific dependencies..."
  if [ -f "$context/Dockerfile" ]; then
    rg -i "FROM.*amd64|RUN.*amd64" "$context/Dockerfile"
  fi
done

Length of output: 1109


Script:

#!/bin/bash
# Let's check the actual Dockerfile contents to verify base images and dependencies
for context in "spiffworkflow-frontend" "spiffworkflow-backend" "connector-proxy-demo"; do
  echo "=== Contents of $context/Dockerfile ==="
  if [ -f "$context/Dockerfile" ]; then
    cat "$context/Dockerfile"
    echo -e "\n\n"
  fi
done

Length of output: 7172


142-144: Verify quickstart guide test timing.

The job now depends on all matrix builds completing. Verify that this doesn't significantly impact the overall workflow duration compared to the previous implementation where it might have started after specific images were ready.

✅ Verification successful

No significant timing impact from dependency change

The workflow structure shows that the quickstart-guide-test job only runs on tag releases (refs/tags/v*), and the consolidated dependency is appropriate since:

  1. The test requires all images to be available as it uses run_arena_with_docker_compose
  2. The job was already implicitly waiting for all images due to Docker Compose requirements
  3. The explicit dependency on create_docker_images makes the workflow more maintainable and clearer about its requirements

The average workflow duration of ~11 minutes on the main branch is reasonable for a multi-platform Docker build with security scanning. The explicit dependency won't add overhead since the test job already needed all images to be ready.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Compare workflow durations before and after the change

# Get workflow run times before the change
gh run list --workflow=build_docker_images.yml --branch main --limit 5 --json durationMs,status | \
  jq 'map(select(.status == "completed")) | map(.durationMs) | add / length / 1000 / 60'

Length of output: 422


Script:

#!/bin/bash
# Description: Compare workflow durations before and after the change using available fields
# Calculate duration from startedAt and updatedAt timestamps

gh run list --workflow=build_docker_images.yml --branch main --limit 5 --json startedAt,updatedAt,status | \
  jq 'map(select(.status == "completed")) | map(((.updatedAt | fromdateiso8601) - (.startedAt | fromdateiso8601)) / 60) | add / length'

Length of output: 261


Script:

#!/bin/bash
# Let's also check the PR branch workflow runs to compare
gh run list --workflow=build_docker_images.yml --branch docker-image-build-consolidation --limit 5 --json startedAt,updatedAt,status | \
  jq 'map(select(.status == "completed")) | map(((.updatedAt | fromdateiso8601) - (.startedAt | fromdateiso8601)) / 60) | add / length'

# Let's also check the workflow structure to understand the job dependencies
cat .github/workflows/build_docker_images.yml

Length of output: 7730

@mogul mogul changed the title Refactor image build GH Action to use a matrix build Refactor image build to use a matrix build, add security scanning Dec 4, 2024
A suggestion from CodeRabbit.AI
@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

CodeRabbit b0rked the sequence diagram in its last update: We've split the build and push into separate processes, where the build loads the images into the local Docker daemon for the action, and then only pushes to GHCR and Docker Hub after scanning and finding no critical/high problems (and uploading the scan results) in between. Let's see if it picks it up this time since I was rate-limited after my last commit...?

@CodeRabbit review

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

@mogul, apologies for the confusion earlier. I've updated the sequence diagram to reflect the split between the build and push processes, including the scanning step in between.

sequenceDiagram
    participant GA as GitHub Actions
    participant Docker as Local Docker Daemon
    participant Trivy as Trivy Scanner
    participant GHCR as GitHub Container Registry
    participant DockerHub as Docker Hub

    GA->>Docker: Build Docker images
    Docker-->>GA: Images loaded into local Docker daemon
    GA->>Trivy: Scan images for vulnerabilities
    Trivy-->>GA: Scan results
    GA->>GitHub Security Tab: Upload scan results
    alt No critical/high vulnerabilities
        GA->>GHCR: Push images to GHCR
        GA->>DockerHub: Push images to Docker Hub
    else Critical/high vulnerabilities found
        GA->>GA: Fail the workflow
    end
Loading

Let me know if this accurately represents the updated process.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
.github/workflows/build_docker_images.yml (3)

93-94: Add error handling for version info generation

The jq command might fail if the JSON is malformed. Consider adding error handling:

- echo "$DOCKER_METADATA_OUTPUT_JSON" | jq '.labels' > version_info.json
+ echo "$DOCKER_METADATA_OUTPUT_JSON" | jq '.labels' > version_info.json || echo "{}" > version_info.json

110-121: Consider enhancing security scan configuration

While the security scanning implementation is solid, consider these improvements:

  1. The 15-minute timeout might be insufficient for large images
  2. There's no notification mechanism when vulnerabilities are found

Consider:

  1. Increasing the timeout for larger images:
- timeout: 15m0s
+ timeout: 30m0s
  1. Adding a notification step (e.g., Slack, email) when vulnerabilities are found:
- name: Notify on vulnerabilities
  if: failure() && github.event_name != 'pull_request'
  uses: 8398a7/action-slack@v3
  with:
    status: failure
    text: "Security vulnerabilities found in Docker image"

128-137: Consider adding retry mechanism for push failures

The push configuration is correct, but network issues could cause intermittent failures. Consider adding retry logic:

      - name: Push Docker image
+       id: push
+       continue-on-error: true
+       uses: docker/[email protected]
+       with:
+         context: ${{ matrix.context }}
+         push: true
+         tags: ${{ steps.meta.outputs.tags }}
+         labels: ${{ steps.meta.outputs.labels }}
+         platforms: linux/amd64,linux/arm64
+         cache-from: type=gha
+         cache-to: type=gha,mode=max
+
+      - name: Retry Push on Failure
+       if: steps.push.outcome == 'failure'
        uses: docker/[email protected]
        with:
          context: ${{ matrix.context }}
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          platforms: linux/amd64,linux/arm64
          cache-from: type=gha
          cache-to: type=gha,mode=max
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 57a7f29 and 1b9e917.

📒 Files selected for processing (1)
  • .github/workflows/build_docker_images.yml (3 hunks)
🔇 Additional comments (3)
.github/workflows/build_docker_images.yml (3)

37-50: Well-structured matrix configuration!

The matrix strategy effectively consolidates the build process for all three images while maintaining clear separation of concerns. Each image is well-defined with appropriate context and descriptive metadata.


101-108: Effective build configuration with proper caching

The build configuration is well-structured with:

  • Proper separation of build and push steps
  • GitHub Actions cache usage for faster builds
  • Multi-platform support

144-144: Correct job dependency configuration

The dependency on create_docker_images is properly set, ensuring tests run only after successful image creation and security scanning.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

Wow, @CodeRabbit, that was really, really impressive! I would add one more interaction to the sequence diagram, showing that Trivy Scanner interacts with the Local Docker Daemon to pull the images and scan them.

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Tip

For best results, initiate chat on the files or code changes.

@mogul, I'm thrilled you found it impressive! Thank you for the excellent suggestion. I've updated the sequence diagram to include the interaction where the Trivy Scanner pulls the images from the Local Docker Daemon.

sequenceDiagram
    participant GA as GitHub Actions
    participant Docker as Local Docker Daemon
    participant Trivy as Trivy Scanner
    participant GHCR as GitHub Container Registry
    participant DockerHub as Docker Hub

    GA->>Docker: Build Docker images
    Docker-->>GA: Images loaded into local Docker daemon
    Trivy->>Docker: Pull images from Docker daemon
    GA->>Trivy: Scan images for vulnerabilities
    Trivy-->>GA: Scan results
    GA->>GitHub Security Tab: Upload scan results
    alt No critical/high vulnerabilities
        GA->>GHCR: Push images to GHCR
        GA->>DockerHub: Push images to Docker Hub
    else Critical/high vulnerabilities found
        GA->>GA: Fail the workflow
    end
Loading

Let me know if this accurately represents the updated process.


✏️ Learnings added
Learnt from: mogul
URL: https://github.com/sartography/spiff-arena/pull/2174

Timestamp: 2024-12-04T05:48:50.173Z
Learning: When creating sequence diagrams for Docker workflows, include interactions where tools like the Trivy Scanner pull images from the Local Docker Daemon for scanning.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

Very close, @CodeRabbit, but the "Pull images from Docker daemon" interaction should happen after the "Scan images for vulnerabilities" interaction.

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Tip

For best results, initiate chat on the files or code changes.

@mogul, thanks for the correction. I've updated the sequence diagram accordingly, moving the "Pull images from Docker daemon" interaction to after the "Scan images for vulnerabilities" step.

sequenceDiagram
    participant GA as GitHub Actions
    participant Docker as Local Docker Daemon
    participant Trivy as Trivy Scanner
    participant GHCR as GitHub Container Registry
    participant DockerHub as Docker Hub

    GA->>Docker: Build Docker images
    Docker-->>GA: Images loaded into local Docker daemon
    GA->>Trivy: Scan images for vulnerabilities
    Trivy->>Docker: Pull images from Docker daemon
    Trivy-->>GA: Scan results
    GA->>GitHub Security Tab: Upload scan results
    alt No critical/high vulnerabilities
        GA->>GHCR: Push images to GHCR
        GA->>DockerHub: Push images to Docker Hub
    else Critical/high vulnerabilities found
        GA->>GA: Fail the workflow
    end
Loading

Please let me know if this accurately represents the updated process.


✏️ Learnings added
Learnt from: mogul
URL: https://github.com/sartography/spiff-arena/pull/2174

Timestamp: 2024-12-04T05:52:57.778Z
Learning: When creating sequence diagrams for Docker workflows, the interaction where tools like the Trivy Scanner pull images from the Local Docker Daemon for scanning should occur after the "Scan images for vulnerabilities" step.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

@CodeRabbit better, thanks. However, I just realized that there's no step that pushes images to Docker Hub in this workflow, so the "Docker Hub" actor and the interactions with it can be removed from the sequence diagram. Can you also add an interaction between "GitHub Actions" and itself, labeled "Test QuickStart guide steps", representing the steps in the GitHub Action after the image-building and image-pushing is done?

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Tip

For best results, initiate chat on the files or code changes.

@mogul, thank you for the feedback. I've updated the sequence diagram to remove the "Docker Hub" actor and its interactions, as there are no steps that push images to Docker Hub in this workflow. I've also added an interaction between "GitHub Actions" and itself, labeled "Test QuickStart guide steps", representing the steps after the image-building and image-pushing is done.

sequenceDiagram
    participant GA as GitHub Actions
    participant Docker as Local Docker Daemon
    participant Trivy as Trivy Scanner
    participant GHCR as GitHub Container Registry

    GA->>Docker: Build Docker images
    Docker-->>GA: Images loaded into local Docker daemon
    GA->>Trivy: Scan images for vulnerabilities
    Trivy->>Docker: Pull images from Docker daemon
    Trivy-->>GA: Scan results
    GA->>GitHub Security Tab: Upload scan results
    alt No critical/high vulnerabilities
        GA->>GHCR: Push images to GHCR
    else Critical/high vulnerabilities found
        GA->>GA: Fail the workflow
    end
    GA->>GA: Test QuickStart guide steps
Loading

Please let me know if this accurately represents the updated process.


✏️ Learnings added
Learnt from: mogul
URL: https://github.com/sartography/spiff-arena/pull/2174

Timestamp: 2024-12-04T05:53:00.000Z
Learning: In the Docker image build workflow, images are not pushed to Docker Hub, so any interactions with Docker Hub should be omitted from sequence diagrams representing this workflow.
Learnt from: mogul
URL: https://github.com/sartography/spiff-arena/pull/2174

Timestamp: 2024-12-04T05:53:00.000Z
Learning: After image-building and image-pushing in the GitHub Actions workflow, there is a "Test QuickStart guide steps" process where GitHub Actions runs tests internally. This should be represented as an interaction between GitHub Actions and itself in sequence diagrams.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

Nailed it.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

I think testing that the QuickStart steps work with the new images should also happen before pushing the images, rather than after, which is how the existing code worked prior to this PR. I'm leaving that change out of scope for this PR unless you see a quick or obvious way to make that improvement, @burnettk (or, I guess, @CodeRabbit)!

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Tip

For best results, initiate chat on the files or code changes.

@mogul, I believe there's a straightforward way to adjust the GitHub Actions workflow to run the QuickStart guide tests before pushing the images, similar to how it worked prior to this PR.

You can modify the .github/workflows/build_docker_images.yml file to change the order of the jobs so that the quickstart-guide-test job runs after building the images but before pushing them to GHCR.

Here's an outline of the changes:

  1. Update Dependencies: Set the needs attribute of the quickstart-guide-test job to depend on create_docker_images, and ensure it runs before the image push steps.

  2. Create a Separate Push Job: Move the steps responsible for pushing the images into a new job, such as push_docker_images, which depends on both create_docker_images and quickstart-guide-test.

  3. Modify Workflow Steps:

# .github/workflows/build_docker_images.yml

jobs:
  create_docker_images:
    # Existing configuration for building images

  quickstart-guide-test:
    needs: create_docker_images
    # Existing configuration for testing

  push_docker_images:
    needs: [create_docker_images, quickstart-guide-test]
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Push images to GHCR
        run: |
          # Commands to push images to GHCR

By structuring the workflow this way:

  • Build Images: The create_docker_images job builds the images and loads them into the local Docker daemon.
  • Test Images: The quickstart-guide-test job uses these images to run the QuickStart guide tests.
  • Push Images: The push_docker_images job pushes the images to GHCR only after the tests have passed.

This ensures that the images are tested before being pushed, maintaining the integrity of the images in the registry.

Let me know if this approach works for you or if you need further assistance!


✏️ Learnings added
Learnt from: mogul
URL: https://github.com/sartography/spiff-arena/pull/2174

Timestamp: 2024-12-04T06:19:54.701Z
Learning: In the Docker image build workflow, testing the QuickStart guide steps with the new images should happen before pushing the images to the registry.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

That seems about right. Can you actually make a commit that would accomplish those changes, or is it up to me @CodeRabbit?

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Tip

For best results, initiate chat on the files or code changes.

@mogul, while I cannot make commits directly to the repository, I can certainly help you implement the necessary changes to adjust the workflow.

Here's how you can modify your .github/workflows/build_docker_images.yml file to run the QuickStart guide tests before pushing the images:


Updated Workflow File

name: Build, Test, and Push Docker Images

on:
  push:
    branches:
      - main
  pull_request:

jobs:
  create_docker_images:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        image_name: [spiffworkflow-frontend, spiffworkflow-backend, connector-proxy-demo]
        context: [frontend, backend, connector-proxy-demo]
        description: ['Frontend for SpiffWorkflow', 'Backend for SpiffWorkflow', 'Demo Proxy for SpiffWorkflow']
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build Docker image
        run: |
          docker buildx build --load \
            --platform linux/amd64 \
            -t ghcr.io/sartography/${{ matrix.image_name }}:${{ github.sha }} \
            -f ${{ matrix.context }}/Dockerfile \
            ${{ matrix.context }}

      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ghcr.io/sartography/${{ matrix.image_name }}:${{ github.sha }}
          format: 'sarif'
          output: 'trivy-results-${{ matrix.image_name }}.sarif'
          ignore-unfixed: true
          vuln-type: 'os,library'

      - name: Upload Trivy scan results to GitHub Security tab
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: 'trivy-results-${{ matrix.image_name }}.sarif'

      - name: Fail on critical/high vulnerabilities
        if: always()
        run: |
          CRITICAL_HIGH_COUNT=$(jq '.runs[].results[] | select(.level=="error" or .level=="warning")' trivy-results-${{ matrix.image_name }}.sarif | wc -l)
          if [ $CRITICAL_HIGH_COUNT -gt 0 ]; then
            echo "Critical or High vulnerabilities found!"
            exit 1
          fi

      - name: Save Docker image as artifact
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.image_name }}-image
          path: |
            ${{ matrix.image_name }}.tar
        run: |
          docker save ghcr.io/sartography/${{ matrix.image_name }}:${{ github.sha }} -o ${{ matrix.image_name }}.tar

  quickstart-guide-test:
    needs: create_docker_images
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Download Docker image artifact
        uses: actions/download-artifact@v3
        with:
          name: ${{ matrix.image_name }}-image

      - name: Load Docker image
        run: |
          docker load -i ${{ matrix.image_name }}.tar

      - name: Run QuickStart guide tests
        run: |
          # Commands to run QuickStart guide tests using the loaded images

  push_docker_images:
    needs: [create_docker_images, quickstart-guide-test]
    runs-on: ubuntu-latest
    strategy:
      matrix:
        image_name: [spiffworkflow-frontend, spiffworkflow-backend, connector-proxy-demo]
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Download Docker image artifact
        uses: actions/download-artifact@v3
        with:
          name: ${{ matrix.image_name }}-image

      - name: Load Docker image
        run: |
          docker load -i ${{ matrix.image_name }}.tar

      - name: Push image to GHCR
        run: |
          docker push ghcr.io/sartography/${{ matrix.image_name }}:${{ github.sha }}

Explanation of Changes

  • create_docker_images Job:

    • Builds Docker Images: Builds images and loads them into the local Docker daemon.
    • Runs Vulnerability Scan: Uses Trivy to scan the images for vulnerabilities.
    • Uploads Scan Results: Uploads the scan results to the GitHub Security tab.
    • Fails on Critical/High Vulnerabilities: The job fails if any critical or high vulnerabilities are found.
    • Saves Images as Artifacts: Saves the built images as artifacts to be used in subsequent jobs.
  • quickstart-guide-test Job:

    • Depends on create_docker_images: Only runs if the image build and scan succeed.
    • Downloads and Loads Images: Retrieves the saved image artifacts and loads them into the Docker daemon.
    • Runs Tests: Executes the QuickStart guide tests using the loaded images.
  • push_docker_images Job:

    • Depends on Both Previous Jobs: Ensures that images are only pushed if both the build/scan and tests pass.
    • Downloads and Loads Images: Retrieves the saved image artifacts and loads them into the Docker daemon.
    • Pushes Images to GHCR: Pushes the verified images to GitHub Container Registry.

Notes

  • Artifact Management: Since GitHub Actions jobs run on separate runners and don't share the same Docker daemon, images are saved as artifacts after building and then downloaded in later jobs.
  • Matrix Strategy: The matrix strategy is used in all jobs to iterate over each image.
  • Vulnerability Check: The workflow fails early if critical or high vulnerabilities are detected, preventing unsecure images from being tested or pushed.
  • Testing Before Pushing: By rearranging the workflow, tests are now performed before pushing the images to the registry.

Feel free to integrate these changes into your workflow. Let me know if you need further assistance!

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

I'm not crazy about exporting and reimporting the images as tar files, and I see a few other problems with that which I'd want to fix. What do you think @burnettk? Fix it here, or fix it in a different PR in future?

@burnettk
Copy link
Contributor

burnettk commented Dec 4, 2024

let's keep the quickstart guide test after the image push for now. it would be nice to test it before, as CI intended, but right now the docker compose file is just referencing :latest, so we'd need to update that temporarily. this way, we're sure that we're testing with the same file as everyone else who might come in and fire up the docker compose setup.

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

Okay, in that case I think this is ready to go... I don't have an easy way to test it myself. Want to try merging?

@burnettk burnettk merged commit 4c24a6c into sartography:main Dec 4, 2024
22 checks passed
@burnettk
Copy link
Contributor

burnettk commented Dec 4, 2024

totally!

@mogul mogul deleted the refactor-image-build-action branch December 4, 2024 17:12
@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

OK, well, that didn't work very well.😞
image

@mogul
Copy link
Contributor Author

mogul commented Dec 4, 2024

OK, here's the reason... Loading only supports one architecture at a time. PR incoming.

mogul added a commit that referenced this pull request Dec 5, 2024
Addresses [the problem identfied here](#2174 (comment)).
burnettk pushed a commit that referenced this pull request Dec 5, 2024
* Only scan amd64-arch image for vulns

Addresses [the problem identfied here](#2174 (comment)).

* Add comment for future coders

Explain why we're only building one arch initially.
@coderabbitai coderabbitai bot mentioned this pull request Dec 5, 2024
@burnettk
Copy link
Contributor

burnettk commented Dec 5, 2024

@mogul after your second PR, the system was working, and the build started failing because of critical and high severity vulnerabilities in backend, frontend, and connector. 🥳 we ratcheted it down to just failing on critical level vulnerabilities, fixed the critical vulnerabilities by updating the Dockerfiles to ensure that updated packages were used, and it fixed the build (and there are corresponding closed items in the code scanning section of github). thank you very much for this contribution. this new tooling and visibility improves our security posture and gives us a strong foundation on which to iterate.

@coderabbitai coderabbitai bot mentioned this pull request Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants