fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions #9211

3droj7 · 2023-12-06T08:59:19Z

Related: #9177

Description
The variable "entry" was not overridden before triggering the lookupRemote method.
The reason is because the entry variable was redeclared within an if block, resulting in a new, locally scoped variable. This redeclaration prevented the updates made to entry within this block from being reflected outside of it.

The bug was introduced in my last PR (9181)... I had no idea that in go, the "if" statement can define a whole new scope

I've also added some tests, so it would test the scenario where we have tags in the remote which shouldn't be built and also fixed another test where it uses missingPull functionality on the remote

…ty and add more tests to test the process of looking up remotely without building the images

codecov · 2023-12-06T09:07:37Z

Codecov Report

Attention: 295 lines in your changes are missing coverage. Please review.

Comparison is base (290280e) 70.48% compared to head (8bf8162) 63.64%.
Report is 1084 commits behind head on main.

Files	Patch %	Lines
cmd/skaffold/app/cmd/exec.go	16.32%	40 Missing and 1 partial ⚠️
cmd/skaffold/app/cmd/filter.go	47.27%	22 Missing and 7 partials ⚠️
cmd/skaffold/app/cmd/lsp.go	28.12%	23 Missing ⚠️
cmd/skaffold/app/cmd/verify.go	23.33%	23 Missing ⚠️
cmd/skaffold/app/cmd/fix.go	51.16%	17 Missing and 4 partials ⚠️
cmd/skaffold/app/cmd/inspect_job_manifest_paths.go	60.00%	15 Missing and 1 partial ⚠️
cmd/skaffold/app/cmd/inspect_namespaces.go	50.00%	13 Missing and 1 partial ⚠️
...md/skaffold/app/cmd/inspect_config_dependencies.go	45.83%	12 Missing and 1 partial ⚠️
cmd/skaffold/app/cmd/lint.go	42.85%	12 Missing ⚠️
cmd/skaffold/app/cmd/inspect_build_env.go	60.71%	11 Missing ⚠️
... and 21 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #9211      +/-   ##
==========================================
- Coverage   70.48%   63.64%   -6.85%     
==========================================
  Files         515      632     +117     
  Lines       23150    32552    +9402     
==========================================
+ Hits        16317    20717    +4400     
- Misses       5776    10240    +4464     
- Partials     1057     1595     +538

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ericzzzzzzz · 2023-12-06T17:07:16Z

hi, @3droj7
there is a test failing on this pr, could you take a look?

2/integration/TestBuildDependenciesCache
    helper.go:260: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    helper.go:260: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 3.238 seconds
time="2023-12-06T15:12:02Z" level=info msg="DOCKER_HOST env is not set, using the host from docker context." subtask=-1 task=DevLoop
    build_dependencies_test.go:172: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image1:latest' not present
    build_dependencies_test.go:173: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image2:latest' not present
    build_dependencies_test.go:174: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image3:latest' not present
    build_dependencies_test.go:175: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image4:latest' not present

3droj7 · 2023-12-07T10:19:08Z

@ericzzzzzzz
I'm getting the same output even before my commits (in d49be7a)

=== RUN   TestBuildDependenciesCache
    helper.go:260: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    helper.go:260: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 10.125 seconds
time="2023-12-07T12:08:51+02:00" level=info msg="DOCKER_HOST env is not set, using the host from docker context." subtask=-1 task=DevLoop
    build_dependencies_test.go:108: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image1:latest' not present
    build_dependencies_test.go:109: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image2:latest' not present
    build_dependencies_test.go:110: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image3:latest' not present
    build_dependencies_test.go:111: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image4:latest' not present
=== RUN   TestBuildDependenciesCache/no_change
    build_dependencies_test.go:75: skipping integration test
=== RUN   TestBuildDependenciesCache/change_1
    build_dependencies_test.go:75: skipping integration test
=== RUN   TestBuildDependenciesCache/change_2
    build_dependencies_test.go:75: skipping integration test
=== RUN   TestBuildDependenciesCache/change_3
    build_dependencies_test.go:75: skipping integration test
=== RUN   TestBuildDependenciesCache/change_4
    build_dependencies_test.go:75: skipping integration test
=== RUN   TestBuildDependenciesCache/change_all
    build_dependencies_test.go:75: skipping integration test
--- FAIL: TestBuildDependenciesCache (10.25s)
    --- SKIP: TestBuildDependenciesCache/no_change (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_1 (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_2 (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_3 (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_4 (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_all (0.00s)
FAIL
FAIL    github.com/GoogleContainerTools/skaffold/v2/integration 10.683s
FAIL

What am I missing here? Are there some different logs from this PR that you can post here?

… digest in the remote (but it shouldn't use the cache if the tagged image exists remotely)

3droj7 · 2023-12-08T09:44:22Z

@ericzzzzzzz It seems that the PR integration tests are passing after my last commit but the "kokoro ci build" still fails and I'm not allowed to see the logs.
Can you post the logs here if there's something critical in them?

ericzzzzzzz · 2023-12-08T16:51:21Z

Hi @3droj7 the log is the same as I pasted before, I haven't got a chance to look into the detail,

but it seems that you need to have a remote registry and your active k8s context should be a remote cluster.

3droj7 · 2023-12-10T21:58:28Z

@ericzzzzzzz This is really weird because the same tests are passing in the PR integration tests -

=== RUN   TestBuildDependenciesCache
    helper.go:260: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    helper.go:260: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 29.859 seconds
=== RUN   TestBuildDependenciesCache/no_change
    util.go:92: Assinged test TestBuildDependenciesCache/no_change to partition: 3
    build_dependencies_test.go:144: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    build_dependencies_test.go:144: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 5[83](https://github.com/GoogleContainerTools/skaffold/actions/runs/7131236413/job/19419365399?pr=9211#step:17:84).92145ms
=== RUN   TestBuildDependenciesCache/change_1
    util.go:92: Assinged test TestBuildDependenciesCache/change_1 to partition: 1
    build_dependencies_test.go:139: skipping non-GCP integration test that doesn't match partition 3
=== RUN   TestBuildDependenciesCache/change_2
    util.go:92: Assinged test TestBuildDependenciesCache/change_2 to partition: 0
    build_dependencies_test.go:139: skipping non-GCP integration test that doesn't match partition 3
=== RUN   TestBuildDependenciesCache/change_3
    util.go:92: Assinged test TestBuildDependenciesCache/change_3 to partition: 3
    build_dependencies_test.go:144: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    build_dependencies_test.go:144: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 31.173 seconds
=== RUN   TestBuildDependenciesCache/change_4
    util.go:[92](https://github.com/GoogleContainerTools/skaffold/actions/runs/7131236413/job/19419365399?pr=9211#step:17:93): Assinged test TestBuildDependenciesCache/change_4 to partition: 2
    build_dependencies_test.go:139: skipping non-GCP integration test that doesn't match partition 3
=== RUN   TestBuildDependenciesCache/change_all
    util.go:92: Assinged test TestBuildDependenciesCache/change_all to partition: 1
    build_dependencies_test.go:139: skipping non-GCP integration test that doesn't match partition 3
--- PASS: TestBuildDependenciesCache (61.64s)
    --- PASS: TestBuildDependenciesCache/no_change (0.59s)
    --- SKIP: TestBuildDependenciesCache/change_1 (0.00s)
    --- SKIP: TestBuildDependenciesCache/change_2 (0.00s)
    --- PASS: TestBuildDependenciesCache/change_3 (31.18s)

Besides that, we're currently using the compiled version of Skaffold with this fix in our environments specifically because we have a remote registry and a remote cluster, and so far we didn't have any issue.
But still it would be very helpful if you could send me the logs.

What do you think is the difference between the integration tests environment and the kokoro CI build environment?

ericzzzzzzz · 2023-12-11T20:17:24Z

Hi @3droj7 I've already posted the whole log

2/integration/TestBuildDependenciesCache
    helper.go:260: Running [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in testdata/build-dependencies
    helper.go:260: Ran [skaffold build --default-repo us-central1-docker.pkg.dev/k8s-skaffold/testing --cache-artifacts=true -p concurrency-0] in 3.238 seconds
time="2023-12-06T15:12:02Z" level=info msg="DOCKER_HOST env is not set, using the host from docker context." subtask=-1 task=DevLoop
    build_dependencies_test.go:172: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image1:latest' not present
    build_dependencies_test.go:173: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image2:latest' not present
    build_dependencies_test.go:174: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image3:latest' not present
    build_dependencies_test.go:175: expected image 'us-central1-docker.pkg.dev/k8s-skaffold/testing/image4:latest' not present

The internal platform uses a remote registry and a remote cluster, while github action uses minikube for testing, note that remote registry and cluster are long running service and are shared when multiple CI instances are running tests, github action CI testing environments are isolated, but the test failure might not be caused by resource conflicts, normally we see test failures caused by race condition are failed on some runs, on some of the testing machines, but this test is failing on every machine.

3droj7 · 2023-12-17T15:54:37Z

Hey @ericzzzzzzz I've now had some time to review the issue, and it seems that the checkImageExists function verifies the local existence of an image. However, since the images already exist in the remote registry, it doesn't trigger a rebuild. This was the fix I implemented for #9177 .

I believe the solution could be to bypass the initial checkImageExists check (in TestBuildDependenciesCache). This step is important because the images might exist remotely. Additionally, altering the tagging policy would ensure that tags change in response to any modifications in dependencies within the tests.

Alternatively, we could change the dependency beforehand. This way, the tags would differ from those in the remote registry, ensuring that the images are always built locally.

What do you think?

ericzzzzzzz · 2023-12-27T17:09:16Z

hi @3droj7 , Thank you for the investigation! Yeah, let's bypass the test against a remote cluster, you can achieve this by moving

skaffold/integration/build_dependencies_test.go

Line 139 in d4369f3

MarkIntegrationTest(t, CanRunWithoutGcp)

to the beginning of the test funciton.

yudppp · 2024-01-10T11:10:04Z

pkg/skaffold/build/cache/lookup.go

+		c.cacheMutex.Lock()
+		c.artifactCache[hash] = entry
+		c.cacheMutex.Unlock()
+		return found{hash: hash}


Hi @ericzzzzzzz @3droj7,

I have a concern about the lookupRemote function. It appears that the function may always return found when the same tag is reused, even if the digests are different.

To fix this issue, the lookupRemote function should check the digest of the image in the cache. If the digests are different, the function should return needsRemoteTagging.

Is this correct?

Thanks.

I submitted this issue.
#9248

…mote images and tests additions (GoogleContainerTools#9211)" This reverts commit ffe769c.

* Revert "fix(lookupRemote): fixed lookup.go lookupRemote to compare remote and cached digests (#9278)" This reverts commit 9ff4546. * Revert "fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions (#9211)" This reverts commit ffe769c. * Revert "fix: puling images when working with a remote repository (#9177) (#9181)" This reverts commit 9376dc0.

fix: new retrieved entry is not used in the lookup remote functionali…

b7e4b9d

…ty and add more tests to test the process of looking up remotely without building the images

pull-request-size bot added the size/M label Dec 6, 2023

ericzzzzzzz added the kokoro:run runs the kokoro jobs on a PR label Dec 6, 2023

kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Dec 6, 2023

ericzzzzzzz added the kokoro:force-run forces a kokoro re-run on a PR label Dec 7, 2023

kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Dec 7, 2023

fix: In case there's an entry in the cache it might not find the same…

df53a31

… digest in the remote (but it shouldn't use the cache if the tagged image exists remotely)

pull-request-size bot added size/L and removed size/M labels Dec 7, 2023

ericzzzzzzz added the kokoro:run runs the kokoro jobs on a PR label Dec 7, 2023

kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Dec 7, 2023

ericzzzzzzz added the kokoro:force-run forces a kokoro re-run on a PR label Dec 11, 2023

kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Dec 11, 2023

fix: build dependency test will be bypassed against a remote cluster

8bf8162

ericzzzzzzz added the kokoro:run runs the kokoro jobs on a PR label Jan 2, 2024

kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Jan 2, 2024

ericzzzzzzz added the kokoro:force-run forces a kokoro re-run on a PR label Jan 2, 2024

kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Jan 2, 2024

ericzzzzzzz added the kokoro:force-run forces a kokoro re-run on a PR label Jan 2, 2024

kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Jan 2, 2024

ericzzzzzzz approved these changes Jan 2, 2024

View reviewed changes

ericzzzzzzz merged commit ffe769c into GoogleContainerTools:main Jan 2, 2024
17 checks passed

yudppp reviewed Jan 10, 2024

View reviewed changes

ericzzzzzzz mentioned this pull request Feb 15, 2024

fix(lookupRemote): fixed lookup.go lookupRemote caching mechanism #9306

Closed

ericzzzzzzz added a commit to ericzzzzzzz/skaffold that referenced this pull request Feb 15, 2024

Revert "fix: Scope Issue with the 'entry' variable when looking up re…

cf426e0

…mote images and tests additions (GoogleContainerTools#9211)" This reverts commit ffe769c.

ericzzzzzzz mentioned this pull request Feb 15, 2024

fix: revert cache lookup changes #9313

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions #9211

fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions #9211

3droj7 commented Dec 6, 2023

codecov bot commented Dec 6, 2023 •

edited

Loading

ericzzzzzzz commented Dec 6, 2023

3droj7 commented Dec 7, 2023

3droj7 commented Dec 8, 2023

ericzzzzzzz commented Dec 8, 2023

3droj7 commented Dec 10, 2023 •

edited

Loading

ericzzzzzzz commented Dec 11, 2023

3droj7 commented Dec 17, 2023

ericzzzzzzz commented Dec 27, 2023

yudppp Jan 10, 2024

yudppp Jan 10, 2024

fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions #9211

fix: Scope Issue with the 'entry' variable when looking up remote images and tests additions #9211

Conversation

3droj7 commented Dec 6, 2023

codecov bot commented Dec 6, 2023 • edited Loading

Codecov Report

ericzzzzzzz commented Dec 6, 2023

3droj7 commented Dec 7, 2023

3droj7 commented Dec 8, 2023

ericzzzzzzz commented Dec 8, 2023

3droj7 commented Dec 10, 2023 • edited Loading

ericzzzzzzz commented Dec 11, 2023

3droj7 commented Dec 17, 2023

ericzzzzzzz commented Dec 27, 2023

yudppp Jan 10, 2024

Choose a reason for hiding this comment

yudppp Jan 10, 2024

Choose a reason for hiding this comment

codecov bot commented Dec 6, 2023 •

edited

Loading

3droj7 commented Dec 10, 2023 •

edited

Loading