-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark destructure gradient test as broken #1797
Conversation
Temporary measure to unblock CI while we investigate the root cause. I think `@test_broken` is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely.
Once the build has completed, you can preview any updated documentation at this URL: https://fluxml.ai/Flux.jl/previews/PR1797/ in ~20 minutes |
Happy to merge this for now. can we let go of the pr comment bot? It seems extraneous to most PRs and the links are always generated, so we aren't missing out. |
@logankilpatrick is there a way to have the bot only trigger on markdown file, docstring changes and/or changes in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
1797: Mark destructure gradient test as broken r=DhairyaLGandhi a=ToucheSir Temporary measure to unblock CI while we investigate the root cause. I think ``@test_broken`` is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely. Co-authored-by: Brian Chen <[email protected]>
Build failed: |
Do we have a flaky test here? https://buildkite.com/julialang/flux-dot-jl/builds/1893#efc52f50-bab7-4968-a07c-e223aa23f13f/402-581 failed on the bors run but not on the commit itself. |
bors r+ |
1797: Mark destructure gradient test as broken r=ToucheSir a=ToucheSir Temporary measure to unblock CI while we investigate the root cause. I think ``@test_broken`` is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely. Co-authored-by: Brian Chen <[email protected]>
Build failed: |
https://github.com/FluxML/Flux.jl/blob/master/test/cuda/layers.jl#L48 behaves differently on the PR branch vs |
That test was added recently and overall seems to be reasonable. (Grouped)Conv layer tests weren't flaky earlier, although we need some cleanups both here and in nnlib. Although we shouldn't have a special case in there if we can avoid it. |
Wondering if you could do a run with Julia 1.6 as a sanity check |
I don't have a machine with 1.6 and a free GPU right now, but I can try over the weekend if someone doesn't get to this first. |
I was thinking of upper bounding gha ci to 1.6 and doing a try with bors |
Do we have GPU instances on GHA? AFAICT the non-CUDA grad tests have been rock stable. |
Buildkite* sorry my bad. |
@ToucheSir it's ok to mark it as broken, but we should also add a test with a higher tolerance (if it passes) so that at least we are aware of further regressions |
let's merge this as quickly as possibly without bors if needed, having broken CI is pretty bad |
The issue is not about a passing test with higher tolerance, since the code path should not be inducing any more error. I think it's a good idea to have a placeholder which can detect any other breakage. |
bors try |
tryBuild succeeded: |
bors r+ |
1797: Mark destructure gradient test as broken r=DhairyaLGandhi a=ToucheSir Temporary measure to unblock CI while we investigate the root cause. I think ``@test_broken`` is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely. Co-authored-by: Brian Chen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
1797: Mark destructure gradient test as broken r=DhairyaLGandhi a=ToucheSir Temporary measure to unblock CI while we investigate the root cause. I think ``@test_broken`` is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely. Co-authored-by: Brian Chen <[email protected]>
Build failed: |
bors r+ |
Build succeeded: |
@DhairyaLGandhi thanks! Do you know what (if anything) changed on CI? Edit: this is strange, master is still not happy :( |
Temporary measure to unblock CI while we investigate the root cause. I think
@test_broken
is better than increasing the tolerance, as the latter makes it easier to forget about this one entirely.