Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation error is blocking jobsets #83647

Closed
FRidh opened this issue Mar 29, 2020 · 16 comments
Closed

Evaluation error is blocking jobsets #83647

FRidh opened this issue Mar 29, 2020 · 16 comments
Labels
0.kind: bug Something is broken 1.severity: blocker This is preventing another PR or issue from being completed 1.severity: channel blocker Blocks a channel
Milestone

Comments

@FRidh
Copy link
Member

FRidh commented Mar 29, 2020

Describe the bug

hydra-eval-jobs returned exit code 1:
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
trace: `mkStrict' is obsolete; use `mkOverride 0' instead.
trace: `lib.nixpkgsVersion` is deprecated, use `lib.version` instead!
trace: Warning: `showVal` is deprecated and will be removed in the next release, please use `traceSeqN`
trace: lib.zip is deprecated, use lib.zipAttrsWith instead
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
skipping job with illegal name 'http_parser.rb'
skipping job with illegal name 'http_parser.rb'
skipping job with illegal name 'http_parser.rb'
skipping job with illegal name 'http_parser.rb'
skipping job with illegal name 'sourcemap.vim'
trace: `mkStrict' is obsolete; use `mkOverride 0' instead.
trace: `lib.nixpkgsVersion` is deprecated, use `lib.version` instead!
trace: Warning: `showVal` is deprecated and will be removed in the next release, please use `traceSeqN`
trace: lib.zip is deprecated, use lib.zipAttrsWith instead
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
warning: SQLite database '/nix/var/nix/db/db.sqlite' is busy
error: unexpected EOF reading a line

The actual issue is masked by a Hydra bug NixOS/hydra#728.

@FRidh FRidh added 0.kind: bug Something is broken 1.severity: blocker This is preventing another PR or issue from being completed labels Mar 29, 2020
@knedlsepp
Copy link
Member

@vcunat vcunat added the 1.severity: channel blocker Blocks a channel label Mar 29, 2020
@vcunat vcunat changed the title Evaluation error is blocking staging-next Evaluation error is blocking jobsets Mar 29, 2020
@vcunat
Copy link
Member

vcunat commented Mar 29, 2020

I'm unable to reproduce any error locally (without hydra). I did an experiment on unstable-small jobset: the last successful evaluation is on 05f0934, so I forced it to re-try with the same commit and it failed with

hydra-eval-jobs returned exit code 1:
error: unexpected EOF reading a line

On the other hand, 19.09 seems unaffected 🤷‍♂️ this seems just hard when the real errors get hidden.

@worldofpeace worldofpeace pinned this issue Mar 29, 2020
@worldofpeace worldofpeace added this to the 20.03 milestone Mar 29, 2020
@samueldr
Copy link
Member

Likely fix for the Hydra issue:

Once fixed, evals should show errors again.

@samueldr
Copy link
Member

samueldr commented Apr 1, 2020

It may be a nix-side regression!

While my nix-daemon was /nix/store/nk4px4bjp0kiss27n5dyrwsj9xgflwhp-nix-2.3, it successfully eval'd nixos/release-small.nix.

Switching to the revision of nix shown on the Hydra footer (3e7aab8) the issue is cleanly reproducible.

I'll add that it also reproduces quickly! 1m10.799s, rather than more than an hour!

@samueldr
Copy link
Member

samueldr commented Apr 1, 2020

In Nix:

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
401b5bc5418f3eb6d57da9d9e66df055f8bce122
75db069f927ffaf38ac6ef2d8143926b724ca935
d700eecea9a274c1b45549141f40180ac74454ce
22a754c091f765061f59bef5ce091268493bb138
887030f211dcd062a73021b1cc289992502b35e4
d37dc71e3cf077fa5d24a9bf8395deae21cc4410
We cannot bisect more!

All had to be skipped as the tests won't pass.


EDIT: It looks like NixOS/nix@75db069 may be the regression.

@edolstra
Copy link
Member

edolstra commented Apr 1, 2020

Yes, it's a known issue: NixOS/nix#3462

@samueldr
Copy link
Member

samueldr commented Apr 1, 2020

See also: #83863

@flokli
Copy link
Contributor

flokli commented Apr 3, 2020

Just to make sure, is NixOS/nix#3462 currently still blocking 20.03 evaluations? Can we move hydra.nixos.org to a version before that?

I think having done some Hydra evaluations is critical to do before the 20.03 release.

cc @disassembler @worldofpeace

@jtojnar
Copy link
Member

jtojnar commented Apr 3, 2020

We reverted #73966 so this particular instance that triggered NixOS/nix#3462 is no longer happening and the evals are running: https://hydra.nixos.org/job/nixos/release-20.03/tested#tabs-status

@samueldr
Copy link
Member

samueldr commented Apr 3, 2020

This was fixed by @jtojnar, closing as per the short discussion in the go-no-go meeting

@samueldr samueldr closed this as completed Apr 3, 2020
@flokli flokli unpinned this issue Apr 3, 2020
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/go-no-go-meeting-nixos-20-03-markhor/6495/16

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-unstable-hasnt-been-updated-in-a-while/6584/8

@andir
Copy link
Member

andir commented Oct 24, 2020

It appears like this issue has returned to the unstable channel. All that is being logged is a JSON error (something is null).

@vcunat
Copy link
Member

vcunat commented Oct 24, 2020

It will surely come from a different place, but we can recycle this issue.

@vcunat vcunat reopened this Oct 24, 2020
@samueldr
Copy link
Member

If this is like #99236, I have doubts that this is related.

The following error, for #99236, didn't result in an error message being eaten by the hydra evaluator.

error: [json.exception.type_error.302] type must be string, but is null

While in the past the following would:

error: unexpected EOF reading a line

It would be better to instead follow-up in a new issue (with linking to suspect previous issues) imo.


What's that with error messages being eaten by the hydra evaluator?

And a new instance found (unrelated to the current issue)

@andir
Copy link
Member

andir commented Oct 26, 2020

The eval issue is gone. It turned out to be due to one of the changes I made to the test infrastructure. This has been fixed in #101645.

As a result of the joined work on this @samueldr did propose NixOS/hydra#825 which would (hopefully) allow us to spot the source of these errors much quicker. Usually the eval output on hydra is very noisy and any pointer towards where the actual error is coming from is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 1.severity: blocker This is preventing another PR or issue from being completed 1.severity: channel blocker Blocks a channel
Projects
None yet
Development

No branches or pull requests

10 participants