Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzer engine, spurious failures #102

Closed
holiman opened this issue Oct 18, 2018 · 6 comments
Closed

Fuzzer engine, spurious failures #102

holiman opened this issue Oct 18, 2018 · 6 comments

Comments

@holiman
Copy link
Collaborator

holiman commented Oct 18, 2018

Sometimes, this happens:

2018-10-18 09:33:58,001 - INFO - Starting processes for ['geth', 'parity'] on test DEFAULT-Wed_09_22_38-11214-573185
2018-10-18 09:33:58,179 - INFO - All procs finished for test DEFAULT-Wed_09_22_38-11214-573183
2018-10-18 09:33:58,232 - INFO - Processed 3228 steps for geth on test DEFAULT-Wed_09_22_38-11214-573183, pTime:53.07 ms
Traceback (most recent call last):
  File "utilities/fuzzerweb.py", line 59, in <module>
    main()
  File "utilities/fuzzerweb.py", line 55, in main
    f.startFuzzing()
  File "/datadrive/evmlab/utilities/fuzzer.py", line 593, in startFuzzing
    self.postprocess_test(test)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 524, in postprocess_test
    data = end_processes(test)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 424, in end_processes
    with open(test.tempTraceLocation(client_name)) as output:
FileNotFoundError: [Errno 2] No such file or directory: '/datadrive/evmlabtemp/logs/DEFAULT-Wed_09_22_38-11214-573183-parity.trace.log'

I ran the testcase in question manually inside the parity docker image, and it worked fine. But for some reason, the file didn't exist. Disk was not full. It happened after 573182 testcases, so it doesn't happen often.

@holiman holiman changed the title Spurious failures Fuzzer engine, spurious failures Oct 18, 2018
@tintinweb
Copy link
Member

Is this the codeline that stores the trace? https://github.com/ethereum/evmlab/blame/master/utilities/fuzzer.py#L143

@holiman
Copy link
Collaborator Author

holiman commented Oct 25, 2018

@holiman holiman reopened this Nov 12, 2018
@holiman
Copy link
Collaborator Author

holiman commented Nov 12, 2018

It reappeared again, but I have some more log output this time:

2018-11-10 23:54:22,508 - INFO - Starting processes for ['geth', 'parity'] on test DEFAULT-Sat_09_04_09-5401-363088
2018-11-10 23:54:22,519 - INFO - All procs finished for test DEFAULT-Sat_09_04_09-5401-363085
2018-11-10 23:54:22,520 - INFO - Processed 8 steps for geth on test DEFAULT-Sat_09_04_09-5401-363085, pTime:0.30 ms
2018-11-10 23:54:22,520 - WARNING - The file /datadrive/evmlabtemp/logs/DEFAULT-Sat_09_04_09-5401-363085-parity.trace.log could not be found!
2018-11-10 23:54:22,520 - WARNING - Socket event [1][17]
2018-11-10 23:54:22,520 - WARNING - Socket data b'\x01\x00\x00\x00\x00\x00\x01drpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:87: adding pid 10540 to cgroups caused \\"failed to write 10540 to cgroup.procs: write /sys/fs/cgroup/cpu,cpuacct/docker/d5fede65f70d4f14ddddc0449c4a8d27ab4f1fa9ffb6352baef5d700c48f7a58/cgroup.procs: invalid argument\\""\n\r\n'b''
2018-11-10 23:54:22,520 - INFO - Processed 8 steps for parity on test DEFAULT-Sat_09_04_09-5401-363085, pTime:0.22 ms
2018-11-10 23:54:22,520 - INFO - Removing test artefacts ['/datadrive/evmlabtemp/testfiles/DEFAULT-Sat_09_04_09-5401-363085-test.json', '/datadrive/evmlabtemp/logs/DEFAULT-Sat_09_04_09-5401-363085-geth.trace.log', '/datadrive/evmlabtemp/logs/DEFAULT-Sat_09_04_09-5401-363085-parity.trace.log']
Traceback (most recent call last):
  File "utilities/fuzzerweb.py", line 61, in <module>
    main()
  File "utilities/fuzzerweb.py", line 56, in main
    executor.startFuzzing()
  File "/datadrive/evmlab/utilities/fuzzer.py", line 353, in startFuzzing
    self.postprocess_test(test, reporting=self._fuzzer._config.enable_reporting)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 278, in postprocess_test
    failingTestcase = self._fuzzer.processTraces(test, forceSave=self._fuzzer._config.force_save)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 487, in processTraces
    test.removeFiles()
  File "/datadrive/evmlab/utilities/fuzzer.py", line 147, in removeFiles
    os.remove(f)
FileNotFoundError: [Errno 2] No such file or directory: '/datadrive/evmlabtemp/logs/DEFAULT-Sat_09_04_09-5401-363085-parity.trace.log'
2018-11-11 01:02:14,011 - INFO - 127.0.0.1 - - [11/Nov/2018 01:02:14] "GET / HTTP/1.0" 200 -

So, apparently, this is what happens when we try to do the docker_exec

\x01\x00\x00\x00\x00\x00\x01drpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:87: adding pid 10540 to cgroups caused \\"failed to write 10540 to cgroup.procs: write /sys/fs/cgroup/cpu,cpuacct/docker/d5fede65f70d4f14ddddc0449c4a8d27ab4f1fa9ffb6352baef5d700c48f7a58/cgroup.procs: invalid argument\\""\n\r\n

@holiman
Copy link
Collaborator Author

holiman commented Nov 12, 2018

Related: moby/moby#31230
possibly loosely related: opencontainers/runc#1884

@holiman
Copy link
Collaborator Author

holiman commented Nov 13, 2018

Actually, now that we know what it is. we don't really have to fix it. If there is any output from the process, we know the instantiation failed (otherwise the output would have wound up into the file, since we pipe everything to file). And then we can just rerun the testcase, and move on.

@holiman
Copy link
Collaborator Author

holiman commented Nov 17, 2018

Same, on the second machine with more up-to-date code:

2018-11-13 01:28:00,790 - INFO - All procs finished for test DEFAULT-Mon_14_35_06-13647-751926
2018-11-13 01:28:00,791 - WARNING - The file /datadrive/evmlabtemp/logs/DEFAULT-Mon_14_35_06-13647-751926-geth.trace.log could not be found!
2018-11-13 01:28:00,791 - WARNING - Socket event [17][17]
2018-11-13 01:28:00,792 - WARNING - Socket data b'\x01\x00\x00\x00\x00\x00\x01drpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:87: adding pid 11733 to cgroups caused \\"failed to write 11733 to cgroup.procs: write /sys/fs/cgroup/cpu,cpuacct/docker/2981fdfce1455048285fa0f10812256b87a959e572e49633eea95d0041ae936d/cgroup.procs: invalid argument\\""\n\r\n'b''
2018-11-13 01:28:00,792 - INFO - Processed 0 steps for geth on test DEFAULT-Mon_14_35_06-13647-751926, pTime:0.77 ms 
2018-11-13 01:28:00,792 - INFO - Processed 4 steps for parity on test DEFAULT-Mon_14_35_06-13647-751926, pTime:0.35 ms 
2018-11-13 01:28:00,793 - WARNING - CONSENSUS BUG!!!
2018-11-13 01:28:00,793 - INFO - Saving testcase as /root/tmp/evmlab/artefacts/DEFAULT-Mon_14_35_06-13647-751926-test.json
2018-11-13 01:28:00,794 - INFO - Saving trace as /root/tmp/evmlab/artefacts/DEFAULT-Mon_14_35_06-13647-751926-geth.trace.log
Traceback (most recent call last):
  File "/usr/lib/python3.6/shutil.py", line 544, in move
    os.rename(src, real_dst)
OSError: [Errno 18] Invalid cross-device link: '/datadrive/evmlabtemp/logs/DEFAULT-Mon_14_35_06-13647-751926-geth.trace.log' -> '/root/tmp/evmlab/artefacts/DEFAULT-Mon_14_35_06-13647-751926-geth.trace.log'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "utilities/fuzzerweb.py", line 61, in <module>
    main()
  File "utilities/fuzzerweb.py", line 56, in main
    executor.startFuzzing()
  File "/datadrive/evmlab/utilities/fuzzer.py", line 357, in startFuzzing
    self.postprocess_test(test, reporting=self._fuzzer._config.enable_reporting)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 281, in postprocess_test
    failingTestcase = self._fuzzer.processTraces(test, forceSave=self._fuzzer._config.force_save)
  File "/datadrive/evmlab/utilities/fuzzer.py", line 526, in processTraces
    test.saveArtefacts()
  File "/datadrive/evmlab/utilities/fuzzer.py", line 181, in saveArtefacts
    shutil.move(f, newloc)
  File "/usr/lib/python3.6/shutil.py", line 558, in move
    copy_function(src, real_dst)
  File "/usr/lib/python3.6/shutil.py", line 257, in copy2
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.6/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/datadrive/evmlabtemp/logs/DEFAULT-Mon_14_35_06-13647-751926-geth.trace.log'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants