Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auditbeat] Flaky file_integrity.test_recursive test #7731

Closed
ruflin opened this issue Jul 25, 2018 · 5 comments
Closed

[Auditbeat] Flaky file_integrity.test_recursive test #7731

ruflin opened this issue Jul 25, 2018 · 5 comments
Assignees
Labels
Auditbeat flaky-test Unstable or unreliable test cases.

Comments

@ruflin
Copy link
Contributor

ruflin commented Jul 25, 2018

Error:

09:03:03 ======================================================================
09:03:03 ERROR: file_integrity monitors watched directories (recursive).
09:03:03 ----------------------------------------------------------------------
09:03:03 Traceback (most recent call last):
09:03:03   File "/go/src/github.com/elastic/beats/auditbeat/tests/system/test_file_integrity.py", line 167, in test_recursive
09:03:03     self.wait_log_contains("\"path\": \"{0}\"".format(escape_path(file2)), ignore_case=True)
09:03:03   File "/go/src/github.com/elastic/beats/auditbeat/tests/system/../../../metricbeat/tests/system/../../../libbeat/tests/system/beat/beat.py", line 348, in wait_log_contains
09:03:03     name=name)
09:03:03   File "/go/src/github.com/elastic/beats/auditbeat/tests/system/../../../metricbeat/tests/system/../../../libbeat/tests/system/beat/beat.py", line 325, in wait_until
09:03:03     "Waited {} seconds.".format(max_timeout))
09:03:03 TimeoutError: Timeout waiting for 'log_contains' to be true. Waited 10 seconds.
09:03:03 
09:03:03 ----------------------------------------------------------------------
09:03:03 XML: /go/src/github.com/elastic/beats/auditbeat/build/TEST-system.xml

Log

2018-07-25T09:02:29.744Z	INFO	instance/beat.go:544	Home path: [/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive] Config path: [/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive] Data path: [/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/data] Logs path: [/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/logs]
2018-07-25T09:02:29.744Z	DEBUG	[beat]	instance/beat.go:571	Beat metadata path: /go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/data/meta.json
2018-07-25T09:02:29.746Z	INFO	instance/beat.go:551	Beat UUID: bacab130-fc52-4ced-b486-b3cca1426ca4
2018-07-25T09:02:29.746Z	DEBUG	[seccomp]	seccomp/seccomp.go:109	Loading syscall filter	{"seccomp_filter": {"no_new_privs":true,"flag":"tsync","policy":{"default_action":"errno","syscalls":[{"names":["accept","accept4","access","arch_prctl","bind","brk","clock_gettime","clone","close","connect","dup","dup2","epoll_create","epoll_create1","epoll_ctl","epoll_pwait","epoll_wait","exit","exit_group","fchdir","fchmod","fchown","fcntl","fdatasync","flock","fstat","fsync","ftruncate","futex","getcwd","getdents","getdents64","geteuid","getgid","getpeername","getpid","getppid","getrandom","getrusage","getsockname","getsockopt","gettid","gettimeofday","getuid","inotify_add_watch","inotify_init1","inotify_rm_watch","ioctl","kill","listen","lseek","lstat","madvise","mincore","mkdirat","mmap","mprotect","munmap","nanosleep","newfstatat","open","openat","pipe","pipe2","poll","pread64","pselect6","pwrite64","read","readlink","readlinkat","recvfrom","recvmmsg","recvmsg","rename","renameat","rt_sigaction","rt_sigprocmask","rt_sigreturn","sched_getaffinity","sched_yield","sendfile","sendmmsg","sendmsg","sendto","set_robust_list","setitimer","setsockopt","shutdown","sigaltstack","socket","stat","statfs","sysinfo","tgkill","time","tkill","uname","unlink","unlinkat","wait4","waitid","write","writev"],"action":"allow"}]}}}
2018-07-25T09:02:29.746Z	INFO	[seccomp]	seccomp/seccomp.go:116	Syscall filter successfully installed
2018-07-25T09:02:29.746Z	INFO	[beat]	instance/beat.go:768	Beat info	{"system_info": {"beat": {"path": {"config": "/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive", "data": "/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/data", "home": "/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive", "logs": "/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/logs"}, "type": "auditbeat", "uuid": "bacab130-fc52-4ced-b486-b3cca1426ca4"}}}
2018-07-25T09:02:29.746Z	INFO	[beat]	instance/beat.go:777	Build info	{"system_info": {"build": {"commit": "unknown", "libbeat": "7.0.0-alpha1", "time": "1754-08-30T22:43:41.128Z", "version": "7.0.0-alpha1"}}}
2018-07-25T09:02:29.746Z	INFO	[beat]	instance/beat.go:780	Go runtime info	{"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.10.3"}}}
2018-07-25T09:02:29.747Z	INFO	[beat]	instance/beat.go:784	Host info	{"system_info": {"host": {"architecture":"x86_64","boot_time":"2018-03-16T08:08:25Z","containerized":false,"hostname":"ebf247864475","ips":["127.0.0.1/8","172.18.0.5/16"],"kernel_version":"4.4.0-1052-aws","mac_addresses":["02:42:ac:12:00:05"],"os":{"family":"debian","platform":"debian","name":"Debian GNU/Linux","version":"9 (stretch)","major":9,"minor":0,"patch":0,"codename":"stretch"},"timezone":"UTC","timezone_offset_sec":0,"id":"3e608929fbd39b959f388bf468c9f0b1"}}}
2018-07-25T09:02:29.747Z	INFO	[beat]	instance/beat.go:813	Process info	{"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"permitted":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"effective":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"ambient":null}, "cwd": "/go/src/github.com/elastic/beats/auditbeat", "exe": "/go/src/github.com/elastic/beats/auditbeat/auditbeat.test", "name": "auditbeat.test", "pid": 28824, "ppid": 28624, "seccomp": {"mode":"filter"}, "start_time": "2018-07-25T09:02:28.900Z"}}}
2018-07-25T09:02:29.747Z	INFO	instance/beat.go:273	Setup Beat: auditbeat; Version: 7.0.0-alpha1
2018-07-25T09:02:29.747Z	DEBUG	[beat]	instance/beat.go:290	Initializing output plugins
2018-07-25T09:02:29.747Z	DEBUG	[processors]	processors/processor.go:66	Processors: 
2018-07-25T09:02:29.747Z	DEBUG	[rotator]	file/rotator.go:135	Initialized file rotator	{"rotator": {"filename": "/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/output/auditbeat", "max_size_bytes": 1024000, "max_backups": 7, "permissions": "-rw-------"}}
2018-07-25T09:02:29.747Z	INFO	fileout/file.go:94	Initialized file output. path=/go/src/github.com/elastic/beats/auditbeat/build/system-tests/run/test_file_integrity.Test.test_recursive/output/auditbeat max_size_bytes=1024000 max_backups=7 permissions=-rw-------
2018-07-25T09:02:29.747Z	DEBUG	[publish]	pipeline/consumer.go:137	start pipeline event consumer
2018-07-25T09:02:29.747Z	INFO	pipeline/module.go:98	Beat name: ebf247864475
2018-07-25T09:02:29.747Z	DEBUG	[modules]	beater/metricbeat.go:98	Register [ModuleFactory:[], MetricSetFactory:[auditd/auditd, file_integrity/file]]
2018-07-25T09:02:29.748Z	DEBUG	[processors]	processors/processor.go:66	Processors: 
2018-07-25T09:02:29.748Z	DEBUG	[file_integrity]	file_integrity/metricset.go:103	Initialized the file event reader. Running as euid=0
2018-07-25T09:02:29.748Z	INFO	instance/beat.go:367	auditbeat start running.
2018-07-25T09:02:29.748Z	INFO	[monitoring]	log/log.go:114	Starting metrics logging every 30s
2018-07-25T09:02:29.748Z	DEBUG	[module]	module/wrapper.go:117	Starting Wrapper[name=file_integrity, len(metricSetWrappers)=1]
2018-07-25T09:02:29.748Z	DEBUG	[module]	module/wrapper.go:171	file_integrity/file will start after 7.837305943s
2018-07-25T09:02:37.586Z	DEBUG	[module]	module/wrapper.go:179	Starting metricSetWrapper[module=file_integrity, name=file, host=]
2018-07-25T09:02:37.592Z	INFO	[file_integrity]	file_integrity/eventreader_fsnotify.go:76	Started fsnotify watcher	{"file_path": ["/tmp/tmpgZqLu1auditbeat_test"], "recursive": true}
2018-07-25T09:02:37.648Z	DEBUG	[file_integrity]	file_integrity/eventreader_fsnotify.go:95	Received fsnotify event	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir", "event_flags": "CREATE"}
2018-07-25T09:02:37.649Z	DEBUG	[file_integrity]	file_integrity/eventreader_fsnotify.go:95	Received fsnotify event	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt", "event_flags": "CREATE"}
2018-07-25T09:02:37.649Z	DEBUG	[file_integrity]	file_integrity/metricset.go:224	File changed since it was last seen	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir", "took": 527361, "event": {"old": null, "new": {"timestamp":"2018-07-25T09:02:37.649047821Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir","info":{"inode":49693370,"uid":0,"gid":0,"sid":"","owner":"root","group":"root","size":4096,"mtime":"2018-07-25T09:02:37.646808735Z","ctime":"2018-07-25T09:02:37.646808735Z","type":"dir","mode":493,"setuid":false,"setgid":false,"origin":null},"source":"fsnotify","action":"created"}}}
2018-07-25T09:02:37.650Z	DEBUG	[publish]	pipeline/processor.go:308	Publish event: {
  "@timestamp": "2018-07-25T09:02:37.649Z",
  "@metadata": {
    "beat": "auditbeat",
    "type": "doc",
    "version": "7.0.0-alpha1"
  },
  "file": {
    "path": "/tmp/tmpgZqLu1auditbeat_test/subdir",
    "mtime": "2018-07-25T09:02:37.646Z",
    "type": "dir",
    "uid": 0,
    "mode": "0755",
    "gid": 0,
    "owner": "root",
    "group": "root",
    "inode": "49693370",
    "ctime": "2018-07-25T09:02:37.646Z"
  },
  "beat": {
    "name": "ebf247864475",
    "hostname": "ebf247864475",
    "version": "7.0.0-alpha1"
  },
  "host": {
    "name": "ebf247864475"
  },
  "event": {
    "module": "file_integrity",
    "action": [
      "created"
    ]
  }
}
2018-07-25T09:02:37.651Z	DEBUG	[file_integrity]	file_integrity/metricset.go:224	File changed since it was last seen	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt", "took": 95578, "event": {"old": null, "new": {"timestamp":"2018-07-25T09:02:37.649621128Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt","info":{"inode":49693406,"uid":0,"gid":0,"sid":"","owner":"root","group":"root","size":12,"mtime":"2018-07-25T09:02:37.646808735Z","ctime":"2018-07-25T09:02:37.646808735Z","type":"file","mode":420,"setuid":false,"setgid":false,"origin":null},"source":"fsnotify","action":"created","hash":{"sha1":"430ce34d020724ed75a196dfc2ad67c77772d169"}}}}
2018-07-25T09:02:37.651Z	DEBUG	[publish]	pipeline/processor.go:308	Publish event: {
  "@timestamp": "2018-07-25T09:02:37.649Z",
  "@metadata": {
    "beat": "auditbeat",
    "type": "doc",
    "version": "7.0.0-alpha1"
  },
  "hash": {
    "sha1": "430ce34d020724ed75a196dfc2ad67c77772d169"
  },
  "beat": {
    "name": "ebf247864475",
    "hostname": "ebf247864475",
    "version": "7.0.0-alpha1"
  },
  "host": {
    "name": "ebf247864475"
  },
  "event": {
    "module": "file_integrity",
    "action": [
      "created"
    ]
  },
  "file": {
    "ctime": "2018-07-25T09:02:37.646Z",
    "uid": 0,
    "mode": "0644",
    "group": "root",
    "inode": "49693406",
    "mtime": "2018-07-25T09:02:37.646Z",
    "owner": "root",
    "path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt",
    "size": 12,
    "type": "file",
    "gid": 0
  }
}
2018-07-25T09:02:37.751Z	DEBUG	[memqueue]	memqueue/ackloop.go:160	ackloop: receive ack [0: 0, 2]
2018-07-25T09:02:37.751Z	DEBUG	[memqueue]	memqueue/eventloop.go:535	broker ACK events: count=2, start-seq=1, end-seq=2

2018-07-25T09:02:37.751Z	DEBUG	[memqueue]	memqueue/ackloop.go:128	ackloop: return ack to broker loop:2
2018-07-25T09:02:37.751Z	DEBUG	[memqueue]	memqueue/ackloop.go:131	ackloop:  done send ack
2018-07-25T09:02:47.686Z	DEBUG	[file_integrity]	file_integrity/eventreader_fsnotify.go:95	Received fsnotify event	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt", "event_flags": "REMOVE"}
2018-07-25T09:02:47.687Z	DEBUG	[file_integrity]	file_integrity/metricset.go:224	File changed since it was last seen	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt", "took": 30022, "event": {"old": {"timestamp":"2018-07-25T09:02:37.649621128Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt","info":{"inode":49693406,"uid":0,"gid":0,"sid":"","owner":"","group":"","size":12,"mtime":"2018-07-25T09:02:37.646808735Z","ctime":"2018-07-25T09:02:37.646808735Z","type":"file","mode":420,"setuid":false,"setgid":false,"origin":null},"source":"fsnotify","action":"created","hash":{"sha1":"430ce34d020724ed75a196dfc2ad67c77772d169"}}, "new": {"timestamp":"2018-07-25T09:02:47.687090387Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt","info":null,"source":"fsnotify","action":"deleted"}}}
2018-07-25T09:02:47.687Z	DEBUG	[file_integrity]	file_integrity/eventreader_fsnotify.go:95	Received fsnotify event	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir", "event_flags": "REMOVE"}
2018-07-25T09:02:47.687Z	DEBUG	[publish]	pipeline/processor.go:308	Publish event: {
  "@timestamp": "2018-07-25T09:02:47.687Z",
  "@metadata": {
    "beat": "auditbeat",
    "type": "doc",
    "version": "7.0.0-alpha1"
  },
  "event": {
    "module": "file_integrity",
    "action": [
      "deleted"
    ]
  },
  "file": {
    "path": "/tmp/tmpgZqLu1auditbeat_test/subdir/file.txt"
  },
  "beat": {
    "name": "ebf247864475",
    "hostname": "ebf247864475",
    "version": "7.0.0-alpha1"
  },
  "host": {
    "name": "ebf247864475"
  }
}
2018-07-25T09:02:47.688Z	DEBUG	[file_integrity]	file_integrity/metricset.go:224	File changed since it was last seen	{"file_path": "/tmp/tmpgZqLu1auditbeat_test/subdir", "took": 14361, "event": {"old": {"timestamp":"2018-07-25T09:02:37.649047821Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir","info":{"inode":49693370,"uid":0,"gid":0,"sid":"","owner":"","group":"","size":0,"mtime":"1970-01-01T00:00:00Z","ctime":"1970-01-01T00:00:00Z","type":"dir","mode":493,"setuid":false,"setgid":false,"origin":null},"source":"fsnotify","action":"created"}, "new": {"timestamp":"2018-07-25T09:02:47.68733801Z","path":"/tmp/tmpgZqLu1auditbeat_test/subdir","info":null,"source":"fsnotify","action":"deleted"}}}
2018-07-25T09:02:47.688Z	DEBUG	[publish]	pipeline/processor.go:308	Publish event: {
  "@timestamp": "2018-07-25T09:02:47.687Z",
  "@metadata": {
    "beat": "auditbeat",
    "type": "doc",
    "version": "7.0.0-alpha1"
  },
  "event": {
    "module": "file_integrity",
    "action": [
      "deleted"
    ]
  },
  "file": {
    "path": "/tmp/tmpgZqLu1auditbeat_test/subdir"
  },
  "beat": {
    "version": "7.0.0-alpha1",
    "name": "ebf247864475",
    "hostname": "ebf247864475"
  },
  "host": {
    "name": "ebf247864475"
  }
}
@ruflin ruflin added Auditbeat flaky-test Unstable or unreliable test cases. labels Jul 25, 2018
@andrewkroh
Copy link
Member

andrewkroh commented Jul 25, 2018

It looks like the OS never sent an event for more.txt.

cwurm pushed a commit to cwurm/beats that referenced this issue Aug 24, 2018
There have been spurious test failures in
test_file_integrity.Test.test_recursive (elastic#7731).
This makes sure all errors encountered in recursiveWatcher
are caught and logged, and also adds a debug message
when a new recursive watch is added.
@cwurm
Copy link
Contributor

cwurm commented Aug 24, 2018

Even before more.txt, it never finds subdir/other/. This has been happening a few times each of the last few months. It's always on Linux, and always test_recursive (and not test_non_recursive which is similar). It might be a problem in fsnotify - unfortunately, it does not have any debug output. It might also be something in recursiveWatcher before it's reported.

Maybe something related to the concurrent map access in #8009 in which case #8027 should help.

I opened #8087 to add error catching everywhere the event could have been held up before being reported. If this happens again, we should have more information either way - a new error message or we've excluded a possible cause and need to dive into fsnotify (I hope not, it does not look pretty).

@ruflin ruflin changed the title [Auditbeat] File integrity flaky test [Auditbeat] Flaky file_integrity.test_recursive test Aug 27, 2018
@cwurm
Copy link
Contributor

cwurm commented Aug 28, 2018

I've merged #8087. I suggest we keep this issue open for a month or so to see if it still happens.

cwurm pushed a commit that referenced this issue Aug 28, 2018
There have been spurious test failures in test_file_integrity.Test.test_recursive (#7731). This makes sure all errors encountered in recursiveWatcher are caught and logged, and also adds a debug message when a new recursive watch is added.
cwurm pushed a commit to cwurm/beats that referenced this issue Sep 18, 2018
There have been spurious test failures in test_file_integrity.Test.test_recursive (elastic#7731). This makes sure all errors encountered in recursiveWatcher are caught and logged, and also adds a debug message when a new recursive watch is added.

(cherry picked from commit e5f16e2)
cwurm pushed a commit that referenced this issue Sep 19, 2018
There have been spurious test failures in test_file_integrity.Test.test_recursive (#7731). This makes sure all errors encountered in recursiveWatcher are caught and logged, and also adds a debug message when a new recursive watch is added.

(cherry picked from commit e5f16e2)
ruflin added a commit to ruflin/beats that referenced this issue Dec 12, 2018
Skipping this test as it was flaky several times. See elastic#7731 for report.
ruflin added a commit that referenced this issue Dec 12, 2018
Skipping this test as it was flaky several times. See #7731 for report.
@cwurm
Copy link
Contributor

cwurm commented Feb 9, 2019

This exact test failure does not seem to be happening anymore. There were others in this test that were fixed since then. Closing for now.

@cwurm cwurm closed this as completed Feb 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auditbeat flaky-test Unstable or unreliable test cases.
Projects
None yet
Development

No branches or pull requests

3 participants