-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use seccomp policy to avoid necessary sync operations #44
Conversation
Good plan! I hadn't realised you could force syscalls to succeed using seccomp too. |
Hmm, looks like the version of |
Sync operations are really slow on btrfs. They're also pointless, since if the computer crashes while we're doing a build then we'll just throw it away and start again anyway. This commit provides a seccomp policy that causes all sync operations to "fail", with errno 0 ("success"). On my machine, this reduces the time to `apt-get install -y shared-mime-info` from 18.5s to 4.7s. Based on https://bblank.thinkmo.de/using-seccomp-to-filter-sync-operations.html Use `--fast-sync` to enable to new behaviour (requires the latest runc).
This should allow `linux32` to work.
match get_machine () with | ||
| "x86_64" -> ["SCMP_ARCH_X86_64"; "SCMP_ARCH_X86"; "SCMP_ARCH_X32"] | ||
| "aarch64" -> ["SCMP_ARCH_AARCH64"; "SCMP_ARCH_ARM"] | ||
| _ -> [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we enumerate this somehow so that it'll fail on an unknown arch? Otherwise we'll run into this when adding riscv-32 in the future. (or could just make a note to remember to update this somewhere when we get around to riscv32)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is logic in https://github.com/avsm/osrelease/blob/master/lib/osrelease.ml that i could release that does all the arch detection (based on opams), if that helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do. But when we add a new multi-arch platform then we'll test it and discover the problem immediately anyway.
Merging now to fix cluster performance problems. Can be improved later if needed. |
Might help with problems such as this: ``` [11030132.006555] INFO: task ocluster-worker:602217 blocked for more than 120 seconds. [11030132.015596] Not tainted 5.4.0-40-generic ocurrent#44-Ubuntu [11030132.022547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11030132.032061] ocluster-worker D 0 602217 1 0x00004000 [11030132.032069] Call Trace: [11030132.032092] __schedule+0x2e3/0x740 [11030132.032106] ? __switch_to_asm+0x40/0x70 [11030132.032116] ? __switch_to_asm+0x34/0x70 [11030132.032126] schedule+0x42/0xb0 [11030132.032130] schedule_preempt_disabled+0xe/0x10 [11030132.032132] __mutex_lock.isra.0+0x182/0x4f0 [11030132.032142] ? try_to_del_timer_sync+0x54/0x80 [11030132.032145] __mutex_lock_slowpath+0x13/0x20 [11030132.032148] mutex_lock+0x2e/0x40 [11030132.032199] btrfs_start_delalloc_roots+0x60/0x280 [btrfs] [11030132.032238] flush_space+0x5dd/0x740 [btrfs] [11030132.032281] ? lock_extent_buffer_for_io+0x370/0x370 [btrfs] [11030132.032325] ? __clear_extent_bit+0x201/0x4a0 [btrfs] [11030132.032372] priority_reclaim_metadata_space.isra.0+0x18b/0x220 [btrfs] [11030132.032429] ? can_overcommit.part.0+0x5f/0xc0 [btrfs] [11030132.032466] btrfs_reserve_metadata_bytes+0x578/0x950 [btrfs] [11030132.032501] ? btrfs_truncate_inode_items+0x35e/0xdb0 [btrfs] [11030132.032505] ? __mutex_lock.isra.0+0x429/0x4f0 [11030132.032557] ? __btrfs_block_rsv_release+0x1c1/0x300 [btrfs] [11030132.032595] btrfs_block_rsv_refill+0x7d/0xa0 [btrfs] [11030132.032628] evict_refill_and_join+0x39/0xd0 [btrfs] [11030132.032670] btrfs_evict_inode+0x417/0x4c0 [btrfs] [11030132.032689] evict+0xd2/0x1b0 [11030132.032698] iput+0x148/0x210 [11030132.032708] dentry_unlink_inode+0xc6/0x110 [11030132.032720] d_delete+0x76/0x80 [11030132.032727] vfs_rmdir+0x179/0x1a0 [11030132.032732] do_rmdir+0x18c/0x1c0 [11030132.032736] __x64_sys_rmdir+0x17/0x20 [11030132.032744] do_syscall_64+0x57/0x190 [11030132.032747] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ```
CHANGES: - Add support for nested / multi-stage builds (@talex5 ocurrent/obuilder#48 ocurrent/obuilder#49). This allows you to use a large build environment to create a binary and then copy that into a smaller runtime environment. It's also useful to get better caching if two things can change independently (e.g. you want to build your software and also a linting tool, and be able to update either without rebuilding the other). - Add healthcheck feature (@talex5 ocurrent/obuilder#52). - Checks that Docker is running. - Does a test build using busybox. - Clean up left-over runc containers on restart (@talex5 ocurrent/obuilder#53). If btrfs crashes and makes the filesystem read-only then after rebooting there will be stale runc directories. New jobs with the same IDs would then fail. - Remove dependency on dockerfile (@talex5 ocurrent/obuilder#51). This also allows us more control over the formatting (e.g. putting a blank line between stages in multi-stage builds). - Record log output from docker pull (@talex5 ocurrent/obuilder#46). Otherwise, it's not obvious why we've stopped at a pull step, or what is happening. - Improve formatting of OBuilder specs (@talex5 ocurrent/obuilder#45). - Use seccomp policy to avoid necessary sync operations (@talex5 ocurrent/obuilder#44). Sync operations are really slow on btrfs. They're also pointless, since if the computer crashes while we're doing a build then we'll just throw it away and start again anyway. Use a seccomp policy that causes all sync operations to "fail", with errno 0 ("success"). On my machine, this reduces the time to `apt-get install -y shared-mime-info` from 18.5s to 4.7s. Use `--fast-sync` to enable to new behaviour (it requires runc 1.0.0-rc92). - Use a mutex to avoid concurrent btrfs operations (@talex5 ocurrent/obuilder#43). Btrfs deadlocks enough as it is. Don't stress it further by trying to do two things at once. Internal changes: - Improve handling of file redirections (@talex5 ocurrent/obuilder#46). Instead of making the caller do all the work of closing the file descriptors safely, add an `FD_move_safely` mode. - Travis tests: ensure apt cache is up-to-date (@talex5 ocurrent/obuilder#50).
Sync operations are really slow on btrfs. They're also pointless, since if the computer crashes while we're doing a build then we'll just throw it away and start again anyway.
This commit provides a seccomp policy that causes all sync operations to "fail", with errno 0 ("success").
On my machine, this reduces the time to
apt-get install -y shared-mime-info
from 18.5s to 4.7s.Based on https://bblank.thinkmo.de/using-seccomp-to-filter-sync-operations.html