-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a way for offline building #17
Comments
Agreed! Offline building will be supported (it is one of the requirements). At the moment, you may try pre |
I saw in the docs something about charge fetch. But for nix it would be best I can do it out of tree and link the folder in tree later. |
I will try to build it via nix next week. If that succeeds, I'm happy to close this issue.
|
Seems like i had an to old version of some dependencies |
If you have the time to try, it would be great! The issue may remain opened since building in general (and offline building in particular) is not yet set in stone.
The branch is intended to always be building and working (CI was just added to keep it that way, too). Of course, mistakes could happen. |
I'm currently seeing this warning, is
(sorry, it's my first time working on the kernel) |
No, it is something that appears the first time when you haven't configured or built the kernel previously (so after a (Don't worry about asking!) |
This line seems to be the problem if you build with O=$something Line 232 in f60edb2
|
Not sure what I did, but I got it to die with this error:
|
What about changing from @ojeda What do you think about this? should I open an Issue/PR for this? |
Good catch, we need to avoid reading the file when not needed.
Do you mean avoiding conditional compilation altogether?
Does
Definitely! Please open the issue. |
Yes, that was my idea. We maybe could create special cases, like some debug values or so, but I think the should be opt-in
I'm currently trying to figure this out and ask in the rust zulipchat.
Opened as #22 |
No, conditional compilation must work -- nobody will settle on something so fundamental (myself included!). It is not just about bloat or performance (which are both a big no-no), but also about complexity (we'd need to define what subset goes in), divergence (developers expect all
Thanks a lot x2! |
Thus std aware cargo is the biggest blocker for me at the moment. Is it reasonable to disable it, until it is stable in cargo? |
I got a new error, so got build-std working for now (unsure how). But now the build.rs tries to access files which are not there. Not sure if this is because I'm calling make with
|
exceptions may be traversed using list_for_each_entry_rcu() outside of an RCU read side critical section BUT under the protection of decgroup_mutex. Hence add the corresponding lockdep expression to fix the following false-positive warning: [ 2.304417] ============================= [ 2.304418] WARNING: suspicious RCU usage [ 2.304420] 5.5.4-stable #17 Tainted: G E [ 2.304422] ----------------------------- [ 2.304424] security/device_cgroup.c:355 RCU-list traversed in non-reader section!! Signed-off-by: Amol Grover <[email protected]> Signed-off-by: James Morris <[email protected]>
…s metrics" test Linux 5.9 introduced perf test case "Parse and process metrics" and on s390 this test case always dumps core: [root@t35lp67 perf]# ./perf test -vvvv -F 67 67: Parse and process metrics : --- start --- metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC parsing metric: inst_retired.any / cpu_clk_unhalted.thread Segmentation fault (core dumped) [root@t35lp67 perf]# I debugged this core dump and gdb shows this call chain: (gdb) where #0 0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6 #1 0x000003ffabc293de in strcasestr () from /lib64/libc.so.6 #2 0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any", n=<optimized out>) at util/metricgroup.c:368 #3 find_metric (map=<optimized out>, map=<optimized out>, metric=0x1e6ea20 "inst_retired.any") at util/metricgroup.c:765 #4 __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0, metric_no_group=<optimized out>, m=<optimized out>) at util/metricgroup.c:844 #5 resolve_metric (ids=0x0, map=0x0, metric_list=0x0, metric_no_group=<optimized out>) at util/metricgroup.c:881 #6 metricgroup__add_metric (metric=<optimized out>, metric_no_group=metric_no_group@entry=false, events=<optimized out>, events@entry=0x3ffd84fb878, metric_list=0x0, metric_list@entry=0x3ffd84fb868, map=0x0) at util/metricgroup.c:943 #7 0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>, metric_list=0x3ffd84fb868, events=0x3ffd84fb878, metric_no_group=<optimized out>, list=<optimized out>) at util/metricgroup.c:988 #8 parse_groups (perf_evlist=perf_evlist@entry=0x1e70260, str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>, metric_no_merge=<optimized out>, fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>, metric_events=0x3ffd84fba58, map=0x1) at util/metricgroup.c:1040 #9 0x0000000001103eb2 in metricgroup__parse_groups_test( evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>, str=str@entry=0x12f34b2 "IPC", metric_no_group=metric_no_group@entry=false, metric_no_merge=metric_no_merge@entry=false, metric_events=0x3ffd84fba58) at util/metricgroup.c:1082 #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0, ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC", vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:159 #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:189 #12 test_ipc () at tests/parse-metric.c:208 ..... ..... omitted many more lines This test case was added with commit 218ca91 ("perf tests: Add parse metric test for frontend metric"). When I compile with make DEBUG=y it works fine and I do not get a core dump. It turned out that the above listed function call chain worked on a struct pmu_event array which requires a trailing element with zeroes which was missing. The marco map_for_each_event() loops over that array tests for members metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes the issue. Output after: [root@t35lp46 perf]# ./perf test 67 67: Parse and process metrics : Ok [root@t35lp46 perf]# Committer notes: As Ian remarks, this is not s390 specific: <quote Ian> This also shows up with address sanitizer on all architectures (perhaps change the patch title) and perhaps add a "Fixes: <commit>" tag. ================================================================= ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp 0x7ffd24327c58 READ of size 8 at 0x55c93b4d59e8 thread T0 #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2 #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9 #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9 #3 0x55c93a1528db in metricgroup__add_metric tools/perf/util/metricgroup.c:943:9 #4 0x55c93a151996 in metricgroup__add_metric_list tools/perf/util/metricgroup.c:988:9 #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8 #6 0x55c93a1513e1 in metricgroup__parse_groups_test tools/perf/util/metricgroup.c:1082:9 #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8 #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9 #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2 #10 0x55c93a00f1e8 in test__parse_metric tools/perf/tests/parse-metric.c:345:2 #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9 #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9 #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4 #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9 #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11 #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8 #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2 #18 0x55c939e45f7e in main tools/perf/perf.c:539:3 0x55c93b4d59e8 is located 0 bytes to the right of global variable 'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25' (0x55c93b4d54a0) of size 1352 SUMMARY: AddressSanitizer: global-buffer-overflow tools/perf/util/metricgroup.c:764:2 in find_metric Shadow bytes around the buggy address: 0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9 0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc </quote> I'm also adding the missing "Fixes" tag and setting just .name to NULL, as doing it that way is more compact (the compiler will zero out everything else) and the table iterators look for .name being NULL as the sentinel marking the end of the table. Fixes: 0a507af ("perf tests: Add parse metric test for ipc metric") Signed-off-by: Thomas Richter <[email protected]> Reviewed-by: Sumanth Korikkar <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
The evsel->unit borrows a pointer of pmu event or alias instead of owns a string. But tool event (duration_time) passes a result of strdup() caused a leak. It was found by ASAN during metric test: Direct leak of 210 byte(s) in 70 object(s) allocated from: #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414 #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414 #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439 #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096 #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141 #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406 #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393 #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415 #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498 #10 0x559fbbc0109b in run_test tests/builtin-test.c:410 #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440 #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695 #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807 #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
The test_generic_metric() missed to release entries in the pctx. Asan reported following leak (and more): Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14) #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497) #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111 #4 0x55f7e7341667 in expr__add_ref util/expr.c:120 #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783 #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858 #7 0x55f7e712390b in compute_single tests/parse-metric.c:128 #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180 #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196 #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295 #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355 #12 0x55f7e70be09b in run_test tests/builtin-test.c:410 #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 6d432c4 ("perf tools: Add test_generic_metric function") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
The metricgroup__add_metric() can find multiple match for a metric group and it's possible to fail. Also it can fail in the middle like in resolve_metric() even for single metric. In those cases, the intermediate list and ids will be leaked like: Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683 #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906 #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940 #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993 #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045 #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087 #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164 #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196 #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318 #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356 #11 0x55f7e70be09b in run_test tests/builtin-test.c:410 #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 83de0b7 ("perf metric: Collect referenced metrics in struct metric_ref_node") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure. Way less files around! - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Offline builds, always; i.e. there is no "online compilation" anymore (fixes #17). - No more interleaved Cargo output (fixes #29). - One less nightly dependency (Cargo's `build-std`); since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking `rustc` options, like overflow checks. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is one more nightly feature used (the new Rust mangling scheme), but we know that one will be stable (and the default one, later on). - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. There are a few TODOs that we can improve later if we agree on this: - Kbuild: + Actually use the `*.d` files. + Complete `make clean`. + Support single-object compilation. + Pass `objtool` to make the ORC unwinder work. + Echo the building of the rust/* libraries and the bindgen call. - Figure out how to pick symbols to export automatically from Rust code instead of managing the list by hand. Perhaps we could use a no-op macro on the Rust code, which is then parse by a script to pick up the symbols: pub fn foo() {} export_symbol!(foo); Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure. Way less files around! - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Offline builds, always; i.e. there is no "online compilation" anymore (fixes #17). - No more interleaved Cargo output (fixes #29). - One less nightly dependency (Cargo's `build-std`); since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking `rustc` options, like overflow checks. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is one more nightly feature used (the new Rust mangling scheme), but we know that one will be stable (and the default one, later on). - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. There are a few TODOs that we can improve later if we agree on this: - Kbuild: + Actually use the `*.d` files. + Complete `make clean`. + Support single-object compilation. + Pass `objtool` to make the ORC unwinder work. + Echo the building of the rust/* libraries and the bindgen call. - Figure out how to pick symbols to export automatically from Rust code instead of managing the list by hand. Perhaps we could use a no-op macro on the Rust code, which is then parse by a script to pick up the symbols: pub fn foo() {} export_symbol!(foo); Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure. Way less files around! - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Offline builds, always; i.e. there is no "online compilation" anymore (fixes #17). - No more interleaved Cargo output (fixes #29). - One less nightly dependency (Cargo's `build-std`); since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking `rustc` options, like overflow checks. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is one more nightly feature used (the new Rust mangling scheme), but we know that one will be stable (and the default one, later on). - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. There are a few TODOs that we can improve later if we agree on this: - Kbuild: + Actually use the `*.d` files. + Complete `make clean`. + Support single-object compilation. + Pass `objtool` to make the ORC unwinder work. + Echo the building of the rust/* libraries and the bindgen call. - Figure out how to pick symbols to export automatically from Rust code instead of managing the list by hand. Perhaps we could use a no-op macro on the Rust code, which is then parse by a script to pick up the symbols: pub fn foo() {} export_symbol!(foo); Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure. Way less files around! - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Offline builds, always; i.e. there is no "online compilation" anymore (fixes #17). - No more interleaved Cargo output (fixes #29). - One less nightly dependency (Cargo's `build-std`); since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking `rustc` options, like overflow checks. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is one more nightly feature used (the new Rust mangling scheme), but we know that one will be stable (and the default one, later on). - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. There are a few TODOs that we can improve later if we agree on this: - Kbuild: + Actually use the `*.d` files. + Complete `make clean`. + Support single-object compilation. + Pass `objtool` to make the ORC unwinder work. + Echo the building of the rust/* libraries and the bindgen call. - Figure out how to pick symbols to export automatically from Rust code instead of managing the list by hand. Perhaps we could use a no-op macro on the Rust code, which is then parse by a script to pick up the symbols: pub fn foo() {} export_symbol!(foo); Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, etc. all trigger recompilation of the proper things. + Works as expected with parallel support (`-j`). - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No more interleaved Cargo output (fixes #29). - One less nightly dependency (Cargo's `build-std`); since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking `rustc` options, like overflow checks. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is one more nightly feature used (the new Rust mangling scheme), but we know that one will be stable (and the default one, later on). - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), following what the user selected for C (no Cargo profiles). - Added Kconfig menu for tweaking relevant `rustc` options, like overflow checks, debug assertions, optimization level, etc. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Updated the quick start guide. - Updated CI `.config`s: + Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable. + Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. + Release: disabled `EXPERT` and changed a few things to make it look more like a normal configuration. + Also update both configs to v5.9 while I was at it. (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it). - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
This is a big PR, but most of it is interdependent to the rest. - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`, `liballoc`, `libcompiler_builtins`. + The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example `.ko` on release is just 12 KiB, down from 1.3 MiB. For reference: `vmlinux` on release w/ Rust is 23 MiB (compressed: 2.1 MiB) `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB) i.e. the bulk is now shared. + Multiple builtin modules are now supported since their symbols do not collide against each other (fixes #9). + Faster compilation (less crates to compile & less repetition). + We achieve this by compiling all the shared code to `.rlib`s (and the `.so` for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the `.ko`s. - Simpler, flat file structure: now a small driver may only need a single file like `drivers/char/rust_example.rs`, like in C. + All the `rust/*` and `driver/char/rust_example/*` files moved to fit in the new structure: less files around. - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies. + Also helps with the faster compilation. - Dependency handling integration with `Kbuild`/`fixdep`. + Changes to the Rust standard library, kernel headers (bindings), `rust/` source files, `.rs` changes, command-line changes, flag changes, etc. all trigger recompilation as needed. + Works as expected with parallel support (`-j`). - Automatic generation of the `exports.c` list: + Instead of manually handling the list, all non-local functions available in `core`, `alloc` and `kernel` are exported, so all modules should work, regardless of what they need, and without failing linking due to symbols in the manual list not existing (e.g. due to differences in config options). + They are a lot, though: * ~6k Rust symbols vs. ~4k C symbols in release. * However, 4k of those are `bindings_raw` (i.e. duplicated C ones), which shouldn't be exported. Thus we should look into making `bindings_raw` private to the crate (at the moment, the (first) Rust example requires `<kernel::bindings...::miscdevice as Default>::default`). + Licensing: * `kernel`'s symbols are exported as GPL. * `core`'s and `alloc`'s symbols are exported as non-GPL so that third-parties can build Rust modules as long as they write their own kernel support infrastructure, i.e. without taking advantage of `kernel`. This seemed to make the most sense compared to other exports from the kernel, plus it follows more closely the original licence of the crates. - Support for GCC-compiled kernels: + The generated bindings do not have meaningful differences in our release config, between GCC 10.1 and Clang 11. + Other configs (e.g. our debug one) may add/remove types and functions. That is fine unless we use them form our bindings. + However, there are config options that may not work (e.g. the randstruct GCC plugin if we use one of those structs). - Support for `arm64` architecture: + Added to the CI: BusyBox is cross-compiled on the fly (increased timeout a bit to match). + Requires weakening of a few compiler builtins and adding `copy_{from,to}_user` helpers. - Support for custom `--sysroot` via `KRUSTCFLAGS`. - Proper `make clean` support. - Offline builds by default (there is no "online compilation" anymore; fixes #17). - No interleaved Cargo output (fixes #29). - No nightly dependency on Cargo's `build-std`; since now we manage the cross-compilation ourselves (should fix #27). - "Big" kallsyms symbol support: + I already raised ksym names from 128 to 256 back when I wrote the first integration. However, Rust symbols can be huge in debug/non-optimized, so I increased it again to 512; plus the module name from 56 to 248. + In turn, this required tuning the table format to support 2-byte lengths for ksyms. Compression at generation and kernel decompression is covered, although it may be the case that some script/tool also requires changes to understand the new table format. - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option is added to enable/disable it manually, regardless of whether one has `rustc` available or not (`CONFIG_HAS_RUST`). - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.), by default following what the user selected for C, but customizable through a Kconfig menu. As well as options for tweaking overflow checks and debug assertions. - This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more `shell readlink` for `O=`). - Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time. - Do not export any helpers' symbols. - Updated the quick start guide. - Updated CI: + Now we always test with 2 builtins and 2 loadables Rust example drivers, removing the matrix test for builtin/loadable. + Added `toolchain` to matrix: now we test building with GCC, Clang or a full LLVM toolchain. + Added `arch` to matrix: now we test both arm64 and x86_64. + Added `rustc` to matrix: now we test with a very recent nightly as well. + Only build `output == build` once to reduce the number of combinations. + Debug x86_64 config: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK. Also enable `-C opt-level=0` to test that such an extreme works and also to see how much bloated everything becomes. + Release x86_64 config: disabled `EXPERT` and changed a few things to make it look more like a normal desktop configuration, although it is still pretty minimal. + The configs for arm64 are `EXPERT` and `EMBEDDED` ones, very minimal, for the particular CPU we are simulating. + Update configs to v5.10. + Use `$GITHUB_ENV` to simplify. - Less `extern crate`s needed since we pass it via `rustc` (closer to idiomatic 2018 edition Rust code). Things to note: - There is two more nightly features used: + The new Rust mangling scheme: we know it will be stable (and the default on, later on). + The binary dep-info output: if we remove all other nightly features, this one can easily go too. - The hack at `exports.c` to export symbols to loadable modules. - The hack at `allocator.rs` to get the `__rust_*()` functions. - The hack to get the proper flags for bindgen on GCC builds. Signed-off-by: Miguel Ojeda <[email protected]>
The evlist and the cpu/thread maps should be released together. Otherwise following error was reported by Asan. Note that this test still has memory leaks in DSOs so it still fails even after this change. I'll take a look at that too. # perf test -v 26 26: Object code reading : --- start --- test child forked, pid 154184 Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols Parsing event 'cycles' mmap size 528384B ... ================================================================= ==154184==ERROR: LeakSanitizer: detected memory leaks Direct leak of 439 byte(s) in 1 object(s) allocated from: #0 0x7fcb66e77037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x55ad9b7e821e in dso__new_id util/dso.c:1256 #2 0x55ad9b8cfd4a in __machine__addnew_vdso util/vdso.c:132 #3 0x55ad9b8cfd4a in machine__findnew_vdso util/vdso.c:347 #4 0x55ad9b845b7e in map__new util/map.c:176 #5 0x55ad9b8415a2 in machine__process_mmap2_event util/machine.c:1787 #6 0x55ad9b8fab16 in perf_tool__process_synth_event util/synthetic-events.c:64 #7 0x55ad9b8fab16 in perf_event__synthesize_mmap_events util/synthetic-events.c:499 #8 0x55ad9b8fbfdf in __event__synthesize_thread util/synthetic-events.c:741 #9 0x55ad9b8ff3e3 in perf_event__synthesize_thread_map util/synthetic-events.c:833 #10 0x55ad9b738585 in do_test_code_reading tests/code-reading.c:608 #11 0x55ad9b73b25d in test__code_reading tests/code-reading.c:722 #12 0x55ad9b6f28fb in run_test tests/builtin-test.c:428 #13 0x55ad9b6f28fb in test_and_print tests/builtin-test.c:458 #14 0x55ad9b6f4a53 in __cmd_test tests/builtin-test.c:679 #15 0x55ad9b6f4a53 in cmd_test tests/builtin-test.c:825 #16 0x55ad9b760cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 #17 0x55ad9b5eaa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 #18 0x55ad9b5eaa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 #19 0x55ad9b5eaa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539 #20 0x7fcb669acd09 in __libc_start_main ../csu/libc-start.c:308 ... SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s). test child finished with 1 ---- end ---- Object code reading: FAILED! Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
It's later supposed to be either a correct address or NULL. Without the initialization, it may contain an undefined value which results in the following segmentation fault: # perf top --sort comm -g --ignore-callees=do_idle terminates with: #0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6 Rust-for-Linux#1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6 Rust-for-Linux#2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489 Rust-for-Linux#3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564 Rust-for-Linux#4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420, sample_self=sample_self@entry=true) at util/hist.c:657 Rust-for-Linux#5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0, sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288 Rust-for-Linux#6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38) at util/hist.c:1056 Rust-for-Linux#7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056 Rust-for-Linux#8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0) at util/hist.c:1231 Rust-for-Linux#9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0) at builtin-top.c:842 Rust-for-Linux#10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202 Rust-for-Linux#11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244 Rust-for-Linux#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323 Rust-for-Linux#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339 Rust-for-Linux#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341 Rust-for-Linux#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339 Rust-for-Linux#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114 Rust-for-Linux#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0 Rust-for-Linux#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6 If you look at the frame Rust-for-Linux#2, the code is: 488 if (he->srcline) { 489 he->srcline = strdup(he->srcline); 490 if (he->srcline == NULL) 491 goto err_rawdata; 492 } If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish), it gets strdupped and strdupping a rubbish random string causes the problem. Also, if you look at the commit 1fb7d06, it adds the srcline property into the struct, but not initializing it everywhere needed. Committer notes: Now I see, when using --ignore-callees=do_idle we end up here at line 2189 in add_callchain_ip(): 2181 if (al.sym != NULL) { 2182 if (perf_hpp_list.parent && !*parent && 2183 symbol__match_regex(al.sym, &parent_regex)) 2184 *parent = al.sym; 2185 else if (have_ignore_callees && root_al && 2186 symbol__match_regex(al.sym, &ignore_callees_regex)) { 2187 /* Treat this symbol as the root, 2188 forgetting its callees. */ 2189 *root_al = al; 2190 callchain_cursor_reset(cursor); 2191 } 2192 } And the al that doesn't have the ->srcline field initialized will be copied to the root_al, so then, back to: 1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al, 1212 int max_stack_depth, void *arg) 1213 { 1214 int err, err2; 1215 struct map *alm = NULL; 1216 1217 if (al) 1218 alm = map__get(al->map); 1219 1220 err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent, 1221 iter->evsel, al, max_stack_depth); 1222 if (err) { 1223 map__put(alm); 1224 return err; 1225 } 1226 1227 err = iter->ops->prepare_entry(iter, al); 1228 if (err) 1229 goto out; 1230 1231 err = iter->ops->add_single_entry(iter, al); 1232 if (err) 1233 goto out; 1234 That al at line 1221 is what hist_entry_iter__add() (called from sample__resolve_callchain()) saw as 'root_al', and then: iter->ops->add_single_entry(iter, al); will go on with al->srcline with a bogus value, I'll add the above sequence to the cset and apply, thanks! Signed-off-by: Michael Petlan <[email protected]> CC: Milian Wolff <[email protected]> Cc: Jiri Olsa <[email protected]> Fixes: 1fb7d06 ("perf report Use srcline from callchain for hist entries") Link: https //lore.kernel.org/r/[email protected] Reported-by: Juri Lelli <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
When doing slub_debug test, kfence's 'test_memcache_typesafe_by_rcu' kunit test case cause a use-after-free error: BUG: KASAN: use-after-free in kobject_del+0x14/0x30 Read of size 8 at addr ffff888007679090 by task kunit_try_catch/261 CPU: 1 PID: 261 Comm: kunit_try_catch Tainted: G B N 6.0.0-rc5-next-20220916 Rust-for-Linux#17 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x48 print_address_description.constprop.0+0x87/0x2a5 print_report+0x103/0x1ed kasan_report+0xb7/0x140 kobject_del+0x14/0x30 kmem_cache_destroy+0x130/0x170 test_exit+0x1a/0x30 kunit_try_run_case+0xad/0xc0 kunit_generic_run_threadfn_adapter+0x26/0x50 kthread+0x17b/0x1b0 </TASK> The cause is inside kmem_cache_destroy(): kmem_cache_destroy acquire lock/mutex shutdown_cache schedule_work(kmem_cache_release) (if RCU flag set) release lock/mutex kmem_cache_release (if RCU flag not set) In some certain timing, the scheduled work could be run before the next RCU flag checking, which can then get a wrong value and lead to double kmem_cache_release(). Fix it by caching the RCU flag inside protected area, just like 'refcnt' Fixes: 0495e33 ("mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock") Signed-off-by: Feng Tang <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Reviewed-by: Waiman Long <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
Upon physical link change, firmware reports to the kernel about the change along with the details like speed, lmac_type_id, etc. Kernel derives lmac_type based on lmac_type_id received from firmware. In a few scenarios, firmware returns an invalid lmac_type_id, which is resulting in below kernel panic. This patch adds the missing validation of the lmac_type_id field. Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 35.321595] Modules linked in: [ 35.328982] CPU: 0 PID: 31 Comm: kworker/0:1 Not tainted 5.4.210-g2e3169d8e1bc-dirty #17 [ 35.337014] Hardware name: Marvell CN103XX board (DT) [ 35.344297] Workqueue: events work_for_cpu_fn [ 35.352730] pstate: 40400089 (nZcv daIf +PAN -UAO) [ 35.360267] pc : strncpy+0x10/0x30 [ 35.366595] lr : cgx_link_change_handler+0x90/0x180 Fixes: 61071a8 ("octeontx2-af: Forward CGX link notifications to PFs") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Sunil Kovvuri Goutham <[email protected]> Signed-off-by: Sai Krishna <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]>
/ # ls /proc/rust_demo/ rust_proc_fs / # cat /proc/rust_demo/rust_proc_fs [ 43.097837] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 43.098565] #PF: supervisor instruction fetch in kernel mode [ 43.099176] #PF: error_code(0x0010) - not-present page [ 43.099832] PGD 5490067 P4D 5490067 PUD 572a067 PMD 0 [ 43.100501] Oops: 0010 [#1] PREEMPT SMP NOPTI [ 43.101051] CPU: 0 PID: 122 Comm: cat Tainted: G E 6.3.0+ Rust-for-Linux#17 [ 43.101745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] Call Trace: [ 43.101745] <TASK> [ 43.101745] ? proc_reg_open+0xf1/0x1d0 [ 43.101745] ? proc_reg_mmap+0x110/0x110 [ 43.101745] do_dentry_open+0x166/0x450 [ 43.101745] vfs_open+0x2d/0x30 [ 43.101745] path_openat+0xa8b/0xc50 [ 43.101745] do_filp_open+0xa1/0x130 [ 43.101745] ? getname_flags+0x50/0x1e0 [ 43.101745] ? alloc_fd+0x146/0x190 [ 43.101745] do_sys_openat2+0x6d/0x130 [ 43.101745] __x64_sys_openat+0x71/0x80 [ 43.101745] do_syscall_64+0x35/0x50 [ 43.101745] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 43.101745] RIP: 0033:0x4ad15b [ 43.101745] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b5 [ 43.101745] RSP: 002b:00007fffb2dcd4e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 43.101745] RAX: ffffffffffffffda RBX: 00007fffb2dcd830 RCX: 00000000004ad15b [ 43.101745] RDX: 0000000000000000 RSI: 00007fffb2dcefb2 RDI: 00000000ffffff9c [ 43.101745] RBP: 00007fffb2dcefb2 R08: 0000000000000001 R09: 0000000000000000 [ 43.101745] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 43.101745] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 43.101745] </TASK> [ 43.101745] Modules linked in: rust_proc(E) [ 43.101745] CR2: 0000000000000002 [ 43.101745] ---[ end trace 0000000000000000 ]--- [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] note: cat[122] exited with irqs disabled Killed
Fix an error detected by memory sanitizer: ``` ==4033==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6 #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2 #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9 #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32 #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6 #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8 #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8 #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9 #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8 #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15 #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9 #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8 #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5 #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9 #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11 #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8 #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2 #17 0x55fb0f941dab in main tools/perf/perf.c:535:3 ``` Fixes: 7b723db ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
/ # ls /proc/rust_demo/ rust_proc_fs / # cat /proc/rust_demo/rust_proc_fs [ 43.097837] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 43.098565] #PF: supervisor instruction fetch in kernel mode [ 43.099176] #PF: error_code(0x0010) - not-present page [ 43.099832] PGD 5490067 P4D 5490067 PUD 572a067 PMD 0 [ 43.100501] Oops: 0010 [#1] PREEMPT SMP NOPTI [ 43.101051] CPU: 0 PID: 122 Comm: cat Tainted: G E 6.3.0+ Rust-for-Linux#17 [ 43.101745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] Call Trace: [ 43.101745] <TASK> [ 43.101745] ? proc_reg_open+0xf1/0x1d0 [ 43.101745] ? proc_reg_mmap+0x110/0x110 [ 43.101745] do_dentry_open+0x166/0x450 [ 43.101745] vfs_open+0x2d/0x30 [ 43.101745] path_openat+0xa8b/0xc50 [ 43.101745] do_filp_open+0xa1/0x130 [ 43.101745] ? getname_flags+0x50/0x1e0 [ 43.101745] ? alloc_fd+0x146/0x190 [ 43.101745] do_sys_openat2+0x6d/0x130 [ 43.101745] __x64_sys_openat+0x71/0x80 [ 43.101745] do_syscall_64+0x35/0x50 [ 43.101745] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 43.101745] RIP: 0033:0x4ad15b [ 43.101745] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b5 [ 43.101745] RSP: 002b:00007fffb2dcd4e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 43.101745] RAX: ffffffffffffffda RBX: 00007fffb2dcd830 RCX: 00000000004ad15b [ 43.101745] RDX: 0000000000000000 RSI: 00007fffb2dcefb2 RDI: 00000000ffffff9c [ 43.101745] RBP: 00007fffb2dcefb2 R08: 0000000000000001 R09: 0000000000000000 [ 43.101745] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 43.101745] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 43.101745] </TASK> [ 43.101745] Modules linked in: rust_proc(E) [ 43.101745] CR2: 0000000000000002 [ 43.101745] ---[ end trace 0000000000000000 ]--- [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] note: cat[122] exited with irqs disabled Killed
/ # ls /proc/rust_demo/ rust_proc_fs / # cat /proc/rust_demo/rust_proc_fs [ 43.097837] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 43.098565] #PF: supervisor instruction fetch in kernel mode [ 43.099176] #PF: error_code(0x0010) - not-present page [ 43.099832] PGD 5490067 P4D 5490067 PUD 572a067 PMD 0 [ 43.100501] Oops: 0010 [#1] PREEMPT SMP NOPTI [ 43.101051] CPU: 0 PID: 122 Comm: cat Tainted: G E 6.3.0+ Rust-for-Linux#17 [ 43.101745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] Call Trace: [ 43.101745] <TASK> [ 43.101745] ? proc_reg_open+0xf1/0x1d0 [ 43.101745] ? proc_reg_mmap+0x110/0x110 [ 43.101745] do_dentry_open+0x166/0x450 [ 43.101745] vfs_open+0x2d/0x30 [ 43.101745] path_openat+0xa8b/0xc50 [ 43.101745] do_filp_open+0xa1/0x130 [ 43.101745] ? getname_flags+0x50/0x1e0 [ 43.101745] ? alloc_fd+0x146/0x190 [ 43.101745] do_sys_openat2+0x6d/0x130 [ 43.101745] __x64_sys_openat+0x71/0x80 [ 43.101745] do_syscall_64+0x35/0x50 [ 43.101745] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 43.101745] RIP: 0033:0x4ad15b [ 43.101745] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b5 [ 43.101745] RSP: 002b:00007fffb2dcd4e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 43.101745] RAX: ffffffffffffffda RBX: 00007fffb2dcd830 RCX: 00000000004ad15b [ 43.101745] RDX: 0000000000000000 RSI: 00007fffb2dcefb2 RDI: 00000000ffffff9c [ 43.101745] RBP: 00007fffb2dcefb2 R08: 0000000000000001 R09: 0000000000000000 [ 43.101745] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 43.101745] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 43.101745] </TASK> [ 43.101745] Modules linked in: rust_proc(E) [ 43.101745] CR2: 0000000000000002 [ 43.101745] ---[ end trace 0000000000000000 ]--- [ 43.101745] RIP: 0010:0x2 [ 43.101745] Code: Unable to access opcode bytes at 0xffffffffffffffd8. [ 43.101745] RSP: 0018:ffff88800573fc68 EFLAGS: 00010202 [ 43.101745] RAX: ffff8880057339c0 RBX: 0000000000000001 RCX: 0000000000000000 [ 43.101745] RDX: 0000000000000000 RSI: ffff888005729700 RDI: ffff888004f5e308 [ 43.101745] RBP: ffff88800573fca0 R08: ffff88800573fc28 R09: ffff8880056b5ac8 [ 43.101745] R10: 0000000031a4a4f0 R11: 0000000000000002 R12: ffff8880056b5ac8 [ 43.101745] R13: ffff888005729700 R14: ffff888005469480 R15: ffff888004f5e308 [ 43.101745] FS: 00000000020ee3c0(0000) GS:ffff888007a00000(0000) knlGS:0000000000000000 [ 43.101745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.101745] CR2: ffffffffffffffd8 CR3: 000000000561e000 CR4: 00000000000006f0 [ 43.101745] note: cat[122] exited with irqs disabled Killed
When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
Add test to validate BPF verifier's register range bounds tracking logic. The main bulk is a lot of auto-generated tests based on a small set of seed values for lower and upper 32 bits of full 64-bit values. Currently we validate only range vs const comparisons, but the idea is to start validating range over range comparisons in subsequent patch set. When setting up initial register ranges we treat registers as one of u64/s64/u32/s32 numeric types, and then independently perform conditional comparisons based on a potentially different u64/s64/u32/s32 types. This tests lots of tricky cases of deriving bounds information across different numeric domains. Given there are lots of auto-generated cases, we guard them behind SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise. With current full set of upper/lower seed value, all supported comparison operators and all the combinations of u64/s64/u32/s32 number domains, we get about 7.7 million tests, which run in about 35 minutes on my local qemu instance without parallelization. But we also split those tests by init/cond numeric types, which allows to rely on test_progs's parallelization of tests with `-j` option, getting run time down to about 5 minutes on 8 cores. It's still something that shouldn't be run during normal test_progs run. But we can run it a reasonable time, and so perhaps a nightly CI test run (once we have it) would be a good option for this. We also add a small set of tricky conditions that came up during development and triggered various bugs or corner cases in either selftest's reimplementation of range bounds logic or in verifier's logic itself. These are fast enough to be run as part of normal test_progs test run and are great for a quick sanity checking. Let's take a look at test output to understand what's going on: $ sudo ./test_progs -t reg_bounds_crafted #191/1 reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK ... #191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK ... #191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK Each test case is uniquely and fully described by this generated string. E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we initialize a register (R6) in such a way that verifier knows that it can have a value in [(u64)0; (u64)0x17fffffff] range. Another register (R7) is also set up as u64, but this time a constant (zero in this case). They then are compared using 32-bit signed < operation. Resulting TRUE/FALSE branches are evaluated (including cases where it's known that one of the branches will never be taken, in which case we validate that verifier also determines this as a dead code). Test validates that verifier's final register state matches expected state based on selftest's own reg_state logic, implemented from scratch for cross-checking purposes. These test names can be conveniently used for further debugging, and if -vv verboseness is requested we can get a corresponding verifier log (with mark_precise logs filtered out as irrelevant and distracting). Example below is slightly redacted for brevity, omitting irrelevant register output in some places, marked with [...]. $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv ... VERIFIER LOG: ======================== func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (05) goto pc+2 3: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 4: (bc) w6 = w0 ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 5: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 6: (bc) w7 = w0 ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 7: (b4) w1 = 0 ; R1_w=0 8: (b4) w2 = -1 ; R2=4294967295 9: (ae) if w6 < w1 goto pc-9 9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 10: (2e) if w6 > w2 goto pc-10 10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 11: (b4) w1 = -1 ; R1_w=4294967295 12: (b4) w2 = -1 ; R2_w=4294967295 13: (ae) if w7 < w1 goto pc-13 ; R1_w=4294967295 R7=4294967295 14: (2e) if w7 > w2 goto pc-14 14: R2_w=4294967295 R7=4294967295 15: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 16: (bc) w0 = w7 ; [...] R7=4294967295 17: (ce) if w6 s< w7 goto pc+3 ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295 18: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) 19: (bc) w0 = w7 ; [...] R7=4294967295 20: (95) exit from 17 to 21: [...] 21: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff)) 22: (bc) w0 = w7 ; [...] R7=4294967295 23: (95) exit from 13 to 1: [...] 1: [...] 1: (b7) r0 = 0 ; R0_w=0 2: (95) exit processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 ===================== Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32 range is used for initialization, followed by signed < operator. Note how we use w6/w7 in this case for register initialization (it would be R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at instruction #17. It will be `if R6 < R7` for 64-bit unsigned comparison. Above example gives a good impression of the overall structure of a BPF programs generated for reg_bounds tests. In the future, this "framework" can be extended to test not just conditional jumps, but also arithmetic operations. Adding randomized testing is another possibility. Some implementation notes. We basically have our own generics-like operations on numbers, where all the numbers are stored in u64, but how they are interpreted is passed as runtime argument enum num_t. Further, `struct range` represents a bounds range, and those are collected together into a minimal `struct reg_state`, which collects range bounds across all four numberical domains: u64, s64, u32, s64. Based on these primitives and `enum op` representing possible conditional operation (<, <=, >, >=, ==, !=), there is a set of generic helpers to perform "range arithmetics", which is used to maintain struct reg_state. We simulate what verifier will do for reg bounds of R6 and R7 registers using these range and reg_state primitives. Simulated information is used to determine branch taken conclusion and expected exact register state across all four number domains. Implementation of "range arithmetics" is more generic than what verifier is currently performing: it allows range over range comparisons and adjustments. This is the intended end goal of this patch set overall and verifier logic is enhanced in subsequent patches in this series to handle range vs range operations, at which point selftests are extended to validate these conditions as well. For now it's range vs const cases only. Note that tests are split into multiple groups by their numeric types for initialization of ranges and for comparison operation. This allows to use test_progs's -j parallelization to speed up tests, as we now have 16 groups of parallel running tests. Overall reduction of running time that allows is pretty good, we go down from more than 30 minutes to slightly less than 5 minutes running time. Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Shung-Hsi Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
Petr Machata says: ==================== mlxsw: Support CFF flood mode The registers to configure to initialize a flood table differ between the controlled and CFF flood modes. In therefore needs to be an op. Add it, hook up the current init to the existing families, and invoke the op. PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . | 802.1q | 802.1d | . . . | UC | MC | BC | UC | MC | BC | \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . | FID | FID | FID | FID | FID | . . . |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which adds CFF flood mode support to mlxsw. Since mlxsw needs to maintain both the controlled mode as well as CFF mode support, we will keep the layout as compatible as possible. The bridge tables will stay in the same overall shape, just their inner organization will change from flood mode -> FID to FID -> flood mode. Likewise will RSP be kept as a contiguous block of PGT memory, as was the case when the FW maintained it. - The way FIDs get configured under the CFF flood mode differs from the currently used controlled mode. The simple approach of having several globally visible arrays for spectrum.c to statically choose from no longer works. Patch #1 thus privatizes all FID initialization and finalization logic, and exposes it as ops instead. - Patch #2 renames the ops that are specific to the controlled mode, to make room in the namespace for the CFF variants. Patch #3 extracts a helper to compute flood table base out of mlxsw_sp_fid_flood_table_mid(). - The op fid_setup configured fid_offset, i.e. the number of this FID within its family. For rFIDs in CFF mode, to determine this number, the driver will need to do fallible queries. Thus in patch #4, make the FID setup operation fallible as well. - Flood mode initialization routine differs between the controlled and CFF flood modes. The controlled mode needs to configure flood table layout, which the CFF mode does not need to do. In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the following patch can make use of it. In patch #6, add an op to be invoked per table (if defined). - The current way of determining PGT allocation size depends on the number of FIDs and number of flood tables. RFIDs however have PGT footprint depending not on number of FIDs, but on number of ports and LAGs, because which ports an rFID should flood to does not depend on the FID itself, but on the port or LAG that it references. Therefore in patch #7, add FID family ops for determining PGT allocation size. - As elaborated above, layout of PGT will differ between controlled and CFF flood modes. In CFF mode, it will further differ between rFIDs and other FIDs (as described at previous patch). The way to pack the SFMR register to configure a FID will likewise differ from controlled to CFF. Thus in patches #8 and #9 add FID family ops to determine PGT base address for a FID and to pack SFMR. - Patches #10 and #11 add more bits for RSP support. In patch #10, add a new traffic type enumerator, for non-UC traffic. This is a combination of BC and MC traffic, but the way that mlxsw maps these mnemonic names to actual traffic type configurations requires that we have a new name to describe this class of traffic. Patch #11 then adds hooks necessary for RSP table maintenance. As ports come and go, and join and leave LAGs, it is necessary to update flood vectors that the rFIDs use. These new hooks will make that possible. - Patches #12, #13 and #14 introduce flood profiles. These have been implicit so far, but the way that CFF flood mode works with profile IDs requires that we make them explicit. Thus in patch #12, introduce flood profile objects as a set of flood tables that FID families then refer to. The FID code currently only uses a single flood profile. In patch #13, add a flood profile ID to flood profile objects. In patch #14, when in CFF mode, configure SFFP according to the existing flood profiles (or the one that exists as of that point). - Patches #15 and #16 add code to implement, respectively, bridge FIDs and RSP FIDs in CFF mode. - In patch #17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which makes the newly-added code live. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
Test runners on debug kernels occasionally fail with: # # RUN tls_err.13_aes_gcm.poll_partial_rec_async ... # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1) # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0) # # poll_partial_rec_async: Test failed at step #17 # # FAIL tls_err.13_aes_gcm.poll_partial_rec_async # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async # # FAILED: 698 / 699 tests passed. This points to the second poll() in the test which is expected to wait for the sender to send the rest of the data. Apparently under some conditions that doesn't happen within 5ms, bump the timeout to 20ms. Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
An errant disk backup on my desktop got into debugfs and triggered the following deadlock scenario in the amdgpu debugfs files. The machine also hard-resets immediately after those lines are printed (although I wasn't able to reproduce that part when reading by hand): [ 1318.016074][ T1082] ====================================================== [ 1318.016607][ T1082] WARNING: possible circular locking dependency detected [ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 Rust-for-Linux#17 Not tainted [ 1318.017598][ T1082] ------------------------------------------------------ [ 1318.018096][ T1082] tar/1082 is trying to acquire lock: [ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80 [ 1318.019084][ T1082] [ 1318.019084][ T1082] but task is already holding lock: [ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.020607][ T1082] [ 1318.020607][ T1082] which lock already depends on the new lock. [ 1318.020607][ T1082] [ 1318.022081][ T1082] [ 1318.022081][ T1082] the existing dependency chain (in reverse order) is: [ 1318.023083][ T1082] [ 1318.023083][ T1082] -> Rust-for-Linux#2 (reservation_ww_class_mutex){+.+.}-{3:3}: [ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 [ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 [ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330 [ 1318.025683][ T1082] do_one_initcall+0x6a/0x350 [ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.026728][ T1082] kernel_init+0x15/0x1a0 [ 1318.027242][ T1082] ret_from_fork+0x2c/0x40 [ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.028281][ T1082] [ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: [ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 [ 1318.029790][ T1082] do_one_initcall+0x6a/0x350 [ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.030722][ T1082] kernel_init+0x15/0x1a0 [ 1318.031168][ T1082] ret_from_fork+0x2c/0x40 [ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.032011][ T1082] [ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: [ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.033487][ T1082] __might_fault+0x58/0x80 [ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] [ 1318.034181][ T1082] full_proxy_read+0x55/0x80 [ 1318.034487][ T1082] vfs_read+0xa7/0x360 [ 1318.034788][ T1082] ksys_read+0x70/0xf0 [ 1318.035085][ T1082] do_syscall_64+0x94/0x180 [ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.035664][ T1082] [ 1318.035664][ T1082] other info that might help us debug this: [ 1318.035664][ T1082] [ 1318.036487][ T1082] Chain exists of: [ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex [ 1318.036487][ T1082] [ 1318.037310][ T1082] Possible unsafe locking scenario: [ 1318.037310][ T1082] [ 1318.037838][ T1082] CPU0 CPU1 [ 1318.038101][ T1082] ---- ---- [ 1318.038350][ T1082] lock(reservation_ww_class_mutex); [ 1318.038590][ T1082] lock(reservation_ww_class_acquire); [ 1318.038839][ T1082] lock(reservation_ww_class_mutex); [ 1318.039083][ T1082] rlock(&mm->mmap_lock); [ 1318.039328][ T1082] [ 1318.039328][ T1082] *** DEADLOCK *** [ 1318.039328][ T1082] [ 1318.040029][ T1082] 1 lock held by tar/1082: [ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.040560][ T1082] [ 1318.040560][ T1082] stack backtrace: [ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 Rust-for-Linux#17 3316c85d50e282c5643b075d1f01a4f6365e39c2 [ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023 [ 1318.041614][ T1082] Call Trace: [ 1318.041895][ T1082] <TASK> [ 1318.042175][ T1082] dump_stack_lvl+0x4a/0x80 [ 1318.042460][ T1082] check_noncircular+0x145/0x160 [ 1318.042743][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.043022][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.043301][ T1082] ? __might_fault+0x40/0x80 [ 1318.043580][ T1082] ? __might_fault+0x40/0x80 [ 1318.043856][ T1082] __might_fault+0x58/0x80 [ 1318.044131][ T1082] ? __might_fault+0x40/0x80 [ 1318.044408][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab] [ 1318.044749][ T1082] full_proxy_read+0x55/0x80 [ 1318.045042][ T1082] vfs_read+0xa7/0x360 [ 1318.045333][ T1082] ksys_read+0x70/0xf0 [ 1318.045623][ T1082] do_syscall_64+0x94/0x180 [ 1318.045913][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046201][ T1082] ? lockdep_hardirqs_on+0x7d/0x100 [ 1318.046487][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046773][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047057][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047337][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047611][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d [ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83 [ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d [ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008 [ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007 [ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00 [ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800 [ 1318.050638][ T1082] </TASK> amdgpu_debugfs_mqd_read() holds a reservation when it calls put_user(), which may fault and acquire the mmap_sem. This violates the established locking order. Bounce the mqd data through a kernel buffer to get put_user() out of the illegal section. Fixes: 445d85e ("drm/amdgpu: add debugfs interface for reading MQDs") Cc: [email protected] # v6.5+ Reviewed-by: Shashank Sharma <[email protected]> Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
…rnel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: Patch #1 skips transaction if object type provides no .update interface. Patch #2 skips NETDEV_CHANGENAME which is unused. Patch #3 enables conntrack to handle Multicast Router Advertisements and Multicast Router Solicitations from the Multicast Router Discovery protocol (RFC4286) as untracked opposed to invalid packets. From Linus Luessing. Patch #4 updates DCCP conntracker to mark invalid as invalid, instead of dropping them, from Jason Xing. Patch #5 uses NF_DROP instead of -NF_DROP since NF_DROP is 0, also from Jason. Patch #6 removes reference in netfilter's sysctl documentation on pickup entries which were already removed by Florian Westphal. Patch #7 removes check for IPS_OFFLOAD flag to disable early drop which allows to evict entries from the conntrack table, also from Florian. Patches #8 to #16 updates nf_tables pipapo set backend to allocate the datastructure copy on-demand from preparation phase, to better deal with OOM situations where .commit step is too late to fail. Series from Florian Westphal. Patch #17 adds a selftest with packetdrill to cover conntrack TCP state transitions, also from Florian. Patch #18 use GFP_KERNEL to clone elements from control plane to avoid quick atomic reserves exhaustion with large sets, reporter refers to million entries magnitude. * tag 'nf-next-24-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: allow clone callbacks to sleep selftests: netfilter: add packetdrill based conntrack tests netfilter: nft_set_pipapo: remove dirty flag netfilter: nft_set_pipapo: move cloning of match info to insert/removal path netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone netfilter: nft_set_pipapo: merge deactivate helper into caller netfilter: nft_set_pipapo: prepare walk function for on-demand clone netfilter: nft_set_pipapo: prepare destroy function for on-demand clone netfilter: nft_set_pipapo: make pipapo_clone helper return NULL netfilter: nft_set_pipapo: move prove_locking helper around netfilter: conntrack: remove flowtable early-drop test netfilter: conntrack: documentation: remove reference to non-existent sysctl netfilter: use NF_DROP instead of -NF_DROP netfilter: conntrack: dccp: try not to drop skb in conntrack netfilter: conntrack: fix ct-state for ICMPv6 Multicast Router Discovery netfilter: nf_tables: remove NETDEV_CHANGENAME from netdev chain event handler netfilter: nf_tables: skip transaction if update object is not implemented ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] Rust-for-Linux#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] Rust-for-Linux#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 Rust-for-Linux#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 Rust-for-Linux#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 Rust-for-Linux#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c Rust-for-Linux#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b Rust-for-Linux#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 Rust-for-Linux#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 Rust-for-Linux#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f Rust-for-Linux#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
During noirq suspend phase the Raspberry Pi power driver suffer of firmware property timeouts. The reason is that the IRQ of the underlying BCM2835 mailbox is disabled and rpi_firmware_property_list() will always run into a timeout [1]. Since the VideoCore side isn't consider as a wakeup source, set the IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled during suspend-resume cycle. [1] PM: late suspend of devices complete after 1.754 msecs WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128 rpi_firmware_property_list+0x204/0x22c Firmware transaction 0x00028001 timeout Modules linked in: CPU: 0 PID: 438 Comm: bash Tainted: G C 6.9.3-dirty #17 Hardware name: BCM2835 Call trace: unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0x88/0xec __warn from warn_slowpath_fmt+0x7c/0xb0 warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0 rpi_firmware_set_power from _genpd_power_off+0xe4/0x148 _genpd_power_off from genpd_sync_power_off+0x7c/0x11c genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0 genpd_finish_suspend from dpm_run_callback+0x78/0xd0 dpm_run_callback from device_suspend_noirq+0xc0/0x238 device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168 dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac suspend_devices_and_enter from pm_suspend+0x254/0x2e4 pm_suspend from state_store+0xa8/0xd4 state_store from kernfs_fop_write_iter+0x154/0x1a0 kernfs_fop_write_iter from vfs_write+0x12c/0x184 vfs_write from ksys_write+0x78/0xc0 ksys_write from ret_fast_syscall+0x0/0x54 Exception stack(0xcc93dfa8 to 0xcc93dff0) [...] PM: noirq suspend of devices complete after 3095.584 msecs Link: raspberrypi/firmware#1894 Fixes: 0bae6af ("mailbox: Enable BCM2835 mailbox support") Signed-off-by: Stefan Wahren <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jassi Brar <[email protected]>
In binder_add_freeze_work() we iterate over the proc->nodes with the proc->inner_lock held. However, this lock is temporarily dropped in order to acquire the node->lock first (lock nesting order). This can race with binder_node_release() and trigger a use-after-free: ================================================================== BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xe4/0x19c Write of size 4 at addr ffff53c04c29dd04 by task freeze/640 CPU: 5 UID: 0 PID: 640 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 Rust-for-Linux#17 Hardware name: linux,dummy-virt (DT) Call trace: _raw_spin_lock+0xe4/0x19c binder_add_freeze_work+0x148/0x478 binder_ioctl+0x1e70/0x25ac __arm64_sys_ioctl+0x124/0x190 Allocated by task 637: __kmalloc_cache_noprof+0x12c/0x27c binder_new_node+0x50/0x700 binder_transaction+0x35ac/0x6f74 binder_thread_write+0xfb8/0x42a0 binder_ioctl+0x18f0/0x25ac __arm64_sys_ioctl+0x124/0x190 Freed by task 637: kfree+0xf0/0x330 binder_thread_read+0x1e88/0x3a68 binder_ioctl+0x16d8/0x25ac __arm64_sys_ioctl+0x124/0x190 ================================================================== Fix the race by taking a temporary reference on the node before releasing the proc->inner lock. This ensures the node remains alive while in use. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Reviewed-by: Alice Ryhl <[email protected]> Acked-by: Todd Kjos <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Current loop calls vfs_statfs() while holding the q->limits_lock. If FS takes some locking in vfs_statfs callback, this may lead to ABBA locking bug (at least, FAT fs has this issue actually). So this patch calls vfs_statfs() outside q->limits_locks instead, because looks like no reason to hold q->limits_locks while getting discord configs. Chain exists of: &sbi->fat_lock --> &q->q_usage_counter(io)Rust-for-Linux#17 --> &q->limits_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->limits_lock); lock(&q->q_usage_counter(io)Rust-for-Linux#17); lock(&q->limits_lock); lock(&sbi->fat_lock); *** DEADLOCK *** Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=a5d8c609c02f508672cc Reviewed-by: Ming Lei <[email protected]> Signed-off-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
Some build systems (like nix) do not allow network access while building, but then provide a download phase, where network access is allowed.
For this to work with cargo, we would have to create a cargo vendor dir, and call cargo with the flags, to not update the index and use the vendor dir.
The text was updated successfully, but these errors were encountered: