Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syzkaller: WARNING in __mptcp_move_skbs_from_subflow #450

Closed
cpaasch opened this issue Oct 20, 2023 · 1 comment
Closed

syzkaller: WARNING in __mptcp_move_skbs_from_subflow #450

cpaasch opened this issue Oct 20, 2023 · 1 comment
Assignees
Labels
bisected Git commit introducing the bug is known bug reproducer Has a simple program to reproduce the bug syzkaller

Comments

@cpaasch
Copy link
Member

cpaasch commented Oct 20, 2023

syzkaller-id: 7bc336bf049b8e3d7efa860b71901d5e094b33ac

HEAD: cd8bdf5

Trace:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
Modules linked in:
CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
Code: 00 00 48 3b 84 24 40 02 00 00 0f 85 ee 00 00 00 89 d8 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 fc b7 59 fd <0f> 0b e9 6b ff ff ff e8 f0 b7 59 fd 43 0f b6 04 2c 84 c0 0f 85 88
RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
PKRU: 55555554
Call Trace:
 <IRQ>
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
 mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
 subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
 tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
 tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
 tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
 tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
 ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
 ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
 ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
 __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
 process_backlog+0x353/0x660 net/core/dev.c:5974
 __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
 net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
 __do_softirq+0x184/0x524 kernel/softirq.c:553
 do_softirq+0xdd/0x130 kernel/softirq.c:454
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x7b/0x80 kernel/softirq.c:381
 local_bh_enable include/linux/bottom_half.h:33 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:819 [inline]
 __dev_queue_xmit+0x133b/0x33c0 net/core/dev.c:4376
 dev_queue_xmit include/linux/netdevice.h:3085 [inline]
 neigh_hh_output include/net/neighbour.h:526 [inline]
 neigh_output include/net/neighbour.h:540 [inline]
 ip6_finish_output2+0xeef/0x1760 net/ipv6/ip6_output.c:135
 ip6_output+0x207/0x530 include/linux/netfilter.h:293
 ip6_xmit+0xe66/0x19f0 include/net/dst.h:458
 inet6_csk_xmit+0x2dc/0x450 net/ipv6/inet6_connection_sock.c:135
 __tcp_transmit_skb+0x1e61/0x34a0 net/ipv4/tcp_output.c:1408
 tcp_write_xmit+0x178d/0x63f0 net/ipv4/tcp_output.c:1426
 __tcp_push_pending_frames+0x95/0x2f0 net/ipv4/tcp_output.c:2928
 __mptcp_push_pending+0x4eb/0x910 net/mptcp/protocol.c:1485
 mptcp_sendmsg+0x12bd/0x1600 net/mptcp/protocol.c:1886
 __sock_sendmsg+0xa7/0x230 net/socket.c:730
 ____sys_sendmsg+0x507/0x7b0 net/socket.c:2558
 ___sys_sendmsg+0x238/0x2c0 net/socket.c:2612
 __sys_sendmmsg+0x276/0x4d0 net/socket.c:2698
 __do_sys_sendmmsg net/socket.c:2727 [inline]
 __se_sys_sendmmsg net/socket.c:2724 [inline]
 __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2724
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x45/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x45c4f9
Code: fc ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 3c fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f239c47dc18 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 000000000079c050 RCX: 000000000045c4f9
RDX: 0000000000000003 RSI: 0000000020001540 RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000079c05c
R13: 0000000000021000 R14: 000000000079c050 R15: 00007f239c47e700
 </TASK>
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------

Kconfig:
Kconfig_k9_kasan.txt

Reproducer:

# {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:namespace SandboxArg:0 Leak:false NetInjection:false NetDevices:true NetReset:false Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:true Swap:false UseTmpDir:true HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
r0 = socket$nl_route(0x10, 0x3, 0x0)
sendmsg$nl_route(r0, &(0x7f0000000200)={0x0, 0x0, &(0x7f00000001c0)={&(0x7f0000000000)=@newlink={0x30, 0x10, 0x1, 0x0, 0x0, {}, [@IFLA_GROUP={0x8}, @IFLA_GSO_MAX_SIZE={0x8, 0x29, 0x7adaa}]}, 0x30}}, 0x0)
sendmmsg$inet(0xffffffffffffffff, 0x0, 0x0, 0xc001)
r1 = socket$inet6_mptcp(0xa, 0x1, 0x106)
bind$inet6(r1, &(0x7f0000000180)={0xa, 0x4e23, 0x0, @loopback}, 0x1c)
listen(r1, 0x0)
r2 = socket$inet6_mptcp(0xa, 0x1, 0x106)
connect$inet6(r2, &(0x7f00000000c0)={0xa, 0x4e23, 0x0, @loopback}, 0x63)
r3 = accept(r1, 0x0, 0x0)
recvfrom$inet6(r2, &(0x7f0000000700)=""/76, 0xe75a, 0x0, 0x0, 0x0)
sendmmsg$unix(r3, &(0x7f0000001540)=[{{0x0, 0x0, &(0x7f0000000580)=[{&(0x7f0000001c80)="e0a8985994af1767d52b44ee04125a1414e3a5fa6e85630cfcc6e0b6a0e6c201d876f0a55c20650abcd110f5c3a5d22ef08c6e01733dac21c02b29d465826372804711dcbf8725c3213d7c86c3b8b74696563c8a17f9e936e283927ef877ef6884a4acdf36c67de9b12c65fa7afdd5faaf19f9274ae022d9cb3b18693449c6c7a4b6f8220834441024ea32460b8161e7c9bc71253b9068932020611f41c87881d7156395d2fbd8ed0f80e107c9894679c4b38fc68cd4589c723954674c434b7deea70bee841931dce9a972381e6328fb681271d07ab96db8abe33625e4f0fa298d5e1f2cfc10054c9343eefa2c23018641774eb079a816f8bf4a0f454052047241cea63b6635d03661148e5f90a05cc9a8f45d68e03788483c6bd9f44248ac03770c02668533d0670dd7809ec35d01a6308a08f44b883e37189a897a0c994e2b60cf964dc062e7edc18b5dcc051d5681e9580433a7555ec1b97612208cdf3e89c0afa2765ef20a83698edb4714bf6286519333381a1a612ee9dcbff6d716ae5ed48bc49a5523364e6b6a47ed3ee84db1488013b4aa8bdb9bd518799e16c61cb60957f9d824b64eb62121865d6a991075b42b3c8f474bfa405ec108da0de5068c1f6dc175c57bc23916b2ff33dca13f26841fae1f94f73d7bec1a19831495e92ed9d1c826bd54ed53ead3b62dc68eebbb80e41fc988f4775b017ed77980cd4612186b714b7bb8262a5bef177adfcda267fef6c554bf6d8c3fd1e654ec7626c69478ccd0f8456739b1821a288194eab31a1269a9962525247f52ea1c57104edf8668cadb58cac2ea656035bac0f71278765fecbd595f1183d3eae508801f2df4372981b198e2bebd34edfe651633e7f4536c65a911fcc1fe074949457c2dd6c13bfd4138905880241e00f9d1829ca9459eea428d38d70145f77a943fac6fa15bcef85a5cd23bd04ffdd9529e3d56cd0cb30c45e211d20ded313ab39f7ed45a02941a600a7256c534ac6dc4bc82d8e2e0b6a519940b55e7548e742f1c337ca268e3dd7d42b654d80862651f612505757db3eb4bf603b625e70e734bf3f7f53e10c3d0e379f2c460e286dbe385a102c7470ac6a732fe1fd3559963af5764bd2b4b910f0ae54a6df04a76e9431e72509f212fa6e99d4bb1b320e595b93f0d728f1341793196a5a72b72b7c340e5497a11d520fdb67b5501e1bbaab0707bfb13fb85ba450628248f2f47e5e7794599207bf79c2c86ea91cf6b07faa0c6f2a85edfea6b5f6fcc2f44cd290bdecae708da2e7cf1c50535831896b73db88974de1cebd5a56e81904e9abfee0683d86145cac6234427fc8d03289a920874eb732a1c9556cc62aebc4004dd9056bf16fab33e37e053284e98248cb7f41e892cbb93998741b4426ab3ea2f8c36c0f8bc1c5a1cca54b163f031ed422f74de7adfbd77eb212c860be5231aba04db092ff9a95d9105b4c4ee0d73c13ebfa14048e4bd938609536778871cf779e20d6b4b63e33405fd52d0b0bd2a7941ab6b369002c7320839cefa006f105a9f9f060ab94ce6069c5e4a942599fb1e9a1f8674359bba6616c023ed8c6f01f60e280b1699c7ea5bc9b56284b00b5ae7911c6f18cb3c65bbbfe7106f69fd7e2ff8091d8a14c848869212fafcd50806f154694c7bf5fa22f857bab5e5bca15eff4603fb1873734a6bf1956234922bdbef5f1d92d9ec8b588072befe4cc5d89570c2e5ce8006d901a61210e8c2320cbfb80102e4da9f0592288aff72de9460def0a4b1d5f4185aae573cde262cf7d0a28aeb7483b15438688cfc77984d0bbfc36cbd912d4d30191c1444f8d538cceef6f628a73616bccf787db7939ace25f97fc987726c0e050466967567c0f9800d9372dfce70829a67975fd76de12f4df3747e524a0dd156f227cda885edb945f673c68087ad7955ac44dc45ac211a789f8bc07f7ca66257bb3cbd2d5ecbe995be6d20c963c16fc6f12ba8861546728f9555c8d8db92585ceaafb55be016b759064830e219eeb24354d5ba15a653732e6baead1dc72010bbda439d092cdab29b37f65fe001adac89ee6f5d66764fa73aba4f6a1fa6037a52312478fb695b2ad3af6a65dbcadf6e46f9816085fc7b4732fe3b04852a0ec2a5870d6d5706f138e3cae4f7865cee8e7e8c39ca43118494ad871e3c49da7b0689b67d33521fbbbf4a98c3eb07509aa139a34438a99ac437e7882efad3b9fef52cb683b6b598e6c21cbe1d0c817b91f53ac098ff1db71c036333f7c2fe168e693eb7c2b3292b1e44207ee0b6aec72987b0146913eed40cf99e69112ecb5b90dc73ff09d5e2f9711028eb61ca8020bdee46154fc3c681c54471b2b6e702d7cf0dd2a92530a5d8e5f72fb90e96734412235b231cb8ef2adb16266961e7d0e823a7ff295a7218603a780c298a49cda33c56ac75415a5e2d1a67229d2d0587ec18477373493913c9fbffb295a2e1868ebeceb6b7ede35f0bd9c42852de636ef854f3378e80147280f8515038f0906ed2a604bcc77f64270cecaeda42c92afa82756202c8f72a60a54616bf89777d5adbb36792e136f0f1506c8198df0986d1d5d0a07a3a495648915df743a74030fdb7949dfabfdf0db628767a23fbaec2bd576fa950e918d4de43e313d306bee33781c7e3a1a643be2bc2792a04e3ea9e47db145aa0cd28ad0bc40dd8e795603ad501724caceb7cc1687dc89709a7b5e83dec2fc2f3adacc42b877b4651042496b573596a756909fc9e860733eaa43861e18016691583f0c53ddde3210d9930e16e7d6279544306a01f3a7bb21898ea0ac673069412e8336eb82916af258fbef768495bdf389f1bd04f5a3b8a94075d7bbd18e71ab8666d48c476c232cb7083c17c5dab97480d4996bac6f327bdffe50d578fe261df0064899102bf87e5d1a1aa2ca72ffe5072265ea1af4d81d00ff396245d550598b759315a4c7b3f300344e451b144179afc9bcf52c4d98ff67eacc55b0c7a3925082c3614a7605a4407f2351face6ed3589b45be662d8919bf14207bdb28850991fda95dde682c3c3c8cff1eee17d4dfde405d248d0fbf460169033f4d4f79e032fa132594f3fe55527b43763cee16a828badca7b478ade7a49c722ac44dc9821381d3328bdcb28b1cb5ae6280a07c67d7e96df9949d6828a1fc7fb3dff5ad5785e19e899904aa39d79626a2b83614daa61acf9ef7e6d8862dc77c0cf37572e895ed0c1a835526b44a39ed1bfcf3b423ed51e5d729f8ce99b2c5853aa9f05aac1a27f62f6f43e00514748b2b194e68fd7205bc2b6ab69e12413d6299744efa2719aac39279cae001673f07925e22c0cbd869bb93d3311a3920f66b3dca5a57ca61a100e504b1529c7e60e1a9801113701a46d544200553b9c0e31edb8940ee9b52cb414012c78971a126f9de370191a32cf10cbe649681aa2cc4fa1c7be82f888473a4e64a5c73f9ba1468deb8871f25de23c585ad0ca63e104ea718ed27731539a608b0e5f5a5de5382cfea474287e3f64f648f5a1e7d00b9c69a2c13d65a894a62ca2a3a722bd17ea5a9dfdaf4b06c1f41f81656228544a53619f565a4655fa5626d7816ac5bf8c87722c3428a1cd98925b6e48b0683d0f7c058051d53851c52f9a5c3a56c09a83cb0ec7d10c185aedc7429e9c71a657993b520e25b829f8770a8f14eccd5b8f71e20a3e73f22327428d6afb946327430774eead920cfde5732ace5f2f0065dd1415d1e311883ced8aefb0cad44372d328adf466a97b63f037652efe5601873b0c3eb95e95e7925f5bdb8a55c44ead49a70da993fa3975b2d18b016a186289981de7b0a0520cd51cf424178d133d34bdbe4766a29a65c297fa8a1950ffbb2a0f6ad43a091e75bfadbb07f646eee6f0f64f2740e9b12cc5c19e19fad42f7864aa98db8e57843f8b628d9a7974729f9b82d3a71e4e71f44d4f36622779d1a5487e7cff8ac46e37c4c9e0a721a21e0b493f0854aa1f7111c0c305b6c5842c14d8190869622e8b599cb6cf5b0f34a611d94bba4bc74bc62be291b806b0dc2fe71444d0fba57e928705d35a23047e925a55be6be8a3da76ca21ac86124e1330e6ce6be7a645ae7dce4e3cba918841ffbcea010a6d697b831732fe900951a6bf1344c41a7160a6fb8f0a3b119328e4aac0e3ac4ee26850a2848a04fa0715b2f1724ec4bd9887ce75264632386d453f9d2a061ca8849a8d45f7897b37053267d82967fca4cc62fea4ab9dc33c68de5b5055f6c71d9cc2f39e2d5e75da6f1cf772667e0a08bec52359ca47f791afe4e11e97920776e87c5615073714aac8e9dd1a39de37231a350ad7e3e7aa3272bb81499b08298366711997e669186441332f0179692e7834534e94998fff28fbfebb2514e4e4b6710c246bf002305cfcfdb3dbc8cef03614e9c5fb1232a385d788163f583b5f98319bfbd8d8ddff2a81fb85e95819e245de15d8021897ba3fe18197c49923dde7ed269783b61f146b55eec398c0576dcd18343a889c7120c5d2a8ed18666addff6d7f865e24888146a0a714b1064f548f5921db43c6accc916403a91518b65639ab461adb730bb95d07cd754003071b317a68ac1582e4136be7d006ab4bfc90a999330b79079cd5286a66355dd8900306d7fa57db308dba5165c4e04620b02518b151bf5dcf7d19839319fd44a730b0ee2788ce62eb242c38c255fc1b4ff79c7f8abdcf6f997ae2344bc3ba3d74aa73febd708f8e579324205a872115f26fb856b81d047b595b2e1822d02e338abe7dbf44c89163f7538fcbd5717052774ba29855635bba48626806ad505abb88d6b85e22f60a339da1bea74b231a9133972236effd86e9c3db9c700566adab9cb32c474a1bb1f0206cef707a8994d4710c11b99464fb666b2d292776afd8d42dab3ad9a4eb120d63f7e2a54a1f3a6f07c5ba3d333397a6657333dfe0783490e3849efc7e5ac3e82aafc64be649cc312a0412889bf5ecc34881741cbc8e8f19ef0f1702093f2e9b8443ad68514e11f54185dfe66e279ddc663487c16618b65395e46dbd9f80c7e6a302b686c8d2b7274ac4d93589b5cec6ec665b316a292e26834b9b69e189617a0d5db64a13a71410343cc80fc0b10ebdedcc6e933c52f98894fc78771bdc77cfc250ecddfd04eec6c69135653c9ade989b0bfd57b4b38ba931af6fbdce5f3b3ea1f394dbc12e9af5c4b1591ecf3c8df09db505ffdb7b2593fb610f56893e912079c303986fc18bc18f81699d546e86a2f251c9eb6e2e655da3624226a2595f0e980fb300b08cef30108c436b75a0c4eaea632263a1d0089288e90a98520b36b03e1fb31560783ccdb3c16b8b2316de79083e1a1b39a9e28d8898d12bbdd67e8834694647c5aaa7f583cc0e85818f9e532305894e0f66001845961438074a6b648932c6ecc5102dcf0aeb1d1904b95b3af15c60bd533d13ba2f66f6539d35cb51aaffd2430171d5792314f5359ded036916e28621b14905eca137429f09178c3f4aa71ed4581d981939b9142ac9db39464cdbd47ff7fa4896f677805dee0bfdcabcc744bfadb1cedba1ce2ef97b9ecfd5b98328e09f7c6c0e9f37b0178f5fcbe484c2d66086fd911703b4b8c8748db9eb7ba7dc26699df516800d41ad66e5d27df5821c9e5cde81dda846fb978afd13a870011560510a477706c8eb66617fd4618ca80b6d87bb623b29ab1ffa96d9f4f98cdb8fb87b0d23d980df08b6a6d506fee6f7ae95a5ce7d53999a51668fb83ceab65d90bcf79d7fd4a1bda6b370949d024a06abd36d49d507dd00d3", 0x1000}, {&(0x7f0000000340)="f1d0d8bc12a764f9a01153a9e89c73c7a285eab8a84607206f3dbf41ee91498af913d21f1c8ed2918ac4883a16ff74cb8887acc44ad76fc23f4671801db1c6a6f4f273d26985ec0e712efea777b4aa90c1dd4903629d52ae22935ad0ac0912bcd1e93118f8afe9dbd752273d381083dc18dc3270cd3b88b38ca04b674670c89d1db8b0f25512b2808625147c2bd4a4eb5c7c3c6785d92b5863bba7b6727d48", 0x9f}, {&(0x7f0000000280)="89f2dad152e888b38c1703caf6c53cb59b05ddd78febff0c2960e7bba2ef6f721792a7e7e8ac4e5d9d3901a93828f96dbd8f74ff7a07cfb35aea7d5c242d77f950b0e31cc40b2b21b6dcc0f3c0eec6d5f9695e4048e6cd9061e4291b54347e88656c694da476db5ef0e8992653", 0x6d}, {&(0x7f0000000140)="ef8bd1721fa0c754d2bf340cb25141aac405d3e1f0e9adf8d6916cab67c8122d7d1967e13979870e9197bf988e131e", 0x2f}, {&(0x7f0000002c80)="12593d70db84dc86b925fb320b7d253abf2e9519de7b191cc07b6d05cbb33aa9c0b5db656951f950234a3fde0e7ce9035cfa20bf54fa661c72d5fe86462fbeb8917e858da02dd88d07508855dc685fcfc6ae304b99bfdf28a6f4afdd32a95710e77cd474ee5a3228136189450f3a34cdb2878221c32982bb117f68c69af538c3e67c54fbfa9f44d977761a6dd3f334edf22ee479c4521156017637f213741599b6dd9c6f1d50c09b97e442f7ab6c7b6c4e40fa45e8d467b2744af1d49737d60fcaaf4b905dd4584c4a8b9a29960b3171daf10806b2050910e22fff479cdb0a7f444d8042794834db75dbe1b95028a2729b79b06e3d479228f7fb64a3b2c88b0dd65e7eb84e28cc374b302c2c1569b2390980a31277b984afce25c4ec02e33a37e12328362b312402b93443108bb66be2f0c8d54aa32993e8e26dd438f2e09c05912e8bab83dd357c1be7a1870ef3f409900141c4d312aef63fcef77c48503ac954b3bda179986656dc508571ba7cacb9ad392bf607492adf31261e5a2f40532d7c9c625a73eb66bacc945b80a2cd6606ff6e30e09c46068a494014372584e68a00a5ae6b9fd8ce3d2aa662946e4fdc9a73e235002bc5063319df29c064345f553da12883f875eb6b7c6b102d89323db4e65a9b4a9d87344007ed7a099bae9dd959c39d5763503f0aba1b58f9b55a6b7e7ebe5450f6d432b2389a0959b3a0d8868f5f84db8ff423531cb8ea2b78ab32e94e1e454c48af0cf652a065aab47bd61c86aef77e7068a11dd521c8141cf16e0caa00c4f08bfdf1590e6c8685ba4fbb3fc4e0b2834279baff944935e7656ff903e7caa812b91575f415aafac0339c290762cd721c48af98671b3b0002fa6e8bfaf8eacd9b1d93b427d24a565124f4256d6f889e9548019a4e2a70052cd71843af279d5bb754d907d990ca1ba2b06185f88484fbd00e74d30edbd196f1a08553ad3831f0482a7e007432da83943dc660c3174856245b5eed1a95a22539fa990211ddc0c2ecac7a30fceae431aa23c2f623313a5596eae160acb7225a4dfdcf9c8f073d3d61a975a64080f10e6b6dcbf3225c1a2e71a0b76fca967a91b4c52e0d27c68aa56daff6a6979d0d8c82634fbafc9899afa320a1fb5781edc9fcb5e6ecb10d8eb4a9847979844dd56965f02770b3592a433e38bfd96634f025d6ad5dc256b8744fefcad5a2b1fbf9ede7b0f07dcf616583f73a864cc02541d60f2ae6d5338705e0ead5cd5f668de02802e535a66c1d2020f8cd42956d5c165be1078fd7163645d341bedbe97cbbfea9f70d731dd5deb248df948a5623c132522995a46a16f9c649cfe2bd6b13dcd24ede42da4fa760f5d369e091d4ddc5b84c3e11ea800efc433f5e6d853d1e41b3fc547a4ec67fae3f1a35d71335fc914665d37789c8f691611bf2dcf16ca9eccb5a80d2e35a9c34b5a9dc3a78b2d1e6e2857dbb4a656704911b5936a6ea88aac03af6e924844bc8ef0d9de29b622302ad3b16b4a003a15734e44c2a55e10344fcf765aa53bf4574abe518f9c38433a0a5ce1ebf5ca5ef71ae7b39136b7211013d49d87785e351b63c50dee4ee28be69e6246693e2dacbeea6b2cba6467c9f618aec8b8a01025be17e84515277ea358756f544f45441506de211abf6be9ba4d9e29b1630c3ce9bf7e8c815bb78da7dc6f1e60f67077604a087c502f40a7f6bcebbf385c5bfabcde46be3917e7b25864edcd5484ccdd447912d91f0fc95b9a7281b0970333757f0979ad8875f07d4688f46dfd46cbefeee1c09f66f0fdfedc5c7ace81634fd824d7b8d322a67b0ff477b82bdfa06259092ce24fed35a3bede5ac6a77c3431c0d7daf5bb576b723c1802ad0f9358fec36610b8adcba1813984c0734585ed8abc335dcadcace2e8a0198259e4af43546e2c783b20d8f8da0040525ae113ce9b37f02099f1f1bf08236088b92f20f0784acaadb5ba900d58a86c5ccf66888b64edf73659e49d36da29efd8d2b86a27277b210ea952808cbccfb380af0bbe56b64079b7bb0214a6bcf25d095ffb94873f219870f46d793eb5d2ad676b3d896ffe4d7d86a237eff8841cba4aca5cb1d6c9bf233d3c456b85edee1b36676debb7df33af268b2255f33920215bd84fd11000a4d8e356adc976349849067383eec478a93dda065e98bf96503aef719d67ebb9e137c13c45c15e7d2c3db1ac961ffd555cdf188a18ede652f19b29df8a989adc42a09c0efb923fcf542023327247383566be5a125c0b4cd4d0f2a926d07cfe8a1a15596f89fa7bf9a34ee647f1f1cd88fe3177d2c8e47074071ebe8aa2bcd864f1a849325f956c2db0be1c9c688dc348ccf7599e9f88c9b37ae8973070c1d28baef36a74aeb74284ea5e2fa849031f531bd26bbd0ae22cb4f009cddd90f7a4211547949f6ea34b5dde0260edfd830ce5744cfc36024bb82131b94bc9b1a6b79daafc57f1cef254605e12c2f9931551e6ff7d3b25716a830491369095bbbd210bf44a7d5795e1c1dce37dcf7589a2dd4cfa64476fe80970f3ec74a814d47a5f3bd8e34646567dd9578d1d269eb9f9c3d676bf7e8f656db900d5912b7f012aeccaa9c5e1e19918b731ed741cda769ddc005b4c1a49976192bd05dd24913fb9132824487ee9895cf94ba13d7d67b833ccd6aee1de2c8230757bfe2af71217a26416922b13c7b00e982f4251e55606d6fe993cdcadf499bb07a835b36b1c3fb18c86beeda0076a3119f3483b5bdcc65e5c8ecefcce3721819fe2ae7ca814479c678d8f841d8c8603bc2d055254aef15f92ec1135e196912dab7ece66cb901efe0b47a02202a6d60f94608d8e78b2523d0b9584d1564c9ef5cc7e4036e57b53ba3cdd07fe169da60e1752c811e5ac5412637185bbf58f7c3c9eb439fccdcb0a10e20ee56b68b4e926bf53a143dfe59e34789ca96f958e365f01f601d7e7734080b37309a3ebc3eafb4b972c0bf8896ac90a1079c04114478c41d763394c6cab86ae80d93ea2b8030d962b060c40b8700431bc55fe5ecd8238f3597690bf0adfa3edae1e602e13fdd0bd91b9afc79d2214f7b0e5aaa52ca5966189ba7c216ffe91b34149d8eb9996d98a6bc868f5ed6367f6f33508f00c02a715faf5e1623c49b4d866a4185c5831ad8b432184c4aa06051565bef766e6710687c34bb3dd076bc9b484846a78312bfd4b6c3cdaf3241dd236c84f09b0923fafb31f4914cef97424c8ba98ff55f4f3d18ab6cc6fdf3e8e730d58af73bb217fc2c3075c85ff7694084388b91a5b6432c79a7cc6aaeacb4e16f6564be1a16af09147e10cfd1d919c55c22389015164809abf5008d4c17a88a277a2e034c96d2e6e0fdcddcc91b4369bc50e5af727d6f60969354532bed60f88d398e5209e8ac18401829fae212532cf401aafac44b54b5b9080669d2f68f4a38a546d9cef8371a6e18322d52ed8c81eef9186d78c4e456cd84efdd756980dda921e8d5412730b29969038fcfa25a686901b3bb3f279690be2c72863491a60169b87487244ee88a719796e2896ec4c7a2ba780d2cf16e50925ca252119d3de6a62f151226f11fc2395711fa7f722e5aa58f5b5ef011ad349930b4aa4f18547c0b387a76736778693b42933748416caf5f09cdcbd0041253befc859a124e0d626e464be9aa0ec81ee3658e3823af9b9495ea6785e98876a9501d57f85af7fdea7fb56f4ef5de06a5d090f86fe402534c58f8fa106c27f6b45c0c7fec368e5d6aa1d4aca5086941384cda33212fdc8f6e197c9045dd577b170d7b2c67f93cf69ab5bf97a452a8b144d51a2585e107ff9b81684085950ca8ecd4e0d8f2b6e9991d22f5961c7ebd3da97343b884a34022eac5b7b4d301c2580a1549cac89a56a29d842e6199a8174b25c555957e76e28ce5db4a9e7669632bf3648d32925c5c8f1442b0142260f74bd5692e0464691086591abacca3ccbf5eb768f6dea05856a9adc9d9dfea5fb03c1e93488ecacd13b4a202757a003bf681501d6e5bdf82e9a1be05baf9443a38147acff3b7c36a890ba9abb68cf36dbcce9dd4a7333d6b658dbe0119da038e0c8272da45aea3e7f2394975294564ad451c1e503d57f563cde98421e2f035c3832323e4d1d4dee4a0cee1a22d66515c4b95bae01845e1cde3bdfed47e3aab5c8d618f961e4afe41130153bcee50acba9ab15d39d6a7729c8a922616b06b8d1b68f388eaa1bca20ef952556d08a4099aed1ed43dc897e4720438b0245409bebbaddb98223e44acdbed83969e2e2926e5fccd042dccbf6874052cc48b22236d6fa17d990b97ba8480e8dd7d0d5a1e8de7454d185a580b79cee9ee307a03900ecbfc72f8206c85ad43b2be12af7d951e12f540fa4a06cec35c74141d5a77fd9e4e2481f3e8b351e44d47b043787dc82535a988a7d77145c860a86fb2d8c1c9b2e48c7f07f5e6a1b3b099bf324d5c78ca153c411ba28feffebc3c6e4582efa0b8cc25ae5861e4759f36d07c488defaec03ad5d6e21f6a6ce72df1033c28ee171060256979206e516891687853e93c032433d337b2aced17a5f53efd81082c3c458bc04082ed31468b74efd6ea1f10c1d8b4121c4f94698df3306c64f32a19aeac72dc266aae71e9c6c00172522cb49ae71636c5e8144ea63de8d188395535d27bce9641a21f5351a17073a1ae2ccca8b979178d517ae489706d19050073a4779c4b24efc05c12ddafb9f1d4b4e72fd691d123068741eb352004e54f424111163d365e1a0650e08444a5e85af068f8ee67027a17e5801b46ae41dd5bf2f48da0281c807ef1d5dbd6afd9e7028a6da56595714344e8fed387f9095a46880c2d47d77b515ad564315abd1170de71bbfcf0dbaf4e76ccc7bcc4f788d88e82038633e8f84e50a6d6d72dd670399aaef3b4e184cd2ec23d2054acbcce1e277a6dc33f9230b64612f75b91ff972f8872ab4e91ea8aa8de91bf4427a9595e73f8cd0f46845cd836012168487193ff4d0721906a0bf808960783f352b6aa7d0046d06760f67578c65b065e4de750ea244527a5e28d1680f37a79c6c5ee2b029ab75f9859c4226da078f33686986ac06f730918ad128a69f82c88f3f750b3f2a1f808f1ed923facbc28c68597100e877cee716a37fc39036348d20f48f94ad2ace6138c655d088014237c3ba0c6961f5a684d121b876e6999d84ae154a599c7177c8dfdb33092e43b65558481caacc86df606a14c5acdbc4ba5dc32f4b732268e38e9b99b3e0030425e8018a1123f391c07ffccab699181f527181b9f8e7fe0eb7370bc67c6114d36b824c1075d568e5175d7e58e55ce72d2e8a1e7594d3b05740b624b065e0f39003f7535e45e79d6a701e77cd0805a33bb556c8d507663963c76f538a0cb306182b675d45ce41e22d1e982d48b93c9ac2b73481d0342264e6b2d7164eb6e4d5602e1acbf523ade02edb9c820700ed73fbd34e2812ab704ceb4998cb7c51feeb1d4a79e72b9b7512644ceaca3d804c90494d0776cc37fe196c0d22344e606937aa6829b24b1", 0xf46}], 0x5}}, {{0x0, 0x0, &(0x7f0000000bc0)=[{&(0x7f0000000ac0)="96", 0x1}], 0x1}}, {{0x0, 0x0, &(0x7f0000001340)=[{&(0x7f00000011c0)="fc", 0x1e6a8}], 0x1}}], 0x3, 0x0)

C repro:
repro.c.txt

@cpaasch cpaasch added bug syzkaller reproducer Has a simple program to reproduce the bug labels Oct 20, 2023
@cpaasch
Copy link
Member Author

cpaasch commented Oct 20, 2023

Bisected to a3a65a4

@cpaasch cpaasch added the bisected Git commit introducing the bug is known label Oct 20, 2023
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Oct 27, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
PKRU: 55555554
Call Trace:
 <IRQ>
 mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
 subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
 tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
 tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
 tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
 tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
 ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
 ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
 ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
 __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
 process_backlog+0x353/0x660 net/core/dev.c:5974
 __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
 net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
 __do_softirq+0x184/0x524 kernel/softirq.c:553
 do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
@matttbe matttbe closed this as completed in 70544ae Nov 7, 2023
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 7, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 8, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 9, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 9, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Nov 10, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Nov 10, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 13, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
matttbe pushed a commit that referenced this issue Nov 13, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Nov 14, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Nov 14, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: #450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Nov 15, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
andyshrk pushed a commit to andyshrk/linux that referenced this issue Nov 22, 2023
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this issue Nov 24, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this issue Nov 26, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Kaz205 pushed a commit to Kaz205/linux that referenced this issue Nov 26, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Kaz205 pushed a commit to Kaz205/linux that referenced this issue Nov 27, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this issue Nov 28, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
sj-aws pushed a commit to amazonlinux/linux that referenced this issue Nov 28, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Whissi pushed a commit to Whissi/linux-stable that referenced this issue Nov 28, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Whissi pushed a commit to Whissi/linux-stable that referenced this issue Nov 28, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Relms12345 added a commit to Relms12345/linux that referenced this issue Nov 28, 2023
commit 6ac30d748bb080752d4078d482534b68d62f685f
Author: Greg Kroah-Hartman <[email protected]>
Date:   Tue Nov 28 17:07:23 2023 +0000

    Linux 6.1.64

    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Salvatore Bonaccorso <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Ron Economos <[email protected]>
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Nam Cao <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Tested-by: Conor Dooley <[email protected]>
    Tested-by: Allen Pais <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 04ff8a5107a56ad6ba87c1e89c7c520e851e4ffa
Author: Conor Dooley <[email protected]>
Date:   Thu Jun 29 12:33:34 2023 +0100

    RISC-V: drop error print from riscv_hartid_to_cpuid()

    commit 52909f1768023656d5c429873e2246a134289a95 upstream.

    As of commit 2ac874343749 ("RISC-V: split early & late of_node to
    hartid mapping") my CI complains about newly added pr_err() messages
    during boot, for example:
    [    0.000000] Couldn't find cpu id for hartid [0]
    [    0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller

    Before the split, riscv_of_processor_hartid() contained a check for
    whether the cpu was "available", before calling riscv_hartid_to_cpuid(),
    but after the split riscv_of_processor_hartid() can be called for cpus
    that are disabled.

    Most callers of riscv_hartid_to_cpuid() already report custom errors
    where it falls, making this print superfluous in those case. In other
    places, the print adds nothing - see riscv_intc_init() for example.

    Fixes: 2ac874343749 ("RISC-V: split early & late of_node to hartid mapping")
    Signed-off-by: Conor Dooley <[email protected]>
    Link: https://lore.kernel.org/r/20230629-paternity-grafted-b901b76d04a0@wendy
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9e1e0887ea21e9fef0f1a2a3ad715f9a3aa9535d
Author: Robert Richter <[email protected]>
Date:   Fri May 19 23:54:35 2023 +0200

    cxl/port: Fix NULL pointer access in devm_cxl_add_port()

    commit a70fc4ed20a6118837b0aecbbf789074935f473b upstream.

    In devm_cxl_add_port() the port creation may fail and its associated
    pointer does not contain a valid address. During error message
    generation this invalid port address is used. Fix that wrong address
    access.

    Fixes: f3cd264c4ec1 ("cxl: Unify debug messages when calling devm_cxl_add_port()")
    Signed-off-by: Robert Richter <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dan Williams <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c88cfbb18a5e498f405836b11f1dd31c54d7a7de
Author: Victor Shih <[email protected]>
Date:   Tue Nov 7 17:57:41 2023 +0800

    mmc: sdhci-pci-gli: GL9755: Mask the replay timer timeout of AER

    commit 85dd3af64965c1c0eb7373b340a1b1f7773586b0 upstream.

    Due to a flaw in the hardware design, the GL9755 replay timer frequently
    times out when ASPM is enabled. As a result, the warning messages will
    often appear in the system log when the system accesses the GL9755
    PCI config. Therefore, the replay timer timeout must be masked.

    Fixes: 36ed2fd32b2c ("mmc: sdhci-pci-gli: A workaround to allow GL9755 to enter ASPM L1.2")
    Signed-off-by: Victor Shih <[email protected]>
    Acked-by: Adrian Hunter <[email protected]>
    Acked-by: Kai-Heng Feng <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Victor Shih <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2132941b453fc933dde89b8f96f0a4439ded5c74
Author: Vicki Pfau <[email protected]>
Date:   Thu Mar 23 18:32:43 2023 -0700

    Input: xpad - add VID for Turtle Beach controllers

    commit 1999a6b12a3b5c8953fc9ec74863ebc75a1b851d upstream.

    This adds support for the Turtle Beach REACT-R and Recon Xbox controllers

    Signed-off-by: Vicki Pfau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2fa74d29fc1899c237d51bf9a6e132ea5c488976
Author: Steven Rostedt (Google) <[email protected]>
Date:   Tue Oct 31 12:24:53 2023 -0400

    tracing: Have trace_event_file have ref counters

    commit bb32500fb9b78215e4ef6ee8b4345c5f5d7eafb4 upstream.

    The following can crash the kernel:

     # cd /sys/kernel/tracing
     # echo 'p:sched schedule' > kprobe_events
     # exec 5>>events/kprobes/sched/enable
     # > kprobe_events
     # exec 5>&-

    The above commands:

     1. Change directory to the tracefs directory
     2. Create a kprobe event (doesn't matter what one)
     3. Open bash file descriptor 5 on the enable file of the kprobe event
     4. Delete the kprobe event (removes the files too)
     5. Close the bash file descriptor 5

    The above causes a crash!

     BUG: kernel NULL pointer dereference, address: 0000000000000028
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 6 PID: 877 Comm: bash Not tainted 6.5.0-rc4-test-00008-g2c6b6b1029d4-dirty #186
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
     RIP: 0010:tracing_release_file_tr+0xc/0x50

    What happens here is that the kprobe event creates a trace_event_file
    "file" descriptor that represents the file in tracefs to the event. It
    maintains state of the event (is it enabled for the given instance?).
    Opening the "enable" file gets a reference to the event "file" descriptor
    via the open file descriptor. When the kprobe event is deleted, the file is
    also deleted from the tracefs system which also frees the event "file"
    descriptor.

    But as the tracefs file is still opened by user space, it will not be
    totally removed until the final dput() is called on it. But this is not
    true with the event "file" descriptor that is already freed. If the user
    does a write to or simply closes the file descriptor it will reference the
    event "file" descriptor that was just freed, causing a use-after-free bug.

    To solve this, add a ref count to the event "file" descriptor as well as a
    new flag called "FREED". The "file" will not be freed until the last
    reference is released. But the FREE flag will be set when the event is
    removed to prevent any more modifications to that event from happening,
    even if there's still a reference to the event "file" descriptor.

    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

    Cc: [email protected]
    Cc: Mark Rutland <[email protected]>
    Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and filter files")
    Reported-by: Beau Belgrave <[email protected]>
    Tested-by: Beau Belgrave <[email protected]>
    Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6460508dce00f5438d95e3ee7096c925e30e72e2
Author: Michael Ellerman <[email protected]>
Date:   Tue Aug 22 00:28:19 2023 +1000

    powerpc/powernv: Fix fortify source warnings in opal-prd.c

    commit feea65a338e52297b68ceb688eaf0ffc50310a83 upstream.

    As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a
    FORTIFY_SOURCE warning:

      memcpy: detected field-spanning write (size 32) of single field "&item->msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4)
      WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 opal_prd_msg_notifier+0x174/0x188 [opal_prd]
      NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd]
      LR  opal_prd_msg_notifier+0x170/0x188 [opal_prd]
      Call Trace:
        opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable)
        notifier_call_chain+0xc0/0x1b0
        atomic_notifier_call_chain+0x2c/0x40
        opal_message_notify+0xf4/0x2c0

    This happens because the copy is targeting item->msg, which is only 4
    bytes in size, even though the enclosing item was allocated with extra
    space following the msg.

    To fix the warning define struct opal_prd_msg with a union of the header
    and a flex array, and have the memcpy target the flex array.

    Reported-by: "Aneesh Kumar K.V" <[email protected]>
    Reported-by: Mahesh Salgaonkar <[email protected]>
    Tested-by: Mahesh Salgaonkar <[email protected]>
    Reviewed-by: Mahesh Salgaonkar <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4c55be0855344187d0970874b6e1215b21a68b61
Author: Lewis Huang <[email protected]>
Date:   Thu Oct 19 17:22:21 2023 +0800

    drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox

    commit 5911d02cac70d7fb52009fbd37423e63f8f6f9bc upstream.

    [WHY]
    Flush command sent to DMCUB spends more time for execution on
    a dGPU than on an APU. This causes cursor lag when using high
    refresh rate mouses.

    [HOW]
    1. Change the DMCUB mailbox memory location from FB to inbox.
    2. Only change windows memory to inbox.

    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Reviewed-by: Nicholas Kazlauskas <[email protected]>
    Acked-by: Alex Hung <[email protected]>
    Signed-off-by: Lewis Huang <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 68d774eb10e261ac6d176da2379f97a62878ef22
Author: Tianci Yin <[email protected]>
Date:   Wed Nov 1 09:47:13 2023 +0800

    drm/amd/display: Enable fast plane updates on DCN3.2 and above

    commit 435f5b369657cffee4b04db1f5805b48599f4dbe upstream.

    [WHY]
    When cursor moves across screen boarder, lag cursor observed,
    since subvp settings need to sync up with vblank that causes
    cursor updates being delayed.

    [HOW]
    Enable fast plane updates on DCN3.2 to fix it.

    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Acked-by: Alex Hung <[email protected]>
    Signed-off-by: Tianci Yin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fb5c134ca589fe670430acc9e7ebf2691ca2476d
Author: Mario Limonciello <[email protected]>
Date:   Wed Nov 8 13:31:57 2023 -0600

    drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()

    commit b71f4ade1b8900d30c661d6c27f87c35214c398c upstream.

    When ddc_service_construct() is called, it explicitly checks both the
    link type and whether there is something on the link which will
    dictate whether the pin is marked as hw_supported.

    If the pin isn't set or the link is not set (such as from
    unloading/reloading amdgpu in an IGT test) then fail the
    amdgpu_dm_i2c_xfer() call.

    Cc: [email protected]
    Fixes: 22676bc500c2 ("drm/amd/display: Fix dmub soft hang for PSR 1")
    Link: https://github.com/fwupd/fwupd/issues/6327
    Signed-off-by: Mario Limonciello <[email protected]>
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 51ffa1a3792e3570ae2eb84d003c329b3d71da6c
Author: Christian König <[email protected]>
Date:   Thu Nov 9 10:14:14 2023 +0100

    drm/amdgpu: lower CS errors to debug severity

    commit 17daf01ab4e3e5a5929747aa05cc15eb2bad5438 upstream.

    Otherwise userspace can spam the logs by using incorrect input values.

    Signed-off-by: Christian König <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c52aac5884bc58e304d4c9cb8441baf8443ea189
Author: Christian König <[email protected]>
Date:   Thu Nov 9 10:12:39 2023 +0100

    drm/amdgpu: fix error handling in amdgpu_bo_list_get()

    commit 12f76050d8d4d10dab96333656b821bd4620d103 upstream.

    We should not leak the pointer where we couldn't grab the reference
    on to the caller because it can be that the error handling still
    tries to put the reference then.

    Signed-off-by: Christian König <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2ab6c1237bd4a961b8d5032671510a028fb9f0f6
Author: Alex Deucher <[email protected]>
Date:   Tue Oct 17 15:40:01 2023 -0400

    drm/amdgpu: don't use ATRM for external devices

    commit 432e664e7c98c243fab4c3c95bd463bea3aeed28 upstream.

    The ATRM ACPI method is for fetching the dGPU vbios rom
    image on laptops and all-in-one systems.  It should not be
    used for external add in cards.  If the dGPU is thunderbolt
    connected, don't try ATRM.

    v2: pci_is_thunderbolt_attached only works for Intel.  Use
        pdev->external_facing instead.
    v3: dev_is_removable() seems to be what we want

    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
    Reviewed-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 965dce07a4fc5b15c07c73124f5016240a7250ef
Author: Alex Deucher <[email protected]>
Date:   Tue Oct 17 16:30:00 2023 -0400

    drm/amdgpu: don't use pci_is_thunderbolt_attached()

    commit 7b1c6263eaf4fd64ffe1cafdc504a42ee4bfbb33 upstream.

    It's only valid on Intel systems with the Intel VSEC.
    Use dev_is_removable() instead.  This should do the right
    thing regardless of the platform.

    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
    Reviewed-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8e54a91d3e66b9730861f10345238ff5ef979d3d
Author: Alex Deucher <[email protected]>
Date:   Wed Nov 1 15:48:14 2023 -0400

    drm/amdgpu/smu13: drop compute workload workaround

    commit 23170863ea0a0965d224342c0eb2ad8303b1f267 upstream.

    This was fixed in PMFW before launch and is no longer
    required.

    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected] # 6.1.x
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 454d0cdd7c127bb0ad06b53c52e94ca2c9a83b20
Author: Ma Jun <[email protected]>
Date:   Tue Oct 31 11:11:04 2023 +0800

    drm/amd/pm: Fix error of MACO flag setting code

    commit 7f3e6b840fa8b0889d776639310a5dc672c1e9e1 upstream.

    MACO only works if BACO is supported

    Signed-off-by: Ma Jun <[email protected]>
    Reviewed-by: Kenneth Feng <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected] # 6.1.x
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 07e94f204f38b0d36eb377b3cda088b4a8b6f9a2
Author: Kunwu Chan <[email protected]>
Date:   Fri Nov 3 11:09:22 2023 +0000

    drm/i915: Fix potential spectre vulnerability

    commit 1a8e9bad6ef563c28ab0f8619628d5511be55431 upstream.

    Fix smatch warning:
    drivers/gpu/drm/i915/gem/i915_gem_context.c:847 set_proto_ctx_sseu()
    warn: potential spectre issue 'pc->user_engines' [r] (local cap)

    Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
    Cc: <[email protected]> # v5.15+
    Signed-off-by: Kunwu Chan <[email protected]>
    Reviewed-by: Tvrtko Ursulin <[email protected]>
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 27b086382c22efb7e0a16442f7bdc2e120108ef3)
    Signed-off-by: Jani Nikula <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9457636a49265bdec14f3b747a4911ea9b7d468c
Author: Ville Syrjälä <[email protected]>
Date:   Tue Oct 31 18:08:00 2023 +0200

    drm/i915: Bump GLK CDCLK frequency when driving multiple pipes

    commit 0cb89cd42fd22bbdec0b046c48f35775f5b88bdb upstream.

    On GLK CDCLK frequency needs to be at least 2*96 MHz when accessing
    the audio hardware. Currently we bump the CDCLK frequency up
    temporarily (if not high enough already) whenever audio hardware
    is being accessed, and drop it back down afterwards.

    With a single active pipe this works just fine as we can switch
    between all the valid CDCLK frequencies by changing the cd2x
    divider, which doesn't require a full modeset. However with
    multiple active pipes the cd2x divider trick no longer works,
    and thus we end up blinking all displays off and back on.

    To avoid this let's just bump the CDCLK frequency to >=2*96MHz
    whenever multiple pipes are active. The downside is slightly
    higher power consumption, but that seems like an acceptable
    tradeoff. With a single active pipe we can stick to the current
    more optiomal (from power comsumption POV) behaviour.

    Cc: [email protected]
    Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9599
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Reviewed-by: Jani Nikula <[email protected]>
    (cherry picked from commit 451eaa1a614c911f5a51078dcb68022874e4cb12)
    Signed-off-by: Jani Nikula <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e973f40de16125f3f85a07db68a2ad4a0aeb42c2
Author: Bas Nieuwenhuizen <[email protected]>
Date:   Tue Oct 17 16:01:35 2023 +0200

    drm/amd/pm: Handle non-terminated overdrive commands.

    commit 08e9ebc75b5bcfec9d226f9e16bab2ab7b25a39a upstream.

    The incoming strings might not be terminated by a newline
    or a 0.

    (found while testing a program that just wrote the string
     itself, causing a crash)

    Cc: [email protected]
    Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs")
    Signed-off-by: Bas Nieuwenhuizen <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit dc4542861ec8dde92c3c8a5139bc412860aebe60
Author: Jan Kara <[email protected]>
Date:   Fri Oct 13 14:13:50 2023 +0200

    ext4: properly sync file size update after O_SYNC direct IO

    commit 91562895f8030cb9a0470b1db49de79346a69f91 upstream.

    Gao Xiang has reported that on ext4 O_SYNC direct IO does not properly
    sync file size update and thus if we crash at unfortunate moment, the
    file can have smaller size although O_SYNC IO has reported successful
    completion. The problem happens because update of on-disk inode size is
    handled in ext4_dio_write_iter() *after* iomap_dio_rw() (and thus
    dio_complete() in particular) has returned and generic_file_sync() gets
    called by dio_complete(). Fix the problem by handling on-disk inode size
    update directly in our ->end_io completion handler.

    References: https://lore.kernel.org/all/[email protected]
    Reported-by: Gao Xiang <[email protected]>
    CC: [email protected]
    Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure")
    Signed-off-by: Jan Kara <[email protected]>
    Tested-by: Joseph Qi <[email protected]>
    Reviewed-by: "Ritesh Harjani (IBM)" <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e1d0f68bc07fee57d1855355dbb94092b895a9f4
Author: Kemeng Shi <[email protected]>
Date:   Sun Aug 27 01:47:01 2023 +0800

    ext4: add missed brelse in update_backups

    commit 9adac8b01f4be28acd5838aade42b8daa4f0b642 upstream.

    add missed brelse in update_backups

    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Theodore Ts'o <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1793dc461e5a081087ab4d34b39b838bdce3f7e9
Author: Kemeng Shi <[email protected]>
Date:   Sun Aug 27 01:47:03 2023 +0800

    ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks

    commit 40dd7953f4d606c280074f10d23046b6812708ce upstream.

    Wrong check of gdb backup in meta bg as following:
    first_group is the first group of meta_bg which contains target group, so
    target group is always >= first_group. We check if target group has gdb
    backup by comparing first_group with [group + 1] and [group +
    EXT4_DESC_PER_BLOCK(sb) - 1]. As group >= first_group, then [group + N] is
    > first_group. So no copy of gdb backup in meta bg is done in
    setup_new_flex_group_blocks.

    No need to do gdb backup copy in meta bg from setup_new_flex_group_blocks
    as we always copy updated gdb block to backups at end of
    ext4_flex_group_add as following:

    ext4_flex_group_add
      /* no gdb backup copy for meta bg any more */
      setup_new_flex_group_blocks

      /* update current group number */
      ext4_update_super
        sbi->s_groups_count += flex_gd->count;

      /*
       * if group in meta bg contains backup is added, the primary gdb block
       * of the meta bg will be copy to backup in new added group here.
       */
      for (; gdb_num <= gdb_num_end; gdb_num++)
        update_backups(...)

    In summary, we can remove wrong gdb backup copy code in
    setup_new_flex_group_blocks.

    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Theodore Ts'o <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 80ddcf21e7e022b392d9ae8363c0353251a95034
Author: Zhang Yi <[email protected]>
Date:   Thu Aug 24 17:26:04 2023 +0800

    ext4: correct the start block of counting reserved clusters

    commit 40ea98396a3659062267d1fe5f99af4f7e4f05e3 upstream.

    When big allocate feature is enabled, we need to count and update
    reserved clusters before removing a delayed only extent_status entry.
    {init|count|get}_rsvd() have already done this, but the start block
    number of this counting isn't correct in the following case.

      lblk            end
       |               |
       v               v
              -------------------------
              |                       | orig_es
              -------------------------
                       ^              ^
          len1 is 0    |     len2     |

    If the start block of the orig_es entry founded is bigger than lblk, we
    passed lblk as start block to count_rsvd(), but the length is correct,
    finally, the range to be counted is offset. This patch fix this by
    passing the start blocks to 'orig_es->lblk + len1'.

    Signed-off-by: Zhang Yi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ec4ba3d62f0fdde57cfaaeb7f1df85609b9a86ef
Author: Kemeng Shi <[email protected]>
Date:   Sun Aug 27 01:47:02 2023 +0800

    ext4: correct return value of ext4_convert_meta_bg

    commit 48f1551592c54f7d8e2befc72a99ff4e47f7dca0 upstream.

    Avoid to ignore error in "err".

    Signed-off-by: Kemeng Shi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 32b9fb9a67ec70bbe3afe931b0ea44203150a49a
Author: Ojaswin Mujoo <[email protected]>
Date:   Mon Sep 18 16:15:50 2023 +0530

    ext4: mark buffer new if it is unwritten to avoid stale data exposure

    commit 2cd8bdb5efc1e0d5b11a4b7ba6b922fd2736a87f upstream.

    ** Short Version **

    In ext4 with dioread_nolock, we could have a scenario where the bh returned by
    get_blocks (ext4_get_block_unwritten()) in __block_write_begin_int() has
    UNWRITTEN and MAPPED flag set. Since such a bh does not have NEW flag set we
    never zero out the range of bh that is not under write, causing whatever stale
    data is present in the folio at that time to be written out to disk. To fix this
    mark the buffer as new, in case it is unwritten, in ext4_get_block_unwritten().

    ** Long Version **

    The issue mentioned above was resulting in two different bugs:

    1. On block size < page size case in ext4, generic/269 was reliably
    failing with dioread_nolock. The state of the write was as follows:

      * The write was extending i_size.
      * The last block of the file was fallocated and had an unwritten extent
      * We were near ENOSPC and hence we were switching to non-delayed alloc
        allocation.

    In this case, the back trace that triggers the bug is as follows:

      ext4_da_write_begin()
        /* switch to nodelalloc due to low space */
        ext4_write_begin()
          ext4_should_dioread_nolock() // true since mount flags still have delalloc
          __block_write_begin(..., ext4_get_block_unwritten)
            __block_write_begin_int()
              for(each buffer head in page) {
                /* first iteration, this is bh1 which contains i_size */
                if (!buffer_mapped)
                  get_block() /* returns bh with only UNWRITTEN and MAPPED */
                /* second iteration, bh2 */
                  if (!buffer_mapped)
                    get_block() /* we fail here, could be ENOSPC */
              }
              if (err)
                /*
                 * this would zero out all new buffers and mark them uptodate.
                 * Since bh1 was never marked new, we skip it here which causes
                 * the bug later.
                 */
                folio_zero_new_buffers();
          /* ext4_wrte_begin() error handling */
          ext4_truncate_failed_write()
            ext4_truncate()
              ext4_block_truncate_page()
                __ext4_block_zero_page_range()
                  if(!buffer_uptodate())
                    ext4_read_bh_lock()
                      ext4_read_bh() -> ... ext4_submit_bh_wbc()
                        BUG_ON(buffer_unwritten(bh)); /* !!! */

    2. The second issue is stale data exposure with page size >= blocksize
    with dioread_nolock. The conditions needed for it to happen are same as
    the previous issue ie dioread_nolock around ENOSPC condition. The issue
    is also similar where in __block_write_begin_int() when we call
    ext4_get_block_unwritten() on the buffer_head and the underlying extent
    is unwritten, we get an unwritten and mapped buffer head. Since it is
    not new, we never zero out the partial range which is not under write,
    thus writing stale data to disk. This can be easily observed with the
    following reproducer:

     fallocate -l 4k testfile
     xfs_io -c "pwrite 2k 2k" testfile
     # hexdump output will have stale data in from byte 0 to 2k in testfile
     hexdump -C testfile

    NOTE: To trigger this, we need dioread_nolock enabled and write happening via
    ext4_write_begin(), which is usually used when we have -o nodealloc. Since
    dioread_nolock is disabled with nodelalloc, the only alternate way to call
    ext4_write_begin() is to ensure that delayed alloc switches to nodelalloc ie
    ext4_da_write_begin() calls ext4_write_begin(). This will usually happen when
    ext4 is almost full like the way generic/269 was triggering it in Issue 1 above.
    This might make the issue harder to hit. Hence, for reliable replication, I used
    the below patch to temporarily allow dioread_nolock with nodelalloc and then
    mount the disk with -o nodealloc,dioread_nolock. With this you can hit the stale
    data issue 100% of times:

    @@ -508,8 +508,8 @@ static inline int ext4_should_dioread_nolock(struct inode *inode)
      if (ext4_should_journal_data(inode))
        return 0;
      /* temporary fix to prevent generic/422 test failures */
    - if (!test_opt(inode->i_sb, DELALLOC))
    -   return 0;
    + // if (!test_opt(inode->i_sb, DELALLOC))
    + //  return 0;
      return 1;
     }

    After applying this patch to mark buffer as NEW, both the above issues are
    fixed.

    Signed-off-by: Ojaswin Mujoo <[email protected]>
    Cc: [email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: "Ritesh Harjani (IBM)" <[email protected]>
    Link: https://lore.kernel.org/r/d0ed09d70a9733fbb5349c5c7b125caac186ecdf.1695033645.git.ojaswin@linux.ibm.com
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f0cc1368fafd2542f09d18a75aa32288bc49d11b
Author: Kemeng Shi <[email protected]>
Date:   Sun Aug 27 01:47:00 2023 +0800

    ext4: correct offset of gdb backup in non meta_bg group to update_backups

    commit 31f13421c004a420c0e9d288859c9ea9259ea0cc upstream.

    Commit 0aeaa2559d6d5 ("ext4: fix corruption when online resizing a 1K
    bigalloc fs") found that primary superblock's offset in its group is
    not equal to offset of backup superblock in its group when block size
    is 1K and bigalloc is enabled. As group descriptor blocks are right
    after superblock, we can't pass block number of gdb to update_backups
    for the same reason.

    The root casue of the issue above is that leading 1K padding block is
    count as data block offset for primary block while backup block has no
    padding block offset in its group.

    Remove padding data block count to fix the issue for gdb backups.

    For meta_bg case, update_backups treat blk_off as block number, do no
    conversion in this case.

    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Theodore Ts'o <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit af075d06b34f79476bcd4e2b07c8632d206dad78
Author: Max Kellermann <[email protected]>
Date:   Tue Sep 19 10:18:23 2023 +0200

    ext4: apply umask if ACL support is disabled

    commit 484fd6c1de13b336806a967908a927cc0356e312 upstream.

    The function ext4_init_acl() calls posix_acl_create() which is
    responsible for applying the umask.  But without
    CONFIG_EXT4_FS_POSIX_ACL, ext4_init_acl() is an empty inline function,
    and nobody applies the umask.

    This fixes a bug which causes the umask to be ignored with O_TMPFILE
    on ext4:

     https://github.com/MusicPlayerDaemon/MPD/issues/558
     https://bugs.gentoo.org/show_bug.cgi?id=686142#c3
     https://bugzilla.kernel.org/show_bug.cgi?id=203625

    Reviewed-by: "J. Bruce Fields" <[email protected]>
    Cc: [email protected]
    Signed-off-by: Max Kellermann <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e795a56654fd6078cc6c8f88d6debebf511c21ae
Author: Heiner Kallweit <[email protected]>
Date:   Tue Nov 21 09:09:33 2023 +0100

    Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"

    commit 6a26310273c323380da21eb23fcfd50e31140913 upstream.

    This reverts commit efa5f1311c4998e9e6317c52bc5ee93b3a0f36df.

    I couldn't reproduce the reported issue. What I did, based on a pcap
    packet log provided by the reporter:
    - Used same chip version (RTL8168h)
    - Set MAC address to the one used on the reporters system
    - Replayed the EAPOL unicast packet that, according to the reporter,
      was filtered out by the mc filter.
    The packet was properly received.

    Therefore the root cause of the reported issue seems to be somewhere
    else. Disabling mc filtering completely for the most common chip
    version is a quite big hammer. Therefore revert the change and wait
    for further analysis results from the reporter.

    Cc: [email protected]
    Signed-off-by: Heiner Kallweit <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit eb2f435be2c46eea47f60baca419b69f01abf6ad
Author: Andrey Konovalov <[email protected]>
Date:   Wed Aug 30 16:16:15 2023 +0100

    media: qcom: camss: Fix csid-gen2 for test pattern generator

    commit 87889f1b7ea40d2544b49c62092e6ef2792dced7 upstream.

    In the current driver csid Test Pattern Generator (TPG) doesn't work.
    This change:
    - fixes writing frame width and height values into CSID_TPG_DT_n_CFG_0
    - fixes the shift by one between test_pattern control value and the
      actual pattern.
    - drops fixed VC of 0x0a which testing showed prohibited some test
      patterns in the CSID to produce output.
    So that TPG starts working, but with the below limitations:
    - only test_pattern=9 works as it should
    - test_pattern=8 and test_pattern=7 produce black frame (all zeroes)
    - the rest of test_pattern's don't work (yavta doesn't get the data)
    - regardless of the CFA pattern set by 'media-ctl -V' the actual pixel
      order is always the same (RGGB for any RAW8 or RAW10P format in
      4608x2592 resolution).

    Tested with:

    RAW10P format, VC0:
     media-ctl -V '"msm_csid0":0[fmt:SRGGB10/4608x2592 field:none]'
     media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB10/4608x2592 field:none]'
     media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]'
     v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9
     yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video0

    RAW10P format, VC1:
     media-ctl -V '"msm_csid0":2[fmt:SRGGB10/4608x2592 field:none]'
     media-ctl -V '"msm_vfe0_rdi1":0[fmt:SRGGB10/4608x2592 field:none]'
     media-ctl -l '"msm_csid0":2->"msm_vfe0_rdi1":0[1]'
     v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9
     yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video1

    RAW8 format, VC0:
     media-ctl --reset
     media-ctl -V '"msm_csid0":0[fmt:SRGGB8/4608x2592 field:none]'
     media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB8/4608x2592 field:none]'
     media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]'
     yavta -B capture-mplane --capture=3 -n 3 -f SRGGB8 -s 4608x2592 /dev/video0

    Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170")
    Cc: [email protected]
    Signed-off-by: Andrey Konovalov <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit eeab07ddd020e6990ba55b47721348beab5dcaaf
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:13 2023 +0100

    media: qcom: camss: Fix invalid clock enable bit disjunction

    commit d8f7e1a60d01739a1d78db2b08603089c6cf7c8e upstream.

    define CSIPHY_3PH_CMN_CSI_COMMON_CTRL5_CLK_ENABLE BIT(7)

    disjunction for gen2 ? BIT(7) : is a nop we are setting the same bit
    either way.

    Fixes: 4abb21309fda ("media: camss: csiphy: Move to hardcode CSI Clock Lane number")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Konrad Dybcio <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 18a06f2eeb841185336da7fd3fd5dfd239f23014
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:12 2023 +0100

    media: qcom: camss: Fix missing vfe_lite clocks check

    commit b6e1bdca463a932c1ac02caa7d3e14bf39288e0c upstream.

    check_clock doesn't account for vfe_lite which means that vfe_lite will
    never get validated by this routine. Add the clock name to the expected set
    to remediate.

    Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ddc424aedbd379f277870db20883d38d34639e5a
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:11 2023 +0100

    media: qcom: camss: Fix VFE-480 vfe_disable_output()

    commit 7f24d291350426d40b36dfbe6b3090617cdfd37a upstream.

    vfe-480 is copied from vfe-17x and has the same racy idle timeout bug as in
    17x.

    Fix the vfe_disable_output() logic to no longer be racy and to conform
    to the 17x way of quiescing and then resetting the VFE.

    Fixes: 4edc8eae715c ("media: camss: Add initial support for VFE hardware version Titan 480")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0f3e5f93fe77bc16e632686b7571d296f91a76be
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:10 2023 +0100

    media: qcom: camss: Fix VFE-17x vfe_disable_output()

    commit 3143ad282fc08bf995ee73e32a9e40c527bf265d upstream.

    There are two problems with the current vfe_disable_output() routine.

    Firstly we rightly use a spinlock to protect output->gen2.active_num
    everywhere except for in the IDLE timeout path of vfe_disable_output().
    Even if that is not racy "in practice" somehow it is by happenstance not
    by design.

    Secondly we do not get consistent behaviour from this routine. On
    sc8280xp 50% of the time I get "VFE idle timeout - resetting". In this
    case the subsequent capture will succeed. The other 50% of the time, we
    don't hit the idle timeout, never do the VFE reset and subsequent
    captures stall indefinitely.

    Rewrite the vfe_disable_output() routine to

    - Quiesce write masters with vfe_wm_stop()
    - Set active_num = 0

    remembering to hold the spinlock when we do so followed by

    - Reset the VFE

    Testing on sc8280xp and sdm845 shows this to be a valid fix.

    Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 04ef31a3e38ad207aee87d8a89290152b9000074
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:09 2023 +0100

    media: qcom: camss: Fix vfe_get() error jump

    commit 26bda3da00c3edef727a6acb00ed2eb4b22f8723 upstream.

    Right now it is possible to do a vfe_get() with the internal reference
    count at 1. If vfe_check_clock_rates() returns non-zero then we will
    leave the reference count as-is and

    run:
    - pm_runtime_put_sync()
    - vfe->ops->pm_domain_off()

    skip:
    - camss_disable_clocks()

    Subsequent vfe_put() calls will when the ref-count is non-zero
    unconditionally run:

    - pm_runtime_put_sync()
    - vfe->ops->pm_domain_off()
    - camss_disable_clocks()

    vfe_get() should not attempt to roll-back on error when the ref-count is
    non-zero as the upper layers will still do their own vfe_put() operations.

    vfe_put() will drop the reference count and do the necessary power
    domain release, the cleanup jumps in vfe_get() should only be run when
    the ref-count is zero.

    [   50.095796] CPU: 7 PID: 3075 Comm: cam Not tainted 6.3.2+ #80
    [   50.095798] Hardware name: LENOVO 21BXCTO1WW/21BXCTO1WW, BIOS N3HET82W (1.54 ) 05/26/2023
    [   50.095799] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [   50.095802] pc : refcount_warn_saturate+0xf4/0x148
    [   50.095804] lr : refcount_warn_saturate+0xf4/0x148
    [   50.095805] sp : ffff80000c7cb8b0
    [   50.095806] x29: ffff80000c7cb8b0 x28: ffff16ecc0e3fc10 x27: 0000000000000000
    [   50.095810] x26: 0000000000000000 x25: 0000000000020802 x24: 0000000000000000
    [   50.095813] x23: ffff16ecc7360640 x22: 00000000ffffffff x21: 0000000000000005
    [   50.095815] x20: ffff16ed175f4400 x19: ffffb4d9852942a8 x18: ffffffffffffffff
    [   50.095818] x17: ffffb4d9852d4a48 x16: ffffb4d983da5db8 x15: ffff80000c7cb320
    [   50.095821] x14: 0000000000000001 x13: 2e656572662d7265 x12: 7466612d65737520
    [   50.095823] x11: 00000000ffffefff x10: ffffb4d9850cebf0 x9 : ffffb4d9835cf954
    [   50.095826] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000057fa8
    [   50.095829] x5 : ffff16f813fe3d08 x4 : 0000000000000000 x3 : ffff621e8f4d2000
    [   50.095832] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff16ed32119040
    [   50.095835] Call trace:
    [   50.095836]  refcount_warn_saturate+0xf4/0x148
    [   50.095838]  device_link_put_kref+0x84/0xc8
    [   50.095843]  device_link_del+0x38/0x58
    [   50.095846]  vfe_pm_domain_off+0x3c/0x50 [qcom_camss]
    [   50.095860]  vfe_put+0x114/0x140 [qcom_camss]
    [   50.095869]  csid_set_power+0x2c8/0x408 [qcom_camss]
    [   50.095878]  pipeline_pm_power_one+0x164/0x170 [videodev]
    [   50.095896]  pipeline_pm_power+0xc4/0x110 [videodev]
    [   50.095909]  v4l2_pipeline_pm_use+0x5c/0xa0 [videodev]
    [   50.095923]  v4l2_pipeline_pm_get+0x1c/0x30 [videodev]
    [   50.095937]  video_open+0x7c/0x100 [qcom_camss]
    [   50.095945]  v4l2_open+0x84/0x130 [videodev]
    [   50.095960]  chrdev_open+0xc8/0x250
    [   50.095964]  do_dentry_open+0x1bc/0x498
    [   50.095966]  vfs_open+0x34/0x40
    [   50.095968]  path_openat+0xb44/0xf20
    [   50.095971]  do_filp_open+0xa4/0x160
    [   50.095974]  do_sys_openat2+0xc8/0x188
    [   50.095975]  __arm64_sys_openat+0x6c/0xb8
    [   50.095977]  invoke_syscall+0x50/0x128
    [   50.095982]  el0_svc_common.constprop.0+0x4c/0x100
    [   50.095985]  do_el0_svc+0x40/0xa8
    [   50.095988]  el0_svc+0x2c/0x88
    [   50.095991]  el0t_64_sync_handler+0xf4/0x120
    [   50.095994]  el0t_64_sync+0x190/0x198
    [   50.095996] ---[ end trace 0000000000000000 ]---

    Fixes: 779096916dae ("media: camss: vfe: Fix runtime PM imbalance on error")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3166c3af55fe197f332d7cd83982e8b06dba5bd4
Author: Bryan O'Donoghue <[email protected]>
Date:   Wed Aug 30 16:16:06 2023 +0100

    media: qcom: camss: Fix pm_domain_on sequence in probe

    commit 7405116519ad70b8c7340359bfac8db8279e7ce4 upstream.

    We need to make sure camss_configure_pd() happens before
    camss_register_entities() as the vfe_get() path relies on the pointer
    provided by camss_configure_pd().

    Fix the ordering sequence in probe to ensure the pointers vfe_get() demands
    are present by the time camss_register_entities() runs.

    In order to facilitate backporting to stable kernels I've moved the
    configure_pd() call pretty early on the probe() function so that
    irrespective of the existence of the old error handling jump labels this
    patch should still apply to -next circa Aug 2023 to v5.13 inclusive.

    Fixes: 2f6f8af67203 ("media: camss: Refactor VFE power domain toggling")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6dcb2605c284fbd54963547ab4df1c91e591a959
Author: Victor Shih <[email protected]>
Date:   Tue Nov 7 17:57:40 2023 +0800

    mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of AER

    commit 015c9cbcf0ad709079117d27c2094a46e0eadcdb upstream.

    Due to a flaw in the hardware design, the GL9750 replay timer frequently
    times out when ASPM is enabled. As a result, the warning messages will
    often appear in the system log when the system accesses the GL9750
    PCI config. Therefore, the replay timer timeout must be masked.

    Fixes: d7133797e9e1 ("mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2")
    Signed-off-by: Victor Shih <[email protected]>
    Acked-by: Adrian Hunter <[email protected]>
    Acked-by: Kai-Heng Feng <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f7164cb0371f194bb1ce6309897b58f6ad7e32b5
Author: ChunHao Lin <[email protected]>
Date:   Fri Nov 10 01:33:59 2023 +0800

    r8169: add handling DASH when DASH is disabled

    commit 0ab0c45d8aaea5192328bfa6989673aceafc767c upstream.

    For devices that support DASH, even DASH is disabled, there may still
    exist a default firmware that will influence device behavior.
    So driver needs to handle DASH for devices that support DASH, no
    matter the DASH status is.

    This patch also prepares for "fix network lost after resume on DASH
    systems".

    Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
    Cc: [email protected]
    Signed-off-by: ChunHao Lin <[email protected]>
    Reviewed-by: Heiner Kallweit <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 862565f32494c7e8d88f49a02989c792c3926841
Author: ChunHao Lin <[email protected]>
Date:   Fri Nov 10 01:34:00 2023 +0800

    r8169: fix network lost after resume on DASH systems

    commit 868c3b95afef4883bfb66c9397482da6840b5baf upstream.

    Device that support DASH may be reseted or powered off during suspend.
    So driver needs to handle DASH during system suspend and resume. Or
    DASH firmware will influence device behavior and causes network lost.

    Fixes: b646d90053f8 ("r8169: magic.")
    Cc: [email protected]
    Reviewed-by: Heiner Kallweit <[email protected]>
    Signed-off-by: ChunHao Lin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9e9e2107ae3637ed143d87f86025b47ddb502060
Author: Paolo Abeni <[email protected]>
Date:   Tue Nov 14 00:16:16 2023 +0100

    mptcp: fix setsockopt(IP_TOS) subflow locking

    commit 7679d34f97b7a09fd565f5729f79fd61b7c55329 upstream.

    The MPTCP implementation of the IP_TOS socket option uses the lockless
    variant of the TOS manipulation helper and does not hold such lock at
    the helper invocation time.

    Add the required locking.

    Fixes: ffcacff87cd6 ("mptcp: Support for IP_TOS for MPTCP setsockopt()")
    Cc: [email protected]
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/457
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts <[email protected]>
    Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-4-7b9cd6a7b7f4@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit dba6f08cef1944116be7a480a4f8e51faca2a184
Author: Geliang Tang <[email protected]>
Date:   Tue Nov 14 00:16:15 2023 +0100

    mptcp: add validity check for sending RM_ADDR

    commit 8df220b29282e8b450ea57be62e1eccd4996837c upstream.

    This patch adds the validity check for sending RM_ADDRs for userspace PM
    in mptcp_pm_remove_addrs(), only send a RM_ADDR when the address is in the
    anno_list or conn_list.

    Fixes: 8b1c94da1e48 ("mptcp: only send RM_ADDR in nl_cmd_remove")
    Cc: [email protected]
    Signed-off-by: Geliang Tang <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts <[email protected]>
    Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-3-7b9cd6a7b7f4@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 70ff9b65a72885b3a2dfde6709da1f19b85fa696
Author: Paolo Abeni <[email protected]>
Date:   Tue Nov 14 00:16:13 2023 +0100

    mptcp: deal with large GSO size

    commit 9fce92f050f448a0d1ddd9083ef967d9930f1e52 upstream.

    After the blamed commit below, the TCP sockets (and the MPTCP subflows)
    can build egress packets larger than 64K. That exceeds the maximum DSS
    data size, the length being misrepresent on the wire and the stream being
    corrupted, as later observed on the receiver:

      WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
      CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
      RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
      RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
      RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
      netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
      RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
      RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
      R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
      R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
      FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
      PKRU: 55555554
      Call Trace:
       <IRQ>
       mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
       subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
       tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
       tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
       tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
       tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
       ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
       ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
       ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
       __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
       process_backlog+0x353/0x660 net/core/dev.c:5974
       __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
       net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
       __do_softirq+0x184/0x524 kernel/softirq.c:553
       do_softirq+0xdd/0x130 kernel/softirq.c:454

    Address the issue explicitly bounding the maximum GSO size to what MPTCP
    actually allows.

    Reported-by: Christoph Paasch <[email protected]>
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
    Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
    Cc: [email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts <[email protected]>
    Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 16fcda24b17507f2bb584e14dec5285301528f52
Author: Roman Gushchin <[email protected]>
Date:   Tue Nov 7 09:18:02 2023 -0800

    mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors

    commit 24948e3b7b12e0031a6edb4f49bbb9fb2ad1e4e9 upstream.

    Objcg vectors attached to slab pages to store slab object ownership
    information are allocated using gfp flags for the original slab
    allocation.  Depending on slab page order and the size of slab objects,
    objcg vector can take several pages.

    If the original allocation was done with the __GFP_NOFAIL flag, it
    triggered a warning in the page allocation code.  Indeed, order > 1 pages
    should not been allocated with the __GFP_NOFAIL flag.

    Fix this by simply dropping the __GFP_NOFAIL flag when allocating the
    objcg vector.  It effectively allows to skip the accounting of a single
    slab object under a heavy memory pressure.

    An alternative would be to implement the mechanism to fallback to order-0
    allocations for accounting metadata, which is also not perfect because it
    will increase performance penalty and memory footprint of the kernel
    memory accounting under memory pressure.

    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Roman Gushchin <[email protected]>
    Reported-by: Christoph Lameter <[email protected]>
    Closes: https://lkml.kernel.org/r/[email protected]
    Acked-by: Shakeel Butt <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a7fd033550271895e2a87ffc8303a03ffdeb9096
Author: Stefan Roesch <[email protected]>
Date:   Mon Nov 6 10:19:18 2023 -0800

    mm: fix for negative counter: nr_file_hugepages

    commit a48d5bdc877b85201e42cef9c2fdf5378164c23a upstream.

    While qualifiying the 6.4 release, the following warning was detected in
    messages:

    vmstat_refresh: nr_file_hugepages -15664

    The warning is caused by the incorrect updating of the NR_FILE_THPS
    counter in the function split_huge_page_to_list.  The if case is checking
    for folio_test_swapbacked, but the else case is missing the check for
    folio_test_pmd_mappable.  The other functions that manipulate the counter
    like __filemap_add_folio and filemap_unaccount_folio have the
    corresponding check.

    I have a test case, which reproduces the problem. It can be found here:
      https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c

    The test case reproduces on an XFS filesystem. Running the same test
    case on a BTRFS filesystem does not reproduce the problem.

    AFAIK version 6.1 until 6.6 are affected by this problem.

    [[email protected]: whitespace fix]
    [[email protected]: test for folio_test_pmd_mappable()]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Stefan Roesch <[email protected]>
    Co-debugged-by: Johannes Weiner <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2594bdaa16b47d304cbe3f3438632b7175b3d747
Author: Victor Shih <[email protected]>
Date:   Tue Sep 12 17:17:10 2023 +0800

    mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2

    commit d7133797e9e1b72fd89237f68cb36d745599ed86 upstream.

    When GL9750 enters ASPM L1 sub-states, it will stay at L1.1 and will not
    enter L1.2. The workaround is to toggle PM state to allow GL9750 to enter
    ASPM L1.2.

    Signed-off-by: Victor Shih <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 97fb6013f318d77077413376d61691e930c2226c
Author: Nam Cao <[email protected]>
Date:   Tue Aug 29 20:25:00 2023 +0200

    riscv: kprobes: allow writing to x0

    commit 8cb22bec142624d21bc85ff96b7bad10b6220e6a upstream.

    Instructions can write to x0, so we should simulate these instructions
    normally.

    Currently, the kernel hangs if an instruction who writes to x0 is
    simulated.

    Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
    Cc: [email protected]
    Signed-off-by: Nam Cao <[email protected]>
    Reviewed-by: Charlie Jenkins <[email protected]>
    Acked-by: Guo Ren <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 645257ad8d307f7f11984a1e7ca713349ef12944
Author: Song Shuai <[email protected]>
Date:   Tue Aug 29 21:39:20 2023 -0700

    riscv: correct pt_level name via pgtable_l5/4_enabled

    commit e59e5e2754bf983fc58ad18f99b5eec01f1a0745 upstream.

    The pt_level uses CONFIG_PGTABLE_LEVELS to display page table names.
    But if page mode is downgraded from kernel cmdline or restricted by
    the hardware in 64BIT, it will give a wrong name.

    Like, using no4lvl for sv39, ptdump named the 1G-mapping as "PUD"
    that should be "PGD":

    0xffffffd840000000-0xffffffd900000000    0x00000000c0000000         3G PUD     D A G . . W R V

    So select "P4D/PUD" or "PGD" via pgtable_l5/4_enabled to correct it.

    Fixes: e8a62cc26ddf ("riscv: Implement sv48 support")
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Song Shuai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fb1b16f04135b50a577370f3ac00f7e8e402a3bd
Author: Song Shuai <[email protected]>
Date:   Wed Aug 9 11:10:23 2023 +0800

    riscv: mm: Update the comment of CONFIG_PAGE_OFFSET

    commit 559fe94a449cba5b50a7cffea60474b385598c00 upstream.

    Since the commit 011f09d12052 set sv57 as default for CONFIG_64BIT,
    the comment of CONFIG_PAGE_OFFSET should be updated too.

    Fixes: 011f09d12052 ("riscv: mm: Set sv57 on defaultly")
    Signed-off-by: Song Shuai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9f74b261e4e2c39168b3e10743b9e606b2521422
Author: Nathan Chancellor <[email protected]>
Date:   Wed Nov 8 14:12:15 2023 +0800

    LoongArch: Mark __percpu functions as always inline

    commit 71945968d8b128c955204baa33ec03bdd91bdc26 upstream.

    A recent change to the optimization pipeline in LLVM reveals some
    fragility around the inlining of LoongArch's __percpu functions, which
    manifests as a BUILD_BUG() failure:

      In file included from kernel/sched/build_policy.c:17:
      In file included from include/linux/sched/cputime.h:5:
      In file included from include/linux/sched/signal.h:5:
      In file included from include/linux/rculist.h:11:
      In file included from include/linux/rcupdate.h:26:
      In file included from include/linux/irqflags.h:18:
      arch/loongarch/include/asm/percpu.h:97:3: error: call to '__compiletime_assert_51' declared with 'error' attribute: BUILD_BUG failed
         97 |                 BUILD_BUG();
            |                 ^
      include/linux/build_bug.h:59:21: note: expanded from macro 'BUILD_BUG'
         59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
            |                     ^
      include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
         39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
            |                                     ^
      include/linux/compiler_types.h:425:2: note: expanded from macro 'compiletime_assert'
        425 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
            |         ^
      include/linux/compiler_types.h:413:2: note: expanded from macro '_compiletime_assert'
        413 |         __compiletime_assert(condition, msg, prefix, suffix)
            |         ^
      include/linux/compiler_types.h:406:4: note: expanded from macro '__compiletime_assert'
        406 |                         prefix ## suffix();                             \
            |                         ^
      <scratch space>:86:1: note: expanded from here
         86 | __compiletime_assert_51
            | ^
      1 error generated.

    If these functions are not inlined (which the compiler is free to do
    even with functions marked with the standard 'inline' keyword), the
    BUILD_BUG() in the default case cannot be eliminated since the compiler
    cannot prove it is never used, resulting in a build failure due to the
    error attribute.

    Mark these functions as __always_inline to guarantee inlining so that
    the BUILD_BUG() only triggers when the default case genuinely cannot be
    eliminated due to an unexpected size.

    Cc:  <[email protected]>
    Closes: https://github.com/ClangBuiltLinux/linux/…
wanghao75 pushed a commit to openeuler-mirror/kernel that referenced this issue Dec 4, 2023
stable inclusion
from stable-v6.6.3
commit 57ced2eb77343a91d28f4a73675b05fe7b555def
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I8LBQP

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=57ced2eb77343a91d28f4a73675b05fe7b555def

--------------------------------

commit 9fce92f050f448a0d1ddd9083ef967d9930f1e52 upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zheng Zengkai <[email protected]>
johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Dec 31, 2023
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
matrix-wsk added a commit to deepin-community/kernel that referenced this issue Jan 25, 2024
* i2c: designware: Disable TX_EMPTY irq while waiting for block length byte

commit e8183fa10c25c7b3c20670bf2b430ddcc1ee03c0 upstream.

During SMBus block data read process, we have seen high interrupt rate
because of TX_EMPTY irq status while waiting for block length byte (the
first data byte after the address phase). The interrupt handler does not
do anything because the internal state is kept as STATUS_WRITE_IN_PROGRESS.
Hence, we should disable TX_EMPTY IRQ until I2C DesignWare receives
first data byte from I2C device, then re-enable it to resume SMBus
transaction.

It takes 0.789 ms for host to receive data length from slave.
Without the patch, i2c_dw_isr() is called 99 times by TX_EMPTY interrupt.
And it is none after applying the patch.

Cc: [email protected]
Co-developed-by: Chuong Tran <[email protected]>
Signed-off-by: Chuong Tran <[email protected]>
Signed-off-by: Tam Nguyen <[email protected]>
Acked-by: Jarkko Nikula <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* s390/ap: fix AP bus crash on early config change callback invocation

commit e14aec23025eeb1f2159ba34dbc1458467c4c347 upstream.

Fix kernel crash in AP bus code caused by very early invocation of the
config change callback function via SCLP.

After a fresh IML of the machine the crypto cards are still offline and
will get switched online only with activation of any LPAR which has the
card in it's configuration. A crypto card coming online is reported
to the LPAR via SCLP and the AP bus offers a callback function to get
this kind of information. However, it may happen that the callback is
invoked before the AP bus init function is complete. As the callback
triggers a synchronous AP bus scan, the scan may already run but some
internal states are not initialized by the AP bus init function resulting
in a crash like this:

  [   11.635859] Unable to handle kernel pointer dereference in virtual kernel address space
  [   11.635861] Failing address: 0000000000000000 TEID: 0000000000000887
  [   11.635862] Fault in home space mode while using kernel ASCE.
  [   11.635864] AS:00000000894c4007 R3:00000001fece8007 S:00000001fece7800 P:000000000000013d
  [   11.635879] Oops: 0004 ilc:1 [#1] SMP
  [   11.635882] Modules linked in:
  [   11.635884] CPU: 5 PID: 42 Comm: kworker/5:0 Not tainted 6.6.0-rc3-00003-g4dbf7cdc6b42 #12
  [   11.635886] Hardware name: IBM 3931 A01 751 (LPAR)
  [   11.635887] Workqueue: events_long ap_scan_bus
  [   11.635891] Krnl PSW : 0704c00180000000 0000000000000000 (0x0)
  [   11.635895]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
  [   11.635897] Krnl GPRS: 0000000001000a00 0000000000000000 0000000000000006 0000000089591940
  [   11.635899]            0000000080000000 0000000000000a00 0000000000000000 0000000000000000
  [   11.635901]            0000000081870c00 0000000089591000 000000008834e4e2 0000000002625a00
  [   11.635903]            0000000081734200 0000038000913c18 000000008834c6d6 0000038000913ac8
  [   11.635906] Krnl Code:>0000000000000000: 0000                illegal
  [   11.635906]            0000000000000002: 0000                illegal
  [   11.635906]            0000000000000004: 0000                illegal
  [   11.635906]            0000000000000006: 0000                illegal
  [   11.635906]            0000000000000008: 0000                illegal
  [   11.635906]            000000000000000a: 0000                illegal
  [   11.635906]            000000000000000c: 0000                illegal
  [   11.635906]            000000000000000e: 0000                illegal
  [   11.635915] Call Trace:
  [   11.635916]  [<0000000000000000>] 0x0
  [   11.635918]  [<000000008834e4e2>] ap_queue_init_state+0x82/0xb8
  [   11.635921]  [<000000008834ba1c>] ap_scan_domains+0x6fc/0x740
  [   11.635923]  [<000000008834c092>] ap_scan_adapter+0x632/0x8b0
  [   11.635925]  [<000000008834c3e4>] ap_scan_bus+0xd4/0x288
  [   11.635927]  [<00000000879a33ba>] process_one_work+0x19a/0x410
  [   11.635930] Discipline DIAG cannot be used without z/VM
  [   11.635930]  [<00000000879a3a2c>] worker_thread+0x3fc/0x560
  [   11.635933]  [<00000000879aea60>] kthread+0x120/0x128
  [   11.635936]  [<000000008792afa4>] __ret_from_fork+0x3c/0x58
  [   11.635938]  [<00000000885ebe62>] ret_from_fork+0xa/0x30
  [   11.635942] Last Breaking-Event-Address:
  [   11.635942]  [<000000008834c6d4>] ap_wait+0xcc/0x148

This patch improves the ap_bus_force_rescan() function which is
invoked by the config change callback by checking if a first
initial AP bus scan has been done. If not, the force rescan request
is simple ignored. Anyhow it does not make sense to trigger AP bus
re-scans even before the very first bus scan is complete.

Cc: [email protected]
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Harald Freudenberger <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* net: ethtool: Fix documentation of ethtool_sprintf()

commit f55d8e60f10909dbc5524e261041e1d28d7d20d8 upstream.

This function takes a pointer to a pointer, unlike sprintf() which is
passed a plain pointer. Fix up the documentation to make this clear.

Fixes: 7888fe53b706 ("ethtool: Add common function for filling out strings")
Cc: Alexander Duyck <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: [email protected]
Signed-off-by: Andrew Lunn <[email protected]>
Reviewed-by: Justin Stitt <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* net: dsa: lan9303: consequently nested-lock physical MDIO

commit 5a22fbcc10f3f7d94c5d88afbbffa240a3677057 upstream.

When LAN9303 is MDIO-connected two callchains exist into
mdio->bus->write():

1. switch ports 1&2 ("physical" PHYs):

virtual (switch-internal) MDIO bus (lan9303_switch_ops->phy_{read|write})->
  lan9303_mdio_phy_{read|write} -> mdiobus_{read|write}_nested

2. LAN9303 virtual PHY:

virtual MDIO bus (lan9303_phy_{read|write}) ->
  lan9303_virt_phy_reg_{read|write} -> regmap -> lan9303_mdio_{read|write}

If the latter functions just take
mutex_lock(&sw_dev->device->bus->mdio_lock) it triggers a LOCKDEP
false-positive splat. It's false-positive because the first
mdio_lock in the second callchain above belongs to virtual MDIO bus, the
second mdio_lock belongs to physical MDIO bus.

Consequent annotation in lan9303_mdio_{read|write} as nested lock
(similar to lan9303_mdio_phy_{read|write}, it's the same physical MDIO bus)
prevents the following splat:

WARNING: possible circular locking dependency detected
5.15.71 #1 Not tainted
------------------------------------------------------
kworker/u4:3/609 is trying to acquire lock:
ffff000011531c68 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}, at: regmap_lock_mutex
but task is already holding lock:
ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&bus->mdio_lock){+.+.}-{3:3}:
       lock_acquire
       __mutex_lock
       mutex_lock_nested
       lan9303_mdio_read
       _regmap_read
       regmap_read
       lan9303_probe
       lan9303_mdio_probe
       mdio_probe
       really_probe
       __driver_probe_device
       driver_probe_device
       __device_attach_driver
       bus_for_each_drv
       __device_attach
       device_initial_probe
       bus_probe_device
       deferred_probe_work_func
       process_one_work
       worker_thread
       kthread
       ret_from_fork
-> #0 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}:
       __lock_acquire
       lock_acquire.part.0
       lock_acquire
       __mutex_lock
       mutex_lock_nested
       regmap_lock_mutex
       regmap_read
       lan9303_phy_read
       dsa_slave_phy_read
       __mdiobus_read
       mdiobus_read
       get_phy_device
       mdiobus_scan
       __mdiobus_register
       dsa_register_switch
       lan9303_probe
       lan9303_mdio_probe
       mdio_probe
       really_probe
       __driver_probe_device
       driver_probe_device
       __device_attach_driver
       bus_for_each_drv
       __device_attach
       device_initial_probe
       bus_probe_device
       deferred_probe_work_func
       process_one_work
       worker_thread
       kthread
       ret_from_fork
other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&bus->mdio_lock);
                               lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock);
                               lock(&bus->mdio_lock);
  lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock);
*** DEADLOCK ***
5 locks held by kworker/u4:3/609:
 #0: ffff000002842938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work
 #1: ffff80000bacbd60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work
 #2: ffff000007645178 (&dev->mutex){....}-{3:3}, at: __device_attach
 #3: ffff8000096e6e78 (dsa2_mutex){+.+.}-{3:3}, at: dsa_register_switch
 #4: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read
stack backtrace:
CPU: 1 PID: 609 Comm: kworker/u4:3 Not tainted 5.15.71 #1
Workqueue: events_unbound deferred_probe_work_func
Call trace:
 dump_backtrace
 show_stack
 dump_stack_lvl
 dump_stack
 print_circular_bug
 check_noncircular
 __lock_acquire
 lock_acquire.part.0
 lock_acquire
 __mutex_lock
 mutex_lock_nested
 regmap_lock_mutex
 regmap_read
 lan9303_phy_read
 dsa_slave_phy_read
 __mdiobus_read
 mdiobus_read
 get_phy_device
 mdiobus_scan
 __mdiobus_register
 dsa_register_switch
 lan9303_probe
 lan9303_mdio_probe
...

Cc: [email protected]
Fixes: dc7005831523 ("net: dsa: LAN9303: add MDIO managed mode support")
Signed-off-by: Alexander Sverdlin <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* net: phylink: initialize carrier state at creation

commit 02d5fdbf4f2b8c406f7a4c98fa52aa181a11d733 upstream.

Background: Turris Omnia (Armada 385); eth2 (mvneta) connected to SFP bus;
SFP module is present, but no fiber connected, so definitely no carrier.

After booting, eth2 is down, but netdev LED trigger surprisingly reports
link active. Then, after "ip link set eth2 up", the link indicator goes
away - as I would have expected it from the beginning.

It turns out, that the default carrier state after netdev creation is
"carrier ok". Some ethernet drivers explicitly call netif_carrier_off
during probing, others (like mvneta) don't - which explains the current
behaviour: only when the device is brought up, phylink_start calls
netif_carrier_off.

Fix this for all drivers using phylink, by calling netif_carrier_off in
phylink_create.

Fixes: 089381b27abe ("leds: initial support for Turris Omnia LEDs")
Cc: [email protected]
Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Klaus Kudielka <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* gfs2: don't withdraw if init_threads() got interrupted

commit 0cdc6f44e9fdc2d20d720145bf99a39f611f6d61 upstream.

In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted,
kthread_create() can return -EINTR.  When that happens, we roll back
what has already been done and abort the mount.

Since commit 62dd0f98a0e5 ("gfs2: Flag a withdraw if init_threads()
fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super();
first via gfs2_make_fs_rw(), then directly.  But gfs2_withdraw_delayed()
only marks the filesystem as withdrawing and relies on a caller further
up the stack to do the actual withdraw, which doesn't exist in the
gfs2_fill_super() case.  Because the filesystem is marked as withdrawing
/ withdrawn, function gfs2_lm_unmount() doesn't release the dlm
lockspace, so when we try to mount that filesystem again, we get:

    gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0"
    gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17

Since commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the
deadlock this gfs2_withdraw_delayed() call was supposed to work around
cannot occur anymore because freeze_go_callback() won't take the
sb->s_umount semaphore unconditionally anymore, so we can get rid of the
gfs2_withdraw_delayed() in gfs2_fill_super() entirely.

Reported-by: Alexander Aring <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Cc: [email protected] # v6.5+
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* i2c: i801: fix potential race in i801_block_transaction_byte_by_byte

commit f78ca48a8ba9cdec96e8839351e49eec3233b177 upstream.

Currently we set SMBHSTCNT_LAST_BYTE only after the host has started
receiving the last byte. If we get e.g. preempted before setting
SMBHSTCNT_LAST_BYTE, the host may be finished with receiving the byte
before SMBHSTCNT_LAST_BYTE is set.
Therefore change the code to set SMBHSTCNT_LAST_BYTE before writing
SMBHSTSTS_BYTE_DONE for the byte before the last byte. Now the code
is also consistent with what we do in i801_isr_byte_done().

Reported-by: Jean Delvare <[email protected]>
Closes: https://lore.kernel.org/linux-i2c/[email protected]/
Cc: [email protected]
Acked-by: Andi Shyti <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* f2fs: do not return EFSCORRUPTED, but try to run online repair

commit 50a472bbc79ff9d5a88be8019a60e936cadf9f13 upstream.

If we return the error, there's no way to recover the status as of now, since
fsck does not fix the xattr boundary issue.

Cc: [email protected]
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* f2fs: set the default compress_level on ioctl

commit f5f3bd903a5d3e3b2ba89f11e0e29db25e60c048 upstream.

Otherwise, we'll get a broken inode.

 # touch $FILE
 # f2fs_io setflags compression $FILE
 # f2fs_io set_coption 2 8 $FILE

[  112.227612] F2FS-fs (dm-51): sanity_check_compress_inode: inode (ino=8d3fe) has unsupported compress level: 0, run fsck to fix

Cc: [email protected]
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* f2fs: avoid format-overflow warning

commit e0d4e8acb3789c5a8651061fbab62ca24a45c063 upstream.

With gcc and W=1 option, there's a warning like this:

fs/f2fs/compress.c: In function ‘f2fs_init_page_array_cache’:
fs/f2fs/compress.c:1984:47: error: ‘%u’ directive writing between
1 and 7 bytes into a region of size between 5 and 8
[-Werror=format-overflow=]
 1984 |  sprintf(slab_name, "f2fs_page_array_entry-%u:%u", MAJOR(dev),
		MINOR(dev));
      |                                               ^~

String "f2fs_page_array_entry-%u:%u" can up to 35. The first "%u" can up
to 4 and the second "%u" can up to 7, so total size is "24 + 4 + 7 = 35".
slab_name's size should be 35 rather than 32.

Cc: [email protected]
Signed-off-by: Su Hui <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* f2fs: split initial and dynamic conditions for extent_cache

commit f803982190f0265fd36cf84670aa6daefc2b0768 upstream.

Let's allocate the extent_cache tree without dynamic conditions to avoid a
missing condition causing a panic as below.

 # create a file w/ a compressed flag
 # disable the compression
 # panic while updating extent_cache

F2FS-fs (dm-64): Swapfile: last extent is not aligned to section
F2FS-fs (dm-64): Swapfile (3) is not align to section: 1) creat(), 2) ioctl(F2FS_IOC_SET_PIN_FILE), 3) fallocate(2097152 * N)
Adding 124996k swap on ./swap-file.  Priority:0 extents:2 across:17179494468k
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline]
BUG: KASAN: null-ptr-deref in atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline]
BUG: KASAN: null-ptr-deref in queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline]
BUG: KASAN: null-ptr-deref in __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline]
BUG: KASAN: null-ptr-deref in _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295
Write of size 4 at addr 0000000000000030 by task syz-executor154/3327

CPU: 0 PID: 3327 Comm: syz-executor154 Tainted: G           O      5.10.185 #1
Hardware name: emulation qemu-x86/qemu-x86, BIOS 2023.01-21885-gb3cc1cd24d 01/01/2023
Call Trace:
 __dump_stack out/common/lib/dump_stack.c:77 [inline]
 dump_stack_lvl+0x17e/0x1c4 out/common/lib/dump_stack.c:118
 __kasan_report+0x16c/0x260 out/common/mm/kasan/report.c:415
 kasan_report+0x51/0x70 out/common/mm/kasan/report.c:428
 kasan_check_range+0x2f3/0x340 out/common/mm/kasan/generic.c:186
 __kasan_check_write+0x14/0x20 out/common/mm/kasan/shadow.c:37
 instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline]
 atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline]
 queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline]
 __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline]
 _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295
 __drop_extent_tree+0xdf/0x2f0 out/common/fs/f2fs/extent_cache.c:1155
 f2fs_drop_extent_tree+0x17/0x30 out/common/fs/f2fs/extent_cache.c:1172
 f2fs_insert_range out/common/fs/f2fs/file.c:1600 [inline]
 f2fs_fallocate+0x19fd/0x1f40 out/common/fs/f2fs/file.c:1764
 vfs_fallocate+0x514/0x9b0 out/common/fs/open.c:310
 ksys_fallocate out/common/fs/open.c:333 [inline]
 __do_sys_fallocate out/common/fs/open.c:341 [inline]
 __se_sys_fallocate out/common/fs/open.c:339 [inline]
 __x64_sys_fallocate+0xb8/0x100 out/common/fs/open.c:339
 do_syscall_64+0x35/0x50 out/common/arch/x86/entry/common.c:46

Cc: [email protected]
Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default")
Reported-and-tested-by: [email protected]
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: lirc: drop trailing space from scancode transmit

commit c8a489f820179fb12251e262b50303c29de991ac upstream.

When transmitting, infrared drivers expect an odd number of samples; iow
without a trailing space. No problems have been observed so far, so
this is just belt and braces.

Fixes: 9b6192589be7 ("media: lirc: implement scancode sending")
Cc: [email protected]
Signed-off-by: Sean Young <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: sharp: fix sharp encoding

commit 4f7efc71891462ab7606da7039f480d7c1584a13 upstream.

The Sharp protocol[1] encoding has incorrect timings for bit space.

[1] https://www.sbprojects.net/knowledge/ir/sharp.php

Fixes: d35afc5fe097 ("[media] rc: ir-sharp-decoder: Add encode capability")
Cc: [email protected]
Reported-by: Joe Ferner <[email protected]>
Closes: https://sourceforge.net/p/lirc/mailman/message/38604507/
Signed-off-by: Sean Young <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: venus: hfi_parser: Add check to keep the number of codecs within range

commit 0768a9dd809ef52440b5df7dce5a1c1c7e97abbd upstream.

Supported codec bitmask is populated from the payload from venus firmware.
There is a possible case when all the bits in the codec bitmask is set. In
such case, core cap for decoder is filled  and MAX_CODEC_NUM is utilized.
Now while filling the caps for encoder, it can lead to access the caps
array beyong 32 index. Hence leading to OOB write.
The fix counts the supported encoder and decoder. If the count is more than
max, then it skips accessing the caps.

Cc: [email protected]
Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser")
Signed-off-by: Vikash Garodia <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: venus: hfi: fix the check to handle session buffer requirement

commit b18e36dfd6c935da60a971310374f3dfec3c82e1 upstream.

Buffer requirement, for different buffer type, comes from video firmware.
While copying these requirements, there is an OOB possibility when the
payload from firmware is more than expected size. Fix the check to avoid
the OOB possibility.

Cc: [email protected]
Fixes: 09c2845e8fe4 ("[media] media: venus: hfi: add Host Firmware Interface (HFI)")
Reviewed-by: Nathan Hebert <[email protected]>
Signed-off-by: Vikash Garodia <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: venus: hfi: add checks to handle capabilities from firmware

commit 8d0b89398b7ebc52103e055bf36b60b045f5258f upstream.

The hfi parser, parses the capabilities received from venus firmware and
copies them to core capabilities. Consider below api, for example,
fill_caps - In this api, caps in core structure gets updated with the
number of capabilities received in firmware data payload. If the same api
is called multiple times, there is a possibility of copying beyond the max
allocated size in core caps.
Similar possibilities in fill_raw_fmts and fill_profile_level functions.

Cc: [email protected]
Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser")
Signed-off-by: Vikash Garodia <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: ccs: Correctly initialise try compose rectangle

commit 724ff68e968b19d786870d333f9952bdd6b119cb upstream.

Initialise the try sink compose rectangle size to the sink compose
rectangle for binner and scaler sub-devices. This was missed due to the
faulty condition that lead to the compose rectangles to be initialised for
the pixel array sub-device where it is not relevant.

Fixes: ccfc97bdb5ae ("[media] smiapp: Add driver")
Cc: [email protected]
Signed-off-by: Sakari Ailus <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* drm/mediatek/dp: fix memory leak on ->get_edid callback audio detection

commit dab12fa8d2bd3868cf2de485ed15a3feef28a13d upstream.

The sads returned by drm_edid_to_sad() needs to be freed.

Fixes: e71a8ebbe086 ("drm/mediatek: dp: Audio support for MT8195")
Cc: Guillaume Ranquet <[email protected]>
Cc: Bo-Chen Chen <[email protected]>
Cc: AngeloGioacchino Del Regno <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Chun-Kuang Hu <[email protected]>
Cc: Philipp Zabel <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]> # v6.1+
Signed-off-by: Jani Nikula <[email protected]>
Reviewed-by: Chen-Yu Tsai <[email protected]>
Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
Signed-off-by: Chun-Kuang Hu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* drm/mediatek/dp: fix memory leak on ->get_edid callback error path

commit fcaf9761fd5884a64eaac48536f8c27ecfd2e6bc upstream.

Setting new_edid to NULL leaks the buffer.

Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Cc: Markus Schneider-Pargmann <[email protected]>
Cc: Guillaume Ranquet <[email protected]>
Cc: Bo-Chen Chen <[email protected]>
Cc: CK Hu <[email protected]>
Cc: AngeloGioacchino Del Regno <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Chun-Kuang Hu <[email protected]>
Cc: Philipp Zabel <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]> # v6.1+
Signed-off-by: Jani Nikula <[email protected]>
Reviewed-by: Guillaume Ranquet <[email protected]>
Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
Signed-off-by: Chun-Kuang Hu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* dm-bufio: fix no-sleep mode

commit 2a695062a5a42aead8c539a344168d4806b3fda2 upstream.

dm-bufio has a no-sleep mode. When activated (with the
DM_BUFIO_CLIENT_NO_SLEEP flag), the bufio client is read-only and we
could call dm_bufio_get from tasklets. This is used by dm-verity.

Unfortunately, commit 450e8dee51aa ("dm bufio: improve concurrent IO
performance") broke this and the kernel would warn that cache_get()
was calling down_read() from no-sleeping context. The bug can be
reproduced by using "veritysetup open" with the "--use-tasklets"
flag.

This commit fixes dm-bufio, so that the tasklet mode works again, by
expanding use of the 'no_sleep_enabled' static_key to conditionally
use either a rw_semaphore or rwlock_t (which are colocated in the
buffer_tree structure using a union).

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]	# v6.4
Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance")
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* dm-verity: don't use blocking calls from tasklets

commit 28f07f2ab4b3a2714f1fefcc58ada4bcc195f806 upstream.

The commit 5721d4e5a9cd enhanced dm-verity, so that it can verify blocks
from tasklets rather than from workqueues. This reportedly improves
performance significantly.

However, dm-verity was using the flag CRYPTO_TFM_REQ_MAY_SLEEP from
tasklets which resulted in warnings about sleeping function being called
from non-sleeping context.

BUG: sleeping function called from invalid context at crypto/internal.h:206
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 PID: 14 Comm: ksoftirqd/0 Tainted: G        W 6.7.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x32/0x50
 __might_resched+0x110/0x160
 crypto_hash_walk_done+0x54/0xb0
 shash_ahash_update+0x51/0x60
 verity_hash_update.isra.0+0x4a/0x130 [dm_verity]
 verity_verify_io+0x165/0x550 [dm_verity]
 ? free_unref_page+0xdf/0x170
 ? psi_group_change+0x113/0x390
 verity_tasklet+0xd/0x70 [dm_verity]
 tasklet_action_common.isra.0+0xb3/0xc0
 __do_softirq+0xaf/0x1ec
 ? smpboot_thread_fn+0x1d/0x200
 ? sort_range+0x20/0x20
 run_ksoftirqd+0x15/0x30
 smpboot_thread_fn+0xed/0x200
 kthread+0xdc/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x28/0x40
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork_asm+0x11/0x20
 </TASK>

This commit fixes dm-verity so that it doesn't use the flags
CRYPTO_TFM_REQ_MAY_SLEEP and CRYPTO_TFM_REQ_MAY_BACKLOG from tasklets. The
crypto API would do GFP_ATOMIC allocation instead, it could return -ENOMEM
and we catch -ENOMEM in verity_tasklet and requeue the request to the
workqueue.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]	# v6.0+
Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* nfsd: fix file memleak on client_opens_release

commit bc1b5acb40201a0746d68a7d7cfc141899937f4f upstream.

seq_release should be called to free the allocated seq_file

Cc: [email protected] # v5.3+
Signed-off-by: Mahmoud Adam <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <[email protected]>
Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* NFSD: Update nfsd_cache_append() to use xdr_stream

commit 49cecd8628a9855cd993792a0377559ea32d5e7c upstream.

When inserting a DRC-cached response into the reply buffer, ensure
that the reply buffer's xdr_stream is updated properly. Otherwise
the server will send a garbage response.

Cc: [email protected] # v6.3+
Reviewed-by: Jeff Layton <[email protected]>
Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* LoongArch: Mark __percpu functions as always inline

commit 71945968d8b128c955204baa33ec03bdd91bdc26 upstream.

A recent change to the optimization pipeline in LLVM reveals some
fragility around the inlining of LoongArch's __percpu functions, which
manifests as a BUILD_BUG() failure:

  In file included from kernel/sched/build_policy.c:17:
  In file included from include/linux/sched/cputime.h:5:
  In file included from include/linux/sched/signal.h:5:
  In file included from include/linux/rculist.h:11:
  In file included from include/linux/rcupdate.h:26:
  In file included from include/linux/irqflags.h:18:
  arch/loongarch/include/asm/percpu.h:97:3: error: call to '__compiletime_assert_51' declared with 'error' attribute: BUILD_BUG failed
     97 |                 BUILD_BUG();
        |                 ^
  include/linux/build_bug.h:59:21: note: expanded from macro 'BUILD_BUG'
     59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
        |                     ^
  include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
     39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
        |                                     ^
  include/linux/compiler_types.h:425:2: note: expanded from macro 'compiletime_assert'
    425 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |         ^
  include/linux/compiler_types.h:413:2: note: expanded from macro '_compiletime_assert'
    413 |         __compiletime_assert(condition, msg, prefix, suffix)
        |         ^
  include/linux/compiler_types.h:406:4: note: expanded from macro '__compiletime_assert'
    406 |                         prefix ## suffix();                             \
        |                         ^
  <scratch space>:86:1: note: expanded from here
     86 | __compiletime_assert_51
        | ^
  1 error generated.

If these functions are not inlined (which the compiler is free to do
even with functions marked with the standard 'inline' keyword), the
BUILD_BUG() in the default case cannot be eliminated since the compiler
cannot prove it is never used, resulting in a build failure due to the
error attribute.

Mark these functions as __always_inline to guarantee inlining so that
the BUILD_BUG() only triggers when the default case genuinely cannot be
eliminated due to an unexpected size.

Cc:  <[email protected]>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1955
Fixes: 46859ac8af52 ("LoongArch: Add multi-processor (SMP) support")
Link: https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
Suggested-by: Nick Desaulniers <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause

commit dd16ac404a685cce07e67261a94c6225d90ea7ba upstream.

Actually it is a part of Conor's
commit aae538cd03bc ("riscv: fix detection of toolchain
Zihintpause support").
It is looks like a merge issue. Samuel's
commit 0b1d60d6dd9e ("riscv: Fix build with
CONFIG_CC_OPTIMIZE_FOR_SIZE=y") do not base on Conor's commit and
revert to __riscv_zihintpause. So this patch can fix it.

Signed-off-by: Minda Chen <[email protected]>
Fixes: 3c349eacc559 ("Merge patch "riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y"")
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* riscv: put interrupt entries into .irqentry.text

commit 87615e95f6f9ccd36d4a3905a2d87f91967ea9d2 upstream.

The interrupt entries are expected to be in the .irqentry.text section.
For example, for kprobes to work properly, exception code cannot be
probed; this is ensured by blacklisting addresses in the .irqentry.text
section.

Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
Signed-off-by: Nam Cao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* riscv: mm: Update the comment of CONFIG_PAGE_OFFSET

commit 559fe94a449cba5b50a7cffea60474b385598c00 upstream.

Since the commit 011f09d12052 set sv57 as default for CONFIG_64BIT,
the comment of CONFIG_PAGE_OFFSET should be updated too.

Fixes: 011f09d12052 ("riscv: mm: Set sv57 on defaultly")
Signed-off-by: Song Shuai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* riscv: correct pt_level name via pgtable_l5/4_enabled

commit e59e5e2754bf983fc58ad18f99b5eec01f1a0745 upstream.

The pt_level uses CONFIG_PGTABLE_LEVELS to display page table names.
But if page mode is downgraded from kernel cmdline or restricted by
the hardware in 64BIT, it will give a wrong name.

Like, using no4lvl for sv39, ptdump named the 1G-mapping as "PUD"
that should be "PGD":

0xffffffd840000000-0xffffffd900000000    0x00000000c0000000         3G PUD     D A G . . W R V

So select "P4D/PUD" or "PGD" via pgtable_l5/4_enabled to correct it.

Fixes: e8a62cc26ddf ("riscv: Implement sv48 support")
Reviewed-by: Alexandre Ghiti <[email protected]>
Signed-off-by: Song Shuai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* riscv: kprobes: allow writing to x0

commit 8cb22bec142624d21bc85ff96b7bad10b6220e6a upstream.

Instructions can write to x0, so we should simulate these instructions
normally.

Currently, the kernel hangs if an instruction who writes to x0 is
simulated.

Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
Cc: [email protected]
Signed-off-by: Nam Cao <[email protected]>
Reviewed-by: Charlie Jenkins <[email protected]>
Acked-by: Guo Ren <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2

commit d7133797e9e1b72fd89237f68cb36d745599ed86 upstream.

When GL9750 enters ASPM L1 sub-states, it will stay at L1.1 and will not
enter L1.2. The workaround is to toggle PM state to allow GL9750 to enter
ASPM L1.2.

Signed-off-by: Victor Shih <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mm: fix for negative counter: nr_file_hugepages

commit a48d5bdc877b85201e42cef9c2fdf5378164c23a upstream.

While qualifiying the 6.4 release, the following warning was detected in
messages:

vmstat_refresh: nr_file_hugepages -15664

The warning is caused by the incorrect updating of the NR_FILE_THPS
counter in the function split_huge_page_to_list.  The if case is checking
for folio_test_swapbacked, but the else case is missing the check for
folio_test_pmd_mappable.  The other functions that manipulate the counter
like __filemap_add_folio and filemap_unaccount_folio have the
corresponding check.

I have a test case, which reproduces the problem. It can be found here:
  https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c

The test case reproduces on an XFS filesystem. Running the same test
case on a BTRFS filesystem does not reproduce the problem.

AFAIK version 6.1 until 6.6 are affected by this problem.

[[email protected]: whitespace fix]
[[email protected]: test for folio_test_pmd_mappable()]
  Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stefan Roesch <[email protected]>
Co-debugged-by: Johannes Weiner <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors

commit 24948e3b7b12e0031a6edb4f49bbb9fb2ad1e4e9 upstream.

Objcg vectors attached to slab pages to store slab object ownership
information are allocated using gfp flags for the original slab
allocation.  Depending on slab page order and the size of slab objects,
objcg vector can take several pages.

If the original allocation was done with the __GFP_NOFAIL flag, it
triggered a warning in the page allocation code.  Indeed, order > 1 pages
should not been allocated with the __GFP_NOFAIL flag.

Fix this by simply dropping the __GFP_NOFAIL flag when allocating the
objcg vector.  It effectively allows to skip the accounting of a single
slab object under a heavy memory pressure.

An alternative would be to implement the mechanism to fallback to order-0
allocations for accounting metadata, which is also not perfect because it
will increase performance penalty and memory footprint of the kernel
memory accounting under memory pressure.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Roman Gushchin <[email protected]>
Reported-by: Christoph Lameter <[email protected]>
Closes: https://lkml.kernel.org/r/[email protected]
Acked-by: Shakeel Butt <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mptcp: deal with large GSO size

commit 9fce92f050f448a0d1ddd9083ef967d9930f1e52 upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mptcp: add validity check for sending RM_ADDR

commit 8df220b29282e8b450ea57be62e1eccd4996837c upstream.

This patch adds the validity check for sending RM_ADDRs for userspace PM
in mptcp_pm_remove_addrs(), only send a RM_ADDR when the address is in the
anno_list or conn_list.

Fixes: 8b1c94da1e48 ("mptcp: only send RM_ADDR in nl_cmd_remove")
Cc: [email protected]
Signed-off-by: Geliang Tang <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-3-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mptcp: fix setsockopt(IP_TOS) subflow locking

commit 7679d34f97b7a09fd565f5729f79fd61b7c55329 upstream.

The MPTCP implementation of the IP_TOS socket option uses the lockless
variant of the TOS manipulation helper and does not hold such lock at
the helper invocation time.

Add the required locking.

Fixes: ffcacff87cd6 ("mptcp: Support for IP_TOS for MPTCP setsockopt()")
Cc: [email protected]
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/457
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-4-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* selftests: mptcp: fix fastclose with csum failure

commit 7cefbe5e1dacc7236caa77e9d072423f21422fe2 upstream.

Running the mp_join selftest manually with the following command line:

  ./mptcp_join.sh -z -C

leads to some failures:

  002 fastclose server test
  # ...
  rtx                                 [fail] got 1 MP_RST[s] TX expected 0
  # ...
  rstrx                               [fail] got 1 MP_RST[s] RX expected 0

The problem is really in the wrong expectations for the RST checks
implied by the csum validation. Note that the same check is repeated
explicitly in the same test-case, with the correct expectation and
pass successfully.

Address the issue explicitly setting the correct expectation for
the failing checks.

Reported-by: Xiumei Mu <[email protected]>
Fixes: 6bf41020b72b ("selftests: mptcp: update and extend fastclose test-cases")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* r8169: fix network lost after resume on DASH systems

commit 868c3b95afef4883bfb66c9397482da6840b5baf upstream.

Device that support DASH may be reseted or powered off during suspend.
So driver needs to handle DASH during system suspend and resume. Or
DASH firmware will influence device behavior and causes network lost.

Fixes: b646d90053f8 ("r8169: magic.")
Cc: [email protected]
Reviewed-by: Heiner Kallweit <[email protected]>
Signed-off-by: ChunHao Lin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* r8169: add handling DASH when DASH is disabled

commit 0ab0c45d8aaea5192328bfa6989673aceafc767c upstream.

For devices that support DASH, even DASH is disabled, there may still
exist a default firmware that will influence device behavior.
So driver needs to handle DASH for devices that support DASH, no
matter the DASH status is.

This patch also prepares for "fix network lost after resume on DASH
systems".

Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
Cc: [email protected]
Signed-off-by: ChunHao Lin <[email protected]>
Reviewed-by: Heiner Kallweit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of AER

commit 015c9cbcf0ad709079117d27c2094a46e0eadcdb upstream.

Due to a flaw in the hardware design, the GL9750 replay timer frequently
times out when ASPM is enabled. As a result, the warning messages will
often appear in the system log when the system accesses the GL9750
PCI config. Therefore, the replay timer timeout must be masked.

Fixes: d7133797e9e1 ("mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2")
Signed-off-by: Victor Shih <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Acked-by: Kai-Heng Feng <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix pm_domain_on sequence in probe

commit 7405116519ad70b8c7340359bfac8db8279e7ce4 upstream.

We need to make sure camss_configure_pd() happens before
camss_register_entities() as the vfe_get() path relies on the pointer
provided by camss_configure_pd().

Fix the ordering sequence in probe to ensure the pointers vfe_get() demands
are present by the time camss_register_entities() runs.

In order to facilitate backporting to stable kernels I've moved the
configure_pd() call pretty early on the probe() function so that
irrespective of the existence of the old error handling jump labels this
patch should still apply to -next circa Aug 2023 to v5.13 inclusive.

Fixes: 2f6f8af67203 ("media: camss: Refactor VFE power domain toggling")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix vfe_get() error jump

commit 26bda3da00c3edef727a6acb00ed2eb4b22f8723 upstream.

Right now it is possible to do a vfe_get() with the internal reference
count at 1. If vfe_check_clock_rates() returns non-zero then we will
leave the reference count as-is and

run:
- pm_runtime_put_sync()
- vfe->ops->pm_domain_off()

skip:
- camss_disable_clocks()

Subsequent vfe_put() calls will when the ref-count is non-zero
unconditionally run:

- pm_runtime_put_sync()
- vfe->ops->pm_domain_off()
- camss_disable_clocks()

vfe_get() should not attempt to roll-back on error when the ref-count is
non-zero as the upper layers will still do their own vfe_put() operations.

vfe_put() will drop the reference count and do the necessary power
domain release, the cleanup jumps in vfe_get() should only be run when
the ref-count is zero.

[   50.095796] CPU: 7 PID: 3075 Comm: cam Not tainted 6.3.2+ #80
[   50.095798] Hardware name: LENOVO 21BXCTO1WW/21BXCTO1WW, BIOS N3HET82W (1.54 ) 05/26/2023
[   50.095799] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   50.095802] pc : refcount_warn_saturate+0xf4/0x148
[   50.095804] lr : refcount_warn_saturate+0xf4/0x148
[   50.095805] sp : ffff80000c7cb8b0
[   50.095806] x29: ffff80000c7cb8b0 x28: ffff16ecc0e3fc10 x27: 0000000000000000
[   50.095810] x26: 0000000000000000 x25: 0000000000020802 x24: 0000000000000000
[   50.095813] x23: ffff16ecc7360640 x22: 00000000ffffffff x21: 0000000000000005
[   50.095815] x20: ffff16ed175f4400 x19: ffffb4d9852942a8 x18: ffffffffffffffff
[   50.095818] x17: ffffb4d9852d4a48 x16: ffffb4d983da5db8 x15: ffff80000c7cb320
[   50.095821] x14: 0000000000000001 x13: 2e656572662d7265 x12: 7466612d65737520
[   50.095823] x11: 00000000ffffefff x10: ffffb4d9850cebf0 x9 : ffffb4d9835cf954
[   50.095826] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000057fa8
[   50.095829] x5 : ffff16f813fe3d08 x4 : 0000000000000000 x3 : ffff621e8f4d2000
[   50.095832] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff16ed32119040
[   50.095835] Call trace:
[   50.095836]  refcount_warn_saturate+0xf4/0x148
[   50.095838]  device_link_put_kref+0x84/0xc8
[   50.095843]  device_link_del+0x38/0x58
[   50.095846]  vfe_pm_domain_off+0x3c/0x50 [qcom_camss]
[   50.095860]  vfe_put+0x114/0x140 [qcom_camss]
[   50.095869]  csid_set_power+0x2c8/0x408 [qcom_camss]
[   50.095878]  pipeline_pm_power_one+0x164/0x170 [videodev]
[   50.095896]  pipeline_pm_power+0xc4/0x110 [videodev]
[   50.095909]  v4l2_pipeline_pm_use+0x5c/0xa0 [videodev]
[   50.095923]  v4l2_pipeline_pm_get+0x1c/0x30 [videodev]
[   50.095937]  video_open+0x7c/0x100 [qcom_camss]
[   50.095945]  v4l2_open+0x84/0x130 [videodev]
[   50.095960]  chrdev_open+0xc8/0x250
[   50.095964]  do_dentry_open+0x1bc/0x498
[   50.095966]  vfs_open+0x34/0x40
[   50.095968]  path_openat+0xb44/0xf20
[   50.095971]  do_filp_open+0xa4/0x160
[   50.095974]  do_sys_openat2+0xc8/0x188
[   50.095975]  __arm64_sys_openat+0x6c/0xb8
[   50.095977]  invoke_syscall+0x50/0x128
[   50.095982]  el0_svc_common.constprop.0+0x4c/0x100
[   50.095985]  do_el0_svc+0x40/0xa8
[   50.095988]  el0_svc+0x2c/0x88
[   50.095991]  el0t_64_sync_handler+0xf4/0x120
[   50.095994]  el0t_64_sync+0x190/0x198
[   50.095996] ---[ end trace 0000000000000000 ]---

Fixes: 779096916dae ("media: camss: vfe: Fix runtime PM imbalance on error")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix VFE-17x vfe_disable_output()

commit 3143ad282fc08bf995ee73e32a9e40c527bf265d upstream.

There are two problems with the current vfe_disable_output() routine.

Firstly we rightly use a spinlock to protect output->gen2.active_num
everywhere except for in the IDLE timeout path of vfe_disable_output().
Even if that is not racy "in practice" somehow it is by happenstance not
by design.

Secondly we do not get consistent behaviour from this routine. On
sc8280xp 50% of the time I get "VFE idle timeout - resetting". In this
case the subsequent capture will succeed. The other 50% of the time, we
don't hit the idle timeout, never do the VFE reset and subsequent
captures stall indefinitely.

Rewrite the vfe_disable_output() routine to

- Quiesce write masters with vfe_wm_stop()
- Set active_num = 0

remembering to hold the spinlock when we do so followed by

- Reset the VFE

Testing on sc8280xp and sdm845 shows this to be a valid fix.

Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix VFE-480 vfe_disable_output()

commit 7f24d291350426d40b36dfbe6b3090617cdfd37a upstream.

vfe-480 is copied from vfe-17x and has the same racy idle timeout bug as in
17x.

Fix the vfe_disable_output() logic to no longer be racy and to conform
to the 17x way of quiescing and then resetting the VFE.

Fixes: 4edc8eae715c ("media: camss: Add initial support for VFE hardware version Titan 480")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix missing vfe_lite clocks check

commit b6e1bdca463a932c1ac02caa7d3e14bf39288e0c upstream.

check_clock doesn't account for vfe_lite which means that vfe_lite will
never get validated by this routine. Add the clock name to the expected set
to remediate.

Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3

commit e655d1ae9703286cef7fda8675cad62f649dc183 upstream.

VC_MODE = 0 implies a two bit VC address.
VC_MODE = 1 is required for VCs with a larger address than two bits.

Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix invalid clock enable bit disjunction

commit d8f7e1a60d01739a1d78db2b08603089c6cf7c8e upstream.

define CSIPHY_3PH_CMN_CSI_COMMON_CTRL5_CLK_ENABLE BIT(7)

disjunction for gen2 ? BIT(7) : is a nop we are setting the same bit
either way.

Fixes: 4abb21309fda ("media: camss: csiphy: Move to hardcode CSI Clock Lane number")
Cc: [email protected]
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Konrad Dybcio <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* media: qcom: camss: Fix csid-gen2 for test pattern generator

commit 87889f1b7ea40d2544b49c62092e6ef2792dced7 upstream.

In the current driver csid Test Pattern Generator (TPG) doesn't work.
This change:
- fixes writing frame width and height values into CSID_TPG_DT_n_CFG_0
- fixes the shift by one between test_pattern control value and the
  actual pattern.
- drops fixed VC of 0x0a which testing showed prohibited some test
  patterns in the CSID to produce output.
So that TPG starts working, but with the below limitations:
- only test_pattern=9 works as it should
- test_pattern=8 and test_pattern=7 produce black frame (all zeroes)
- the rest of test_pattern's don't work (yavta doesn't get the data)
- regardless of the CFA pattern set by 'media-ctl -V' the actual pixel
  order is always the same (RGGB for any RAW8 or RAW10P format in
  4608x2592 resolution).

Tested with:

RAW10P format, VC0:
 media-ctl -V '"msm_csid0":0[fmt:SRGGB10/4608x2592 field:none]'
 media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB10/4608x2592 field:none]'
 media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]'
 v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9
 yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video0

RAW10P format, VC1:
 media-ctl -V '"msm_csid0":2[fmt:SRGGB10/4608x2592 field:none]'
 media-ctl -V '"msm_vfe0_rdi1":0[fmt:SRGGB10/4608x2592 field:none]'
 media-ctl -l '"msm_csid0":2->"msm_vfe0_rdi1":0[1]'
 v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9
 yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video1

RAW8 format, VC0:
 media-ctl --reset
 media-ctl -V '"msm_csid0":0[fmt:SRGGB8/4608x2592 field:none]'
 media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB8/4608x2592 field:none]'
 media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]'
 yavta -B capture-mplane --capture=3 -n 3 -f SRGGB8 -s 4608x2592 /dev/video0

Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170")
Cc: [email protected]
Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: Bryan O'Donoghue <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"

commit 6a26310273c323380da21eb23fcfd50e31140913 upstream.

This reverts commit efa5f1311c4998e9e6317c52bc5ee93b3a0f36df.

I couldn't reproduce the reported issue. What I did, based on a pcap
packet log provided by the reporter:
- Used same chip version (RTL8168h)
- Set MAC address to the one used on the reporters system
- Replayed the EAPOL unicast packet that, according to the reporter,
  was filtered out by the mc filter.
The packet was properly received.

Therefore the root cause of the reported issue seems to be somewhere
else. Disabling mc filtering completely for the most common chip
version is a quite big hammer. Therefore revert the change and wait
for further analysis results from the reporter.

Cc: [email protected]
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

* ext4: fix race between writepages and remount

commit 745f17a4166e79315e4b7f33ce89d03e75a76983 upstream.

We got a WARNING in ext4_add_complete_io:
==================================================================
 WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
 [...]
 Call Trace:
  <TASK>
  ext4_end_bio+0xa8/0x240 [ext4]
  bio_endio+0x195/0x310
  blk_update_request+0x184/0x770
  scsi_end_request+0x2f/0x240
  scsi_io_completion+0x75/0x450
  scsi_finish_command+0xef/0x160
  scsi_complete+0xa3/0x180
  blk_complete_reqs+0x60/0x80
  blk_done_softirq+0x25/0x40
  __do_softirq+0x119/0x4c8
  run_ksoftirqd+0x42/0x70
  smpboot_thread_fn+0x136/0x3c0
  kthread+0x140/0x1a0
  ret_from_fork+0x2c/0x50
==================================================================

Above issue may happen as follows:

            cpu1                        cpu2
----------------------------|----------------------------
mount -o dioread_lock
ext4_writepages
 ext4_do_writepages
  *if (ext4_should_dioread_nolock(inode))*
    // rsv_blocks is not assigned here
                                 mount -o remount,dioread_nolock
  ext4_journal_start_with_reserve
   __ext4_journal_start
    __ext4_journal_start_sb
     jbd2__journal_start
      *if (rsv_blocks)*
        // h_rsv_handle is not initialized here
  mpage_map_and_submit_extent
    mpage_map_one_extent
      dioread_nolock = ext4_should_dioread_nolock(inode)
      if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
        mpd->io_submit.io_end->handle = handle->h_rsv_handle
        ext4_set_io_unwritten_flag
          io_end->flag |= EXT4_IO_END_UNWRITTEN
      // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag

scsi_finish_command
 scsi_io_completion
  scsi_io_completion_action
   scsi_end_request
    blk_update_request
     req_bio_endio
      bio_endio
       bio->bi_end_io  > ext4_end_bio
        ext4_put_io_end_defer
	 ext4_add_complete_io
	  // trigger WARN_ON(!io_end->handle && sbi->s_journal);

The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:

    (1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
    (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount

The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.

Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, when applying options during remount, we grab s_writepages_rwsem
to avoid racing with writepages ops to trigger this problem.

Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: [email protected]
Signed-off-by: Baokun Li <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodo…
snajpa pushed a commit to vpsfreecz/linux that referenced this issue Mar 2, 2024
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
snajpa pushed a commit to vpsfreecz/linux that referenced this issue Mar 2, 2024
commit 9fce92f upstream.

After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger than 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

  WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
  CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 torvalds#45
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
  RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
  RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
  netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
  RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
  RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
  R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
  R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
  FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
  PKRU: 55555554
  Call Trace:
   <IRQ>
   mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
   subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
   tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
   tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
   tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
   tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
   ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
   ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
   ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
   __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
   process_backlog+0x353/0x660 net/core/dev.c:5974
   __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
   net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
   __do_softirq+0x184/0x524 kernel/softirq.c:553
   do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#450
Fixes: 7c4e983 ("net: allow gso_max_size to exceed 65536")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bisected Git commit introducing the bug is known bug reproducer Has a simple program to reproduce the bug syzkaller
Projects
None yet
Development

No branches or pull requests

2 participants