NAS-124109 / 24.04 / Merge v6.1.50 to add support for AVX extensions #120

usaleem-ix · 2023-09-14T19:49:56Z

In Cobia nightlies, support for AVX extension is not available. Due to this, Fletcher4 and RAIDZ performance is hit.

Currently in Cobia nightlies:

root@truenas[~]# cat /sys/module/zfs/parameters/zfs_vdev_raidz_impl
cycle [fastest] original scalar sse2 ssse3 # 
root@truenas[~]# cat /sys/module/zfs/parameters/zfs_fletcher_4_impl 
[fastest] scalar superscalar superscalar4 sse2 ssse3 # 
root@truenas[~]# cat /proc/spl/kstat/zfs/fletcher_4_bench
0 0 0x01 -1 0 8297888349 8256517936506
implementation   native         byteswap       
scalar           3952550218     3127757469     
superscalar      5292677479     4073405329     
superscalar4     4782032535     4162250508     
sse2             8355853056     4976411809     
ssse3            8354812782     7700633411     
fastest          sse2           ssse3

After the update with v6.1.50:

root@truenas[~]# cat /sys/module/zfs/parameters/zfs_vdev_raidz_impl
cycle [fastest] original scalar sse2 ssse3 avx2 #
root@truenas[~]# cat /sys/module/zfs/parameters/zfs_fletcher_4_impl
[fastest] scalar superscalar superscalar4 sse2 ssse3 avx2 #  
root@truenas[~]# cat /proc/spl/kstat/zfs/fletcher_4_bench
0 0 0x01 -1 0 6880632918 197638602372
implementation   native         byteswap       
scalar           3919565944     3087609160     
superscalar      5287129973     4141508055     
superscalar4     4763883430     4326696588     
sse2             7875843343     5013934356     
ssse3            8365560879     7826325634     
avx2             14541263999    13464817127    
fastest          avx2           avx2

This also applies to Dragonfish since we are using the same kernel version in both releases.

[ Upstream commit 4e3733f ] There is a slight issue with error handling code inside nfs42_proc_getxattr(). If page allocating loop fails then we free the failing page array element which is NULL but __free_page() can't deal with NULL args. Found by Linux Verification Center (linuxtesting.org). Fixes: a1f2673 ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f4e89f1 ] Another highly rare error case when a page allocating loop (inside __nfs4_get_acl_uncached, this time) is not properly unwound on error. Since pages array is allocated being uninitialized, need to free only lower array indices. NULL checks were useful before commit 62a1573 ("NFSv4 fix acl retrieval over krb5i/krb5p mounts") when the array had been initialized to zero on stack. Found by Linux Verification Center (linuxtesting.org). Fixes: 62a1573 ("NFSv4 fix acl retrieval over krb5i/krb5p mounts") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 895cedc ] On server-initiated disconnect, rpcrdma_xprt_disconnect() was DMA- unmapping the Receive buffers, but rpcrdma_post_recvs() neglected to remap them after a new connection had been established. The result was immediate failure of the new connection with the Receives flushing with LOCAL_PROT_ERR. Fixes: 671c450 ("xprtrdma: Fix oops in Receive handler after device removal") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c1ebead ] It's just open coded and matches. Note that Thomas said that his version apparently failed for some reason, but hey maybe we should try again. Signed-off-by: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Helge Deller <[email protected]> Cc: [email protected] Tested-by: Thomas Zimmmermann <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 9b539c4 ] It's not exactly the same since the open coded version doesn't set primary correctly. But that's a bugfix, so shouldn't hurt really. Signed-off-by: Daniel Vetter <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: [email protected] Reviewed-by: Thomas Zimmermann <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 80e9939 ] This one nukes all framebuffers, which is a bit much. In reality gma500 is igpu and never shipped with anything discrete, so there should not be any difference. v2: Unfortunately the framebuffer sits outside of the pci bars for gma500, and so only using the pci helpers won't be enough. Otoh if we only use non-pci helper, then we don't get the vga handling, and subsequent refactoring to untangle these special cases won't work. It's not pretty, but the simplest fix (since gma500 really is the only quirky pci driver like this we have) is to just have both calls. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Cc: Patrik Jakobsson <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 62aeaea ] Only really pci devices have a business setting this - it's for figuring out whether the legacy vga stuff should be nuked too. And with the preceding two patches those are all using the pci version of this. Which means for all other callers primary == false and we can remove it now. v2: - Reorder to avoid compile fail (Thomas) - Include gma500, which retained it's called to the non-pci version. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Deepak Rawat <[email protected]> Cc: Neil Armstrong <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Jerome Brunet <[email protected]> Cc: Martin Blumenstingl <[email protected]> Cc: Thierry Reding <[email protected]> Cc: Jonathan Hunter <[email protected]> Cc: Emma Anholt <[email protected]> Cc: Helge Deller <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Acked-by: Martin Blumenstingl <[email protected]> Acked-by: Thierry Reding <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 7450cd2 ] Otherwise it's a bit silly, and we might throw out the driver for the screen the user is actually looking at. I haven't found a bug report for this case yet, but we did get bug reports for the analog case where we're throwing out the efifb driver. v2: Flip the check around to make it clear it's a special case for kicking out the vgacon driver only (Thomas) v4: - fixes to commit message - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Link: https://bugzilla.kernel.org/show_bug.cgi?id=216303 Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Helge Deller <[email protected]> Cc: [email protected] Reviewed-by: Javier Martinez Canillas <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f1d599d ] A few reasons for this: - It's really the only one where this matters. I tried looking around, and I didn't find any non-pci vga-compatible controllers for x86 (since that's the only platform where we had this until a few patches ago), where a driver participating in the aperture claim dance would interfere. - I also don't expect that any future bus anytime soon will not just look like pci towards the OS, that's been the case for like 25+ years by now for practically everything (even non non-x86). - Also it's a bit funny if we have one part of the vga removal in the pci function, and the other in the generic one. v2: Rebase. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Helge Deller <[email protected]> Cc: [email protected] Reviewed-by: Javier Martinez Canillas <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: 5ae3716 ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 40613da ] When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough. Reproducer: $ qemu-kvm -monitor stdio -M q35 -m 4G \ -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \ -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \ disk_image wait till linux guest boots, then hotplug device: (qemu) device_add qxl,bus=rp1 hotplug on guest side fails with: pci 0000:01:00.0: [1b36:0100] type 00 class 0x038000 pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00001fff] pci 0000:01:00.0: reg 0x1c: [io 0x0000-0x001f] pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 1: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 2: assigned [mem 0xfe800000-0xfe801fff] pci 0000:01:00.0: BAR 3: assigned [io 0x1000-0x101f] qxl 0000:01:00.0: enabling device (0000 -> 0003) Unable to create vram_mapping qxl: probe of 0000:01:00.0 failed with error -12 However when using native PCIe hotplug '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' it works fine, since kernel attempts to reassign unused resources. Use the same machinery as native PCIe hotplug to (re)assign resources. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Igor Mammedov <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Cc: [email protected] Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f641519 ] cpu_has_octeon_cache was tied to 0 for generic cpu-features, whith this generic kernel built for octeon CPU won't boot. Just enable this flag by cpu_type. It won't hurt orther platforms because compiler will eliminate the code path on other processors. Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Thomas Bogendoerfer <[email protected]> Stable-dep-of: 5487a7b ("MIPS: cpu-features: Use boot_cpu_type for CPU type based features") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 5487a7b ] Some CPU feature macros were using current_cpu_type to mark feature availability. However current_cpu_type will use smp_processor_id, which is prohibited under preemptable context. Since those features are all uniform on all CPUs in a SMP system, use boot_cpu_type instead of current_cpu_type to fix preemptable kernel. Cc: [email protected] Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit be22255 ] Since t_checkpoint_io_list was stop using in jbd2_log_do_checkpoint() now, it's time to remove the whole t_checkpoint_io_list logic. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Stable-dep-of: 46f881b ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b98dba2 ] journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Stable-dep-of: 46f881b ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 46f881b ] Before removing checkpoint buffer from the t_checkpoint_list, we have to check both BH_Dirty and BH_Lock bits together to distinguish buffers have not been or were being written back. But __cp_buffer_busy() checks them separately, it first check lock state and then check dirty, the window between these two checks could be raced by writing back procedure, which locks buffer and clears buffer dirty before I/O completes. So it cannot guarantee checkpointing buffers been written back to disk if some error happens later. Finally, it may clean checkpoint transactions and lead to inconsistent filesystem. jbd2_journal_forget() and __journal_try_to_free_buffer() also have the same problem (journal_unmap_buffer() escape from this issue since it's running under the buffer lock), so fix them through introducing a new helper to try holding the buffer lock and remove really clean buffer. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Cc: [email protected] Suggested-by: Jan Kara <[email protected]> Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit ee8b94c ] Got kmemleak errors with the following ltp can_filter testcase: for ((i=1; i<=100; i++)) do ./can_filter & sleep 0.1 done ============================================================== [<00000000db4a4943>] can_rx_register+0x147/0x360 [can] [<00000000a289549d>] raw_setsockopt+0x5ef/0x853 [can_raw] [<000000006d3d9ebd>] __sys_setsockopt+0x173/0x2c0 [<00000000407dbfec>] __x64_sys_setsockopt+0x61/0x70 [<00000000fd468496>] do_syscall_64+0x33/0x40 [<00000000b7e47d51>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 It's a bug in the concurrent scenario of unregister_netdevice_many() and raw_release() as following: cpu0 cpu1 unregister_netdevice_many(can_dev) unlist_netdevice(can_dev) // dev_get_by_index() return NULL after this net_set_todo(can_dev) raw_release(can_socket) dev = dev_get_by_index(, ro->ifindex); // dev == NULL if (dev) { // receivers in dev_rcv_lists not free because dev is NULL raw_disable_allfilters(, dev, ); dev_put(dev); } ... ro->bound = 0; ... call_netdevice_notifiers(NETDEV_UNREGISTER, ) raw_notify(, NETDEV_UNREGISTER, ) if (ro->bound) // invalid because ro->bound has been set 0 raw_disable_allfilters(, dev, ); // receivers in dev_rcv_lists will never be freed Add a net_device pointer member in struct raw_sock to record bound can_dev, and use rtnl_lock to serialize raw_socket members between raw_bind(), raw_release(), raw_setsockopt() and raw_notify(). Use ro->dev to decide whether to free receivers in dev_rcv_lists. Fixes: 8d0caed ("can: bcm/raw/isotp: use per module netdevice notifier") Reviewed-by: Oliver Hartkopp <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Ziyang Xuan <[email protected]> Link: https://lore.kernel.org/all/[email protected] Cc: [email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 11c9027 ] syzbot complained about a lockdep issue [1] Since raw_bind() and raw_setsockopt() first get RTNL before locking the socket, we must adopt the same order in raw_release() [1] WARNING: possible circular locking dependency detected 6.5.0-rc1-syzkaller-00192-g78adb4bcf99e #0 Not tainted ------------------------------------------------------ syz-executor.0/14110 is trying to acquire lock: ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1708 [inline] ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: raw_bind+0xb1/0xab0 net/can/raw.c:435 but task is already holding lock: ffffffff8e3df368 (rtnl_mutex){+.+.}-{3:3}, at: raw_bind+0xa7/0xab0 net/can/raw.c:434 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:603 [inline] __mutex_lock+0x181/0x1340 kernel/locking/mutex.c:747 raw_release+0x1c6/0x9b0 net/can/raw.c:391 __sock_release+0xcd/0x290 net/socket.c:654 sock_close+0x1c/0x20 net/socket.c:1386 __fput+0x3fd/0xac0 fs/file_table.c:384 task_work_run+0x14d/0x240 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297 do_syscall_64+0x44/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (sk_lock-AF_CAN){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492 lock_sock include/net/sock.h:1708 [inline] raw_bind+0xb1/0xab0 net/can/raw.c:435 __sys_bind+0x1ec/0x220 net/socket.c:1792 __do_sys_bind net/socket.c:1803 [inline] __se_sys_bind net/socket.c:1801 [inline] __x64_sys_bind+0x72/0xb0 net/socket.c:1801 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(sk_lock-AF_CAN); lock(rtnl_mutex); lock(sk_lock-AF_CAN); *** DEADLOCK *** 1 lock held by syz-executor.0/14110: stack backtrace: CPU: 0 PID: 14110 Comm: syz-executor.0 Not tainted 6.5.0-rc1-syzkaller-00192-g78adb4bcf99e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2195 check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492 lock_sock include/net/sock.h:1708 [inline] raw_bind+0xb1/0xab0 net/can/raw.c:435 __sys_bind+0x1ec/0x220 net/socket.c:1792 __do_sys_bind net/socket.c:1803 [inline] __se_sys_bind net/socket.c:1801 [inline] __x64_sys_bind+0x72/0xb0 net/socket.c:1801 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd89007cb29 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd890d2a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 RAX: ffffffffffffffda RBX: 00007fd89019bf80 RCX: 00007fd89007cb29 RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000003 RBP: 00007fd8900c847a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fd89019bf80 R15: 00007ffebf8124f8 </TASK> Fixes: ee8b94c ("can: raw: fix receiver memory leak") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Ziyang Xuan <[email protected]> Cc: Oliver Hartkopp <[email protected]> Cc: [email protected] Cc: Marc Kleine-Budde <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 72c2112 ] Pointer variables of void * type do not require type cast. Signed-off-by: Yu Zhe <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Stable-dep-of: 4cfca53 ("s390/zcrypt: fix reply buffer calculations for CCA replies") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4cfca53 ] The length information for available buffer space for CCA replies is covered with two fields in the T6 header prepended on each CCA reply: fromcardlen1 and fromcardlen2. The sum of these both values must not exceed the AP bus limit for this card (24KB for CEX8, 12KB CEX7 and older) minus the always present headers. The current code adjusted the fromcardlen2 value in case of exceeding the AP bus limit when there was a non-zero value given from userspace. Some tests now showed that this was the wrong assumption. Instead the userspace value given for this field should always be trusted and if the sum of the two fields exceeds the AP bus limit for this card the first field fromcardlen1 should be adjusted instead. So now the calculation is done with this new insight in mind. Also some additional checks for overflow have been introduced and some comments to provide some documentation for future maintainers of this complicated calculation code. Furthermore the 128 bytes of fix overhead which is used in the current code is not correct. Investigations showed that for a reply always the same two header structs are prepended before a possible payload. So this is also fixed with this patch. Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Holger Dengler <[email protected]> Cc: [email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b2f59e9 ] We always assumed that a device might either have AUX or FLAT CCS, but this is an approximation that is not always true, e.g. PVC represents an exception. Set the basis for future finer selection by implementing a boolean gen12_needs_ccs_aux_inv() function that tells whether aux invalidation is needed or not. Currently PVC is the only exception to the above mentioned rule. Requires: 059ae7ae2a1c ("drm/i915/gt: Cleanup aux invalidation registers") Signed-off-by: Andi Shyti <[email protected]> Cc: Matt Roper <[email protected]> Cc: Jonathan Cavitt <[email protected]> Cc: <[email protected]> # v5.8+ Reviewed-by: Matt Roper <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit c827655) Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 78a6ccd ] All memory traffic must be quiesced before requesting an aux invalidation on platforms that use Aux CCS. Fixes: 972282c ("drm/i915/gen12: Add aux table invalidate for all engines") Requires: a2a4aa0eef3b ("drm/i915: Add the gen12_needs_ccs_aux_inv helper") Signed-off-by: Jonathan Cavitt <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Cc: <[email protected]> # v5.8+ Reviewed-by: Nirmoy Das <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit ad8ebf1) Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 0fde2f2 ] For platforms that use Aux CCS, wait for aux invalidation to complete by checking the aux invalidation register bit is cleared. Fixes: 972282c ("drm/i915/gen12: Add aux table invalidate for all engines") Signed-off-by: Jonathan Cavitt <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Cc: <[email protected]> # v5.8+ Reviewed-by: Nirmoy Das <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit d459c86) Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 6a35f22 ] Perform some refactoring with the purpose of keeping in one single place all the operations around the aux table invalidation. With this refactoring add more engines where the invalidation should be performed. Fixes: 972282c ("drm/i915/gen12: Add aux table invalidate for all engines") Signed-off-by: Andi Shyti <[email protected]> Cc: Jonathan Cavitt <[email protected]> Cc: Matt Roper <[email protected]> Cc: <[email protected]> # v5.8+ Reviewed-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 76ff778) Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b71645d ] Trace ring buffer can no longer record anything after executing following commands at the shell prompt: # cd /sys/kernel/tracing # cat tracing_cpumask fff # echo 0 > tracing_cpumask # echo 1 > snapshot # echo fff > tracing_cpumask # echo 1 > tracing_on # echo "hello world" > trace_marker -bash: echo: write error: Bad file descriptor The root cause is that: 1. After `echo 0 > tracing_cpumask`, 'record_disabled' of cpu buffers in 'tr->array_buffer.buffer' became 1 (see tracing_set_cpumask()); 2. After `echo 1 > snapshot`, 'tr->array_buffer.buffer' is swapped with 'tr->max_buffer.buffer', then the 'record_disabled' became 0 (see update_max_tr()); 3. After `echo fff > tracing_cpumask`, the 'record_disabled' become -1; Then array_buffer and max_buffer are both unavailable due to value of 'record_disabled' is not 0. To fix it, enable or disable both array_buffer and max_buffer at the same time in tracing_set_cpumask(). Link: https://lkml.kernel.org/r/[email protected] Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Fixes: 71babb2 ("tracing: change CPU ring buffer state from tracing_cpumask") Signed-off-by: Zheng Yejian <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit eecb91b ] Kmemleak report a leak in graph_trace_open(): unreferenced object 0xffff0040b95f4a00 (size 128): comm "cat", pid 204981, jiffies 4301155872 (age 99771.964s) hex dump (first 32 bytes): e0 05 e7 b4 ab 7d 00 00 0b 00 01 00 00 00 00 00 .....}.......... f4 00 01 10 00 a0 ff ff 00 00 00 00 65 00 10 00 ............e... backtrace: [<000000005db27c8b>] kmem_cache_alloc_trace+0x348/0x5f0 [<000000007df90faa>] graph_trace_open+0xb0/0x344 [<00000000737524cd>] __tracing_open+0x450/0xb10 [<0000000098043327>] tracing_open+0x1a0/0x2a0 [<00000000291c3876>] do_dentry_open+0x3c0/0xdc0 [<000000004015bcd6>] vfs_open+0x98/0xd0 [<000000002b5f60c9>] do_open+0x520/0x8d0 [<00000000376c7820>] path_openat+0x1c0/0x3e0 [<00000000336a54b5>] do_filp_open+0x14c/0x324 [<000000002802df13>] do_sys_openat2+0x2c4/0x530 [<0000000094eea458>] __arm64_sys_openat+0x130/0x1c4 [<00000000a71d7881>] el0_svc_common.constprop.0+0xfc/0x394 [<00000000313647bf>] do_el0_svc+0xac/0xec [<000000002ef1c651>] el0_svc+0x20/0x30 [<000000002fd4692a>] el0_sync_handler+0xb0/0xb4 [<000000000c309c35>] el0_sync+0x160/0x180 The root cause is descripted as follows: __tracing_open() { // 1. File 'trace' is being opened; ... *iter->trace = *tr->current_trace; // 2. Tracer 'function_graph' is // currently set; ... iter->trace->open(iter); // 3. Call graph_trace_open() here, // and memory are allocated in it; ... } s_start() { // 4. The opened file is being read; ... *iter->trace = *tr->current_trace; // 5. If tracer is switched to // 'nop' or others, then memory // in step 3 are leaked!!! ... } To fix it, in s_start(), close tracer before switching then reopen the new tracer after switching. And some tracers like 'wakeup' may not update 'iter->private' in some cases when reopen, then it should be cleared to avoid being mistakenly closed again. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Fixes: d7350c3 ("tracing/core: make the read callbacks reentrants") Signed-off-by: Zheng Yejian <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 05f3d5b ] On SDP interfaces, frame oversize and undersize errors are observed as driver is not considering packet sizes of all subscribers of the link before updating the link config. This patch fixes the same. Fixes: 9b7dd87 ("octeontx2-af: Support to modify min/max allowed packet lengths") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f05bd8e ] The devlink code is hard to navigate with 13kLoC in one file. I really like the way Michal split the ethtool into per-command files and core. It'd probably be too much to split it all up, but we can at least separate the core parts out of the per-cmd implementations and put it in a directory so that new commands can be separate files. Move the code, subsequent commit will do a partial split. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Stable-dep-of: 2ebbc97 ("devlink: add missing unregister linecard notification") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 2ebbc97 ] Cited fixes commit introduced linecard notifications for register, however it didn't add them for unregister. Fix that by adding them. Fixes: c246f9b ("devlink: add support to create line card and expose to user") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…rio gates [ Upstream commit d44036c ] The blamed commit resolved a bug where frames would still get stuck at egress, even though they're smaller than the maxSDU[tc], because the driver did not take into account the extra 33 ns that the queue system needs for scheduling the frame. It now takes that into account, but the arithmetic that we perform in vsc9959_tas_remaining_gate_len_ps() is buggy, because we operate on 64-bit unsigned integers, so gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS may become a very large integer if gate_len_ns < 33 ns. In practice, this means that we've introduced a regression where all traffic class gates which are permanently closed will not get detected by the driver, and we won't enable oversize frame dropping for them. Before: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 2 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 3 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 4 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 5 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 6 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS After: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS Fixes: 11afdc6 ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 76f3329 ] *prot->memory_pressure is read/writen locklessly, we need to add proper annotations. A recent commit added a new race, it is time to audit all accesses. Fixes: 2d0c88e ("sock: Fix misuse of sk_under_memory_pressure()") Fixes: 4d93df0 ("[SCTP]: Rewrite of sctp buffer management code") Signed-off-by: Eric Dumazet <[email protected]> Cc: Abel Wu <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 111cd11 upstream. Turns out percpu_cpuset_rwsem - commit 1243dc5 ("cgroup/cpuset: Convert cpuset_mutex to percpu_rwsem") - wasn't such a brilliant idea, as it has been reported to cause slowdowns in workloads that need to change cpuset configuration frequently and it is also not implementing priority inheritance (which causes troubles with realtime workloads). Convert percpu_cpuset_rwsem back to regular cpuset_mutex. Also grab it only for SCHED_DEADLINE tasks (other policies don't care about stable cpusets anyway). Signed-off-by: Juri Lelli <[email protected]> Reviewed-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> [ Conflict in kernel/cgroup/cpuset.c due to pulling new code/comments. Reject all new code. Remove BUG_ON() about rwsem that doesn't exist on mainline. ] Signed-off-by: Qais Yousef (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6c24849 upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef (Google) <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Juri Lelli <[email protected]> Reviewed-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Qais Yousef (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c0f78fd upstream. update_tasks_root_domain currently iterates over all tasks even if no DEADLINE task is present on the cpuset/root domain for which bandwidth accounting is being rebuilt. This has been reported to introduce 10+ ms delays on suspend-resume operations. Skip the costly iteration for cpusets that don't contain DEADLINE tasks. Reported-by: Qais Yousef (Google) <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Juri Lelli <[email protected]> Reviewed-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Qais Yousef (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8598910 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Juri Lelli <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Qais Yousef (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2ef269e upstream. cpuset_can_attach() can fail. Postpone DL BW allocation until all tasks have been checked. DL BW is not allocated per-task but as a sum over all DL tasks migrating. If multiple controllers are attached to the cgroup next to the cpuset controller a non-cpuset can_attach() can fail. In this case free DL BW in cpuset_cancel_attach(). Finally, update cpuset DL task count (nr_deadline_tasks) only in cpuset_attach(). Suggested-by: Waiman Long <[email protected]> Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Juri Lelli <[email protected]> Reviewed-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Qais Yousef (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…ug onwards commit 583893a upstream. Previously, on unplug events, the TMU mode was disabled first followed by the Time Synchronization Handshake, irrespective of whether the tb_switch_tmu_rate_write() API was successful or not. However, this caused a problem with Thunderbolt 3 (TBT3) devices, as the TSPacketInterval bits were always enabled by default, leading the host router to assume that the device router's TMU was already enabled and preventing it from initiating the Time Synchronization Handshake. As a result, TBT3 monitors experienced display flickering from the second hot plug onwards. To address this issue, we have modified the code to only disable the Time Synchronization Handshake during TMU disable if the tb_switch_tmu_rate_write() function is successful. This ensures that the TBT3 devices function correctly and eliminates the display flickering issue. Co-developed-by: Sanath S <[email protected]> Signed-off-by: Sanath S <[email protected]> Signed-off-by: Sanjay R Mehta <[email protected]> Cc: [email protected] Signed-off-by: Mika Westerberg <[email protected]> [ USB4v2 introduced support for uni-directional TMU mode as part of d49b4f0 ("thunderbolt: Add support for enhanced uni-directional TMU mode") This is not a stable candidate commit, so adjust the code for backport. ] Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9c7c4bc upstream. sizeof(struct ublksrv_io_cmd) is 16bytes, which can be held in 64byte SQE, so not necessary to check IO_URING_F_SQE128. With this change, we get chance to save half SQ ring memory. Fixed: 71f28f3 ("ublk_drv: add io_uring based userspace block driver") Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c275a17 upstream. Commit ee8b94c ("can: raw: fix receiver memory leak") introduced a new reference to the CAN netdevice that has assigned CAN filters. But this new ro->dev reference did not maintain its own refcount which lead to another KASAN use-after-free splat found by Eric Dumazet. This patch ensures a proper refcount for the CAN nedevice. Fixes: ee8b94c ("can: raw: fix receiver memory leak") Reported-by: Eric Dumazet <[email protected]> Cc: Ziyang Xuan <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…folio for sharing check commit 0e0e9bd upstream. Commit 98b211d ("madvise: convert madvise_free_pte_range() to use a folio") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. It's not correct for large folios. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. It should be OK for madvise case here. User-visible effects is that the THP is skipped when user call madvise. But the correct behavior is THP should be split and processed then. NOTE: this change is a temporary fix to reduce the user-visible effects before the long term fix from David is ready. Link: https://lkml.kernel.org/r/[email protected] Fixes: 98b211d ("madvise: convert madvise_free_pte_range() to use a folio") Signed-off-by: Yin Fengwei <[email protected]> Reviewed-by: Yu Zhao <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Vishal Moola (Oracle) <[email protected]> Cc: Yang Shi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1bd3a76 upstream. Commit 41320b1 ("scsi: snic: Fix possible memory leak if device_add() fails") fixed the memory leak caused by dev_set_name() when device_add() failed. However, it did not consider that 'tgt' has already been released when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path to avoid double free of 'tgt' and move put_device(&tgt->dev) after the removed kfree(tgt) to avoid a use-after-free. Fixes: 41320b1 ("scsi: snic: Fix possible memory leak if device_add() fails") Signed-off-by: Zhu Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 60c5fd2 upstream. The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Fixes: 04b5b5c ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 2746f13 ] The COMMON_CLK config is not enabled in some of the architectures. This causes below build issues: pwm-rz-mtu3.c:(.text+0x114): undefined reference to `clk_rate_exclusive_put' pwm-rz-mtu3.c:(.text+0x32c): undefined reference to `clk_rate_exclusive_get' Fix these issues by moving clk_rate_exclusive_{get,put} inside COMMON_CLK code block, as clk.c is enabled by COMMON_CLK. Fixes: 55e9b8b ("clk: add clk_rate_exclusive api") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Biju Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…node_to_map() [ Upstream commit 661efa2 ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzg2l_dt_subnode_to_map() with mutex lock. Crash log: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzg2l_dt_subnode_to_map+0x314/0x44c rzg2l_dt_node_to_map+0x164/0x194 pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add comments for bitmap_lock and lock. Fixes: c4c4637 ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver") Tested-by: Chris Paterson <[email protected]> Signed-off-by: Biju Das <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…node_to_map() [ Upstream commit f982b9d ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzv2m_dt_subnode_to_map() with mutex lock. Crash logs: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzv2m_dt_subnode_to_map+0x2e4/0x418 rzv2m_dt_node_to_map+0x15c/0x18c pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add a comment for lock. Fixes: 92a9b82 ("pinctrl: renesas: Add RZ/V2M pin and gpio controller driver") Signed-off-by: Biju Das <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…group,{add,remove}_function} [ Upstream commit 8fcc1c4 ] The pinctrl group and function creation/remove calls expect caller to take care of locking. Add lock around these functions. Fixes: b59d0e7 ("pinctrl: Add RZ/A2 pin and gpio controller") Signed-off-by: Biju Das <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit e531fdb ] If a signal callback releases the sw_sync fence, that will trigger a deadlock as the timeline_fence_release recurses onto the fence->lock (used both for signaling and the the timeline tree). To avoid that, temporarily hold an extra reference to the signalled fences until after we drop the lock. (This is an alternative implementation of https://patchwork.kernel.org/patch/11664717/ which avoids some potential UAF issues with the original patch.) v2: Remove now obsolete comment, use list_move_tail() and list_del_init() Reported-by: Bas Nieuwenhuizen <[email protected]> Fixes: d3c6dd1 ("dma-buf/sw_sync: Synchronize signal vs syncpt free") Signed-off-by: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit ab4109f ] If a GPIO simulator device is unbound with interrupts still requested, we will hit a use-after-free issue in __irq_domain_deactivate_irq(). The owner of the irq domain must dispose of all mappings before destroying the domain object. Fixes: cb8c474 ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 6e39c1a ] Associate the swnode of the GPIO device's (which is the interrupt controller here) with the irq domain. Otherwise the interrupt-controller device attribute is a no-op. Fixes: cb8c474 ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c008323 ] Lenovo 82SJ doesn't have DMIC connected like 82V2 does. Narrow the match down to only cover 82V2. Reported-by: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217063 Fixes: 2232b2d ("ASoC: amd: yc: Add Lenovo Yoga Slim 7 Pro X to quirks table") Signed-off-by: Mario Limonciello <[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected] Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit cfeb6ae ] The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution. During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value Alternatively: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action. Link: https://lkml.kernel.org/r/[email protected] Fixes: 54a611b ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit fd0a7ec upstream. The vangogh driver just gained a link time dependency that now causes randconfig builds to fail: x86_64-linux-ld: sound/soc/amd/vangogh/pci-acp5x.o: in function `snd_acp5x_probe': pci-acp5x.c:(.text+0xbb): undefined reference to `snd_amd_acp_find_config' Fixes: e89f45e ("ASoC: amd: vangogh: Add check for acp config flags in vangogh platform") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Link: https://lore.kernel.org/r/[email protected] Tested-by: Conor Dooley <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Salvatore Bonaccorso <[email protected]> Tested-by: Joel Fernandes (Google) <[email protected]> Tested-by: SeongJae Park <[email protected]> Tested-by: Bagas Sanjaya <[email protected]> Tested-by: Sudip Mukherjee <[email protected]> Tested-by: Takeshi Ogasawara <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Florian Fainelli <[email protected]> Tested-by: Guenter Roeck <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Pavel Machek (CIP) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

This is the 6.1.50 stable release Signed-off-by: Umer Saleem <[email protected]>

bugclerk · 2023-09-14T19:50:33Z

Jira URL: https://ixsystems.atlassian.net/browse/NAS-124109

Fedor Pchelkin and others added 30 commits August 30, 2023 16:10

jlelli and others added 23 commits August 30, 2023 16:11

Merge tag 'v6.1.50' into nwrelmrg

f125942

This is the 6.1.50 stable release Signed-off-by: Umer Saleem <[email protected]>

usaleem-ix requested review from amotin and ixhamza September 14, 2023 19:49

amotin approved these changes Sep 14, 2023

View reviewed changes

amotin merged commit 6752e97 into truenas/linux-6.1 Sep 14, 2023
6 checks passed

amotin deleted the nwrelmrg branch September 14, 2023 19:52

bugclerk added the no-time-tracked label Sep 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAS-124109 / 24.04 / Merge v6.1.50 to add support for AVX extensions #120

NAS-124109 / 24.04 / Merge v6.1.50 to add support for AVX extensions #120

usaleem-ix commented Sep 14, 2023

bugclerk commented Sep 14, 2023

NAS-124109 / 24.04 / Merge v6.1.50 to add support for AVX extensions #120

NAS-124109 / 24.04 / Merge v6.1.50 to add support for AVX extensions #120

Conversation

usaleem-ix commented Sep 14, 2023

bugclerk commented Sep 14, 2023