-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add perf list check #16
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Qinghao Shi <[email protected]>
RevySR
pushed a commit
that referenced
this pull request
Jun 29, 2024
commit 9d274c1 upstream. We have been seeing crashes on duplicate keys in btrfs_set_item_key_safe(): BTRFS critical (device vdb): slot 4 key (450 108 8192) new key (450 108 8192) ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:2620! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 3139 Comm: xfs_io Kdump: loaded Not tainted 6.9.0 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:btrfs_set_item_key_safe+0x11f/0x290 [btrfs] With the following stack trace: #0 btrfs_set_item_key_safe (fs/btrfs/ctree.c:2620:4) #1 btrfs_drop_extents (fs/btrfs/file.c:411:4) #2 log_one_extent (fs/btrfs/tree-log.c:4732:9) #3 btrfs_log_changed_extents (fs/btrfs/tree-log.c:4955:9) #4 btrfs_log_inode (fs/btrfs/tree-log.c:6626:9) #5 btrfs_log_inode_parent (fs/btrfs/tree-log.c:7070:8) #6 btrfs_log_dentry_safe (fs/btrfs/tree-log.c:7171:8) #7 btrfs_sync_file (fs/btrfs/file.c:1933:8) #8 vfs_fsync_range (fs/sync.c:188:9) #9 vfs_fsync (fs/sync.c:202:9) #10 do_fsync (fs/sync.c:212:9) #11 __do_sys_fdatasync (fs/sync.c:225:9) #12 __se_sys_fdatasync (fs/sync.c:223:1) #13 __x64_sys_fdatasync (fs/sync.c:223:1) #14 do_syscall_x64 (arch/x86/entry/common.c:52:14) #15 do_syscall_64 (arch/x86/entry/common.c:83:7) #16 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121) So we're logging a changed extent from fsync, which is splitting an extent in the log tree. But this split part already exists in the tree, triggering the BUG(). This is the state of the log tree at the time of the crash, dumped with drgn (https://github.com/osandov/drgn/blob/main/contrib/btrfs_tree.py) to get more details than btrfs_print_leaf() gives us: >>> print_extent_buffer(prog.crashed_thread().stack_trace()[0]["eb"]) leaf 33439744 level 0 items 72 generation 9 owner 18446744073709551610 leaf 33439744 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da item 0 key (450 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 7 transid 9 size 8192 nbytes 8473563889606862198 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 204 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417704.983333333 (2024-05-22 15:41:44) mtime 1716417704.983333333 (2024-05-22 15:41:44) otime 17592186044416.000000000 (559444-03-08 01:40:16) item 1 key (450 INODE_REF 256) itemoff 16110 itemsize 13 index 195 namelen 3 name: 193 item 2 key (450 XATTR_ITEM 1640047104) itemoff 16073 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 3 key (450 EXTENT_DATA 0) itemoff 16020 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 4096 ram 12288 extent compression 0 (none) item 4 key (450 EXTENT_DATA 4096) itemoff 15967 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 4096 nr 8192 item 5 key (450 EXTENT_DATA 8192) itemoff 15914 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 ... So the real problem happened earlier: notice that items 4 (4k-12k) and 5 (8k-12k) overlap. Both are prealloc extents. Item 4 straddles i_size and item 5 starts at i_size. Here is the state of the filesystem tree at the time of the crash: >>> root = prog.crashed_thread().stack_trace()[2]["inode"].root >>> ret, nodes, slots = btrfs_search_slot(root, BtrfsKey(450, 0, 0)) >>> print_extent_buffer(nodes[0]) leaf 30425088 level 0 items 184 generation 9 owner 5 leaf 30425088 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da ... item 179 key (450 INODE_ITEM 0) itemoff 4907 itemsize 160 generation 7 transid 7 size 4096 nbytes 12288 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 6 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417703.220000000 (2024-05-22 15:41:43) mtime 1716417703.220000000 (2024-05-22 15:41:43) otime 1716417703.220000000 (2024-05-22 15:41:43) item 180 key (450 INODE_REF 256) itemoff 4894 itemsize 13 index 195 namelen 3 name: 193 item 181 key (450 XATTR_ITEM 1640047104) itemoff 4857 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 182 key (450 EXTENT_DATA 0) itemoff 4804 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 8192 ram 12288 extent compression 0 (none) item 183 key (450 EXTENT_DATA 8192) itemoff 4751 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 Item 5 in the log tree corresponds to item 183 in the filesystem tree, but nothing matches item 4. Furthermore, item 183 is the last item in the leaf. btrfs_log_prealloc_extents() is responsible for logging prealloc extents beyond i_size. It first truncates any previously logged prealloc extents that start beyond i_size. Then, it walks the filesystem tree and copies the prealloc extent items to the log tree. If it hits the end of a leaf, then it calls btrfs_next_leaf(), which unlocks the tree and does another search. However, while the filesystem tree is unlocked, an ordered extent completion may modify the tree. In particular, it may insert an extent item that overlaps with an extent item that was already copied to the log tree. This may manifest in several ways depending on the exact scenario, including an EEXIST error that is silently translated to a full sync, overlapping items in the log tree, or this crash. This particular crash is triggered by the following sequence of events: - Initially, the file has i_size=4k, a regular extent from 0-4k, and a prealloc extent beyond i_size from 4k-12k. The prealloc extent item is the last item in its B-tree leaf. - The file is fsync'd, which copies its inode item and both extent items to the log tree. - An xattr is set on the file, which sets the BTRFS_INODE_COPY_EVERYTHING flag. - The range 4k-8k in the file is written using direct I/O. i_size is extended to 8k, but the ordered extent is still in flight. - The file is fsync'd. Since BTRFS_INODE_COPY_EVERYTHING is set, this calls copy_inode_items_to_log(), which calls btrfs_log_prealloc_extents(). - btrfs_log_prealloc_extents() finds the 4k-12k prealloc extent in the filesystem tree. Since it starts before i_size, it skips it. Since it is the last item in its B-tree leaf, it calls btrfs_next_leaf(). - btrfs_next_leaf() unlocks the path. - The ordered extent completion runs, which converts the 4k-8k part of the prealloc extent to written and inserts the remaining prealloc part from 8k-12k. - btrfs_next_leaf() does a search and finds the new prealloc extent 8k-12k. - btrfs_log_prealloc_extents() copies the 8k-12k prealloc extent into the log tree. Note that it overlaps with the 4k-12k prealloc extent that was copied to the log tree by the first fsync. - fsync calls btrfs_log_changed_extents(), which tries to log the 4k-8k extent that was written. - This tries to drop the range 4k-8k in the log tree, which requires adjusting the start of the 4k-12k prealloc extent in the log tree to 8k. - btrfs_set_item_key_safe() sees that there is already an extent starting at 8k in the log tree and calls BUG(). Fix this by detecting when we're about to insert an overlapping file extent item in the log tree and truncating the part that would overlap. CC: [email protected] # 6.1+ Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RevySR
pushed a commit
that referenced
this pull request
Jun 29, 2024
[ Upstream commit 79f18a4 ] When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Shannon Nelson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.