Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL pointer dereference #15399

Open
siilike opened this issue Oct 13, 2023 · 4 comments
Open

NULL pointer dereference #15399

siilike opened this issue Oct 13, 2023 · 4 comments
Labels
Component: Encryption "native encryption" feature Component: Send/Recv "zfs send/recv" feature Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@siilike
Copy link

siilike commented Oct 13, 2023

System information

Type Version/Name
Distribution Name Debian
Distribution Version 11.8
Kernel Version 5.10.0-20
Architecture amd64
OpenZFS Version 2.1.11-1~bpo11+1

Describe the problem you're observing

NULL pointer dereference while receiving an encrypted dataset. Does not seem to be the same as any other open or closed issue.

Describe how to reproduce the problem

First time this has happened.

Include any warning/errors/backtraces from the system logs

[851506.092015] BUG: kernel NULL pointer dereference, address: 0000000000000000
[851506.092067] #PF: supervisor read access in kernel mode
[851506.092111] #PF: error_code(0x0000) - not-present page
[851506.092155] PGD 3556a1067 P4D 3556a1067 PUD 3b7f4b067 PMD 0 
[851506.092203] Oops: 0000 [#1] SMP NOPTI
[851506.092246] CPU: 1 PID: 2648078 Comm: receive_writer Tainted: P           OE     5.10.0-20-amd64 #1 Debian 5.10.158-2
[851506.092298] Hardware name: System manufacturer System Product Name/PRIME A320M-K, BIOS 5603 10/14/2020
[851506.092444] RIP: 0010:abd_borrow_buf_copy+0x21/0x90 [zfs]
[851506.092490] Code: 15 5b 11 00 0f 1f 44 00 00 0f 1f 44 00 00 41 55 41 54 55 48 89 fd 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <f6> 07 01 74 25 4c 8b 6f 48 48 8b 44 24 08 65 48 2b 04 25 28 00 00
[851506.092554] RSP: 0018:ffffa3de1ff4fa18 EFLAGS: 00010246
[851506.092595] RAX: 0000000000000000 RBX: ffff9438bcabad00 RCX: 0000000000000000
[851506.092638] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
[851506.092682] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[851506.092725] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000010
[851506.092769] R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
[851506.092813] FS:  0000000000000000(0000) GS:ffff9438de640000(0000) knlGS:0000000000000000
[851506.092857] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[851506.092896] CR2: 0000000000000000 CR3: 000000023903c000 CR4: 00000000001506e0
[851506.092939] Call Trace:
[851506.093060]  zio_crypt_copy_dnode_bonus+0x2e/0x120 [zfs]
[851506.093144]  arc_buf_fill+0x3f9/0xce0 [zfs]
[851506.093186]  ? ___slab_alloc+0x32c/0x580
[851506.093267]  arc_untransform+0x1d/0x80 [zfs]
[851506.093349]  dbuf_read_verify_dnode_crypt+0xf4/0x160 [zfs]
[851506.093456]  dbuf_read_impl.constprop.0+0x52e/0x6e0 [zfs]
[851506.093544]  ? dbuf_cons+0xa7/0xc0 [zfs]
[851506.093595]  ? spl_kmem_cache_alloc+0xaf/0x7d0 [spl]
[851506.093687]  ? dbuf_rele_and_unlock+0x132/0x670 [zfs]
[851506.093772]  ? arc_buf_access+0x14c/0x250 [zfs]
[851506.093818]  ? _cond_resched+0x16/0x50
[851506.093860]  ? _cond_resched+0x16/0x50
[851506.093902]  ? mutex_lock+0xe/0x30
[851506.093985]  ? aggsum_add+0x180/0x1a0 [zfs]
[851506.094071]  dbuf_read+0xda/0x5e0 [zfs]
[851506.094165]  ? dnode_hold_impl+0x9a3/0x1080 [zfs]
[851506.094211]  ? _cond_resched+0x16/0x50
[851506.094323]  dmu_bonus_hold_by_dnode+0x86/0x1a0 [zfs]
[851506.094424]  receive_object+0x410/0xca0 [zfs]
[851506.094511]  ? dmu_object_next+0x95/0x120 [zfs]
[851506.094555]  ? kfree+0xba/0x490
[851506.094642]  ? receive_writer_thread+0xbd/0xad0 [zfs]
[851506.094681]  ? kfree+0x410/0x490
[851506.094720]  ? _cond_resched+0x16/0x50
[851506.094805]  receive_writer_thread+0x1cc/0xad0 [zfs]
[851506.094853]  ? thread_generic_wrapper+0x62/0x80 [spl]
[851506.094892]  ? kfree+0xba/0x490
[851506.094980]  ? receive_process_write_record+0x1a0/0x1a0 [zfs]
[851506.095027]  ? thread_generic_wrapper+0x6f/0x80 [spl]
[851506.095070]  thread_generic_wrapper+0x6f/0x80 [spl]
[851506.095114]  ? __thread_exit+0x20/0x20 [spl]
[851506.095155]  kthread+0x11b/0x140
[851506.095192]  ? __kthread_bind_mask+0x60/0x60
[851506.095232]  ret_from_fork+0x22/0x30
[851506.095270] Modules linked in: usblp ipmi_devintf ipmi_msghandler wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha amdgpu edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation ccp rng_core kvm snd_usb_audio snd_soc_core gpu_sched eeepc_wmi ttm asus_wmi battery snd_compress soundwire_cadence sparse_keymap snd_usbmidi_lib snd_hda_codec sp5100_tco irqbypass rfkill wmi_bmof snd_rawmidi pcspkr snd_seq_device snd_hda_core mc snd_hwdep fam15h_power watchdog k10temp soundwire_bus snd_pcm drm_kms_helper snd_timer snd soundcore cec i2c_algo_bit button acpi_cpufreq evdev sg parport_pc ppdev nfsd auth_rpcgss nfs_acl lp parport lockd grace drm fuse configfs sunrpc ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) dm_crypt dm_mod
[851506.095332]  hid_generic sd_mod crc32_pclmul usbhid uas crc32c_intel hid usb_storage ahci libahci ghash_clmulni_intel mpt3sas nvme libata aesni_intel r8169 libaes crypto_simd nvme_core cryptd glue_helper t10_pi crc_t10dif raid_class scsi_transport_sas crct10dif_generic i2c_piix4 realtek mdio_devres libphy xhci_pci scsi_mod crct10dif_pclmul crct10dif_common xhci_hcd usbcore video usb_common gpio_amdpt gpio_generic wmi
[851506.095615] CR2: 0000000000000000
[851506.095653] ---[ end trace a247c66f70488d90 ]---
[851506.095774] RIP: 0010:abd_borrow_buf_copy+0x21/0x90 [zfs]
[851506.096941] Code: 15 5b 11 00 0f 1f 44 00 00 0f 1f 44 00 00 41 55 41 54 55 48 89 fd 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <f6> 07 01 74 25 4c 8b 6f 48 48 8b 44 24 08 65 48 2b 04 25 28 00 00
[851506.097005] RSP: 0018:ffffa3de1ff4fa18 EFLAGS: 00010246
[851506.097049] RAX: 0000000000000000 RBX: ffff9438bcabad00 RCX: 0000000000000000
[851506.097096] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
[851506.097144] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[851506.097190] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000010
[851506.097237] R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
[851506.097284] FS:  0000000000000000(0000) GS:ffff9438de640000(0000) knlGS:0000000000000000
[851506.097333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[851506.097376] CR2: 0000000000000000 CR3: 000000023903c000 CR4: 00000000001506e0
@siilike siilike added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 13, 2023
@siilike
Copy link
Author

siilike commented Oct 15, 2023

Second time:

[184225.465838] BUG: kernel NULL pointer dereference, address: 0000000000000000
[184225.465893] #PF: supervisor read access in kernel mode
[184225.465939] #PF: error_code(0x0000) - not-present page
[184225.465986] PGD 175d83067 P4D 175d83067 PUD 175d82067 PMD 0 
[184225.466039] Oops: 0000 [#1] SMP NOPTI
[184225.466088] CPU: 0 PID: 2255992 Comm: receive_writer Tainted: P           OE     5.10.0-20-amd64 #1 Debian 5.10.158-2
[184225.466142] Hardware name: System manufacturer System Product Name/PRIME A320M-K, BIOS 5603 10/14/2020
[184225.466310] RIP: 0010:abd_borrow_buf_copy+0x21/0x90 [zfs]
[184225.466363] Code: 15 5b 11 00 0f 1f 44 00 00 0f 1f 44 00 00 41 55 41 54 55 48 89 fd 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <f6> 07 01 74 25 4c 8b 6f 48 48 8b 44 24 08 65 48 2b 04 25 28 00 00
[184225.466435] RSP: 0018:ffffae3f5764b9b0 EFLAGS: 00010246
[184225.466484] RAX: 0000000000000000 RBX: ffff93b71ac7c000 RCX: 0000000000000000
[184225.466534] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
[184225.466581] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[184225.466628] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000010
[184225.466674] R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
[184225.466721] FS:  0000000000000000(0000) GS:ffff93b81e600000(0000) knlGS:0000000000000000
[184225.466768] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[184225.466811] CR2: 0000000000000000 CR3: 000000017603e000 CR4: 00000000001506f0
[184225.466858] Call Trace:
[184225.467020]  zio_crypt_copy_dnode_bonus+0x2e/0x120 [zfs]
[184225.467146]  arc_buf_fill+0x3f9/0xce0 [zfs]
[184225.467234]  arc_untransform+0x1d/0x80 [zfs]
[184225.467321]  dbuf_read_verify_dnode_crypt+0xf4/0x160 [zfs]
[184225.467402]  dbuf_read_impl.constprop.0+0x2c4/0x6e0 [zfs]
[184225.467449]  ? _cond_resched+0x16/0x50
[184225.467545]  ? dbuf_create+0x43c/0x610 [zfs]
[184225.467631]  dbuf_read+0xda/0x5e0 [zfs]
[184225.467750]  dmu_tx_check_ioerr+0x64/0xd0 [zfs]
[184225.467879]  dmu_tx_hold_free_impl+0x12f/0x250 [zfs]
[184225.467994]  dmu_free_long_range+0x242/0x4d0 [zfs]
[184225.468083]  dmu_free_long_object+0x22/0xd0 [zfs]
[184225.468166]  receive_freeobjects+0x82/0x100 [zfs]
[184225.468255]  receive_writer_thread+0x565/0xad0 [zfs]
[184225.468304]  ? thread_generic_wrapper+0x62/0x80 [spl]
[184225.468347]  ? kfree+0xba/0x490
[184225.468436]  ? receive_process_write_record+0x1a0/0x1a0 [zfs]
[184225.468487]  ? thread_generic_wrapper+0x6f/0x80 [spl]
[184225.468535]  thread_generic_wrapper+0x6f/0x80 [spl]
[184225.468588]  ? __thread_exit+0x20/0x20 [spl]
[184225.468637]  kthread+0x11b/0x140
[184225.468680]  ? __kthread_bind_mask+0x60/0x60
[184225.468728]  ret_from_fork+0x22/0x30
[184225.468770] Modules linked in: rpcsec_gss_krb5 ipmi_devintf ipmi_msghandler wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha amdgpu edac_mce_amd kvm_amd ccp rng_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_usb_audio snd_hda_intel kvm snd_intel_dspcfg soundwire_intel soundwire_generic_allocation snd_soc_core snd_compress soundwire_cadence snd_hda_codec snd_hda_core snd_usbmidi_lib gpu_sched snd_hwdep snd_rawmidi ttm soundwire_bus snd_seq_device mc eeepc_wmi snd_pcm asus_wmi battery irqbypass wmi_bmof sparse_keymap pcspkr rfkill fam15h_power k10temp sp5100_tco snd_timer watchdog drm_kms_helper snd soundcore cec i2c_algo_bit acpi_cpufreq button evdev sg nfsd auth_rpcgss nfs_acl lockd grace parport_pc drm ppdev lp parport fuse sunrpc configfs ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE)
[184225.468828]  dm_crypt dm_mod hid_generic usbhid crc32_pclmul uas crc32c_intel usb_storage hid sd_mod ghash_clmulni_intel mpt3sas ahci raid_class libahci xhci_pci xhci_hcd r8169 scsi_transport_sas libata nvme realtek nvme_core mdio_devres aesni_intel libaes scsi_mod crypto_simd libphy cryptd glue_helper t10_pi crc_t10dif crct10dif_generic i2c_piix4 crct10dif_pclmul crct10dif_common usbcore usb_common wmi video gpio_amdpt gpio_generic
[184225.469141] CR2: 0000000000000000
[184225.469189] ---[ end trace 3b5aa52298dfc36e ]---
[184225.469323] RIP: 0010:abd_borrow_buf_copy+0x21/0x90 [zfs]
[184225.469370] Code: 15 5b 11 00 0f 1f 44 00 00 0f 1f 44 00 00 41 55 41 54 55 48 89 fd 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <f6> 07 01 74 25 4c 8b 6f 48 48 8b 44 24 08 65 48 2b 04 25 28 00 00
[184225.469437] RSP: 0018:ffffae3f5764b9b0 EFLAGS: 00010246
[184225.469484] RAX: 0000000000000000 RBX: ffff93b71ac7c000 RCX: 0000000000000000
[184225.469537] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
[184225.469584] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[184225.469630] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000010
[184225.469675] R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
[184225.469722] FS:  0000000000000000(0000) GS:ffff93b81e600000(0000) knlGS:0000000000000000
[184225.469770] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[184225.469812] CR2: 0000000000000000 CR3: 000000017603e000 CR4: 00000000001506f0

@rincebrain
Copy link
Contributor

Looks like one of the panics in #11679 to me.

The bug is, to my understanding, that when native encryption was added, it started keeping an additional potential copy of the encrypted data, in addition to potentially the decrypted copy, which violates some assumptions in the code about there only being one potential thing you need to check for being owned in a specific edge case, leading to a race condition where if that assumption being violated lines up correctly with a specific series of operations that assume they're safe to do if there's only one person looking at them, and you know you're looking at it since you have a reference to it, ...

@rincebrain rincebrain added Component: Send/Recv "zfs send/recv" feature Component: Encryption "native encryption" feature labels Oct 15, 2023
@siilike
Copy link
Author

siilike commented Apr 25, 2024

More:

[236300.768680] BUG: kernel NULL pointer dereference, address: 0000000000000000
[236300.768736] #PF: supervisor read access in kernel mode
[236300.768781] #PF: error_code(0x0000) - not-present page
[236300.768827] PGD 0 P4D 0 
[236300.768871] Oops: 0000 [#1] PREEMPT SMP NOPTI
[236300.768917] CPU: 1 PID: 2723265 Comm: receive_writer Tainted: P           OE      6.1.0-18-amd64 #1  Debian 6.1.76-1
[236300.768975] Hardware name: System manufacturer System Product Name/PRIME A320M-K, BIOS 5603 10/14/2020
[236300.769028] RIP: 0010:abd_borrow_buf_copy+0x20/0x80 [zfs]
[236300.769219] Code: ff ff ff e8 52 4a e6 d2 66 90 0f 1f 44 00 00 41 54 55 53 48 89 fb 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <f6> 07 01 74 24 4c 8b 67 48 48 8b 44 24 08 65 48 2b 04 25 28 00 00
[236300.769288] RSP: 0018:ffffaf069ddcb9f8 EFLAGS: 00010246
[236300.769332] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[236300.769379] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
[236300.769426] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[236300.769474] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000020
[236300.769521] R13: ffffffffc0e5afe0 R14: 0000000000000000 R15: ffffffffc0e5b000
[236300.769567] FS:  0000000000000000(0000) GS:ffff9dd85e640000(0000) knlGS:0000000000000000
[236300.769615] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[236300.769657] CR2: 0000000000000000 CR3: 000000019aa96000 CR4: 00000000001506e0
[236300.769702] Call Trace:
[236300.769742]  <TASK>
[236300.769781]  ? __die_body.cold+0x1a/0x1f
[236300.769826]  ? page_fault_oops+0xd2/0x2b0
[236300.769871]  ? exc_page_fault+0x70/0x170
[236300.769914]  ? asm_exc_page_fault+0x22/0x30
[236300.769958]  ? abd_borrow_buf_copy+0x20/0x80 [zfs]
[236300.770137]  zio_crypt_copy_dnode_bonus+0x2e/0x130 [zfs]
[236300.770286]  arc_buf_fill+0x391/0xd10 [zfs]
[236300.770403]  arc_untransform+0x1d/0x80 [zfs]
[236300.770519]  dbuf_read_verify_dnode_crypt+0x11b/0x190 [zfs]
[236300.770640]  dbuf_read_impl.constprop.0+0x107/0x690 [zfs]
[236300.770761]  ? dbuf_rele_and_unlock+0xf3/0x770 [zfs]
[236300.770882]  ? dbuf_create+0x401/0x5d0 [zfs]
[236300.771001]  ? __kmem_cache_alloc_node+0x191/0x2a0
[236300.771049]  dbuf_read+0xd4/0x620 [zfs]
[236300.771172]  dmu_tx_check_ioerr+0x61/0xd0 [zfs]
[236300.771299]  dmu_tx_hold_free_impl+0x126/0x260 [zfs]
[236300.771422]  dmu_free_long_range+0x250/0x4d0 [zfs]
[236300.771541]  dmu_free_long_object+0x22/0xd0 [zfs]
[236300.771660]  receive_freeobjects+0xa0/0x120 [zfs]
[236300.771782]  receive_writer_thread+0x529/0xaa0 [zfs]
[236300.771904]  ? start_cfs_bandwidth.part.0+0x50/0x50
[236300.771947]  ? set_user_nice+0x162/0x270
[236300.771987]  ? receive_process_write_record+0x180/0x180 [zfs]
[236300.772114]  ? __thread_exit+0x20/0x20 [spl]
[236300.772166]  thread_generic_wrapper+0x5a/0x70 [spl]
[236300.772215]  kthread+0xda/0x100
[236300.772255]  ? kthread_complete_and_exit+0x20/0x20
[236300.772297]  ret_from_fork+0x22/0x30
[236300.772343]  </TASK>

@amotin
Copy link
Member

amotin commented Apr 25, 2024

I expect this to be already fixed by #16104 in master. It is likely too fresh for upcoming 2.2.4, but should appear in following 2.2.5 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Encryption "native encryption" feature Component: Send/Recv "zfs send/recv" feature Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants