Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel Panic after a couple of minutes while using APFS inside zvol on mirror-pool #90

Open
JMoVS opened this issue Jul 13, 2021 · 26 comments

Comments

@JMoVS
Copy link
Contributor

JMoVS commented Jul 13, 2021

System information

Type Version/Name
Distribution Name MacOS
Distribution Version 11, Big Sur
Linux Kernel
Architecture ARM M1
ZFS Version 2.1 rc1
SPL Version

Describe the problem you're observing

Kernel Panic after around 15 minutes of making time machine backup to APFS inside zvol

Describe how to reproduce the problem

Start Time Machine backup to APFS inside zvol, wait

Include any warning/errors/backtraces from the system logs

https://gist.github.com/JMoVS/21e46b2813bf2196fe1876bdccc6f029 (copy pasted version, here the direct file: https://gist.github.com/JMoVS/31ab6dc11516b200799258d5fd6a6905)
https://gist.github.com/JMoVS/1f0231732ea493c816b67ad40a80ad95
https://gist.github.com/JMoVS/33ac87f3c7faa127d6e026d24e96b69d

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

Here another panic, just minutes after importing the pool:
https://gist.github.com/JMoVS/688585ca6495d33f48dd45f587c7dc6e

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

and one more after import:

panic(cpu 7 caller 0xfffffe0019c40888): Kernel data abort. at pc 0xfffffe00185a9e10, lr 0xcfa77e00185c2ddc (saved state: 0xfffffe3fe9882f10)
	  x0: 0x0000000000000000  x1:  0xfffffe4002fc1ac0  x2:  0x0000000000000000  x3:  0xfffffe3fe9883128
	  x4: 0x00000000ffffffff  x5:  0x0000000000000002  x6:  0x0000000000000000  x7:  0x0000000000000000
	  x8: 0x0000000000000000  x9:  0xfffffe001d21d000  x10: 0xfffffe400000d800  x11: 0x0000000000000007
	  x12: 0x0000000000000008 x13: 0xfffffe0017013bd8  x14: 0xfffffe0017013df8  x15: 0xfffffe408cee3648
	  x16: 0xfffffe00195a4694 x17: 0xffffffffffffffff  x18: 0x0000000000000000  x19: 0xfffffe167fb44a00
	  x20: 0xfffffe3fe9883cd0 x21: 0x0000000000000003  x22: 0x00000000c0205a02  x23: 0xfffffe3fe9883cb0
	  x24: 0x00000000c0205a02 x25: 0x0000000020007450  x26: 0xfffffe3fe9883cd0  x27: 0xfffffe167cda4260
	  x28: 0xfffffe001d235000 fp:  0xfffffe3fe9883290  lr:  0xcfa77e00185c2ddc  sp:  0xfffffe3fe9883260
	  pc:  0xfffffe00185a9e10 cpsr: 0x60401208         esr: 0x96000006          far: 0x00000000000000e1

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20F71
Kernel version: Darwin Kernel Version 20.5.0: Sat May  8 05:10:31 PDT 2021; root:xnu-7195.121.3~9/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: FB10CC0AB8BAC020BC47A50D64476F11
Kernel UUID: 07259C53-9EF7-32FF-821D-8F28A5985DFA
iBoot version: iBoot-6723.120.36
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000011824000
KernelCache base:  0xfffffe0018828000
Kernel slide:      0x000000001236c000
Kernel text base:  0xfffffe0019370000
Kernel text exec base:  0xfffffe001943c000
mach_absolute_time: 0x93910d05d
Epoch Time:        sec       usec
  Boot    : 0x60ee78c8 0x000cabcd
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x60ee7f35 0x000bd5c8

CORE 0 recently retired instr at 0xfffffe00195ae6f4
CORE 1 recently retired instr at 0xfffffe00195ae6f4
CORE 2 recently retired instr at 0xfffffe00195ae6f4
CORE 3 recently retired instr at 0xfffffe00195ae6f4
CORE 4 recently retired instr at 0xfffffe00195ae6f8
CORE 5 recently retired instr at 0xfffffe00195ae6f8
CORE 6 recently retired instr at 0xfffffe00195ae6f8
CORE 7 recently retired instr at 0xfffffe00195ad240
Panicked task 0xfffffe1672b7e1f8: 448 pages, 2 threads: pid 874: net.the-color-bl
Panicked thread: 0xfffffe16702e4cc0, backtrace: 0xfffffe3fe9882680, tid: 11437
		  lr: 0xfffffe001948abe4  fp: 0xfffffe3fe98826f0
		  lr: 0xfffffe001948a9c8  fp: 0xfffffe3fe9882760
		  lr: 0xfffffe00195b3a70  fp: 0xfffffe3fe9882780
		  lr: 0xfffffe00195a52b8  fp: 0xfffffe3fe9882830
		  lr: 0xfffffe00194437e8  fp: 0xfffffe3fe9882840
		  lr: 0xfffffe001948a658  fp: 0xfffffe3fe9882bd0
		  lr: 0xfffffe001948a658  fp: 0xfffffe3fe9882c40
		  lr: 0xfffffe0019c3c3e8  fp: 0xfffffe3fe9882c60
		  lr: 0xfffffe0019c40888  fp: 0xfffffe3fe9882dd0
		  lr: 0xfffffe00195a7294  fp: 0xfffffe3fe9882e40
		  lr: 0xfffffe00195a51e0  fp: 0xfffffe3fe9882ef0
		  lr: 0xfffffe00194437e8  fp: 0xfffffe3fe9882f00
		  lr: 0xfffffe00185c2ddc  fp: 0xfffffe3fe9883290
		  lr: 0xfffffe00185c2ddc  fp: 0xfffffe3fe9883320
		  lr: 0xfffffe00185a9d64  fp: 0xfffffe3fe9883380
		  lr: 0xfffffe00185a666c  fp: 0xfffffe3fe9883560
		  lr: 0xfffffe00185a9a74  fp: 0xfffffe3fe98835e0
		  lr: 0xfffffe00185c7304  fp: 0xfffffe3fe9883670
		  lr: 0xfffffe00185c3cf4  fp: 0xfffffe3fe98836d0
		  lr: 0xfffffe00184d7cac  fp: 0xfffffe3fe9883810
		  lr: 0xfffffe001856b794  fp: 0xfffffe3fe9883850
		  lr: 0xfffffe00185645e0  fp: 0xfffffe3fe9883940
		  lr: 0xfffffe0018570d9c  fp: 0xfffffe3fe9883a10
		  lr: 0xfffffe001970be4c  fp: 0xfffffe3fe9883a90
		  lr: 0xfffffe00196fef40  fp: 0xfffffe3fe9883ca0
		  lr: 0xfffffe00199b7584  fp: 0xfffffe3fe9883db0
		  lr: 0xfffffe0019a917c0  fp: 0xfffffe3fe9883e40
		  lr: 0xfffffe00195a4f94  fp: 0xfffffe3fe9883ef0
		  lr: 0xfffffe00194437e8  fp: 0xfffffe3fe9883f00
      Kernel Extensions in backtrace:
         org.openzfsonosx.zfs(2.1)[0BF8CB05-9B3B-3182-8DE6-AF14261D75B8]@0xfffffe0018410000->0xfffffe00186fffff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe001b958000->0xfffffe001b977fff

last started kext at 24394817680: com.apple.filesystems.smbfs	3.6 (addr 0xfffffe0019344000, size 65536)

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

I can no longer import the pool without panicing:

panic(cpu 5 caller 0xfffffe0023c40888): Kernel data abort. at pc 0xfffffe00225a9e10, lr 0xed8b7e00225c2ddc (saved state: 0xfffffe3feaf52f10)
	  x0: 0x0000000000000000  x1:  0xfffffe4002f61d00  x2:  0x0000000000000000  x3:  0xfffffe3feaf53128
	  x4: 0x00000000ffffffff  x5:  0x0000000000000002  x6:  0x0000000000000000  x7:  0x0000000000000000
	  x8: 0x0000000000000000  x9:  0xfffffe002721d000  x10: 0xfffffe400000d800  x11: 0x0000000000000000
	  x12: 0x0000000000000001 x13: 0xfffffe0021013bd8  x14: 0xfffffe0021013df8  x15: 0xfffffe4084af7e70
	  x16: 0xfffffe00235a4694 x17: 0xffffffffffffffff  x18: 0x0000000000000000  x19: 0xfffffe167d7e9f00
	  x20: 0xfffffe3feaf53cd0 x21: 0x0000000000000003  x22: 0x00000000c0205a02  x23: 0xfffffe3feaf53cb0
	  x24: 0x00000000c0205a02 x25: 0x0000000020007450  x26: 0xfffffe3feaf53cd0  x27: 0xfffffe16757138a0
	  x28: 0xfffffe0027235000 fp:  0xfffffe3feaf53290  lr:  0xed8b7e00225c2ddc  sp:  0xfffffe3feaf53260
	  pc:  0xfffffe00225a9e10 cpsr: 0x60401208         esr: 0x96000006          far: 0x00000000000000e1

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20F71
Kernel version: Darwin Kernel Version 20.5.0: Sat May  8 05:10:31 PDT 2021; root:xnu-7195.121.3~9/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: FB10CC0AB8BAC020BC47A50D64476F11
Kernel UUID: 07259C53-9EF7-32FF-821D-8F28A5985DFA
iBoot version: iBoot-6723.120.36
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x000000001b824000
KernelCache base:  0xfffffe0022828000
Kernel slide:      0x000000001c36c000
Kernel text base:  0xfffffe0023370000
Kernel text exec base:  0xfffffe002343c000
mach_absolute_time: 0x20cb82a07
Epoch Time:        sec       usec
  Boot    : 0x60ee7f47 0x000b2fa5
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x60ee80b0 0x0008bc74

CORE 0 recently retired instr at 0xfffffe00235ae6f4
CORE 1 recently retired instr at 0xfffffe00235ae6f4
CORE 2 recently retired instr at 0xfffffe00235ae6f4
CORE 3 recently retired instr at 0xfffffe00235ae6f4
CORE 4 recently retired instr at 0xfffffe00235ae6f8
CORE 5 recently retired instr at 0xfffffe00235ad240
CORE 6 recently retired instr at 0xfffffe00235ae6f8
CORE 7 recently retired instr at 0xfffffe00235ae6f8
Panicked task 0xfffffe166dd92f08: 440 pages, 1 threads: pid 1209: zpool
Panicked thread: 0xfffffe166c9d0cc0, backtrace: 0xfffffe3feaf52680, tid: 15557
		  lr: 0xfffffe002348abe4  fp: 0xfffffe3feaf526f0
		  lr: 0xfffffe002348a9c8  fp: 0xfffffe3feaf52760
		  lr: 0xfffffe00235b3a70  fp: 0xfffffe3feaf52780
		  lr: 0xfffffe00235a52b8  fp: 0xfffffe3feaf52830
		  lr: 0xfffffe00234437e8  fp: 0xfffffe3feaf52840
		  lr: 0xfffffe002348a658  fp: 0xfffffe3feaf52bd0
		  lr: 0xfffffe002348a658  fp: 0xfffffe3feaf52c40
		  lr: 0xfffffe0023c3c3e8  fp: 0xfffffe3feaf52c60
		  lr: 0xfffffe0023c40888  fp: 0xfffffe3feaf52dd0
		  lr: 0xfffffe00235a7294  fp: 0xfffffe3feaf52e40
		  lr: 0xfffffe00235a51e0  fp: 0xfffffe3feaf52ef0
		  lr: 0xfffffe00234437e8  fp: 0xfffffe3feaf52f00
		  lr: 0xfffffe00225c2ddc  fp: 0xfffffe3feaf53290
		  lr: 0xfffffe00225c2ddc  fp: 0xfffffe3feaf53320
		  lr: 0xfffffe00225a9d64  fp: 0xfffffe3feaf53380
		  lr: 0xfffffe00225a666c  fp: 0xfffffe3feaf53560
		  lr: 0xfffffe00225a9a74  fp: 0xfffffe3feaf535e0
		  lr: 0xfffffe00225c7304  fp: 0xfffffe3feaf53670
		  lr: 0xfffffe00225c3cf4  fp: 0xfffffe3feaf536d0
		  lr: 0xfffffe00224d7cac  fp: 0xfffffe3feaf53810
		  lr: 0xfffffe002256b794  fp: 0xfffffe3feaf53850
		  lr: 0xfffffe00225645e0  fp: 0xfffffe3feaf53940
		  lr: 0xfffffe0022570d9c  fp: 0xfffffe3feaf53a10
		  lr: 0xfffffe002370be4c  fp: 0xfffffe3feaf53a90
		  lr: 0xfffffe00236fef40  fp: 0xfffffe3feaf53ca0
		  lr: 0xfffffe00239b7584  fp: 0xfffffe3feaf53db0
		  lr: 0xfffffe0023a917c0  fp: 0xfffffe3feaf53e40
		  lr: 0xfffffe00235a4f94  fp: 0xfffffe3feaf53ef0
		  lr: 0xfffffe00234437e8  fp: 0xfffffe3feaf53f00
      Kernel Extensions in backtrace:
         org.openzfsonosx.zfs(2.1)[0BF8CB05-9B3B-3182-8DE6-AF14261D75B8]@0xfffffe0022410000->0xfffffe00226fffff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe0025958000->0xfffffe0025977fff

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

readonly import

panic(cpu 4 caller 0xfffffe002dc40888): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe002d4430d8, lr 0xe5ef7e002d523468 (saved state: 0xfffffe3069edbcb0)
	  x0: 0x0000000000000010  x1:  0x0000000000000000  x2:  0xfffffe3088a50088  x3:  0x000000000003ffff
	  x4: 0x0000000000000000  x5:  0x00000000000000a0  x6:  0x0000000000000000  x7:  0x0000000000000000
	  x8: 0x0000000000000000  x9:  0x0000000000000001  x10: 0x0000000000040000  x11: 0x0000000000040000
	  x12: 0x0000000000000000 x13: 0x0000000000000000  x14: 0xfffffe002c410000  x15: 0xfffffe002c700000
	  x16: 0x0000000000000001 x17: 0xfffffe00311b8e40  x18: 0x0000000000000000  x19: 0x0000000000040000
	  x20: 0xfffffe0035510120 x21: 0x00000000000000a0  x22: 0x000000000003ffff  x23: 0x0000000000000000
	  x24: 0x0000000000000000 x25: 0x0000000000000000  x26: 0x0000000000000000  x27: 0x0000000000000001
	  x28: 0x0000000000000000 fp:  0xfffffe3088a500f0  lr:  0xe5ef7e002d523468  sp:  0xfffffe3088a4fc80
	  pc:  0xfffffe002d4430d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3088a4fc88

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20F71
Kernel version: Darwin Kernel Version 20.5.0: Sat May  8 05:10:31 PDT 2021; root:xnu-7195.121.3~9/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: FB10CC0AB8BAC020BC47A50D64476F11
Kernel UUID: 07259C53-9EF7-32FF-821D-8F28A5985DFA
iBoot version: iBoot-6723.120.36
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000025824000
KernelCache base:  0xfffffe002c828000
Kernel slide:      0x000000002636c000
Kernel text base:  0xfffffe002d370000
Kernel text exec base:  0xfffffe002d43c000
mach_absolute_time: 0x2d69b9828
Epoch Time:        sec       usec
  Boot    : 0x60ee90cd 0x000a8bda
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x60ee92c3 0x00061000

CORE 0 recently retired instr at 0xfffffe002d5ae6f4
CORE 1 recently retired instr at 0xfffffe002d5ae6f4
CORE 2 recently retired instr at 0xfffffe002d5ae6f4
CORE 3 recently retired instr at 0xfffffe002d5ae6f4
CORE 4 recently retired instr at 0xfffffe002d5ad240
CORE 5 recently retired instr at 0xfffffe002d5ae6f8
CORE 6 recently retired instr at 0xfffffe002d5ae6f8
CORE 7 recently retired instr at 0xfffffe002d5ae6f8
Panicked task 0xfffffe167964f590: 114 pages, 1 threads: pid 1923: mount_apfs
Panicked thread: 0xfffffe166c5ed320, backtrace: 0xfffffe3069edb5c0, tid: 22456
		  lr: 0xfffffe002d48abe4  fp: 0xfffffe3069edb630
		  lr: 0xfffffe002d48a9c8  fp: 0xfffffe3069edb6a0
		  lr: 0xfffffe002d5b3a70  fp: 0xfffffe3069edb6c0
		  lr: 0xfffffe002dc408dc  fp: 0xfffffe3069edb6e0
		  lr: 0xfffffe002d4439bc  fp: 0xfffffe3069edb6f0
		  lr: 0xfffffe002d48a658  fp: 0xfffffe3069edba80
		  lr: 0xfffffe002d48a658  fp: 0xfffffe3069edbaf0
		  lr: 0xfffffe002dc3c3e8  fp: 0xfffffe3069edbb10
		  lr: 0xfffffe002dc40888  fp: 0xfffffe3069edbc80
		  lr: 0xfffffe002d5a82cc  fp: 0xfffffe3069edbc90
		  lr: 0xfffffe002d44399c  fp: 0xfffffe3069edbca0
		  lr: 0xfffffe002d523468  fp: 0xfffffe3088a500f0
		  lr: 0xfffffe002db04a9c  fp: 0xfffffe3088a50130
		  lr: 0xfffffe002c6ec5c0  fp: 0xfffffe3088a501e0
		  lr: 0xfffffe002c6f9d2c  fp: 0xfffffe3088a50220
		  lr: 0xfffffe002c6f721c  fp: 0xfffffe3088a503a0
		  lr: 0xfffffe002c6f2e4c  fp: 0xfffffe3088a50610
		  lr: 0xfffffe002c6f4238  fp: 0xfffffe3088a50660
		  lr: 0xfffffe002c6f84b8  fp: 0xfffffe3088a50950
		  lr: 0xfffffe002c6f2e4c  fp: 0xfffffe3088a50bc0
		  lr: 0xfffffe002c6f4238  fp: 0xfffffe3088a50c10
		  lr: 0xfffffe002c6f2e4c  fp: 0xfffffe3088a50e80
		  lr: 0xfffffe002c6f4238  fp: 0xfffffe3088a50ed0
		  lr: 0xfffffe002c6f2e4c  fp: 0xfffffe3088a51140
		  lr: 0xfffffe002c6f4238  fp: 0xfffffe3088a51190
		  lr: 0xfffffe002c6f2e4c  fp: 0xfffffe3088a51400
		  lr: 0xfffffe002c6f4238  fp: 0xfffffe3088a51450
		  lr: 0xfffffe002c6e2ce8  fp: 0xfffffe3088a514f0
		  lr: 0xfffffe002c6dbcd8  fp: 0xfffffe3088a51530
		  lr: 0xfffffe002c6db54c  fp: 0xfffffe3088a515f0
		  lr: 0xfffffe002c6e2d44  fp: 0xfffffe3088a51690
		  lr: 0xfffffe002c6dbcd8  fp: 0xfffffe3088a516d0
		  lr: 0xfffffe002c6db54c  fp: 0xfffffe3088a51790
		  lr: 0xfffffe002c412a0c  fp: 0xfffffe3088a517d0
		  lr: 0xfffffe002c4103cc  fp: 0xfffffe3088a51820
		  lr: 0xfffffe002c422a2c  fp: 0xfffffe3088a51860
		  lr: 0xfffffe002c41aa9c  fp: 0xfffffe3088a51910
		  lr: 0xfffffe002c41626c  fp: 0xfffffe3088a51970
		  lr: 0xfffffe002c418f18  fp: 0xfffffe3088a51e80
		  lr: 0xfffffe002c436ed0  fp: 0xfffffe3088a51fa0
		  lr: 0xfffffe002c4364c0  fp: 0xfffffe3088a520f0
		  lr: 0xfffffe002c4452d0  fp: 0xfffffe3088a52200
		  lr: 0xfffffe002c446988  fp: 0xfffffe3088a52280
		  lr: 0xfffffe002c5c5c48  fp: 0xfffffe3088a52320
		  lr: 0xfffffe002c5c9240  fp: 0xfffffe3088a52370
		  lr: 0xfffffe002f95efcc  fp: 0xfffffe3088a523f0
		  lr: 0xfffffe002f963a74  fp: 0xfffffe3088a524a0
		  lr: 0xfffffe002f96e530  fp: 0xfffffe3088a52560
		  lr: 0xfffffe002d7164f4  fp: 0xfffffe3088a525c0
		  lr: 0xfffffe002d70f328  fp: 0xfffffe3088a52600
      Kernel Extensions in backtrace:
         com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe002f958000->0xfffffe002f977fff
         org.openzfsonosx.zfs(2.1)[0BF8CB05-9B3B-3182-8DE6-AF14261D75B8]@0xfffffe002c410000->0xfffffe002c6fffff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe002f958000->0xfffffe002f977fff

last started kext at 8603072333: com.apple.driver.CoreStorageFsck	554 (addr 0xfffffe002ce88000, size 16384)

https://gist.github.com/JMoVS/37427666f7aae402b0a443fea2ada817

@lundman
Copy link
Contributor

lundman commented Jul 14, 2021

Invalid kernel stack pointer (probable overflow).

Most likely the issue there, and setting larger stack might help.

@lundman
Copy link
Contributor

lundman commented Jul 14, 2021

org.openzfsonosx.zfs(2.1)[0BF8CB05-9B3B-3182-8DE6-AF14261D75B8]@ 0xfffffe0022410000->0xfffffe00226fffff
0xfffffe0022410000 - 0x88000 = 0xFFFFFE0022388000

		  lr: 0xfffffe002348abe4  fp: 0xfffffe3feaf526f0
		  lr: 0xfffffe002348a9c8  fp: 0xfffffe3feaf52760
...

# atos -o module/os/macos/zfs/zfs -arch arm64e -l 0xFFFFFE0022388000 0xfffffe002348abe4 0xfffffe002348a9c8 0xfffffe00235b3a70 0xfffffe00235a52b8 0xfffffe00234437e8 0xfffffe002348a658 0xfffffe002348a658 0xfffffe0023c3c3e8 0xfffffe0023c40888 0xfffffe00235a7294 0xfffffe00235a51e0 0xfffffe00234437e8 0xfffffe00225c2ddc 0xfffffe00225c2ddc 0xfffffe00225a9d64 0xfffffe00225a666c 0xfffffe00225a9a74 0xfffffe00225c7304 0xfffffe00225c3cf4 0xfffffe00224d7cac 0xfffffe002256b794 0xfffffe00225645e0 0xfffffe0022570d9c 0xfffffe002370be4c 0xfffffe00236fef40 0xfffffe00239b7584 0xfffffe0023a917c0 0xfffffe00235a4f94 0xfffffe00234437e8

vmem_init.initial_default_block (in zfs) + 10709988
vmem_init.initial_default_block (in zfs) + 10709448
vmem_init.initial_default_block (in zfs) + 11926128
vmem_init.initial_default_block (in zfs) + 11866808
vmem_init.initial_default_block (in zfs) + 10418152
vmem_init.initial_default_block (in zfs) + 10708568
vmem_init.initial_default_block (in zfs) + 10708568
0xfffffe0023c3c3e8
0xfffffe0023c40888
vmem_init.initial_default_block (in zfs) + 11874964
vmem_init.initial_default_block (in zfs) + 11866592
vmem_init.initial_default_block (in zfs) + 10418152
zvol_replay_write (in zfs) (zvol.c:528)
zvol_replay_write (in zfs) (zvol.c:528)
zil_replay_log_record (in zfs) (zil.c:3564)
zil_parse (in zfs) (zil.c:408)
zil_replay (in zfs) (zil.c:3619)
zvol_os_create_minor (in zfs) (zvol_os.c:0)
zvol_create_minors_recursive (in zfs) (zvol.c:1160)
spa_import (in zfs) (spa.c:6219)
zfs_ioc_pool_import (in zfs) (zfs_ioctl.c:1561)
zfsdev_ioctl_common (in zfs) (zfs_ioctl.c:7666)
zfsdev_ioctl (in zfs) (zfs_ioctl_os.c:198)
vmem_init.initial_default_block (in zfs) + 13336140
vmem_init.initial_default_block (in zfs) + 13283136
vmem_init.initial_default_block (in zfs) + 16135556
0xfffffe0023a917c0
vmem_init.initial_default_block (in zfs) + 11866004
vmem_init.initial_default_block (in zfs) + 10418152

Probably the zvol replay code we fixed today.

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

import problem fixed with rc2 - now monitoring situation regarding general use again

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 14, 2021

import works again, now tried volblocksize 128k with rc2, new panic:

https://gist.github.com/JMoVS/aa6483efbf0dbaeb76bcbee7f07467bf

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 16, 2021

New panic with the custom pkg:

panic(cpu 2 caller 0xfffffe002feb8888): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe002f6bb0d8, lr 0x17d4fe0030c9d8f4 (saved state: 0xfffffe3069aa3cb0)
	  x0: 0xfffffe00337f85a0  x1:  0x000000000000000a  x2:  0x0000000000000002  x3:  0x000001554d26bf0e
	  x4: 0x0000000000000000  x5:  0xfffffe3ff2a68140  x6:  0xfffffe3ff2a68110  x7:  0xfffffe3ff2a68178
	  x8: 0x00000000000000c0  x9:  0x0000000000000001  x10: 0x0000000000000001  x11: 0x0000000000000000
	  x12: 0x0000000000000000 x13: 0x0000000000000000  x14: 0x0000000000000000  x15: 0x0000000000000000
	  x16: 0xfffffe002f819f14 x17: 0x0000000000004a0e  x18: 0x0000000000000000  x19: 0x000001554d26bf0e
	  x20: 0x0000000000000002 x21: 0x0000000000000004  x22: 0xfffffe00337f85b0  x23: 0xfffffe00337f83c0
	  x24: 0xfffffe00337f8540 x25: 0xfffffe00337f8540  x26: 0x0000000000000000  x27: 0x0000000000000001
	  x28: 0x0000000000000001 fp:  0xfffffe3ff2a680e0  lr:  0x17d4fe0030c9d8f4  sp:  0xfffffe3ff2a67cc0
	  pc:  0xfffffe002f6bb0d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3ff2a67cc8

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20F71
Kernel version: Darwin Kernel Version 20.5.0: Sat May  8 05:10:31 PDT 2021; root:xnu-7195.121.3~9/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: FB10CC0AB8BAC020BC47A50D64476F11
Kernel UUID: 07259C53-9EF7-32FF-821D-8F28A5985DFA
iBoot version: iBoot-6723.120.36
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000027a9c000
KernelCache base:  0xfffffe002eaa0000
Kernel slide:      0x00000000285e4000
Kernel text base:  0xfffffe002f5e8000
Kernel text exec base:  0xfffffe002f6b4000
mach_absolute_time: 0x1554e99113d
Epoch Time:        sec       usec
  Boot    : 0x60efd868 0x000e9c95
  Sleep   : 0x60f16436 0x00042463
  Wake    : 0x60f16457 0x0003e7d5
  Calendar: 0x60f191eb 0x0009433c

CORE 0 recently retired instr at 0xfffffe002f8266f4
CORE 1 recently retired instr at 0xfffffe002f8266f4
CORE 2 recently retired instr at 0xfffffe002f825240
CORE 3 recently retired instr at 0xfffffe002f8266f4
CORE 4 recently retired instr at 0xfffffe002f8266f8
CORE 5 recently retired instr at 0xfffffe002f8266f8
CORE 6 recently retired instr at 0xfffffe002f8266f8
CORE 7 recently retired instr at 0xfffffe002f8266f8
Panicked task 0xfffffe166c621a20: 9759 pages, 7 threads: pid 759: backupd
Panicked thread: 0xfffffe166c5e3960, backtrace: 0xfffffe3069aa35c0, tid: 653893
		  lr: 0xfffffe002f702be4  fp: 0xfffffe3069aa3630
		  lr: 0xfffffe002f7029c8  fp: 0xfffffe3069aa36a0
		  lr: 0xfffffe002f82ba70  fp: 0xfffffe3069aa36c0
		  lr: 0xfffffe002feb88dc  fp: 0xfffffe3069aa36e0
		  lr: 0xfffffe002f6bb9bc  fp: 0xfffffe3069aa36f0
		  lr: 0xfffffe002f702658  fp: 0xfffffe3069aa3a80
		  lr: 0xfffffe002f702658  fp: 0xfffffe3069aa3af0
		  lr: 0xfffffe002feb43e8  fp: 0xfffffe3069aa3b10
		  lr: 0xfffffe002feb8888  fp: 0xfffffe3069aa3c80
		  lr: 0xfffffe002f8202cc  fp: 0xfffffe3069aa3c90
		  lr: 0xfffffe002f6bb99c  fp: 0xfffffe3069aa3ca0
		  lr: 0xfffffe0030c9d8f4  fp: 0xfffffe3ff2a680e0
		  lr: 0xfffffe0030ca9b94  fp: 0xfffffe3ff2a68100
		  lr: 0xfffffe002f820ffc  fp: 0xfffffe3ff2a68200
		  lr: 0xfffffe002f72a1d4  fp: 0xfffffe3ff2a68290
		  lr: 0xfffffe002f729e6c  fp: 0xfffffe3ff2a68320
		  lr: 0xfffffe002f728930  fp: 0xfffffe3ff2a68370
		  lr: 0xfffffe002f6bbaa8  fp: 0xfffffe3ff2a68380
		  lr: 0xfffffe0032021afc  fp: 0xfffffe3ff2a68750
		  lr: 0xfffffe0032044328  fp: 0xfffffe3ff2a687d0
		  lr: 0xfffffe00320208b0  fp: 0xfffffe3ff2a68860
		  lr: 0xfffffe0032006d38  fp: 0xfffffe3ff2a68950
		  lr: 0xfffffe002fdde2a0  fp: 0xfffffe3ff2a689c0
		  lr: 0xfffffe0031ea6880  fp: 0xfffffe3ff2a68a50
		  lr: 0xfffffe00320059b4  fp: 0xfffffe3ff2a68b00
		  lr: 0xfffffe0031f03420  fp: 0xfffffe3ff2a68ba0
		  lr: 0xfffffe002fdde2a0  fp: 0xfffffe3ff2a68c10
		  lr: 0xfffffe0031f02a70  fp: 0xfffffe3ff2a68ca0
		  lr: 0xfffffe00320a9fb8  fp: 0xfffffe3ff2a68ce0
		  lr: 0xfffffe00320a3fec  fp: 0xfffffe3ff2a68d50
		  lr: 0xfffffe0031b2ec14  fp: 0xfffffe3ff2a68db0
		  lr: 0xfffffe0031b2f1ec  fp: 0xfffffe3ff2a68de0
		  lr: 0xfffffe0031b46444  fp: 0xfffffe3ff2a68e50
		  lr: 0xfffffe0031b44960  fp: 0xfffffe3ff2a68eb0
		  lr: 0xfffffe0031bd6fcc  fp: 0xfffffe3ff2a68f30
		  lr: 0xfffffe0031bdba74  fp: 0xfffffe3ff2a68fe0
		  lr: 0xfffffe002e728b38  fp: 0xfffffe3ff2a69020
		  lr: 0xfffffe002e7249c8  fp: 0xfffffe3ff2a69090
		  lr: 0xfffffe002e7803e0  fp: 0xfffffe3ff2a69120
		  lr: 0xfffffe002e82cc84  fp: 0xfffffe3ff2a69190
		  lr: 0xfffffe002e826e5c  fp: 0xfffffe3ff2a691e0
		  lr: 0xfffffe002e791a8c  fp: 0xfffffe3ff2a69260
		  lr: 0xfffffe002e82cc84  fp: 0xfffffe3ff2a692d0
		  lr: 0xfffffe002e826e5c  fp: 0xfffffe3ff2a69320
		  lr: 0xfffffe002e791a8c  fp: 0xfffffe3ff2a693a0
		  lr: 0xfffffe002e82c96c  fp: 0xfffffe3ff2a69410
		  lr: 0xfffffe002e826e5c  fp: 0xfffffe3ff2a69460
		  lr: 0xfffffe002e6919dc  fp: 0xfffffe3ff2a69970
		  lr: 0xfffffe002e6aeed0  fp: 0xfffffe3ff2a69a90
		  lr: 0xfffffe002e6ae4c0  fp: 0xfffffe3ff2a69be0
      Kernel Extensions in backtrace:
         com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe0031bd0000->0xfffffe0031beffff
         com.apple.iokit.IOUSBHostFamily(1.2)[87ECD783-C526-3D75-9F17-6D08FBA23E1B]@0xfffffe0031e7c000->0xfffffe0031f37fff
            dependency: com.apple.driver.AppleSMC(3.1.9)[49B08A04-DEEA-3C28-AEBC-1925E668B6CF]@0xfffffe0030b38000->0xfffffe0030b63fff
            dependency: com.apple.driver.usb.AppleUSBCommon(1.0)[F534782F-3D7E-3A09-AFA6-8E5A98FC6BF0]@0xfffffe0030fbc000->0xfffffe0030fc3fff
            dependency: com.apple.driver.AppleUSBHostMergeProperties(1.2)[779142B0-94CE-3667-BB0A-4D2F086FE98B]@0xfffffe0031f90000->0xfffffe0031f93fff
         com.apple.iokit.IOSCSIArchitectureModelFamily(436.121.1)[A229CB3B-F9D1-376A-B929-2C946EFDA492]@0xfffffe0031b28000->0xfffffe0031b43fff
         com.apple.iokit.IOSCSIBlockCommandsDevice(436.121.1)[931AEC6B-C16F-35D8-92A1-E47E3763ECEC]@0xfffffe0031b44000->0xfffffe0031b57fff
            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.121.1)[A229CB3B-F9D1-376A-B929-2C946EFDA492]@0xfffffe0031b28000->0xfffffe0031b43fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe0031bd0000->0xfffffe0031beffff
         com.apple.iokit.IOUSBMassStorageDriver(184.121.1)[B82C866E-16BD-39D7-A062-299875A117CB]@0xfffffe0032090000->0xfffffe00320b3fff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[B0CD5169-B0A8-3682-BB05-ADBC1D65D831]@0xfffffe0031b08000->0xfffffe0031b23fff
            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.121.1)[A229CB3B-F9D1-376A-B929-2C946EFDA492]@0xfffffe0031b28000->0xfffffe0031b43fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe0031bd0000->0xfffffe0031beffff
            dependency: com.apple.iokit.IOUSBHostFamily(1.2)[87ECD783-C526-3D75-9F17-6D08FBA23E1B]@0xfffffe0031e7c000->0xfffffe0031f37fff
         com.apple.driver.usb.AppleUSBXHCI(1.2)[A03522C6-4221-3C53-B568-B9BC0D6D889A]@0xfffffe0031ff0000->0xfffffe0032053fff
            dependency: com.apple.driver.AppleARMPlatform(1.0.2)[F784412E-33CC-3859-B0CB-D0A62C9E26CB]@0xfffffe0030010000->0xfffffe003005ffff
            dependency: com.apple.driver.usb.AppleUSBCommon(1.0)[F534782F-3D7E-3A09-AFA6-8E5A98FC6BF0]@0xfffffe0030fbc000->0xfffffe0030fc3fff
            dependency: com.apple.iokit.IOAccessoryManager(1.0.0)[ECF49BAE-1493-32AA-A772-CC0DAC56C2AD]@0xfffffe0031524000->0xfffffe00315cbfff
            dependency: com.apple.iokit.IOUSBHostFamily(1.2)[87ECD783-C526-3D75-9F17-6D08FBA23E1B]@0xfffffe0031e7c000->0xfffffe0031f37fff
         com.apple.driver.AppleT8103CLPCv3(1.0)[4BB216DD-DE35-3A5C-BD56-EC1DFECDE838]@0xfffffe0030c90000->0xfffffe0030cc3fff
            dependency: com.apple.driver.AppleARMPlatform(1.0.2)[F784412E-33CC-3859-B0CB-D0A62C9E26CB]@0xfffffe0030010000->0xfffffe003005ffff
            dependency: com.apple.driver.ApplePMGR(1)[5761657E-DCB4-3AA9-A3B8-C82FE84C2A0E]@0xfffffe00309e8000->0xfffffe0030a23fff
            dependency: com.apple.iokit.IOReportFamily(47)[A0C25545-295B-3538-AA20-D7ED2DFACC78]@0xfffffe0031b24000->0xfffffe0031b27fff
            dependency: com.apple.iokit.IOSurface(290.8.1)[C5FD4B4A-C048-353D-BE45-72BEF2BE75BE]@0xfffffe0031bf8000->0xfffffe0031c17fff
            dependency: com.apple.kec.Libm(1)[A20A98FB-6353-31A9-85C0-594B4D1F55BB]@0xfffffe00320e0000->0xfffffe00320e3fff
         org.openzfsonosx.zfs(2.1)[8FA54BF4-AB61-3D35-B53F-FD01D0780AE7]@0xfffffe002e688000->0xfffffe002e977fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[8A98ACCC-D34C-36E3-93F6-D8A2F39C2A51]@0xfffffe0031bd0000->0xfffffe0031beffff

last started kext at 96693450488: com.apple.filesystems.smbfs	3.6 (addr 0xfffffe002f5bc000, size 65536)
loaded kexts:
org.openzfsonosx.zfs	2.1.0

Beware: Still unable to 100% verify which kext version is really loaded - but eying the strings n the kext and considering the fact that caches were rebuilt, I hope it is indeed the right version

@lundman
Copy link
Contributor

lundman commented Jul 16, 2021

20 xnu functions
buf_strategy_iokit (in zfs) (ldi_iokit.cpp:0)
ldi_strategy (in zfs) (ldi_osx.c:2258)
vdev_disk_io_start (in zfs) (vdev_disk.c:631)
zio_vdev_io_start (in zfs) (zio.c:3857)
zio_nowait (in zfs) (zio.c:2284)
vdev_mirror_io_start (in zfs) (vdev_mirror.c:686)
zio_vdev_io_start (in zfs) (zio.c:3857)
zio_nowait (in zfs) (zio.c:2284)
vdev_mirror_io_start (in zfs) (vdev_mirror.c:686)
zio_vdev_io_start (in zfs) (zio.c:3729)
zio_nowait (in zfs) (zio.c:2284)
arc_read (in zfs) (arc.c:6452)
dbuf_read_impl (in zfs) (dbuf.c:1527)
dbuf_read (in zfs) (dbuf.c:1658)

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 22, 2021

more panic:

panic(cpu 0 caller 0xfffffe0021c35320): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe00214370d8, lr 0x57b57e00215581f8 (saved state: 0xfffffe30291dbcb0)
	  x0: 0x0000000000000000  x1:  0x0000000000000000  x2:  0xfffffe3feb6c80b8  x3:  0x000000000003ffff
	  x4: 0x0000000000000000  x5:  0x000000000000009c  x6:  0x0000000000000000  x7:  0x0000000000000004
	  x8: 0x0000000000000000  x9:  0x0000000000000001  x10: 0x0000000000040000  x11: 0x0000000000040000
	  x12: 0x0000000000000000 x13: 0x0000000000000000  x14: 0xfffffe0020410000  x15: 0xfffffe00206f4000
	  x16: 0x0000000000000001 x17: 0xfffffe00251a8e40  x18: 0x0000000000000000  x19: 0xfffffe002520c000
	  x20: 0x0000000000000000 x21: 0x0000000000000010  x22: 0x0000000000000000  x23: 0x0000000000000000
	  x24: 0x0000000000000000 x25: 0x0000000000040000  x26: 0xfffffe0025209000  x27: 0xfffffe0025209000
	  x28: 0xfffffe0025209000 fp:  0xfffffe3feb6c8060  lr:  0x57b57e00215581f8  sp:  0xfffffe3feb6c7c60
	  pc:  0xfffffe00214370d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3feb6c7c68

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20G71
Kernel version: Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: 30D51067A55177473519BD211C009F6E
Kernel UUID: AC4A14A7-8A8E-3AE6-85A6-55E6B2502BF9
iBoot version: iBoot-6723.140.2
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000019818000
KernelCache base:  0xfffffe002081c000
Kernel slide:      0x000000001a360000
Kernel text base:  0xfffffe0021364000
Kernel text exec base:  0xfffffe0021430000
mach_absolute_time: 0x1bf96d62f3
Epoch Time:        sec       usec
  Boot    : 0x60f937c1 0x000b4106
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x60f94b48 0x000ce9c0

CORE 0 recently retired instr at 0xfffffe00215a16a4
CORE 1 recently retired instr at 0xfffffe00215a2d6c
CORE 2 recently retired instr at 0xfffffe00215a2d6c
CORE 3 recently retired instr at 0xfffffe00215a2d6c
CORE 4 recently retired instr at 0xfffffe00215a2d70
CORE 5 recently retired instr at 0xfffffe00215a2d70
CORE 6 recently retired instr at 0xfffffe00215a2d70
CORE 7 recently retired instr at 0xfffffe00215a2d70
CORE 0 PVH locks held: None
CORE 1 PVH locks held: None
CORE 2 PVH locks held: None
CORE 3 PVH locks held: None
CORE 4 PVH locks held: None
CORE 5 PVH locks held: None
CORE 6 PVH locks held: None
CORE 7 PVH locks held: None
CORE 0 is the one that panicked. Check the full backtrace for details.
CORE 1: PC=0xfffffe00214a9a64, LR=0xfffffe00214a9a5c, FP=0xfffffe3ff09f3ee0
CORE 2: PC=0xfffffe0022a06558, LR=0xfffffe0022a05b74, FP=0xfffffe3ff022bc20
CORE 3: PC=0xfffffe0021440c28, LR=0xfffffe0021440c28, FP=0x0000000000000000
CORE 4: PC=0xfffffe00214a9a64, LR=0xfffffe00214a9a5c, FP=0xfffffe3fe2a03ee0
CORE 5: PC=0xfffffe00214a9a64, LR=0xfffffe00214a9a5c, FP=0xfffffe3ffcbcbee0
CORE 6: PC=0xfffffe00214a9a64, LR=0xfffffe00214a9a5c, FP=0xfffffe3ff0683ee0
CORE 7: PC=0xfffffe00214a9a64, LR=0xfffffe00214a9a5c, FP=0xfffffe3fe3afbee0
Panicked task 0xfffffe16682b9398: 5766 pages, 8 threads: pid 3822: backupd
Panicked thread: 0xfffffe1676c7b960, backtrace: 0xfffffe30291db5c0, tid: 54723
		  lr: 0xfffffe002147eb68  fp: 0xfffffe30291db630
		  lr: 0xfffffe002147e94c  fp: 0xfffffe30291db6a0
		  lr: 0xfffffe00215a81c8  fp: 0xfffffe30291db6c0
		  lr: 0xfffffe0021c35374  fp: 0xfffffe30291db6e0
		  lr: 0xfffffe00214379bc  fp: 0xfffffe30291db6f0
		  lr: 0xfffffe002147e5dc  fp: 0xfffffe30291dba80
		  lr: 0xfffffe002147e5dc  fp: 0xfffffe30291dbaf0
		  lr: 0xfffffe0021c30e80  fp: 0xfffffe30291dbb10
		  lr: 0xfffffe0021c35320  fp: 0xfffffe30291dbc80
		  lr: 0xfffffe002159c734  fp: 0xfffffe30291dbc90
		  lr: 0xfffffe002143799c  fp: 0xfffffe30291dbca0
		  lr: 0xfffffe00215581f8  fp: 0xfffffe3feb6c8060
		  lr: 0xfffffe0021517354  fp: 0xfffffe3feb6c8120
		  lr: 0xfffffe0021af93c4  fp: 0xfffffe3feb6c8160
		  lr: 0xfffffe00206e1e18  fp: 0xfffffe3feb6c8210
		  lr: 0xfffffe00206ef654  fp: 0xfffffe3feb6c8250
		  lr: 0xfffffe00206ecb44  fp: 0xfffffe3feb6c83d0
		  lr: 0xfffffe00206e8774  fp: 0xfffffe3feb6c8640
		  lr: 0xfffffe00206e9b60  fp: 0xfffffe3feb6c8690
		  lr: 0xfffffe00206edde0  fp: 0xfffffe3feb6c8980
		  lr: 0xfffffe00206e8774  fp: 0xfffffe3feb6c8bf0
		  lr: 0xfffffe00206e9b60  fp: 0xfffffe3feb6c8c40
		  lr: 0xfffffe00206e8774  fp: 0xfffffe3feb6c8eb0
		  lr: 0xfffffe00206e9b60  fp: 0xfffffe3feb6c8f00
		  lr: 0xfffffe00206e8774  fp: 0xfffffe3feb6c9170
		  lr: 0xfffffe00206e9b60  fp: 0xfffffe3feb6c91c0
		  lr: 0xfffffe00206e8774  fp: 0xfffffe3feb6c9430
		  lr: 0xfffffe00206e9b60  fp: 0xfffffe3feb6c9480
		  lr: 0xfffffe00206d858c  fp: 0xfffffe3feb6c9520
		  lr: 0xfffffe00206d158c  fp: 0xfffffe3feb6c9560
		  lr: 0xfffffe00206d0e00  fp: 0xfffffe3feb6c9630
		  lr: 0xfffffe00205ac884  fp: 0xfffffe3feb6c9670
		  lr: 0xfffffe00204222d0  fp: 0xfffffe3feb6c96b0
		  lr: 0xfffffe0020416540  fp: 0xfffffe3feb6c9770
		  lr: 0xfffffe00204188d4  fp: 0xfffffe3feb6c9c80
		  lr: 0xfffffe0020436ed0  fp: 0xfffffe3feb6c9da0
		  lr: 0xfffffe00204364c0  fp: 0xfffffe3feb6c9ef0
		  lr: 0xfffffe00204452d0  fp: 0xfffffe3feb6ca000
		  lr: 0xfffffe00204467a8  fp: 0xfffffe3feb6ca080
		  lr: 0xfffffe00205c5750  fp: 0xfffffe3feb6ca120
		  lr: 0xfffffe00205c8c88  fp: 0xfffffe3feb6ca1c0
		  lr: 0xfffffe002394efcc  fp: 0xfffffe3feb6ca240
		  lr: 0xfffffe0023953a74  fp: 0xfffffe3feb6ca2f0
		  lr: 0xfffffe002395e530  fp: 0xfffffe3feb6ca3b0
		  lr: 0xfffffe002170ad1c  fp: 0xfffffe3feb6ca410
		  lr: 0xfffffe0021703b50  fp: 0xfffffe3feb6ca450
		  lr: 0xfffffe0023fd34dc  fp: 0xfffffe3feb6ca4d0
		  lr: 0xfffffe0023fd3118  fp: 0xfffffe3feb6ca590
		  lr: 0xfffffe00240c8fb4  fp: 0xfffffe3feb6ca650
		  lr: 0xfffffe00240c8694  fp: 0xfffffe3feb6ca780
      Kernel Extensions in backtrace:
         com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe0023948000->0xfffffe0023967fff
         com.apple.filesystems.apfs(1677.141.1)[CC40AF90-D379-339D-857A-BCFCC5378DF6]@0xfffffe0023fd0000->0xfffffe00240e3fff
            dependency: com.apple.driver.AppleEffaceableStorage(1.0)[608C5FF9-9F38-3E74-8CE1-F90AFF18F8ED]@0xfffffe0022278000->0xfffffe002227ffff
            dependency: com.apple.iokit.CoreAnalyticsFamily(1)[34337684-DA05-3170-8DDB-259DFEB3D159]@0xfffffe0022df0000->0xfffffe0022df7fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe0023948000->0xfffffe0023967fff
            dependency: com.apple.kec.corecrypto(11.1)[7740FC9A-DE55-35F9-99FB-46C0F386BB51]@0xfffffe0024124000->0xfffffe002416ffff
         org.openzfsonosx.zfs(2.1.99)[8C444041-C4DA-3CD8-BD7E-CA38F1B5FB39]@0xfffffe0020410000->0xfffffe00206f3fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe0023948000->0xfffffe0023967fff

last started kext at 32925186575: com.apple.driver.CoreStorageFsck	554.140.2 (addr 0xfffffe0020e7c000, size 16384)
loaded kexts:
org.openzfsonosx.zfs	2.1.99

(base) justin@Sergeant-Peppair ~ % log show --source --predicate 'process == "kernel" AND sender == "zfs" and message CONTAINS "stackavailable"' --style compact --last 1h
Filtering the log data using "process == "kernel" AND sender == "zfs" AND composedMessage CONTAINS "stackavailable""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp               Ty Process[PID:TID]
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4264
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4024
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3992
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3944
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3864
2021-07-22 12:35:06.728 Df kernel[0:d5c3] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3768
2021-07-22 10:42:01.058 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14872
2021-07-22 10:42:01.058 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14776
2021-07-22 10:42:01.058 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13560
2021-07-22 10:42:01.058 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13128
2021-07-22 10:42:01.060 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12952
2021-07-22 10:42:01.072 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12776
2021-07-22 10:42:01.076 Df kernel[0:954] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12488

@lundman
Copy link
Contributor

lundman commented Jul 22, 2021

vmem_init.initial_default_block (in zfs) + 10709864
vmem_init.initial_default_block (in zfs) + 10709324
vmem_init.initial_default_block (in zfs) + 11928008
0xfffffe0021c35374
vmem_init.initial_default_block (in zfs) + 10418620
vmem_init.initial_default_block (in zfs) + 10708444
vmem_init.initial_default_block (in zfs) + 10708444
0xfffffe0021c30e80
0xfffffe0021c35320
vmem_init.initial_default_block (in zfs) + 11880244
vmem_init.initial_default_block (in zfs) + 10418588
vmem_init.initial_default_block (in zfs) + 11600376
vmem_init.initial_default_block (in zfs) + 11334484
0xfffffe0021af93c4
osif_malloc (in zfs) (spl-seg_kmem.c:178)
spl_vmem_malloc_if_no_pressure (in zfs) (spl-vmem.c:1341)
xnu_alloc_throttled (in zfs) (spl-vmem.c:2497)
vmem_xalloc (in zfs) (spl-vmem.c:1541)
vmem_alloc (in zfs) (spl-vmem.c:1775)
vmem_bucket_alloc (in zfs) (spl-vmem.c:3061)
vmem_xalloc (in zfs) (spl-vmem.c:1541)
vmem_alloc (in zfs) (spl-vmem.c:1775)
vmem_xalloc (in zfs) (spl-vmem.c:1541)
vmem_alloc (in zfs) (spl-vmem.c:1775)
vmem_xalloc (in zfs) (spl-vmem.c:1541)
vmem_alloc (in zfs) (spl-vmem.c:1775)
vmem_xalloc (in zfs) (spl-vmem.c:1541)
vmem_alloc (in zfs) (spl-vmem.c:1775)
kmem_slab_create (in zfs) (spl-kmem.c:1118)
kmem_slab_alloc (in zfs) (spl-kmem.c:1339)
kmem_cache_alloc (in zfs) (spl-kmem.c:2190)
zio_data_buf_alloc (in zfs) (zio.c:343)
arc_get_data_buf (in zfs) (arc.c:5221)
arc_buf_alloc_impl (in zfs) (arc.c:2866)
arc_read (in zfs) (arc.c:6055)
dbuf_read_impl (in zfs) (dbuf.c:1527)
dbuf_read (in zfs) (dbuf.c:1658)
dmu_buf_hold_array_by_dnode (in zfs) (dmu.c:573)
dmu_read_uio_dnode (in zfs) (dmu.c:1207)
zvol_os_read_zv (in zfs) (zvol_os.c:331)
org_openzfsonosx_zfs_zvol_device::doAsyncReadWrite(IOMemoryDescriptor*, unsigned long long, unsigned long long, IOStorageAttributes*, IOStorageCompletion*) (in zfs) (zvolIO.cpp:0)
0xfffffe002394efcc
0xfffffe0023953a74
0xfffffe002395e530
vmem_init.initial_default_block (in zfs) + 13380892
vmem_init.initial_default_block (in zfs) + 13351760
0xfffffe0023fd34dc
0xfffffe0023fd3118
0xfffffe00240c8fb4
0xfffffe00240c8694

@lundman
Copy link
Contributor

lundman commented Jul 22, 2021

Function sizes:

176    osif_malloc
64    spl_vmem_malloc_if_no_pressure
384    xnu_alloc_throttled
624    vmem_xalloc
80    vmem_alloc
752    vmem_bucket_alloc
624    vmem_xalloc
80    vmem_alloc
624    vmem_xalloc
80    vmem_alloc
624    vmem_xalloc
80    vmem_alloc
624    vmem_xalloc
80    vmem_alloc
160    kmem_slab_create
64    kmem_slab_alloc
208    kmem_cache_alloc
64    zio_data_buf_alloc
64    arc_get_data_buf
192    arc_buf_alloc_impl
1296    arc_read
288    dbuf_read_impl
336    dbuf_read
272    dmu_buf_hold_array_by_dnode
128    dmu_read_uio_dnode
160    zvol_os_read_zv
160    org_openzfsonosx_zfs_zvol_device::doAsyncReadWrite

Total: 8288 bytes

~/sizes.sh allcompile.txt | awk '{total += $1} END {print "Total:", total, "bytes"}'

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 22, 2021

this test had compression enabled whereas the previous ones didn't - don't know if that makes a difference

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 23, 2021

panic(cpu 5 caller 0xfffffe002dc35320): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe002d4370d8, lr 0x4c94fe002d932e94 (saved state: 0xfffffe3058453cb0)
	  x0: 0xfffffe167276baa0  x1:  0x0000000000000008  x2:  0xfffffe3ff02e0017  x3:  0x0000000000000005
	  x4: 0x0000000000000000  x5:  0xb78bfe002d4d2fd0  x6:  0x0000000000000000  x7:  0x0000000000000000
	  x8: 0x0000000000000001  x9:  0xfffffe002d36b73c  x10: 0xfffffe167276bba0  x11: 0xfffffe167276bbb0
	  x12: 0x000000007b2ef46d x13: 0x000000000e84d235  x14: 0xfffffe301e404000  x15: 0x000000000003ae6b
	  x16: 0xfffffe166bbf3180 x17: 0xfffffe166bbf3180  x18: 0x0000000000000000  x19: 0xfffffe167276baa0
	  x20: 0x0000000000000000 x21: 0xfffffe1673de35d8  x22: 0x0000000000000000  x23: 0xfffffe16738fd200
	  x24: 0xfffffe1673de35d8 x25: 0xfffffe00311cd000  x26: 0xfffffe1673de35d8  x27: 0x0000000000000000
	  x28: 0x0000000000000005 fp:  0xfffffe3ff02e00a0  lr:  0x4c94fe002d932e94  sp:  0xfffffe3ff02dfc90
	  pc:  0xfffffe002d4370d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3ff02dfc98

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20G71
Kernel version: Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: 30D51067A55177473519BD211C009F6E
Kernel UUID: AC4A14A7-8A8E-3AE6-85A6-55E6B2502BF9
iBoot version: iBoot-6723.140.2
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000025818000
KernelCache base:  0xfffffe002c81c000
Kernel slide:      0x0000000026360000
Kernel text base:  0xfffffe002d364000
Kernel text exec base:  0xfffffe002d430000
mach_absolute_time: 0x245e790641
Epoch Time:        sec       usec
  Boot    : 0x60fa7daf 0x000b29c0
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x60fa9715 0x0001d25e

CORE 0 recently retired instr at 0xfffffe002d5a2d6c
CORE 1 recently retired instr at 0xfffffe002d5a2d6c
CORE 2 recently retired instr at 0xfffffe002d5a2d6c
CORE 3 recently retired instr at 0xfffffe002d5a2d6c
CORE 4 recently retired instr at 0xfffffe002d5a2d70
CORE 5 recently retired instr at 0xfffffe002d5a16a4
CORE 6 recently retired instr at 0xfffffe002d5a2d70
CORE 7 recently retired instr at 0xfffffe002d5a2d70
CORE 0 PVH locks held: None
CORE 1 PVH locks held: None
CORE 2 PVH locks held: None
CORE 3 PVH locks held: None
CORE 4 PVH locks held: None
CORE 5 PVH locks held: None
CORE 6 PVH locks held: None
CORE 7 PVH locks held: None
CORE 0: PC=0x000000019eabfb0c, LR=0x000000019eabf518, FP=0x000000016b91e330
CORE 1: PC=0x000000019ea5cb98, LR=0x0000000199fc34ac, FP=0x000000016b713e60
CORE 2: PC=0x000000019e867a7c, LR=0x000000019e876dd4, FP=0x000000016dda6a40
CORE 3: PC=0x00007ffe9471fbe0, LR=0x00007ffe949cf270, FP=0x000000030df270e0
CORE 4: PC=0xfffffe002c43a910, LR=0xfffffe002c43a8a4, FP=0xfffffe3ff22f3390
CORE 5 is the one that panicked. Check the full backtrace for details.
CORE 6: PC=0xfffffe002d5a4470, LR=0xfffffe002ea28fc0, FP=0xfffffe3fef9fba60
CORE 7: PC=0x00007ffe9b6f9400, LR=0x00007ffe9b6f93b8, FP=0x0000000201255ee0
Panicked task 0xfffffe16770154e8: 9370 pages, 9 threads: pid 3111: backupd
Panicked thread: 0xfffffe166d504000, backtrace: 0xfffffe30584535c0, tid: 52480
		  lr: 0xfffffe002d47eb68  fp: 0xfffffe3058453630
		  lr: 0xfffffe002d47e94c  fp: 0xfffffe30584536a0
		  lr: 0xfffffe002d5a81c8  fp: 0xfffffe30584536c0
		  lr: 0xfffffe002dc35374  fp: 0xfffffe30584536e0
		  lr: 0xfffffe002d4379bc  fp: 0xfffffe30584536f0
		  lr: 0xfffffe002d47e5dc  fp: 0xfffffe3058453a80
		  lr: 0xfffffe002d47e5dc  fp: 0xfffffe3058453af0
		  lr: 0xfffffe002dc30e80  fp: 0xfffffe3058453b10
		  lr: 0xfffffe002dc35320  fp: 0xfffffe3058453c80
		  lr: 0xfffffe002d59c734  fp: 0xfffffe3058453c90
		  lr: 0xfffffe002d43799c  fp: 0xfffffe3058453ca0
		  lr: 0xfffffe002d932e94  fp: 0xfffffe3ff02e00a0
		  lr: 0xfffffe002d936d38  fp: 0xfffffe3ff02e0120
		  lr: 0xfffffe002d959188  fp: 0xfffffe3ff02e0160
		  lr: 0xfffffe002d95e1f4  fp: 0xfffffe3ff02e0180
		  lr: 0xfffffe002d4d317c  fp: 0xfffffe3ff02e0250
		  lr: 0xfffffe002d4d2da8  fp: 0xfffffe3ff02e02e0
		  lr: 0xfffffe002d4d22bc  fp: 0xfffffe3ff02e03c0
		  lr: 0xfffffe002d4d67a8  fp: 0xfffffe3ff02e0420
		  lr: 0xfffffe002d9ada64  fp: 0xfffffe3ff02e0440
		  lr: 0xfffffe002d9a8af4  fp: 0xfffffe3ff02e0460
		  lr: 0xfffffe002dadbb90  fp: 0xfffffe3ff02e04a0
		  lr: 0xfffffe002dadba44  fp: 0xfffffe3ff02e0790
		  lr: 0xfffffe002d499d10  fp: 0xfffffe3ff02e08f0
		  lr: 0xfffffe002d499b9c  fp: 0xfffffe3ff02e0910
		  lr: 0xfffffe002c6e1cf4  fp: 0xfffffe3ff02e09d0
		  lr: 0xfffffe002c6ef6b8  fp: 0xfffffe3ff02e0a10
		  lr: 0xfffffe002c6ecba8  fp: 0xfffffe3ff02e0b90
		  lr: 0xfffffe002c6e87d8  fp: 0xfffffe3ff02e0e00
		  lr: 0xfffffe002c6e9bc4  fp: 0xfffffe3ff02e0e50
		  lr: 0xfffffe002c6ede44  fp: 0xfffffe3ff02e1140
		  lr: 0xfffffe002c6e87d8  fp: 0xfffffe3ff02e13b0
		  lr: 0xfffffe002c6e9bc4  fp: 0xfffffe3ff02e1400
		  lr: 0xfffffe002c6e87d8  fp: 0xfffffe3ff02e1670
		  lr: 0xfffffe002c6e9bc4  fp: 0xfffffe3ff02e16c0
		  lr: 0xfffffe002c6e87d8  fp: 0xfffffe3ff02e1930
		  lr: 0xfffffe002c6e9bc4  fp: 0xfffffe3ff02e1980
		  lr: 0xfffffe002c6e87d8  fp: 0xfffffe3ff02e1bf0
		  lr: 0xfffffe002c6e9bc4  fp: 0xfffffe3ff02e1c40
		  lr: 0xfffffe002c6d8598  fp: 0xfffffe3ff02e1ce0
		  lr: 0xfffffe002c6d1598  fp: 0xfffffe3ff02e1d20
		  lr: 0xfffffe002c6d0e0c  fp: 0xfffffe3ff02e1df0
		  lr: 0xfffffe002c5ac884  fp: 0xfffffe3ff02e1e30
		  lr: 0xfffffe002c4222d0  fp: 0xfffffe3ff02e1e70
		  lr: 0xfffffe002c416540  fp: 0xfffffe3ff02e1f30
		  lr: 0xfffffe002c4188d4  fp: 0xfffffe3ff02e2440
		  lr: 0xfffffe002c436ed0  fp: 0xfffffe3ff02e2560
		  lr: 0xfffffe002c4364c0  fp: 0xfffffe3ff02e26b0
		  lr: 0xfffffe002c4452d0  fp: 0xfffffe3ff02e27c0
		  lr: 0xfffffe002c4467a8  fp: 0xfffffe3ff02e2840
      Kernel Extensions in backtrace:
         org.openzfsonosx.zfs(2.1.99)[1E5B5E73-442C-3CB5-B981-822C9FE7C853]@0xfffffe002c410000->0xfffffe002c6f3fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe002f948000->0xfffffe002f967fff

last started kext at 63220395454: com.apple.driver.CoreStorageFsck	554.140.2 (addr 0xfffffe002ce7c000, size 16384)
loaded kexts:
org.openzfsonosx.zfs	2.1.99

14:05 https://www.lundman.net/OpenZFSonOsX-2.1.99-Big.Sur-11-arm64.pkg 71ee657ce334422f83310c0411973b2e

no new messages appeared here:

2021-07-23 11:15:28.195 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4808
2021-07-23 11:15:28.750 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4616
2021-07-23 11:15:29.246 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4584
2021-07-23 11:15:29.747 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4536
2021-07-23 11:15:30.247 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4520
2021-07-23 11:15:30.747 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4360
2021-07-23 11:21:48.233 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4344
2021-07-23 11:21:48.733 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4312
2021-07-23 11:21:49.232 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4264
2021-07-23 11:21:49.733 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4024
2021-07-23 11:21:50.232 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3992
2021-07-23 11:21:50.732 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3944
2021-07-23 11:21:51.233 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3864
2021-07-23 11:21:51.733 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3768

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 23, 2021

(base) justin@SergeantPeppair ~ % log show --source --predicate 'process == "kernel" AND sender == "zfs" and message CONTAINS "stackavailable"' --style compact --last 1h
Filtering the log data using "process == "kernel" AND sender == "zfs" AND composedMessage CONTAINS "stackavailable""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp               Ty Process[PID:TID]
2021-07-23 11:21:48.209 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4344
2021-07-23 11:21:48.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4312
2021-07-23 11:21:49.207 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4264
2021-07-23 11:21:49.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4024
2021-07-23 11:21:50.207 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3992
2021-07-23 11:21:50.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3944
2021-07-23 11:21:51.208 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3864
2021-07-23 11:21:51.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3768
2021-07-23 10:17:42.263 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14872
2021-07-23 10:17:42.758 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14776
2021-07-23 10:17:43.257 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13560
2021-07-23 10:17:43.758 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13128
2021-07-23 10:17:44.265 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12952
2021-07-23 10:17:44.780 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12776
2021-07-23 10:17:45.289 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12488

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 23, 2021

justin@SergeantPeppair ~ % log show --source --predicate 'process == "kernel" AND sender == "zfs" and message CONTAINS "stackavailable"' --style compact --last 4h
Filtering the log data using "process == "kernel" AND sender == "zfs" AND composedMessage CONTAINS "stackavailable""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp               Ty Process[PID:TID]
2021-07-23 08:29:01.956 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14872
2021-07-23 08:29:02.453 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14776
2021-07-23 08:29:02.953 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13560
2021-07-23 08:29:03.453 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13128
2021-07-23 08:29:03.959 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12952
2021-07-23 08:29:04.469 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12776
2021-07-23 08:29:04.974 Df kernel[0:893] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12488
2021-07-23 11:12:58.789 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12472
2021-07-23 11:12:59.288 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12264
2021-07-23 11:12:59.787 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12104
2021-07-23 11:13:00.288 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12024
2021-07-23 11:13:00.789 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12008
2021-07-23 11:13:01.289 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 11688
2021-07-23 11:13:01.788 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 11576
2021-07-23 11:13:02.288 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 11160
2021-07-23 11:13:02.819 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 10808
2021-07-23 11:13:03.320 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 10456
2021-07-23 11:13:03.820 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 10248
2021-07-23 11:13:04.348 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 9896
2021-07-23 11:13:04.875 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 9128
2021-07-23 11:13:05.368 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 9000
2021-07-23 11:13:05.867 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8472
2021-07-23 11:13:06.368 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8424
2021-07-23 11:13:06.868 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8392
2021-07-23 11:13:07.367 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8344
2021-07-23 11:13:07.868 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8104
2021-07-23 11:13:08.368 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8072
2021-07-23 11:13:08.868 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 8024
2021-07-23 11:13:09.368 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7944
2021-07-23 11:13:09.868 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7848
2021-07-23 11:14:00.675 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7832
2021-07-23 11:14:01.168 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7784
2021-07-23 11:14:01.668 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7704
2021-07-23 11:14:02.168 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7608
2021-07-23 11:14:08.634 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7528
2021-07-23 11:14:09.128 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7496
2021-07-23 11:14:09.628 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7448
2021-07-23 11:14:10.129 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7368
2021-07-23 11:14:10.627 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7272
2021-07-23 11:14:11.129 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 7240
2021-07-23 11:14:11.629 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 6952
2021-07-23 11:14:12.127 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 6840
2021-07-23 11:14:12.630 Df kernel[0:a204] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 6344
2021-07-23 11:15:19.249 Df kernel[0:bc4d] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5848
2021-07-23 11:15:19.747 Df kernel[0:bc4d] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5720
2021-07-23 11:15:24.161 Df kernel[0:bd73] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5688
2021-07-23 11:15:24.660 Df kernel[0:bd73] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5448
2021-07-23 11:15:25.159 Df kernel[0:bd73] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5256
2021-07-23 11:15:25.658 Df kernel[0:bd73] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 5128
2021-07-23 11:15:26.166 Df kernel[0:bd73] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4968
2021-07-23 11:15:28.176 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4808
2021-07-23 11:15:28.731 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4616
2021-07-23 11:15:29.227 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4584
2021-07-23 11:15:29.728 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4536
2021-07-23 11:15:30.228 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4520
2021-07-23 11:15:30.728 Df kernel[0:bdac] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4360
2021-07-23 11:21:48.209 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4344
2021-07-23 11:21:48.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4312
2021-07-23 11:21:49.207 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4264
2021-07-23 11:21:49.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 4024
2021-07-23 11:21:50.207 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3992
2021-07-23 11:21:50.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3944
2021-07-23 11:21:51.208 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3864
2021-07-23 11:21:51.708 Df kernel[0:cd00] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 3768
2021-07-23 10:17:42.263 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14872
2021-07-23 10:17:42.758 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 14776
2021-07-23 10:17:43.257 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13560
2021-07-23 10:17:43.758 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 13128
2021-07-23 10:17:44.265 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12952
2021-07-23 10:17:44.780 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12776
2021-07-23 10:17:45.289 Df kernel[0:891] (zfs) <zfs`kmem_cache_alloc> SPL: New LOW stackavailable 12488

@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 23, 2021

(base) justin@SergeantPeppair ~ % log show --last 2h | grep OSKernelStackRemaining
2021-07-23 10:17:42.263614+0000 0x891      Default     0x0                  0      0    kernel: (zfs) SPL: NEW OSKernelStackRemaining 15240

rottegift added a commit to rottegift/openzfs that referenced this issue Jul 24, 2021
In openzfsonosx#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time warnings
rottegift added a commit to rottegift/openzfs that referenced this issue Jul 25, 2021
In openzfsonosx#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.
rottegift added a commit that referenced this issue Jul 25, 2021
In #90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.
@JMoVS
Copy link
Contributor Author

JMoVS commented Jul 28, 2021

panic after setting 4000:

panic(cpu 1 caller 0xfffffe001bd01320): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe001b5030d8, lr 0xc58efe001b572110 (saved state: 0xfffffe30638bfcb0)
	  x0: 0x000000000000000a  x1:  0x000001f47db440a9  x2:  0x0000000000000000  x3:  0x0000000000000080
	  x4: 0xfffffe1667612ca0  x5:  0xfffffe166e3b4660  x6:  0x0000000000000001  x7:  0xffffffffffffffff
	  x8: 0xfffffe00235125d0  x9:  0x000002002df92698  x10: 0xfffffe1667612ca0  x11: 0x0000040000000000
	  x12: 0x0000000000000000 x13: 0x0000000000000001  x14: 0x00000000604013c8  x15: 0x0000000000000000
	  x16: 0x0291fe001b668d7c x17: 0x0000000000004a0e  x18: 0x0000000000000000  x19: 0xfffffe166e3b4660
	  x20: 0xfffffe1667612ca0 x21: 0x0000000000000000  x22: 0x000001f47db440a9  x23: 0x000000000000000a
	  x24: 0x0000000000000001 x25: 0x0000000000000080  x26: 0xfffffe001f288000  x27: 0xfffffe001f2b9000
	  x28: 0xfffffe001f2d9000 fp:  0xfffffe3fef1a80d0  lr:  0xc58efe001b572110  sp:  0xfffffe3fef1a7c90
	  pc:  0xfffffe001b5030d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3fef1a7c98

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20G80
Kernel version: Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: E46841F89DC3FD7ACEC6F404AC995579
Kernel UUID: AC4A14A7-8A8E-3AE6-85A6-55E6B2502BF9
iBoot version: iBoot-6723.140.2
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x00000000138e4000
KernelCache base:  0xfffffe001a8e8000
Kernel slide:      0x000000001442c000
Kernel text base:  0xfffffe001b430000
Kernel text exec base:  0xfffffe001b4fc000
mach_absolute_time: 0x1f47dbbf5e1
Epoch Time:        sec       usec
  Boot    : 0x60ffd690 0x0003cece
  Sleep   : 0x6101a8dc 0x000e5e72
  Wake    : 0x6101a9b4 0x00078085
  Calendar: 0x6101c8b9 0x00084134

CORE 0 recently retired instr at 0xfffffe001b66ed6c
CORE 1 recently retired instr at 0xfffffe001b66d6a4
CORE 2 recently retired instr at 0xfffffe001b66ed6c
CORE 3 recently retired instr at 0xfffffe001b66ed6c
CORE 4 recently retired instr at 0xfffffe001b66ed70
CORE 5 recently retired instr at 0xfffffe001b66ed70
CORE 6 recently retired instr at 0xfffffe001b66ed70
CORE 7 recently retired instr at 0xfffffe001b66ed70
CORE 0 PVH locks held: None
CORE 1 PVH locks held: None
CORE 2 PVH locks held: None
CORE 3 PVH locks held: None
CORE 4 PVH locks held: None
CORE 5 PVH locks held: None
CORE 6 PVH locks held: None
CORE 7 PVH locks held: None
CORE 0: PC=0xfffffe001a55ffc4, LR=0xfffffe001a55ffc4, FP=0xfffffe3fe7dfaf40
CORE 1 is the one that panicked. Check the full backtrace for details.
CORE 2: PC=0xfffffe001b5a7338, LR=0xfffffe001b55c0c0, FP=0xfffffe3fef3b3ba0
CORE 3: PC=0xfffffe001b575a64, LR=0xfffffe001b575a5c, FP=0xfffffe30428a3ee0
CORE 4: PC=0xfffffe001b575a64, LR=0xfffffe001b575a5c, FP=0xfffffe3fef4e3ee0
CORE 5: PC=0xfffffe001b575a64, LR=0xfffffe001b575a5c, FP=0xfffffe3fef38bee0
CORE 6: PC=0xfffffe001b575a64, LR=0xfffffe001b575a5c, FP=0xfffffe3feba4bee0
CORE 7: PC=0xfffffe001b575a64, LR=0xfffffe001b575a5c, FP=0xfffffe3fe5e3bee0
Panicked task 0xfffffe166bb294e8: 26321 pages, 7 threads: pid 647: mds_stores
Panicked thread: 0xfffffe166e3b4660, backtrace: 0xfffffe30638bf5c0, tid: 883240
		  lr: 0xfffffe001b54ab68  fp: 0xfffffe30638bf630
		  lr: 0xfffffe001b54a94c  fp: 0xfffffe30638bf6a0
		  lr: 0xfffffe001b6741c8  fp: 0xfffffe30638bf6c0
		  lr: 0xfffffe001bd01374  fp: 0xfffffe30638bf6e0
		  lr: 0xfffffe001b5039bc  fp: 0xfffffe30638bf6f0
		  lr: 0xfffffe001b54a5dc  fp: 0xfffffe30638bfa80
		  lr: 0xfffffe001b54a5dc  fp: 0xfffffe30638bfaf0
		  lr: 0xfffffe001bcfce80  fp: 0xfffffe30638bfb10
		  lr: 0xfffffe001bd01320  fp: 0xfffffe30638bfc80
		  lr: 0xfffffe001b668734  fp: 0xfffffe30638bfc90
		  lr: 0xfffffe001b50399c  fp: 0xfffffe30638bfca0
		  lr: 0xfffffe001b572110  fp: 0xfffffe3fef1a80d0
		  lr: 0xfffffe001b572110  fp: 0xfffffe3fef1a8160
		  lr: 0xfffffe001b571da8  fp: 0xfffffe3fef1a81f0
		  lr: 0xfffffe001b57086c  fp: 0xfffffe3fef1a8240
		  lr: 0xfffffe001b669b60  fp: 0xfffffe3fef1a8260
		  lr: 0xfffffe001bc103f0  fp: 0xfffffe3fef1a82a0
		  lr: 0xfffffe001dd3262c  fp: 0xfffffe3fef1a83a0
		  lr: 0xfffffe001bc26bc8  fp: 0xfffffe3fef1a8410
		  lr: 0xfffffe001dd669ec  fp: 0xfffffe3fef1a84c0
		  lr: 0xfffffe001bc26bc8  fp: 0xfffffe3fef1a8530
		  lr: 0xfffffe001dd6148c  fp: 0xfffffe3fef1a85d0
		  lr: 0xfffffe001dd5eaf0  fp: 0xfffffe3fef1a8670
		  lr: 0xfffffe001dd473e0  fp: 0xfffffe3fef1a8710
		  lr: 0xfffffe001bc26bc8  fp: 0xfffffe3fef1a8780
		  lr: 0xfffffe001dd46a70  fp: 0xfffffe3fef1a8810
		  lr: 0xfffffe001deedfb8  fp: 0xfffffe3fef1a8850
		  lr: 0xfffffe001dee7fec  fp: 0xfffffe3fef1a88c0
		  lr: 0xfffffe001d972c14  fp: 0xfffffe3fef1a8920
		  lr: 0xfffffe001d9731ec  fp: 0xfffffe3fef1a8950
		  lr: 0xfffffe001d98a444  fp: 0xfffffe3fef1a89c0
		  lr: 0xfffffe001d988960  fp: 0xfffffe3fef1a8a20
		  lr: 0xfffffe001da1afcc  fp: 0xfffffe3fef1a8aa0
		  lr: 0xfffffe001da1fa74  fp: 0xfffffe3fef1a8b50
		  lr: 0xfffffe001a57d3f8  fp: 0xfffffe3fef1a8b90
		  lr: 0xfffffe001a579428  fp: 0xfffffe3fef1a8c00
		  lr: 0xfffffe001a5d4dcc  fp: 0xfffffe3fef1a8c90
		  lr: 0xfffffe001a6818a4  fp: 0xfffffe3fef1a8d00
		  lr: 0xfffffe001a67ba7c  fp: 0xfffffe3fef1a8d50
		  lr: 0xfffffe001a5e6478  fp: 0xfffffe3fef1a8dd0
		  lr: 0xfffffe001a6818a4  fp: 0xfffffe3fef1a8e40
		  lr: 0xfffffe001a67ba7c  fp: 0xfffffe3fef1a8e90
		  lr: 0xfffffe001a5e6478  fp: 0xfffffe3fef1a8f10
		  lr: 0xfffffe001a68158c  fp: 0xfffffe3fef1a8f80
		  lr: 0xfffffe001a67ba7c  fp: 0xfffffe3fef1a8fd0
		  lr: 0xfffffe001a4e65fc  fp: 0xfffffe3fef1a94e0
		  lr: 0xfffffe001a503af0  fp: 0xfffffe3fef1a9600
		  lr: 0xfffffe001a5030e0  fp: 0xfffffe3fef1a9750
		  lr: 0xfffffe001a511ef0  fp: 0xfffffe3fef1a9860
		  lr: 0xfffffe001a5133c8  fp: 0xfffffe3fef1a98e0
      Kernel Extensions in backtrace:
         com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001da14000->0xfffffe001da33fff
         com.apple.iokit.IOUSBHostFamily(1.2)[6CFAFC24-387A-3E4C-82F2-35F742CCB3B6]@0xfffffe001dcc0000->0xfffffe001dd7bfff
            dependency: com.apple.driver.AppleSMC(3.1.9)[A1733D39-958E-3CA9-845C-9719D057859F]@0xfffffe001c978000->0xfffffe001c9a3fff
            dependency: com.apple.driver.usb.AppleUSBCommon(1.0)[6A4EFC7B-4630-3C80-B24B-1CFF38B8C0B2]@0xfffffe001cdfc000->0xfffffe001ce03fff
            dependency: com.apple.driver.AppleUSBHostMergeProperties(1.2)[659D6E3C-E285-302F-8739-284DC6C78A1A]@0xfffffe001ddd4000->0xfffffe001ddd7fff
         com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe001d96c000->0xfffffe001d987fff
         com.apple.iokit.IOSCSIBlockCommandsDevice(436.140.1)[68323F13-F102-378E-A85F-847410107EE1]@0xfffffe001d988000->0xfffffe001d99bfff
            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe001d96c000->0xfffffe001d987fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001da14000->0xfffffe001da33fff
         com.apple.iokit.IOUSBMassStorageDriver(184.140.2)[A8A5FECD-3428-3697-9858-2756A22AC437]@0xfffffe001ded4000->0xfffffe001def7fff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[663AF8F3-8DF8-346B-84B5-A116C931421D]@0xfffffe001d94c000->0xfffffe001d967fff
            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe001d96c000->0xfffffe001d987fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001da14000->0xfffffe001da33fff
            dependency: com.apple.iokit.IOUSBHostFamily(1.2)[6CFAFC24-387A-3E4C-82F2-35F742CCB3B6]@0xfffffe001dcc0000->0xfffffe001dd7bfff
         org.openzfsonosx.zfs(2.1.99)[1212C13F-37BD-3313-A9C6-8C7262974126]@0xfffffe001a4dc000->0xfffffe001a7bffff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001da14000->0xfffffe001da33fff

last started kext at 4649608201: com.apple.filesystems.smbfs	3.6 (addr 0xfffffe001b404000, size 65536)
loaded kexts:
org.openzfsonosx.zfs	2.1.99

lundman pushed a commit that referenced this issue Jul 30, 2021
In #90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.
@JMoVS
Copy link
Contributor Author

JMoVS commented Aug 1, 2021

setup change:

this time zvol inside ZFS encryption root, skein checksum, and compression set to zstd initilly, changed to lz4 briefly and then back again. but unencrypted apfs.

{"bug_type":"210","timestamp":"2021-08-01 20:13:00.00 +0200","os_version":"macOS 11.5.1 (20G80)","incident_id":"62EF507D-5F41-4D00-BFA0-92A2D4D79CCA"}
{
  "build" : "macOS 11.5.1 (20G80)",
  "product" : "MacBookAir10,1",
  "kernel" : "Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5\/RELEASE_ARM64_T8101",
  "incident" : "62EF507D-5F41-4D00-BFA0-92A2D4D79CCA",
  "crashReporterKey" : "403AAFF1-90BE-5D70-C84C-F7EB6830469C",
  "date" : "2021-08-01 20:13:00.80 +0200",
  "panicString" : "panic(cpu 3 caller 0xfffffe0017c39320): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe001743b0d8, lr 0x73b77e00174940c0 (saved state: 0xfffffe305a4c3cb0)\n\t  x0: 0xfffffe001b1f7128  x1:  0xfffffe10000c8198  x2:  0x0000000000000004  x3:  0xfffffe001b1da558\n\t  x4: 0x0000000000000000  x5:  0x0000000000000000  x6:  0xf9fffe001a3f6688  x7:  0xf1b77e0017b5eabc\n\t  x8: 0xfffffe001a32ecc0  x9:  0x0000000000000007  x10: 0xfffffe001b1bca00  x11: 0xfffffe001b1bd430\n\t  x12: 0xfffffe001b1bd418 x13: 0x000000000029adbb  x14: 0x0000000000000000  x15: 0xfffffe001b20073c\n\t  x16: 0x1601fe001a3f26b0 x17: 0xfffffe001a3f2608  x18: 0x0000000000000000  x19: 0x0000000000000088\n\t  x20: 0xfffffe001a32ea10 x21: 0xfffffe001b1f7128  x22: 0x0000000000000000  x23: 0x0000000000000000\n\t  x24: 0xf3c47e0019c6a1ec x25: 0xfffffe2331c6e648  x26: 0x0000000000000002  x27: 0x00000000aaaaaaaa\n\t  x28: 0xfffffe1667b54028 fp:  0xfffffe3fe3620040  lr:  0x73b77e00174940c0  sp:  0xfffffe3fe361fcb0\n\t  pc:  0xfffffe001743b0d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3fe361fcb8\n\nDebugger message: panic\nMemory ID: 0x6\nOS release type: User\nOS version: 20G80\nKernel version: Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5\/RELEASE_ARM64_T8101\nFileset Kernelcache UUID: E46841F89DC3FD7ACEC6F404AC995579\nKernel UUID: AC4A14A7-8A8E-3AE6-85A6-55E6B2502BF9\niBoot version: iBoot-6723.140.2\nsecure boot?: YES\nPaniclog version: 13\nKernelCache slide: 0x000000000f81c000\nKernelCache base:  0xfffffe0016820000\nKernel slide:      0x0000000010364000\nKernel text base:  0xfffffe0017368000\nKernel text exec base:  0xfffffe0017434000\nmach_absolute_time: 0x3d9e9e3d46\nEpoch Time:        sec       usec\n  Boot    : 0x6106b820 0x0008dfe2\n  Sleep   : 0x00000000 0x00000000\n  Wake    : 0x00000000 0x00000000\n  Calendar: 0x6106e32c 0x000ad12f\n\nCORE 0 recently retired instr at 0xfffffe00175a6d6c\nCORE 1 recently retired instr at 0xfffffe00175a6d6c\nCORE 2 recently retired instr at 0xfffffe00175a6d6c\nCORE 3 recently retired instr at 0xfffffe00175a56a4\nCORE 4 recently retired instr at 0xfffffe00175a6d70\nCORE 5 recently retired instr at 0xfffffe00175a6d70\nCORE 6 recently retired instr at 0xfffffe00175a6d70\nCORE 7 recently retired instr at 0xfffffe00175a6d70\nCORE 0 PVH locks held: None\nCORE 1 PVH locks held: None\nCORE 2 PVH locks held: None\nCORE 3 PVH locks held: None\nCORE 4 PVH locks held: None\nCORE 5 PVH locks held: None\nCORE 6 PVH locks held: None\nCORE 7 PVH locks held: None\nCORE 0: PC=0xfffffe0018a2fba4, LR=0xfffffe0018a2f064, FP=0xfffffe3fecb83d60\nCORE 1: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe3fea41bee0\nCORE 2: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe30b555bee0\nCORE 3 is the one that panicked. Check the full backtrace for details.\nCORE 4: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe3fed7cbee0\nCORE 5: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe3042aebee0\nCORE 6: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe3fea40bee0\nCORE 7: PC=0xfffffe00174ada64, LR=0xfffffe00174ada5c, FP=0xfffffe3fea3cbee0\nPanicked task 0xfffffe16677421f8: 16208 pages, 12 threads: pid 346: ArqAgent\nPanicked thread: 0xfffffe166bb43300, backtrace: 0xfffffe305a4c35c0, tid: 73104\n\t\t  lr: 0xfffffe0017482b68  fp: 0xfffffe305a4c3630\n\t\t  lr: 0xfffffe001748294c  fp: 0xfffffe305a4c36a0\n\t\t  lr: 0xfffffe00175ac1c8  fp: 0xfffffe305a4c36c0\n\t\t  lr: 0xfffffe0017c39374  fp: 0xfffffe305a4c36e0\n\t\t  lr: 0xfffffe001743b9bc  fp: 0xfffffe305a4c36f0\n\t\t  lr: 0xfffffe00174825dc  fp: 0xfffffe305a4c3a80\n\t\t  lr: 0xfffffe00174825dc  fp: 0xfffffe305a4c3af0\n\t\t  lr: 0xfffffe0017c34e80  fp: 0xfffffe305a4c3b10\n\t\t  lr: 0xfffffe0017c39320  fp: 0xfffffe305a4c3c80\n\t\t  lr: 0xfffffe00175a0734  fp: 0xfffffe305a4c3c90\n\t\t  lr: 0xfffffe001743b99c  fp: 0xfffffe305a4c3ca0\n\t\t  lr: 0xfffffe00174940c0  fp: 0xfffffe3fe3620040\n\t\t  lr: 0xfffffe0017b31b98  fp: 0xfffffe3fe3620060\n\t\t  lr: 0xfffffe0017b36e70  fp: 0xfffffe3fe36200b0\n\t\t  lr: 0xfffffe0017b48300  fp: 0xfffffe3fe36200f0\n\t\t  lr: 0xfffffe0019c6a62c  fp: 0xfffffe3fe36201f0\n\t\t  lr: 0xfffffe0017b5ebc8  fp: 0xfffffe3fe3620260\n\t\t  lr: 0xfffffe0019c9e9ec  fp: 0xfffffe3fe3620310\n\t\t  lr: 0xfffffe0017b5ebc8  fp: 0xfffffe3fe3620380\n\t\t  lr: 0xfffffe0019c9948c  fp: 0xfffffe3fe3620420\n\t\t  lr: 0xfffffe0019c96af0  fp: 0xfffffe3fe36204c0\n\t\t  lr: 0xfffffe0019c7f3e0  fp: 0xfffffe3fe3620560\n\t\t  lr: 0xfffffe0017b5ebc8  fp: 0xfffffe3fe36205d0\n\t\t  lr: 0xfffffe0019c7ea70  fp: 0xfffffe3fe3620660\n\t\t  lr: 0xfffffe0019e25fb8  fp: 0xfffffe3fe36206a0\n\t\t  lr: 0xfffffe0019e1ffec  fp: 0xfffffe3fe3620710\n\t\t  lr: 0xfffffe00198aac14  fp: 0xfffffe3fe3620770\n\t\t  lr: 0xfffffe00198ab1ec  fp: 0xfffffe3fe36207a0\n\t\t  lr: 0xfffffe00198c2444  fp: 0xfffffe3fe3620810\n\t\t  lr: 0xfffffe00198c0960  fp: 0xfffffe3fe3620870\n\t\t  lr: 0xfffffe0019952fcc  fp: 0xfffffe3fe36208f0\n\t\t  lr: 0xfffffe0019957a74  fp: 0xfffffe3fe36209a0\n\t\t  lr: 0xfffffe00164b53f8  fp: 0xfffffe3fe36209e0\n\t\t  lr: 0xfffffe00164b1428  fp: 0xfffffe3fe3620a50\n\t\t  lr: 0xfffffe001650cdcc  fp: 0xfffffe3fe3620ae0\n\t\t  lr: 0xfffffe00165b98a4  fp: 0xfffffe3fe3620b50\n\t\t  lr: 0xfffffe00165b3a7c  fp: 0xfffffe3fe3620ba0\n\t\t  lr: 0xfffffe001651e478  fp: 0xfffffe3fe3620c20\n\t\t  lr: 0xfffffe00165b98a4  fp: 0xfffffe3fe3620c90\n\t\t  lr: 0xfffffe00165b3a7c  fp: 0xfffffe3fe3620ce0\n\t\t  lr: 0xfffffe001651e478  fp: 0xfffffe3fe3620d60\n\t\t  lr: 0xfffffe00165b958c  fp: 0xfffffe3fe3620dd0\n\t\t  lr: 0xfffffe00165b3a7c  fp: 0xfffffe3fe3620e20\n\t\t  lr: 0xfffffe001641e5fc  fp: 0xfffffe3fe3621330\n\t\t  lr: 0xfffffe001643baf0  fp: 0xfffffe3fe3621450\n\t\t  lr: 0xfffffe001643b0e0  fp: 0xfffffe3fe36215a0\n\t\t  lr: 0xfffffe001643f2a8  fp: 0xfffffe3fe3621770\n\t\t  lr: 0xfffffe001643f278  fp: 0xfffffe3fe3621940\n\t\t  lr: 0xfffffe001643ca8c  fp: 0xfffffe3fe3621990\n\t\t  lr: 0xfffffe0016440178  fp: 0xfffffe3fe36219c0\n      Kernel Extensions in backtrace:\n         com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001994c000->0xfffffe001996bfff\n         com.apple.iokit.IOUSBHostFamily(1.2)[6CFAFC24-387A-3E4C-82F2-35F742CCB3B6]@0xfffffe0019bf8000->0xfffffe0019cb3fff\n            dependency: com.apple.driver.AppleSMC(3.1.9)[A1733D39-958E-3CA9-845C-9719D057859F]@0xfffffe00188b0000->0xfffffe00188dbfff\n            dependency: com.apple.driver.usb.AppleUSBCommon(1.0)[6A4EFC7B-4630-3C80-B24B-1CFF38B8C0B2]@0xfffffe0018d34000->0xfffffe0018d3bfff\n            dependency: com.apple.driver.AppleUSBHostMergeProperties(1.2)[659D6E3C-E285-302F-8739-284DC6C78A1A]@0xfffffe0019d0c000->0xfffffe0019d0ffff\n         com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe00198a4000->0xfffffe00198bffff\n         com.apple.iokit.IOSCSIBlockCommandsDevice(436.140.1)[68323F13-F102-378E-A85F-847410107EE1]@0xfffffe00198c0000->0xfffffe00198d3fff\n            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe00198a4000->0xfffffe00198bffff\n            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001994c000->0xfffffe001996bfff\n         com.apple.iokit.IOUSBMassStorageDriver(184.140.2)[A8A5FECD-3428-3697-9858-2756A22AC437]@0xfffffe0019e0c000->0xfffffe0019e2ffff\n            dependency: com.apple.iokit.IOPCIFamily(2.9)[663AF8F3-8DF8-346B-84B5-A116C931421D]@0xfffffe0019884000->0xfffffe001989ffff\n            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(436.140.1)[FACB1737-CE01-3E0C-9C75-D5E522DCFC4D]@0xfffffe00198a4000->0xfffffe00198bffff\n            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001994c000->0xfffffe001996bfff\n            dependency: com.apple.iokit.IOUSBHostFamily(1.2)[6CFAFC24-387A-3E4C-82F2-35F742CCB3B6]@0xfffffe0019bf8000->0xfffffe0019cb3fff\n         org.openzfsonosx.zfs(2.1)[9D9F176D-94C1-34BE-A3C6-CE64313F02AF]@0xfffffe0016414000->0xfffffe00166f7fff\n            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001994c000->0xfffffe001996bfff\n\nlast started kext at 24575130415: com.apple.filesystems.smbfs\t3.6 (addr 0xfffffe001733c000, size 65536)\nloaded kexts:\norg.openzfsonosx.zfs\t2.1.0\ncom.apple.filesystems.smbfs\t3.6\ncom.apple.filesystems.autofs\t3.0\ncom.apple.fileutil\t20.036.15\ncom.apple.driver.AppleBluetoothMultitouch\t99\ncom.apple.driver.AppleTopCaseHIDEventDriver\t4050.1\ncom.apple.iokit.IOBluetoothSerialManager\t8.0.5d7\ncom.apple.driver.AppleBiometricServices\t1\ncom.apple.driver.CoreStorageFsck\t554.140.2\ncom.apple.driver.usb.AppleUSBHostBillboardDevice\t1.0\ncom.apple.iokit.SCSITaskUserClient\t436.140.1\ncom.apple.driver.BCMWLANFirmware4378.Hashstore\t1\ncom.apple.driver.CoreKDL\t1\ncom.apple.driver.SEPHibernation\t1\ncom.apple.driver.DiskImages.ReadWriteDiskImage\t493.0.0\ncom.apple.driver.DiskImages.UDIFDiskImage\t493.0.0\ncom.apple.driver.DiskImages.RAMBackingStore\t493.0.0\ncom.apple.driver.DiskImages.FileBackingStore\t493.0.0\ncom.apple.driver.AppleUSBDeviceNCM\t5.0.0\ncom.apple.driver.AppleThunderboltIP\t4.0.3\ncom.apple.driver.AppleSmartBatteryManager\t161.0.0\ncom.apple.filesystems.apfs\t1677.141.1\ncom.apple.driver.AppleALSColorSensor\t1.0.0d1\ncom.apple.driver.AppleAOPVoiceTrigger\t11.5\ncom.apple.driver.AppleSmartIO2\t1\ncom.apple.driver.ApplePMP\t1\ncom.apple.driver.AppleT8020SOCTuner\t1\ncom.apple.driver.AppleFileSystemDriver\t3.0.1\ncom.apple.driver.AppleAVD\t385\ncom.apple.nke.l2tp\t1.9\ncom.apple.filesystems.tmpfs\t1\ncom.apple.IOTextEncryptionFamily\t1.0.0\ncom.apple.filesystems.hfs.kext\t556.100.11\ncom.apple.security.BootPolicy\t1\ncom.apple.BootCache\t40\ncom.apple.AppleFSCompression.AppleFSCompressionTypeZlib\t1.0.0\ncom.apple.AppleFSCompression.AppleFSCompressionTypeDataless\t1.0.0d1\ncom.apple.driver.ApplePMPFirmware\t1\ncom.apple.driver.AppleT8103CLPCv3\t1\ncom.apple.AppleEmbeddedSimpleSPINORFlasher\t1\ncom.apple.driver.AppleDPDisplayTCON\t1\ncom.apple.driver.AppleCS42L83Audio\t442.26\ncom.apple.driver.AppleTAS5770LAmp\t442.26\ncom.apple.driver.AppleSPMIPMU\t1.0.1\ncom.apple.AGXG13G\t173.28.7\ncom.apple.driver.AppleAVE2\t401.73.4\ncom.apple.driver.AppleJPEGDriver\t4.6.0\ncom.apple.driver.AppleMobileDispH13G-DCP\t140.0\ncom.apple.driver.usb.AppleUSBHostT8103\t1\ncom.apple.driver.AudioDMAController-T8103\t1.60.5\ncom.apple.driver.AppleS5L8960XNCO\t1\ncom.apple.driver.AppleT8103PMGR\t1\ncom.apple.driver.Apple

@lundman
Copy link
Contributor

lundman commented Aug 1, 2021

0xfffffe1667b54028
vmem_init.initial_default_block (in zfs) + 10709864
vmem_init.initial_default_block (in zfs) + 10709324
vmem_init.initial_default_block (in zfs) + 11928008
0xfffffe0017c39374
vmem_init.initial_default_block (in zfs) + 10418620
vmem_init.initial_default_block (in zfs) + 10708444
vmem_init.initial_default_block (in zfs) + 10708444
0xfffffe0017c34e80
0xfffffe0017c39320
vmem_init.initial_default_block (in zfs) + 11880244
vmem_init.initial_default_block (in zfs) + 10418588
vmem_init.initial_default_block (in zfs) + 10780864
0xfffffe0017b31b98
0xfffffe0017b36e70
0xfffffe0017b48300
0xfffffe0019c6a62c
0xfffffe0017b5ebc8
0xfffffe0019c9e9ec
0xfffffe0017b5ebc8
0xfffffe0019c9948c
0xfffffe0019c96af0
0xfffffe0019c7f3e0
0xfffffe0017b5ebc8
0xfffffe0019c7ea70
0xfffffe0019e25fb8
0xfffffe0019e1ffec
0xfffffe00198aac14
0xfffffe00198ab1ec
0xfffffe00198c2444
0xfffffe00198c0960
0xfffffe0019952fcc
0xfffffe0019957a74
buf_strategy_iokit (in zfs) (ldi_iokit.cpp:0)
ldi_strategy (in zfs) (ldi_osx.c:2258)
vdev_disk_io_start (in zfs) (vdev_disk.c:631)
zio_vdev_io_start (in zfs) (zio.c:3857)
zio_nowait (in zfs) (zio.c:2284)
vdev_mirror_io_start (in zfs) (vdev_mirror.c:686)
zio_vdev_io_start (in zfs) (zio.c:3857)
zio_nowait (in zfs) (zio.c:2284)
vdev_mirror_io_start (in zfs) (vdev_mirror.c:686)
zio_vdev_io_start (in zfs) (zio.c:3729)
zio_nowait (in zfs) (zio.c:2284)
arc_read (in zfs) (arc.c:6452)
dbuf_read_impl (in zfs) (dbuf.c:1527)
dbuf_read (in zfs) (dbuf.c:1658)
dbuf_hold_impl (in zfs) (dbuf.c:3425)
dbuf_hold_impl (in zfs) (dbuf.c:3425)
dbuf_hold_level (in zfs) (dbuf.c:3511)
dbuf_hold (in zfs) (dbuf.c:3504)

@JMoVS
Copy link
Contributor Author

JMoVS commented Aug 4, 2021

stack trace with newest code drop (rc3 + cherry pick)

panic(cpu 0 caller 0xfffffe001a40d320): Invalid kernel stack pointer (probable overflow). at pc 0xfffffe0019c0f0d8, lr 0x5ebafe001b1f5b94 (saved state: 0xfffffe302908bcb0)
	  x0: 0xfffffe001dd483c0  x1:  0x000000000000000a  x2:  0x0000000000000000  x3:  0x000002725ba130e1
	  x4: 0x0000000000000000  x5:  0xfffffe3fe94b80a0  x6:  0xfffffe3fe94b8070  x7:  0xfffffe3fe94b80d8
	  x8: 0xfffffe001dd46240  x9:  0x0000000000002180  x10: 0xfffffe001dd48180  x11: 0x0000008000000000
	  x12: 0x0000000000000000 x13: 0x0000000000000001  x14: 0x00000000604013c8  x15: 0x0000000000000000
	  x16: 0xc08ffe0019d74d7c x17: 0x0000000000004a0e  x18: 0x0000000000000000  x19: 0xfffffe1670d28660
	  x20: 0xfffffe166eb3b960 x21: 0x0000000000000000  x22: 0x000002725ba130e1  x23: 0x000000000000000a
	  x24: 0x0000000000000000 x25: 0x0000000000000203  x26: 0xfffffe001d994000  x27: 0xfffffe001cb05000
	  x28: 0xfffffe001d9e5000 fp:  0xfffffe3fe94b8060  lr:  0x5ebafe001b1f5b94  sp:  0xfffffe3fe94b7c20
	  pc:  0xfffffe0019c0f0d8 cpsr: 0x204013c8         esr: 0x96000047          far: 0xfffffe3fe94b7c28

Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 20G80
Kernel version: Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101
Fileset Kernelcache UUID: E46841F89DC3FD7ACEC6F404AC995579
Kernel UUID: AC4A14A7-8A8E-3AE6-85A6-55E6B2502BF9
iBoot version: iBoot-6723.140.2
secure boot?: YES
Paniclog version: 13
KernelCache slide: 0x0000000011ff0000
KernelCache base:  0xfffffe0018ff4000
Kernel slide:      0x0000000012b38000
Kernel text base:  0xfffffe0019b3c000
Kernel text exec base:  0xfffffe0019c08000
mach_absolute_time: 0x2725ba8b833
Epoch Time:        sec       usec
  Boot    : 0x6109175b 0x0004f813
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x610acd2d 0x000d2425

CORE 0 recently retired instr at 0xfffffe0019d796a4
CORE 1 recently retired instr at 0xfffffe0019d7ad6c
CORE 2 recently retired instr at 0xfffffe0019d7ad6c
CORE 3 recently retired instr at 0xfffffe0019d7ad6c
CORE 4 recently retired instr at 0xfffffe0019d7ad70
CORE 5 recently retired instr at 0xfffffe0019d7ad70
CORE 6 recently retired instr at 0xfffffe0019d7ad70
CORE 7 recently retired instr at 0xfffffe0019d7ad70
CORE 0 PVH locks held: None
CORE 1 PVH locks held: None
CORE 2 PVH locks held: None
CORE 3 PVH locks held: None
CORE 4 PVH locks held: None
CORE 5 PVH locks held: None
CORE 6 PVH locks held: None
CORE 7 PVH locks held: None
CORE 0 is the one that panicked. Check the full backtrace for details.
CORE 1: PC=0xfffffe0019cb3af4, LR=0xfffffe0019cb3af0, FP=0xfffffe3fea183be0
CORE 2: PC=0x0000000180b40384, LR=0x00000001980086b0, FP=0x000000016f0ed1f0
CORE 3: PC=0xfffffe0019ee6924, LR=0xfffffe0019d6602c, FP=0xfffffe30b576bd70
CORE 4: PC=0xfffffe0019c81a64, LR=0xfffffe0019c81a5c, FP=0xfffffe3ff7773ee0
CORE 5: PC=0xfffffe0019c81a64, LR=0xfffffe0019c81a5c, FP=0xfffffe3fe9123ee0
CORE 6: PC=0xfffffe0019c81a64, LR=0xfffffe0019c81a5c, FP=0xfffffe3fe989bee0
CORE 7: PC=0xfffffe0019c81a64, LR=0xfffffe0019c81a5c, FP=0xfffffe3069c2bee0
Panicked task 0xfffffe16676bc150: 5089 pages, 13 threads: pid 333: ArqAgent
Panicked thread: 0xfffffe1670d28660, backtrace: 0xfffffe302908b5c0, tid: 915160
		  lr: 0xfffffe0019c56b68  fp: 0xfffffe302908b630
		  lr: 0xfffffe0019c5694c  fp: 0xfffffe302908b6a0
		  lr: 0xfffffe0019d801c8  fp: 0xfffffe302908b6c0
		  lr: 0xfffffe001a40d374  fp: 0xfffffe302908b6e0
		  lr: 0xfffffe0019c0f9bc  fp: 0xfffffe302908b6f0
		  lr: 0xfffffe0019c565dc  fp: 0xfffffe302908ba80
		  lr: 0xfffffe0019c565dc  fp: 0xfffffe302908baf0
		  lr: 0xfffffe001a408e80  fp: 0xfffffe302908bb10
		  lr: 0xfffffe001a40d320  fp: 0xfffffe302908bc80
		  lr: 0xfffffe0019d74734  fp: 0xfffffe302908bc90
		  lr: 0xfffffe0019c0f99c  fp: 0xfffffe302908bca0
		  lr: 0xfffffe001b1f5b94  fp: 0xfffffe3fe94b8060
		  lr: 0xfffffe0019d75464  fp: 0xfffffe3fe94b8160
		  lr: 0xfffffe0019c7e110  fp: 0xfffffe3fe94b81f0
		  lr: 0xfffffe0019c7dda8  fp: 0xfffffe3fe94b8280
		  lr: 0xfffffe0019c7c86c  fp: 0xfffffe3fe94b82d0
		  lr: 0xfffffe0019c0faa8  fp: 0xfffffe3fe94b82e0
		  lr: 0xfffffe0018e04774  fp: 0xfffffe3fe94b8720
		  lr: 0xfffffe0018de7428  fp: 0xfffffe3fe94b87a0
		  lr: 0xfffffe0018dd6e4c  fp: 0xfffffe3fe94b8960
		  lr: 0xfffffe0018dc3ee8  fp: 0xfffffe3fe94b8af0
		  lr: 0xfffffe0018d935e0  fp: 0xfffffe3fe94b8c90
		  lr: 0xfffffe0018d95664  fp: 0xfffffe3fe94b8e60
		  lr: 0xfffffe0018c548c4  fp: 0xfffffe3fe94b8f10
		  lr: 0xfffffe0018bfb324  fp: 0xfffffe3fe94b9030
		  lr: 0xfffffe0018bfada4  fp: 0xfffffe3fe94b9070
		  lr: 0xfffffe0018bed48c  fp: 0xfffffe3fe94b9180
		  lr: 0xfffffe0018bef23c  fp: 0xfffffe3fe94b9240
		  lr: 0xfffffe0018bf14f4  fp: 0xfffffe3fe94b9750
		  lr: 0xfffffe0018c0faf0  fp: 0xfffffe3fe94b9870
		  lr: 0xfffffe0018c0f0e0  fp: 0xfffffe3fe94b99c0
		  lr: 0xfffffe0018c1def0  fp: 0xfffffe3fe94b9ad0
		  lr: 0xfffffe0018c1f3c8  fp: 0xfffffe3fe94b9b50
		  lr: 0xfffffe0018d9e4e8  fp: 0xfffffe3fe94b9bf0
		  lr: 0xfffffe0018da1a20  fp: 0xfffffe3fe94b9c90
		  lr: 0xfffffe001c126fcc  fp: 0xfffffe3fe94b9d10
		  lr: 0xfffffe001c12ba74  fp: 0xfffffe3fe94b9dc0
		  lr: 0xfffffe001c136530  fp: 0xfffffe3fe94b9e80
		  lr: 0xfffffe0019ee2d1c  fp: 0xfffffe3fe94b9ee0
		  lr: 0xfffffe0019edbb50  fp: 0xfffffe3fe94b9f20
		  lr: 0xfffffe001c7ab4dc  fp: 0xfffffe3fe94b9fa0
		  lr: 0xfffffe001c7ab118  fp: 0xfffffe3fe94ba060
		  lr: 0xfffffe001c8a0fb4  fp: 0xfffffe3fe94ba120
		  lr: 0xfffffe001c8a0694  fp: 0xfffffe3fe94ba250
		  lr: 0xfffffe001c87d548  fp: 0xfffffe3fe94ba320
		  lr: 0xfffffe001c87d44c  fp: 0xfffffe3fe94ba350
		  lr: 0xfffffe001c885eec  fp: 0xfffffe3fe94ba470
		  lr: 0xfffffe001c872cec  fp: 0xfffffe3fe94ba490
		  lr: 0xfffffe001c8a5564  fp: 0xfffffe3fe94ba550
		  lr: 0xfffffe001c8a1070  fp: 0xfffffe3fe94ba610
      Kernel Extensions in backtrace:
         com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001c120000->0xfffffe001c13ffff
         com.apple.driver.AppleT8103CLPCv3(1.0)[87EC4A6E-F110-3054-8106-3FCBFF333A59]@0xfffffe001b1dc000->0xfffffe001b20ffff
            dependency: com.apple.driver.AppleARMPlatform(1.0.2)[5E9CCC2E-8DAD-3602-9B36-6A976B6F7995]@0xfffffe001a564000->0xfffffe001a5b3fff
            dependency: com.apple.driver.ApplePMGR(1)[5AE074DE-9313-34A2-A16E-854B40FB4625]@0xfffffe001af34000->0xfffffe001af6ffff
            dependency: com.apple.iokit.IOReportFamily(47)[11A4640E-66CF-399D-BD06-F13C57BF7D16]@0xfffffe001c074000->0xfffffe001c077fff
            dependency: com.apple.iokit.IOSurface(290.8.1)[E1A3CC6B-7556-3503-BB75-53264C4F53AA]@0xfffffe001c148000->0xfffffe001c167fff
            dependency: com.apple.kec.Libm(1)[6655447E-7F98-322A-A310-6E1F0203833A]@0xfffffe001c630000->0xfffffe001c633fff
         com.apple.filesystems.apfs(1677.141.1)[CC40AF90-D379-339D-857A-BCFCC5378DF6]@0xfffffe001c7a8000->0xfffffe001c8bbfff
            dependency: com.apple.driver.AppleEffaceableStorage(1.0)[608C5FF9-9F38-3E74-8CE1-F90AFF18F8ED]@0xfffffe001aa50000->0xfffffe001aa57fff
            dependency: com.apple.iokit.CoreAnalyticsFamily(1)[34337684-DA05-3170-8DDB-259DFEB3D159]@0xfffffe001b5c8000->0xfffffe001b5cffff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001c120000->0xfffffe001c13ffff
            dependency: com.apple.kec.corecrypto(11.1)[7740FC9A-DE55-35F9-99FB-46C0F386BB51]@0xfffffe001c8fc000->0xfffffe001c947fff
         org.openzfsonosx.zfs(2.1)[4BC82C1C-30A2-373A-AB8C-C78E58BA4B32]@0xfffffe0018be8000->0xfffffe0018ecbfff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[A41978E8-2B18-341C-8935-2AACD1565F0F]@0xfffffe001c120000->0xfffffe001c13ffff

last started kext at 1799898525404: com.apple.driver.CoreStorageFsck	554.140.2 (addr 0xfffffe0019654000, size 16384)
loaded kexts:
org.openzfsonosx.zfs	2.1.0

@lundman
Copy link
Contributor

lundman commented Aug 4, 2021

vmem_init.initial_default_block (in zfs) + 10709864
vmem_init.initial_default_block (in zfs) + 10709324
vmem_init.initial_default_block (in zfs) + 11928008
0xfffffe001a40d374
vmem_init.initial_default_block (in zfs) + 10418620
vmem_init.initial_default_block (in zfs) + 10708444
vmem_init.initial_default_block (in zfs) + 10708444
0xfffffe001a408e80
0xfffffe001a40d320
vmem_init.initial_default_block (in zfs) + 11880244
vmem_init.initial_default_block (in zfs) + 10418588
0xfffffe001b1f5b94
vmem_init.initial_default_block (in zfs) + 11883620
vmem_init.initial_default_block (in zfs) + 10871056
vmem_init.initial_default_block (in zfs) + 10870184
vmem_init.initial_default_block (in zfs) + 10864748
vmem_init.initial_default_block (in zfs) + 10418856
aes_encrypt_block (in zfs) (aes_impl.c:140)
ccm_decrypt_final (in zfs) (ccm.c:542)
aes_decrypt_atomic (in zfs) (aes.c:1160)
crypto_decrypt (in zfs) (kcf_cipher.c:680)
zio_do_crypt_uio (in zfs) (zio_crypt.c:462)
zio_do_crypt_data (in zfs) (zio_crypt.c:1992)
spa_do_crypt_abd (in zfs) (dsl_crypt.c:2835)
arc_hdr_decrypt (in zfs) (arc.c:1879)
arc_fill_hdr_crypt (in zfs) (arc.c:1961)
arc_buf_fill (in zfs) (arc.c:2056)
arc_buf_alloc_impl (in zfs) (arc.c:2883)
arc_read (in zfs) (arc.c:6055)
dbuf_read_impl (in zfs) (dbuf.c:1527)
dbuf_read (in zfs) (dbuf.c:1658)
dmu_buf_hold_array_by_dnode (in zfs) (dmu.c:573)
dmu_read_uio_dnode (in zfs) (dmu.c:1207)
zvol_os_read_zv (in zfs) (zvol_os.c:331)
org_openzfsonosx_zfs_zvol_device::doAsyncReadWrite(IOMemoryDescriptor*, unsigned long long, unsigned long long, IOStorageAttributes*, IOStorageCompletion*) (in zfs) (zvolIO.cpp:0)
0xfffffe001c126fcc
0xfffffe001c12ba74
0xfffffe001c136530
vmem_init.initial_default_block (in zfs) + 13380892
vmem_init.initial_default_block (in zfs) + 13351760
0xfffffe001c7ab4dc
0xfffffe001c7ab118
0xfffffe001c8a0fb4
0xfffffe001c8a0694
0xfffffe001c87d548
0xfffffe001c87d44c
0xfffffe001c885eec
0xfffffe001c872cec
0xfffffe001c8a5564
0xfffffe001c8a1070

@lundman
Copy link
Contributor

lundman commented Aug 4, 2021

128    aes_encrypt_block
128    ccm_decrypt_final
448    aes_decrypt_atomic
400    crypto_decrypt
416    zio_do_crypt_uio
464    zio_do_crypt_data
176    spa_do_crypt_abd
288    arc_hdr_decrypt
64    arc_fill_hdr_crypt
272    arc_buf_fill
192    arc_buf_alloc_impl
1296    arc_read
288    dbuf_read_impl
336    dbuf_read
272    dmu_buf_hold_array_by_dnode
128    dmu_read_uio_dnode
160    zvol_os_read_zv
160    org_openzfsonosx_zfs_zvol_device::doAsyncReadWrite
Total: 5616 bytes

@lundman
Copy link
Contributor

lundman commented Aug 5, 2021

Using the fp: to show stack sizes we have:

0xfffffe3fe94b8160 0xfffffe3fe94b8060 0x256
0xfffffe3fe94b81f0 0xfffffe3fe94b8160 0x144
0xfffffe3fe94b8280 0xfffffe3fe94b81f0 0x144
0xfffffe3fe94b82d0 0xfffffe3fe94b8280 0x80
0xfffffe3fe94b82e0 0xfffffe3fe94b82d0 0x16
0xfffffe3fe94b8720 0xfffffe3fe94b82e0 0x1088
0xfffffe3fe94b87a0 0xfffffe3fe94b8720 0x128 aes_encrypt_block 
0xfffffe3fe94b8960 0xfffffe3fe94b87a0 0x448
0xfffffe3fe94b8af0 0xfffffe3fe94b8960 0x400
0xfffffe3fe94b8c90 0xfffffe3fe94b8af0 0x416
0xfffffe3fe94b8e60 0xfffffe3fe94b8c90 0x464
0xfffffe3fe94b8f10 0xfffffe3fe94b8e60 0x176
0xfffffe3fe94b9030 0xfffffe3fe94b8f10 0x288
0xfffffe3fe94b9070 0xfffffe3fe94b9030 0x64
0xfffffe3fe94b9180 0xfffffe3fe94b9070 0x272
0xfffffe3fe94b9240 0xfffffe3fe94b9180 0x192
0xfffffe3fe94b9750 0xfffffe3fe94b9240 0x1296 arc_read
0xfffffe3fe94b9870 0xfffffe3fe94b9750 0x288
0xfffffe3fe94b99c0 0xfffffe3fe94b9870 0x336
0xfffffe3fe94b9ad0 0xfffffe3fe94b99c0 0x272
0xfffffe3fe94b9b50 0xfffffe3fe94b9ad0 0x128
0xfffffe3fe94b9bf0 0xfffffe3fe94b9b50 0x160
0xfffffe3fe94b9c90 0xfffffe3fe94b9bf0 0x160 org_openzfsonosx_zfs_zvol_device::doAsyncReadWrite
0xfffffe3fe94b9d10 0xfffffe3fe94b9c90 0x128 
0xfffffe3fe94b9dc0 0xfffffe3fe94b9d10 0x176
0xfffffe3fe94b9e80 0xfffffe3fe94b9dc0 0x192
0xfffffe3fe94b9ee0 0xfffffe3fe94b9e80 0x96
0xfffffe3fe94b9f20 0xfffffe3fe94b9ee0 0x64
0xfffffe3fe94b9fa0 0xfffffe3fe94b9f20 0x128
0xfffffe3fe94ba060 0xfffffe3fe94b9fa0 0x192
0xfffffe3fe94ba120 0xfffffe3fe94ba060 0x192
0xfffffe3fe94ba250 0xfffffe3fe94ba120 0x304
0xfffffe3fe94ba320 0xfffffe3fe94ba250 0x208
0xfffffe3fe94ba350 0xfffffe3fe94ba320 0x48
0xfffffe3fe94ba470 0xfffffe3fe94ba350 0x288
0xfffffe3fe94ba490 0xfffffe3fe94ba470 0x32
0xfffffe3fe94ba550 0xfffffe3fe94ba490 0x192
0xfffffe3fe94ba610 0xfffffe3fe94ba550 0x192

cat p | awk '{ print $4;}' | perl -lne 'use bignum qw/hex/; if ($.==1){$p=$_} else{ printf("$_ $p 0x%x\n", (hex $_ - hex $p)); $p=$_} END{print $p}'

lundman pushed a commit to openzfsonosx/openzfs-fork that referenced this issue Feb 22, 2022
In openzfsonosx/openzfs#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.
lundman pushed a commit to openzfsonosx/openzfs-fork that referenced this issue Sep 27, 2022
In openzfsonosx/openzfs#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.
lundman added a commit to openzfsonosx/openzfs-fork that referenced this issue Sep 28, 2022
Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: configure.ac add cmd/os/macos/zsysctl

Upstream: configure.ac changes

Upstream: Makefile

Upstream: Add macOS to headers

Attempt to group most of the sweeping changes to headers in there,
unless they fit better with an individual commit

Signed-off-by: Jorgen Lundman <[email protected]>

It appears FreeBSD did the same for zfs_ioctl_register_dataset_nolog()
as they use it, so following suit for zfs_ioctl_register_pool()

Upstream: macOS default mount is /Volumes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: add IO calls for iokit

Is this the best way? We could add ", func, private" to the
existing IO, and either send by uio, or by func(private).

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Allow cmd/zfs mount unmount of snapshots

"zfs mount dataset@snapshot" as mounting of snapshot has to be done
manually from userland in macOS.

Add zfs_rollback_os() call to the rollback logic, so platforms can
do specific requirements.

macOS: need to kick Finder to update.

Signed-off-by: Jorgen Lundman <[email protected]>

upstream: hack - retry destroy until diskarb goes away

A more portable solution is perhaps desired.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add macOS support

Add all files required for the macOS port. Add new cmd/os/ for tools
which are only expected to be used on macOS.

This has support for all macOS version up to BigSur (11.x)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Additional work

macOS: employ advanced lock synchronisation

macOS: handle additional lookups to delay waiting for mount

macOS: handle rapid snapshot auto mounts

Re-implement snapshots to always mount on "lookup()". This handles the
deadlock when cwd is changed to the snapshot directory before mount.

Then add some logic to attempt to not-mount in some situations, ie
listing inside ".zfs/snapshot" directory. If a listing there is
started, we ignore mount requests until it is complete - by
storing the theadid and pid of the listing process.

Any access below ".zfs/snapshot", will clear the ignore, ie, cause
the mount to happen.

macOS: userland unmount to disable auto_snapshot

to avoid triggering a mount. Also make kernel remember 5 pid+tid to
ignore.

macOS: Do not truncate returned name in case correcting lookups

macOS: also don't truncate further down

macOS: fix leak in ldi handle_set_wce_iokit

The parent device needs to be released if it was retained.

macOS: add zvol_os_is_zvol()

Or we are unable to create zpools inside zvols.

Also cleanup zvolIO.cpp to be cstyle compliant and correcting
obvious leaks.

macOS: fix zfs_vnop_lookup() and linkid

zfs_vnop_lookup() failed to "remember" the name used to lookup in the
cache_lookup() success case, making us return the incorrect name in
future zfs_vnop_getattr() - most notacibly in realpath().

linkid logic for Finder was not converting XNU inode to avoid the
first 16 inodes.

macOS: Return nametoolong when formD is lacking space

Originally it was returning "Operation not supported" which isn't quite as
useful to the user.

Hopefully nothing checks that it must return ENOTSUP.

macOS: change vnop_lookup to use cache.

To give more room for formD formC to work with, we
always allocate MAXPATHLEN, so we might as well use a kmem_cache.

macOS: rmdir -p is far too eager

macOS: dir link count doesn't count files.

To be like upstream:

drwxr-xr-x  2 root  wheel   2 Jun 16 17:37 .
touch a
drwxr-xr-x  2 root  wheel   3 Jun 16 17:37 .

Where 2nd field is "number of directories" (2) and
5th field is "number of files and directories" (3)

macOS: move sa_setup() to after zap_lookup()

This is the order Linux calls them, so we should minimise differences.

macOS: clean up handling of readonly with vfs_mount

to follow what upstream does.

macOS: parentID also needs to be mapped to XNU id

macOS: add cmd/os/macos/zsysctl

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: Makefile.am for mount_zfs [squash]

macOS: squash

macOS: strip selinux functions [squash]

macOS: move getmntany into libzfs

zvol.c change

fix zfs.h

macOS: run zsysctl if /etc/zfs/zsysctl.conf exists

macOS: re-implement most of xattrs

We had some difference betweem how ZOL and macOS behaved when going
between xattr=sa and xattr=on datasets (send/recv) and fairly large
duplicate code.

Take ZOL zpl_xattr for the sa/on logic, change it to take "uio" for
the data buffer. Also pass in "cr" as we can.

The finderinfo logic stays in the vnop handlers, leaving the imported
source very close to ZOL.

Everything with xattrs, and decmpfs needs to be tested :)

macOS: Add uio type for IOKit iomem support

Add another UIO seg type, UIO_FUNCSPACE (UIO_SYSSPACE, UIO_USERSPACE) to
handle the IOkit IOMemoryDescriptor type. When zvolIO needs to issue
IO on volumes, it will setup a uio with iov_base as "iomem".
As dmu_read_dnode_uio() (and write) filters down to zfs_uiomove(),
spl-uio will handle the type to call registered IO function
"zvolIO_strategy" instead of memcpy/bcopy calls.

zvolIO_strategy() will call iomem->writeBytes (readBytes) as required.

Model zvol_os.c calls zvol_os_read_zv() (and write) on ZOL sources again
to ensure as little divergence as possible.

Restore dmu.c to contain no macOS changes

macOS: Fix abd leak, kmem_free correct size of abd_t

... for macOS and Freebsd, and improve macOS abd performance (#56)

* Cleanup in macos abd_os.c to fix abd_t leak

Fix a leak of abd_t that manifested mostly when using
raidzN with at least as many columns as N (e.g. a
four-disk raidz2 but not a three-disk raidz2).
Sufficiently heavy raidz use would eventually run a system
out of memory.

The leak was introduced as a fix for a panic caused by
calculating the wrong size of an abd_t at free time if the
abd_t had been made using abd_get_offset_impl, since it
carried along the unnecessary tails of large ABDs, leading
to a mismatch between abd->abd_size and the original
allocation size of the abd_t.  This would feed kmem_free a
bad size, which produces a heap corruption panic.

The fix now carries only the necessary chunk pointers,
leading to smaller abd_ts (especially those of
abd_get_zeros() ABDs) and a performance gain from the
reduction in copying and allocation activity.

We now calculate the correct size for the abd_t at free time.

This requires passing the number of bytes wanted in a
scatter ABD to abd_get_offset_scatter().

Additionally:

* Switch abd_cache arena to FIRSTFIT, which empirically
improves perofrmance.

* Make abd_chunk_cache more performant and debuggable.

* Allocate the abd_zero_buf from abd_chunk_cache rather
than the heap.

* Don't try to reap non-existent qcaches in abd_cache arena.

* KM_PUSHPAGE->KM_SLEEP when allocating chunks from their
own arena

- having fixed the abd leaks, return to using KMF_LITE,
but leave a commented example of audit kmem debugging

- having made this work, abd_orig_size is no longer needed
as a way to track the size originally kmem_zalloc-ed for
a scatter abd_t

* Update FreeBSD abd_os.c with the fix, and let Linux build

* Minimal change to fix FreeBSD's abd_get_offset_scatter()
carrying too many chunks for the desired ABD size

* A size argument is added to abd_get_offset_scatter() for
FreeBSD and macOS, which is unused by Linux

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: ASM changes to support macOS

Due to some differences in assembler work, macOS will have own copies.
It would be desirable to change all assembler files to use asm_linkage.h
and the macros inside for better portability.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: module/zfs/spa.c

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: zfs-tests to support macOS

Start to add macOS support to the zfs-tester environment, much more work
is required still.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Changes to dprintf for macOS

Prefer to always have the option to turn printfs on, even in RELEASE
builds

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: macOS currently has own zfs_fsync

Hoping to remove it eventually.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: work around different API for sbuf_finish()

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Why is linux even trying to look at etc/launchd

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Missing empty taskq for userland

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Add crypto errata1 for projectquota-less datasets

There was a short Windows of 2.0 releases before rc4 where a crypto
dataset would enable projectquota but fail to start it. Add
a work-around for that issue. It is expected this commit will
be remote in the near future.

datasets with crypto will generate the proper local_mac, and will not
be able to be imported with the broken 2.0 version.

Fixed dataset should work on other platforms again.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: import -d does not go through os/macos/ sources

On macOS we need to prioritise /dev/disk over /dev/rdisk, but
the common code makes no adjustment based on os preferred names.

Potentially we should possibly call an os/ function to set
the priority.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Test for NULL vd

It seems we managed to get a deadman triggered during export?

 : 0xffffff8004ebda40 mach_kernel : _return_from_trap + 0xe0
 : 0xffffff7f8942bbbf org.openzfsonosx.zfs : _vdev_deadman + 0x1f
 : 0xffffff7f8941149a org.openzfsonosx.zfs : _spa_deadman + 0xca
 : 0xffffff7f896a6246 org.openzfsonosx.zfs : _taskq_thread + 0x4a6

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: zdb inode mapping fix

Upstream: realpath vdev directory paths

This is already the behavior of
zpool_find_import_scan, so do the same in
make_leaf_vdev and zfs_strcmp_pathname.

On macOS, /var is a symlink to private/var so when
the user inputs an import path starting with /var,
it is eventually converted automatically by zfs to
the realpath starting with /private/var. This
causes problems later finding vdevs as string
comparisons between paths starting with
/private/var and paths starting with /var fail, so
make sure we are always using the vdev directory's
realpath. Note that basenames are preserved so as
not to compromise invariant symlinks.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: dirname -> zfs_dirnamelen [squash]

Forgot to actually change it to zfs_dirnamelen

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: set default macOS invariant disks path

InvariantDisk (udev analogue for macOS) does not
use /dev/disk in order to avoid subdirectories in
/dev. Instead, the default path for the invariant
symlinks is /var/run/disk/by-*, a root owned
temporary directory.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: stub-out zpool_read_label for APPLE

It does not work for macOS platform, we have our own based on the
old pre-lio style.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: cppcheck fixes

Upstream: fit in with recent man page changes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: use correct libcurl.4.dylib name

This fix isn't exactly great either.

macOS: destroy snapshots

squash

renamed zpool_disable_volume_os

macOS: rename zed zvol symlink script and variables

macOS: handle 2-arg pthread_setname_np()

By taking it out completely.

macOS: Add snapshot and zvol events (uio.h fixes)

It turns out that it could not see readv/writev because
our macos/sys/uio.h was testing for the _LIBSPL_SYS_UIO_H as
set by the top level libspl/include/sys/uio.h and therefor
skipped over, if includes came in wrong order.

Upstream: libzfs.h abi requires changes

macOS: compile fixes after rebase

macOS: changes to zfs_file after rebase

macOS: compile fixes after rebase

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Make zvol list be non-static

Until we can agree on a solution that works for everyone.

macOS: rename fallthrough to zfs_fallthrough

macOS: Compile fixes for latest rebase

macOS: Update arcstat and arc_summary

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Correct CPUID features lookup

Account for surprise A, B, D, C order of registers.

Add fixes to compile on ARM64, but functionally is missing.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Set name after zfs_vnop_create() for NFS

NFS would fail with open(..., O_EXCL) if we do not set the name after
zfs_vnop_create(). nfsd handles O_EXCL differently, in that it
always assumes VA_EXCLUSIVE is set (and will receive EEXIST). nfsd
uses atime to store a pseudo-random unique ID, then call VNOP_CREATE().
If it succeeded in creating the file (this nfs client won over any other
nfs clients) then the atime ID will match. nfsd will then call
vnode_setattr() with the correct atime.

If the name is not set by ZFS, it fails before calling vnode_setattr()
with call stack:
  mac_vnode_check_open()
    vn_getpath_ext_with_mntlen()
      build_path_with_parent()

Also correct fhtovp/vptofh to handle XNU remapped inodes.

Remove atime checks for 48bit overflow from pre 64bit days.

zfs_vnop_create() is also given a vattr struct, we should reply with
the attr we handled - this saves XNU from calling fallback setattr().

Clean up zfs_vnop_getattr() to only set va_active for vattrs that was
asked for, rather than blindly setting vattrs. Some xnu code checks
that va_active == va_enabled, so if we set too many it can force
XNU to call fallback.

Actually handle atime in setattr()/getattr(), as it lives in
zp->z_atime struct.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: fix default ADDEDTIME getattr

Logic would return 0 date reply for entries without ADDEDTIME
(which is only added after moving)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Also set name in other vnop create calls.

Unsure if NFS will bug on symlink and link, but we might
as well call update. mknod is handled in the call to create.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add verbose kstat for RAW type

The RAW kstat type as used by nodes like:

kstat.zfs.misc.dbgmsg
kstat.zfs.misc.dbufs

can get really large, and there is no way to skip them when issuing
a "sysctl -a" or similar request. This can slow down the process
considerably, while it holds the locks.

The RAW kstat type will now automatically add a "verbose" leaf as well,
defaulting to "0" (do not display). To see the RAW information set
the verbose value to 1.

kstat.zfs.misc.dbufs.verbose: 0

kstat.zfs.misc.dbufs.dbufs:
pool             objset   object   level    blkid    offset     dbsize
[...]
kstat.zfs.misc.dbufs.verbose: 1

Conveniently, this command works:

sudo sysctl kstat.zfs.misc.dbgmsg.verbose=1 kstat.zfs.misc.dbgmsg.dbgmsg
     kstat.zfs.misc.dbgmsg.verbose=0

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: change wmsum to be a struct

instead of being clever with pointers, and prepare
for possible future expansion.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Handle ZFS_MODULE_PARAMS as sysctl, take 2

Modelled on FreeBSD approach, made to work on macOS.

Attempt to stay close to legacy macOS tunable names
but some are now slightly different.

Retire the macOS kstat versions, replace with ZFS_MODULE_IMPL.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Also copy out sysctl for ZFS_MODULE_VIRTUAL

Upstream: take out Linux code in zfeature

macOS: move ioctl_fd (back) into libzfs_core

macOS: fix up clock_gettime for zfs-tester

macOS: build fix for monterey

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: also include all source files

Split deep vmem_alloc()/vmem_xalloc() stacks

In openzfsonosx/openzfs#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.

Sub-PAGE_SIZE ABDs instead of linear ABDs

Previously, when making an ABD of size less than
zfs_abd_chunk_size (4k by default), we would make a linear
ABD, which would allocate memory out of the zio caches.

The subpage chunks are in multiples of SPA_MINBLOCKSIZE,
with each multiple (up to PAGE_SIZE minus
SPA_MINBLOCKSIZE) having its own kmem_cache. These
kmem_caches are parented to a subpage vmem_cache that
takes 128k allocations from the PAGE_SIZE abd_chunk_cache.
ABDs whose size falls within SPA_MINBLOCKSIZE bytes of
PAGE_SIZE and all larger ABDs are served by the PAGE_SIZE
ABD cache.

Upstream: fix M1 --enable-debug build failure

cannot #pragma diagnostic pop without a matching #pragma
diagnostic push

use -finline-hint-functions and not HAVE_LARGE_STACKS

We appear to have a stack overflow problem.

HAVE_LARGE_STACKS is default.  It drives the decision
about whether (HAVE) or not (!HAVE) to do txg sync context
(frequent) and pool initialization (much less frequent)
zio work in the same thread as the present __zio_execute,
or whether it should be pushed to the head of the line of
zios to be serviced asynchronously by another thread.

Let's not define HAVE_LARGE_STACKS when building the kext
for macOS.

Clang's -finline-hint-functions inlines all threads
explicitly hinted as inline "static inline foo(...) { }"
or equipped with an __attribute__((always_inline)), but
does not inline other functions, even if they are static.

Clang & LLVM's inlining bumps the stack frame size to
include automatic variables in the inlined functions,
growing the stack even for invocations where the inlined
function will not be reached.  This has led to large stack
frames in recursively called functions, notably
dsl_scan_visitbp, which was dealt with by removing the
always inline attribute.

Globally enabling -finline-hint-functions reduces the
number of inlined functions enough to make un-inlining
such functions, while still inlining obvious
wins (e.g. tiny frequently called from all over the source
tree functions such as atomic_add_64_nv()).

remove exponential moving average code

Its utility for tracking relatively long term (~ seconds)
movements of the calculated spl_free value has declined
with our switch to "pure" and away from using macOS kernel
variables which tracking momentary VM page demand and
consequent changes in ARC.

macOS allows for floating point to be used in kernel
extensions, but this *may* have a cost on ARM in the form
of larger stack frames, which is an acute problem.

This code therefore should go away, rather than be put
behind a compile-time flag.

macOS: fix autoimport script

Use absolute path to zsysctl.

Ensure that the org.openzfsonosx.zpool-import service is loaded before
kickstarting it.

macOS: zfs_resume_fs can panic accessing NULL vp

macOS: silence zvol_in_zvol

and use SET_ERROR() while we are at it.

macOS: zvol_replay must call zil_open

Fix some SPL warnings

* large frame size -> IOMalloc/IOFree
* loss of precision in int -> short by way of bitmasking
* __maybe_unused for the pattern:

const retval r __maybe_unused = f();
ASSERT0(f);

in non-debug builds
* variables used possibly uninitialized

macOS: Silence warning in uio

A race to thread_call_enter1() could deadlock

If multiple concurrent deep-stack allocation requests race
to vmem_alloc(), the "winner" of the race could be
cancelled by one of the other racers, and so the cancelled
"winner" would never see the done flag, and would spend
the rest of eternity stuck in a cv_wait() loop, hanging
the thread that wanted memory.

This commit uses a per-arena busy flag to block the later
racers from reaching thread_call_enter1() until the
race-winner's in-worker-thread memory allocation is
complete.

Additionally, the worker does less work updating stats,
and only takes a mutex around the cv_signal(). The parent
also checks for lost and duplicate cv_signals() error
conditions.

macOS: zvolRename needs to wait

zvolRename needs to wait for IOKit to settle the changes, with a
timeout. Rework the code to reuse the wait logic in a function.

Remove old delay() hack.

Most easily tested with zfs-tester run over;
cli_root/zfs_create, zvol/zvol_cli, zvol/zvol_misc
which results in testpool/vol.33979-renamed "is busy".

macOS: Use vdev_disk_taskq when stack space limited.

As we can trigger stack overflow in the IO path (especially with zvol)
we detect if available space is below tunable spl_vmem_split_stack_below.

In addition to this, remove small kmem_alloc of ldi_buf_t, vdev_buf_t and
ldi_iokit_buf_t for each IO by attaching ldi_buf_t into zio_t.
(See ZIO_OS_FIELDS)

Track lowest stack remaining seen in vdev_disk as kstat

It may be useful to know if we are being handed an
especially deep stack in vdev_disk_io_start(), rather
than just that we have been called with less than
the threshold remaining.

Additionally update variable names for clarity, notably
reflecting that spl_stack_split_below is not just for vmem
any more.

Issue zvol_read/write async when needed

macOS: Address 3 different ways to compress HFS

Handle 2 xattr holding the compressed data stream, and
detect when UF_COMPRESSED is being set, and if file size is
zero we return zero. Makes 'tar' retry the compression.

Upstream: test for -finline-hint-functions

macOS: thread_call_allocate fix for older OS

macOS: add support for sharenfs and sharesmb

macOS: implement kcred for zfs_replay

Some calls used by zfs_replay can not handle a NULL kcred and will
panic. We use available functions to fetch the kernel cred, but
it is somewhat of a hack, as we release it before using.

Alternatively, we could hold reference in zfsvfs_setup() before
calling zil_replay() and release after, with the hopes that
kcred isn't used many other places.

macOS: Fix sysctl macros and missing prototypes

macOS: Add disable_trashes tunable

zfs_disable_trashes

Upstream: wrap in ifdef for SEEK_HOLE

macOS: fix for earlier macOS

also needs:
-               AC_MSG_FAILURE([*** clock_gettime is missing in libc and librt])
+               AC_MSG_RESULT([*** clock_gettime is missing in libc and librt])

and remove -finline-hint-functions

macOS: wrap mkostemp/s in OSX10.12 checks

macOS: bzero the tqe in the allocation construction phase

the tqent_next and tqent_prev fields are random, which causes problems
with the IS_EMPTY() check, causing an assertion in zio_taskq_dispatch

macOS: Clean up vmem_alloc_in_worker_thread for boot hang

vmem_alloc_in_worker_thread() would use a local stack "cb" referenced
by the thread, possibly after the stack was released.

spl_lowest_alloc_stack_remaining would also trigger async calls when
not neccessary, each time we got a new low, which was problematic
during spl-kmem startup.

macOS: M1 kext must contain x64 binary

Otherwise notarize fails.

macOS: minor cstyle fix

commit before git bisect

Upstream: fixing all the Makefiles for new build system

Upstream: Changes and fixes to common files

macOS: fixes and updates

macOS: replace bcmp / bcopy / bfree

macOS: cstyle fixes

macOS: compile fixes

Upstream: stop Linux from compiling macOS source files.

It doesn't seem to work.

Upstream: continued makefile fixes

Upstream: Linux squats on mount_zfs, so hack around it
lundman added a commit to openzfsonosx/openzfs-fork that referenced this issue Sep 28, 2022
Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: configure.ac add cmd/os/macos/zsysctl

Upstream: configure.ac changes

Upstream: Makefile

Upstream: Add macOS to headers

Attempt to group most of the sweeping changes to headers in there,
unless they fit better with an individual commit

Signed-off-by: Jorgen Lundman <[email protected]>

It appears FreeBSD did the same for zfs_ioctl_register_dataset_nolog()
as they use it, so following suit for zfs_ioctl_register_pool()

Upstream: macOS default mount is /Volumes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: add IO calls for iokit

Is this the best way? We could add ", func, private" to the
existing IO, and either send by uio, or by func(private).

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Allow cmd/zfs mount unmount of snapshots

"zfs mount dataset@snapshot" as mounting of snapshot has to be done
manually from userland in macOS.

Add zfs_rollback_os() call to the rollback logic, so platforms can
do specific requirements.

macOS: need to kick Finder to update.

Signed-off-by: Jorgen Lundman <[email protected]>

upstream: hack - retry destroy until diskarb goes away

A more portable solution is perhaps desired.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add macOS support

Add all files required for the macOS port. Add new cmd/os/ for tools
which are only expected to be used on macOS.

This has support for all macOS version up to BigSur (11.x)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Additional work

macOS: employ advanced lock synchronisation

macOS: handle additional lookups to delay waiting for mount

macOS: handle rapid snapshot auto mounts

Re-implement snapshots to always mount on "lookup()". This handles the
deadlock when cwd is changed to the snapshot directory before mount.

Then add some logic to attempt to not-mount in some situations, ie
listing inside ".zfs/snapshot" directory. If a listing there is
started, we ignore mount requests until it is complete - by
storing the theadid and pid of the listing process.

Any access below ".zfs/snapshot", will clear the ignore, ie, cause
the mount to happen.

macOS: userland unmount to disable auto_snapshot

to avoid triggering a mount. Also make kernel remember 5 pid+tid to
ignore.

macOS: Do not truncate returned name in case correcting lookups

macOS: also don't truncate further down

macOS: fix leak in ldi handle_set_wce_iokit

The parent device needs to be released if it was retained.

macOS: add zvol_os_is_zvol()

Or we are unable to create zpools inside zvols.

Also cleanup zvolIO.cpp to be cstyle compliant and correcting
obvious leaks.

macOS: fix zfs_vnop_lookup() and linkid

zfs_vnop_lookup() failed to "remember" the name used to lookup in the
cache_lookup() success case, making us return the incorrect name in
future zfs_vnop_getattr() - most notacibly in realpath().

linkid logic for Finder was not converting XNU inode to avoid the
first 16 inodes.

macOS: Return nametoolong when formD is lacking space

Originally it was returning "Operation not supported" which isn't quite as
useful to the user.

Hopefully nothing checks that it must return ENOTSUP.

macOS: change vnop_lookup to use cache.

To give more room for formD formC to work with, we
always allocate MAXPATHLEN, so we might as well use a kmem_cache.

macOS: rmdir -p is far too eager

macOS: dir link count doesn't count files.

To be like upstream:

drwxr-xr-x  2 root  wheel   2 Jun 16 17:37 .
touch a
drwxr-xr-x  2 root  wheel   3 Jun 16 17:37 .

Where 2nd field is "number of directories" (2) and
5th field is "number of files and directories" (3)

macOS: move sa_setup() to after zap_lookup()

This is the order Linux calls them, so we should minimise differences.

macOS: clean up handling of readonly with vfs_mount

to follow what upstream does.

macOS: parentID also needs to be mapped to XNU id

macOS: add cmd/os/macos/zsysctl

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: Makefile.am for mount_zfs [squash]

macOS: squash

macOS: strip selinux functions [squash]

macOS: move getmntany into libzfs

zvol.c change

fix zfs.h

macOS: run zsysctl if /etc/zfs/zsysctl.conf exists

macOS: re-implement most of xattrs

We had some difference betweem how ZOL and macOS behaved when going
between xattr=sa and xattr=on datasets (send/recv) and fairly large
duplicate code.

Take ZOL zpl_xattr for the sa/on logic, change it to take "uio" for
the data buffer. Also pass in "cr" as we can.

The finderinfo logic stays in the vnop handlers, leaving the imported
source very close to ZOL.

Everything with xattrs, and decmpfs needs to be tested :)

macOS: Add uio type for IOKit iomem support

Add another UIO seg type, UIO_FUNCSPACE (UIO_SYSSPACE, UIO_USERSPACE) to
handle the IOkit IOMemoryDescriptor type. When zvolIO needs to issue
IO on volumes, it will setup a uio with iov_base as "iomem".
As dmu_read_dnode_uio() (and write) filters down to zfs_uiomove(),
spl-uio will handle the type to call registered IO function
"zvolIO_strategy" instead of memcpy/bcopy calls.

zvolIO_strategy() will call iomem->writeBytes (readBytes) as required.

Model zvol_os.c calls zvol_os_read_zv() (and write) on ZOL sources again
to ensure as little divergence as possible.

Restore dmu.c to contain no macOS changes

macOS: Fix abd leak, kmem_free correct size of abd_t

... for macOS and Freebsd, and improve macOS abd performance (#56)

* Cleanup in macos abd_os.c to fix abd_t leak

Fix a leak of abd_t that manifested mostly when using
raidzN with at least as many columns as N (e.g. a
four-disk raidz2 but not a three-disk raidz2).
Sufficiently heavy raidz use would eventually run a system
out of memory.

The leak was introduced as a fix for a panic caused by
calculating the wrong size of an abd_t at free time if the
abd_t had been made using abd_get_offset_impl, since it
carried along the unnecessary tails of large ABDs, leading
to a mismatch between abd->abd_size and the original
allocation size of the abd_t.  This would feed kmem_free a
bad size, which produces a heap corruption panic.

The fix now carries only the necessary chunk pointers,
leading to smaller abd_ts (especially those of
abd_get_zeros() ABDs) and a performance gain from the
reduction in copying and allocation activity.

We now calculate the correct size for the abd_t at free time.

This requires passing the number of bytes wanted in a
scatter ABD to abd_get_offset_scatter().

Additionally:

* Switch abd_cache arena to FIRSTFIT, which empirically
improves perofrmance.

* Make abd_chunk_cache more performant and debuggable.

* Allocate the abd_zero_buf from abd_chunk_cache rather
than the heap.

* Don't try to reap non-existent qcaches in abd_cache arena.

* KM_PUSHPAGE->KM_SLEEP when allocating chunks from their
own arena

- having fixed the abd leaks, return to using KMF_LITE,
but leave a commented example of audit kmem debugging

- having made this work, abd_orig_size is no longer needed
as a way to track the size originally kmem_zalloc-ed for
a scatter abd_t

* Update FreeBSD abd_os.c with the fix, and let Linux build

* Minimal change to fix FreeBSD's abd_get_offset_scatter()
carrying too many chunks for the desired ABD size

* A size argument is added to abd_get_offset_scatter() for
FreeBSD and macOS, which is unused by Linux

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: ASM changes to support macOS

Due to some differences in assembler work, macOS will have own copies.
It would be desirable to change all assembler files to use asm_linkage.h
and the macros inside for better portability.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: module/zfs/spa.c

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: zfs-tests to support macOS

Start to add macOS support to the zfs-tester environment, much more work
is required still.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Changes to dprintf for macOS

Prefer to always have the option to turn printfs on, even in RELEASE
builds

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: macOS currently has own zfs_fsync

Hoping to remove it eventually.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: work around different API for sbuf_finish()

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Why is linux even trying to look at etc/launchd

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Missing empty taskq for userland

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Add crypto errata1 for projectquota-less datasets

There was a short Windows of 2.0 releases before rc4 where a crypto
dataset would enable projectquota but fail to start it. Add
a work-around for that issue. It is expected this commit will
be remote in the near future.

datasets with crypto will generate the proper local_mac, and will not
be able to be imported with the broken 2.0 version.

Fixed dataset should work on other platforms again.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: import -d does not go through os/macos/ sources

On macOS we need to prioritise /dev/disk over /dev/rdisk, but
the common code makes no adjustment based on os preferred names.

Potentially we should possibly call an os/ function to set
the priority.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Test for NULL vd

It seems we managed to get a deadman triggered during export?

 : 0xffffff8004ebda40 mach_kernel : _return_from_trap + 0xe0
 : 0xffffff7f8942bbbf org.openzfsonosx.zfs : _vdev_deadman + 0x1f
 : 0xffffff7f8941149a org.openzfsonosx.zfs : _spa_deadman + 0xca
 : 0xffffff7f896a6246 org.openzfsonosx.zfs : _taskq_thread + 0x4a6

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: zdb inode mapping fix

Upstream: realpath vdev directory paths

This is already the behavior of
zpool_find_import_scan, so do the same in
make_leaf_vdev and zfs_strcmp_pathname.

On macOS, /var is a symlink to private/var so when
the user inputs an import path starting with /var,
it is eventually converted automatically by zfs to
the realpath starting with /private/var. This
causes problems later finding vdevs as string
comparisons between paths starting with
/private/var and paths starting with /var fail, so
make sure we are always using the vdev directory's
realpath. Note that basenames are preserved so as
not to compromise invariant symlinks.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: dirname -> zfs_dirnamelen [squash]

Forgot to actually change it to zfs_dirnamelen

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: set default macOS invariant disks path

InvariantDisk (udev analogue for macOS) does not
use /dev/disk in order to avoid subdirectories in
/dev. Instead, the default path for the invariant
symlinks is /var/run/disk/by-*, a root owned
temporary directory.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: stub-out zpool_read_label for APPLE

It does not work for macOS platform, we have our own based on the
old pre-lio style.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: cppcheck fixes

Upstream: fit in with recent man page changes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: use correct libcurl.4.dylib name

This fix isn't exactly great either.

macOS: destroy snapshots

squash

renamed zpool_disable_volume_os

macOS: rename zed zvol symlink script and variables

macOS: handle 2-arg pthread_setname_np()

By taking it out completely.

macOS: Add snapshot and zvol events (uio.h fixes)

It turns out that it could not see readv/writev because
our macos/sys/uio.h was testing for the _LIBSPL_SYS_UIO_H as
set by the top level libspl/include/sys/uio.h and therefor
skipped over, if includes came in wrong order.

Upstream: libzfs.h abi requires changes

macOS: compile fixes after rebase

macOS: changes to zfs_file after rebase

macOS: compile fixes after rebase

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Make zvol list be non-static

Until we can agree on a solution that works for everyone.

macOS: rename fallthrough to zfs_fallthrough

macOS: Compile fixes for latest rebase

macOS: Update arcstat and arc_summary

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Correct CPUID features lookup

Account for surprise A, B, D, C order of registers.

Add fixes to compile on ARM64, but functionally is missing.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Set name after zfs_vnop_create() for NFS

NFS would fail with open(..., O_EXCL) if we do not set the name after
zfs_vnop_create(). nfsd handles O_EXCL differently, in that it
always assumes VA_EXCLUSIVE is set (and will receive EEXIST). nfsd
uses atime to store a pseudo-random unique ID, then call VNOP_CREATE().
If it succeeded in creating the file (this nfs client won over any other
nfs clients) then the atime ID will match. nfsd will then call
vnode_setattr() with the correct atime.

If the name is not set by ZFS, it fails before calling vnode_setattr()
with call stack:
  mac_vnode_check_open()
    vn_getpath_ext_with_mntlen()
      build_path_with_parent()

Also correct fhtovp/vptofh to handle XNU remapped inodes.

Remove atime checks for 48bit overflow from pre 64bit days.

zfs_vnop_create() is also given a vattr struct, we should reply with
the attr we handled - this saves XNU from calling fallback setattr().

Clean up zfs_vnop_getattr() to only set va_active for vattrs that was
asked for, rather than blindly setting vattrs. Some xnu code checks
that va_active == va_enabled, so if we set too many it can force
XNU to call fallback.

Actually handle atime in setattr()/getattr(), as it lives in
zp->z_atime struct.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: fix default ADDEDTIME getattr

Logic would return 0 date reply for entries without ADDEDTIME
(which is only added after moving)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Also set name in other vnop create calls.

Unsure if NFS will bug on symlink and link, but we might
as well call update. mknod is handled in the call to create.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add verbose kstat for RAW type

The RAW kstat type as used by nodes like:

kstat.zfs.misc.dbgmsg
kstat.zfs.misc.dbufs

can get really large, and there is no way to skip them when issuing
a "sysctl -a" or similar request. This can slow down the process
considerably, while it holds the locks.

The RAW kstat type will now automatically add a "verbose" leaf as well,
defaulting to "0" (do not display). To see the RAW information set
the verbose value to 1.

kstat.zfs.misc.dbufs.verbose: 0

kstat.zfs.misc.dbufs.dbufs:
pool             objset   object   level    blkid    offset     dbsize
[...]
kstat.zfs.misc.dbufs.verbose: 1

Conveniently, this command works:

sudo sysctl kstat.zfs.misc.dbgmsg.verbose=1 kstat.zfs.misc.dbgmsg.dbgmsg
     kstat.zfs.misc.dbgmsg.verbose=0

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: change wmsum to be a struct

instead of being clever with pointers, and prepare
for possible future expansion.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Handle ZFS_MODULE_PARAMS as sysctl, take 2

Modelled on FreeBSD approach, made to work on macOS.

Attempt to stay close to legacy macOS tunable names
but some are now slightly different.

Retire the macOS kstat versions, replace with ZFS_MODULE_IMPL.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Also copy out sysctl for ZFS_MODULE_VIRTUAL

Upstream: take out Linux code in zfeature

macOS: move ioctl_fd (back) into libzfs_core

macOS: fix up clock_gettime for zfs-tester

macOS: build fix for monterey

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: also include all source files

Split deep vmem_alloc()/vmem_xalloc() stacks

In openzfsonosx/openzfs#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.

Sub-PAGE_SIZE ABDs instead of linear ABDs

Previously, when making an ABD of size less than
zfs_abd_chunk_size (4k by default), we would make a linear
ABD, which would allocate memory out of the zio caches.

The subpage chunks are in multiples of SPA_MINBLOCKSIZE,
with each multiple (up to PAGE_SIZE minus
SPA_MINBLOCKSIZE) having its own kmem_cache. These
kmem_caches are parented to a subpage vmem_cache that
takes 128k allocations from the PAGE_SIZE abd_chunk_cache.
ABDs whose size falls within SPA_MINBLOCKSIZE bytes of
PAGE_SIZE and all larger ABDs are served by the PAGE_SIZE
ABD cache.

Upstream: fix M1 --enable-debug build failure

cannot #pragma diagnostic pop without a matching #pragma
diagnostic push

use -finline-hint-functions and not HAVE_LARGE_STACKS

We appear to have a stack overflow problem.

HAVE_LARGE_STACKS is default.  It drives the decision
about whether (HAVE) or not (!HAVE) to do txg sync context
(frequent) and pool initialization (much less frequent)
zio work in the same thread as the present __zio_execute,
or whether it should be pushed to the head of the line of
zios to be serviced asynchronously by another thread.

Let's not define HAVE_LARGE_STACKS when building the kext
for macOS.

Clang's -finline-hint-functions inlines all threads
explicitly hinted as inline "static inline foo(...) { }"
or equipped with an __attribute__((always_inline)), but
does not inline other functions, even if they are static.

Clang & LLVM's inlining bumps the stack frame size to
include automatic variables in the inlined functions,
growing the stack even for invocations where the inlined
function will not be reached.  This has led to large stack
frames in recursively called functions, notably
dsl_scan_visitbp, which was dealt with by removing the
always inline attribute.

Globally enabling -finline-hint-functions reduces the
number of inlined functions enough to make un-inlining
such functions, while still inlining obvious
wins (e.g. tiny frequently called from all over the source
tree functions such as atomic_add_64_nv()).

remove exponential moving average code

Its utility for tracking relatively long term (~ seconds)
movements of the calculated spl_free value has declined
with our switch to "pure" and away from using macOS kernel
variables which tracking momentary VM page demand and
consequent changes in ARC.

macOS allows for floating point to be used in kernel
extensions, but this *may* have a cost on ARM in the form
of larger stack frames, which is an acute problem.

This code therefore should go away, rather than be put
behind a compile-time flag.

macOS: fix autoimport script

Use absolute path to zsysctl.

Ensure that the org.openzfsonosx.zpool-import service is loaded before
kickstarting it.

macOS: zfs_resume_fs can panic accessing NULL vp

macOS: silence zvol_in_zvol

and use SET_ERROR() while we are at it.

macOS: zvol_replay must call zil_open

Fix some SPL warnings

* large frame size -> IOMalloc/IOFree
* loss of precision in int -> short by way of bitmasking
* __maybe_unused for the pattern:

const retval r __maybe_unused = f();
ASSERT0(f);

in non-debug builds
* variables used possibly uninitialized

macOS: Silence warning in uio

A race to thread_call_enter1() could deadlock

If multiple concurrent deep-stack allocation requests race
to vmem_alloc(), the "winner" of the race could be
cancelled by one of the other racers, and so the cancelled
"winner" would never see the done flag, and would spend
the rest of eternity stuck in a cv_wait() loop, hanging
the thread that wanted memory.

This commit uses a per-arena busy flag to block the later
racers from reaching thread_call_enter1() until the
race-winner's in-worker-thread memory allocation is
complete.

Additionally, the worker does less work updating stats,
and only takes a mutex around the cv_signal(). The parent
also checks for lost and duplicate cv_signals() error
conditions.

macOS: zvolRename needs to wait

zvolRename needs to wait for IOKit to settle the changes, with a
timeout. Rework the code to reuse the wait logic in a function.

Remove old delay() hack.

Most easily tested with zfs-tester run over;
cli_root/zfs_create, zvol/zvol_cli, zvol/zvol_misc
which results in testpool/vol.33979-renamed "is busy".

macOS: Use vdev_disk_taskq when stack space limited.

As we can trigger stack overflow in the IO path (especially with zvol)
we detect if available space is below tunable spl_vmem_split_stack_below.

In addition to this, remove small kmem_alloc of ldi_buf_t, vdev_buf_t and
ldi_iokit_buf_t for each IO by attaching ldi_buf_t into zio_t.
(See ZIO_OS_FIELDS)

Track lowest stack remaining seen in vdev_disk as kstat

It may be useful to know if we are being handed an
especially deep stack in vdev_disk_io_start(), rather
than just that we have been called with less than
the threshold remaining.

Additionally update variable names for clarity, notably
reflecting that spl_stack_split_below is not just for vmem
any more.

Issue zvol_read/write async when needed

macOS: Address 3 different ways to compress HFS

Handle 2 xattr holding the compressed data stream, and
detect when UF_COMPRESSED is being set, and if file size is
zero we return zero. Makes 'tar' retry the compression.

Upstream: test for -finline-hint-functions

macOS: thread_call_allocate fix for older OS

macOS: add support for sharenfs and sharesmb

macOS: implement kcred for zfs_replay

Some calls used by zfs_replay can not handle a NULL kcred and will
panic. We use available functions to fetch the kernel cred, but
it is somewhat of a hack, as we release it before using.

Alternatively, we could hold reference in zfsvfs_setup() before
calling zil_replay() and release after, with the hopes that
kcred isn't used many other places.

macOS: Fix sysctl macros and missing prototypes

macOS: Add disable_trashes tunable

zfs_disable_trashes

Upstream: wrap in ifdef for SEEK_HOLE

macOS: fix for earlier macOS

also needs:
-               AC_MSG_FAILURE([*** clock_gettime is missing in libc and librt])
+               AC_MSG_RESULT([*** clock_gettime is missing in libc and librt])

and remove -finline-hint-functions

macOS: wrap mkostemp/s in OSX10.12 checks

macOS: bzero the tqe in the allocation construction phase

the tqent_next and tqent_prev fields are random, which causes problems
with the IS_EMPTY() check, causing an assertion in zio_taskq_dispatch

macOS: Clean up vmem_alloc_in_worker_thread for boot hang

vmem_alloc_in_worker_thread() would use a local stack "cb" referenced
by the thread, possibly after the stack was released.

spl_lowest_alloc_stack_remaining would also trigger async calls when
not neccessary, each time we got a new low, which was problematic
during spl-kmem startup.

macOS: M1 kext must contain x64 binary

Otherwise notarize fails.

macOS: minor cstyle fix

commit before git bisect

Upstream: fixing all the Makefiles for new build system

Upstream: Changes and fixes to common files

macOS: fixes and updates

macOS: replace bcmp / bcopy / bfree

macOS: cstyle fixes

macOS: compile fixes

Upstream: stop Linux from compiling macOS source files.

It doesn't seem to work.

Upstream: continued makefile fixes

Upstream: Linux squats on mount_zfs, so hack around it
lundman added a commit to openzfsonosx/openzfs-fork that referenced this issue Oct 6, 2022
Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: configure.ac add cmd/os/macos/zsysctl

Upstream: configure.ac changes

Upstream: Makefile

Upstream: Add macOS to headers

Attempt to group most of the sweeping changes to headers in there,
unless they fit better with an individual commit

Signed-off-by: Jorgen Lundman <[email protected]>

It appears FreeBSD did the same for zfs_ioctl_register_dataset_nolog()
as they use it, so following suit for zfs_ioctl_register_pool()

Upstream: macOS default mount is /Volumes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: add IO calls for iokit

Is this the best way? We could add ", func, private" to the
existing IO, and either send by uio, or by func(private).

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Allow cmd/zfs mount unmount of snapshots

"zfs mount dataset@snapshot" as mounting of snapshot has to be done
manually from userland in macOS.

Add zfs_rollback_os() call to the rollback logic, so platforms can
do specific requirements.

macOS: need to kick Finder to update.

Signed-off-by: Jorgen Lundman <[email protected]>

upstream: hack - retry destroy until diskarb goes away

A more portable solution is perhaps desired.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add macOS support

Add all files required for the macOS port. Add new cmd/os/ for tools
which are only expected to be used on macOS.

This has support for all macOS version up to BigSur (11.x)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Additional work

macOS: employ advanced lock synchronisation

macOS: handle additional lookups to delay waiting for mount

macOS: handle rapid snapshot auto mounts

Re-implement snapshots to always mount on "lookup()". This handles the
deadlock when cwd is changed to the snapshot directory before mount.

Then add some logic to attempt to not-mount in some situations, ie
listing inside ".zfs/snapshot" directory. If a listing there is
started, we ignore mount requests until it is complete - by
storing the theadid and pid of the listing process.

Any access below ".zfs/snapshot", will clear the ignore, ie, cause
the mount to happen.

macOS: userland unmount to disable auto_snapshot

to avoid triggering a mount. Also make kernel remember 5 pid+tid to
ignore.

macOS: Do not truncate returned name in case correcting lookups

macOS: also don't truncate further down

macOS: fix leak in ldi handle_set_wce_iokit

The parent device needs to be released if it was retained.

macOS: add zvol_os_is_zvol()

Or we are unable to create zpools inside zvols.

Also cleanup zvolIO.cpp to be cstyle compliant and correcting
obvious leaks.

macOS: fix zfs_vnop_lookup() and linkid

zfs_vnop_lookup() failed to "remember" the name used to lookup in the
cache_lookup() success case, making us return the incorrect name in
future zfs_vnop_getattr() - most notacibly in realpath().

linkid logic for Finder was not converting XNU inode to avoid the
first 16 inodes.

macOS: Return nametoolong when formD is lacking space

Originally it was returning "Operation not supported" which isn't quite as
useful to the user.

Hopefully nothing checks that it must return ENOTSUP.

macOS: change vnop_lookup to use cache.

To give more room for formD formC to work with, we
always allocate MAXPATHLEN, so we might as well use a kmem_cache.

macOS: rmdir -p is far too eager

macOS: dir link count doesn't count files.

To be like upstream:

drwxr-xr-x  2 root  wheel   2 Jun 16 17:37 .
touch a
drwxr-xr-x  2 root  wheel   3 Jun 16 17:37 .

Where 2nd field is "number of directories" (2) and
5th field is "number of files and directories" (3)

macOS: move sa_setup() to after zap_lookup()

This is the order Linux calls them, so we should minimise differences.

macOS: clean up handling of readonly with vfs_mount

to follow what upstream does.

macOS: parentID also needs to be mapped to XNU id

macOS: add cmd/os/macos/zsysctl

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: Makefile.am for mount_zfs [squash]

macOS: squash

macOS: strip selinux functions [squash]

macOS: move getmntany into libzfs

zvol.c change

fix zfs.h

macOS: run zsysctl if /etc/zfs/zsysctl.conf exists

macOS: re-implement most of xattrs

We had some difference betweem how ZOL and macOS behaved when going
between xattr=sa and xattr=on datasets (send/recv) and fairly large
duplicate code.

Take ZOL zpl_xattr for the sa/on logic, change it to take "uio" for
the data buffer. Also pass in "cr" as we can.

The finderinfo logic stays in the vnop handlers, leaving the imported
source very close to ZOL.

Everything with xattrs, and decmpfs needs to be tested :)

macOS: Add uio type for IOKit iomem support

Add another UIO seg type, UIO_FUNCSPACE (UIO_SYSSPACE, UIO_USERSPACE) to
handle the IOkit IOMemoryDescriptor type. When zvolIO needs to issue
IO on volumes, it will setup a uio with iov_base as "iomem".
As dmu_read_dnode_uio() (and write) filters down to zfs_uiomove(),
spl-uio will handle the type to call registered IO function
"zvolIO_strategy" instead of memcpy/bcopy calls.

zvolIO_strategy() will call iomem->writeBytes (readBytes) as required.

Model zvol_os.c calls zvol_os_read_zv() (and write) on ZOL sources again
to ensure as little divergence as possible.

Restore dmu.c to contain no macOS changes

macOS: Fix abd leak, kmem_free correct size of abd_t

... for macOS and Freebsd, and improve macOS abd performance (#56)

* Cleanup in macos abd_os.c to fix abd_t leak

Fix a leak of abd_t that manifested mostly when using
raidzN with at least as many columns as N (e.g. a
four-disk raidz2 but not a three-disk raidz2).
Sufficiently heavy raidz use would eventually run a system
out of memory.

The leak was introduced as a fix for a panic caused by
calculating the wrong size of an abd_t at free time if the
abd_t had been made using abd_get_offset_impl, since it
carried along the unnecessary tails of large ABDs, leading
to a mismatch between abd->abd_size and the original
allocation size of the abd_t.  This would feed kmem_free a
bad size, which produces a heap corruption panic.

The fix now carries only the necessary chunk pointers,
leading to smaller abd_ts (especially those of
abd_get_zeros() ABDs) and a performance gain from the
reduction in copying and allocation activity.

We now calculate the correct size for the abd_t at free time.

This requires passing the number of bytes wanted in a
scatter ABD to abd_get_offset_scatter().

Additionally:

* Switch abd_cache arena to FIRSTFIT, which empirically
improves perofrmance.

* Make abd_chunk_cache more performant and debuggable.

* Allocate the abd_zero_buf from abd_chunk_cache rather
than the heap.

* Don't try to reap non-existent qcaches in abd_cache arena.

* KM_PUSHPAGE->KM_SLEEP when allocating chunks from their
own arena

- having fixed the abd leaks, return to using KMF_LITE,
but leave a commented example of audit kmem debugging

- having made this work, abd_orig_size is no longer needed
as a way to track the size originally kmem_zalloc-ed for
a scatter abd_t

* Update FreeBSD abd_os.c with the fix, and let Linux build

* Minimal change to fix FreeBSD's abd_get_offset_scatter()
carrying too many chunks for the desired ABD size

* A size argument is added to abd_get_offset_scatter() for
FreeBSD and macOS, which is unused by Linux

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: ASM changes to support macOS

Due to some differences in assembler work, macOS will have own copies.
It would be desirable to change all assembler files to use asm_linkage.h
and the macros inside for better portability.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: module/zfs/spa.c

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: zfs-tests to support macOS

Start to add macOS support to the zfs-tester environment, much more work
is required still.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Changes to dprintf for macOS

Prefer to always have the option to turn printfs on, even in RELEASE
builds

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: macOS currently has own zfs_fsync

Hoping to remove it eventually.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: work around different API for sbuf_finish()

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Why is linux even trying to look at etc/launchd

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Missing empty taskq for userland

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Add crypto errata1 for projectquota-less datasets

There was a short Windows of 2.0 releases before rc4 where a crypto
dataset would enable projectquota but fail to start it. Add
a work-around for that issue. It is expected this commit will
be remote in the near future.

datasets with crypto will generate the proper local_mac, and will not
be able to be imported with the broken 2.0 version.

Fixed dataset should work on other platforms again.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: import -d does not go through os/macos/ sources

On macOS we need to prioritise /dev/disk over /dev/rdisk, but
the common code makes no adjustment based on os preferred names.

Potentially we should possibly call an os/ function to set
the priority.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Test for NULL vd

It seems we managed to get a deadman triggered during export?

 : 0xffffff8004ebda40 mach_kernel : _return_from_trap + 0xe0
 : 0xffffff7f8942bbbf org.openzfsonosx.zfs : _vdev_deadman + 0x1f
 : 0xffffff7f8941149a org.openzfsonosx.zfs : _spa_deadman + 0xca
 : 0xffffff7f896a6246 org.openzfsonosx.zfs : _taskq_thread + 0x4a6

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: zdb inode mapping fix

Upstream: realpath vdev directory paths

This is already the behavior of
zpool_find_import_scan, so do the same in
make_leaf_vdev and zfs_strcmp_pathname.

On macOS, /var is a symlink to private/var so when
the user inputs an import path starting with /var,
it is eventually converted automatically by zfs to
the realpath starting with /private/var. This
causes problems later finding vdevs as string
comparisons between paths starting with
/private/var and paths starting with /var fail, so
make sure we are always using the vdev directory's
realpath. Note that basenames are preserved so as
not to compromise invariant symlinks.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: dirname -> zfs_dirnamelen [squash]

Forgot to actually change it to zfs_dirnamelen

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: set default macOS invariant disks path

InvariantDisk (udev analogue for macOS) does not
use /dev/disk in order to avoid subdirectories in
/dev. Instead, the default path for the invariant
symlinks is /var/run/disk/by-*, a root owned
temporary directory.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: stub-out zpool_read_label for APPLE

It does not work for macOS platform, we have our own based on the
old pre-lio style.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: cppcheck fixes

Upstream: fit in with recent man page changes

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: use correct libcurl.4.dylib name

This fix isn't exactly great either.

macOS: destroy snapshots

squash

renamed zpool_disable_volume_os

macOS: rename zed zvol symlink script and variables

macOS: handle 2-arg pthread_setname_np()

By taking it out completely.

macOS: Add snapshot and zvol events (uio.h fixes)

It turns out that it could not see readv/writev because
our macos/sys/uio.h was testing for the _LIBSPL_SYS_UIO_H as
set by the top level libspl/include/sys/uio.h and therefor
skipped over, if includes came in wrong order.

Upstream: libzfs.h abi requires changes

macOS: compile fixes after rebase

macOS: changes to zfs_file after rebase

macOS: compile fixes after rebase

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Make zvol list be non-static

Until we can agree on a solution that works for everyone.

macOS: rename fallthrough to zfs_fallthrough

macOS: Compile fixes for latest rebase

macOS: Update arcstat and arc_summary

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Correct CPUID features lookup

Account for surprise A, B, D, C order of registers.

Add fixes to compile on ARM64, but functionally is missing.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Set name after zfs_vnop_create() for NFS

NFS would fail with open(..., O_EXCL) if we do not set the name after
zfs_vnop_create(). nfsd handles O_EXCL differently, in that it
always assumes VA_EXCLUSIVE is set (and will receive EEXIST). nfsd
uses atime to store a pseudo-random unique ID, then call VNOP_CREATE().
If it succeeded in creating the file (this nfs client won over any other
nfs clients) then the atime ID will match. nfsd will then call
vnode_setattr() with the correct atime.

If the name is not set by ZFS, it fails before calling vnode_setattr()
with call stack:
  mac_vnode_check_open()
    vn_getpath_ext_with_mntlen()
      build_path_with_parent()

Also correct fhtovp/vptofh to handle XNU remapped inodes.

Remove atime checks for 48bit overflow from pre 64bit days.

zfs_vnop_create() is also given a vattr struct, we should reply with
the attr we handled - this saves XNU from calling fallback setattr().

Clean up zfs_vnop_getattr() to only set va_active for vattrs that was
asked for, rather than blindly setting vattrs. Some xnu code checks
that va_active == va_enabled, so if we set too many it can force
XNU to call fallback.

Actually handle atime in setattr()/getattr(), as it lives in
zp->z_atime struct.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: fix default ADDEDTIME getattr

Logic would return 0 date reply for entries without ADDEDTIME
(which is only added after moving)

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Also set name in other vnop create calls.

Unsure if NFS will bug on symlink and link, but we might
as well call update. mknod is handled in the call to create.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Add verbose kstat for RAW type

The RAW kstat type as used by nodes like:

kstat.zfs.misc.dbgmsg
kstat.zfs.misc.dbufs

can get really large, and there is no way to skip them when issuing
a "sysctl -a" or similar request. This can slow down the process
considerably, while it holds the locks.

The RAW kstat type will now automatically add a "verbose" leaf as well,
defaulting to "0" (do not display). To see the RAW information set
the verbose value to 1.

kstat.zfs.misc.dbufs.verbose: 0

kstat.zfs.misc.dbufs.dbufs:
pool             objset   object   level    blkid    offset     dbsize
[...]
kstat.zfs.misc.dbufs.verbose: 1

Conveniently, this command works:

sudo sysctl kstat.zfs.misc.dbgmsg.verbose=1 kstat.zfs.misc.dbgmsg.dbgmsg
     kstat.zfs.misc.dbgmsg.verbose=0

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: change wmsum to be a struct

instead of being clever with pointers, and prepare
for possible future expansion.

Signed-off-by: Jorgen Lundman <[email protected]>

macOS: Handle ZFS_MODULE_PARAMS as sysctl, take 2

Modelled on FreeBSD approach, made to work on macOS.

Attempt to stay close to legacy macOS tunable names
but some are now slightly different.

Retire the macOS kstat versions, replace with ZFS_MODULE_IMPL.

Signed-off-by: Jorgen Lundman <[email protected]>

Upstream: Also copy out sysctl for ZFS_MODULE_VIRTUAL

Upstream: take out Linux code in zfeature

macOS: move ioctl_fd (back) into libzfs_core

macOS: fix up clock_gettime for zfs-tester

macOS: build fix for monterey

macOS: bring in cmd/os/macos/zsysctl and mount_zfs

macOS: also include all source files

Split deep vmem_alloc()/vmem_xalloc() stacks

In openzfsonosx/openzfs#90
a user reported panics on an M1 with the message

"Invalid kernel stack pointer (probable overflow)."

In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).

The stack frames above the first vmem_alloc were also fairly large.

This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.

The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.

Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.

* some spl_vmem.c functions are given inline hints

These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.

* remove some (unrelated) unused variables which escaped
previous commits, eliminating a couple compile-time
warnings.

Sub-PAGE_SIZE ABDs instead of linear ABDs

Previously, when making an ABD of size less than
zfs_abd_chunk_size (4k by default), we would make a linear
ABD, which would allocate memory out of the zio caches.

The subpage chunks are in multiples of SPA_MINBLOCKSIZE,
with each multiple (up to PAGE_SIZE minus
SPA_MINBLOCKSIZE) having its own kmem_cache. These
kmem_caches are parented to a subpage vmem_cache that
takes 128k allocations from the PAGE_SIZE abd_chunk_cache.
ABDs whose size falls within SPA_MINBLOCKSIZE bytes of
PAGE_SIZE and all larger ABDs are served by the PAGE_SIZE
ABD cache.

Upstream: fix M1 --enable-debug build failure

cannot #pragma diagnostic pop without a matching #pragma
diagnostic push

use -finline-hint-functions and not HAVE_LARGE_STACKS

We appear to have a stack overflow problem.

HAVE_LARGE_STACKS is default.  It drives the decision
about whether (HAVE) or not (!HAVE) to do txg sync context
(frequent) and pool initialization (much less frequent)
zio work in the same thread as the present __zio_execute,
or whether it should be pushed to the head of the line of
zios to be serviced asynchronously by another thread.

Let's not define HAVE_LARGE_STACKS when building the kext
for macOS.

Clang's -finline-hint-functions inlines all threads
explicitly hinted as inline "static inline foo(...) { }"
or equipped with an __attribute__((always_inline)), but
does not inline other functions, even if they are static.

Clang & LLVM's inlining bumps the stack frame size to
include automatic variables in the inlined functions,
growing the stack even for invocations where the inlined
function will not be reached.  This has led to large stack
frames in recursively called functions, notably
dsl_scan_visitbp, which was dealt with by removing the
always inline attribute.

Globally enabling -finline-hint-functions reduces the
number of inlined functions enough to make un-inlining
such functions, while still inlining obvious
wins (e.g. tiny frequently called from all over the source
tree functions such as atomic_add_64_nv()).

remove exponential moving average code

Its utility for tracking relatively long term (~ seconds)
movements of the calculated spl_free value has declined
with our switch to "pure" and away from using macOS kernel
variables which tracking momentary VM page demand and
consequent changes in ARC.

macOS allows for floating point to be used in kernel
extensions, but this *may* have a cost on ARM in the form
of larger stack frames, which is an acute problem.

This code therefore should go away, rather than be put
behind a compile-time flag.

macOS: fix autoimport script

Use absolute path to zsysctl.

Ensure that the org.openzfsonosx.zpool-import service is loaded before
kickstarting it.

macOS: zfs_resume_fs can panic accessing NULL vp

macOS: silence zvol_in_zvol

and use SET_ERROR() while we are at it.

macOS: zvol_replay must call zil_open

Fix some SPL warnings

* large frame size -> IOMalloc/IOFree
* loss of precision in int -> short by way of bitmasking
* __maybe_unused for the pattern:

const retval r __maybe_unused = f();
ASSERT0(f);

in non-debug builds
* variables used possibly uninitialized

macOS: Silence warning in uio

A race to thread_call_enter1() could deadlock

If multiple concurrent deep-stack allocation requests race
to vmem_alloc(), the "winner" of the race could be
cancelled by one of the other racers, and so the cancelled
"winner" would never see the done flag, and would spend
the rest of eternity stuck in a cv_wait() loop, hanging
the thread that wanted memory.

This commit uses a per-arena busy flag to block the later
racers from reaching thread_call_enter1() until the
race-winner's in-worker-thread memory allocation is
complete.

Additionally, the worker does less work updating stats,
and only takes a mutex around the cv_signal(). The parent
also checks for lost and duplicate cv_signals() error
conditions.

macOS: zvolRename needs to wait

zvolRename needs to wait for IOKit to settle the changes, with a
timeout. Rework the code to reuse the wait logic in a function.

Remove old delay() hack.

Most easily tested with zfs-tester run over;
cli_root/zfs_create, zvol/zvol_cli, zvol/zvol_misc
which results in testpool/vol.33979-renamed "is busy".

macOS: Use vdev_disk_taskq when stack space limited.

As we can trigger stack overflow in the IO path (especially with zvol)
we detect if available space is below tunable spl_vmem_split_stack_below.

In addition to this, remove small kmem_alloc of ldi_buf_t, vdev_buf_t and
ldi_iokit_buf_t for each IO by attaching ldi_buf_t into zio_t.
(See ZIO_OS_FIELDS)

Track lowest stack remaining seen in vdev_disk as kstat

It may be useful to know if we are being handed an
especially deep stack in vdev_disk_io_start(), rather
than just that we have been called with less than
the threshold remaining.

Additionally update variable names for clarity, notably
reflecting that spl_stack_split_below is not just for vmem
any more.

Issue zvol_read/write async when needed

macOS: Address 3 different ways to compress HFS

Handle 2 xattr holding the compressed data stream, and
detect when UF_COMPRESSED is being set, and if file size is
zero we return zero. Makes 'tar' retry the compression.

Upstream: test for -finline-hint-functions

macOS: thread_call_allocate fix for older OS

macOS: add support for sharenfs and sharesmb

macOS: implement kcred for zfs_replay

Some calls used by zfs_replay can not handle a NULL kcred and will
panic. We use available functions to fetch the kernel cred, but
it is somewhat of a hack, as we release it before using.

Alternatively, we could hold reference in zfsvfs_setup() before
calling zil_replay() and release after, with the hopes that
kcred isn't used many other places.

macOS: Fix sysctl macros and missing prototypes

macOS: Add disable_trashes tunable

zfs_disable_trashes

Upstream: wrap in ifdef for SEEK_HOLE

macOS: fix for earlier macOS

also needs:
-               AC_MSG_FAILURE([*** clock_gettime is missing in libc and librt])
+               AC_MSG_RESULT([*** clock_gettime is missing in libc and librt])

and remove -finline-hint-functions

macOS: wrap mkostemp/s in OSX10.12 checks

macOS: bzero the tqe in the allocation construction phase

the tqent_next and tqent_prev fields are random, which causes problems
with the IS_EMPTY() check, causing an assertion in zio_taskq_dispatch

macOS: Clean up vmem_alloc_in_worker_thread for boot hang

vmem_alloc_in_worker_thread() would use a local stack "cb" referenced
by the thread, possibly after the stack was released.

spl_lowest_alloc_stack_remaining would also trigger async calls when
not neccessary, each time we got a new low, which was problematic
during spl-kmem startup.

macOS: M1 kext must contain x64 binary

Otherwise notarize fails.

macOS: minor cstyle fix

commit before git bisect

Upstream: fixing all the Makefiles for new build system

Upstream: Changes and fixes to common files

macOS: fixes and updates

macOS: replace bcmp / bcopy / bfree

macOS: cstyle fixes

macOS: compile fixes

Upstream: stop Linux from compiling macOS source files.

It doesn't seem to work.

Upstream: continued makefile fixes

Upstream: Linux squats on mount_zfs, so hack around it

macOS: ZFS_ENTER changes, blake3 tunable

Huh where did zvol_wait go

FIxes for cstyle, make install etc

Remove all _impl_get() functions

apparently we do MODULE_PARAM_VIRTUAL some other way now

Bring back zfs_vdev_raidz_impl_get()

I guess only one vdev_raidz_impl_get

Signed-off-by: Jorgen Lundman <[email protected]>

ABI changes

Signed-off-by: Jorgen Lundman <[email protected]>

Mac: Build without librt

Signed-off-by: Andrew Innes <[email protected]>

Workflow to build OpenZFS on mac

Signed-off-by: Andrew Innes <[email protected]>

spa_activate_os() - just one will be sufficient

Initialize all members of kcf_create_mech_entry()

Silence SPL startup debug

Signed-off-by: Jorgen Lundman <[email protected]>

Add darwin to default.cfg.in

Signed-off-by: Jorgen Lundman <[email protected]>

enum ZFS_PROP and zfs_prop_register must match.

Signed-off-by: Jorgen Lundman <[email protected]>

Attempting to rw_destroy() uninitialized rwlock

Move the rw_init() higher, so all "goto error;" will work.

Make sure kcf_mech_tabs is set to zero.

Definitely unsure what is going on here. The address for the tabs
keep moving, as in:

class = 2;
printf("kcf_mech_tabs_tab[class].met_tab is %p and class is %d\n",
   kcf_mech_tabs_tab[class].met_tab, class);
printf("kcf_mech_tabs_tab[  2  ].met_tab is %p and class is %d\n",
   kcf_mech_tabs_tab[2].met_tab, class);

kcf_mech_tabs_tab[class].met_tab is 0xffffff7f8b933d90 and class is 2
kcf_mech_tabs_tab[  2  ].met_tab is 0xffffff7f8b930d90 and class is 2
..................................................^

which is most peculiar, so the memory that the 3 tabs set up, is
full of garbage, no empty slot is found and it fails to add the
ciphers, digests and macs.

If we set a struct to nothing, or to = { 0 }; like:

static kcf_mech_entry_t kcf_digest_mechs_tab[KCF_MAXDIGEST] = { 0 };

The original kcf_mech_tabs_tab is all zero, but when we start to use
it and it shifts by 0x30000 it is garbage. This is true even if we
set it to:

    = {{{0}, 0, 0, {0}}}

But for some bizarre reason, if we set the first element to something
not-zero, like:
  = {
       { "NOTUSED", 0, 0, {0}},
       { {0}, 0, 0, {0}}
    }

Then it works, yes the first entry is busy, but [1] and onwards is
finally set to zero, and the pointer does not slide by 0x30000.

I wish I knew why.

Signed-off-by: Jorgen Lundman <[email protected]>

Scripts no longer needed

now everything is built in root

zfs_get_data() can deadlock

Instead of passing ZGET_FLAG_ASYNC from zfs_get_data() lets
try calling just vnode_get() instead, as it will not msleep().

If we still hang, we need to #ifdef zfs_get_data().

Signed-off-by: Jorgen Lundman <[email protected]>

Remove most of the warnings.

Signed-off-by: Jorgen Lundman <[email protected]>

Remove AARCH64 from M1 compiles.

Sadly, no standard NEON on M1, at least, not yet.

Signed-off-by: Jorgen Lundman <[email protected]>

fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants