-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel Panic: Kernel Data Abort (OpenZFS 2.1.0, M1 Ultra) #799
Comments
Another one:
|
Another one (different process panicking this time):
|
lundman just released a new beta version Monterey for arm64, perhaps you can give it a shot? |
The new version doesn't help. |
Hi, as of the start of this week I have one of these 128GiB M1 Ultras too, and now know what this problem is. Bear with us, this is the first time we've had a chance to run on a big arm Mac "in anger", and doing kernel extension development on the arm Macs is somewhat different in practice from doing it on Intel Macs. I will commit a fix (and we will put out a package) in a couple of days, but in the meanwhile there is a workaround that should solve this for you entirely.
You can also add the line
into /etc/zfs/zsysctl.conf. At boot time, by default, /usr/local/sbin/zsysctl will be run and will set the tunable for you. This caps the maximum arc to 16 GiB. Almost certainly you can dial that up, but keep it well well well below 64 GIB, which is what the default settinig (0) uses. You can run the sysctl at any time the kernel extension is loaded and running, and dial it up and down; it should be safe enough to do so if you are sufficiently below half your system RAM. 16 GiB is very conservative. I have tested that value extensively and cannot get the machine to panic. The problem in a nutshell is that we limit the entire zfs kernel extension (practically but not exactly all its allocations) to half of a system's RAM. In our case, 64 GIB. ARC is in present code capped by that 64 GiB, rather than some lower value. This leads to two potential problems. The first problem is that because of the constraints on knowing how much memory is available from the kernel to our kernel extension, we are reactive to memory pressure. It's hard to cause a 128 GIB mac to run critically low on memory, so on my system so far I have been unable to see a pressure event at all (and I have tried!). As a result ARC may grow to its maximum extent, 64 GiB, because nothing is telling it to stop growing. Unfortunately this leaves no room for the other parts of our kernel extension which need memory. The second problem arises when it's discovered by our code that nearly all 64GiB are in use. When that's observed some low-level code generates an artificial pressure event, and waits for that to take effect. The signal was not strong (a multiple of the desired allocation size, which translates into between kilobytes and a couple megabytes). On a machine like the M1 Studios, with their many fast I/O ports and little chance of being CPU bound when doing I/O, demands for new memory can easily outrace these attempts to free small amounts, leading to a period of allocations stalling. The result is that a client I/O (say, HFS or APFS inside a zvol; or our command line zfs and zpool tools, or running /bin/ls on a dataset) can be delayed for potentially a long time under certain system workloads, and some of those clients are intolerant of such delays. In particular, HFS appears to decide to time out and panic in some cases. Two probable fixes for this are in the code I am running and testing now, and will in due course commit to our trees. One change scales the artificial pressure signal up with increasing system memory. The other caps the ARC to below the 50% total extension memory threshold. This will get rid of this exact panic, which I can reproduce easily (keep the arc_max tunable at the default, and do a time machine backup into a zvol on an otherwise quiet system). It will also avoid other panics that can arise from the same root cause. |
Awesome! I will try that right now and see what happens; do let me know if I can otherwise help with testing in any way. Now if we can just figure out what causes the boot loop when installing the same kext in Ventura... :) |
I can use |
Great debugging work @rottegift, I really appreciate it. |
@dmzimmerman do a "sudo launchctl list | grep -i zfs", and see if it says something like:
The columns are process ID, status, and launchctl job label. also, "sudo launchctl list org.openzfsonosx.zconfigd"
And do look in /var/log/org.openzfsonosx.*log and /var/log/org.openzfsonosx.*err to see if there's any evidence of failures (or even zconfigd starting). zconfigd should be running
You could run that by hand (use sudo) to see that it emits any output (it should say at least something like:
@jawbroken : thank you. I'm still working on a few ARM things I discovered, and will land it all at once in a few days. As far as I can tell on a 128GiB machine it is safe to use 16 GiB arc_max. With modern compressed ARC that hopefully that's not such a low constraint to suffer with for a few days. |
Unfortunately, I have to report that it still panics for me, even with a 16GiB arc_max. It takes a lot longer, though, and I was pushing it pretty hard.
|
Well, I think I found the problem:
The contents of my
so I really don't understand what's happening there. |
Update: tl;dr: kernel doesn't like giving our kext more than approx 32 GIB of memory on 128GiB Mac Studios. Previously in our second lowest layer of memory allocation there was plumbing to wait in a loop for memory to appear in some circumstances. On our systems, those circumstances were always us running close to 50% of system RAM (a self-imposed limit) or learning from the kernel that system memory was recently (or even currently) in short supply. The waiting would also happen if we received a NULL from the (essentially) malloc "system" call to the kernel. For years we were using a since-deprecated kernel malloc call that was guaranteed never to fail: it would give you the memory requested or panic the system. More recently we switched to IOMallocAligned(), which it turns out is allowed to return a NULL whenever it likes. As far as I can tell, we have only ever seen many NULLs being returned from IOMallocAligned() on these 128GiB Mac Studios. (I guess it could have happenend on an Intel Mac or Hack with 128GiB+ system RAM, but a brief search showed no complaints suggesting anyone had discovered this bug; it could be that on Intel platforms IOMallocAligned() returns NULL so rarely that it might as well be never -- one might never notice the effect of a NULL if retrying produced a successful allocation within a few milliseconds or a few thousand passes through the waiting loop). I ran an experiment to simplify our second-lowest allocation layer on macOS 12+ builds. I think the 12+ condition is reasonable -- macOS Monterrey benefits from years of memory system development by Apple, and additionally all our Mac Studio 128GiB machines, when running macOS, must be running at 12+. The results are promising. As expected, ARC grows as quickly as it can, causing (directly and indirectly) us to allocate a little more (!) than 32 GiB of memory, after which we start seeing NULL almost every time we use IOMallocAligned() for more. Nothing seems to break, and ARC correctly transitions into a much more slowly-and-tentatively-growing mode, causing us to use IOMallocAligned() much less frequently. I can think of a few ways to try to get past the (slightly-more-than-)32GiB effective limit already, but want to UTSL and /usr/bin/zprint to see how exactly it's enforced on us. However, give or take cosmetic changes, I have a workaround that avoids the panic without the need to set an explicit non-default arc_max at runtime via sysctl. I am likely to commit the change either tonight or Monday. [ETA (pardon the acronym pun), more likely Monday or Tuesday, as I will want to test whether a variety of other Macs and VMs, with different OS vintages, are happy with a total removal of the relevant code, instead of an #ifdef removing it for macOS 12+ only. Over time the amount of time we have to pester xnu with requests for more memory or to return memory have reduced via changes in middle layers of our memory management. However I don't want to make a change that causes panics on 10.11 or even earlier, or on machines (including virtual machines) with only 4 GiB total RAM]. Until then, setting a sufficiently low arc_max via sysctl will virtually certainly prevent the entire kext from attempting to allocate more than 32 GiB memory, and thus will not have to deal with these strings of NULLs from IOMallocAligned(). 16GiB is still my best advice for the upper limit. With this experimental code I am seeing a 26GiB ARC at the moment, and I have seen at least as much as kstat.spl.misc.spl_misc.os_mem_alloc: 35,670,573,056 memory having been obtained from osif_malloc()->IOMallocAligned(). So my kext is using somewhat, but not much, more than 33 GiB (we make some allocations via other kernel interfaces, too, and those don't pass through osif_malloc() and therefore don't increment that particular kstat). @dmzimmerman : the only thing I can think of is that you may have installed a recent package built by lundman without first making sure that a previous install was truly gone. Personally, i prefer /usr/local/zfs/{bin,sbin,libexec,...} to /usr/local/bin et al. I did get surprised in resorting to a pacakge lundman built for me personally for bootstrapping purposes, because I had always built my own installation (rather than using a release) and had put zpool, zfs etc. in /usr/local/sbin, and so had to update the PATHs of a few scripts of mine. However I had cleaned out (post-migration-from-intel-mac) my previously installed openzfsonosx code, so zconfigd/zsysctl was doing the right thing. I think (having done a very brief UTSL) that at least since May 2021 packages have been using /usr/local/zfs. |
@rottegift: I think you may have misread my comment (or I don't understand what you're saying in response). My |
@dmzimmerman - The output of zsysctl should be identical to if you copied and pasted the non-commented-out lines from /etc/zfs/zsysctl.conf to just after "sudo sysctl ". What you are reporting is unexpected, and not what I have ever seen from zsysctl. It's also not a complex tool, in coding terms. I don't know the provenance of your zsysctl. Packaging has never been my department, and it might be that the whatever installed your zsysctl correctly put it there (and likewise put all the other userland files in concordant locations). My guess was that your zsysctl is old (and maybe file and otool -L on it can prove this), but that's only a guess. Is there any chance your zsysctl is being run under rosetta? This is really a question for lundman, and frankly the subject for a different issue number / standalone problem report. (FWIW, he did scratch his head about it yesterday on IRC). Yes, you should try going lower than 16GiB. Off the top of my head, I can think of workloads that might drive the ARC-size-to-total-memory-allocated ratio upwards. However, do verify that the sysctl has been set, in case there is any doubt about that. If you it is set before import or other activity might exercise ARC, you should see e.g.:
(The last line reports a variable generated by ARC). |
Fair enough. My |
On my M1 Ultra with 128GB RAM, I've been getting panics like the one below.
The interesting thing about them is that what's in the backtrace is
com.apple.filesystems.hfs.kext
- but I'm pretty sure it's a ZFS panic, and it's just manifesting this way because I havecom.apple.mimic
turned on.The text was updated successfully, but these errors were encountered: