Never boots, saying rpool is in use from other system #2195

jgoerzen · 2014-03-20T16:25:54Z

On every boot on this system, I get this message:

FAIL: zpool import -c /etc/zfs/zpool.cache -N rpool . Retrying...
FAIL: zpool import -N -d /dev/disk/by-id rpool . Retrying...

Command: zpool import -N -d /dev/disk/by-id rpool
Message: cannot import 'rpool': pool may be in use from other system
use '-f' to import anyway
Error: 1

Manually import the root pool at the command prompt then exit.
Hint: Try: zpool import -R / -N rpool

Running:

zpool imprt -fR / -N rpool; exit

lets the system boot.

This issue was first reported on the mailing list at https://groups.google.com/a/zfsonlinux.org/d/topic/zfs-discuss/RggMKyj-64A/discussion but no resolution was reached. I am unsure if it is a zfs or zfs-pkg bug. I do not appear to have hostid issues. The disk is never in use by another system.

The text was updated successfully, but these errors were encountered:

ryao · 2014-03-20T18:47:10Z

The problem is that the hostid for the pool in the initramfs does not match the hostid of the actual pool. Run zpool set cachefile= rpool and then rebuild your initramfs. That should clear this problem. If it does not, the initramfs software needs modification to store the system hostid correctly. You could workaround it by using the spl.spl_hostid kernel commandline parameter to override it should your initramfs software support that.

jgoerzen · 2014-03-20T21:08:05Z

I've rebuilt the initramfs numerous times. The hostid command from within the initramfs, the spl message from mere seconds after boot in dmesg, and the hostid command from the running system, all show the same hostid as well.

zpool set cachefile= rpool, by itself, does nothing. If I then rebuild the initramfs, things progress a little further. However, it still doesn't boot, with this message:

FAIL: zpool import -c /etc/zfs/zpool.cache -N rpool . Retrying...
FAIL: zpool import -N -d /dev/disk/by-id rpool . Retrying...

Command: zpool import -N -d /dev/disk/by-id rpool
Message: cannot import 'rpool': a pool with that name is already created/imported,
and no additional pools with that name were found
Error: 1

followed by the same hints.

This time, I have to run just exit to boot.

One other note. Something is changing cachefile back to none on this pool. It does not stay at the empty string.

jgoerzen · 2014-03-20T21:09:27Z

@FransUrbo You might also be interested in this discussion.

ryao · 2014-03-20T21:37:22Z

The command I provided should regenerate zpool.cache. Which distribution and initramfs generator are you using? It looks like the software is not able to handle verbatim import from the cachefile. In that case, making an empty cachefile would work. I believe that I solved this problem in Gentoo by modifying genkernel to autodetect verbatim import from the cachefile and skip that step.

jgoerzen · 2014-03-20T21:45:03Z

This is Debian with its default initramfs generator (hence the @FransUrbo cc)

It was working fine with the 0.6.2 support; the 0.6.3 dailies have shown this breakage.

jgoerzen · 2014-03-20T21:45:19Z

How could I go about debugging hostid issues?

FransUrbo · 2014-03-21T16:22:49Z

In the current released version, the initrd would try to mount it without using force. If that failed, it would then try a forced import.

In my dailies, I've removed that (because it's considered .... if not 'evil', but at least 'bad practice'). This unfortunately leads to the effect that more and more people are reporting import failures. It only affects people that have their root on ZFS.

The reason for this is that the pool isn't exported properly when the system shuts down. The new init scripts in the dailies does try to do this correctly (and that code is sound!). Unfortunately, since one is booting from the filesystem/pool, it can not be exported (because it is in use by the very script that tries to export it!).

Technically, this is not a problem because the filesystem is mounted read only a couple of moments earlier. It's just a problem with the next import (it will be reported as 'in use by another system', because it wasn't exported properly).

I have no real, good way of solving this unfortunately :(. @behlendorf have mentioned somewhere (a couple of years ago) that eventually the hostid 'stuff' will be removed (because it no longer serves a purpose if I remember correctly). Then this might go away.

But until then, adding the 'zfsforce=yes' option on the kernel command line will help. It is not a good and proper way, but it will work. And if you ONLY use your pool on one, single computer with only one OS (as opposed to importing it on multiple computers with many different operating systems) then there won't be any problem.

behlendorf · 2014-03-21T20:46:46Z

Yes, I'd like to remove the existing hostid implementation in favor of proper multi-mount protection. That work is described in #745 and should resolve this issue, however we haven't scheduled anyone yet to do that work.

jgoerzen · 2014-03-21T20:49:38Z

Thanks everyone for your help.

The way I see it, there are at least these three open questions:

Why does zpool import fail without -f, given that everything I can see suggests that the hostid matches everywhere?
Why does the zpool set cachefile= work around that first problem?
Why does the system claim the pool is already imported after setting cachefile=?

@FransUrbo have you seen that third problem anywhere before? I can readily duplicate all of these.

agijsberts · 2014-06-18T21:27:55Z

Are there any updates on this issue?

This still seems to be relevant with release 0.6.3 and I only manage to boot from ZFS without errors (the same as reported by jgoerzen) if I build initramfs without zpool.cache (and zfsforce=1). It even gives the error when I write an explicit /etc/hostid and include that file also in initramfs. Like jgoerzen, I double checked that this hostid matches the pool's hostid and the hostid used by SPL during boot.

behlendorf · 2014-06-19T16:23:44Z

@agijsberts For reasons like this in 0.6.3 we've set the default hostid to 0 which disables the hostid check. What you're going to want to do is make sure the hostid for your system gets set to 0 on boot by removing your /etc/hostid file. Then force import the pool and export it. At this point your pool should no longer contain a specific host id and will no long perform this check. You can verify that's the case by running zdb -l <device> on any of the disks and making sure there is no hostid entry listed.

If you need to run a fail over configuration in the future you'll need to explicitly create /etc/hostid files to enable this support. See openzfs/spl#224.

agijsberts · 2014-06-19T19:24:03Z

@behlendorf Thanks for your suggestions, they sound like the (default) setup I had originally. To be sure I removed /etc/hostid and rebuilt initramfs. After export/import the pools no longer have any hostid attached and SPL reports hostid=00000000. Unfortunately, I still get the following error (the same as the second one reported by jgoerzen):

FAIL: zpool import -c /etc/zfs/zpool.cache -N zroot -f. Retrying...
FAIL: zpool import -N -d /dev/disk/by-id zroot -f . Retrying...

Command: zpool import -N -d /dev/disk/by-id zroot -f
Message: cannot import 'zroot': a pool with that name is already created/imported,
and no additional pools with that name were found
Error: 1

At this point the pools are actually mounted and I can resume system boot simply by exiting the emergency shell (CTRL-D), so to my untrained eye it appears that it tries to import the pool twice.

So far the only ways I found to avoid this error are to either (1) to build initramfs without zpool.cache or (2) to explicitly set /etc/hostid. It might be a user-error somewhere, but I'm drawing a blank what it could be.

StephanieSunshine · 2014-07-01T14:38:49Z

I just did an install this morning using Linux Mint Debian ( Mate ) 64 bit rolling release with a ZFS (0.6.3) root and I'm experiencing this problem as well. I have tried removing /etc/hostid, adding zfsforce=1, and zpool set cachefile= as Ryao had suggested and nothing is working. Every boot I'm forced to type zpool import -f -N rpool ; exit to get it started.

I tried zdb -l | grep hostid and I see nothing. I did notice that hostname was set to '(none)', could this be a problem?

Anyone else have any other suggestions?

agijsberts · 2014-07-01T14:45:52Z

@FuzzySunshine This refers to a different problem than discussed here, but it seems you try to boot from the ZFS root dataset: did you try to remove the trailing slash from the cmdline in grub.cfg? (see: zfsonlinux/grub#15). Also make sure to try and rebuild initramfs without zpool.cache.

StephanieSunshine · 2014-07-01T16:24:38Z

@agijsberts Thank you for replying :)

I looked at the bug you linked and I don't think it applies because I did end up writing my own grub.cfg line " linux /vmlinuz-3.11-2-amd64 bootfs=rpool/ROOT/debian-live-1 boot=zfs ro ". I did try to delete zpool.cache and recreated initramfs (update-initramfs -u -k all ) with no luck. I did manage to figure out that when boot did bork, that if I force import and then export rpool and then reboot instead of exit that the next boot completes just fine without any interaction.

Any suggestions?

agijsberts · 2014-07-02T08:52:34Z

@FuzzySunshine You're right, in your case it cmdline issue does not apply. Make sure though to include zfsforce=1 in the cmdline, this is absolutely required (see FransUrbo's comment above).

I'm not a ZFS developer, so unfortunately I can merely suggest which things helped in my case. As temporary solution, you can also try to write /etc/hostid explicitly. Then export+import your pools, double check with zdb that it has been set (iirc zdb converts the bytes to decimal), and recreate initramfs. ZFS is moving away from this reliance on hostid, but at least this might help you right now (it worked in my case). If it doesn't, you might want to move the issue to the mailing list where more people might see it.

l1k · 2014-10-06T19:37:05Z

Likely fixed by #2766 if Dracut is used.

Make use of Dracut's ability to restore the initramfs on shutdown and pivot to it, allowing for a clean unmount and export of the ZFS root. No need to force-import on every reboot anymore. Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2195 Issue openzfs#2476 Issue openzfs#2498 Issue openzfs#2556 Issue openzfs#2563 Issue openzfs#2575 Issue openzfs#2600 Issue openzfs#2755 Issue openzfs#2766

behlendorf · 2014-10-31T16:16:13Z

The combination of d94fd5f and 07a3312 which are now in master should resolve this issue.

Make use of Dracut's ability to restore the initramfs on shutdown and pivot to it, allowing for a clean unmount and export of the ZFS root. No need to force-import on every reboot anymore. Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2195 Issue openzfs#2476 Issue openzfs#2498 Issue openzfs#2556 Issue openzfs#2563 Issue openzfs#2575 Issue openzfs#2600 Issue openzfs#2755 Issue openzfs#2766

behlendorf · 2015-02-05T18:52:23Z

Closing, this was fixed in master.

behlendorf added this to the 0.6.5 milestone Mar 21, 2014

behlendorf added the Bug label Mar 21, 2014

behlendorf modified the milestones: 0.6.4, 0.6.5 Oct 31, 2014

behlendorf added Bug - Minor and removed Bug labels Oct 31, 2014

behlendorf closed this as completed Feb 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Never boots, saying rpool is in use from other system #2195

Never boots, saying rpool is in use from other system #2195

jgoerzen commented Mar 20, 2014

ryao commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

ryao commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

FransUrbo commented Mar 21, 2014

behlendorf commented Mar 21, 2014

jgoerzen commented Mar 21, 2014

agijsberts commented Jun 18, 2014

behlendorf commented Jun 19, 2014

agijsberts commented Jun 19, 2014

StephanieSunshine commented Jul 1, 2014

agijsberts commented Jul 1, 2014

StephanieSunshine commented Jul 1, 2014

agijsberts commented Jul 2, 2014

l1k commented Oct 6, 2014

behlendorf commented Oct 31, 2014

behlendorf commented Feb 5, 2015

Never boots, saying rpool is in use from other system #2195

Never boots, saying rpool is in use from other system #2195

Comments

jgoerzen commented Mar 20, 2014

ryao commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

ryao commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

jgoerzen commented Mar 20, 2014

FransUrbo commented Mar 21, 2014

behlendorf commented Mar 21, 2014

jgoerzen commented Mar 21, 2014

agijsberts commented Jun 18, 2014

behlendorf commented Jun 19, 2014

agijsberts commented Jun 19, 2014

StephanieSunshine commented Jul 1, 2014

agijsberts commented Jul 1, 2014

StephanieSunshine commented Jul 1, 2014

agijsberts commented Jul 2, 2014

l1k commented Oct 6, 2014

behlendorf commented Oct 31, 2014

behlendorf commented Feb 5, 2015