Overcommitting storage silently corrupts data #652
Replies: 9 comments 4 replies
-
Thanks for the report. We believe that this problem has been addressed in subsequent stratisd releases, which are available in RHEL 9. Here is some documentation on the design changes that we made in Stratis 3 to make extension of the Stratis filesystems more robust: https://stratis-storage.github.io/thin-provisioning-redesign/ . If you reproduce the same problem in a more recent version of Stratis, please let us know, we will be eager to investigate. Thanks! |
Beta Was this translation helpful? Give feedback.
-
@mulkieran This is NOT fixed on RHEL 9.2 with stratis 3.4.1. I just re-ran the steps in my original post on a fresh system and see the same behavior. I've updated my original post to include those versions. |
Beta Was this translation helpful? Give feedback.
-
@crackerjam Can you confirm that The pool itself is supposed to maintain information about itself, and can give some warnings. When you first create the pool, it will assess its state and may issue a warning, which will be seen with the |
Beta Was this translation helpful? Give feedback.
-
@mulkieran Correct, /dev/sdb is a 2GiB disk.
That doesn't seem to be correct. If I just format the raw disk with XFS directly and try to run the copy, I receive:
Think of a scenario where you have a stratis volume that receives and stores critical business documents. Sure, ideally you would have monitoring set up for disk usage, but let's say you don't. With any other filesystem your application would start receiving errors when the disk is full. Users will see that something's going on, it will be clear that things aren't working properly. Things will grind to a halt, but you won't lose data. With this issue, everything just appears to silently work. Files show up on disk and report correct sizes. However, when you read the files, they're junk. Depending on the use case you could go weeks or months without noticing that all of your data is just evaporating. This should be an extremely critical priority issue. |
Beta Was this translation helpful? Give feedback.
-
@crackerjam In a case like this, you would be advised to create a pool with no overprovisioning mode set. In such a pool the size of the XFS filesystem can not be made any larger that the actual space that is available to it. In that case, the filesystem will behave like the one that you placed on the raw device. For the raw device, check the filesystem size. You'll notice that it is significantly smaller than the default filesystem size, 1 TiB, that Stratis creates when overprovisioning is allowed. |
Beta Was this translation helpful? Give feedback.
-
@mulkieran I don't think that's really an appropriate stance. What is 'a case like this'? A case where you don't want your files to get corrupted? No filesystem, ever, should report back that a write succeeded when it did not, regardless of options used on creation. That's just a core tenant of how filesystems work. It is especially true here, when I'm using the default options. I have to be honest, I only discovered Stratis as part of some RHEL training. Personally, and professionally, if I ever get a whiff of anyone thinking about using Stratis I will tell them to steer clear due to how this bug is being handled. In my opinion, this is a massive issue that needs to be resolved if you want this to be taken seriously as a replacement for LVM, and I can only imagine how many other data-destroying bugs there are that have been ignored because "well you should have just used the right command arguments". If you'd like any more assistance testing scenarios around this, please feel free to let me know. Otherwise, it seems like you have a particular opinion here that I probably won't be able to change. |
Beta Was this translation helpful? Give feedback.
-
@crackerjam I'm afraid that you are failing to understand how filesystems work. To produce an analogous situation with LVM, simply create a thin pool on your 2 GiB device, construct a thin device of say, 1 TiB, on that thinpool, create an XFS filesystem on it, and do your cp as before. The cp will return success, but the file will not be copied in toto. It is surprising to some people that that is how filesystems behave, but it is a consequence of the original design choices made for filesystems in a simple world that did not include thinly provisioned devices supplied by kernel modules. If you want to do a copy that is sure to fail if there is really no place to write the bytes, you will have to explicitly sync your data. |
Beta Was this translation helpful? Give feedback.
-
Now on RHEL 9.3, the problem is still not fixed. and if we use the "no-overprovision" option. the file system size can not grow any more. This is a very damning point. It's almost impossible for me to use Stratis in a production environment. |
Beta Was this translation helpful? Give feedback.
-
Well I am surprised to see this, but it is indeed still an issue with 3.6.8 daemon and 3.6.2 CLI. The file system is not corrupted, just that the file that apparently to be copied successfully, was truncated or zero sized. |
Beta Was this translation helpful? Give feedback.
-
Trying to copy a 10GB file to a 2GB Stratis filesystem appears to succeed, but in reality causes the file to be corrupted. To reproduce:
stratisd
andstratis-cli
(version 3.4.1)yum -y install stratisd stratis-cli; systemctl enable --now stratisd
stratis pool create testpool /dev/sdb
stratis filesystem create testpool testfilesystem
mkdir -p /mnt/test
/etc/fstab
/dev/stratis/testpool/testfilesystem /mnt/test xfs defaults,x-systemd.requires=stratisd.service 0 0
mount -a
wget -P /root http://mirrors.rit.edu/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-Everything-2009.iso
cp /root/CentOS-7-x86_64-Everything-2009.iso /mnt/test
md5sum /root/CentOS-7-x86_64-Everything-2009.iso
md5sum /mnt/test/CentOS-7-x86_64-Everything-2009.iso
These checksums will come back different, indicating data corruption.
Beta Was this translation helpful? Give feedback.
All reactions