Attempt to write saved data in a resilient manner #1001

FiniteReality · 2024-05-25T00:09:53Z

I'm implementing a naive solution for saving NBT data in a resilient manner here to prevent data corruption when a server/game crashes during saving. This should prevent data attachments (especially large ones) from corrupting, as any unfinished writes will get directed to a temp file instead. These temp files are automatically cleaned on load, if detected, using a naive algorithm which likely has terrible performance characteristics.

This should resolve #775, at least partially. It would be better if we could guarantee the write goes through, but if the JVM hard crashes (e.g. access violation, power loss) I don't see a way around needing something like this.

neoforged-pr-publishing · 2024-05-25T00:10:05Z

Publish PR to GitHub Packages

Last commit published: e8ac624b542217cac433f4f15517566bf5ab851a.

PR Publishing

The artifacts published by this PR:

Repository Declaration

In order to use the artifacts published by the PR, add the following repository to your buildscript:

repositories {
    maven {
        name 'Maven for PR #1001' // https://github.com/neoforged/NeoForge/pull/1001
        url 'https://prmaven.neoforged.net/NeoForge/pr1001'
        content {
            includeModule('net.neoforged', 'testframework')
            includeModule('net.neoforged', 'neoforge')
        }
    }
}

MDK installation

In order to setup a MDK using the latest PR version, run the following commands in a terminal.
The script works on both *nix and Windows as long as you have the JDK bin folder on the path.
The script will clone the MDK in a folder named NeoForge-pr1001.
On Powershell you will need to remove the -L flag from the curl invocation.

mkdir NeoForge-pr1001
cd NeoForge-pr1001
curl -L https://prmaven.neoforged.net/NeoForge/pr1001/net/neoforged/neoforge/21.0.107-beta-pr-1001-feature-resillient-io/mdk-pr1001.zip -o mdk.zip
jar xf mdk.zip
rm mdk.zip || del mdk.zip

To test a production environment, you can download the installer from here.

FiniteReality · 2024-05-25T00:36:57Z

I'm not 100% sure if putting the patch in NbtIo is the best place, as there are certain SavedData implementations which themselves do this kind of resilient saving - the player's saved data is one example. It may be worth moving the implementation into SavedData directly, rather than NbtIo. Thoughts?

Technici4n · 2024-05-25T22:22:25Z

I wanted to look into how vanilla handles this for its main level data file. There is some convoluted logic to make it work in many corner cases.

FiniteReality · 2024-05-26T08:57:01Z

I wanted to look into how vanilla handles this for its main level data file. There is some convoluted logic to make it work in many corner cases.

Looking at it, it seems that it uses the same trick (RegionFile#writeToExternalFile creates a temp file, writes to it, and then moves it into place) for oversize chunks, but the main region file is simply a FileChannel created with StandardOpenOption.DSYNC which forces synchronous writes, and according to the NIO docs:

The SYNC and DSYNC options are used when opening a file to require that updates to the file are written synchronously to the underlying storage device. In the case of the default provider, and the file resides on a local storage device, and the seekable channel is connected to a file that was opened with one of these options, then an invocation of the write method is only guaranteed to return when all changes made to the file by that invocation have been written to the device. These options are useful for ensuring that critical information is not lost in the event of a system crash. If the file does not reside on a local device then no such guarantee is made. Whether this guarantee is possible with other provider implementations is provider specific.

My guess is that internally this acts as a write-through for any caches and buffers for the region file that's open. Everything else on top of this is just shuffling I/O around so that it doesn't happen on the main thread, as they become blocking writes.

Since we're writing to a temporary file and moving that into place, all buffers should be appropriately flushed before the move completes anyway. We could consider moving to an I/O pool but I don't think this is necessary, and would be a rather invasive patch in comparison.

Technici4n · 2024-05-26T09:01:34Z

Doesn't MC also have some recovery system where it attempts to read files that are updated but haven't been moved yet? If it's just falling back to LevelDirectory#oldDataFile it's not as fancy as I remember...

The atomic move should be good enough.

FiniteReality · 2024-05-26T09:52:31Z

Doesn't MC also have some recovery system where it attempts to read files that are updated but haven't been moved yet?

I couldn't find anything fancy like that. I assume they couldn't do that anyway, because it would risk data loss; a partially completed save could still have partially valid data and load successfully, even though the older file is more likely to be correct.

patches/net/minecraft/nbt/NbtIo.java.patch

patches/net/minecraft/world/level/storage/DimensionDataStorage.java.patch

patches/net/minecraft/nbt/NbtIo.java.patch

src/main/java/net/neoforged/neoforge/common/IOUtilities.java

Busy

… the end.

src/main/java/net/neoforged/neoforge/common/IOUtilities.java

Technici4n

Does closing the channel not flush it already?

shartte · 2024-07-17T07:20:43Z

Does closing the channel not flush it already?

No. Not that we know of. You are thinking of java flush. We need OS-level fsync.

Technici4n · 2024-07-17T19:34:05Z

Ok. Just the extra parentheses and we're good then.

We no longer need to implement this ourselves for NeoForge. For Fabric, keep the existing patch. See: neoforged/NeoForge#1001

Technici4n reviewed May 26, 2024

View reviewed changes

FiniteReality force-pushed the feature/resillient-io branch from fbc0902 to 94148f2 Compare June 2, 2024 20:50

FiniteReality marked this pull request as ready for review June 2, 2024 20:51

FiniteReality commented Jun 2, 2024

View reviewed changes

src/main/java/net/neoforged/neoforge/common/IOUtilities.java Outdated Show resolved Hide resolved

FiniteReality force-pushed the feature/resillient-io branch from 94148f2 to 4479ecb Compare June 5, 2024 18:53

Matyrobbrt added the enhancement New (or improvement to existing) feature or request label Jun 11, 2024

FiniteReality added 3 commits June 16, 2024 02:53

Attempt to write saved data in a resillient manner

50bd723

Use atomic moves if available

efc6490

Move patch to SavedData, introduce helper class

36c2e54

FiniteReality force-pushed the feature/resillient-io branch from 4479ecb to 36c2e54 Compare June 16, 2024 02:01

FiniteReality changed the base branch from 1.20.x to 1.21.x June 16, 2024 02:02

FiniteReality added the 1.21 Targeted at Minecraft 1.21 label Jun 16, 2024

Merge branch '1.21.x' into feature/resillient-io

c614483

sciwhiz12 requested a review from Technici4n July 5, 2024 02:02

sciwhiz12 previously requested changes Jul 5, 2024

View reviewed changes

src/main/java/net/neoforged/neoforge/common/IOUtilities.java Show resolved Hide resolved

src/main/java/net/neoforged/neoforge/common/IOUtilities.java Outdated Show resolved Hide resolved

src/main/java/net/neoforged/neoforge/common/IOUtilities.java Outdated Show resolved Hide resolved

Technici4n reviewed Jul 11, 2024

View reviewed changes

Respond to PR feedback

3af0f3f

raoulvdberge mentioned this pull request Jul 14, 2024

Improve data resiliency of PersistentState FabricMC/fabric#3942

Open

Merge branch '1.21.x' into feature/resillient-io

00de060

Technici4n previously approved these changes Jul 15, 2024

View reviewed changes

Technici4n requested a review from sciwhiz12 July 15, 2024 08:40

Merge branch '1.21.x' into feature/resillient-io

6bcf2fc

FiniteReality removed the request for review from sciwhiz12 July 16, 2024 21:24

shartte self-requested a review July 16, 2024 21:26

Switch from syncing on every underlying write to syncing only once at…

e8ac624

… the end.

shartte dismissed Technici4n’s stale review via e8ac624 July 16, 2024 22:27

FiniteReality commented Jul 16, 2024

View reviewed changes

src/main/java/net/neoforged/neoforge/common/IOUtilities.java Outdated Show resolved Hide resolved

Technici4n reviewed Jul 17, 2024

View reviewed changes

Remove extra paren

7cd8c6f

shartte requested a review from Technici4n July 17, 2024 19:54

shartte approved these changes Jul 17, 2024

View reviewed changes

Technici4n approved these changes Jul 17, 2024

View reviewed changes

shartte enabled auto-merge (squash) July 17, 2024 19:56

shartte merged commit 2c72560 into neoforged:1.21.x Jul 17, 2024
4 checks passed

raoulvdberge added a commit to refinedmods/refinedstorage2 that referenced this pull request Jul 20, 2024

fix: use neoforge safe saveddata saving

e2bf89e

We no longer need to implement this ourselves for NeoForge. For Fabric, keep the existing patch. See: neoforged/NeoForge#1001

raoulvdberge added a commit to refinedmods/refinedstorage2 that referenced this pull request Jul 20, 2024

fix: use neoforge safe saveddata saving

d6756d0

We no longer need to implement this ourselves for NeoForge. For Fabric, keep the existing patch. See: neoforged/NeoForge#1001

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to write saved data in a resilient manner #1001

Attempt to write saved data in a resilient manner #1001

FiniteReality commented May 25, 2024

neoforged-pr-publishing bot commented May 25, 2024 •

edited

Loading

The artifacts published by this PR:

Repository Declaration

MDK installation

FiniteReality commented May 25, 2024

Technici4n commented May 25, 2024

FiniteReality commented May 26, 2024

Technici4n commented May 26, 2024 •

edited

Loading

FiniteReality commented May 26, 2024

Technici4n left a comment

shartte commented Jul 17, 2024

Technici4n commented Jul 17, 2024

Attempt to write saved data in a resilient manner #1001

Attempt to write saved data in a resilient manner #1001

Conversation

FiniteReality commented May 25, 2024

neoforged-pr-publishing bot commented May 25, 2024 • edited Loading

The artifacts published by this PR:

Repository Declaration

MDK installation

FiniteReality commented May 25, 2024

Technici4n commented May 25, 2024

FiniteReality commented May 26, 2024

Technici4n commented May 26, 2024 • edited Loading

FiniteReality commented May 26, 2024

Technici4n left a comment

Choose a reason for hiding this comment

shartte commented Jul 17, 2024

Technici4n commented Jul 17, 2024

neoforged-pr-publishing bot commented May 25, 2024 •

edited

Loading

Technici4n commented May 26, 2024 •

edited

Loading