Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets use atomic write when persisting to disk #307

Merged
merged 2 commits into from
Nov 11, 2024

Commits on Nov 8, 2024

  1. Datasets use atomic write when persisting to disk

    Partially fixes: #1954
    
    Prior to this change dataset.download() would use a normal file write to
    persist downloaded data to disk.
    
    This meant another process or thread could check for the file and
    attempt to read it before the full content was written.
    
    This change uses a temporary file + a rename to update the file
    atomically.
    
    If a process is already reading a file that the new verion overwrites,
    the previous file node is unlinked rather than being overwritten so the
    read will work as expected.
    
    This will allow us to back out optimistically pre-loading dataset data
    before it is needed (and causing 404 errors when running tests on
    machines without the appropriate permissions to download UK data)
    EC2 Default User committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    3408b5d View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2024

  1. Configuration menu
    Copy the full SHA
    e49633f View commit details
    Browse the repository at this point in the history