-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some copied files are corrupted (chunks replaced by zeros) #15526
Comments
I'm also hit by this bug, happened after upgrade to ZFS 2.2.0. Mounting tmpfs on portage TMPDIR (build directory for go) solves the issue. Currently on kernel 6.5.9, originally happened on 6.5.7. No idea if it's worth noting, but the remaining data in the files seems to be a repetition of the same base64 encoded data. Decoding it gave me nothing comprehensible. |
After downgrading coreutils from 9.3 to 8.32, I am no longer able to reproduce this corruption. My understanding is 8.32 predates the switch to automatically using reflink?
This is the Go BuildID. Compare the output of a non-corrupted build:
With the data remaining in the file:
|
Had the same issue here... Upgraded from 2.1.13 to 2.2.0 last month right after it came out. I'm on Ubuntu and built the ZFS package from the 2.2.0 tag. I am discovering random files that were being stored after the 2.2.0 upgrade have suffered silent corruption and either have repetitious data like @thulle reported or have large blocks of contiguous zeroes instead of the data that they should be filled with. I don't use or build
|
This seems to solve the corruption issue on my end too. |
#11900 (comment) onwards is highly relevant. See also #14753 and #11900 as a whole. It would be interesting to know if people here have all upgraded their pool for the new block cloning feature or if there's a mix here. |
feature@block_cloning active here |
My understanding is that block cloning is enabled by default when upgrading to 2.2.0, but my not be getting used unless an application like Can people post the result of this:
(where |
|
My understanding is this: If the result is Any non-zero number for either field and you may OR may not have silent corruption. |
That is likely but not really safe to say yet, we don't have enough data. |
Is it accurate to say it's being used when the FICLONE ioctl is used, and possibly when |
I wouldn't even feel comfortable saying that yet, I agree it's likely, but I'd really like more certainty before declaring "you're safe if not this". |
Agreed, we don't know what we don't know, we need to:
I don't know if there's a more robust way of figuring this out outside of that. And of course I'm handwaving 2 which could be a heavy lift. |
I'd assume, not having looked, that it's something like copy_file_range isn't dirtying things so the thing that triggers a force txg sync on SEEK_DATA/SEEK_HOLE with a dirty thing isn't firing. But as I'm somewhat occupied recovering from being badly ill at home, my time to test and debug this is rather finite, so I wouldn't assume I'll be doing that in a timely fashion. |
I agree with @rincebrain on the rough theory. I did look, and landed on code I've looked at before while trying to get my head around corner cases in block cloning. I have been suspicious of it for a while now, and remain so. When we start to clone a block, we indicate intent to modify the target dbuf by calling In void
dmu_buf_will_clone(dmu_buf_t *db_fake, dmu_tx_t *tx)
{
dmu_buf_impl_t *db = (dmu_buf_impl_t *)db_fake;
/*
* Block cloning: We are going to clone into this block, so undirty
* modifications done to this block so far in this txg. This includes
* writes and clones into this block.
*/
mutex_enter(&db->db_mtx);
VERIFY(!dbuf_undirty(db, tx));
ASSERT0P(dbuf_find_dirty_eq(db, tx->tx_txg));
if (db->db_buf != NULL) {
arc_buf_destroy(db->db_buf, db);
db->db_buf = NULL;
}
mutex_exit(&db->db_mtx);
dmu_buf_will_not_fill(db_fake, tx);
}
void
dmu_buf_will_not_fill(dmu_buf_t *db_fake, dmu_tx_t *tx)
{
dmu_buf_impl_t *db = (dmu_buf_impl_t *)db_fake;
mutex_enter(&db->db_mtx);
db->db_state = DB_NOFILL;
DTRACE_SET_STATE(db, "allocating NOFILL buffer");
mutex_exit(&db->db_mtx);
dbuf_noread(db);
(void) dbuf_dirty(db, tx);
} It seems there's a window there where the lock is down and the dbuf is not-dirty. That may be a place where a second thread can add a change to that block, which then gets trampled. Unfortunately Footnotes
|
With FreeBSD 14.0-RELEASE, there's complementary use of the
Whilst 14.0 is not yet announced, I can't imagine a change to the value at this time. |
Time to upstream that and add Linux support, with how broken this is. |
I am curious if it is broken in the same way on FreeBSD. Has anyone managed to reproduce this with intentional reflink copies? From what I recall the implementer's main platform was FreeBSD. Also some of these stack traces are blown assertions in the ZIL. Are we certain all of these are in regard to the BRT feature? I know a lot of the more recent optimizations with regard to ZIL messed with lock granularity, do we know that these things haven't happened with BRT not being leveraged? |
Here is probably not the right place to debate whether the other bugs around the ZIL are or are not from BRT. At least in 15529, I only tagged the ones where it was explicitly the case that block cloning was happening. I'm also not sure what you mean by "intentional reflink support" here - in both FreeBSD and Linux's case, they're calling a copy_file_range analogue that is going to reflink if it can and do a boring copy otherwise, nobody here is explicitly trying FICLONE, which has last I checked no analogue on FreeBSD at all. |
The |
By intentional I mean doing cp with a reflink argument, assuming the BSD variant of cp has it. I'd expect that by default it doesn't try to use it automatically like coreutils' --reflink=auto behavior. When something like make is copying things around with quickly generated build artifacts, I expect you're dealing with code that is likely to contend race conditions more than a file that's been stable on disk for a few seconds and is being copied once with the BRT backed clone feature. |
My understanding is that cp on FreeBSD just invokes copy_file_range in every case, and as I said, has no FICLONE analogue. |
Ah gotcha, so the benefits of the reduced syscall are there without requiring repeated userspace / kernelspace round trips, but the BRT based shortcut and space savings are prohibited behind that sysctl. Makes sense. I am curious if FICLONE is the issue here and copy_file_range worked just fine? |
Since the code, if I'm reading it right, for FICLONE just invokes the same backend function, it shouldn't be FICLONE specifically. I would guess differing semantics about how easy races are to trigger between the two platforms' memory management, but I really don't know. |
Pinging @pjd on block cloning. I'm seeing if I can make a non-gentoo reproducer. |
Since I have 1.42GB of possibly affected files I thought I'd check for stretches of zeroes in the other files to see if anything other than compiling go might have triggered this. Currently dumping a list of all file blocks to check for multiple references to the same block. |
I don't think zdb knows how to dump it as it is now, no. I don't think it'd be hard to teach it, though. |
Its not hard to dump, but also it doesn't really show what you want. Each entry is just the offset within a vdev (half a DVA), and the number of references to it. It doesn't know what a file is, nor even really what a block pointer is. Probably the dumbest version is to extend the in-memory mini-BRT used in Hmm, now I think about it, it might not help anyway. If the problem is that there was a race, and we applied a change in the wrong order, such that zeroes got written, then the clone itself was never performed - we did a real write, just with zero content. So there's no clone to look for. |
I'm using coreutils 8.32, so basically any distro that isn't using coreutils 9.x is completely unaffected by this issue? |
not completely, but it takes away many possible points of creating corruption (this also assumes the distro didn't backport the SEEK_HOLE changes, like EL9 did, iirc) |
To add to @classabbyamp answer, any software (besides coreutils) which tries to find holes in files it is both reading and writing at the same time is potentially affected. However, the potential timing window is so small and the access pattern so specific that the likelihood of actually tripping on this bug is rather small. |
In that case, would zfs send/recv be affected and would rsync be affected? Or would that only really be the case if coreutils 9.x is present? I did find some ppas I could look into like ppa:arter97/zfs and ppa:patrickdk/zfs and ppa:ofthesun/zfs at least so I can probably patch my Ubuntu systems, but I have a feeling patching my Proxmox systems may require me learning how to build zfs from source, although I did find kneutron/ansitest has a couple zfs build scripts so maybe that will help me learn building zfs from source. Thanks for all the additional information, it's quite interesting learning about this bug even though it's been patched already. Edit: Sounds like there's a newer Proxmox with this bug already fixed/backported so that's good at least. Edit 2: I have gone with ppa:arter97/zfs and upgraded my Ubuntu systems to zfs 2.2.2 and upgraded my Proxmox systems to Proxmox 8.1 which supposedly includes the zfs fix backported to 2.2.0, so I suppose all is well for me now. |
commit 5abe7bd Author: Peter Korsgaard <[email protected]> Date: Mon Dec 4 14:06:08 2023 +0100 Update for 2023.08.4 Signed-off-by: Peter Korsgaard <[email protected]> commit 6b68ace Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 19:44:00 2023 +0100 package/mariadb: security bump to version 10.11.6 This bump will fix the following build failure raised since bump of fmt to version 10.1.0 in commit 619b558 thanks to MariaDB/server@f4cec36: -- Performing Test HAVE_SYSTEM_LIBFMT -- Performing Test HAVE_SYSTEM_LIBFMT - Failed [...] -- Downloading... dst='/home/buildroot/autobuild/instance-3/output-1/build/mariadb-10.11.4/extra/libfmt/src/8.0.1.zip' timeout='none' inactivity timeout='none' -- Using src='https://github.com/fmtlib/fmt/archive/refs/tags/8.0.1.zip' CMake Error at libfmt-stamp/download-libfmt.cmake:170 (message): Each download failed! error: downloading 'https://github.com/fmtlib/fmt/archive/refs/tags/8.0.1.zip' failed status_code: 1 status_string: "Unsupported protocol" log: --- LOG BEGIN --- Protocol "https" not supported or disabled in libcurl This bump will also fix CVE-2023-22084 https://mariadb.com/kb/en/mariadb-10-11-5-release-notes/ https://mariadb.com/kb/en/mariadb-10-11-6-release-notes/ Fixes: - http://autobuild.buildroot.org/results/9cb577195aa939289102116df5a2eac03f0d5017 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit d20329e) Signed-off-by: Peter Korsgaard <[email protected]> commit b1509f7 Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 18:42:04 2023 +0100 package/libmemcached: fix static build Fix the following static build failure raised since bump to version 1.1.4 in commit 7205df8: CMake Error at /home/autobuild/autobuild/instance-13/output-1/build/libmemcached-1.1.4/src/bin/cmake_install.cmake:60 (file): file RPATH_CHANGE could not write new RPATH: $ORIGIN/../lib to the file: /home/autobuild/autobuild/instance-13/output-1/host/arc-buildroot-linux-uclibc/sysroot/usr/bin/memcapable No valid ELF RPATH or RUNPATH entry exists in the file; Call Stack (most recent call first): /home/autobuild/autobuild/instance-13/output-1/build/libmemcached-1.1.4/src/cmake_install.cmake:52 (include) /home/autobuild/autobuild/instance-13/output-1/build/libmemcached-1.1.4/cmake_install.cmake:52 (include) Fixes: - http://autobuild.buildroot.org/results/778ff517d465896f54a3cd5316a66c54f66fd4cb Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit b47b206) Signed-off-by: Peter Korsgaard <[email protected]> commit dedfab8 Author: Peter Korsgaard <[email protected]> Date: Fri Dec 1 22:14:01 2023 +0100 toradex_apalis_imx6_defconfig: add download hashes for linux/uboot The defconfig fetches Linux and U-Boot from a git repo using the unauthenticated git:// protocol, so add download hashes for them to ensure we get the right sources by adding a global patch dir and running utils/add-custom-hashes. The defconfig uses the Linux sources for the kernel headers, so make linux-headers/linux-headers.hash a symlink to linux/linux.hash so the same hash file is used. Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit cdc9b8a) Signed-off-by: Peter Korsgaard <[email protected]> commit 100ba32 Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 15:54:18 2023 +0100 package/xenomai: fix build with gcc >= 12 Fix the following build failure with gcc >= 12: task.c: In function 't_start': task.c:398:16: error: 'ret' may be used uninitialized [-Werror=maybe-uninitialized] 398 | return ret; | ^~~ task.c:364:13: note: 'ret' was declared here 364 | int ret; | ^~~ task.c: In function 't_resume': task.c:444:16: error: 'ret' may be used uninitialized [-Werror=maybe-uninitialized] 444 | return ret; | ^~~ task.c:428:13: note: 'ret' was declared here 428 | int ret; | ^~~ Fixes: - http://autobuild.buildroot.org/results/bc1b40de22e563b704ad7f20b6bf4d1f73a6ed8a Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit a3db1dd) Signed-off-by: Peter Korsgaard <[email protected]> commit ce9b0d5 Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 15:15:18 2023 +0100 package/speechd: fix NLS build Fix the following NLS build failure raised since the addition of the package in commit 9f4f8c5: /home/buildroot/autobuild/run/instance-2/output-1/host/lib/gcc/arm-buildroot-linux-musleabihf/12.3.0/../../../../arm-buildroot-linux-musleabihf/bin/ld: ../../src/common/.libs/libcommon.a(libcommon_la-i18n.o): undefined reference to symbol 'libintl_bindtextdomain' Fixes: - http://autobuild.buildroot.org/results/8ab13cf474d732c95a1da65592d950b24b3d474b Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit f6a7050) Signed-off-by: Peter Korsgaard <[email protected]> commit 37dfdda Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 09:44:45 2023 +0100 package/libmemcached: fix build with gcc 4.8 Fix the following build failure with gcc 4.8 raised since bump to version 1.1.4 in commit 7205df8: /home/buildroot/autobuild/run/instance-0/output-1/build/libmemcached-1.1.4/src/libmemcachedprotocol/ascii_handler.c: In function 'ascii_get_response_handler': /home/buildroot/autobuild/run/instance-0/output-1/build/libmemcached-1.1.4/src/libmemcachedprotocol/ascii_handler.c:249:3: error: 'for' loop initial declarations are only allowed in C99 mode for (int x = 0; x < keylen; ++x) { ^ Fixes: - http://autobuild.buildroot.org/results/202aeec4dda822ac341d8882f84f968a303697c3 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 5eb79ff) Signed-off-by: Peter Korsgaard <[email protected]> commit 50abc2e Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 15:20:11 2023 +0100 package/libde265: security bump to version 1.0.14 Fix CVE-2023-43887: Libde265 v1.0.12 was discovered to contain multiple buffer overflows via the num_tile_columns and num_tile_row parameters in the function pic_parameter_set::dump. Fix CVE-2023-47471: Buffer Overflow vulnerability in strukturag libde265 v1.10.12 allows a local attacker to cause a denial of service via the slice_segment_header function in the slice.cc component. https://github.com/strukturag/libde265/releases/tag/v1.0.14 https://github.com/strukturag/libde265/releases/tag/v1.0.13 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 4cf5d91) Signed-off-by: Peter Korsgaard <[email protected]> commit 2369c3b Author: Fabrice Fontaine <[email protected]> Date: Sun Dec 3 09:02:14 2023 +0100 package/libmemcached: link with -latomic when needed Fix the following build failure raised since bump to version 1.1.4 in commit 7205df8: /home/buildroot/autobuild/instance-2/output-1/host/opt/ext-toolchain/bin/../lib/gcc/sparc-buildroot-linux-uclibc/11.3.0/../../../../sparc-buildroot-linux-uclibc/bin/ld: CMakeFiles/aslap.dir/ms_conn.c.o: undefined reference to symbol '__atomic_fetch_add_4@@LIBATOMIC_1.0' Fixes: - http://autobuild.buildroot.org/results/c8e4e1f9609d1339fe070afe440c63660892600e Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit a73cbe6) Signed-off-by: Peter Korsgaard <[email protected]> commit 55678b8 Author: Fabrice Fontaine <[email protected]> Date: Sat Dec 2 22:45:29 2023 +0100 package/putty: disable gssapi PUTTY_GSSAPI is enabled by default resulting in the following build failure since bump to version 0.78 in commit 5673ea3: /home/fabrice/buildroot/output/build/putty-0.79/unix/gss.c:133:10: fatal error: gssapi/gssapi.h: No such file or directory 133 | #include <gssapi/gssapi.h> | ^~~~~~~~~~~~~~~~~ Fixes: - http://autobuild.buildroot.org/results/d6d06b5aa0df070c3880399e044fb3cd3a830aec Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 499b4d6) Signed-off-by: Peter Korsgaard <[email protected]> commit 49da7a4 Author: Francois Perrad <[email protected]> Date: Sun Dec 3 09:42:51 2023 +0100 package/perl: security bump to version 5.36.3 fix CVE-2023-47038 - Write past buffer end via illegal user-defined Unicode property note: 5.36.2 was a broken release Signed-off-by: Francois Perrad <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit bc7b0e1) Signed-off-by: Peter Korsgaard <[email protected]> commit 0b3f844 Author: Fabrice Fontaine <[email protected]> Date: Fri Dec 1 22:23:18 2023 +0100 package/libpjsip: security bump to version 2.14 Fix CVE-2023-38703: PJSIP is a free and open source multimedia communication library written in C with high level API in C, C++, Java, C#, and Python languages. SRTP is a higher level media transport which is stacked upon a lower level media transport such as UDP and ICE. Currently a higher level transport is not synchronized with its lower level transport that may introduce use-after-free issue. This vulnerability affects applications that have SRTP capability (`PJMEDIA_HAS_SRTP` is set) and use underlying media transport other than UDP. This vulnerability’s impact may range from unexpected application termination to control flow hijack/memory corruption. The patch is available as a commit in the master branch. GHSA-f76w-fh7c-pc66 https://github.com/pjsip/pjproject/releases/tag/2.14 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 38c4aa2) Signed-off-by: Peter Korsgaard <[email protected]> commit 275d74b Author: Fabrice Fontaine <[email protected]> Date: Fri Dec 1 21:38:22 2023 +0100 package/putty: fix static build Fix the following static build failure raised since bump to version 0.78 in commit 5673ea3: In file included from /home/buildroot/autobuild/instance-0/output-1/build/putty-0.78/putty.h:8, from /home/buildroot/autobuild/instance-0/output-1/build/putty-0.78/callback.c:8: /home/buildroot/autobuild/instance-0/output-1/build/putty-0.78/unix/platform.h:11:10: fatal error: dlfcn.h: No such file or directory 11 | #include <dlfcn.h> /* Dynamic library loading */ | ^~~~~~~~~ Fixes: - http://autobuild.buildroot.org/results/06f0b14bd0414f97b06070198e290fb3253348c5 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 3d8e0a2) Signed-off-by: Peter Korsgaard <[email protected]> commit 758b779 Author: Bernd Kuhls <[email protected]> Date: Fri Dec 1 21:34:15 2023 +0100 package/samba4: security bump version to 4.18.9 Fixes CVE-2018-14628: https://www.samba.org/samba/security/CVE-2018-14628.html Release notes: https://www.samba.org/samba/history/samba-4.18.9.html Signed-off-by: Bernd Kuhls <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> commit 75abb66 Author: Fabrice Fontaine <[email protected]> Date: Thu Nov 30 23:49:04 2023 +0100 package/rtty: fix wolfssl build Fix the following wolfssl build failure raised at least since bump to version 7.4.0 in commit 6b5907b: /home/autobuild/autobuild/instance-4/output-1/build/rtty-8.1.0/src/ssl/openssl.c: In function 'ssl_last_error_string': /home/autobuild/autobuild/instance-4/output-1/build/rtty-8.1.0/src/ssl/openssl.c:143:24: error: implicit declaration of function 'ERR_peek_error_line_data'; did you mean 'wolfSSL_ERR_get_error_line_data'? [-Werror=implicit-function-declaration] 143 | ssl_err_code = ERR_peek_error_line_data(&file, &line, &data, &flags); | ^~~~~~~~~~~~~~~~~~~~~~~~ | wolfSSL_ERR_get_error_line_data Fixes: - http://autobuild.buildroot.org/results/9db9f1dcc6760de4b78771bb79f109c4efd06c36 - http://autobuild.buildroot.org/results/16422af9469de114e552124542508c3b18ea8f19 Signed-off-by: Fabrice Fontaine <[email protected]> [[email protected]: don't force wolfssl-all] Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 67cb7d8) Signed-off-by: Peter Korsgaard <[email protected]> commit 4073574 Author: José Luis Salvador Rufo <[email protected]> Date: Fri Dec 1 08:33:05 2023 +0100 package/zfs: bump version to 2.2.2 This release contains an important fix for a data corruption bug. Full details are in the issue [1] and bug fix [2]. 1. openzfs/zfs#15526 2. openzfs/zfs#15571 Signed-off-by: José Luis Salvador Rufo <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit c068fc4) Signed-off-by: Peter Korsgaard <[email protected]> commit 9e2e2cb Author: José Luis Salvador Rufo <[email protected]> Date: Mon Nov 13 01:58:34 2023 +0100 package/zfs: bump version to 2.2.0 Removed backported patch: - https://github.com/openzfs/zfs/commit/bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch Updated ZFS test to pass this new version; drop the explicit /pool mountpoint option to rely on the default location (which happens to be /pool already). Signed-off-by: José Luis Salvador Rufo <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> [[email protected]: - needed on master to further bump to a data-corruption fix ] (cherry picked from commit d153e58) Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit a44d1a1) Signed-off-by: Peter Korsgaard <[email protected]> commit 236a009 Author: Fabrice Fontaine <[email protected]> Date: Wed Nov 29 18:39:01 2023 +0100 package/xtables-addons: bump to version 3.24 This bump will fix the following build failure with kernel >= 6.2 thanks to https://codeberg.org/jengelh/xtables-addons/commit/51761c3fe2454e0b4bc25274dd55d4ab72c54bf0: /home/buildroot/autobuild/instance-1/output-1/build/xtables-addons-3.22/extensions/xt_TARPIT.c: In function 'xttarpit_honeypot': /home/buildroot/autobuild/instance-1/output-1/build/xtables-addons-3.22/extensions/xt_TARPIT.c:110:26: error: implicit declaration of function 'prandom_u32_max'; did you mean 'prandom_u32_state'? [-Werror=implicit-function-declaration] 110 | (prandom_u32_max(0x20) - 0xf); | ^~~~~~~~~~~~~~~ | prandom_u32_state Fixes: - http://autobuild.buildroot.org/results/e8f2a0cb5b38ff98da97268c4b642554a0a732e1 - http://autobuild.buildroot.org/results/0191ee0590c08b73f17b35a5c8521796693772b5 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 84b721c) Signed-off-by: Peter Korsgaard <[email protected]> commit 49e3269 Author: Fabrice Fontaine <[email protected]> Date: Wed Nov 29 18:39:00 2023 +0100 package/xtables-addons: drop unrecognized option --with-xtables is an unrecognized option since the addition of the package in commit 4909173: https://github.com/nawawi/xtables-addons/blob/a576f4d43e80f9f91705c9e6a86f2d58c283df14/configure.ac configure: WARNING: unrecognized options: --disable-gtk-doc, --disable-gtk-doc-html, --disable-doc, --disable-docs, --disable-documentation, --with-xmlto, --with-fop, --enable-ipv6, --disable-nls, --with-xtables Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit e81dc9d) Signed-off-by: Peter Korsgaard <[email protected]> commit 0ffbc8e Author: Fabrice Fontaine <[email protected]> Date: Wed Nov 29 22:43:08 2023 +0100 package/imagemagick: security bump to version 7.1.1-21 Fix CVE-2023-1289, CVE-2023-2157, CVE-2023-34151, CVE-2023-34152, CVE-2023-34153, CVE-2023-3428, CVE-2023-34474 and CVE-2023-34475 https://github.com/ImageMagick/Website/blob/main/ChangeLog.md Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 758d79f) Signed-off-by: Peter Korsgaard <[email protected]> commit fb3f6d1 Author: Fabrice Fontaine <[email protected]> Date: Mon Nov 27 23:11:19 2023 +0100 package/gsl: fix musl build on m68k Update patch to fix the following musl build failure with m68k which is only raised (for an unknown reason) since bump to version 2.7.1 in commit 3e48f83: In file included from fp.c:6: fp-gnum68k.c:21:10: fatal error: fpu_control.h: No such file or directory 21 | #include <fpu_control.h> | ^~~~~~~~~~~~~~~ Add also upstream link to first patch iteration which was sent in November 2022 but didn't get it any reply (like most of the other emails sent to [email protected] ...) Fixes: - http://autobuild.buildroot.org/results/e59636f6ac148807c1c67f09eef0e0a9f5d52303 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 02e80e0) Signed-off-by: Peter Korsgaard <[email protected]> commit a17063e Author: Yann E. MORIN <[email protected]> Date: Mon Nov 27 10:40:44 2023 +0100 package/erlang: disable for uclibc, fix glibc-build Commit 2cfa86a(package/erlang: bump version to 26.0.2) added a patch to restore building on uClibc. However, that patch is not upstream, and has been rejected: erlang/otp#7500 Please open a PR to https://github.com/asmjit/asmjit instead and we will get the fix next time we sync with upstream. We do not want theirs and our implementation to diverge. Furthermore, it happens to work on uClibc, because uClibc does not expose sys/auxv.h, but it fails to work on glibc, because the define is not propagated to "sub-trees", and thus is never defined where it is checked for, even when sys/auxv.h is available. This causes build failures such as: asmjit/core/cpuinfo.cpp: In function ‘void asmjit::_abi_1_10::detectHWCaps(CpuInfo&, long unsigned int, const LinuxHWCapMapping*, size_t)’: asmjit/core/cpuinfo.cpp:840:24: error: ‘getauxval’ was not declared in this scope 840 | unsigned long mask = getauxval(type); | ^~~~~~~~~ asmjit/core/cpuinfo.cpp: In function ‘void asmjit::_abi_1_10::detectARMCpu(CpuInfo&)’: asmjit/core/cpuinfo.cpp:972:21: error: ‘AT_HWCAP’ was not declared in this scope 972 | detectHWCaps(cpu, AT_HWCAP, hwCapMapping, ASMJIT_ARRAY_SIZE(hwCapMapping)); | ^~~~~~~~ asmjit/core/cpuinfo.cpp:973:21: error: ‘AT_HWCAP2’ was not declared in this scope 973 | detectHWCaps(cpu, AT_HWCAP2, hwCapMapping2, ASMJIT_ARRAY_SIZE(hwCapMapping2)); | ^~~~~~~~~ Yet, sys/auxv.h was detected at configure time: checking for sys/auxv.h... yes This defconfig is enough to reproduce the error: BR2_aarch64=y BR2_TOOLCHAIN_EXTERNAL=y BR2_TOOLCHAIN_EXTERNAL_BOOTLIN=y BR2_PACKAGE_ERLANG=y Since upstream refused the patch, and there is no fix that was submitted to the actual upstream (asmjit), drop the rejectred patch, and disable for uClibc: the patch is incorrect, and we can't fix a build issue on uClibc by introducing another on glibc. Fixes: http://autobuild.buildroot.org/results/fc1/fc19bad2263bdfacea594217d5ddfde0e27895b1/ http://autobuild.buildroot.org/results/114/11416d81d5b27fc0627b335a971154c088d5754a/ Signed-off-by: Yann E. MORIN <[email protected]> Cc: Bernd Kuhls <[email protected]> Cc: Maxim Kochetkov <[email protected]> Changes v1 -> v2: - update comment when unavailable Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit fb72418) Signed-off-by: Peter Korsgaard <[email protected]> commit 7867302 Author: Francois Perrad <[email protected]> Date: Mon Nov 27 04:26:39 2023 +0100 package/perl: security bump to 5.36.2 fix CVE-2023-47038 - Write past buffer end via illegal user-defined Unicode property Signed-off-by: Francois Perrad <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 127986f) Signed-off-by: Peter Korsgaard <[email protected]> commit d353e51 Author: Bernd Kuhls <[email protected]> Date: Tue Nov 28 18:51:25 2023 +0100 {linux, linux-headers}: bump 4.{14, 19}.x / 5.{4, 10, 15}.x / 6.{1, 5, 6}.x series Signed-off-by: Bernd Kuhls <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit c9222fe) [Peter: drop 6.5.x / 6.6.x bump] Signed-off-by: Peter Korsgaard <[email protected]> commit fe30c57 Author: Fabrice Fontaine <[email protected]> Date: Tue Nov 28 21:30:59 2023 +0100 package/libxml2: security bump to version 2.11.6 Fix CVE-2023-45322: libxml2 through 2.11.5 has a use-after-free that can only occur after a certain memory allocation fails. This occurs in xmlUnlinkNode in tree.c. NOTE: the vendor's position is "I don't think these issues are critical enough to warrant a CVE ID ... because an attacker typically can't control when memory allocations fail." https://gitlab.gnome.org/GNOME/libxml2/-/blob/v2.11.6/NEWS Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit e5af07d) Signed-off-by: Peter Korsgaard <[email protected]> commit 11be509 Author: Bernd Kuhls <[email protected]> Date: Sat Oct 7 12:25:00 2023 +0200 package/libxml2: bump version to 2.11.5 Release notes: https://download.gnome.org/sources/libxml2/2.11/libxml2-2.11.5.news Signed-off-by: Bernd Kuhls <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 622698d) Signed-off-by: Peter Korsgaard <[email protected]> commit 7241abc Author: Fabrice Fontaine <[email protected]> Date: Tue Nov 28 21:23:52 2023 +0100 package/vim: security bump to version 9.0.2136 Fix CVE-2023-46246, CVE-2023-48231, CVE-2023-48232, CVE-2023-48233, CVE-2023-48234, CVE-2023-48235, CVE-2023-48236 and CVE-2023-48237 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 6bd302c) Signed-off-by: Peter Korsgaard <[email protected]> commit e6eda1b Author: Fabrice Fontaine <[email protected]> Date: Tue Nov 28 21:21:13 2023 +0100 package/squid: security bump to version 6.5 Fix CVE-2023-5824, CVE-2023-46724, CVE-2023-46846, CVE-2023-46847 and CVE-2023-46848 GHSA-543m-w2m2-g255 GHSA-j83v-w3p4-5cqh GHSA-73m6-jm96-c6r3 GHSA-phqj-m8gv-cq4g GHSA-2g3c-pg7q-g59w https://github.com/squid-cache/squid/blob/SQUID_6_5/ChangeLog Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 7fb3c96) Signed-off-by: Peter Korsgaard <[email protected]> commit 7223351 Author: Waldemar Brodkorb <[email protected]> Date: Thu Oct 5 08:14:09 2023 +0200 package/squid: bump version to 6.3 Signed-off-by: Waldemar Brodkorb <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 0e15854) Signed-off-by: Peter Korsgaard <[email protected]> commit bc63929 Author: Waldemar Brodkorb <[email protected]> Date: Thu Aug 10 11:58:55 2023 +0200 package/squid: update to 6.2 See the release notes for Squid 6 for any news: http://www.squid-cache.org/Versions/v6/RELEASENOTES.html Tested with qemu_aarch64_virt_defconfig. Signed-off-by: Waldemar Brodkorb <[email protected]> Signed-off-by: Thomas Petazzoni <[email protected]> (cherry picked from commit 2a7c681) Signed-off-by: Peter Korsgaard <[email protected]> commit c06c127 Author: Fabrice Fontaine <[email protected]> Date: Tue Nov 28 21:14:33 2023 +0100 package/memcached: security bump to version 1.6.22 Fix CVE-2023-46852: In Memcached before 1.6.22, a buffer overflow exists when processing multiget requests in proxy mode, if there are many spaces after the "get" substring. Fix CVE-2023-46853: In Memcached before 1.6.22, an off-by-one error exists when processing proxy requests in proxy mode, if \n is used instead of \r\n. https://github.com/memcached/memcached/wiki/ReleaseNotes1622 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit bc96e9d) Signed-off-by: Peter Korsgaard <[email protected]> commit f86173d Author: Fabrice Fontaine <[email protected]> Date: Sun Oct 1 15:04:59 2023 +0200 package/memcached: fix uclibc-ng build Fix the following uclibc-ng build failure raised since bump to version 1.6.21 in commit 6ce55ab and memcached/memcached@875371a: /home/buildroot/autobuild/instance-2/output-1/host/lib/gcc/arc-buildroot-linux-uclibc/10.2.0/../../../../arc-buildroot-linux-uclibc/bin/ld: memcached-thread.o: in function `thread_setname': thread.c:(.text+0xea2): undefined reference to `pthread_setname_np' Fixes: - http://autobuild.buildroot.org/results/e856d381f5ec7d2727f21c8bd46dacb456984416 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Thomas Petazzoni <[email protected]> (cherry picked from commit bfa3cd7) Signed-off-by: Peter Korsgaard <[email protected]> commit 1cdd069 Author: Fabrice Fontaine <[email protected]> Date: Sun Sep 24 17:09:26 2023 +0200 package/memcached: bump to version 1.6.21 - Send first patch upstream - Drop second and third patches (already in version) and so drop autoreconf https://github.com/memcached/memcached/wiki/ReleaseNotes1618 https://github.com/memcached/memcached/wiki/ReleaseNotes1619 https://github.com/memcached/memcached/wiki/ReleaseNotes1620 https://github.com/memcached/memcached/wiki/ReleaseNotes1621 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 6ce55ab) Signed-off-by: Peter Korsgaard <[email protected]> commit 8b0ba84 Author: Fabrice Fontaine <[email protected]> Date: Tue Nov 28 21:12:50 2023 +0100 package/vlc: security bump to version 3.0.20 Fix CVE-2023-47359: Videolan VLC prior to version 3.0.20 contains an incorrect offset read that leads to a Heap-Based Buffer Overflow in function GetPacket() and results in a memory corruption. Fix CVE-2023-47360: Videolan VLC prior to version 3.0.20 contains an Integer underflow that leads to an incorrect packet length. https://code.videolan.org/videolan/vlc/-/blob/3.0.20/NEWS Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit d675873) Signed-off-by: Peter Korsgaard <[email protected]> commit 31ddad9 Author: Bernd Kuhls <[email protected]> Date: Tue Oct 17 22:20:57 2023 +0200 package/vlc: bump version to 3.0.19 Rebased patch 0006 due to upstream commit https://code.videolan.org/videolan/vlc/-/commit/3f9fc44176cc5505132977885799fa988c5e7701 Release notes: https://code.videolan.org/videolan/vlc/-/blob/3.0.19/NEWS Signed-off-by: Bernd Kuhls <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit f45fa3b) Signed-off-by: Peter Korsgaard <[email protected]> commit 69f4ee8 Author: Brandon Maier <[email protected]> Date: Tue Nov 28 19:55:07 2023 +0000 docs/website: fix favicon When the favicon image was added in f26e613 (docs/website: add favicon.png), it was added to a different directory then where the header's icon link points. This causes the favicon to fail to load with 404. While we are here, remove the "shortcut" rel attribute as it is non-standard and it's recommended not to use it[1]. [1] https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/rel#sect4 Signed-off-by: Brandon Maier <[email protected]> Signed-off-by: Peter Korsgaard <[email protected]> (cherry picked from commit 8ad1a2e) Signed-off-by: Peter Korsgaard <[email protected]> commit 66acf39 Author: Fabrice Fontaine <[email protected]> Date: Mon Nov 27 22:27:12 2023 +0100 package/motion: fix webp build Fix the following build failure raised since bump of webp to version 1.3.2 in commit c88c1d3: /home/autobuild/autobuild/instance-9/output-1/host/lib/gcc/aarch64_be-buildroot-linux-uclibc/13.2.0/../../../../aarch64_be-buildroot-linux-uclibc/bin/ld: picture.o: undefined reference to symbol 'WebPMemoryWriterClear' /home/autobuild/autobuild/instance-9/output-1/host/lib/gcc/aarch64_be-buildroot-linux-uclibc/13.2.0/../../../../aarch64_be-buildroot-linux-uclibc/bin/ld: /home/autobuild/autobuild/instance-9/output-1/host/aarch64_be-buildroot-linux-uclibc/sysroot/usr/lib64/libwebp.so.7: error adding symbols: DSO missing from command line Fixes: - http://autobuild.buildroot.org/results/9b859a701debeaddf1f9909e16adc6811a620576 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 1267a23) Signed-off-by: Peter Korsgaard <[email protected]> commit 30bfbf6 Author: Fabrice Fontaine <[email protected]> Date: Mon Nov 27 22:25:58 2023 +0100 package/exfatprogs: security bump to version 1.2.2 Fix CVE-2023-45897: exfatprogs before 1.2.2 allows out-of-bounds memory access, such as in read_file_dentry_set. https://github.com/exfatprogs/exfatprogs/blob/1.2.2/NEWS Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 07dad08) Signed-off-by: Peter Korsgaard <[email protected]> commit b68a880 Author: Peter Seiderer <[email protected]> Date: Tue Aug 8 20:09:58 2023 +0200 board/raspberrypi/config_4_64bit.txt: remove testing dtoverlay entries (vc4-kms-v3d-pi4, imx219) Remove private/testing dtoverlay entries (vc4-kms-v3d-pi4, imx219 and commented out ov5647) wrongly introduced by commit 689b9ac ("package/rpi-firmware: rework boot/config file handling") [1]. [1] https://git.buildroot.net/buildroot/commit/?id=689b9ac439ab7b507c8982b6102bddf59d03efbf Signed-off-by: Peter Seiderer <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit fbf0a6e) Signed-off-by: Peter Korsgaard <[email protected]> commit ec866af Author: Gaël PORTAY <[email protected]> Date: Mon Nov 20 22:41:50 2023 +0100 board/raspberrypi: fix autoprobing of bluetooth driver The commit 689b9ac (package/rpi-firmware: rework boot/config file handling) has split in two the property: dtoverlay=miniuart-bt,krnbt=on Into: dtoverlay=miniuart-bt dtoverlay=krnbt=on The initial property contained the dtbo file miniuart-bt[1] and its parameter krnbt=on[2][3]. The first syntax is correct while the second is not. The krnbt=on is not a dtoverlay[4] but a dtparam[5]. Therefore the property dtparam must be used instead. This fixes: # cat /sys/firmware/devicetree/base/chosen/user-warnings Failed to load overlay 'krnbt=on' [1]: https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/overlays/miniuart-bt-overlay.dts [2]: https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/overlays/miniuart-bt-overlay.dts#L91 [3]: https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/overlays/README#L213-L215 [4]: https://www.raspberrypi.com/documentation/computers/config_txt.html#dtoverlay [5]: https://www.raspberrypi.com/documentation/computers/config_txt.html#dtparam Signed-off-by: Gaël PORTAY <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 5be42d8) Signed-off-by: Peter Korsgaard <[email protected]> commit d8bc17f Author: Fabrice Fontaine <[email protected]> Date: Sun Nov 26 23:57:17 2023 +0100 package/exfatprogs: add EXFATPROGS_CPE_ID_VENDOR cpe:2.3:a:namjaejeon:exfatprogs is a valid CPE identifier for this package: https://nvd.nist.gov/products/cpe/detail/F174A846-F275-4AD8-A0E3-6D0CEFDFF308 Signed-off-by: Fabrice Fontaine <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 3da6267) Signed-off-by: Peter Korsgaard <[email protected]> commit ec2238b Author: Maxim Kochetkov <[email protected]> Date: Thu Nov 23 09:15:00 2023 +0300 package/postgresql: security bump version to 15.5 Release notes: https://www.postgresql.org/about/news/postgresql-161-155-1410-1313-1217-and-1122-released-2749/ Fixes CVE-2023-5868, CVE-2023-5869, CVE-2023-5870. Signed-off-by: Maxim Kochetkov <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 4d549c0) Signed-off-by: Peter Korsgaard <[email protected]> commit 8212d48 Author: Thomas Petazzoni <[email protected]> Date: Thu Nov 16 14:51:35 2023 +0100 package/netsnmp: revert back to 5.9.3, backport security fix In commit 13fc9dc, netsnmp was bumped from 5.9.3 to 5.9.4 to fix two CVEs. However, even though it's a minor version bump, there are actually 163 commits upstream between those two minor releases, and some of them are breaking existing use-cases. In particular upstream a2cb167514ac0c7e1b04e8f151e0b015501362e0 now requires that config_() macros in MIB files are terminated with a semicolon, causing a build breakage with existing MIB files that were totally valid with 5.9.3. This commit therefore proposes to revert back to 5.9.3, by reverting those two commits: 56caafc package/netsnmp: fix musl build 13fc9dc package/netsnmp: security bump to version 5.9.4 and instead backport the one upstream commit that fixes both CVEs. Signed-off-by: Thomas Petazzoni <[email protected]> [[email protected]: fix typo as reported by Baruch] Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 44243b4) Signed-off-by: Peter Korsgaard <[email protected]> commit bc63ab9 Author: Gaël PORTAY <[email protected]> Date: Wed Nov 22 02:04:08 2023 +0100 board/raspberrypi/readme.txt: fix typos Signed-off-by: Gaël PORTAY <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit acd833c) Signed-off-by: Peter Korsgaard <[email protected]> commit 29e2700 Author: José Luis Salvador Rufo <[email protected]> Date: Sun Nov 12 23:11:17 2023 +0100 package/zfs: fix zfs autotools cross-compilation This commit addresses a long-standing bug encountered during ZFS compilation in cross-platform environments. The issue arises because ZFS autoconf triggers a `make modules` to detect if the kernel can compile modules [1]. The problem occurs when autoconf uses the host environment instead of the cross-platform environment. To fix this, we export necessary environment variables to ensure that ZFS autoconf utilizes the cross-platform environment correctly. This patch resolves ZFS cross-platform compilations: - http://autobuild.buildroot.net/results/ebeab256101bcba38c35fd55075c414e62f92caa/ - http://autobuild.buildroot.net/results/03b9f12a106bf100eec695a92b83bf09b22c68b0/ - http://autobuild.buildroot.net/results/c2da90337463607c2fadfeac7ad72e5c3899a61f/ - http://autobuild.buildroot.net/results/465a249f92d2f5db7ac4b61b4111e6cbaaa15688/ - http://autobuild.buildroot.net/results/7e2d3277e26fa5b0c8073a0e8b9e82f47ade9697/ - http://autobuild.buildroot.net/results/a8fb87336b09fef8787a7889dfcccf14fe1215b9/ - https://gitlab.com/kubu93/buildroot/-/jobs/1522848483 And fix a few emails: - alpine.DEB.2.22.394.2108181630280.2028262@ridzo [build zfs into buildroot for raspberry pi 4] - https://lists.buildroot.org/pipermail/buildroot/2021-August/621696.html - https://lists.buildroot.org/pipermail/buildroot/2021-August/621345.html - https://lists.buildroot.org/pipermail/buildroot/2022-July/646379.html - https://lists.buildroot.org/pipermail/buildroot/2023-June/668467.html [1] This is the full callback, you can just check the last link: - https://github.com/openzfs/zfs/blob/zfs-2.1.12/config/kernel-declare-event-class.m4#L7C11-L7C11 - https://github.com/openzfs/zfs/blob/zfs-2.1.12/config/kernel.m4#L883 - https://github.com/openzfs/zfs/blob/zfs-2.1.12/config/kernel.m4#L868 - https://github.com/openzfs/zfs/blob/zfs-2.1.12/config/kernel.m4#L668 Signed-off-by: José Luis Salvador Rufo <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 7fe685c) Signed-off-by: Peter Korsgaard <[email protected]> commit 76699a7 Author: Yann E. MORIN <[email protected]> Date: Sun Nov 26 17:11:18 2023 +0100 package/zfs: don't download patch generated from github Git-generated patches embed the short-hash of the objects in the repository. The length of those short hashes are subject to change in at least three cases: - the number of objects in the repository increases, so git increases the length of short hashes to get a good change there is no collision; - the git configuration changes, see core.abbrev in git-config; - the heuristic to compute the length changes in a newer git version. Since the bump to zfs 2.1.4 in commit 68dfd09, the patch generated by github has changed, causing download failures: wget --passive-ftp -nd -t 3 -O '/home/ymorin/dev/buildroot/O/master/build/.bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch.uoFq9e/output' 'https://github.com/openzfs/zfs/commit/bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch' --2023-11-26 16:53:25-- https://github.com/openzfs/zfs/commit/bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch Resolving github.com (github.com)... 140.82.121.3 Connecting to github.com (github.com)|140.82.121.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2976 (2.9K) [text/plain] Saving to: ‘/home/ymorin/dev/buildroot/O/master/build/.bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch.uoFq9e/output’ /home/ymorin/dev/buildroot/O/ 100%[================================================>] 2.91K --.-KB/s in 0s 2023-11-26 16:53:25 (15.0 MB/s) - ‘/home/ymorin/dev/buildroot/O/master/build/.bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch.uoFq9e/output’ saved [2976/2976] ERROR: while checking hashes from package/zfs//zfs.hash ERROR: bc3f12bfac152a0c28951cec92340ba14f9ccee9.patch has wrong sha256 hash: ERROR: expected: 96a27353fe717ff2c8b95deb8b009c4eb750303c6400e2d8a2582ab1ec12b25a ERROR: got : 246c80f66abca5a7e0c41cc7c56eec0b4cb7f16b142262480401142bbc2f999f ERROR: Incomplete download, or man-in-the-middle (MITM) attack And indeed, the length of short hashes has increased by one since then. Fix that by bundling the patch, with the short hashes that were known then, so that it matches the sha256 we had for it. Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit 2c3946f) Signed-off-by: Peter Korsgaard <[email protected]> commit b1a3096 Author: Nicolas Cavallari <[email protected]> Date: Wed Nov 22 16:47:36 2023 +0100 package/gcc: fix disabling the documentation gcc.mk attempts to disable building the documentation by setting MAKEINFO=missing, but it is not working. If makeinfo is installed and recent enough, gcc still uses it. This can be checked easily: grep BUILD_INFO='info' host-gcc-initial-*/build/gcc/config.log It happens because the root ./configure script will check $MAKEINFO --version (aka 'missing --version') and will overwrite it with MAKEINFO='missing makeinfo' because the version does not match. Having MAKEINFO='missing makeinfo' is a problem because 'missing makeinfo' will actually attempt to run 'makeinfo' before failing with an error message. If makeinfo is installed on the host, then 'missing makeinfo' will successfully run makeinfo anyway. Many gcc subprojects will check $MAKEINFO --version and enable building the documentation if it is recent enough. This patch overrides these checks by forcing gcc_cv_prog_makeinfo_modern=no. Building the GCC documentation can fail with the wrong makeinfo version. It happened at least when building GCC 11.3.0 with makeinfo 7.1. Signed-off-by: Nicolas Cavallari <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit f7b9d3a) Signed-off-by: Peter Korsgaard <[email protected]> commit d3302c3 Author: Peter Korsgaard <[email protected]> Date: Wed Nov 15 12:26:42 2023 +0100 package/intel-microcode: security bump to version 20231114 Includes fixes for INTEL-SA-00950: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00950.html https://lock.cmpxchg8b.com/reptar.html https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20231114 Signed-off-by: Peter Korsgaard <[email protected]> Signed-off-by: Yann E. MORIN <[email protected]> (cherry picked from commit c544075) Signed-off-by: Peter Korsgaard <[email protected]>
Same thing here. I am using Proxmox 7.4-3 with coreutils version 8.32-4+b1 and zfsutils-linux: 2.1.11-pve1 and zfs-2.1.11-pve1 and zfs-kmod-2.1.11-pve1 and I've tested 640 million files (64 threads * 10 million files/thread) and haven't been able to replicate the issue on my end neither. When I check my Solaris 10 1/13 system, GNU coreutils is NOT installed by default. As far as I can so, it looks like (until I can find information otherwise, or someone else can educate me) that the core software libraries used for system administration is installed under the Solaris package SUNWadmc, so I am guessing that that's where the tool This should suggest that the probability of there being an issue with Solaris ZFS for this specific issue, should be relatively rather low, given that as far as I can tell, it doesn't use the same method to copy files as GNU coreutils Thanks. |
The bug could be hit by anything that tries to find holes in a file that is also being written to, which includes but is not limited to coreutils 9.x. Anything that seeks to the next hole in a file, while it is being written to, during a very specific moment of the write operation, could end up incorrectly reading zeroes. |
Anyone can answer my question in comment: #15526 (comment) ? |
Thank you, @0x5c. Unfortunately, since Oracle bought Sun Microsystems, Solaris has been closed source since 2010, so it would be impossible to tell from the outside. |
If the bug is present in that version of ZFS, then it should theoretically be possible to trigger with anything that seeks to the next hole in a file, regardless of what that program may be. If coreutils 9.x can build for solaris, then that could be a way to try to reproduce the bug. Even a minimal program that just does the equivalent to coreutils |
Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in #15526 The bug was fixed in #15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15608
Over its history this the dirty dnode test has been changed between checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and `dn_dirty_record`. de198f2 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency 2531ce3 Revert "Report holes when there are only metadata changes" ec4f9b8 Report holes when there are only metadata changes 454365b Fix dirty check in dmu_offset_next() 66aca24 SEEK_HOLE should not block on txg_wait_synced() Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9 It turns out both are actually required. In the case of appending data to a newly created file, the dnode proper is dirtied (at least to change the blocksize) and dirty records are added. Thus, a single logical operation is represented by separate dirty indicators, and must not be separated. The incorrect dirty check becomes a problem when the first block of a file is being appended to while another process is calling lseek to skip holes. There is a small window where the dnode part is undirtied while there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)` would not know that the file is dirty, and would go to `dnode_next_offset()`. Since the object has no data blocks yet, it returns `ESRCH`, indicating no data found, which results in `ENXIO` being returned to `lseek()`'s caller. Since coreutils 9.2, `cp` performs sparse copies by default, that is, it uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to replicate the holes in the target. When it hits the bug, its initial search for data fails, and it goes on to call `fallocate()` to create a hole over the entire destination file. This has come up more recently as users upgrade their systems, getting OpenZFS 2.2 as well as a newer coreutils. However, this problem has been reproduced against 2.1, as well as on FreeBSD 13 and 14. This change simply updates the dirty check to check both types of dirty. If there's anything dirty at all, we immediately go to the "wait for sync" stage, It doesn't really matter after that; both changes are on disk, so the dirty fields should be correct. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#15571 Closes openzfs#15526
Previously, dmu_buf_will_clone() would roll back any dirty record, but would not clean out the modified data nor reset the state before releasing the lock. That leaves the last-written data in db_data, but the dbuf in the wrong state. This is eventually corrected when the dbuf state is made NOFILL, and dbuf_noread() called (which clears out the old data), but at this point its too late, because the lock was already dropped with that invalid state. Any caller acquiring the lock before the call into dmu_buf_will_not_fill() can find what appears to be a clean, readable buffer, and would take the wrong state from it: it should be getting the data from the cloned block, not from earlier (unwritten) dirty data. Even after the state was switched to NOFILL, the old data was still not cleaned out until dbuf_noread(), which is another gap for a caller to take the lock and read the wrong data. This commit fixes all this by properly cleaning up the previous state and then setting the new state before dropping the lock. The DBUF_VERIFY() calls confirm that the dbuf is in a valid state when the lock is down. Sponsored-by: Klara, Inc. Sponsored-By: OpenDrives Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pawel Jakub Dawidek <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#15566 Closes openzfs#15526
I found out why second copy from my comment #15526 (comment) is done by seeking hole/data: ZFS reports number of blocks that is less than size of file thats why second copy is sparse but it shouldn't. Of course it whouldn't be good for reproducer but the real question: Do we have another bug in zfs in reporting number of blocks ? BEcause of that cp doesn't to copy_file_Range as it should and perform faster :) |
As a mitigation until more is understood and fixes are tested & reviewed, change the default of zfs_dmu_offset_next_sync from 1 to 0, as it was before 05b3eb6d232009db247882a39d518e7282630753 upstream. There are no reported cases of The Bug being hit with zfs_dmu_offset_next_sync=1: that does not mean this is a cure or a real fix, but it _appears_ to be at least effective in reducing the chances of it happening. By itself, it's a safe change anyway, so it feels worth us doing while we wait. Note that The Bug has been reproduced on 2.1.x as well, hence we do it for both 2.1.13 and 2.2.1. Bug: openzfs/zfs#11900 Bug: openzfs/zfs#15526 Bug: https://bugs.gentoo.org/917224 Signed-off-by: Sam James <[email protected]>
Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in openzfs#15526 The bug was fixed in openzfs#15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#15608
Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in #15526 The bug was fixed in #15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15608
We're still hitting this bug in 2.2.3 and it hasn't been fully fixed yet. I found this issue here, more people are still experiencing it: #15933 Just wanted to keep people aware and to keep checking your data for silent corruption. |
Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in openzfs#15526 The bug was fixed in openzfs#15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#15608
Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in openzfs#15526 The bug was fixed in openzfs#15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#15608
System information
Describe the problem you're observing
When installing the Go compiler with Portage, many of the internal compiler commands have been corrupted by having most of the files replaced by zeros.
I'm able to reproduce on two separate machines running 6.5.11 and ZFS 2.2.0.
ZFS does not see any errors with the pool.
Describe how to reproduce the problem
emerge -1 dev-lang/go
, where Portage's TMPDIR is on ZFS./usr/lib/go/pkg/tool/linux_amd64/compile
are corrupted.I was able to reproduce with and without Portage's "native-extensions" feature. I was unable to reproduce after changing Portage's TMPDIR to another filesystem (such as tmpfs).
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: