Skip to content

Commit

Permalink
xz: adjust arch-specific options for better kernel compression
Browse files Browse the repository at this point in the history
mainline inclusion
from mainline-v6.12-rc1
category: feature

Confilct: remove
ia64)           BCJ=--ia64; LZMA2OPTS=pb=4 ;;
0df8e97: ("scripts: clean up IA-64 code")

Use LZMA2 options that match the arch-specific alignment of instructions.
This change reduces compressed kernel size 0-2 % depending on the arch.
On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs
it helps the most.

Use the ARM-Thumb filter for ARM-Thumb2 kernels.  This reduces compressed
kernel size about 5 %.[1] Previously such kernels were compressed using
the ARM filter which didn't do anything useful with ARM-Thumb2 code.

Add BCJ filter support for ARM64 and RISC-V.  Compared to unfiltered XZ or
plain LZMA, the compressed kernel size is reduced about 5 % on ARM64 and 7
% on RISC-V.  A new enough version of the xz tool is required: 5.4.0 for
ARM64 and 5.6.0 for RISC-V.  With an old xz version, a message is printed
to standard error and the kernel is compressed without the filter.

Update lib/decompress_unxz.c to match the changes to xz_wrap.sh.

Update the CONFIG_KERNEL_XZ help text in init/Kconfig:
  - Add the RISC-V and ARM64 filters.
  - Clarify that the PowerPC filter is for big endian only.
  - Omit IA-64.

Link: https://lore.kernel.org/lkml/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Lasse Collin <[email protected]>
Reviewed-by: Sam James <[email protected]>
Cc: Simon Glass <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Jubin Zhong <[email protected]>
Cc: Jules Maselbas <[email protected]>
Cc: Emil Renner Berthing <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Joel Stanley <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Rui Li <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 7472ff8adad8655f38b060a602f66e59c93c4793)
Signed-off-by: Wentao Guan <[email protected]>
  • Loading branch information
opsiff committed Jan 14, 2025
1 parent 0825e8e commit 9bf37e0
Show file tree
Hide file tree
Showing 3 changed files with 152 additions and 10 deletions.
5 changes: 3 additions & 2 deletions init/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -317,8 +317,9 @@ config KERNEL_XZ
BCJ filters which can improve compression ratio of executable
code. The size of the kernel is about 30% smaller with XZ in
comparison to gzip. On architectures for which there is a BCJ
filter (i386, x86_64, ARM, IA-64, PowerPC, and SPARC), XZ
will create a few percent smaller kernel than plain LZMA.
filter (i386, x86_64, ARM, ARM64, RISC-V, big endian PowerPC,
and SPARC), XZ will create a few percent smaller kernel than
plain LZMA.

The speed is about the same as with LZMA: The decompression
speed of XZ is better than that of bzip2 but worse than gzip
Expand Down
14 changes: 12 additions & 2 deletions lib/decompress_unxz.c
Original file line number Diff line number Diff line change
Expand Up @@ -126,11 +126,21 @@
#ifdef CONFIG_X86
# define XZ_DEC_X86
#endif
#ifdef CONFIG_PPC
#if defined(CONFIG_PPC) && defined(CONFIG_CPU_BIG_ENDIAN)
# define XZ_DEC_POWERPC
#endif
#ifdef CONFIG_ARM
# define XZ_DEC_ARM
# ifdef CONFIG_THUMB2_KERNEL
# define XZ_DEC_ARMTHUMB
# else
# define XZ_DEC_ARM
# endif
#endif
#ifdef CONFIG_ARM64
# define XZ_DEC_ARM64
#endif
#ifdef CONFIG_RISCV
# define XZ_DEC_RISCV
#endif
#ifdef CONFIG_IA64
# define XZ_DEC_IA64
Expand Down
143 changes: 137 additions & 6 deletions scripts/xz_wrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,146 @@
#
# Author: Lasse Collin <[email protected]>

# This has specialized settings for the following archs. However,
# XZ-compressed kernel isn't currently supported on every listed arch.
#
# Arch Align Notes
# arm 2/4 ARM and ARM-Thumb2
# arm64 4
# csky 2
# loongarch 4
# mips 2/4 MicroMIPS is 2-byte aligned
# parisc 4
# powerpc 4 Uses its own wrapper for compressors instead of this.
# riscv 2/4
# s390 2
# sh 2
# sparc 4
# x86 1

# A few archs use 2-byte or 4-byte aligned instructions depending on
# the kernel config. This function is used to check if the relevant
# config option is set to "y".
is_enabled()
{
grep -q "^$1=y$" include/config/auto.conf
}

# XZ_VERSION is needed to disable features that aren't available in
# old XZ Utils versions.
XZ_VERSION=$($XZ --robot --version) || exit
XZ_VERSION=$(printf '%s\n' "$XZ_VERSION" | sed -n 's/^XZ_VERSION=//p')

# Assume that no BCJ filter is available.
BCJ=
LZMA2OPTS=

# Set the instruction alignment to 1, 2, or 4 bytes.
#
# Set the BCJ filter if one is available.
# It must match the #ifdef usage in lib/decompress_unxz.c.
case $SRCARCH in
x86) BCJ=--x86 ;;
powerpc) BCJ=--powerpc ;;
ia64) BCJ=--ia64; LZMA2OPTS=pb=4 ;;
arm) BCJ=--arm ;;
sparc) BCJ=--sparc ;;
arm)
if is_enabled CONFIG_THUMB2_KERNEL; then
ALIGN=2
BCJ=--armthumb
else
ALIGN=4
BCJ=--arm
fi
;;

arm64)
ALIGN=4

# ARM64 filter was added in XZ Utils 5.4.0.
if [ "$XZ_VERSION" -ge 50040002 ]; then
BCJ=--arm64
else
echo "$0: Upgrading to xz >= 5.4.0" \
"would enable the ARM64 filter" \
"for better compression" >&2
fi
;;

csky)
ALIGN=2
;;

loongarch)
ALIGN=4
;;

mips)
if is_enabled CONFIG_CPU_MICROMIPS; then
ALIGN=2
else
ALIGN=4
fi
;;

parisc)
ALIGN=4
;;

powerpc)
ALIGN=4

# The filter is only for big endian instruction encoding.
if is_enabled CONFIG_CPU_BIG_ENDIAN; then
BCJ=--powerpc
fi
;;

riscv)
if is_enabled CONFIG_RISCV_ISA_C; then
ALIGN=2
else
ALIGN=4
fi

# RISC-V filter was added in XZ Utils 5.6.0.
if [ "$XZ_VERSION" -ge 50060002 ]; then
BCJ=--riscv
else
echo "$0: Upgrading to xz >= 5.6.0" \
"would enable the RISC-V filter" \
"for better compression" >&2
fi
;;

s390)
ALIGN=2
;;

sh)
ALIGN=2
;;

sparc)
ALIGN=4
BCJ=--sparc
;;

x86)
ALIGN=1
BCJ=--x86
;;

*)
echo "$0: Arch-specific tuning is missing for '$SRCARCH'" >&2

# Guess 2-byte-aligned instructions. Guessing too low
# should hurt less than guessing too high.
ALIGN=2
;;
esac

# Select the LZMA2 options matching the instruction alignment.
case $ALIGN in
1) LZMA2OPTS= ;;
2) LZMA2OPTS=lp=1 ;;
4) LZMA2OPTS=lp=2,lc=2 ;;
*) echo "$0: ALIGN wrong or missing" >&2; exit 1 ;;
esac

# Use single-threaded mode because it compresses a little better
Expand Down

0 comments on commit 9bf37e0

Please sign in to comment.