Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux glibc multilib issue with different "march" #1393

Open
baibeta opened this issue Dec 18, 2023 · 9 comments
Open

Linux glibc multilib issue with different "march" #1393

baibeta opened this issue Dec 18, 2023 · 9 comments

Comments

@baibeta
Copy link

baibeta commented Dec 18, 2023

For the newlib multilib build, there are different dirs such as rv64ima, rv64imafd, rv64imafdc.
However, for linux glibc multilib build, the multilib dir name only contains lib(XLEN)/ABI, there is no march dirs.

For example, I built with glibc_multilib_names="rv64imafd-lp64d rv64imafdc-lp64d", and generate t-linux-multilib with "multilib-generator rv64imafd-lp64d--c", then "make linux -j16".
There is only one "libc.a" in "lib64/lp64d/".

However, for newlib build, there are difference libc.a in “rv64imafd/lp64d/” and “rv64imafdc/lp64d/”.

If I build a program with -march=rv64gc or -march=rv64g, the "march" option only affects my code, the code in libc.a is not affected. It cannot choose a libc.a with or without "c" extension.

I found this is in "gcc/gcc/config/riscv/t-linux":
# Only XLEN and ABI affect Linux multilib dir names, e.g. /lib32/ilp32d/
MULTILIB_DIRNAMES := $(patsubst rv32%,lib32,$(patsubst rv64%,lib64,$(MULTILIB_DIRNAMES)))
It will change all rv64xxx into lib64.
Does anyone know why glibc is designed this way?
Why doesn't the multilib directory distinguish between marchs?

Given the current design, how can we statically compile glibc for different -march options?

Thanks

@TommyMurphyTM1234
Copy link
Collaborator

Given the current design, how can we statically compile glibc for different -march options?

Can you clarify what exactly you mean by this please?

FWIW, in my opinion and experience the whole area of multilibs and trying to have a single toolchain that supports multiple arch/abi and extension combinations is a bit of a mess, very confusing, inconsistent, and arguably broken in places. The ongoing increasing number of new ratified and custom extensions is just exacerbating this further. I was trying to clarify some issues here but haven't completed this yet:

I'm leaning towards the approach of building separate toolchains for each specific arch/abi (e.g. configure --with-arch=... --with-abi=...) that I need to use rather than trying to build one that supports many. KISS!

@baibeta
Copy link
Author

baibeta commented Dec 21, 2023

Thanks for your reply.

I mean after linux toolchain built, there is only one libc.so in "install_dir/sysroot/lib64/lp64d".

But after newlib toolchain built, there are several libc.a, for example in "install_dir/newlib_tuple/lib/rv64imafd/lp64d" and "install_dir/newlib_tuple/lib/rv64imafdc/lp64d".

If I disassemble libc.so in "/sysroot/lib64/lp64d" using objdump, I found that in its functions, such as "printf", it used compressed C instructions.
But for newlib multilib build, I can choose to use the libc.a in "lib/rv64imafd/lp64d" or "lib/rv64imafdc/lp64d". There is no C instructions in "lib/rv64imafd/lp64d".

If I want to statically compile my program and run it on a risc-v machine without C extension support. The newlib multilib build is fine, but the glibc multilib build will be crash.

I known I can fix the problem by build glibc without multilib and with default march=rv64imafd.
But I still want to know why glibc does not support multilibs for different 'march' options.

Thanks

@TommyMurphyTM1234
Copy link
Collaborator

TommyMurphyTM1234 commented Dec 21, 2023

But I still want to know why glibc does not support multilibs for different 'march' options.

Maybe because the default set of multilibs enabled by --enable-multilib for the Linux toolchain does not include any non C (compressed instructions) extension libraries?

I think that these defaults are used because in the RISC-V world there is a general assumption that a Linux target will support the C (compressed instructions) extension (or more specifically something like rvXXgc_zicsr_zifencei?). Although there was at least one implementation that did not and there were many questions here and on the mailing lists on how to enable toolchain/software support for it because of this "exception". E.g.:

Unfortunately the old way of simply regenerating t-linux-multilib using multilib-generator to specify a custom set of multilibs (e.g. in your case one(s) without C extension support) no longer seems to work as far as I know and have tried. My recollection is that this file is basically ignored these days even if removing it causes errors:

Maybe you can hand edit configure.ac or the Makefile generated by configure to modify the default set of multilibs but I thought that I tried that and it didn't work either? But I could be wrong as I am completely bamboozled by the multiple inconsistent multilib support mechanisms at this stage and I am leaning towards giving up on them to be honest in favour of multiple specific individual arch/abi toolchains instead.

@TommyMurphyTM1234
Copy link
Collaborator

TommyMurphyTM1234 commented Dec 21, 2023

You might get there before me but try this...

Change this line:

[AC_SUBST(glibc_multilib_names,"rv32imac-ilp32 rv32imafdc-ilp32d rv64imac-lp64 rv64imafdc-lp64d")],

so that it also specifies rv64imafd-lp64d as a Linux/GLIBC multilib:

[AC_SUBST(glibc_multilib_names,"rv32imac-ilp32 rv32imafdc-ilp32d rv64imac-lp64 rv64imafdc-lp64d rv64imafd-lp64d")],

and then build as usual:

# If you did a previous build...
# rm -rf <prefix-dir?
# make distclean

./configure --prefix=<prefix-dir> --enable-multilib
make linux

<prefix-dir>/bin/riscv64-unknown-linux-gnu-gcc -print-multi-lib
...

This doesn't do exactly what you mentioned in your first post - i.e. building rv64imafd/lp64d and reusing that if rv64imafdc/lp64d is specified - but it may still be sufficient for your needs?

@TommyMurphyTM1234
Copy link
Collaborator

You might get there before me but try this...

Change this line:

[AC_SUBST(glibc_multilib_names,"rv32imac-ilp32 rv32imafdc-ilp32d rv64imac-lp64 rv64imafdc-lp64d")],

so that it also specifies rv64imafd-lp64d as a Linux/GLIBC multilib:

[AC_SUBST(glibc_multilib_names,"rv32imac-ilp32 rv32imafdc-ilp32d rv64imac-lp64 rv64imafdc-lp64d rv64imafd-lp64d")],

and then build as usual:

# If you did a previous build...
# rm -rf <prefix-dir?
# make distclean

./configure --prefix=<prefix-dir> --enable-multilib
make linux

<prefix-dir>/bin/riscv64-unknown-linux-gnu-gcc -print-multi-lib
...

This doesn't do exactly what you mentioned in your first post - i.e. building rv64imafd/lp64d and reusing that if rv64imafdc/lp64d is specified - but it may still be sufficient for your needs?

Never mind - as I suspected earlier I tried this before and it doesn't work. In spite of changing the list of multilibs in configure.ac it still just does its own thing:

./installed-tools/bin/riscv64-unknown-linux-gnu-gcc -print-multi-lib
.;
lib32/ilp32;@march=rv32imac@mabi=ilp32
lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d
lib64/lp64;@march=rv64imac@mabi=lp64
lib64/lp64d;@march=rv64imafdc@mabi=lp64d

./installed-tools/bin/riscv64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=./installed-tools/bin/riscv64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/user/Downloads/issue-1393/installed-tools/libexec/gcc/riscv64-unknown-linux-gnu/13.2.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /home/user/Downloads/issue-1393/gcc/configure --target=riscv64-unknown-linux-gnu --prefix=/home/user/Downloads/issue-1393/installed-tools --with-sysroot=/home/user/Downloads/issue-1393/installed-tools/sysroot --with-newlib --without-headers --disable-shared --disable-threads --with-system-zlib --enable-tls --enable-languages=c --disable-libatomic --disable-libmudflap --disable-libssp --disable-libquadmath --disable-libgomp --disable-nls --disable-bootstrap --src=.././gcc --enable-multilib --with-abi=lp64d --with-arch=rv64imafdc --with-tune=rocket --with-isa-spec=20191213 'CFLAGS_FOR_TARGET=-O2    -mcmodel=medlow' 'CXXFLAGS_FOR_TARGET=-O2    -mcmodel=medlow'
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 13.2.0 (GCC)

So only these multilibs:

  • rv32imac/ilp32
  • rv32imafdc/ilp32d
  • rv64imac/lp64
  • rv64imafdc/lp64d

and no sign of this:

  • rv64imafd/lp64d

@TommyMurphyTM1234
Copy link
Collaborator

TommyMurphyTM1234 commented Dec 21, 2023

If you do this then you get a toolchain with support for the following arch/abis:

  • Toolchain default in the absence of any explicit -march=... -mabi=... when compiling code
    • rv64imafd/lp64d
  • Multilibs
    • rv32imac/ilp32
    • rv32imafdc/ilp32d
    • rv64imac/lp64
    • rv64imafdc/lp64d
./configure --prefix=`pwd`/installed-tools --enable-multilib --with-arch=rv64imafd --with-abi=lp64d
make linux

...

./installed-tools/bin/riscv64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=./installed-tools/bin/riscv64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/user/Downloads/issue-1393/installed-tools/libexec/gcc/riscv64-unknown-linux-gnu/13.2.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /home/user/Downloads/issue-1393/gcc/configure --target=riscv64-unknown-linux-gnu --prefix=/home/user/Downloads/issue-1393/installed-tools --with-sysroot=/home/user/Downloads/issue-1393/installed-tools/sysroot --with-pkgversion=gc891d8dc23e --with-system-zlib --enable-shared --enable-tls --enable-languages=c,c++,fortran --disable-libmudflap --disable-libssp --disable-libquadmath --disable-libsanitizer --disable-nls --disable-bootstrap --src=.././gcc --enable-multilib --with-abi=lp64d --with-arch=rv64imafd --with-tune=rocket --with-isa-spec=20191213 'CFLAGS_FOR_TARGET=-O2    -mcmodel=medlow' 'CXXFLAGS_FOR_TARGET=-O2    -mcmodel=medlow'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.2.0 (gc891d8dc23e)

./installed-tools/bin/riscv64-unknown-linux-gnu-gcc -print-multi-lib
.;
lib32/ilp32;@march=rv32imac@mabi=ilp32
lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d
lib64/lp64;@march=rv64imac@mabi=lp64
lib64/lp64d;@march=rv64imafdc@mabi=lp64d

But it still doesn't address your question about why the Linux multilibs are organised on disk as they are and why, unlike the bare-metal/newlib toolchain, this doesn't seem to allow for multilibs distinguished by additional extension support...

@TommyMurphyTM1234
Copy link
Collaborator

I found this is in "gcc/gcc/config/riscv/t-linux": # Only XLEN and ABI affect Linux multilib dir names, e.g. /lib32/ilp32d/ MULTILIB_DIRNAMES := $(patsubst rv32%,lib32,$(patsubst rv64%,lib64,$(MULTILIB_DIRNAMES))) It will change all rv64xxx into lib64. Does anyone know why glibc is designed this way? Why doesn't the multilib directory distinguish between marchs?

Given the current design, how can we statically compile glibc for different -march options?

I suspect that this is an issue that needs to be flagged/discussed and maybe addressed upstream in the GCC project.

@TommyMurphyTM1234
Copy link
Collaborator

FWIW the issue is sort of covered here indirectly from the issue that I originally flagged:

@hauhsu
Copy link

hauhsu commented Jul 19, 2024

I used to have the confusion. After working on these for years, my understandings are:

Newlib is for bare-metal (embedded) platforms, which usually have an optimized, limited hardware (to reduce cost) for a specific use case. Say hardware-1 is (and only) used for scenario-1 that needs rv32imac, hardware-2 only for scenario-2 that needs rv32imafdc, etc. Thus the toolchain needs to build different multi-libs for those requirements. And we don't want to just build a super compatible multilib (say rv32i), since embedded systems usually have many concerns (like code size).
In order to distinguish all those multilibs, we have the multlib path encoding that includes march and mabi.
Here are some examples:

riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 => rv32i/ilp32
riscv64-unknown-elf-gcc -march=rv32ic -mabi=ilp32 => rv32i/ilp32
riscv64-unknown-elf-gcc -march=rv64imac -mabi=lp64 => rv64imac/lp6
...

Note that different march/mabi have multlib path encoding.

Glibc is for platforms that has OS, which is usually for PCs and servers. These platforms are general purpose, powerful and are able to support different scenarios. There are less concerns comparing with embedded systems. For example, your PC probably have floating point support, but not all the applications running on the PC uses floating points. (This is not the case for embedded systems. The application doesn't need FP? Remove FP support from the hardware!)

Also the OS and Glibc are able to choose suitable implementations base on the hardware capabilities at run time. That means a Glibc might have memcpy implemented in both rv64gc and rv64gcv (RVV optimized). When executing the function, Glibc will ask OS kernel whether RVV is supported on the CPU to decide which implementation of memcpy to be used. (Search ifunc for more informations.)

Plus Glibc toolchain usually packaged with OS distributions. So Ubuntu has it's own toolchain package, Arch linux has it's own toolchain package. The OS distributions decide which extensions are the best.
I think this echos @palmer-dabbelt 's comment: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111065#c3

... for Linux we assumed ISA compatibility was up to the distro

With all the reasons above, we don't need complex multlib path encoding like embedded systems. So we only encodes XLEN and mabi. Here are some examples:

riscv64-unknown-linux-gnu-gcc -march=rv32imac -mabi=ilp32 => lib32/ilp32
riscv64-unknown-linux-gnu-gcc -march=rv32imaf -mabi=ilp32 => lib32/ilp32
...
riscv64-unknown-linux-gnu-gcc -march=rv64imafd -mabi=lp64 => lib64/lp64
riscv64-unknown-linux-gnu-gcc -march=rv64imafdc -mabi=lp64 => lib64/lp64

Note that different march share the same multlib path encoding.


You might already read the great blog post written by @palmer-dabbelt:
https://www.sifive.com/blog/all-aboard-part-5-risc-v-multilib
(Although this was written when RISC-V was emerging, I think things doesn't change much since then.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants