Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Owain testing] Standing up the stack on Kathleen #61

Open
owainkenwayucl opened this issue Nov 13, 2024 · 15 comments
Open

[Owain testing] Standing up the stack on Kathleen #61

owainkenwayucl opened this issue Nov 13, 2024 · 15 comments
Assignees
Labels
a puzzlement bug Something isn't working working info

Comments

@owainkenwayucl
Copy link

As a result of trying to help with CMake in #58 I've been trying to stand everything up on Kathleen, as ccspapp.

Because it started there, the site name is called "owain-issue58". Sorry.

This led to #60.

Now at the stage on Kathleen where most of the stuff in a base install installs (spack -e base install) but with some failures.

==> Error: beast2-2.7.4-rikpxdve6hqrxxlip26e3n2decji6eix: Package was not installed
==> Error: castep-23.1-ucwspmytbk3c4uu7qampn4ebpuhfdhpz: Package was not installed
==> Error: castep-24.1-lrbxb3yrppp5jonztcawb3mkydt2gt22: Package was not installed
==> Error: gromacs-2023.5-u62yewfylhpwfkxzr5mvnu5ck2l5fipu: Package was not installed
==> Error: gromacs-2024.3-ogai2ly4strsjcp4og4hqeu4n4baokvo: Package was not installed
==> Error: hdf5-1.14.3-bpqekariph2hh7ha3ygnjuelvtgrg37i: Package was not installed
==> Error: lammps-20240829-fhg2sunktv5m7qdgqx732yeew5kn4467: Package was not installed
==> Error: namd-2.14-lfzr5jvycp2ssvwbz7pxhmhqmabnkdmp: Package was not installed
==> Error: namd-3.0-lpx5kxpvuzcpkpr6chatj7byxdtcupxy: Package was not installed
==> Error: netcdf-c-4.9.2-pgnfkfvli7xnfdm37vzu5eqsmb3b5lkq: Package was not installed
==> Error: netcdf-fortran-4.6.1-n3yhrus2xxrohntuu6dcj7gntfgnk5s6: Package was not installed
==> Error: openmpi-4.1.6-xtzhq2emwlwbqflxuyhnhwomd44qxw3s: Package was not installed
==> Error: Installation request failed.  Refer to reported errors for failing package(s).

These failures seem to stem from various dependencies not being able to find libcrypt.so.2

==> Error: ProcessError: Command exited with status 127:
    './autogen.sh'

1 error found in build log:
     1    ==> numactl: Executing phase: 'autoreconf'
     2    ==> [2024-11-13-09:52:11.076603] './autogen.sh'
  >> 3    /lustre/shared/ucl/apps/spack/0.22/owain-issue58/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__
          /__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__/linux-centos7-x86_64_v3/gcc-10.2.
          1/perl-5.40.0-cznthvrecpjllyyx3dagchkcqpwooivw/bin/perl: error while loading shared libraries: libcrypt.so.2: cannot open shared object file: No 
          such file or directory

This issue is just to capture my notes as I work on this!

@owainkenwayucl
Copy link
Author

That library isn't on Myriad in the usual locations so........

@owainkenwayucl owainkenwayucl self-assigned this Nov 13, 2024
@giordano
Copy link
Member

error while loading shared libraries: libcrypt.so.2: cannot open shared object file: No 
          such file or directory

libcrypt.so.2 should be coming from glibc 🤔 Spack relies on glibc coming from the system, bit worrying this library can't be found (which begs the question how it ended up being linked to that in the first place), unless on kathleen we have an old glibc which has the old libcrypt.so.1

@owainkenwayucl
Copy link
Author

libcrypt.so.2 should be coming from glibc 🤔 Spack relies on glibc coming from the system, bit worrying this library can't be found (which begs the question how it ended up being linked to that in the first place), unless on kathleen we have an old glibc which has the old libcrypt.so.1

Myriad also only has libcrypt.so.1. This is pulling things from the general spack build cache which is presumably why it's linked against libcrypt.so.2

@giordano
Copy link
Member

This is pulling things from the general spack build cache which is presumably why it's linked against libcrypt.so.2

Ah, then that's problematic 😕

@haampie
Copy link
Collaborator

haampie commented Nov 13, 2024

libcrypt.so is generally not problematic whether glibc provides it or not, cause recipes should force it to come from depends_on("libxcrypt"). If not that's a packaging bug.

However, with Perl it's tricky because libxcrypt requires Perl and Perl requires libxcrypt. It's a known issue:

besser82/libxcrypt#187

So in Spack libxcrypt is not a dependency of perl, but perl's configure script will enable it anyhow if it can find it as part of glibc; I don't think their configure scripts allows one to disable it explicitly.

@owainkenwayucl
Copy link
Author

Thanks!

@heatherkellyucl
Copy link
Collaborator

Ah right, this is one where we need a perl built and added to our own build cache first to not hit this problem. (Which has been the case for all the previous work I was doing). Then it won't try to get it from manylinux.

@heatherkellyucl
Copy link
Collaborator

I was thinking if we added system perl as an external then this might be resolvable, but from that issue "But to configure libxcrypt, you need perl 5.14 with open.pm, which is not available on RHEL and derivatives by default, and you may not have access to yum install perl-open." so that probably isn't going to help us.

@giordano
Copy link
Member

How about installing libxcrypt from RHEL repos and use that as an external? 😅 Still, avoiding externals would be better to keep things self-consistent

@heatherkellyucl
Copy link
Collaborator

heatherkellyucl commented Nov 14, 2024

Discussion in the Spack Slack about never reusing Perl from a buildcache but reusing everything else pointed at the examples in spack/spack#42782 of excluding things from being reused by the concretizer, not sure if useful here.

The perl we get from first_compiler.yaml was like this when the binary cache was not added. The gcc 11.2.1 here is our starting external compiler.

==> Installed packages
-- linux-rhel7-cascadelake / [email protected] -------------------------
[email protected]+cpanm+opcode+open+shared+threads build_system=generic
    [email protected]+cxx~docs+stl build_system=autotools patches=26090f4,b231fcc
    [email protected]~debug~pic+shared build_system=generic
        [email protected] build_system=autotools
            [email protected] build_system=autotools libs=shared,static
    [email protected] build_system=generic
    [email protected] build_system=autotools
        [email protected] build_system=autotools patches=bbf97f1
            [email protected]~symlinks+termlib abi=none build_system=autotools patches=7a351bc
                [email protected] build_system=autotools
    [email protected] build_system=autotools patches=be65fec,e179c43
    [email protected]~guile build_system=generic
    [email protected]+optimize+pic+shared build_system=makefile

@owainkenwayucl
Copy link
Author

(adding system perl only removed netcdf from the list of things that don't install)

@owainkenwayucl
Copy link
Author

The perl from the firstcompiler run is in the build-cache but it doesn't help.

@owainkenwayucl
Copy link
Author

We've tried various things.

So some things that definitely break things badly:

If you

  include_concrete:
  - /path/that/doesnt/exist

it will break spack env for all environments e.g. you can't spack env remove <other environmentname>

Tied in to that it doesn't expand environment variables.

@heatherkellyucl
Copy link
Collaborator

heatherkellyucl commented Nov 14, 2024

So having the binary cache available caused much chaos and several hours of going round in circles.

One of the environments Owain got ended up like this:

-- linux-rhel7-cascadelake / [email protected] -------------------------
[email protected]     [email protected]   [email protected]    [email protected]    [email protected]      [email protected]              [email protected]  [email protected]
[email protected]      [email protected]  [email protected]  [email protected]  lammps@20240829  [email protected]        [email protected]
[email protected]  [email protected]  [email protected]  [email protected]  [email protected]        [email protected]  [email protected]

==> Installed packages
-- linux-centos7-x86_64_v3 / [email protected] -------------------------
[email protected]        [email protected]             [email protected]  [email protected]    [email protected]                     [email protected]
[email protected]      [email protected]        [email protected]          [email protected]     [email protected]                [email protected]
[email protected]  [email protected]        [email protected]    [email protected]  [email protected]             [email protected]
[email protected]          [email protected]            [email protected]        [email protected]    [email protected]           [email protected]
[email protected]          [email protected]            [email protected]      [email protected]    [email protected]            [email protected]
[email protected]         [email protected]           [email protected]        [email protected]   [email protected]                [email protected]
[email protected]         [email protected]          [email protected]     [email protected]   [email protected]         [email protected]
[email protected]          [email protected]           [email protected]     [email protected]   [email protected]      [email protected]
[email protected]          [email protected]         [email protected]     [email protected]   [email protected]               [email protected]
[email protected]       [email protected]  [email protected]       [email protected]   [email protected]  [email protected]
[email protected]          [email protected]       [email protected]       [email protected]     [email protected]              [email protected]
[email protected]      [email protected]          [email protected]           [email protected]     [email protected]                [email protected]
[email protected]      [email protected]         [email protected]            [email protected]     [email protected]                [email protected]
[email protected]   [email protected]         [email protected]          [email protected]        [email protected]

-- linux-rhel7-cascadelake / [email protected] -------------------------
[email protected]    [email protected]  [email protected]    [email protected]  [email protected]
[email protected]  [email protected]          [email protected]  [email protected]    [email protected]

-- linux-rhel7-cascadelake / [email protected] -------------------------
[email protected]         [email protected]           [email protected]         [email protected]     [email protected]     [email protected]
[email protected]    [email protected]           [email protected]            [email protected]     [email protected]  [email protected]
[email protected]     [email protected]         [email protected]             [email protected]  [email protected]  [email protected]
[email protected]  [email protected]          [email protected]        [email protected]  [email protected]     [email protected]
[email protected]        [email protected]       [email protected]  [email protected]         [email protected]         [email protected]
[email protected]    [email protected]      [email protected]    [email protected]         [email protected]                [email protected]
[email protected]       [email protected]           [email protected]         [email protected]       [email protected]              [email protected]
[email protected]      [email protected]        [email protected]      [email protected]        [email protected]                [email protected]
[email protected]         [email protected]  [email protected]     [email protected]           [email protected]              [email protected]
[email protected]         [email protected]       [email protected]    [email protected]      [email protected]                 [email protected]
[email protected]       [email protected]_1    [email protected]   [email protected]      [email protected]

-- linux-rhel7-x86_64_v3 / [email protected] ---------------------------
[email protected]  [email protected]
==> 160 installed packages

The linux-centos7-x86_64_v3 / [email protected] chunk all came from the binary cache, but we shouldn't be getting anything that depends on gcc 10, so we don't want it to do that (and it had the same perl issue anyway).

I do need to look at the target arch, since we aren't specifying anything for those right now and we've ended up with a variety which are likely not helping. linux-rhel7-x86_64_v3, linux-rhel7-cascadelake, linux-centos7-x86_64_v3. (I don't know if it is the gcc version or the cascadelake that made more of our linux-rhel7-cascadelake / [email protected] set be not used).

I don't know what the minimum is that we need to install locally to still make things in the binary cache usable and perl not be broken. A libxcrypt at the same time as our first perl, with that chunk as x86_64_v3 and not allowing the binary cache?

@giordano
Copy link
Member

@haampie would you be able to rebuild perl in the buildcache to use libxcrypt, which uses the first perl in the buildcache (basically do the bootstrap process)? Feels like the current situation isn't ideal 🥲

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a puzzlement bug Something isn't working working info
Projects
None yet
Development

No branches or pull requests

4 participants