Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly Docker container build fails due to missing GLIBCXX library #2836

Closed
zaneselvans opened this issue Sep 7, 2023 · 0 comments · Fixed by #2837
Closed

Nightly Docker container build fails due to missing GLIBCXX library #2836

zaneselvans opened this issue Sep 7, 2023 · 0 comments · Fixed by #2837
Labels
cloud Stuff that has to do with adapting PUDL to work in cloud computing context. nightly-builds Anything having to do with nightly builds or continuous deployment.

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Sep 7, 2023

For the last week or so the Docker container builds associated with our nightly deployments have been failing.

Ultimately the error comes from attempting to run pudl_setup -- so the environment has been solved and the container is essentially built, but we get a runtime error, apparently having to do with pandas. It may be that this error is related to the upgrade to pandas 2.0?

[stage-0 11/11] RUN --mount=type=bind,source=.git,target=/home/catalyst/pudl/.git     conda run --no-capture-output --prefix /home/catalyst/env pip install --no-cache-dir -e './[dev,doc,test,datasette]' &&     conda run --no-capture-output --prefix /home/catalyst/env pudl_setup:
122.2   File "/home/catalyst/env/lib/python3.11/site-packages/pandas/core/frame.py", line 182, in <module>
122.2     from pandas.core.generic import NDFrame
122.2   File "/home/catalyst/env/lib/python3.11/site-packages/pandas/core/generic.py", line 179, in <module>
122.2     from pandas.core.window import (
122.2   File "/home/catalyst/env/lib/python3.11/site-packages/pandas/core/window/__init__.py", line 1, in <module>
122.2     from pandas.core.window.ewm import (
122.2   File "/home/catalyst/env/lib/python3.11/site-packages/pandas/core/window/ewm.py", line 11, in <module>
122.2     import pandas._libs.window.aggregations as window_aggregations
122.2 ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/catalyst/env/lib/python3.11/site-packages/pandas/_libs/window/aggregations.cpython-311-x86_64-linux-gnu.so)
122.3 ERROR conda.cli.main_run:execute(41): `conda run pudl_setup` failed. (See above for error)

Attempting to reproduce the problem locally

Building the container myself I noticed some things:

  • The mambaforge image that we have been using is more than a year out of date.
  • We're using Python 3.10 initially in creating the python environment.
  • We've got the google-cloud-sdk package pinned to a very old version.
  • My container build fails when attempting to create the pudl-test environment because it claims there's no google-cloud-sdk package available, but that package definitely appears to exist and installs just fine locally. I have no trouble creating the pudl-test environment.
  • I updated to using a recent base image, Python 3.11, and the most recent version available for google-cloud-sdk and still got the same error:
18.62 Looking for: ["python[version='>=3.11,<3.12']", "geopandas[version='>=0.13,<0.14']", "pip[version='>=22,<24']", "shapely[version='>=2,<3']", "python-snappy[version='>=0.6,<1']", "setuptools[version='<69']", "sqlite[version='>=3.36,<4']", "tox[version='>=4,<5']", "google-cloud-sdk[version='>=386,<446']", 'sqlite-utils~=3.29']
18.62
18.62
19.97 Could not solve for environment specs
19.97 The following package could not be installed
19.97 └─ google-cloud-sdk >=386,<446  does not exist (perhaps a typo or a missing channel).

Despite:

$ mamba search "google-cloud-sdk>=445"
Loading channels: done
# Name                       Version           Build  Channel
google-cloud-sdk             445.0.0 py310hbe9552e_0  conda-forge
google-cloud-sdk             445.0.0 py311h267d04e_0  conda-forge
google-cloud-sdk             445.0.0  py38h10201cd_0  conda-forge
google-cloud-sdk             445.0.0  py39h2804cbe_0  conda-forge

Wondering if it might be different on different hardware platforms... seems like it's there just fine for Linux too:

$ mamba search "google-cloud-sdk>=445[subdir=linux-64]"    
Loading channels: done
# Name                       Version           Build  Channel
google-cloud-sdk             445.0.0 py310hff52083_0  conda-forge
google-cloud-sdk             445.0.0 py311h38be061_0  conda-forge
google-cloud-sdk             445.0.0  py38h578d9bd_0  conda-forge
google-cloud-sdk             445.0.0  py39h4162558_0  conda-forge
google-cloud-sdk             445.0.0  py39hf3d152e_0  conda-forge

However, when running the build-deploy-pudl action on GitHub with the new versions it has no trouble building the environment, so not sure what the issue is here locally.

The eventual call to pudl_dev still fails with the new software environment. I note that the following packages don't have any wheels available for download from PyPI, and the wheels are being built by pip instead:

timezonefinder
odfpy
restructuredtext-lint
stringcase
unicodecsv
pendulum
linear-tsv
petl

Another thing that seems a little funny is the need to uninstall the most recent versions of a few packages at the end of the pip install step, which I assume is because they were installed by conda and then found to be in conflict with other dependencies in pip and so were downgraded to whatever version was required to satisfy the pip dependencies

#17 63.79   Attempting uninstall: urllib3
#17 63.80     Found existing installation: urllib3 2.0.4
#17 63.80     Uninstalling urllib3-2.0.4:
#17 63.81       Successfully uninstalled urllib3-2.0.4
#17 67.60   Attempting uninstall: numpy
#17 67.62     Found existing installation: numpy 1.25.2
#17 67.67     Uninstalling numpy-1.25.2:
#17 67.90       Successfully uninstalled numpy-1.25.2
#17 76.91   Attempting uninstall: pandas
#17 76.93     Found existing installation: pandas 2.1.0
#17 77.06     Uninstalling pandas-2.1.0:
#17 77.55       Successfully uninstalled pandas-2.1.0

The final installed versions were:

  • urllib3==1.26.16
  • numpy==1.24.4
  • pandas==2.0.3

I added version pins for these packages to the test-environment.yml file so that they wouldn't get replaced by the latter pip install, and that did not fix the missing GLIBCXX problem. Still can't test locally because mamba can't find google-cloud-sdk when inside the Docker build for reasons I do not understand.

Looking more closely at when the docker build issue started, it was actually on August 29th which was a day before the Pandas 2.0 branch got merged in. This is the first failure and it uses pandas 1.5.3. So maybe it's not related to that at all... What got merged into dev on August 28th?

Turns out the culprit was the update to PyArrow 13.

@zaneselvans zaneselvans moved this from New to In progress in Catalyst Megaproject Sep 7, 2023
@zaneselvans zaneselvans added cloud Stuff that has to do with adapting PUDL to work in cloud computing context. nightly-builds Anything having to do with nightly builds or continuous deployment. labels Sep 7, 2023
@zaneselvans zaneselvans linked a pull request Sep 7, 2023 that will close this issue
8 tasks
@zaneselvans zaneselvans mentioned this issue Sep 7, 2023
8 tasks
@zaneselvans zaneselvans moved this from In progress to In review in Catalyst Megaproject Sep 7, 2023
@zaneselvans zaneselvans moved this from In review to Done in Catalyst Megaproject Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud Stuff that has to do with adapting PUDL to work in cloud computing context. nightly-builds Anything having to do with nightly builds or continuous deployment.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant