-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circular dependency for installing GDAL using mosaic.setup_gdal() #524
Comments
These are the instructions - https://databrickslabs.github.io/mosaic/usage/install-gdal.html, it is not circular. |
Maybe subject string is not best. I've updated subject. Let me clarify the problem I face with DBR 13.3. But instructions do not include details how to install On DBR 13.3 (which has no GDAL library) user cannot install mosaic, thus user cannot run
|
@smartkiwi did you ever figure this out? I'm stuck at the same location. |
Again, not a circular dependency. The following is what the docs are conveying:
Providing the signature to the code for
|
If you are running on a "Single Node" spark instance (vs cluster) and do not want to setup an init script then just manually run the contents of the generated script in a cell in your notebook, from here, e.g. something like the following (you are root when running in the notebook, so no sudo):
Then no cluster restart needed for |
Hi Michael, Thanks for sharing this so quickly. The issue that I'm having is on the first step, Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.
Collecting databricks-mosaic
Downloading databricks_mosaic-0.4.2-py3-none-any.whl.metadata (828 bytes)
Collecting geopandas<0.14.4,>=0.14 (from databricks-mosaic)
Downloading geopandas-0.14.3-py3-none-any.whl.metadata (1.5 kB)
Collecting h3<4.0,>=3.7 (from databricks-mosaic)
Downloading h3-3.7.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (4.9 kB)
Requirement already satisfied: ipython>=7.22.0 in /databricks/python3/lib/python3.10/site-packages (from databricks-mosaic) (8.10.0)
Collecting keplergl==0.3.2 (from databricks-mosaic)
Downloading keplergl-0.3.2.tar.gz (9.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 74.0 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Collecting pyspark<3.5,>=3.4 (from databricks-mosaic)
Downloading pyspark-3.4.3.tar.gz (311.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 311.4/311.4 MB 49.2 MB/s eta 0:00:00
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Requirement already satisfied: ipywidgets<8,>=7.0.0 in /databricks/python3/lib/python3.10/site-packages (from keplergl==0.3.2->databricks-mosaic) (7.7.2)
Collecting traittypes>=0.2.1 (from keplergl==0.3.2->databricks-mosaic)
Downloading traittypes-0.2.1-py2.py3-none-any.whl.metadata (1.0 kB)
Requirement already satisfied: pandas>=0.23.0 in /databricks/python3/lib/python3.10/site-packages (from keplergl==0.3.2->databricks-mosaic) (1.4.4)
Collecting Shapely>=1.6.4.post2 (from keplergl==0.3.2->databricks-mosaic)
Downloading shapely-2.0.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (7.0 kB)
Collecting fiona>=1.8.21 (from geopandas<0.14.4,>=0.14->databricks-mosaic)
Downloading fiona-1.9.6.tar.gz (411 kB)
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'error'
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [3 lines of output]
<string>:86: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
CRITICAL:root:A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. |
Referencing Installation Guide, please run on an Assigned Cluster and see if that clears up your issue. Also, refer to pending release of 0.4.3 PR #568 for any additional python library "version fixing" that might now be required in DBR 13.3 (notably we are going to be identifying a range for numpy as version 2.0 is no longer compatible with scikit-learn version installed). |
@mjohns-databricks thanks for recommending running on an Assigned Cluster. Initially it didn't work either but after running %sh
sudo apt update
sudo apt install -y cmake libgdal-dev we were able to run Thanks again. |
GDAL installation helper is not usable as part of mosaic library
Currently
mosaic.setup_gdal()
helper requires GDAL to installed.This makes it difficult to use for users.
Versions:
Install GDAL documentation doesn't work because of this.
https://github.com/databrickslabs/mosaic/blob/main/docs/source/usage/install-gdal.rst
To Reproduce
Steps to reproduce the behavior:
Running
%pip install databricks-mosaic
in Databricks Notebook in vanilla DBR 13.3 fails with error that GLAD not found.Expected behavior
Documentation and tooling should be improved to allow users to install GDAL first without requiring to install mosaic.
Or there should be some way to install mosaic library without GDAL dependencies to allow users to use
mosaic.setup_gdal
functionThe text was updated successfully, but these errors were encountered: