This is the official Python SDK for Bacalhau, named bacalhau-sdk
.
It is a high-level SDK that ships the client-side logic (e.g. signing requests) needed to query the endpoints.
Please take a look at the examples for snippets to create, list and inspect jobs.
Under the hood, this SDK uses bacalhau-apiclient
(autogenerated via Swagger/OpenAPI) to interact with the API.
Please make sure to use this SDK library in your Python projects, instead of the lower level bacalhau-apiclient
.
The latter is listed as a dependency of this SDK and will be installed automatically when you follow the installation instructions below.
- List, create and inspect Bacalhau jobs using Python objects 🎈
- Use the production network, or set the following environment variables to target any Bacalhau network out there:
BACALHAU_API_HOST
BACALHAU_API_PORT
- Generate a key pair used to sign requests stored in the path specified by the
BACALHAU_DIR
env var (default:~/.bacalhau
)
$ pip install bacalhau-sdk
Clone the public repository:
$ git clone https://github.com/bacalhau-project/bacalhau/
Once you have a copy of the source, you can install it with:
$ cd python/
$ pip install .
Likewise the Bacalhau CLI, this SDK uses a key pair to be stored in BACALHAU_DIR
used for signing requests.
If a key pair is not found there, it will create one for you.
Let's submit a Hello World job and then fetch its output data's CID.
We start by importing this sdk, namely bacalhau_sdk
, used to create and submit a job create request.
Then we import bacalhau_apiclient
(installed automatically with this sdk), it provides various object models that compose a job create request.
These are used to populate a simple python dictionary that will be passed over to the submit
util method.
import pprint
from bacalhau_apiclient.models.job import Job
from bacalhau_apiclient.models.task import Task
from bacalhau_apiclient.models.all_of_execution_published_result import SpecConfig
from bacalhau_apiclient.models.api_put_job_request import (
ApiPutJobRequest as PutJobRequest,
)
from bacalhau_sdk.jobs import Jobs
task = Task(
name="My Main task",
engine=SpecConfig(
type="docker",
params=dict(
Image="ubuntu:latest",
Entrypoint=["/bin/bash"],
Parameters=["-c", "echo Hello World"],
),
),
publisher=SpecConfig(),
)
job = Job(name="A Simple Docker Job", type="batch", count=1, tasks=[task])
put_job_request = PutJobRequest(job=job)
jobs = Jobs()
put_job_response = jobs.put(put_job_request)
pprint.pprint(put_job_response)
The script above prints the following object, the job.metadata.id
value is our newly created job id!
{
'job': {
'constraints': [],
'count': 1,
'create_time': 1719930417160015000,
'id': 'j-df836f6a-6920-44e4-9683-27f6ec1921e2',
'labels': {},
'meta': {'bacalhau.org/requester.id': 'node-0'},
'modify_time': 1719930417160015000,
'name': 'A Simple Docker Job',
'namespace': 'default',
'priority': 0,
'revision': 1,
'state': {'message': None, 'state_type': 'Pending'},
'tasks': [{'engine': {'params': {'Entrypoint': ['/bin/bash'],
'Image': 'ubuntu:latest',
'Parameters': ['-c',
'echo Hello World']},
'type': 'docker'},
'env': None,
'input_sources': None,
'meta': None,
'name': 'My Main task',
'network': {'domains': None, 'type': 'None'},
'publisher': {'params': None, 'type': ''},
'resources': {'cpu': None,
'disk': None,
'gpu': None,
'memory': None},
'result_paths': None,
'timeouts': {'execution_timeout': None,
'queue_timeout': None,
'total_timeout': 1800}}],
'type': 'batch',
'version': 0
}
}
We can then use the results
method to fetch, among other fields, the output data's CID.
from bacalhau_sdk.jobs import Jobs
jobs = Jobs()
print(results(jobs.results(job_id="710a0bc2-81d1-4025-8f80-5327ca3ce170")))
The line above prints the following dictionary:
{'results': [{'data': {'cid': 'QmYEqqNDdDrsRhPRShKHzsnZwBq3F59Ti3kQmv9En4i5Sw',
'metadata': None,
'name': 'job-710a0bc2-81d1-4025-8f80-5327ca3ce170-shard-0-host-QmYgxZiySj3MRkwLSL4X2MF5F9f2PMhAE3LV49XkfNL1o3',
'path': None,
'source_path': None,
'storage_source': 'IPFS',
'url': None},
'node_id': 'QmYgxZiySj3MRkwLSL4X2MF5F9f2PMhAE3LV49XkfNL1o3',
'shard_index': None}]}
Congrats, that was a good start! 🎈 Please find more code snippets in the examples folder (more examples published in the near future).
You can set the environment variables BACALHAU_API_HOST
and BACALHAU_API_PORT
to point this SDK to your Bacalhau API (e.g. local devstack).
We use Poetry to manage this package, take a look at their official docs to install it. Note, all targets in the Makefile use poetry as well!
To develop this SDK locally, create a dedicated poetry virtual environment and install the root package (i.e. bacalhau_sdk
) and its dependencies:
$ poetry install --no-interaction --with test,dev -vvv
Creating virtualenv bacalhau-sdk-9mIcLX8U-py3.9 in /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs
Using virtualenv: /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs/bacalhau-sdk-9mIcLX8U-py3.9
Installing dependencies from lock file
...
Note the line above installs the root package (i.e. bacalhau_sdk
) in editable mode, that is, any change to its source code is reflected immediately without the need for re-packaging and re-installing it. Easy peasy!
Then install the pre-commit hooks and test it:
$ make install-pre-commit
$ make pre-commit