diff --git a/artifact-manager.md b/artifact-manager.md index db9ae6d8..5a429e09 100644 --- a/artifact-manager.md +++ b/artifact-manager.md @@ -706,70 +706,227 @@ datasets = await artifact_manager.list(collection.id) print("Datasets in the gallery:", datasets) ``` - ## HTTP API for Accessing Artifacts and Download Counts -The `Artifact Manager` provides an HTTP endpoint for retrieving artifact manifests, data, and download statistics. This is useful for public-facing web applications that need to access datasets, models, or applications. +The `Artifact Manager` provides an HTTP API for retrieving artifact manifests, data, file statistics, and managing zip files. These endpoints are designed for public-facing web applications that need to interact with datasets, models, or applications. -### Endpoints: +--- +### Artifact Metadata and File Access Endpoints - - `/{workspace}/artifacts/{artifact_alias}` for fetching the artifact manifest. - - `/{workspace}/artifacts/{artifact_alias}/children` for listing all artifacts in a collection. - - `/{workspace}/artifacts/{artifact_alias}/files` for listing all files in the artifact. - - `/{workspace}/artifacts/{artifact_alias}/files/{file_path:path}` for downloading a file from the artifact (will be redirected to a pre-signed URL). +#### Endpoints: +- `/{workspace}/artifacts/{artifact_alias}`: Fetch the artifact manifest. +- `/{workspace}/artifacts/{artifact_alias}/children`: List all artifacts in a collection. +- `/{workspace}/artifacts/{artifact_alias}/files`: List all files in the artifact. +- `/{workspace}/artifacts/{artifact_alias}/files/{file_path:path}`: Download a file from the artifact (redirects to a pre-signed URL). -### Request Format: +#### Request Format: - **Method**: `GET` -- **Headers**: - - `Authorization`: Optional. The user's token for accessing private artifacts (obtained via the login logic or created by `api.generate_token()`). Not required for public artifacts. - -### Path Parameters: +- **Headers**: + - `Authorization`: Optional. The user's token for accessing private artifacts (obtained via login logic or created by `api.generate_token()`). Not required for public artifacts. -The path parameters are used to specify the artifact or file to access. The following parameters are supported: +#### Path Parameters: - **workspace**: The workspace in which the artifact is stored. -- **artifact_alias**: The alias or id of the artifact to access. This can be an artifact id generated by `create` or `edit` function, or it can be an alias of the artifact under the current workspace. Note that this artifact_alias should not contain the workspace. -- **file_path**: Optional, the relative path to a file within the artifact. This is optional and only required when downloading a file. +- **artifact_alias**: The alias or ID of the artifact to access. This can be generated by `create` or `edit` functions or be an alias under the current workspace. +- **file_path**: (Optional) The relative path to a file within the artifact. + +#### Response Examples: + +- **Artifact Manifest**: + ```json + { + "manifest": { + "name": "Example Dataset", + "description": "A dataset for testing.", + "version": "1.0.0" + }, + "view_count": 150, + "download_count": 25 + } + ``` + +- **Files in Artifact**: + ```json + [ + {"name": "example.txt", "type": "file"}, + {"name": "nested", "type": "directory"} + ] + ``` + +- **Download File**: A redirect to a pre-signed URL for the file. -### Query Parameters: - -Qury parameters are passed after the `?` in the URL and are used to control the behavior of the API. The following query parameters are supported: +--- -- **stage**: A boolean flag to fetch the staged version of the manifest. Default is `False`. -- **silent**: A boolean flag to suppress the view count increment. Default is `False`. +### Dynamic Zip File Creation Endpoint -- **keywords**: A list of search terms used for fuzzy searching across all manifest fields, separated by commas. -- **filters**: A dictionary of filters to apply to the search, in the format of a JSON string. -- **mode**: The mode for combining multiple conditions. Default is `AND`. -- **offset**: The number of artifacts to skip before listing results. Default is `0`. -- **limit**: The maximum number of artifacts to return. Default is `100`. -- **order_by**: The field used to order results. Default is ascending by id. -- **silent**: A boolean flag to prevent incrementing the view count for the parent artifact when listing children, listing files, or reading the artifact. Default is `False`. +#### Endpoint: -### Response: +- `/{workspace}/artifacts/{artifact_alias}/create-zip-file`: Stream a dynamically created zip file containing selected or all files in the artifact. -For `/{workspace}/artifacts/{artifact_alias}`, the response will be a JSON object representing the artifact manifest. For `/{workspace}/artifacts/{artifact_alias}/__files__/{file_path:path}`, the response will be a pre-signed URL to download the file. The artifact manifest will also include any metadata such as download statistics, e.g. `view_count`, `download_count`. For private artifacts, make sure if the user has the necessary permissions. +#### Request Format: -For `/{workspace}/artifacts/{artifact_alias}/children`, the response will be a list of artifacts in the collection. +- **Method**: `GET` +- **Query Parameters**: + - **file**: (Optional) A list of files to include in the zip file. If omitted, all files in the artifact are included. + - **token**: (Optional) User token for private artifact access. + - **version**: (Optional) The version of the artifact to fetch files from. -For `/{workspace}/artifacts/{artifact_alias}/files`, the response will be a list of files in the artifact, each file is a dictionary with the `name` and `type` fields. +#### Response: -For `/{workspace}/artifacts/{artifact_alias}/files/{file_path:path}`, the response will be a pre-signed URL to download the file. +- Streams the zip file back to the client. +- **Headers**: + - `Content-Disposition`: Attachment with the artifact alias as the filename. -### Example: Fetching a public artifact with download statistics +#### Example Usage: ```python import requests SERVER_URL = "https://hypha.aicell.io" workspace = "my-workspace" -response = requests.get(f"{SERVER_URL}/{workspace}/artifacts/example-dataset") -if response.ok: - artifact = response.json() - print(artifact["manifest"]["name"]) # Output: Example Dataset - print(artifact["download_count"]) # Output: Download count for the dataset +artifact_alias = "example-dataset" +files = ["example.txt", "nested/example2.txt"] + +response = requests.get( + f"{SERVER_URL}/{workspace}/artifacts/{artifact_alias}/create-zip-file", + params={"file": files}, + stream=True, +) +if response.status_code == 200: + with open("artifact_files.zip", "wb") as f: + for chunk in response.iter_content(chunk_size=8192): + f.write(chunk) + print("Zip file created successfully.") else: print(f"Error: {response.status_code}") ``` + +--- + +### Zip File Access Endpoints + +These endpoints allow direct access to zip file contents stored in the artifact without requiring the entire zip file to be downloaded or extracted. + +#### Endpoints: + +1. **`/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path:path}?path=...`** + - Access the contents of a zip file, specifying the path within the zip file using a query parameter (`?path=`). + +2. **`/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path:path}/~/{path:path|}`** + - Access the contents of a zip file, separating the zip file path and the internal path using `/~/`. + +--- + +#### Endpoint 1: `/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path:path}?path=...` + +##### Functionality: + +- **If `path` ends with `/`:** Lists the contents of the directory specified by `path` inside the zip file. +- **If `path` specifies a file:** Streams the file content from the zip. + +##### Request Format: + +- **Method**: `GET` +- **Path Parameters**: + - **workspace**: The workspace in which the artifact is stored. + - **artifact_alias**: The alias or ID of the artifact to access. + - **zip_file_path**: Path to the zip file within the artifact. +- **Query Parameters**: + - **path**: (Optional) The relative path inside the zip file. Defaults to the root directory. + +##### Response Examples: + +1. **Listing Directory Contents**: + ```json + [ + {"type": "file", "name": "example.txt", "size": 123, "last_modified": 1732363845.0}, + {"type": "directory", "name": "nested"} + ] + ``` + +2. **Fetching a File**: + Streams the file content from the zip. + +--- + +#### Endpoint 2: `/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path:path}/~/{path:path}` + +##### Functionality: + +- **If `path` ends with `/`:** Lists the contents of the directory specified by `path` inside the zip file. +- **If `path` specifies a file:** Streams the file content from the zip. + +##### Request Format: + +- **Method**: `GET` +- **Path Parameters**: + - **workspace**: The workspace in which the artifact is stored. + - **artifact_alias**: The alias or ID of the artifact to access. + - **zip_file_path**: Path to the zip file within the artifact. + - **path**: (Optional) The relative path inside the zip file. Defaults to the root directory. + +##### Response Examples: + +1. **Listing Directory Contents**: + ```json + [ + {"type": "file", "name": "example.txt", "size": 123, "last_modified": 1732363845.0}, + {"type": "directory", "name": "nested"} + ] + ``` + +2. **Fetching a File**: + Streams the file content from the zip. + +--- + +#### Example Usage for Both Endpoints + +##### Listing Directory Contents: + +```python +import requests + +SERVER_URL = "https://hypha.aicell.io" +workspace = "my-workspace" +artifact_alias = "example-dataset" +zip_file_path = "example.zip" + +# Using the query parameter method +response = requests.get( + f"{SERVER_URL}/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path}", + params={"path": "nested/"} +) +print(response.json()) + +# Using the tilde method +response = requests.get( + f"{SERVER_URL}/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path}/~/nested/" +) +print(response.json()) +``` + +##### Fetching a File: + +```python +# Using the query parameter method +response = requests.get( + f"{SERVER_URL}/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path}", + params={"path": "nested/example2.txt"}, + stream=True, +) +with open("example2.txt", "wb") as f: + for chunk in response.iter_content(chunk_size=8192): + f.write(chunk) + +# Using the tilde method +response = requests.get( + f"{SERVER_URL}/{workspace}/artifacts/{artifact_alias}/zip-files/{zip_file_path}/~/nested/example2.txt", + stream=True, +) +with open("example2.txt", "wb") as f: + for chunk in response.iter_content(chunk_size=8192): + f.write(chunk) +```