Skip to content

Commit

Permalink
Use database for storing artifacts (#695)
Browse files Browse the repository at this point in the history
* support _token and remove prefix in login

* allow pass workspace when login

* Support current workspace for http endpoint

* fix login optional

* increase page size for list workspaces

* Bump version for hypha-rpc 0.20.38

* Update change logs and login instructions

* Support artifact endpoint via http

* Implement sql database

* change _id to _prefix

* clean up

* redirect login

* use sql to store workspace info

* add stage_files

* Update helm charts

* Fix workspaces db

* skip default database uri

* Use in-memory sql for artifacts

* Fix workspace loading error

* restore workspace info

* rename it to test-3

* restore version
  • Loading branch information
oeway authored Oct 7, 2024
1 parent 81ced64 commit a30119e
Show file tree
Hide file tree
Showing 17 changed files with 956 additions and 509 deletions.
74 changes: 70 additions & 4 deletions docs/artifact-manager.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# Artifact Manager

The `Artifact Manager` is a builtin hypha service for indexing, managing, and storing resources such as datasets, AI models, and applications. It is designed to provide a structured way to manage datasets and similar resources, enabling efficient listing, uploading, updating, and deleting of files.
The `Artifact Manager` is a built-in Hypha service for indexing, managing, and storing resources such as datasets, AI models, and applications. It provides a structured way to manage datasets and similar resources, enabling efficient listing, uploading, updating, and deleting of files.

A typical use case for the `Artifact Manager` is as a backend for a single-page web application displaying a gallery of datasets, AI models, applications or other type of resources. The default metadata of an artifact is designed to render a grid of cards on a webpage.
A typical use case for the `Artifact Manager` is as a backend for a single-page web application that displays a gallery of datasets, AI models, applications, or other types of resources. The default metadata of an artifact is designed to render a grid of cards on a webpage.

**Note:** The `Artifact Manager` is only available when your Hypha server has S3 storage enabled.

**Note:** The `Artifact Manager` is only available when your hypha server enabled s3 storage.

## Getting Started

### Step 1: Connecting to the Artifact Manager Service

To use the `Artifact Manager`, you first need to connect to the Hypha server. This API allows you to create, read, edit, and delete datasets in the artifact registry (stored in s3 bucket for each workspace).
To use the `Artifact Manager`, you first need to connect to the Hypha server. This API allows you to create, read, edit, and delete datasets in the artifact registry (stored in a S3 bucket for each workspace).

```python
from hypha_rpc.websocket_client import connect_to_server
Expand Down Expand Up @@ -216,6 +217,18 @@ await artifact_manager.commit(prefix="collections/schema-dataset-gallery/valid-d
print("Valid dataset committed.")
```

### Step 3: Accessing the collection via HTTP API

You can access the collection via the HTTP API to retrieve the schema and datasets.
This can be used for rendering a gallery of datasets on a webpage.

```javascript
// Fetch the schema for the collection
fetch("https://hypha.aicell.io/my-workspace/artifact/public/collections/schema-dataset-gallery")
.then(response => response.json())
.then(data => console.log("Schema:", data.collection_schema));
```

## API Reference

This section details the core functions provided by the `Artifact Manager` for creating, managing, and validating artifacts such as datasets and collections.
Expand Down Expand Up @@ -441,3 +454,56 @@ await artifact_manager.commit(prefix="collections/dataset-gallery/example-datase
datasets = await artifact_manager.list(prefix="collections/dataset-gallery")
print("Datasets in the gallery:", datasets)
```


## HTTP API for Accessing Artifacts

The `Artifact Manager` provides an HTTP endpoint for retrieving artifact manifests and data. This is useful for public-facing web applications that need to access datasets, models, or applications.

### Endpoint: `/{workspace}/artifact/{path:path}`

- **Workspace**: The workspace in which the artifact is stored.
- **Path**: The relative path to the artifact.
- For public artifacts, the path must begin with `public/`.
- For private artifacts, the path does not include the `public/` prefix and requires proper authentication.

### Request Format:

- **Method**: `GET`
- **Parameters**:
- `workspace`: The workspace in which the artifact is stored.
- `path`: The path to the artifact (e.g., `public/collections/dataset-gallery/example-dataset`).
- `stage` (optional): A boolean flag to indicate whether to fetch the staged version of the manifest (`_manifest.yaml`). Default is `False`.

### Response:

- **For public artifacts**: Returns the artifact manifest if it exists under the `public/` prefix.
- **For private artifacts**: Returns the artifact manifest if the user has the necessary permissions.

### Example:

#### Fetching a public artifact:

```python
import requests

SERVER_URL = "https://hypha.aicell.io"
workspace = "my-workspace"
response = requests.get(f"{SERVER_URL}/{workspace}/artifact/public/collections/dataset-gallery/example-dataset")
if response.ok:
artifact = response.json()
print(artifact["name"]) # Output: Example Dataset
else:
print(f"Error: {response.status_code}")
```

#### Fetching a private artifact:

```python
response = requests.get(f"{SERVER_URL}/{workspace}/artifact/collections/private-dataset-gallery/private-example-dataset")
if response.ok:
artifact = response.json()
print(artifact["name"]) # Output: Private Example Dataset
else:
print(f"Error: {response.status_code}")
```
14 changes: 14 additions & 0 deletions helm-charts/hypha-server/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,26 @@ spec:
args: {{- toYaml .Values.startupCommand.args | nindent 12 }}
env:
{{- toYaml .Values.env | nindent 12 }}
volumeMounts:
- name: {{ .Values.persistence.volumeName }}
mountPath: {{ .Values.persistence.mountPath }}
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: {{ .Values.persistence.volumeName }}
persistentVolumeClaim:
claimName: {{ .Values.persistence.existingClaim | default (include "hypha-server.fullname" .) }}
{{- if not .Values.persistence.existingClaim }}
accessModes:
{{- toYaml .Values.persistence.accessModes | nindent 14 }}
resources:
requests:
storage: {{ .Values.persistence.size }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
Expand Down
17 changes: 11 additions & 6 deletions helm-charts/hypha-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,12 +102,6 @@ env:
key: JWT_SECRET
- name: PUBLIC_BASE_URL
value: "https://hypha.amun.ai"
# Use the pod's UID as the server ID
# This is important to ensure Hypha Server can handle multiple replicas
- name: HYPHA_SERVER_ID
valueFrom:
fieldRef:
fieldPath: metadata.uid

# Define command-line arguments here
startupCommand:
Expand All @@ -117,3 +111,14 @@ startupCommand:
- "--port=9520"
- "--public-base-url=$(PUBLIC_BASE_URL)"
# - "--redis-uri=redis://redis.hypha.svc.cluster.local:6379/0"
- "--database-uri=sqlite+aiosqlite:///app/data/artifacts.db"

# Persistence Configuration
persistence:
volumeName: hypha-app-storage
mountPath: /app/data
storageClass: ""
accessModes:
- ReadWriteOnce
size: 5Gi
existingClaim: "" # If you have an existing claim, specify it here. Otherwise, a new PVC will be created.
2 changes: 1 addition & 1 deletion hypha/VERSION
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"version": "0.20.38"
"version": "0.20.37.post4"
}
6 changes: 3 additions & 3 deletions hypha/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,12 @@ def close(_) -> None:

self.event_bus.on_local("shutdown", close)

async def setup_workspace(self, overwrite=True, context=None):
async def setup_applications_collection(self, overwrite=True, context=None):
"""Set up the workspace."""
ws = context["ws"]
# Create an collection in the workspace
manifest = {
"id": "description",
"id": "applications",
"type": "collection",
"name": "Applications",
"description": f"A collection of applications for workspace {ws}",
Expand Down Expand Up @@ -205,7 +205,7 @@ async def install(
try:
await self.artifact_manager.read("applications", context=context)
except KeyError:
await self.setup_workspace(overwrite=True, context=context)
await self.setup_applications_collection(overwrite=True, context=context)
# Create artifact using the artifact controller
prefix = f"applications/{app_id}"
await self.artifact_manager.create(
Expand Down
Loading

0 comments on commit a30119e

Please sign in to comment.