Skip to content

Commit

Permalink
Merge pull request #19 from kerthcet/feat/agent
Browse files Browse the repository at this point in the history
Update docs
  • Loading branch information
InftyAI-Agent authored Nov 10, 2024
2 parents cc79215 + 67ab1c6 commit 0cf38c8
Show file tree
Hide file tree
Showing 6 changed files with 9 additions and 24 deletions.
21 changes: 6 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@ _Name Story: the inspiration of the name `Manta` is coming from Dota2, called [M
## Features Overview

- **Preheat Models**: Models could be preloaded to the cluster, or even specified nodes to accelerate the model serving.
- **Model Caching**: Once models are downloaded, origin access is no longer necessary, but from other node peers.
- **Plugin Framework**: _Filter_ and _Score_ extension points could be extended with your own logic to pick up the best candidates in the form of plugin.
- **Model LCM**: Manage the model lifecycles automatically with different configurations.
- **Memory Management(WIP)**: Specify the maximum reserved memory for use, and GC with LRU algorithm.
- **Model Preheat**: Models could be preloaded to clusters, to specified nodes to accelerate the model serving.
- **Model Cache**: Models will be cached after downloading for faster model loading.
- **Model Lifecycle Management**: Manage the model lifecycle automatically with different policies, like `Retain` or `Delete`.
- **Plugin Framework**: _Filter_ and _Score_ plugins could be extended to pick up the best candidates.
- **Memory Management(WIP)**: Manage the reserved memories for caching, together with LRU algorithm for GC.

## Quick Start

Expand All @@ -41,15 +41,14 @@ Read the [Installation](./docs//installation.md) for guidance.

### Preheat Models

A toy sample to preload the `Qwen/Qwen2.5-0.5B-Instruct` model:
A sample to preload the `Qwen/Qwen2.5-0.5B-Instruct` model:

```yaml
apiVersion: manta.io/v1alpha1
kind: Torrent
metadata:
name: torrent-sample
spec:
replicas: 1
hub:
repoID: Qwen/Qwen2.5-0.5B-Instruct
```
Expand All @@ -62,7 +61,6 @@ kind: Torrent
metadata:
name: torrent-sample
spec:
replicas: 1
hub:
repoID: Qwen/Qwen2.5-0.5B-Instruct
nodeSelector:
Expand All @@ -79,20 +77,13 @@ kind: Torrent
metadata:
name: torrent-sample
spec:
replicas: 1
hub:
repoID: Qwen/Qwen2.5-0.5B-Instruct
reclaimPolicy: Delete
```

More details refer to the [APIs](https://github.com/InftyAI/Manta/blob/main/api/v1alpha1/torrent_types.go).

## Roadmap

- Support GC policy with LRU algorithm
- More integrations with serving projects
- Support file chunking

## Community

Join us for more discussions:
Expand Down
2 changes: 1 addition & 1 deletion agent/deploy/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ spec:
containers:
- name: agent
# image: inftyai/manta-agent:v0.0.1
image: inftyai/test:manta-agent-110811
image: inftyai/test:manta-agent-111001
ports:
- containerPort: 9090
resources:
Expand Down
2 changes: 0 additions & 2 deletions api/v1alpha1/torrent_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,9 +78,7 @@ type TorrentSpec struct {
// URI *URIProtocol `json:"uri,omitempty"`

// Replicas represents the replication number of each object.
// The real Replicas number could be greater than the desired Replicas.
// +kubebuilder:default=1
// +kubebuilder:validation:Maximum=99
// +optional
Replicas *int32 `json:"replicas,omitempty"`
// ReclaimPolicy represents how to handle the file replicas when Torrent is deleted.
Expand Down
5 changes: 1 addition & 4 deletions config/crd/bases/manta.io_torrents.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,8 @@ spec:
type: string
replicas:
default: 1
description: |-
Replicas represents the replication number of each object.
The real Replicas number could be greater than the desired Replicas.
description: Replicas represents the replication number of each object.
format: int32
maximum: 99
type: integer
type: object
status:
Expand Down
2 changes: 1 addition & 1 deletion config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ kind: Kustomization
images:
- name: controller
newName: inftyai/manta
newTag: "110905"
newTag: "111001"
1 change: 0 additions & 1 deletion config/samples/_v1alpha1_torrent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ metadata:
app.kubernetes.io/created-by: controller
name: torrent-sample
spec:
replicas: 2
reclaimPolicy: Delete
# nodeSelector:
# zone: zone1
Expand Down

0 comments on commit 0cf38c8

Please sign in to comment.