Skip to content

Commit

Permalink
feat: simplify download management, model file should be able to indi…
Browse files Browse the repository at this point in the history
…vidually introduced
  • Loading branch information
wsxiaoys committed Nov 2, 2023
1 parent 36ffeb6 commit 40fc7b8
Show file tree
Hide file tree
Showing 13 changed files with 224 additions and 245 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# v0.5.0 [Unreleased]

## Notice

* llama.cpp backend (CPU, Metal) now requires a redownload of gguf model due to upstream format changes: https://github.com/TabbyML/tabby/pull/645 https://github.com/ggerganov/llama.cpp/pull/3252
* With tabby fully migrated to the `llama.cpp` serving stack, the `--model` and `--chat-model` options now accept local file paths instead of a directory path containing both the `tabby.json` and `ggml` files, as was the case previously.

## Features

Expand Down
21 changes: 21 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions MODEL_SPEC.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Tabby Model Specification (Unstable)

> [!WARNING]
> This documentation is no longer valid , tabby accept gguf files directly since release of v0.5. see https://github.com/TabbyML/registry-tabby for details.
Tabby organizes the model within a directory. This document provides an explanation of the necessary contents for supporting model serving. An example model directory can be found at https://huggingface.co/TabbyML/StarCoder-1B

The minimal Tabby model directory should include the following contents:
Expand Down
1 change: 1 addition & 0 deletions crates/tabby-common/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ reqwest = { workspace = true, features = [ "json" ] }
tokio = { workspace = true, features = ["rt", "macros"] }
uuid = { version = "1.4.1", features = ["v4"] }
tantivy.workspace = true
anyhow.workspace = true

[features]
testutils = []
1 change: 1 addition & 0 deletions crates/tabby-common/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ pub mod events;
pub mod index;
pub mod languages;
pub mod path;
pub mod registry;
pub mod usage;

use std::{
Expand Down
36 changes: 1 addition & 35 deletions crates/tabby-common/src/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,38 +51,4 @@ pub fn events_dir() -> PathBuf {
tabby_root().join("events")
}

pub struct ModelDir(PathBuf);

impl ModelDir {
pub fn new(model: &str) -> Self {
Self(models_dir().join(model))
}

pub fn from(path: &str) -> Self {
Self(PathBuf::from(path))
}

pub fn path(&self) -> &PathBuf {
&self.0
}

pub fn path_string(&self, name: &str) -> String {
self.0.join(name).display().to_string()
}

pub fn cache_info_file(&self) -> String {
self.path_string(".cache_info.json")
}

pub fn metadata_file(&self) -> String {
self.path_string("tabby.json")
}

pub fn ggml_q8_0_file(&self) -> String {
self.path_string("ggml/q8_0.gguf")
}

pub fn ggml_q8_0_v2_file(&self) -> String {
self.path_string("ggml/q8_0.v2.gguf")
}
}
mod registry {}
83 changes: 83 additions & 0 deletions crates/tabby-common/src/registry.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
use std::{fs, path::PathBuf};

use anyhow::Result;
use serde::{Deserialize, Serialize};

use crate::path::models_dir;

#[derive(Serialize, Deserialize)]
pub struct ModelInfo {
pub name: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub prompt_template: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub chat_template: Option<String>,
pub urls: Vec<String>,
pub sha256: String,
}

fn models_json_file(registry: &str) -> PathBuf {
models_dir().join(registry).join("models.json")
}

async fn load_remote_registry(registry: &str) -> Result<Vec<ModelInfo>> {
let value = reqwest::get(format!(
"https://raw.githubusercontent.com/{}/registry-tabby/main/models.json",
registry
))
.await?
.json()
.await?;
fs::create_dir_all(models_dir().join(registry))?;
serdeconv::to_json_file(&value, models_json_file(registry))?;
Ok(value)
}

fn load_local_registry(registry: &str) -> Result<Vec<ModelInfo>> {
Ok(serdeconv::from_json_file(models_json_file(registry))?)
}

#[derive(Default)]
pub struct ModelRegistry {
pub name: String,
pub models: Vec<ModelInfo>,
}

impl ModelRegistry {
pub async fn new(registry: &str) -> Self {
Self {
name: registry.to_owned(),
models: load_remote_registry(registry).await.unwrap_or_else(|err| {
load_local_registry(registry).unwrap_or_else(|_| {
panic!(
"Failed to fetch model organization <{}>: {:?}",
registry, err
)
})
}),
}
}

pub fn get_model_path(&self, name: &str) -> PathBuf {
models_dir()
.join(&self.name)
.join(name)
.join("ggml/q8_0.v2.gguf")
}

pub fn get_model_info(&self, name: &str) -> &ModelInfo {
self.models
.iter()
.find(|x| x.name == name)
.unwrap_or_else(|| panic!("Invalid model_id <{}/{}>", self.name, name))
}
}

pub fn parse_model_id(model_id: &str) -> (&str, &str) {
let parts: Vec<_> = model_id.split('/').collect();
if parts.len() != 2 {
panic!("Invalid model id {}", model_id);
}

(parts[0], parts[1])
}
1 change: 1 addition & 0 deletions crates/tabby-download/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ urlencoding = "2.1.3"
serde_json = { workspace = true }
cached = { version = "0.46.0", features = ["async", "proc_macro"] }
async-trait = { workspace = true }
sha256 = "1.4.0"
46 changes: 0 additions & 46 deletions crates/tabby-download/src/cache_info.rs

This file was deleted.

Loading

0 comments on commit 40fc7b8

Please sign in to comment.