Skip to content

Commit

Permalink
feat(api): new api that integrates with node-llama-cpp@beta
Browse files Browse the repository at this point in the history
  • Loading branch information
ido-pluto committed May 9, 2024
1 parent 4b770b7 commit 176f74e
Show file tree
Hide file tree
Showing 19 changed files with 813 additions and 698 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,6 @@ web_modules/
.yarn-integrity

# dotenv environment variable files
.env
.env.development.local
.env.test.local
.env.production.local
Expand Down
43 changes: 34 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Make sure you have [Node.js](https://nodejs.org/en/) (**download current**) inst
```bash
npm install -g catai

catai install vicuna-7b-16k-q4_k_s
catai install llama3-8b-openhermes-dpo-q3_k_s
catai up
```

Expand Down Expand Up @@ -57,6 +57,7 @@ Commands:
active Show active model
remove|rm [options] [models...] Remove a model
uninstall Uninstall server and delete all models
node-llama-cpp|cpp [options] Node llama.cpp CLI - recompile node-llama-cpp binaries
help [command] display help for command
```

Expand Down Expand Up @@ -92,14 +93,6 @@ This package uses [node-llama-cpp](https://github.com/withcatai/node-llama-cpp)
- linux-ppc64le
- win32-x64-msvc

### Memory usage
Runs on most modern computers. Unless your computer is very very old, it should work.

According to [a llama.cpp discussion thread](https://github.com/ggerganov/llama.cpp/issues/13), here are the memory requirements:

- 7B => ~4 GB
- 13B => ~8 GB
- 30B => ~16 GB

### Good to know
- All download data will be downloaded at `~/catai` folder by default.
Expand All @@ -125,6 +118,38 @@ const data = await response.text();

For more information, please read the [API guide](https://github.com/withcatai/catai/blob/main/docs/api.md)

## Development API + Node-llama-cpp@beta integration

You can use the model with [node-llama-cpp@beta](https://github.com/withcatai/node-llama-cpp/pull/105)

CatAI enables you to easily manage the models and chat with them.

```ts
import {downloadModel, getModelPath} from 'catai';

// download the model, skip if you already have the model
await downloadModel(
"https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q2_K.gguf?download=true",
"llama3"
);

// get the model path with catai
const modelPath = getModelPath("llama3");

const llama = await getLlama();
const model = await llama.loadModel({
modelPath
});

const context = await model.createContext();
const session = new LlamaChatSession({
contextSequence: context.getSequence()
});

const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);
```

## Configuration

You can edit the configuration via the web ui.
Expand Down
9 changes: 8 additions & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Enable you to chat with the model locally on your computer.
```ts
import {createChat} from 'catai';

const chat = await createChat();
const chat = await createChat(); // using the default model installed

const response = await catai.prompt('Write me 100 words story', token => {
progress.stdout.write(token);
Expand All @@ -20,6 +20,13 @@ const response = await catai.prompt('Write me 100 words story', token => {
console.log(`Total text length: ${response.length}`);
```

You can also specify the model you want to use:

```ts
import {createChat} from 'catai';
const chat = await createChat({model: "llama3"});
```

If you want to install the model on the fly, please read the [install-api guide](./install-api.md)

## Remote API
Expand Down
35 changes: 17 additions & 18 deletions docs/install-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@ You can install models on the fly using the `FetchModels` class.
import {FetchModels} from 'catai';

const allModels = await FetchModels.fetchModels();
const firstModel = Object.keys(allModels)[0];

const installModel = new FetchModels({
download: firstModel,
download: "https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q2_K.gguf?download=true",
latest: true,
model: {
settings: {
Expand All @@ -23,27 +22,27 @@ await installModel.startDownload();

After the download is finished, this model will be the active model.

## Configuration
## Using with node-llama-cpp@beta

You can change the active model by changing the `CatAIDB`
You can download the model and use it directly with node-llama-cpp@beta

```ts
import {CatAIDB} from 'catai';
import {getLlama, LlamaChatSession} from "node-llama-cpp";
import {getModelPath} from 'catai';
const modelPath = getModelPath("llama3");

CatAIDB.db.activeModel = Object.keys(CatAIDB.db.models)[0];

await CatAIDB.saveDB();
```

You also can change the model settings by changing the `CatAIDB`

```ts
import {CatAIDB} from 'catai';
const llama = await getLlama();
const model = await llama.loadModel({
modelPath
});

const selectedModel = CatAIDB.db.models[CatAIDB.db.activeModel];
selectedModel.settings.context = 4096;
const context = await model.createContext();
const session = new LlamaChatSession({
contextSequence: context.getSequence()
});

await CatAIDB.saveDB();
const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);
```

For extra information about the configuration, please read the [configuration guide](./configuration.md)
For more information on how to use the model, please refer to the [node-llama-cpp beta pull request](https://github.com/withcatai/node-llama-cpp/pull/105)
Loading

0 comments on commit 176f74e

Please sign in to comment.