Skip to content

saturncloud/llm

Repository files navigation

Saturn Cloud LLM (Un)Framework

The Saturn Cloud LLM Framework is a set of tools for several application level, as well as functional tasks around LLMs.

Application Tasks:

  • RAG QA
  • Text Summarization
  • NER
  • Automated Tagging

Functional Tasks:

  • fine tuning
  • batch inference
  • model serving

This repository is designed to be a framework for common tasks one often performs with LLMs. This repository is also designed to be easily read, so that if your needs go beyond what is provided here, that it is easy for you to fork this repository, or build your own framework on top of your existing code. You should fork this repository or build your own framework as soon as this repository stops making your life easier.

You can use this framework without using Saturn Cloud. You build LLMs on Saturn Cloud without using this framework. This framework is offered as an easy and useful way to get started.

Structure of the repository

  • llm module - this module contains all the "library" code to facilitate LLM applications, as well as LLM functional tasks.
  • build_examples - this directory contains scripts used to prepare data used in examples. Users are not expected to use this directory
  • starting_points - this directory contains code templates that you can implement in order to apply this repository to your own data.
  • examples - this directory contains examples of using the framework on sample datasets. You can think of examples as the code in starting_points, implemented for specific datasets.

Concepts

This repository uses a few concepts

Model Config

We have a registry of common models supported by this framework. Model Configs include common parameters for each model, as well as the PromptFormat for the model

Prompt Format

This the format that was used to train the model. It is a good idea to use the Prompt Format for a given model, but sometimes not essential.

For example the Llama 2 chat model expects prompts to follow this style:

<s>[INST] <<SYS>>
{system_message}
<</SYS>>

{input} [/INST] {response} </s>

Whereas Vicuna expects prompts to follow this style:

<s> {system_message}
USER: {input}
ASSISTANT: {response}
</s>

Not all models have a Prompt Format. For example the Llama 2 base model was trained on a corpus of text that didn't have system messages or user/assistant roles.

Prompts

Prompts include specific system messages, examples (for few-shot learning) and templates for inputs, responses, and contexts. Prompts map to the problem you are trying to solve, and can be mixed and matched with different PromptFormats that map to different models.

Note

It is important that your Prompt or Prompt Format contain enough information to tell the language model when it should start servicing your request

For example, let's assume that you've constructed a Prompt object that renders the following piece of data for fine tuning:

Please summarize the following conversation
A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.

Tom and his friend are going to the animal shelter to get a puppy for Tom's son.

If you fine tuned a model with data like this, and then attempted to use it to generate summaries,
feeding in this prompt:

Please summarize the following conversation
A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.

Would likely result in the following output:

B: That will make him so happy.
A: Yeah, we’ve discussed it many times. I think he’s ready now.
B: That’s good. Raising a dog is a tough issue. Like having a baby ;-)
A: I'll get him one of those little dogs.

This is because nothing in the input indicated that the conversation was over, and that the model should begin producing the summary. If you were using a model such as Vicuna, the inherent PromptFormat would solve this problem:

User: Please summarize the following conversation
A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.
Assistant: Tom and his friend are going to the animal shelter to get a puppy for Tom's son.

The presence of the "Assistant:" string indicates to the language model that it's time to produce the summary.

A better approach would be to bake this information into the Prompt object, so that the format of the model is irrelevant.

Prompt(
  system_message="Please summarize the following conversation", 
  input_template="Conversation: {text}",
  response_template="Summary: {text}"
)

This would result in the following piece of training data when used with the base llama-2 model:

Please summarize the following conversation
Conversation: A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.
Summary: Tom and his friend are going to the animal shelter to get a puppy for Tom's son.

Or, if used with Vicuna:

Please summarize the following conversation
User: Conversation: A: Hi Tom, are you busy tomorrow’s afternoon?
B: I’m pretty sure I am. What’s up?
A: Can you go with me to the animal shelter?.
B: What do you want to do?
A: I want to get a puppy for my son.
Assistant: Summary: Tom and his friend are going to the animal shelter to get a puppy for Tom's son.

The User and Assistant roles help Vicuna identify the components of the prompt, while the Conversation and Summary prompt templating serves as a callback to the task in the system message. During inference we can also use the PromptFormat's roles as early stopping conditions. If the model goes on to generate a new User: .... message after the summary then the request is complete, and the final output can be cleaned up before being returned to the user.

Configuration

We've written scripts for tasks (such as fine tuning, batch inference, model serving) So that you can ideally run these tasks without having to write any code at all. To do so we rely on a lightweight yaml configuration to direct the specifics of each task.

Note

Sometimes the configuration delegates to other code/classes, for example

  • load_datest, load_from_disk for referencing HuggingFace datasets
  • UserAssistantFormat, VicunaFormat for PromptFormats
  • ZeroShotQA and FewShotQA classes for Prompts

These configurations are specified with a method and a kwargs value. method is a string that has been registered against an existing python function in code. You can also call methods that we haven't registered with the following syntax: path.to.module::name. The kwargs entry is a dictionary of parameters that the method expects.

About

Tools for working with large language models

Resources

Stars

Watchers

Forks

Packages

No packages published