Skip to content
/ llmt Public

Transforms all files in specified directory using a large language model like ChatGPT

License

Notifications You must be signed in to change notification settings

blwsh/llmt

Repository files navigation

LLMT - Large Language Model Transformer

Transforms all files in specified directory using a large language model. You can specify conditions for file transformation along with a prompt, which LLM to use and which model should be used . The transformed files are saved in a new directory. The file structure is preserved.

Usage

Command line usage

Docker

docker run -v $(pwd):/data ghcr.io/blwsh/llmt analyze

Note

When using openai analyzer you'll need to provide an API key. You can do this by setting the OPENAI_TOKEN environment variable.

Release

Download the latest release from the releases page. Extract the archive and run the binary.

llmt \ --config <config_file> \    # optional parameter, default is config.yaml in current directory
      analyze ./myProject ../docs  # analyzes files in ./myProject and outputs them as markdown in ../docs (maintains file structure)

Configuration

Example config.yaml file:

#$schema: https://raw.githubusercontent.com/blwsh/llmt/main/schema.json
version: "0.1"

analyzers:
  - prompt: Write docs for this file
    analyzer: openai
    model: gpt-4o-mini
    regex: ^.+\.php$
    not_in:
      - vendor

Analyzer configuration

Field Description Required Default
prompt The prompt to use for the language model yes
analyzer Specifies which llm to use. See Available analyzers for the full list of analyzers. yes
model The model to use for the language model. If you use a fine tuned openai model, you set its name here. yes
regex A regex to match the file path. no
not_in A list of directories to exclude from the analysis. no
in A list of directories to include in the analysis. Note: not_in takes precedence over in. no

See schema.json for the full config schema.

Available analyzers
Analyzer Description
openai Uses the OpenAI API to transform the files. You need to provide an API key via the OPENAI_TOKEN environment variable.
ollama Uses the OLLAMA API to transform the files. You can override the ollama url by setting the OLLAMA_HOST environment variable.

Go usage

You can find a comprehensive list of examples here. Below is a simple example which has similar behaviour to the command line analyze command.

Click to expand!
package main

import (
	"context"
	"io/ioutil"
	"strings"

	"github.com/blwsh/llmt/pkg/analyzer"
	"github.com/blwsh/llmt/pkg/analyzer/item_analyzer/openai"
	"github.com/blwsh/llmt/pkg/analyzer/project_analyzer/chat"
)

func main() {
	chat.New().
	  AnalyzeProject(context.Background(), "./myProject", "../docs", []analyzer.FileAnalyzerConfig{
			{
				Prompt:    "document this files behaviour",
				Analyzer:  openai.New("OPENAI_TOKEN_HERE", "gpt-4o-mini"),
				Condition: func(path string) bool { return strings.HasSuffix(path, ".php") },
				ResultHandler: func(destFilepath string, result string) error {
					return ioutil.WriteFile(destFilepath, []byte(result), 0644)
				},
			},
		})
}

With Condition and ResultHandler you're able to filter out which files should be processed and how the result should be processed.

[!TIP] To see a more complete example of the above snippet, see examples/overview directory.

Development

All contributions welcome! For the next release I want to offer more analyzers with greater configurability. I also plan on adding more hooks for file analyzers to allow for more complex transformations.