Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Llama.cpp Support #183

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Add Llama.cpp Support #183

wants to merge 9 commits into from

Conversation

bayedieng
Copy link
Contributor

@bayedieng bayedieng commented Aug 27, 2024

This PR adds support for Llama.cpp and closes #167.

@AlexCheema
Copy link
Contributor

AlexCheema commented Sep 5, 2024

Hey @bayedieng just checking in. Anything I can help with to move this along?

@bayedieng
Copy link
Contributor Author

Hey @AlexCheema I indeed was initially having trouble understanding the codebase however it's clearer now (Inheritance can be confusing). I wrote a basic sharded inference engine class and will proceed with the implementation.

My plan is to largely follow the implementation of the pytorch and tinygrad inference engines implementation with the only exception being skipping the tokenizer part of the problem. The Llama CPP API tokenizer is tied to the Llama class being instantiated. Also, the Tokenizer being defined in the other implementations don't seem to be tokenizing inputs but rather applies a chat template in the handle_chat_completions function of the ChatGPT API. I will be implementing it manually later in the call stack.

I will let you know if I have any further questions and am looking to have atleast a one working ai model being inferenced later today.

exo/models.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BOUNTY - $500] Llama.cpp inference engine
2 participants