Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Local App Snippet] support non conversational LLMs #954

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

mishig25
Copy link
Collaborator

@mishig25 mishig25 commented Oct 7, 2024

Description

Most GGUF files on the hub are insutrct/conversational. However, not all of them. Previously, local app snippets assumed that all GGUFs are insutrct/conversational.

vLLM

https://huggingface.co/meta-llama/Llama-3.2-3B?local-app=vllm

mishig@machine:~$ curl -X POST "http://localhost:8000/v1/completions" \
        -H "Content-Type: application/json" \
        --data '{
                "model": "meta-llama/Llama-3.2-3B",
                "prompt": "Once upon a time",
                "max_tokens": 150,
                "temperature": 0.5
        }'

{"id":"cmpl-157aad50ba6d45a5a7e2641a3c8157dd","object":"text_completion","created":1728293162,"model":"meta-llama/Llama-3.2-3B","choices":[{"index":0,"text":" there was a man who was very generous and kind to everyone. He was a good man and a good person. One day he was walking down the street and he saw a man who was very poor and starving. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":5,"total_tokens":155,"completion_tokens":150}}

llama.cpp

https://huggingface.co/mlabonne/gemma-2b-GGUF?local-app=llama.cpp

llama-cli \
  --hf-repo "mlabonne/gemma-2b-GGUF" \
  --hf-file gemma-2b.Q2_K.gguf \
  -p "Once upon a time "

llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
        repo_id="mlabonne/gemma-2b-GGUF",
        filename="gemma-2b.Q2_K.gguf",
)

output = llm(
        "Once upon a time ",
        max_tokens=512,
        echo=True
)

print(output)

@mishig25 mishig25 marked this pull request as ready for review October 7, 2024 10:07
Base automatically changed from fix_vlmm_snippet to main October 7, 2024 10:08
packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved
packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, but important specially wrt llama.cpp

packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved
packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved
@mishig25
Copy link
Collaborator Author

mishig25 commented Oct 7, 2024

Added test cases as the examples are getting more complex and we can be sure not to break any existing examples

packages/tasks/src/local-apps.spec.ts & packages/tasks/src/model-libraries-snippets.spec.ts

`curl -X POST "http://localhost:8000/v1/chat/completions" \\ `,
` -H "Content-Type: application/json" \\ `,
`curl -X POST "http://localhost:8000/v1/chat/completions" \\`,
` -H "Content-Type: application/json" \\`,
` --data '{`,
` "model": "${model.id}",`,
` "messages": [`,
` {"role": "user", "content": "Hello!"}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
` {"role": "user", "content": "Hello!"}`,
` {"role": "user", "content": "What is the capital of France?"}`,

Minor suggestion: Hello! looks a bit too terse. Perhaps we can unify the Instruct examples to be the same as llama-cpp-python and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants