Help Needed with API Call in Colang File using LLama #502
Replies: 4 comments 2 replies
-
Can you run with
|
Beta Was this translation helpful? Give feedback.
-
I use the next distributions file: / The full code in nemo.ipynb :
and then :
Also, I'm running this in a Jupyter instance on SageMaker with GPU usage. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hi @andgonzalez-technisys! From what I see, the issue is not the calling of the API. In your logs I see that the completion contains the prompt as well. This will mess up the parsing. The LLM actually predicts correctly The second issue I see is that the LLM does not stop and it keeps producing tokens, probably until it reaches the limit. This can be fixed by tweaking the prompts for the |
Beta Was this translation helpful? Give feedback.
-
I want to call an API from my Colang file with a quantized Llama3 model. I have registered the provider, and the bot responds well with the rail. After, when I try a simple example like asking for the weather API (weather.co), the bot does not do the action. The bot just answers using the LLM. Is it possible to call an action, for example, with a quantized Llama3? Please, I need your help.
colang File:
actions.py
config.yml
load model is the same:
https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/examples/configs/llm/hf_pipeline_llama2/config.py
Example:
after i add the weather.co file and response is and the answer is anything:
res["content"]
""
its empty.
After I try answer about the weather:
response:
" "
I use:
I would be very grateful for your help.
Beta Was this translation helpful? Give feedback.
All reactions