[llama-index] serialization error and missing token counts with llamaindex and gemini #1207
Labels
bug
Something isn't working
instrumentation
Adding instrumentations to open source packages
triage
Issues that require triage
Discussed in Arize-ai/phoenix#6070
Originally posted by hristogg January 16, 2025
Hi I am using Arize Phoenix to trace a LlamaIndex workflow which uses Google's LLMs - VertexAI Gemini, Gemini through the API as well as their textembedding models.
I have two issues, the first one is that the token count stays empty and I cannot figure out how to add token count to the tracing, the second one is that once I tried to create a mock-up workflow to play with it without my complex logic I encountered another error around pydantic serialization which is curious as I dont have this error in my core workflow :)
If someone can take a look and help me figure out how to count tokens properly and why is this serialization failing it would be greatly appreciated.
Here is how I do the instrumentation:
from phoenix.otel import register
tracer_provider = register(
project_name="test", # Default is 'default'
endpoint="https://app.phoenix.arize.com/v1/traces",
)
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)
and my test workflow logic:
llm = Vertex(
model="gemini-pro",
temperature=0,
max_tokens=3000,
#safety_settings=safety_config,
credentials=credentials,
)
Settings.llm = llm
class TestWorkflow(Workflow):
@step
async def answer_q(self, ctx: Context, ev: StartEvent) -> StopEvent:
question = ev.question
qa_prompt_str = (
"Give answer to the quesiton below in the language it is asked.\n"
"---------------------\n"
"{question}\n"
)
chat_text_qa_msgs = [
ChatMessage(
role=MessageRole.SYSTEM,
content=(
"Always answer the question, even if the context isn't helpful."
),
),
ChatMessage(role=MessageRole.USER, content=qa_prompt_str),
]
formated_prompt = ChatPromptTemplate(chat_text_qa_msgs)
question_to_pass = formated_prompt.format_messages(question=question)
print(question_to_pass)
answer = await llm.achat(question_to_pass)
return StopEvent(result=answer)
Here is the error as well:
ERROR:openinference.instrumentation.llama_index._handler:Error serializing to JSON: PydanticSerializationError: Unable to serialize unknown type: <class 'google.cloud.aiplatform_v1beta1.types.prediction_service.GenerateContentResponse'>
Traceback (most recent call last):
File "C:\Users\hgospodinov\venv\container_work\Lib\site-packages\openinference\instrumentation\llama_index_handler.py", line 253, in process_output
self[OUTPUT_VALUE] = result.model_dump_json(exclude_unset=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hgospodinov\venv\container_work\Lib\site-packages\pydantic\main.py", line 441, in model_dump_json
return self.pydantic_serializer.to_json(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.PydanticSerializationError: Error serializing to JSON: PydanticSerializationError: Unable to serialize unknown type: <class 'google.cloud.aiplatform_v1beta1.types.prediction_service.GenerateContentResponse'>
The text was updated successfully, but these errors were encountered: