Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerank-english-v3.0 doesn't return re-ranked documents (v4.57) #202

Open
pchamart opened this issue Jul 29, 2024 · 3 comments
Open

Rerank-english-v3.0 doesn't return re-ranked documents (v4.57) #202

pchamart opened this issue Jul 29, 2024 · 3 comments

Comments

@pchamart
Copy link

Rerank-english-v3.0 doesn't return re-ranked documents

!pip show cohere

Name: cohere
Version: 4.57
Summary: Python SDK for the Cohere API
Home-page: 
Author: Cohere
Author-email: 
License: 
Location: /Users/xxxxx/miniconda3/envs/rag/lib/python3.10/site-packages
Requires: aiohttp, backoff, fastavro, importlib_metadata, requests, urllib3

Enter API key

cohere_api_key = os.getenv("COHERE_API_KEY") or getpass("Enter Cohere API key: ")
co = cohere.Client(api_key=cohere_api_key)
print(co.datasets.list())

Enter Cohere API key: ········
DatasetsListResponse(datasets=None)

List of docs matched from Vector Store

query = "What are action groups in Amazon Bedrock"
print(matched_docs)

docs = ['Components in agents for Amazon BedrockBehind the scenes, agents for Amazon Bedrock automate the prompt engineering and orchestration of user-requested tasks.They can securely augment the prompts with company-specific information to provide responses back to the user in natural language.The agent breaks the user-requested task into multiple steps and orchestrates subtasks with the help of FMs.Action groups are tasks that the agent can perform autonomously....',
'Select a foundation model:In the Select model screen, you select a model.Provide clear and precise instructions to the agent about what tasks to perform and how to interact with the users.Add action groups:An action is a task the agent can perform by making API calls.A set of actions comprise an action group.You provide an API schema that defines all the APIs in the action group.You must provide an API schema in the OpenAPI schema JSON format.The Lambda function contains the business logic needed to perform API calls.You must associate a Lambda
function to each action group.Give the action group a name and a description for the action.Select the Lambda function, provide an API schema file and selectNext.In the final step, review the agent configuration and selectCreate Agent....',
'General availability of Agents for Amazon Bedrock to help execute multistep tasks using systems, data sources, and company knowledge.LLMs are great at having conversations and generating content, but customers want their applications to be able to do even morelike take actions, solve problems, and interact with a range of systems to complete multi-step tasks like booking travel, filing insurance claims, or ordering replacement parts...'
]

Rerank the docs

RERANK_MODEL_NAME = "rerank-english-v3.0" # another option is rerank-multilingual-02
reranked_results = co.rerank(
    model=RERANK_MODEL_NAME, query=query, documents=matched_docs, top_n=3
)
print(reranked_results)

RerankResponse(
id='64dc0d1d-7b67-49ef-8f62-c6c1bc6a07a6',
results=[
RerankResponseResultsItem(document=None, index=0, relevance_score=0.99911696),
RerankResponseResultsItem(document=None, index=1, relevance_score=0.99895966),
RerankResponseResultsItem(document=None, index=2, relevance_score=0.67362124)
],
meta=ApiMeta(
api_version=ApiMetaApiVersion(version='1', is_deprecated=None, is_experimental=None),
billed_units=ApiMetaBilledUnits(
input_tokens=None,
output_tokens=None,
search_units=1,
classifications=None
),
tokens=None,
warnings=None
)
)

Print reranked results

for idx, r in enumerate(reranked_results):
  print(f"Document Rank: {idx + 1}, Document Index: {r.index}")
  print(f"Document: {r.document['text']}")
  print(f"Relevance Score: {r.relevance_score:.2f}")
  print("\n")

Document Rank: 1, Document Index: <built-in method index of tuple object at 0x11a9b9f40>
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in :3 │
│ │
│ 1 for idx, r in enumerate(reranked_results): │
│ 2 print(f"Document Rank: {idx + 1}, Document Index: {r.index}") │
│ ❱ 3 print(f"Document: {r.document['text']}") │
│ 4 print(f"Relevance Score: {r.relevance_score:.2f}") │
│ 5 print("\n") │
│ 6 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'tuple' object has no attribute 'document'

@ai-yann
Copy link
Collaborator

ai-yann commented Oct 23, 2024

Thanks for sharing this issue. Which notebook in this repo are you referring to?

@praveenc
Copy link

I used the code sample in the developer docs here: https://docs.cohere.com/v2/page/rerank-demo

results = co.rerank(query=query, model=MODEL_NAME, documents=docs, top_n=3) # Change top_n to change the number of results returned. If top_n is not passed, all results will be returned.
for idx, r in enumerate(results):
  print(f"Document Rank: {idx + 1}, Document Index: {r.index}")
  print(f"Document: {r.document['text']}")
  print(f"Relevance Score: {r.relevance_score:.2f}")
  print("\n")

@ai-yann
Copy link
Collaborator

ai-yann commented Nov 8, 2024

@praveenc, thanks for reporting this! I've reproduced the issue and found that there's a mismatch between our documentation and the actual behavior.

The docs show:

# Documentation example
for idx, r in enumerate(results):
    print(f"Document Rank: {idx + 1}, Document Index: {r.index}")
    print(f"Document: {r.document['text']}")
    print(f"Relevance Score: {r.relevance_score:.2f}")

But the rerank results:

  • Return document=None in the response
  • Return tuples instead of objects with document and text attributes

Here's the working code that matches the current behavior:

for idx, r in enumerate(results.results):  # Use results.results to get the list
    print(f"Document Rank: {idx + 1}")
    print(f"Document: {docs[r.index]}")  # Use index to reference original docs
    print(f"Relevance Score: {r.relevance_score:.2f}")
    print("\n")

I'll create a PR to update our docs to reflect this current behavior. @praveenc, could you please try this modified version and confirm if it works for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants