Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): Add Anthropic PDF support (document type) in invoke #7496

Merged
merged 8 commits into from
Jan 18, 2025

Conversation

adhambadr
Copy link
Contributor

Since claude-3-5-sonnet-20240620 PDF Support has been added to Anthropic's message types.

You are able to send the PDF document to Anthropic and they do text extraction, Image conversion and supply the LLM with both (Text + Screenshot) of each page to do deep dive analysis, text extraction and more. Its pretty neat and handy especially in doing structured output and I added support for it in the Langchain Ecosystem as right now using document type throws an unsupported type error before passing it to the LLM.

I added the source type document support as well as simplifying the source object to just pass the base64 or the object exactly as in Anthropic's documentation.
I added a working example inside yarn example examples/src/prompts/pdf_document.ts

Here is an example usage:

import { ChatAnthropic } from "@langchain/anthropic";

const llm = new ChatAnthropic({
    model: "claude-3-5-sonnet-20240620",
 // Key 
});

// Local file
const file = fs.readFileSync("test.pdf");
const base64 = Buffer.from(file).toString("base64");
// Or Load remotely (web environment): 
const res = await fetch("https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf")
const buffer = await res.arrayBuffer();
const base64 = Buffer.from(buffer).toString("base64");

const prompt = "Summarize for me the contents of this document"; 
const {content} = await llm.invoke([ 
  {
     role : "user",
     content :  [
        {
          type: "text",
          text: prompt,
        },
        {
          type: "document",
          source: base64,
        }
      ]
   }
]);

console.log(content);

It's my first PR to this project so apologies if I missed something crucial, feedback or improvements are welcomed, as all as the shoutout to my twitter

Supported models as of Jan 2025:

  1. claude-3-5-sonnet-20240620
  2. claude-3-5-sonnet-20241022

Copy link

vercel bot commented Jan 10, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jan 18, 2025 8:55am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jan 18, 2025 8:55am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. auto:improvement Medium size change to existing code to handle new use-cases labels Jan 10, 2025
@jacoblee93
Copy link
Collaborator

Ah nice!

Copy link
Collaborator

@jacoblee93 jacoblee93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this - see comment!

libs/langchain-anthropic/src/utils/message_inputs.ts Outdated Show resolved Hide resolved
@camwardy
Copy link

Thanks for adding this @adhambadr, hopefully this can get merged soon as it'd be really useful for us!

@jacoblee93 jacoblee93 changed the title Add Anthropic PDF support (document type) in invoke feat(anthropic): Add Anthropic PDF support (document type) in invoke Jan 18, 2025
@jacoblee93 jacoblee93 merged commit 94467fa into langchain-ai:main Jan 18, 2025
33 of 34 checks passed
@jacoblee93
Copy link
Collaborator

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants