Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): Add Anthropic PDF support (document type) in invoke #7496

Merged
merged 8 commits into from
Jan 18, 2025
48 changes: 48 additions & 0 deletions examples/src/prompts/pdf_document.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import { base } from "@faker-js/faker";
import { ChatAnthropic } from "@langchain/anthropic";

export const run = async () => {
const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620", // Only claude-3-5-sonnet-20240620 , claude-3-5-sonnet-20241022 as of Jan 2025 support PDF documents as in base64
});

// PDF needs to be in Base64.
const getLocalFile = async (path: string) => {
const localFile = await fs.readFileSync(path);
const base64File = localFile.toString("base64");
return base64File;
};

// Or remotely
const getRemoteFile = async (url: string) => {
const response = await fetch(url);
const arrayBuffer = await response.arrayBuffer();
const base64File = Buffer.from(arrayBuffer).toString("base64");
return base64File;
};

const base64 = await getRemoteFile(
"https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
);

const prompt = "Summarise the contents of this PDF";

const response = await llm.invoke([
{
role: "user",
content: [
{
type: "text",
text: prompt,
},
{
type: "document",
source: base64,
},
],
},
]);
console.log(response.content);
//console.log(response.content);
return response.content;
};
14 changes: 14 additions & 0 deletions libs/langchain-anthropic/src/utils/message_inputs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,20 @@ function _formatContent(content: MessageContent) {
source,
...(cacheControl ? { cache_control: cacheControl } : {}),
};
} else if (contentPart.type === "document") {
// PDF
return {
type: "document",
source:
typeof contentPart.source === "string"
adhambadr marked this conversation as resolved.
Show resolved Hide resolved
? {
media_type: "application/pdf",
type: "base64",
data: contentPart.source,
}
: contentPart.source,
...(cacheControl ? { cache_control: cacheControl } : {}),
};
} else if (
textTypes.find((t) => t === contentPart.type) &&
"text" in contentPart
Expand Down