-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StructuredOutputParser can't handle nested triple backticks #6734
Comments
Hey @hnykda! I'm here to help you with your LangChain.js issue. I'm a bot designed to assist with bug fixes, answer questions, and support contributors. Let's tackle this bug together! 🤖 The Here is the relevant code snippet from the async parse(text: string): Promise<z.infer<T>> {
try {
const json = text.includes("```")
? text.trim().split(/```(?:json)?/)[1]
: text.trim();
const escapedJson = json
.replace(/"([^"\\]*(\\.[^"\\]*)*)"/g, (_match, capturedGroup) => {
const escapedInsideQuotes = capturedGroup.replace(/\n/g, "\\n");
return `"${escapedInsideQuotes}"`;
})
.replace(/\n/g, "");
return await this.schema.parseAsync(JSON.parse(escapedJson));
} catch (e) {
throw new OutputParserException(
`Failed to parse. Text: "${text}". Error: ${e}`,
text
);
}
} The For handling JSON content that includes nested triple backticks, you can use the export function parseJsonMarkdown(s: string, parser = parsePartialJson) {
s = s.trim();
const match = /```(json)?(.*)```/s.exec(s);
if (!match) {
return parser(s);
} else {
return parser(match[2]);
}
} This function uses a regular expression to detect and extract JSON content enclosed in triple backticks and then parses it using the You might want to consider using this utility function or modifying the |
Well, I know bot, that's what I linked. I am sharing this because I think this is unexpected behavior, and took me an hour to figure what's happening because I didn't expect that langsmith is going to just split the string if there is something in between. I am not sure how to solve this, but maybe at least the error message could be improved, or one could try to first parse "as is" and only then remove those backticks. |
I'm having same issue. I'm trying to Build an Extraction Chain but sometimes even with same prompt it throws the parse error. An not good way that I'm using to try to prevent while it isn't fixed is creating a retry logic. For example: For example:
|
Hi, my group and I are taking a look at this (as part of a University of Toronto course project). We're currently investigating if the proposed solution works. |
Hello, I'm working with @tlandart on this. It seems @hnykda's idea could work, though we think it is probably cleaner to explicitly handle this case in the try block by only splitting when triple back ticks are outside of the curly brackets. It could look something like this: ...
try {
const pattern = /```\s*\{/;
const json = pattern.test(text)
? text.trim().split(/```(?:json)?/)[1]
: text.trim();
return await this.schema.parseAsync(JSON.parse(json));
}
... We have tested this approach with some possible values and it seems to behave as expected, but, if there is anything we have overlooked, please let us know. We're first time contributors so if anyone has any comments on this it would be appreciated! |
Looks valid to me. Include some tests to your PR, that will help? 🤷 |
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
Description
StructuredOutputParser
fails on that scenario, partly because this will just randomly split the string in the middle...System Info
The text was updated successfully, but these errors were encountered: