Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): Add support for boolean metadata attributes in FunctionalTranslator #7407

Merged

Conversation

nick-w-nick
Copy link
Contributor

While adding support for boolean as a type in the castValue function to allow them to be used in metadata filters that are created within a StructuredQuery, I noticed that this would cause the FunctionalTranslator implementation to have a gap in functionality, where booleans were technically supported as filter inputs, but would not be handled correctly during the query translation without modifications.

To remedy this, I've added a new utility method called getAllowedComparatorsForType which returns the allowed comparators that can be used with a given type. This enables guards around using incorrect combinations of values and comparators.

Another added benefit of this utility is that we can easily get the list of comparators for each expected type, making it possible to generate tests for every comparator for each data type automatically. I've gone ahead and added an MVP set of tests for the FunctionalTranslator, limited to the getAllowedComparatorsForType and visitComparison methods, to not overcrowd the PR with unrelated changes.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 19, 2024
Copy link

vercel bot commented Dec 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Dec 30, 2024 2:52am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Dec 30, 2024 2:52am

@dosubot dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Dec 19, 2024
Comment on lines +36 to +39
const sq6 = new StructuredQuery(
"",
new Comparison(Comparators.eq, "isReleased", true)
);
Copy link
Contributor Author

@nick-w-nick nick-w-nick Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't technically a required test addition since I didn't modify the actual Comparison logic, but since I added support for boolean types in Comparison, I thought it was worthwhile to include it here just for posterity.

@@ -17,6 +17,7 @@ const correctQuery = new StructuredQuery(
]),
new Comparison(Comparators.lt, "length", 180),
new Comparison(Comparators.eq, "genre", "pop"),
new Comparison(Comparators.eq, "hasLyrics", true),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as this comment

Comment on lines +6 to +9
describe("FunctionalTranslator", () => {
const translator = new FunctionalTranslator();

describe("getAllowedComparatorsForType", () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the PR description, I only added the bare minimum tests needed to ensure the modification to the existing functionality did not break the visitComparison logic.

const value = inputValuesByAttribute[attribute];
const validDocuments = validDocumentsByComparator[comparator];
for (const validDocument of validDocuments) {
test(`${value} -> ${comparator} -> ${validDocument.metadata[attribute]}`, () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a test fails, the generated test name looks like input_value -> comparator -> document_value, like shown in the below screenshot for a failed eq comparator check, where the input value is value and the document value is invalid-value.

image

@jacoblee93 jacoblee93 changed the title feat(langchain-core): Add support for boolean metadata attributes in FunctionalTranslator feat(core): Add support for boolean metadata attributes in FunctionalTranslator Dec 19, 2024
Copy link
Collaborator

@jacoblee93 jacoblee93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me!

@dosubot dosubot bot added the lgtm PRs that are ready to be merged as-is label Dec 19, 2024
@jacoblee93
Copy link
Collaborator

@nick-w-nick seems like it's breaking some types in community:

Error: src/structured_query/supabase.ts(225,7): error TS2345: Argument of type 'string | number | boolean' is not assignable to parameter of type 'string | number'.
  Type 'boolean' is not assignable to type 'string | number'.
Error: src/structured_query/supabase.ts(243,44): error TS2345: Argument of type 'string | number | boolean' is not assignable to parameter of type 'string | number'.
Error: src/structured_query/vectara.ts(122,7): error TS2345: Argument of type 'string | number | boolean' is not assignable to parameter of type 'Value'.

Any way we can make this backwards compatible?

@nick-w-nick
Copy link
Contributor Author

@jacoblee93 The type issues have been resolved, could you re-run the CI? Thanks!

@@ -94,7 +94,7 @@ export class QueryTransformer {
return new Comparison(
funcName as Comparator,
traverse(node.args[0]) as string,
traverse(node.args[1]) as string | number
traverse(node.args[1]) as string | number | boolean
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change is required, will this will mean that langchain will no longer support older versions of core without your other changes?

Is there a way we can do this while only changing core?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this, TraverseType already expects boolean, it's just that the typecasting on traverse(node.args[1]) is forcibly setting it to expect string | number.

To allow backward compatibility, I've reverted the typecasting to be what it was originally, while still allowing boolean return values from traverse.

@jacoblee93 jacoblee93 merged commit 53feade into langchain-ai:main Dec 30, 2024
36 checks passed
@jacoblee93
Copy link
Collaborator

Thank you for your patience! Will do a release today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases lgtm PRs that are ready to be merged as-is size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants