Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): Add support for boolean metadata attributes in FunctionalTranslator #7407

Merged
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
f967361
added isBoolean function to allow the value to be casted
nick-w-nick Dec 19, 2024
bf549bc
added function to get allowed comparators based on input type
nick-w-nick Dec 19, 2024
1b97531
updated ValueType types for applicable operators
nick-w-nick Dec 19, 2024
2f43b70
added boolean to visitor comparison result type
nick-w-nick Dec 19, 2024
f83470d
added test for isBoolean
nick-w-nick Dec 19, 2024
1a92845
Merge branch 'main' into feature/add-boolean-attribute-support
nick-w-nick Dec 19, 2024
cb8c08b
added boolean to Comparison class
nick-w-nick Dec 19, 2024
74b643a
updated query parser test to include boolean comparisons
nick-w-nick Dec 19, 2024
9d8c25e
updated query chain test to include a boolean attribute and comparison
nick-w-nick Dec 19, 2024
15c46bd
added boolean type to query transform traversal
nick-w-nick Dec 19, 2024
00e7721
updated function to accept an input type rather than the value itself
nick-w-nick Dec 19, 2024
db0b4ed
fixed conditional to throw error if comparator is not valid with the …
nick-w-nick Dec 19, 2024
f55a5ce
added basic tests for FunctionalTranslator
nick-w-nick Dec 19, 2024
379e466
fixed typo
nick-w-nick Dec 19, 2024
10523ae
removed extra symbol
nick-w-nick Dec 19, 2024
d538d8b
lint fixes
nick-w-nick Dec 19, 2024
2bd18a7
readded symbol
nick-w-nick Dec 19, 2024
8fb1fcd
added clarifying comments to test fakes
nick-w-nick Dec 19, 2024
fc79a32
updated test to get input values by attribute instead of type
nick-w-nick Dec 19, 2024
56ee535
moved variable into test scenario
nick-w-nick Dec 19, 2024
af87e11
added clarifying comments
nick-w-nick Dec 19, 2024
d3376c4
refactored repeated tests into function that generates the tests
nick-w-nick Dec 19, 2024
f432c9b
updated test value for consistency
nick-w-nick Dec 19, 2024
824d090
Format
jacoblee93 Dec 19, 2024
eeaafd4
added boolean to vectara types
nick-w-nick Dec 20, 2024
21f1504
added supabase boolean column functionality
nick-w-nick Dec 20, 2024
dd2863e
Merge branch 'main' into feature/add-boolean-attribute-support
nick-w-nick Dec 20, 2024
21d5f7b
Revert "added boolean to vectara types"
nick-w-nick Dec 21, 2024
6ff9229
Revert "added supabase boolean column functionality"
nick-w-nick Dec 21, 2024
83a9db6
made comparison generic with a backwards compatible default
nick-w-nick Dec 21, 2024
62eac92
updated typecasting
nick-w-nick Dec 26, 2024
b1ead94
formatting fixes
nick-w-nick Dec 26, 2024
356be4e
Merge branch 'main' into feature/add-boolean-attribute-support
nick-w-nick Dec 27, 2024
79c243e
Fix missing import
jacoblee93 Dec 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion langchain-core/src/structured_query/base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ export class BasicTranslator<
this.allowedComparators.indexOf(func as Comparator) === -1
) {
throw new Error(
`Comparator ${func} not allowed. Allowed operators: ${this.allowedComparators.join(
`Comparator ${func} not allowed. Allowed comparators: ${this.allowedComparators.join(
jacoblee93 marked this conversation as resolved.
Show resolved Hide resolved
", "
)}`
);
Expand Down
31 changes: 28 additions & 3 deletions langchain-core/src/structured_query/functional.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ import { castValue, isFilterEmpty } from "./utils.js";
* the result of a comparison operation.
*/
type ValueType = {
eq: string | number;
ne: string | number;
eq: string | number | boolean;
ne: string | number | boolean;
lt: string | number;
lte: string | number;
gt: string | number;
Expand Down Expand Up @@ -65,7 +65,29 @@ export class FunctionalTranslator extends BaseTranslator {
formatFunction(): string {
throw new Error("Not implemented");
}


/**
* Returns the allowed comparators for a given data type.
* @param input The input value to get the allowed comparators for.
* @returns An array of allowed comparators for the input data type.
*/
getAllowedComparatorsForType(inputType: string): Comparator[] {
switch (inputType) {
case "string": {
return [Comparators.eq, Comparators.ne, Comparators.gt, Comparators.gte, Comparators.lt, Comparators.lte];
}
case "number": {
return [Comparators.eq, Comparators.ne, Comparators.gt, Comparators.gte, Comparators.lt, Comparators.lte];
}
case "boolean": {
return [Comparators.eq, Comparators.ne];
}
default: {
throw new Error(`Unsupported data type: ${inputType}`);
}
}
}

/**
* Returns a function that performs a comparison based on the provided
* comparator.
Expand Down Expand Up @@ -159,6 +181,9 @@ export class FunctionalTranslator extends BaseTranslator {
const { comparator, attribute, value } = comparison;
const undefinedTrue = [Comparators.ne];
if (this.allowedComparators.includes(comparator)) {
if (!this.getAllowedComparatorsForType(typeof value).includes(comparator)) {
throw new Error(`'${comparator}' comparator not allowed to be used with ${typeof value}`);
}
const comparatorFunction = this.getComparatorFunction(comparator);
return (document: Document) => {
const documentValue = document.metadata[attribute];
Expand Down
4 changes: 2 additions & 2 deletions langchain-core/src/structured_query/ir.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ export type VisitorOperationResult = {
*/
export type VisitorComparisonResult = {
[attr: string]: {
[comparator: string]: string | number;
[comparator: string]: string | number | boolean;
};
};

Expand Down Expand Up @@ -155,7 +155,7 @@ export class Comparison extends FilterDirective {
constructor(
public comparator: Comparator,
public attribute: string,
public value: string | number
public value: string | number | boolean
) {
super();
}
Expand Down
246 changes: 246 additions & 0 deletions langchain-core/src/structured_query/tests/functional.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
import { test, expect } from "@jest/globals";
import { Document } from "../../documents/document.js";
import { FunctionalTranslator } from "../functional.js";
import { Comparators, Visitor } from "../ir.js";

describe("FunctionalTranslator", () => {
const translator = new FunctionalTranslator();

describe("getAllowedComparatorsForType", () => {
Comment on lines +6 to +9
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the PR description, I only added the bare minimum tests needed to ensure the modification to the existing functionality did not break the visitComparison logic.

test("string", () => {
expect(translator.getAllowedComparatorsForType("string")).toEqual([
Comparators.eq,
Comparators.ne,
Comparators.gt,
Comparators.gte,
Comparators.lt,
Comparators.lte,
]);
});
test("number", () => {
expect(translator.getAllowedComparatorsForType("number")).toEqual([
Comparators.eq,
Comparators.ne,
Comparators.gt,
Comparators.gte,
Comparators.lt,
Comparators.lte,
]);
});
test("boolean", () => {
expect(translator.getAllowedComparatorsForType("boolean")).toEqual([
Comparators.eq,
Comparators.ne,
]);
});
test("unsupported", () => {
expect(() =>
translator.getAllowedComparatorsForType("unsupported")
).toThrow("Unsupported data type: unsupported");
});
});

describe("visitComparison", () => {
describe("returns true or false for valid comparisons", () => {
const attributesByType = {
string: "stringValue",
number: "numberValue",
boolean: "booleanValue",
};

const inputValuesByAttribute: { [key in string]: string | number | boolean } = {
stringValue: "value",
numberValue: 1,
booleanValue: true,
};

// documents that will match against the comparison
const validDocumentsByComparator: { [key in string]: Document<Record<string, unknown>>[] } = {
[Comparators.eq]: [
new Document({
pageContent: "",
metadata: {
stringValue: "value",
numberValue: 1,
booleanValue: true,
},
}),
],
[Comparators.ne]: [
new Document({
pageContent: "",
metadata: {
stringValue: "not-value",
numberValue: 0,
booleanValue: false,
},
}),
],
[Comparators.gt]: [
new Document({
pageContent: "",
metadata: {
stringValue: "valueee",
numberValue: 2,
booleanValue: true,
},
}),
],
[Comparators.gte]: [
// test for greater than
new Document({
pageContent: "",
metadata: {
stringValue: "valueee",
numberValue: 2,
booleanValue: true,
},
}),
// test for equal to
new Document({
pageContent: "",
metadata: {
stringValue: "value",
numberValue: 1,
booleanValue: true,
},
}),
],
[Comparators.lt]: [
new Document({
pageContent: "",
metadata: {
stringValue: "val",
numberValue: 0,
booleanValue: true,
},
}),
],
[Comparators.lte]: [
// test for less than
new Document({
pageContent: "",
metadata: {
stringValue: "val",
numberValue: 0,
booleanValue: true,
},
}),
// test for equal to
new Document({
pageContent: "",
metadata: {
stringValue: "value",
numberValue: 1,
booleanValue: true,
},
}),
],
};

// documents that will not match against the comparison
const invalidDocumentsByComparator: { [key in string]: Document<Record<string, unknown>>[] } = {
[Comparators.eq]: [
new Document({
pageContent: "",
metadata: {
stringValue: "not-value",
numberValue: 0,
booleanValue: false,
},
}),
],
[Comparators.ne]: [
new Document({
pageContent: "",
metadata: {
stringValue: "value",
numberValue: 1,
booleanValue: true,
},
}),
],
[Comparators.gt]: [
new Document({
pageContent: "",
metadata: {
stringValue: "value",
numberValue: 1,
booleanValue: true,
},
}),
],
[Comparators.gte]: [
new Document({
pageContent: "",
metadata: {
stringValue: "val",
numberValue: 0,
booleanValue: false,
},
}),
],
[Comparators.lt]: [
new Document({
pageContent: "",
metadata: {
stringValue: "valueee",
numberValue: 2,
booleanValue: true,
},
}),
],
[Comparators.lte]: [
new Document({
pageContent: "",
metadata: {
stringValue: "valueee",
numberValue: 2,
booleanValue: true,
},
}),
],
};

function generateComparatorTestsForType(type: "string" | "number" | "boolean") {
const comparators = translator.getAllowedComparatorsForType(type);
for (const comparator of comparators) {
const attribute = attributesByType[type];
const value = inputValuesByAttribute[attribute];
const validDocuments = validDocumentsByComparator[comparator];
for (const validDocument of validDocuments) {
test(`${value} -> ${comparator} -> ${validDocument.metadata[attribute]}`, () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a test fails, the generated test name looks like input_value -> comparator -> document_value, like shown in the below screenshot for a failed eq comparator check, where the input value is value and the document value is invalid-value.

image

const comparison = translator.visitComparison({
attribute,
comparator,
value,
exprName: "Comparison",
accept: (visitor: Visitor) => visitor,
});
const result = comparison(validDocument);
expect(result).toBeTruthy();
});
}
const invalidDocuments = invalidDocumentsByComparator[comparator];
for (const invalidDocument of invalidDocuments) {
test(`${value} -> ${comparator} -> ${invalidDocument.metadata[attribute]}`, () => {
const comparison = translator.visitComparison({
attribute,
comparator,
value,
exprName: "Comparison",
accept: (visitor: Visitor) => visitor,
});
const result = comparison(invalidDocument);
expect(result).toBeFalsy();
});
}
}
}

generateComparatorTestsForType("string");
generateComparatorTestsForType("number");
generateComparatorTestsForType("boolean");
});
});
});
9 changes: 8 additions & 1 deletion langchain-core/src/structured_query/tests/utils.test.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* eslint-disable no-process-env */
import { test, expect } from "@jest/globals";
import { castValue, isFloat, isInt, isString } from "../utils.js";
import { castValue, isFloat, isInt, isString, isBoolean } from "../utils.js";

test("Casting values correctly", () => {
const stringString = [
Expand Down Expand Up @@ -28,6 +28,8 @@ test("Casting values correctly", () => {

const floatFloat = ["1.1", 2.2, 3.3];

const booleanBoolean = [true, false];

stringString.map(castValue).forEach((value) => {
expect(typeof value).toBe("string");
expect(isString(value)).toBe(true);
Expand All @@ -54,4 +56,9 @@ test("Casting values correctly", () => {
expect(typeof value).toBe("number");
expect(isFloat(value)).toBe(true);
});

booleanBoolean.map(castValue).forEach((value) => {
expect(typeof value).toBe("boolean");
expect(isBoolean(value)).toBe(true);
});
});
11 changes: 10 additions & 1 deletion langchain-core/src/structured_query/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,20 +72,29 @@ export function isString(value: unknown): boolean {
);
}

/**
* Checks if the provided value is a boolean.
*/
export function isBoolean(value: unknown): boolean {
return typeof value === "boolean";
}

/**
* Casts a value that might be string or number to actual string or number.
* Since LLM might return back an integer/float as a string, we need to cast
* it back to a number, as many vector databases can't handle number as string
* values as a comparator.
*/
export function castValue(input: unknown): string | number {
export function castValue(input: unknown): string | number | boolean {
let value;
if (isString(input)) {
value = input as string;
} else if (isInt(input)) {
value = parseInt(input as string, 10);
} else if (isFloat(input)) {
value = parseFloat(input as string);
} else if (isBoolean(input)) {
value = Boolean(input);
} else {
throw new Error("Unsupported value type");
}
Expand Down
2 changes: 1 addition & 1 deletion langchain/src/chains/query_constructor/parser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ export class QueryTransformer {
return new Comparison(
funcName as Comparator,
traverse(node.args[0]) as string,
traverse(node.args[1]) as string | number
traverse(node.args[1]) as string | number | boolean
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change is required, will this will mean that langchain will no longer support older versions of core without your other changes?

Is there a way we can do this while only changing core?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this, TraverseType already expects boolean, it's just that the typecasting on traverse(node.args[1]) is forcibly setting it to expect string | number.

To allow backward compatibility, I've reverted the typecasting to be what it was originally, while still allowing boolean return values from traverse.

);
}
throw new Error("Comparator must have exactly 2 arguments");
Expand Down
Loading