Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: batch evaluations backend #2 #164

Merged
merged 1 commit into from
Sep 11, 2024

Conversation

geclos
Copy link
Collaborator

@geclos geclos commented Sep 11, 2024

This commit implements the full cycle of a batch evaluation. Now users can run batch evaluations on top of their prompts with dataset data.

@geclos geclos force-pushed the feature/batch_evaluations_backend_2 branch 6 times, most recently from c502a49 to c596eb3 Compare September 11, 2024 14:26
@@ -174,55 +171,5 @@ describe('POST /run', () => {
id: '0',
})
})

it('enqueue the document log', async () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goes via events system from the service now, as it should have been

parameters: '',
config: '',
duration: '',
cost: '',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the parameters that do not come from the providerlog are a pita to fetch so I'm just leaving them blank for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember I made an object with functions to generate all parameters from a documentLog, and that it was also used to get the name of these parameters without having to repeat them somewhere else. Any idea of what happened to it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:\ no idea tbh

return [key, JSON.parse(value)]
} catch (e) {
return [key, value]
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to improve the compiler when a parameter is a JSON @csansoon . We need compiler parameters to accept jsons so that users can perform operations over them while writing their prompts (e.g messages.user.first), but if a json is gonna get interpolated into the final resolved content we should stringify it first otherwise it gets stringified to [Object object]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's easy! :D
We'll have to decide whether to interpolate it with a formatting or not

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard JSON.stringify(data, null, 2) will work ok for now

@@ -137,7 +137,7 @@ export function DocumentLogMetadata({
trigger={
<div className='flex flex-row items-center gap-x-1'>
<Text.H5 color='foregroundMuted'>
{formatCostInMillicents(documentLog.cost_in_millicents ?? 0)}
{formatCostInMillicents(documentLog.costInMillicents ?? 0)}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always camel case in JS please

documentUuid: string
draft?: Commit
}): Promise<TypedResult<DocumentLogWithMetadata[], Error>> {
return computeDocumentLogsWithMetadata(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved code around, this fetch is not returning the original repository scope which should be one of the constraints of repositories so I moved to its own service. Will probably remove this repository method at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is only used once. Move it now or at least put a TODO. Same effort

@@ -17,12 +24,13 @@ export const documentLogs = latitudeSchema.table(
onUpdate: 'cascade',
}),
resolvedContent: text('resolved_content').notNull(),
parameters: json('parameters').notNull(),
parameters: jsonb('parameters').$type<Record<string, unknown>>().notNull(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always use jsonb (better performance and less disk space)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference? Is json stored just as a string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how json is stored tbh, jsonb is stored in binary format and is generally understood to be faster and occupy less space

customIdentifier: text('custom_identifier'),
duration: bigint('duration', { mode: 'number' }).notNull(),
...timestamps(),
},
(table) => ({
documentLogUuidIdx: index('document_log_uuid_idx').on(table.documentUuid),
commitIdIdx: index('commit_id_idx').on(table.commitId),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget your indices please

duration: await duration,
},
}),
),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made documentlog creation transactional with the chain run so that you know if the response succeeded a document log was also created

return draft
? or(isNotNull(commits.mergedAt), eq(commits.id, draft.id))
: isNotNull(commits.mergedAt)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing fancy, I DRYed some of the code in that long method from DocumentVersionsRepository since I need it in two places now

if (result.error) return result

result.value.response.then((response) => {
publisher.publishLater({
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this should be also transactional with the evaluation run, it probably should

@@ -42,11 +43,16 @@ export async function createProviderLog(
apiKeyId,
documentLogUuid,
generatedAt,
costInMillicents,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only so that we can inject it from test factories tbh

system: formatRoleMessages(MessageRole.system),
assistant: formatRoleMessages(MessageRole.assistant),
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this formats providerlogs data into the parameters that evaluations expect as described in this notion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


return { batchId }
},
return { batchId }
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing really changed here you can skip it

console.error(err)
}

throw err
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some logging in dev

Comment on lines -17 to +13
cost_in_millicents: number | null
durationInMs: number | null
costInMillicents: number | null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, much better!

@geclos geclos force-pushed the feature/batch_evaluations_backend_2 branch from c596eb3 to 74bf958 Compare September 11, 2024 14:42
csansoon
csansoon previously approved these changes Sep 11, 2024
Copy link
Contributor

@csansoon csansoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, but looking good

@@ -14,7 +14,7 @@ export const runPromptAction = authProcedure
.input(
z.object({
prompt: z.string(),
parameters: z.object({ messages: z.string(), last_message: z.string() }),
parameters: z.record(z.any()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why any?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be any parameters

andresgutgon
andresgutgon previously approved these changes Sep 11, 2024
@geclos geclos dismissed stale reviews from andresgutgon and csansoon via 72c22b3 September 11, 2024 15:23
@geclos geclos force-pushed the feature/batch_evaluations_backend_2 branch from 74bf958 to 72c22b3 Compare September 11, 2024 15:23
This commit implements the full cycle of a batch evaluation. Now users
can run batch evaluations on top of their prompts with dataset data.
@geclos geclos force-pushed the feature/batch_evaluations_backend_2 branch from 72c22b3 to a44ba69 Compare September 11, 2024 15:27
@geclos geclos merged commit 6daddd4 into main Sep 11, 2024
3 checks passed
@geclos geclos deleted the feature/batch_evaluations_backend_2 branch September 11, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants