-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: batch evaluations backend #2 #164
Conversation
c502a49
to
c596eb3
Compare
@@ -174,55 +171,5 @@ describe('POST /run', () => { | |||
id: '0', | |||
}) | |||
}) | |||
|
|||
it('enqueue the document log', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goes via events system from the service now, as it should have been
parameters: '', | ||
config: '', | ||
duration: '', | ||
cost: '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the parameters that do not come from the providerlog are a pita to fetch so I'm just leaving them blank for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember I made an object with functions to generate all parameters from a documentLog
, and that it was also used to get the name of these parameters without having to repeat them somewhere else. Any idea of what happened to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:\ no idea tbh
return [key, JSON.parse(value)] | ||
} catch (e) { | ||
return [key, value] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to improve the compiler when a parameter is a JSON @csansoon . We need compiler parameters to accept jsons so that users can perform operations over them while writing their prompts (e.g messages.user.first
), but if a json is gonna get interpolated into the final resolved content we should stringify it first otherwise it gets stringified to [Object object]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's easy! :D
We'll have to decide whether to interpolate it with a formatting or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standard JSON.stringify(data, null, 2)
will work ok for now
@@ -137,7 +137,7 @@ export function DocumentLogMetadata({ | |||
trigger={ | |||
<div className='flex flex-row items-center gap-x-1'> | |||
<Text.H5 color='foregroundMuted'> | |||
{formatCostInMillicents(documentLog.cost_in_millicents ?? 0)} | |||
{formatCostInMillicents(documentLog.costInMillicents ?? 0)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always camel case in JS please
documentUuid: string | ||
draft?: Commit | ||
}): Promise<TypedResult<DocumentLogWithMetadata[], Error>> { | ||
return computeDocumentLogsWithMetadata( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved code around, this fetch is not returning the original repository scope which should be one of the constraints of repositories so I moved to its own service. Will probably remove this repository method at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only used once. Move it now or at least put a TODO. Same effort
@@ -17,12 +24,13 @@ export const documentLogs = latitudeSchema.table( | |||
onUpdate: 'cascade', | |||
}), | |||
resolvedContent: text('resolved_content').notNull(), | |||
parameters: json('parameters').notNull(), | |||
parameters: jsonb('parameters').$type<Record<string, unknown>>().notNull(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always use jsonb (better performance and less disk space)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference? Is json
stored just as a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how json is stored tbh, jsonb is stored in binary format and is generally understood to be faster and occupy less space
customIdentifier: text('custom_identifier'), | ||
duration: bigint('duration', { mode: 'number' }).notNull(), | ||
...timestamps(), | ||
}, | ||
(table) => ({ | ||
documentLogUuidIdx: index('document_log_uuid_idx').on(table.documentUuid), | ||
commitIdIdx: index('commit_id_idx').on(table.commitId), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't forget your indices please
duration: await duration, | ||
}, | ||
}), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made documentlog creation transactional with the chain run so that you know if the response succeeded a document log was also created
return draft | ||
? or(isNotNull(commits.mergedAt), eq(commits.id, draft.id)) | ||
: isNotNull(commits.mergedAt) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing fancy, I DRYed some of the code in that long method from DocumentVersionsRepository since I need it in two places now
if (result.error) return result | ||
|
||
result.value.response.then((response) => { | ||
publisher.publishLater({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this should be also transactional with the evaluation run, it probably should
@@ -42,11 +43,16 @@ export async function createProviderLog( | |||
apiKeyId, | |||
documentLogUuid, | |||
generatedAt, | |||
costInMillicents, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only so that we can inject it from test factories tbh
system: formatRoleMessages(MessageRole.system), | ||
assistant: formatRoleMessages(MessageRole.assistant), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this formats providerlogs data into the parameters that evaluations expect as described in this notion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
|
||
return { batchId } | ||
}, | ||
return { batchId } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing really changed here you can skip it
console.error(err) | ||
} | ||
|
||
throw err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some logging in dev
cost_in_millicents: number | null | ||
durationInMs: number | null | ||
costInMillicents: number | null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, much better!
c596eb3
to
74bf958
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, but looking good
@@ -14,7 +14,7 @@ export const runPromptAction = authProcedure | |||
.input( | |||
z.object({ | |||
prompt: z.string(), | |||
parameters: z.object({ messages: z.string(), last_message: z.string() }), | |||
parameters: z.record(z.any()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be any parameters
74bf958
to
72c22b3
Compare
This commit implements the full cycle of a batch evaluation. Now users can run batch evaluations on top of their prompts with dataset data.
72c22b3
to
a44ba69
Compare
This commit implements the full cycle of a batch evaluation. Now users can run batch evaluations on top of their prompts with dataset data.