Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation dashboard #146

Merged
merged 1 commit into from
Sep 11, 2024
Merged

Evaluation dashboard #146

merged 1 commit into from
Sep 11, 2024

Conversation

csansoon
Copy link
Contributor

@csansoon csansoon commented Sep 9, 2024

image

@csansoon csansoon added the 🚧 wip Work in progress label Sep 9, 2024
@csansoon csansoon force-pushed the evaluation-dashboard branch 5 times, most recently from 0ba2fe4 to 4125fed Compare September 9, 2024 13:40
Comment on lines 71 to 141
onSelect={() =>
navigate.push(
ROUTES.projects
.detail({ id: document.projectId })
.commits.detail({ uuid: HEAD_COMMIT })
.documents.detail({ uuid: document.documentUuid }).logs.root, // TODO: Navigate to the document evaluation details page
)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Navigating to document logs as a mock rn. It will redirect to the document evaluation details page when it's available.

Copy link
Collaborator

@geclos geclos Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's available! check routes in main

@csansoon csansoon force-pushed the evaluation-dashboard branch 2 times, most recently from fa19382 to 6dd0730 Compare September 9, 2024 14:19
Comment on lines 67 to 70
async getConnectedDocumentsWithMetadata(
evaluationId: number,
): Promise<TypedResult<ConnectedDocumentWithMetadata[], LatitudeError>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attention: black magic ahead. Complex sql query to fetch aggregated data from connected evaluations. The idea being: Display all documents connected to this evaluation, and the costs and metadata of the results. The problem is that the relations from connectedEvaluations to this info is not really straight-forward.

@csansoon csansoon force-pushed the evaluation-dashboard branch 4 times, most recently from 1b39558 to 86fd8e0 Compare September 9, 2024 16:02
@csansoon csansoon marked this pull request as ready for review September 9, 2024 16:12
@csansoon csansoon removed the 🚧 wip Work in progress label Sep 9, 2024
@csansoon csansoon force-pushed the evaluation-dashboard branch from 86fd8e0 to 91a2107 Compare September 10, 2024 07:39
<Text.H4 color='foregroundMuted' noWrap>
{promptPath}
{Boolean(promptPath) && '/'}
<Text.H4 noWrap>{promptName}</Text.H4>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would truncate this name after a certain number of characters, the preview you shared doesn't look very good with very long path names 😅

</TableCell>
<TableCell>
{isProjectsLoading ? (
<Skeleton className='w-full h-4 bg-muted animate-pulse' />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👌🏼

.from(documentVersionsScope.scope)
.where(isNotNull(documentVersionsScope.scope.mergedAt)) // Only account for merged documents
.groupBy(documentVersionsScope.scope.documentUuid)
.as('latest_document_versions')
Copy link
Collaborator

@geclos geclos Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this should be in the document versions repo but i understand why it ain't straightforward to put it there... anyhow let's continue and we'll think on how to improve our repos later on

.where(eq(this.scope.evaluationId, evaluationId))

return Result.ok(result.filter((r) => r.deletedAt == null)) // Only show non-removed documents
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this merits a test case tbh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Venga que cursor te los escribe solo casi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ Rewrote the queries to add a modal value, and simplify the overall query complexity. Working on tests now!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Had to modify some factories and a few other functions, but the test env is a bit cleaner now!

@csansoon csansoon force-pushed the evaluation-dashboard branch 6 times, most recently from f485efe to d4bfb94 Compare September 10, 2024 14:44
): PromisedResult<ConnectedEvaluation[], Error> {
return Transaction.call(
async (tx): PromisedResult<ConnectedEvaluation[], Error> => {
const documentVersionsScope = new DocumentVersionsRepository(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to put this inside a transaction. Keep it outside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to use a transaction, because it creates new evaluations from the templates before creating the connections! If anything fails in between those two actions, I'd like to undo the changes.

@@ -43,3 +44,29 @@ export async function createEvaluation(
return Result.ok({ ...result[0]!, metadata: metadataTable[0]! })
}, db)
}

export async function importLlmAsJudgeEvaluation(
Copy link
Contributor

@andresgutgon andresgutgon Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this method where it is used. No need to put in create service

) {
// Note: otherwise the comparison fails with "invalid input syntax for type uuid: 'non-existent-uuid'""
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this. Wrap in a transaction and it will return a result error. This is a legit SQL error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't want to return false for any error! I only want to return false if there's not any document in the scope with the given uuid. If there is an error from any other reason, I still want the error to be thrown.

eq(selectedDocuments.documentUuid, modalValueCount.documentUuid),
)

return Result.ok(result.filter((r) => r.deletedAt == null)) // Only show non-removed documents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOW 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 😂

andresgutgon
andresgutgon previously approved these changes Sep 10, 2024
workspace: ctx.workspace,
documentUuid: input.documentUuid,
evaluationUuids: input.evaluationUuids,
templateIds: input.templateIds,
Copy link
Collaborator

@geclos geclos Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Services should always receive model instances, not identifiers, and tenancy checks should live at the boundaries of the application (actions, in this case) 🙏🏼 This is mainly to ensure developers always implement tenancy when adding public endpoints.

I know you are passing the workspace and probably implementing tenancy through that internally but it breaks the pattern and sets us up for the next dev to implement a service without a proper tenancy check.

await documentVersionsScope.existsDocumentWithUuid(documentUuid)
if (!documentExists) {
return Result.error(new NotFoundError('Document not found'))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not this service responsibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This service had tests that checked that it failed when the document didn't exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not. Actually the tests were about the server action, not the service...

`The following evaluations were not found: ${missingEvaluationUuids?.join(', ')}`,
),
)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coolio but it belongs to the action not this service

Copy link
Collaborator

@geclos geclos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work 👌🏼 my only gripe is with tenancy checks

@csansoon csansoon merged commit 7050747 into main Sep 11, 2024
3 checks passed
@csansoon csansoon deleted the evaluation-dashboard branch September 11, 2024 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants