Feature: Run evaluations in batch #154

geclos · 2024-09-09T22:34:20Z

This commit implements evaluation runs in batch.

packages/core/src/lib/disk.ts

packages/jobs/src/job-definitions/batchEvaluations/runBatchEvaluationJob.ts

packages/jobs/src/job-definitions/batchEvaluations/runDocumentJob.ts

andresgutgon · 2024-09-10T08:28:35Z

apps/web/src/actions/datasets/preview.ts

 import { DatasetsRepository } from '@latitude-data/core/repositories'
 import { previewDataset } from '@latitude-data/core/services/datasets/preview'
-import disk from '$/lib/disk'


What's the reason to do this?. For using in jobs? I think if you manage to make it work is fine. But also fine that each app init their own disk singleton

we need it in jobs yes

andresgutgon · 2024-09-10T12:01:30Z

apps/web/src/actions/evaluations/runBatch.ts

+      runCount: z.number(),
+      offset: z.number().optional().default(0),
+      parameters: z.record(z.number()).optional(),
+      evaluationId: z.number(),


Shouldn't be this an array?

andresgutgon · 2024-09-10T13:54:36Z

apps/web/src/actions/evaluations/runBatch.ts

+      documentUuid: z.string(),
+      commitUuid: z.string(),
+      runCount: z.number(),
+      offset: z.number().optional().default(0),


Also limit no?

And also a boolean to pick all the rows

it's gonna be fromLine and toLine btw

And also a boolean to pick all the rows

better to handle this in frontend, don't send the limits to the backend if the users wants everything

apps/web/src/actions/evaluations/runBatch.ts

andresgutgon · 2024-09-10T18:01:44Z

apps/web/src/actions/evaluations/runBatch.ts

+      commitUuid: z.string(),
+      runCount: z.number(),
+      offset: z.number().optional().default(0),
+      parameters: z.record(z.number()).optional(),


Why is this a record of numbers?

it maps parameters to the index column of rows in a csv, are you doing it differently in frontend?

But you also need the name of the parameters in the documents no? It's a map between document params and csv headers or I'm missunderstanding?

yes the key is the parameter name and the value is the column index

so in this case let's assume the csv has last_name and name in columns 3 and 4, the parameters map would be

{ last_name: 3, name: 4} the keys being the document parameters and the values being the csv collumn indeces

It's true, headers is an array 👍

geclos marked this pull request as draft September 9, 2024 22:34

geclos force-pushed the feature/run_evaluations_in_batch branch from 06b22b1 to 36f3ab5 Compare September 9, 2024 22:40