-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the way LLM eval runs in the background #6
Comments
I would rather switch to async(io/HTTP), where we can wait for thousands of responses without affecting the server performance. We always delegate the heavy work to another server, and I think this works well for us. If you worry about scaling the worker nodes, I would start solving it only once the waiting times for the users are too bad in some usecase. Somebody had to simulate it first. Personally, I think I will never need it in factgenie. |
We currently have no specialized solution whatsover for running LLM evals in the background.
After receiving the request, we simply start to iterate over the examples to annotate on the backend. We check for the
running
flag at each iteration of the loop and stop if the flag is set toFalse
.Surprisingly, this seems to work quite ok so far, probably since Flask takes care of threading.
However, it seems to be too YOLO. I also expect it not to work robustly, especially if users start launching multiple tasks at once.
At first, I also tried using Python threads manually in the code, something along the lines of:
But that actually rendered the frontend unresponsive (I might have just messed it up, though). In any case, implementing a more principled solution would be much appreciated.
The text was updated successfully, but these errors were encountered: