Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a way to easily run tests results on an issue #102

Open
gentlementlegen opened this issue Aug 26, 2024 · 25 comments · Fixed by #170 · May be fixed by #181
Open

Implement a way to easily run tests results on an issue #102

gentlementlegen opened this issue Aug 26, 2024 · 25 comments · Fixed by #170 · May be fixed by #181

Comments

@gentlementlegen
Copy link
Member

For testing purposes and fine tuning, it would be handy to have a way to run the conversation-rewards manually against any pull-request within a sandbox to test and fine tune the incentives as desired.

My first thought would be to have a /simulate-rewards issue_url or some similar command that would generate the results within the issue / pr where it is run, without generating the permits, which would allow for testing and tuning without needed to open / close issues manually to trigger a run.

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2024

How about from a cli?

@gentlementlegen
Copy link
Member Author

gentlementlegen commented Aug 26, 2024

It can be run locally even now, as long as you provide a valid Open AI key. But it requires you to clone, setup the env etc. which feels less handy than just writing a comment on a testing PR, even more for third parties who want to test.


You could imagine going to ubiquibot-sandbox, open an issue and simply type /rewards url_to_any_issue and get a preview of the results, then change the configuration and repeat.

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2024

Ideally we should be able to live tweak. For example, running this client side and caching all the comments etc.

Then we can edit the config and instantly see what the rewards would be. This requires a client, hence CLI. It would be significantly faster to fine tune incentives this way, with a cache of some sort.

@gentlementlegen
Copy link
Member Author

Except the caching part, this is already runnable client side as long as you provide the full environment and setup up the project, the database etc. Caching and CLI can be added. We can combine this with a command through Github commands so it can run locally and within Github as well.

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2024

Sure. The main thing I am requesting is instant feedback.

@gentlementlegen
Copy link
Member Author

Probably caching cannot be achieved as easily outside of a local environment, but sure could happen locally.

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2024

Cool set a time for it and lets focus on the caching bit. Running locally and instant feedback is more useful than test runs on GitHub via a command.

@gentlementlegen
Copy link
Member Author

I think both are quite nice because doing it from GitHub allows very quick testing without needing the whole setup locally as well, which is nice to test pull-requests. I think this is quite a long task however.

@0x4007
Copy link
Member

0x4007 commented Aug 27, 2024

Make them two separate tasks and lets prioritize the caching and instant results on local.

Copy link
Contributor

ubiquity-os bot commented Oct 14, 2024

@gentlementlegen the deadline is at Mon, Oct 21, 2:56 PM UTC

1 similar comment
Copy link

@gentlementlegen the deadline is at Mon, Oct 21, 2:56 PM UTC

@KodeSage
Copy link

Hello @gentlementlegen , is the issue still available?
I will like to work on it But I will need some clarification on this on how I can run it locally so I can test it locally and have general insights on the project as I did not understand much from the docs.

@0x4007
Copy link
Member

0x4007 commented Oct 17, 2024

I wish I could help you with advice but I never tried running this locally.

In other news I realize that we could host a UI from this plugin for testing. That could be more convenient than local setup for normal use in the future, but that can be a lower priority task.

@KodeSage
Copy link

I wish I could help you with advice but I never tried running this locally.

In other news I realize that we could host a UI from this plugin for testing. That could be more convenient than local setup for normal use in the future, but that can be a lower priority task.

alright, thank you.
But can you still give me the overview of the project.

@gentlementlegen
Copy link
Member Author

@KodeSage I started it and got side tracked by other urgent matters. This is a bit urgent so I'd prefer taking care of it, but if you feel confident enough to carry this on please do. To run this locally there could be a few different approaches, ideally without having to rely on GitHub API. Caching is also important to save on LLM usage.

@0x4007
Copy link
Member

0x4007 commented Oct 19, 2024

Caching is also important to save on LLM usage.

I wasn't considering caching for this purpose, but this is a bit restrictive. If we are tweaking the prompt, then of course we need to run the LLM every time.

However, if you want to scope this task to accommodate changes ONLY to the quantitative "formatting" score, then it makes sense to cache the LLM score per saved issue as well!

@gentlementlegen gentlementlegen self-assigned this Oct 24, 2024
Copy link

@gentlementlegen the deadline is at Thu, Oct 31, 5:34 AM UTC

Copy link

Passed the deadline and no activity is detected, removing assignees: @gentlementlegen.

Copy link

Passed the deadline and no activity is detected, removing assignees: @gentlementlegen.

@gentlementlegen gentlementlegen self-assigned this Nov 1, 2024
Copy link

@gentlementlegen the deadline is at Fri, Nov 8, 10:25 AM UTC

Copy link

A new workroom has been created for this task. Join chat

Copy link

Passed the deadline and no activity is detected, removing assignees: @gentlementlegen.

@gentlementlegen
Copy link
Member Author

@0x4007 I feel like I am fighting against the bot lol I do have an open pull-request but I wanted to break it into 2 prs, I guess I can link both.

@gentlementlegen gentlementlegen self-assigned this Nov 3, 2024
Copy link

@gentlementlegen the deadline is at Sun, Nov 10, 4:44 AM UTC

@gentlementlegen gentlementlegen linked a pull request Nov 5, 2024 that will close this issue
@gentlementlegen
Copy link
Member Author

Should have not closed because of the draft, maybe it is just checking for "OPEN" prs (which could also make sense), either way reopening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants