Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove resource offers in DealNegotiating[0] state on disconnect #475

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

bgins
Copy link
Contributor

@bgins bgins commented Dec 18, 2024

Summary

This pull request makes the following changes:

  • Fix remove resource offers on resource provider disconnect
  • Rename removeResourceOfferByResourceProvider to removeUnmatchedResourceOffers

This pull request fixes a bug where we sometimes remove the wrong resource offer on disconnect. See #467 for details.

The updated code removes resource offers in a unmatched DealNegotiating[0] state to avoid matching them when their resource provider is not connected. This approach has the additional benefit of removing multiple resource offers if the resource provider has offered them.

Task/Issue reference

Closes: #467

Test plan

We have included a temporary commit (994bef2) to demonstrate resource offer removal. The following steps are recommended for testing:

  • Start the stack
  • Stop the resource provider and observe the resource offers before and after. All offers should be removed because the one posted offer was in a DealNegotiating[0] state.
  • Restart the resource provider and run a job
  • Wait a moment, and stop the resource provider. The resource provider should have added a new resource offer in a DealNegotiating[0] state that gets removed, but the resource offer from running the job (in a ResultsAccepted[3]) state should be preserved.

Related issues or PRs (optional)

Epic: https://github.com/Lilypad-Tech/internal/issues/367

@bgins bgins requested a review from a team as a code owner December 18, 2024 16:34
@cla-bot cla-bot bot added the cla-signed label Dec 18, 2024
@bgins bgins changed the title Bgins/fix resource offer cleanup fix: Remove resource offers in DealNegotiating[0] state on disconnect Dec 18, 2024
@github-actions github-actions bot added the fix label Dec 18, 2024
Comment on lines +416 to +417
// TODO Remove before merge
fmt.Printf("+++ resource offers before removal: %+v\n", resourceOffers)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Reminder to remove once we're ready to merge

if offer.State == 0 {
err = controller.store.RemoveResourceOffer(offer.ID)
if err != nil {
controller.log.Error("remove resource offer failed: %s", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should log some context around the resource offer to help debug issues

Copy link
Contributor Author

@bgins bgins Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah great idea! Added the resource provider ID and the ID of the offer that could not be removed.

@@ -404,23 +404,35 @@ func (controller *SolverController) addResourceOffer(resourceOffer data.Resource
return ret, nil
}

func (controller *SolverController) removeResourceOfferByResourceProvider(ID string) error {
// Remove resource offers in an unmatched DealNegotiating[0] state
func (controller *SolverController) removeUnmatchedResourceOffers(ID string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] lets change ID to resourceProviderID for more clarity cause I got confused at first and though this was supposed to be the resource offer id 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah totally. We lost some semantic meaning renaming the function. Pushed an update change ID to resourceProviderID.

Comment on lines 428 to 432
// TODO Remove before merge
offersAfter, err := controller.store.GetResourceOffers(store.GetResourceOffersQuery{
ResourceProvider: ID,
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Reminder to remove once we're ready to merge

if err != nil {
return err
}
fmt.Printf("+++ resource offers after removal: %+v\n", offersAfter)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Reminder to remove once we're ready to merge

We remove resource offers in a unmatched DealNegotiating[0] state to
avoid matching them when their resource provider is not connected.
@bgins bgins force-pushed the bgins/fix-resource-offer-cleanup branch from 994bef2 to 6a7cc7d Compare December 19, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resource offer cleanup sometimes removes the wrong offers
2 participants