Fix worker IP update to use one single DB session #858

benoit74 · 2023-10-30T10:07:29Z

Rationale

Fix #857

Changes (original ones, see below for updated list)

use on single DB session instead of two which obviously leads to inconsistent results
add log information about which worker has changed IP, including previous and new IP
refresh USES_WORKERS_IPS_WHITELIST dynamically so that it can be manipulated in tests
removed previous fix on this topic, since it was not working / needed
simplified code paths when comparing / setting IPs

Remarks

I'm not a huge fan of the dynamic refresh of USES_WORKERS_IPS_WHITELIST but this is the less intrusive change I found to keep this in constants.py since this file is useful to have the list of available environment variables
I've done a quick and dirty change with the IpUpdater class ; this should not be done like this, we should have a cleaner code architecture + use a mocking library, but again, this is not a small change
as mentioned in the issue, I fixed only the bug, the other part should be moved to a specific issue from my PoV

rgaudin

I don't like this implementation neither.

First refreshable_constant does noting. It simply returns what is passed. This is confusing and should be avoided.
You may want to look at functools.partial I think.

Next is this faux-dynamic refresh of environ. It's not possible to change the environment of a running process so this brings additional confusion and only works in your tests because pytest fires processes.
Testing should mostly adapt to codebase or require refactoring but here it's a detrimental change of codebase to accomodate tests.

Another thing is I think we wont write the IP to the DB should the policy update fail (it's possible, it's a remote call). Are we OK with this? Maybe a comment would be helpful.

dispatcher/backend/src/tests/integration/routes/workers/test_worker.py

benoit74 · 2023-10-30T14:18:33Z

I honestly don't know how to fix code to match your expectations. As stated, existing code architecture prevented me to find a clean and simple way to run tests. And I didn't knew about the impossibility to update a process environment, my bad.

The only solution I see to move forward would be to delete the tests, since it looks they are detrimental to code quality. It makes me very sad because I really prefer working and tested code to clean code. But that's life and I don't have time to refactor the Zimfarm codebase, at least I'm not supposed to do it as far as I've understood.

I agree about the fact that we wont write the IP to the DB should the policy update fail. Was already the case before, still the case now. Rather than a comment, I think we should try/catch the remote IP update to ensure data is persisted or commit the transaction before making the remote IP update.

Feel free to take over this PR and finish the work if it is simpler.

rgaudin · 2023-10-30T15:21:23Z

Can you describe the problem/issue? Maybe we can think of a better way to solve it

benoit74 · 2023-10-30T16:45:09Z

This is my issue (only for one test):

I want to ensure IPs sent to code handling Wasabi policies is correct (assuming this code is correct)
for this I need to get result of build_workers_whitelist to check its value
I would like to not call the method directly to stay in same DB session (otherwise bug is not triggered btw)
so what I found is that I need to
- set USES_WORKERS_IPS_WHITELIST environment variable to 1 so that IP whitelist code is triggered
- capture result of IP computation to check its value
- ensure that tests won't try to call Wasabi

benoit74 · 2023-11-02T10:47:14Z

Any idea about how to move this forward?

benoit74 · 2023-11-02T15:35:29Z

#863 should probably be tackled at the same time

benoit74 · 2023-11-14T15:05:51Z

I almost completely rewrote the tests + added a short-term fix to persist IP changes asap in DB, before we can tackle #863 which is indeed not straightforward)

Changes

use on single DB session instead of two which leads to inconsistent results
use a new kind of dbsession decorator which does not manage transaction for us ; this allows to call commit ourselves whenever we see fit
commit IP changes to DB before calling Wasabi so that this is persisted no matter what happens later
add log information about which worker has changed IP, including previous and new IP
remove default value when computing USES_WORKERS_IPS_WHITELIST, useless
simplified code paths when comparing / setting IPs

rgaudin

I'd also like my two comments from previous revision to be addressed ; either in discussion or in code.

dispatcher/backend/src/routes/requested_tasks/requested_task.py

dispatcher/backend/src/common/external.py

benoit74 · 2023-11-17T09:20:48Z

I rebased the commits on latest main changes maybe a bit too soon, I should have wait the end of the review, sorry.

I fixed the last comment which wasn't yet fixed, the code was dead indeed, I just forgot to remove it from the tests ... my bad, sorry again.

AFAIK the other comment has already been fixed, I hope I'm not mistaken.

rgaudin

Thank you

benoit74 force-pushed the fix_worker_ip_update branch 3 times, most recently from bc3391d to ff11212 Compare October 30, 2023 10:28

benoit74 marked this pull request as ready for review October 30, 2023 10:30

benoit74 requested a review from rgaudin October 30, 2023 10:30

rgaudin requested changes Oct 30, 2023

View reviewed changes

dispatcher/backend/src/tests/integration/routes/workers/test_worker.py Outdated Show resolved Hide resolved

dispatcher/backend/src/tests/integration/routes/workers/test_worker.py Outdated Show resolved Hide resolved

benoit74 self-assigned this Nov 9, 2023

benoit74 force-pushed the fix_worker_ip_update branch from ff11212 to 45b05ee Compare November 14, 2023 13:40

benoit74 requested a review from rgaudin November 14, 2023 15:05

rgaudin requested changes Nov 16, 2023

View reviewed changes

dispatcher/backend/src/routes/requested_tasks/requested_task.py Show resolved Hide resolved

dispatcher/backend/src/common/external.py Outdated Show resolved Hide resolved

benoit74 added 4 commits November 17, 2023 10:07

Fix worker IP update to use one single DB session

06bef54

Commit Worker IP change to DB asap

30ed071

Rework / simplify handling of external IP update

efea331

Add docstring + fix name

ddead22

benoit74 force-pushed the fix_worker_ip_update branch from a638fc2 to ddead22 Compare November 17, 2023 09:07

Remove dead code

2ddee9b

benoit74 force-pushed the fix_worker_ip_update branch from 5f0dc29 to 2ddee9b Compare November 17, 2023 09:19

benoit74 requested a review from rgaudin November 17, 2023 09:21

rgaudin approved these changes Nov 21, 2023

View reviewed changes

rgaudin merged commit e208419 into main Nov 21, 2023
5 checks passed

rgaudin deleted the fix_worker_ip_update branch November 21, 2023 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix worker IP update to use one single DB session #858

Fix worker IP update to use one single DB session #858

benoit74 commented Oct 30, 2023 •

edited

Loading

rgaudin left a comment

benoit74 commented Oct 30, 2023

rgaudin commented Oct 30, 2023

benoit74 commented Oct 30, 2023

benoit74 commented Nov 2, 2023

benoit74 commented Nov 2, 2023

benoit74 commented Nov 14, 2023

rgaudin left a comment

benoit74 commented Nov 17, 2023

rgaudin left a comment

Fix worker IP update to use one single DB session #858

Fix worker IP update to use one single DB session #858

Conversation

benoit74 commented Oct 30, 2023 • edited Loading

Rationale

Changes (original ones, see below for updated list)

Remarks

rgaudin left a comment

Choose a reason for hiding this comment

benoit74 commented Oct 30, 2023

rgaudin commented Oct 30, 2023

benoit74 commented Oct 30, 2023

benoit74 commented Nov 2, 2023

benoit74 commented Nov 2, 2023

benoit74 commented Nov 14, 2023

Changes

rgaudin left a comment

Choose a reason for hiding this comment

benoit74 commented Nov 17, 2023

rgaudin left a comment

Choose a reason for hiding this comment

benoit74 commented Oct 30, 2023 •

edited

Loading