Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4541 Introduced clean_up_docket_judges command to clean up assigned_to and referred_to #4555

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

albertisfu
Copy link
Contributor

As described in #4541, this PR introduces the clean_up_docket_judges command to clean up assigned_to and referred_to fields when their corresponding assigned_to_str and referred_to_str values have changed, and we don't have the new judges in database.

This works by checking all the dockets in the database where referred_to and assigned_to are not null. Then, the docket assigned_to_str and referred_to_str values are used to perform a Judge lookup using lookup_judge_by_full_name.

If the current assigned_to and referred_to values are different from the ones returned by the lookup (which includes a None value in case the Judge is not found), the docket is updated.

Since this save() will trigger an Elasticsearch update, it's better to do it slowly. I set a default sleep of 0.1 seconds between each update. We can monitor how this performs and adjust the wait time between iterations accordingly.

The command can be run as:
manage.py clean_up_docket_judges --iteration-wait 0.1

Copy link
Member

@mlissner mlissner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks. One performance suggestion though, that I think could help.

Comment on lines +34 to +45
async_to_sync(lookup_judge_by_full_name)(
d.referred_to_str, d.court_id, d.date_filed
)
if d.referred_to_str
else None
)
new_assigned = (
async_to_sync(lookup_judge_by_full_name)(
d.assigned_to_str, d.court_id, d.date_filed
)
if d.assigned_to_str
else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use lru_cache here to avoid zillions of DB lookups? Maybe set a maxsize of, say 30k so that we catch most judge names?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

2 participants