-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPRECATED! implement validator for company existion #2601
base: master
Are you sure you want to change the base?
DEPRECATED! implement validator for company existion #2601
Conversation
This pull request is split into 5 parts for easier review. Changed files are located in these folders:
|
b393eda
to
6dd5635
Compare
02671f1
to
fb4c53c
Compare
6d73a34
to
e4fa329
Compare
… statement set soft delete
8e871d9
to
4aafe29
Compare
905017d
to
814b561
Compare
the output list of invalid org contacts currently includes all Estonian org type objects no matter the role. But as we set Force Delete only on domains where such an object is in the role of a registrant we need to generate a sub-list or add a role indicator to the output so it would be possible to filter out only the ones important in the context of ForceDelete. |
latest test resulted in again with multiple instances of the same entity, but more importantly each entity was matched with only one domain so if a company had 3 domains force delete was set only on one of them |
bundle exec rake company_status:check_all -- --open_data_file_path=lib/tasks/data/ettevotja_rekvisiidid__lihtandmed.csv --missing_companies_output_path=lib/tasks/data/missing_companies_in_business_registry.csv --deleted_companies_output_path=lib/tasks/data/deleted_companies_from_business_registry.csv --download_path=https://avaandmed.ariregister.rik.ee/sites/default/files/avaandmed/ettevotja_rekvisiidid__lihtandmed.csv.zip --soft_delete_enable=false
This rake task performs the following actions:
Therefore, the attributes look like this:
open_data_file_path
- specifies where the data is saved and retrieved from. Default valuelib/tasks/data/ettevotja_rekvisiidid__lihtandmed.csv
missing_companies_output_path
- specifies the path where companies not found in the business registry will be saved. Default valuelib/tasks/data/missing_companies_in_business_registry.csv
deleted_companies_output_path
- specifies the path where companies that have been removed from the registry will be saved. Default valuedeleted_companies_from_business_registry.csv
download_path
- specifies where the data will be downloaded from. Default valuehttps://avaandmed.ariregister.rik.ee/sites/default/files/avaandmed/ettevotja_rekvisiidid__lihtandmed.csv.zip
soft_delete
- Indicates whether to run soft deletion for companies that have been removed, gone bankrupt, or are missing from the business registry. (Default value False)Since this command already includes default values, it is not necessary to enter any parameters; they were simply added for greater flexibility. Therefore, you can run the following command:
bundle exec rake company_status:check_all
and the data will be available in the directory
lib/tasks/data
The job:
CompanyRegisterStatusJob.perform_later(days_interval = 14, spam_time_delay = 0.2, batch_size = 100, download_open_data_file_url='https://avaandmed.ariregister.rik.ee/sites/default/files/avaandmed/ettevotja_rekvisiidid__lihtandmed.csv.zip')
This job accepts the following parameters:
days_interval
- selects domains that were last checked more than {days_interval} days ago.spam_time_delay
- this is the time delay when querying the business registry.batch_size
- the size of the batch for processing. This is needed for optimization.download_open_data_file_url
- the URL from which to download the business registry data.As indicated above, all these values have default settings, so they can be modified if necessary.
What the job does:
kandeliik
isKustutamiskanne dokumentide hoidjata
.application.yml
file and it has this structure:POTENTIAL PROBLEM: It could happen that we decide to check a large array of data in one day, and say the next time we decide to check in a year, and logically this job might process a large list of companies exactly one year later. This should be kept in mind.
this PR related to this one #internetee/company_register#6
related tickets: internetee/company_register#4 internetee/company_register#5