Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WFA migration not run for all records #1524

Closed
Tracked by #1540
mahalakshme opened this issue Feb 20, 2024 · 35 comments
Closed
Tracked by #1540

WFA migration not run for all records #1524

mahalakshme opened this issue Feb 20, 2024 · 35 comments
Assignees

Comments

@mahalakshme
Copy link

mahalakshme commented Feb 20, 2024

Issue:

The migration here ran for this issue doesn't seem to have run for all rows.

Eg: On running the below SQL,

select program_encounter.name, observations->>'c5f418f8-af4a-4a10-a70d-54f088c6a8c6' as WFAzscore, manual_update_history, last_modified_date_time from public.program_encounter where individual_id=332143 and encounter_date_time is not null order by id desc;

found that, manual_update_history and last_modified_date_time were not updated for many rows.

Analysis:

Previously this card was not tested. We thought Code review is sufficient. May be we should have QAed it.

Acceptance criteria:

Make sure the SQL to update WFA for existing data is correct and updates all applicable records for the org in focus.
---- As part of this verification, make sure to check the 2nd case mentioned here along with other cases

Comment (Vivek)

  • better to update all program encounters again based on the logic
  • use an sql to check the count of records that have been not been updated and whether that is the expected number (before commit)
  • it would be good if the SQL updates only rows that have a different value. since we already have the id in the update the search will be fast and it wouldn't unnecessarily create syncs
  • this will allow us to run this again after few days when all user device data has been pushed
@mahalakshme mahalakshme converted this from a draft issue Feb 20, 2024
@mahalakshme mahalakshme moved this from In Analysis to Ready in Avni Product Feb 20, 2024
@himeshr himeshr self-assigned this Feb 23, 2024
@himeshr himeshr moved this from Ready to In Progress in Avni Product Feb 23, 2024
@himeshr
Copy link

himeshr commented Feb 23, 2024

Root cause for missing updates:

The issue, here is that as per acceptance criteria mentioned here, we have only updated z-score where values in the range [< -3.0 and > 3.0 ].
Whereas the core issue also has an additional acceptance criteria that, all values should be rounded to 2 decimal places.

This info was lost due to large number of cards and tickets present for this issue.

Fix to be done:

As part of this card, we'll only address the second issue, i.e. to update z-scores where values is (>= -3.0 and <= 3.0), to have 2 decimal place values.

Support in code for rounding to 2 decimal places

Also, found that code changes to handle rounding to 2 decimal places has already been done as part of this commit.

@himeshr
Copy link

himeshr commented Feb 23, 2024

We have to only execute commands part of file https://github.com/avniproject/data-fixes/blob/main/z-score-update/output_round_2.sql

Ignore commands in output.sql, as they have already been run as part of release 6.1

@himeshr himeshr moved this from In Progress to Code Review Ready in Avni Product Feb 23, 2024
@himeshr
Copy link

himeshr commented Feb 23, 2024

Note, that, the values for some encounters, could still end up with single decimal precision, this is due to the value being something like '2.10' which numerically, get transformed to 2.1 in code.
And since the concept is a numeric concept, we should be fine saving this as such, without having to set it to 2.10

@mahalakshme
Copy link
Author

@himeshr I think 2.10 has higher precision than 2.1. If its possible to set it so, then we can

@1t5j0y 1t5j0y moved this from Code Review Ready to In Code Review in Avni Product Feb 29, 2024
@1t5j0y 1t5j0y moved this from In Code Review to QA Ready in Avni Product Feb 29, 2024
@mahalakshme mahalakshme moved this from QA Ready to In QA in Avni Product Mar 1, 2024
@mahalakshme
Copy link
Author

@himeshr this can be run on prod, so will move it to progress, just to keep track.

Things to make sure:

  • All rows are updated - for the ones created before 6.1.0 release.
  • ETL is updated with the updated values

@mahalakshme mahalakshme moved this from In QA to In Progress in Avni Product Mar 1, 2024
@himeshr
Copy link

himeshr commented Mar 1, 2024

Pending deploy to prod..

@himeshr
Copy link

himeshr commented Mar 4, 2024

Applied the data-fix to prod:
Sample test sql response:

openchs=> select program_encounter.name, observations->>'c5f418f8-af4a-4a10-a70d-54f088c6a8c6' as WFAzscore, manual_update_history, last_modified_date_time from public.program_encounter where individual_id=332143 and encounter_date_time is not null order by id desc;
          name           | wfazscore |        manual_update_history         |  last_modified_date_time
-------------------------+-----------+--------------------------------------+----------------------------
 Growth Monitoring Visit | -2.22     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:30.771+00
 Growth Monitoring Visit | -2.21     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:23.949+00
 Growth Monitoring Visit | -1.89     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:27.347+00
 Growth Monitoring Visit | -2.12     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:29.102+00
 Growth Monitoring Visit | -2.08     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:32.48+00
 Growth Monitoring Visit | -2.22     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:27.076+00
 Growth Monitoring Visit | -2.29     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:24.594+00
 Growth Monitoring Visit | -2.29     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:33.312+00
 Growth Monitoring Visit | -2.24     | 04/03/2024 11:03:23 - data-fixes#1.1 | 2024-03-04 11:27:32.431+00
 Growth Monitoring Visit | -4.47     | 22/01/2024 08:01:50 - data-fixes#1   | 2024-01-22 08:28:51.215+00
 Growth Monitoring Visit | -4.47     | 22/01/2024 08:01:50 - data-fixes#1   | 2024-01-22 08:28:51.215+00
(11 rows)

@himeshr himeshr closed this as completed Mar 4, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Avni Product Mar 4, 2024
@mahalakshme
Copy link
Author

@himeshr the above things mentioned this comment not made sure - ETL failed when I changed the status of analytics for the org in focus since was not seeing updated values in ETL

@mahalakshme mahalakshme reopened this Mar 4, 2024
@github-project-automation github-project-automation bot moved this from Done to Triaged in Avni Product Mar 4, 2024
@mahalakshme mahalakshme moved this from Triaged to QA Failed in Avni Product Mar 4, 2024
@himeshr
Copy link

himeshr commented Mar 5, 2024

I see the following ETL logs for JSS org, which seem to indicate ETL completed successfully on Prod:


ubuntu@ip-10-100-1-136:/var/log/avni-etl-service$ grep -i  'ETL for schema' etl-service.log* | grep -e ' jss[, ]'
etl-service.log.1:2024-03-05 04:23:45,289 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.1:2024-03-05 04:23:53,460 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.10:2024-03-04 07:36:39,589 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.10:2024-03-04 07:36:54,863 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.10:2024-03-04 09:42:59,702 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.10:2024-03-04 09:43:16,197 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.2:2024-03-05 01:23:45,285 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.2:2024-03-05 01:23:49,965 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.2:2024-03-05 02:53:45,288 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.2:2024-03-05 02:53:50,701 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.3:2024-03-04 23:53:45,282 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.3:2024-03-04 23:53:50,510 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.4:2024-03-04 22:23:45,310 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.4:2024-03-04 22:23:50,228 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.5:2024-03-04 19:23:45,376 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.5:2024-03-04 19:23:50,351 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.5:2024-03-04 20:53:45,311 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.5:2024-03-04 20:53:49,770 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.6:2024-03-04 17:53:45,289 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.6:2024-03-04 17:53:50,318 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.7:2024-03-04 16:23:45,284 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.7:2024-03-04 16:23:50,207 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.8:2024-03-04 13:24:25,706 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.8:2024-03-04 13:51:24,627 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss
etl-service.log.8:2024-03-04 14:53:45,289 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Running ETL for Schema: jss, DB User: jss, Schema User: jss
etl-service.log.8:2024-03-04 14:53:50,418 INFO  Avni ETL_Worker-1 [etl.service.EtlService] Completed ETL for schema jss with dbUser jss and schemaUser jss

@mahalakshme
Copy link
Author

@himeshr yes, yesterday it had failed, not it has passed.

@mahalakshme mahalakshme moved this from QA Failed to In QA in Avni Product Mar 5, 2024
@mahalakshme
Copy link
Author

@himeshr for many ids I found, WFA didnt have double point precision. To find out if it is bcoz of trailing zero, checked one randomly and found that was not the case. This is for program encounter with id, 913973.

When calculated via calculator it was found to be, -2.69. Kindly find the attached image below for the calculation.
Screenshot 2024-03-05 at 4 43 00 PM

In db it is present as -2.7 with single decimal precision.

@mahalakshme mahalakshme moved this from In QA to QA Failed in Avni Product Mar 5, 2024
@himeshr
Copy link

himeshr commented Mar 5, 2024

The update file does have the sql query to make this correction.
I found a large number of program_encounters without migration_history, which shouldn't be the case, for jss.
Might have inadvertently terminated the sql transactions running in the background.

Will re-trigger the psql update in a background process with command logs captured using nohup to validate that the entire sql file was executed correctly on the prod db.

@himeshr himeshr moved this from QA Failed to In Progress in Avni Product Mar 5, 2024
@mahalakshme
Copy link
Author

mahalakshme commented Mar 25, 2024

@vinayvenu @petmongrels In this comment, I feel the following:

  • 1st point moved to this card
  • 2nd point is part of AC, hence can remain in this card
  • 3rd point I feel we need to fix since we cant tell them to update since 1st point is still an issue, hence I feel we need to fix. And I feel the fix is similar to that of 2nd point. Hence I feel it can be part of this card.

For 2nd and 3rd point whoever does it, can make it foolproof to write the test as well.

Let me know once you both review.

@mahalakshme
Copy link
Author

@himeshr also I had already mentioned this

so 2nd point is within the scope, for other points I ve mentioned the reasons above.

@petmongrels
Copy link

To do this following is probably required.

  • JS based verification of data so that manual testing is not required and we know when we are done. This may cover only double precision if that is the only part remaining.
  • A script that can be run on incremental pushes that may happen to our data from the user's devices where the calculation is incorrect. We can run this once after say 15 days of running it the first time. Whatever remains after 15 days, we leave it as it is.

@mahalakshme
Copy link
Author

@petmongrels I see the above 2 points will address the points 2 and 3 here and I see solution for both will be the same. So I am thinking both can be part of this card. Let me know your thoughts.

@petmongrels
Copy link

we will need two cards I think.

  1. to fix and verify (along with the script to run later)
  2. just to run later after 15 days

This card is burdened by context. If I pick up I don't know what to work on, so likely will ignore everything mentioned here and just go by the automated test. The test should be written based on the acceptance criteria of remaining work.

oh and there is one more thing. what happens if we run the test and find out that the client is not sending the right numbers to the server. so we perhaps needs a third card to very sometime in between. depending on what we see in the data we will have to take call how to go about fixing them.

@mahalakshme
Copy link
Author

mahalakshme commented Mar 25, 2024

@petmongrels Its implicit that whoever fixes it needs to also make sure its run for all rows. I am not convinced with the reason of context because of which it need to be moved to separate card. I feel for the 2nd point here the AC still remains the same. The comments here can help the person who picks up the card, to be aware of the issues that he might face while fixing.

for 3rd point, I will create a separate card, which can be picked up after release.

@vinayvenu if you think separate card needs to be created for 2nd point here and if you have bandwidth, you can create. I ve already spent too much time testing this card and dont want to spend additional time for this.

@mahalakshme
Copy link
Author

mahalakshme commented Mar 25, 2024

@vinayvenu @petmongrels For the 2nd point here, It is like duplicate analysis work for me if I need to create separate card, because development was not complete and I need to understand now what developer has completed. If this continues, then the analyst time will not be spent efficiently.

@petmongrels
Copy link

Will we be able to specify the requirement in form of JS based test? If so then we can first write that test in a separate card and code review it. After that we can play this card. Actually there should be nothing to QA on this card after that.

@mahalakshme
Copy link
Author

@petmongrels I think it is upto the developer how he wants to verify it, whether JS or SQL or any other method. I think testing after development by developer is part of all cards. So I cant understand why it should be part of separate card.

Also I think QA process should not be missed because test will be written by developer and it also can have issues. So from a QA perspective whatever needs to be verified can be verified.

@mahalakshme
Copy link
Author

@petmongrels 1st point not an issue(my mistake - checked last_modified_date_time instead of encounter_date_time for identifying user's version), 3rd point can be moved to separate card, this card is just for 2nd point.

Have added this: ---- As part of this verification, make sure to check the 2nd case mentioned here to the AC of this card.

@mahalakshme mahalakshme moved this from In Analysis Review to Ready in Avni Product Mar 25, 2024
@mahalakshme mahalakshme moved this from Ready to In Analysis Review in Avni Product Mar 25, 2024
@mahalakshme mahalakshme moved this from In Analysis Review to Ready in Avni Product Mar 25, 2024
@himeshr himeshr moved this from Ready to In Progress in Avni Product Mar 26, 2024
@himeshr himeshr self-assigned this Mar 26, 2024
@himeshr himeshr moved this from In Progress to Code Review Ready in Avni Product Mar 26, 2024
@himeshr
Copy link

himeshr commented Mar 26, 2024

Dry run on Local DB was successful.
updateSql.txt
localDBDumpDryRunResults.txt

@petmongrels petmongrels moved this from Code Review Ready to In Code Review in Avni Product Mar 27, 2024
@petmongrels
Copy link

Review notes

  • when verifying the wfa number (not whether it has been updated), expected value should be from the health modules output as that has been assumed to be the right now
  • we can decide to run this one more time, after this, at some point. after updates to the value from the mobile device after that will not be updated and it will remain wrong

@petmongrels petmongrels moved this from In Code Review to QA Ready in Avni Product Mar 27, 2024
@himeshr
Copy link

himeshr commented Mar 27, 2024

Successfully executed the Updates for the input.json generated on 26 March 2024 from JSS prod organisation.
Validation was also successful, all updates had been correctly applied to db.

prodSqlExecutionOutput.txt

@himeshr
Copy link

himeshr commented Mar 27, 2024

Pending task:

After 15 day, perform steps mentioned in Readme to make z-score corrections to "over-written" entries or the "newly inserted entries from old version of app".

@himeshr
Copy link

himeshr commented Mar 28, 2024

No Additional QA required for now.
It needs to only be re-executed sometime around April 10th 2024, 15 days from March 26th 2024.
Moving to hold for now, to avoid QA team from picking it up for validation.

@himeshr himeshr moved this from QA Ready to Hold in Avni Product Mar 28, 2024
@himeshr himeshr moved this from Hold to In Progress in Avni Product Apr 15, 2024
@himeshr
Copy link

himeshr commented Apr 15, 2024

Ran the migration one last time today, for all records which were out of sync as of 15 April 2024, 1:50 PM IST.
validationSqlOutput.txt

@himeshr himeshr closed this as completed Apr 15, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Avni Product Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants