Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foundations for Warnings System #161

Merged
merged 16 commits into from
Dec 13, 2024
Merged

Foundations for Warnings System #161

merged 16 commits into from
Dec 13, 2024

Conversation

kennsippell
Copy link
Member

@kennsippell kennsippell commented Apr 29, 2024

Property Values

Currently, a staged place stores only the "original values" which come from the user. When code wants the corresponding "formatted value" for that propertly, it uses the validation library to format it on demand. When importing large CSV lists of CHUs/CHPs (like the Narok dataset), this results in >300,000 calls of Validation.formatSingle. This isn't that slow; but it slowed down a lot when we started autogenerating properties #47.

The work happening in #20 requires more formatted of property data. Although the performance gains from this change aren't huge (below), I view these changes as a required foundation for that work lest things get even slower.

This change therefore validates and formats the inputs once. The result is persisted as an IPropertyValue which contains .formatted and .original values. This adds complexity in the place datamodel, but simplifies upstream classes like RemotePlaceResolver.

The interface to the Validation library has changed significantly. New interface is to create a new ValidatedPropertyValue().

Remote Place Caching

Currently, the ChtApi.getPlacesWithType downloads all documents of a particular type and caches a subset of the data (id, name, lineage) as a RemotePlace. The warning system is going to need more data about these places so we can check for uniqueness (eg. unique facility codes). The doc's name information should also now be stored as "Property Value" and so this resulted in a reduced role of ChtApi (just fetches docs now), an increased role for RemotePlaceCache (it maps the doc down to a reduced set for caching), and a bit more complexity in Config.

Performance Measures

For Narok upload performance set (1640 CHPs)
Before: 1170ms
After: 601ms

Also includes

Overall, I think this is a fairly risky change because it touches every scenario we support and so we should test it well.

@kennsippell kennsippell marked this pull request as draft April 29, 2024 20:32
@kennsippell kennsippell requested a review from freddieptf May 1, 2024 23:12
@kennsippell kennsippell marked this pull request as ready for review May 1, 2024 23:21
@kennsippell
Copy link
Member Author

Hey @freddieptf - This is a big PR but it was born out of my attempt to split up #20 into smaller PRs. Let me know if you think I should split it up even more for easier review. Let me know if you'd like to review in person.

@kennsippell kennsippell mentioned this pull request May 8, 2024
src/config/index.ts Outdated Show resolved Hide resolved
src/config/index.ts Outdated Show resolved Hide resolved
Config.getUserRoleConfig(contactType),
];

Config.getPropertyWithName(contactType.place_properties, 'name');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the PR but nothing hints that getPropertyWithName might actually throw an error which i think makes it hard to reason about error handling in functions that call it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. now that we assert it on startup, maybe we don't need it in the property? we know it is also there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we don't need it in the property

Yeah i think it would be clearer if getPropertyWithName returned a string or undefined; then the caller can choose to assert if undefined and throw if needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we assert at startup, we can guarantee it is never undefined at runtime
this allows us to safely make it a string which takes the burden off the calling code
i personally like this since it leads to cleaner calling code. would you rather i not assert in the get and just use typescript operators to cast?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we get a runtime error if the property name that was being accessed was not asserted at start up like name is? If this function is only used for getting name then maybe it can be renamed and args changed?

src/index.ts Outdated Show resolved Hide resolved
@@ -172,26 +161,15 @@ export class ChtApi {
};

getPlacesWithType = async (placeType: string)
: Promise<RemotePlace[]> => {
const url = `medic/_design/medic-client/_view/contacts_by_type_freetext`;
: Promise<any[]> => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this array of type RemotePlace by default?

Copy link
Member Author

@kennsippell kennsippell May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. Right now it returns the raw documents from CouchDB. I use any to capture that.
RemotePlaceCache.fetchRemotePlaces now converts the CouchDB doc into a RemotePlace. I didn't want to cache the full object from CouchDB because it could be quite large and I want things to work with our validation/formatting/generation stuff.

public static async add(place: Place, chtApi: ChtApi): Promise<void> {
const domainStore = await RemotePlaceCache.getDomainStore(chtApi, place.type.name);
domainStore.push(place.asRemotePlace());
public static add(place: Place, chtApi: ChtApi): void {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we can figure out if a local place is created on a remote instance, do we still need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need add is so we don't have to refresh the full cache (heavy requests to instance) after creating a place through the management tool. We just shim the "local one" into the cache cache and it pretends to be remote (which it is, if everything went well).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i understand that, the reason i bring it up is the comments here

// sadly, sometimes invalid or uncreated objects "pretend" to be remote
// should reconsider this naming
type: 'remote' | 'local' | 'invalid';

I noticed we can convert a Place to a Remote place even when it's not created on the server. I think it'd make sense to treat remote places strictly as places we fetched from the server then we could rid of the type field. During search, the remote places can be joined with places that have isCreated set to true (maybe under an interface defining the fields needed?). Does that make sense?

Copy link
Member Author

@kennsippell kennsippell May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During search, the remote places can be joined with places that have isCreated set to true (maybe under an interface defining the fields needed?). Does that make sense?

I've tried this, and it's a bit complex. Mostly because you need the configuration object when convertingto get data from the Place into a similar structure that comes in RemotePlace. I spent like an entire day on it and I think you're right that what we have here is still unsatisfying.

I suspect Place should be called something like StagedPlace and it represents a row in the table to be created. I think RemotePlace initially represented only a place from CHT but it has been confused and overburdened. But I'm struggling to untangle it now. I agree that we need a third type which they both implement like SearchablePlace (or something alike)...

I think this is probably a lot of code churn and this PR is already large + is the base of #20. Doing that work here would cause a complex merge. Would you mind if we address it in a follow-up ticket?

Copy link
Member Author

@kennsippell kennsippell May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddieptf Thoughts? Do you think it is important enough to do it in this PR?

Copy link
Collaborator

@freddieptf freddieptf May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that can be addressed in another PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your views on doing the validation during creation instead of introducing the PropertyValue classes? I think PlaceFactory would be perfect for this and some of the existing methods in Place

Copy link
Collaborator

@freddieptf freddieptf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks mostly fine to me, my only concern is the growing amount of Indirection in the codebase introduced by classes/interfaces. For a CRUD-ish app I think things would be much simpler if we heavily relied on immutable objects and pure functions

Copy link
Collaborator

@freddieptf freddieptf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some observations:

  • Casing isn't fully preserved, if i have an existing place with a name in all lowercase, eg. county and search for it I get the formatted name County
  • The name formatting might be a bit aggressive. A silly example, i can't create a CHU with the name Example CHU One because the CHU will get stripped out and the name formatted to Example One Community Health Unit
  • Saving credentials file is broken

@kennsippell
Copy link
Member Author

For the first case, aren't these all names and should be capitalized? Why would we want an uncapitalized name?

Is the second case "example CHU one" real or hypothetical?

@freddieptf
Copy link
Collaborator

For the first case, aren't these all names and should be capitalized? Why would we want an uncapitalized name?

They should be, i was just noting that what is shown on the search results isn't necessarily what is on the instance in this case.

Is the second case "example CHU one" real or hypothetical?

Hypothetical, i was demonstrating that formatting has a lot of magic right now which might not be desirable in some cases.

@kennsippell
Copy link
Member Author

kennsippell commented Dec 2, 2024

@freddieptf Just checking if you need anything from me to merge this PR forward?
I believe the above are observations - is that correct? Not feedback you're wanting addressed here?

@freddieptf
Copy link
Collaborator

freddieptf commented Dec 2, 2024

Saving credentials file is broken

When testing i couldn't download the credentials file, this needs to be addressed.

@kennsippell
Copy link
Member Author

OK @freddieptf, I've fixed that reported problem with the credentails file.

Copy link
Collaborator

@freddieptf freddieptf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to go after fixing lint error

@kennsippell kennsippell merged commit 90ef80d into main Dec 13, 2024
1 check passed
@kennsippell kennsippell deleted the format-once branch December 13, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants