Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT MERGE - [PERF-286] Update pipeclean to work more generically with an Ed-Fi API #66

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

stephenfuqua
Copy link
Contributor

⚠️ Tested against Meadowlark - there are some known problems both with meadowlark and, at this time, with the performance tests

⛔ Not yet tested against the ODS/API

Probably needs some additional work to get it ready for ODS/API.

@stephenfuqua stephenfuqua marked this pull request as draft March 1, 2023 21:38
@@ -15,8 +15,9 @@ class CourseOfferingClient(EdFiAPIClient):

def create_with_dependencies(self, **kwargs):
school_id = kwargs.pop("schoolId", SchoolClient.shared_elementary_school_id())
school_year = kwargs.get("schoolYear", 2014)
session_reference = self.session_client.create_with_dependencies(
school_year = kwargs.pop("schoolYear", 2014)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pop instead of get: without the pop, the schoolYear ends up directly in the "root" of the course_offering, which is not correct. ODS/API ignores that overposting, but Meadowlark does not.

else:
return None

return headers[header].split("/")[-1].strip()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ODS/API and Meadowlark return different cases. Per MDN: "An HTTP header consists of its case-insensitive name". There's probably a better solution available, like lower-casing all headers and looking for "location". This works for now.

),
educationOrganizationReference__educationOrganizationId=kwargs.pop(
"educationOrganizationId", SchoolClient.shared_elementary_school_id()
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed temporary variable assignment while diagnosing something else. Not a very consequential refactor, and I didn't bother following that pattern elsewhere.

@@ -470,7 +467,7 @@ def create_with_dependencies(self, **kwargs):
gradingPeriodReference__periodSequence=period_reference["attributes"][
"periodSequence"
],
objectiveCompetencyObjectiveReference__objective=objective_reference[
competencyObjectiveReference__objective=objective_reference[
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of many examples where something was simply wrong, perhaps because the original programmer was looking at the Data Standard instead of the API schema, and thus did not realize that the objective prefix would be removed.

# force first letter to be capitalized
descriptor_type = descriptor_type[:1].capitalize() + descriptor_type[1:]

return f"{'uri://ed-fi.org/'}{descriptor_type}#{value}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case sensitivity on strings is important in Meadowlark, so XyzDescriptor and xyzDescriptor are two different things.

namespace = "uri://ed-fi.org"
learningResourceMetadataURI = "uri://ed-fi.org/whatever"
contentClassDescriptor = "uri://ed-fi.org/ContentClassDescriptor#Presentation"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealing with the known problem of a required Choice: ODS/API does not enforce, and Meadowlark does. I have proposed that we standardize on what the ODS/API does, which means changing Meadowlark. In the meantime, these extra fields at least let the processing continue without error against Meadowlark.



class GradingPeriodFactory(APIFactory):
periodSequence = UniquePrimaryKeyAttribute()
periodSequence = 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

king this random was previously causing failures when trying to create other resources that rely on a GradingPeriod. That probably worked fine with the ODS/API because it has been using the Northridge database. For meadowlark, which would probably take 12+ hours to load Northridge at this point, I only pre-loaded a very minimal data set, not including grading periods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have copied and pasted the text above into the PR, but missed the beginning of the first sentence. I don't know what I was trying to say at the beginning. Point is, I didn't see any reason to have a randomly-assigned periodSequence, and it was making life harder for working with Meadowlark.

@@ -28,73 +30,9 @@ class StudentFactory(APIFactory):
lastSurname = "Jones"
generationCodeSuffix = "JR"
personalTitlePrefix = "Mr"
hispanicLatinoEthnicity = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone must have been looking at the Data Standard 2.x definition instead of 3.x when writing this!

),
programName="Gifted and Talented",
programTypeDescriptor=build_descriptor("ProgramType", ProgramClient.shared_program_name),
programName=ProgramClient.shared_program_name,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern repeats across a number of Programs. Again because I loaded a much more minimal dataset, the tests could not rely on "Gifted and Talented" to exist already. I thought about trying to create the additional program automatically, but instead just connected to the one Program that the pipeclean tests are already creating: Bilingual.

default_attributes["educationOrganizationReference"][
"educationOrganizationId"
] = SchoolClient.shared_elementary_school_id()
# This resource has no non-key attributes, so there is nothing to update
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ODS/API interestingly does not invalidate a PUT request that tries to change a natural key - instead it just ignores those changes. Meadowlark invalidates the request and responds with 400.

def it_formats_using_current_year() -> None:
month = 12
day = 1
expected = str(date.today().year) + "-12-01"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For practice, I rewrote these tests using pytest-describe.

@stephenfuqua stephenfuqua changed the title [PERF-286] Update pipeclean to work more generically with an Ed-Fi API DO NOT MERGE - [PERF-286] Update pipeclean to work more generically with an Ed-Fi API Jun 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant