Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NIHMS metadata, Adding Grant and Funder to bulk_meta.xml #79

Merged
merged 35 commits into from
Jan 12, 2024

Conversation

tsande16
Copy link
Contributor

@tsande16 tsande16 commented Dec 15, 2023

Adding the Grant and Funder information to the serialized XML bulk_meta.xml provided to NIHMS.

Unit test added/updated:

  • NihmsMetadataSerializerTest
    • testSerializedMetaDataGrants()
  • NihmsAssemblerIT
    • testPackageMetadata()

Update 1.10.2024:

  • The funder-mapping now exists in the options of the assembler in respositories.json
  • This json file will need to be updated in STAGE and PROD

@tsande16 tsande16 self-assigned this Dec 15, 2023
@tsande16 tsande16 linked an issue Dec 18, 2023 that may be closed by this pull request
- Add tests for the grants and changed test data
- Should be 8 authors now, adding 1 more PI + 2 co-pis
Copy link
Contributor

@rpoet-jh rpoet-jh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, Tim. Just a few comments.

@@ -210,7 +210,40 @@ private void write_metadata(Document doc) {
}
}

// Could add grant information here if it was useful.
// Can only be used for a set list of funders
List<DepositMetadata.Grant> grantsList = metadata.getGrantsMetadata();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is already very long. It would be good to pull this new code out into a new method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree write_metadata is rather long. I added the grants here because it fit with the rest. I didn't want to touch what was already written, but perhaps I could create a method for each component of the XML file: manuscript, article, journal etc? Something along the lines of:

private void write_metadata(Document doc) {
    Element root = initializeRootElement(doc);

    addManuscriptMetadata(doc, root);
    addArticleMetadata(doc, root);
    addJournalMetadata(doc, root);
    addManuscriptTitle(doc, root);
    addContacts(doc, root);
    addGrants(doc, root);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about creating a method just for the grants, but it would seem awkward for only that component to have it's own method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your suggestion would be good clean up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Element grantElement = doc.createElement("grant");
grantsElement.appendChild(grantElement);

if (grant.getGrantId() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at these conditionals, it is possible we could end up with a grant element with no attributes. Should the conditionals be changed to check for at least one non-null attribute before creating the grant element?

Same question for the PI element below.

Copy link
Contributor Author

@tsande16 tsande16 Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Now that I'm thinking about it some more, I think all these attributes represent a minimum set. I would say at least for the grant element it should definitely have an award number and funder before writing out the grant element.

  • I checked the data in stage, and there doesn't exist a grant without an awardnumber or funder.
  • In staging, all PIs associated with a grant have a first name and last name. There are some cases where email is null.

So maybe all attributes are required, except for the email? I'm not sure if a grant should exist without a PI. I will have to double check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check has been added to ensure at least 1 nihms funder grant exists and then will create the grant element.

grantElement.setAttribute("id", grant.getGrantId());
}

if (grant.getFunder() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed to use StringUtils.isNotBlank?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -241,6 +244,49 @@ public void testPackageMetadata() throws Exception {
candidate.getType() == person.getType()));
});

//Ensure that the grants in the metadata matches a Grant on the submission, Check the attributes of a grant on
//submission against what is found in the metadata
List<Element> grantElements = asList(root.getElementsByTagName("grant"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably cleaner to pull this out into a separate method if not too many params are needed to be passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as the first comment. Should I do the same for the other components?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method isn't as long, so probably just adding a separate method to test the grant/funder change is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@tsande16
Copy link
Contributor Author

@markpatton @rpoet-jh

From our discussion this morning. Will need to add the funders abbreviations that aligns with the NIHMS dtd. Will add a configuration in the repositories.json and add a property to RepositoryDepositConfig

@tsande16 tsande16 requested a review from rpoet-jh January 11, 2024 13:29
Copy link
Contributor

@rpoet-jh rpoet-jh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, Tim! Just a couple minor comments.

@@ -28,6 +31,9 @@ public class RepositoryDepositConfig {
@JsonProperty("mapping")
private StatusMapping statusMapping;

@JsonProperty("funder-mapping")
private Map<String, String> funderMapping = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that's not longer used. Good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* @return True if at least one nihms funderKey exists in the metadata grant list
*/
private boolean funderLocalKeyExists(List<String> metaDataGrantFunderKeys) {
for (String key : metaDataGrantFunderKeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use Stream.anyMatch here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Much cleaner using Stream.anyMatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -31,7 +31,44 @@
"algorithms": [
"sha512",
"md5"
]
],
"funder-mapping": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we don't want this map here in the src repositories.json since it is specific to JHU. I think it should be removed, which means you will need to create a new test-repositories.json most likely to test your code, but I think it is cleaner to this for the community.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- This is institution specific. Has been moved to the test resources for testing JHU specific needs.
@tsande16 tsande16 merged commit 0c6a2e7 into main Jan 12, 2024
2 checks passed
@tsande16 tsande16 deleted the 828-update-nihms-metadata-grant-funder branch January 12, 2024 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Nihms metadata to add grant and funder info
2 participants