Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NIHMS metadata, Adding Grant and Funder to bulk_meta.xml #79

Merged
merged 35 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
b3c6343
Add Grant class
tsande16 Dec 14, 2023
72d15a9
Update DepositSubmissionMapper
tsande16 Dec 14, 2023
21b7584
Add createGrant
tsande16 Dec 14, 2023
d67069c
Add writing grants to write_metadata
tsande16 Dec 15, 2023
6b6573f
Update testSerializedMetaDataGrants
tsande16 Dec 15, 2023
a90e231
Add funder and grant to testPackageMetadata()
tsande16 Dec 15, 2023
a3e4080
Remove convert doc to string
tsande16 Dec 18, 2023
ab148a3
Add new grants to sample1.json
tsande16 Dec 18, 2023
c02a9a0
Add assertions
tsande16 Dec 18, 2023
7771af8
Update DepositSubmissionModelBuilderIT
tsande16 Dec 18, 2023
23ffc13
Update commonContributorsAndFewAuthors()
tsande16 Dec 19, 2023
d571614
Refactor write_metadata
tsande16 Dec 20, 2023
f0d6471
Refactor testPackageMetadata()
tsande16 Dec 20, 2023
9123db4
Fix grantElement
tsande16 Dec 20, 2023
b1e2591
Cleanup
tsande16 Dec 20, 2023
40dbdee
Add funder-mapping to repositories.json and RepositoryDepositConfig
tsande16 Dec 20, 2023
cbcc853
Cleanup
tsande16 Dec 20, 2023
85fcb7d
Add funder-mapping to repositories.json
tsande16 Jan 3, 2024
1ad3b52
Add localKey to metadata model
tsande16 Jan 8, 2024
39874de
Add getRepositoryConfig to Packager
tsande16 Jan 9, 2024
a70b638
Remove getRepositoryConfig to Packager
tsande16 Jan 9, 2024
3a0c45f
Move funder mapping to the assembler options
tsande16 Jan 9, 2024
bc1b17d
Add check for funderKey in NihmsMetadataSerializer
tsande16 Jan 9, 2024
6cb893d
Cleanup
tsande16 Jan 9, 2024
5be99d0
Add package option and funderMapping check
tsande16 Jan 10, 2024
6d18160
Update test data and IT setup
tsande16 Jan 10, 2024
9b63b14
Fix funderKey
tsande16 Jan 10, 2024
8000aa2
Fix NihmsAssemblerIT
tsande16 Jan 10, 2024
68a08ba
Cleanup
tsande16 Jan 10, 2024
1d05b45
Cleanup
tsande16 Jan 10, 2024
adc236c
Update NihmsMetadataSerializer constructor
tsande16 Jan 10, 2024
6d28894
Cleanup
tsande16 Jan 10, 2024
1d79d1d
Update packageOptions in NihmsThreadedAssemblyIT
tsande16 Jan 10, 2024
9152306
Cleanup
tsande16 Jan 12, 2024
c5da2e4
Remove funder-mapping from respositories.json
tsande16 Jan 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -98,4 +98,11 @@ enum OPTS {

}

/**
* Funder Mapping for Nihms Packages
*/
interface FunderMapping {
String KEY = "FUNDER-MAPPING";
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ public DepositSubmission createDepositSubmission(Submission submissionEntity, Li
metadata.setJournalMetadata(journal);
ArrayList<DepositMetadata.Person> persons = new ArrayList<>();
metadata.setPersons(persons);
ArrayList<DepositMetadata.Grant> grants = new ArrayList<>();
metadata.setGrantsMetadata(grants);

// Data from the Submission resource
submission.setId(submissionEntity.getId());
Expand Down Expand Up @@ -129,6 +131,7 @@ public DepositSubmission createDepositSubmission(Submission submissionEntity, Li
// Data from the User resources for the PI and CoPIs
User piEntity = grantEntity.getPi();
persons.add(createPerson(piEntity, DepositMetadata.PERSON_TYPE.pi));
grants.add(createGrant(piEntity, grantEntity));

for (User copiEntity : grantEntity.getCoPis()) {
persons.add(createPerson(copiEntity, DepositMetadata.PERSON_TYPE.copi));
Expand Down Expand Up @@ -188,6 +191,15 @@ private DepositMetadata.Person createAuthor(String fullName) {
return person;
}

private DepositMetadata.Grant createGrant(User userEntity, Grant grantEntity) {
DepositMetadata.Grant grant = new DepositMetadata.Grant();
grant.setGrantId(grantEntity.getAwardNumber());
grant.setGrantPi(createPerson(userEntity, DepositMetadata.PERSON_TYPE.pi));
grant.setFunder(grantEntity.getPrimaryFunder().getName());
grant.setFunderLocalKey(grantEntity.getPrimaryFunder().getLocalKey());
return grant;
}

/**
* Convenience method for retrieving a string property.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
*/
package org.eclipse.pass.deposit.config.repository;

import java.util.HashMap;
import java.util.Map;

import com.fasterxml.jackson.annotation.JsonProperty;

/**
Expand All @@ -28,6 +31,9 @@ public class RepositoryDepositConfig {
@JsonProperty("mapping")
private StatusMapping statusMapping;

@JsonProperty("funder-mapping")
private Map<String, String> funderMapping = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that's not longer used. Good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


public DepositProcessing getDepositProcessing() {
return depositProcessing;
}
Expand All @@ -44,6 +50,14 @@ public void setStatusMapping(StatusMapping statusMapping) {
this.statusMapping = statusMapping;
}

public Map<String, String> getFunderMapping() {
return funderMapping;
}

public void setFunderMapping(Map<String, String> funderMapping) {
this.funderMapping = funderMapping;
}

@Override
public boolean equals(Object o) {
if (this == o) {
Expand Down Expand Up @@ -72,6 +86,6 @@ public int hashCode() {
@Override
public String toString() {
return "RepositoryDepositConfig{" + "depositProcessing=" + depositProcessing +
", statusMapping=" + statusMapping + '}';
", statusMapping=" + statusMapping + " funderMapping=" + funderMapping + "}";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@
import java.time.LocalDate;
import java.time.Period;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
Expand All @@ -30,6 +33,8 @@
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.apache.commons.lang3.StringUtils;
import org.eclipse.pass.deposit.assembler.PackageOptions;
import org.eclipse.pass.deposit.assembler.SizedStream;
import org.eclipse.pass.deposit.model.DepositMetadata;
import org.eclipse.pass.deposit.model.JournalPublicationType;
Expand All @@ -48,9 +53,19 @@ public class NihmsMetadataSerializer implements StreamingSerializer {
private static final Logger LOG = LoggerFactory.getLogger(NihmsMetadataSerializer.class);

private DepositMetadata metadata;
private Map<String, String> funderMapping = new HashMap<>();

public NihmsMetadataSerializer(DepositMetadata metadata) {
public NihmsMetadataSerializer(DepositMetadata metadata, Map<String, Object> packageOptions) {
this.metadata = metadata;
Object funderMappingObj = packageOptions.get(PackageOptions.FunderMapping.KEY);
if (funderMappingObj instanceof Map<?, ?>) {
this.funderMapping = ((Map<?, ?>) funderMappingObj).entrySet().stream()
.collect(Collectors.toMap(
e -> (String) e.getKey(),
e -> (String) e.getValue()));
} else {
throw new ClassCastException("FunderMapping is not a Map<String, String>");
}
}

@Override
Expand Down Expand Up @@ -92,16 +107,27 @@ private void add_text_element(Document doc, Element parent, String name, String
}

private void write_metadata(Document doc) {
DepositMetadata.Manuscript manuscript = metadata.getManuscriptMetadata();
DepositMetadata.Article article = metadata.getArticleMetadata();
DepositMetadata.Journal journal = metadata.getJournalMetadata();

Element root = doc.createElement("manuscript-submit");
doc.appendChild(root);

if (manuscript.getNihmsId() != null) {
addManuscriptMetadata(root);
addArticleMetadata(root);
addJournalMetadata(doc, root);
addManuscriptTitle(doc, root);
addContacts(doc, root);
addGrants(doc, root);
}

private void addManuscriptMetadata(Element root) {
DepositMetadata.Manuscript manuscript = metadata.getManuscriptMetadata();

if (StringUtils.isNotBlank(manuscript.getNihmsId())) {
root.setAttribute("manuscript-id", manuscript.getNihmsId());
}
}

private void addArticleMetadata(Element root) {
DepositMetadata.Article article = metadata.getArticleMetadata();

if (article != null && metadata.getArticleMetadata().getEmbargoLiftDate() != null) {
LocalDate lift = article.getEmbargoLiftDate().toLocalDate();
Expand All @@ -128,6 +154,10 @@ private void write_metadata(Document doc) {

root.setAttribute("doi", path);
}
}

private void addJournalMetadata(Document doc, Element root) {
DepositMetadata.Journal journal = metadata.getJournalMetadata();

// There is an optional agency attribute.
// Should only be used for non-NIH funders when grant information is also given
Expand All @@ -143,7 +173,7 @@ private void write_metadata(Document doc) {
// See https://github.com/OA-PASS/metadata-schemas/pull/28 and
// https://github.com/OA-PASS/jhu-package-providers/issues/16
journal.getIssnPubTypes().values().forEach(issnPubType -> {
if (issnPubType.pubType == null || issnPubType.issn == null || issnPubType.issn.trim().isEmpty()) {
if (issnPubType.pubType == null || StringUtils.isBlank(issnPubType.issn)) {
LOG.debug("Discarding incomplete ISSN: {}", issnPubType);
return;
}
Expand All @@ -159,15 +189,21 @@ private void write_metadata(Document doc) {
add_text_element(doc, journal_meta, "journal-title", journal.getJournalTitle());
}

if (manuscript != null && manuscript.title != null) {
}

private void addManuscriptTitle(Document doc, Element root) {
DepositMetadata.Manuscript manuscript = metadata.getManuscriptMetadata();

if (manuscript != null && StringUtils.isNotBlank(manuscript.title)) {
add_text_element(doc, root, "manuscript-title", manuscript.title);
}

// Could add a citation element if we had all the information
}

private void addContacts(Document doc, Element root) {
List<DepositMetadata.Person> persons = metadata.getPersons();

if (persons != null && persons.size() > 0) {
if (persons != null && !persons.isEmpty()) {
Element contacts = doc.createElement("contacts");
root.appendChild(contacts);

Expand All @@ -177,20 +213,20 @@ private void write_metadata(Document doc) {
Element p = doc.createElement("person");
contacts.appendChild(p);

if (person.getFirstName() != null) {
if (StringUtils.isNotBlank(person.getFirstName())) {
p.setAttribute("fname", person.getFirstName());
} else {
if (person.getFullName() != null) {
if (StringUtils.isNotBlank(person.getFullName())) {
p.setAttribute("fname", person.getFullName().split("\\s")[0]);
}
}
if (person.getMiddleName() != null) {
if (StringUtils.isNotBlank(person.getMiddleName())) {
p.setAttribute("mname", person.getMiddleName());
}
if (person.getLastName() != null) {
if (StringUtils.isNotBlank(person.getLastName())) {
p.setAttribute("lname", person.getLastName());
} else {
if (person.getFullName() != null) {
if (StringUtils.isNotBlank(person.getFullName())) {
String[] split = person.getFullName().split("\\s");
if (split.length > 2) {
// middle name is present
Expand All @@ -201,16 +237,76 @@ private void write_metadata(Document doc) {
}
}
}
if (person.getEmail() != null) {
if (StringUtils.isNotBlank(person.getEmail())) {
p.setAttribute("email", person.getEmail());
}

p.setAttribute("person-type", "author");
}
}
}
}

private void addGrants(Document doc, Element root) {
List<DepositMetadata.Grant> grantsList = metadata.getGrantsMetadata();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is already very long. It would be good to pull this new code out into a new method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree write_metadata is rather long. I added the grants here because it fit with the rest. I didn't want to touch what was already written, but perhaps I could create a method for each component of the XML file: manuscript, article, journal etc? Something along the lines of:

private void write_metadata(Document doc) {
    Element root = initializeRootElement(doc);

    addManuscriptMetadata(doc, root);
    addArticleMetadata(doc, root);
    addJournalMetadata(doc, root);
    addManuscriptTitle(doc, root);
    addContacts(doc, root);
    addGrants(doc, root);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about creating a method just for the grants, but it would seem awkward for only that component to have it's own method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your suggestion would be good clean up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if (grantsList != null && !grantsList.isEmpty()) {
List<String> funderKeys = grantsList.stream()
.map(DepositMetadata.Grant::getFunderLocalKey)
.toList();

//Only create the top level grant element if at least one of the keys is associated with nihms
if (funderLocalKeyExists(funderKeys)) {

Element grantsElement = doc.createElement("grants");
root.appendChild(grantsElement);

// Could add grant information here if it was useful.
// Can only be used for a set list of funders
for (DepositMetadata.Grant grant : grantsList) {
if (funderMapping.containsKey(grant.getFunderLocalKey())) {
if (StringUtils.isNotBlank(grant.getFunder())) {
Element grantElement = doc.createElement("grant");
grantsElement.appendChild(grantElement);

//use the nihms abbreviations for funders, the accepted list is in the nihms DTD
grantElement.setAttribute("funder", funderMapping.get(grant.getFunderLocalKey()));

if (StringUtils.isNotBlank(grant.getGrantId())) {
grantElement.setAttribute("id", grant.getGrantId());
}

DepositMetadata.Person pi = grant.getGrantPi();
if (pi != null) {
Element piElement = doc.createElement("PI");
grantElement.appendChild(piElement);

if (StringUtils.isNotBlank(pi.getFirstName())) {
piElement.setAttribute("fname", pi.getFirstName());
}
if (StringUtils.isNotBlank(pi.getLastName())) {
piElement.setAttribute("lname", pi.getLastName());
}
if (StringUtils.isNotBlank(pi.getEmail())) {
piElement.setAttribute("email", pi.getEmail());
}
}
}
}
}
}
}
}

/**
* Used to determine if one of the grants localKeys exists in the list of accepted nihms funders
*
* @return True if at least one nihms funderKey exists in the metadata grant list
*/
private boolean funderLocalKeyExists(List<String> metaDataGrantFunderKeys) {
for (String key : metaDataGrantFunderKeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use Stream.anyMatch here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Much cleaner using Stream.anyMatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (funderMapping.containsKey(key)) {
return true;
}
}
return false;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ public static String getNonCollidingFilename(String fileName, DepositFileType fi
public void start(DepositSubmission submission, List<DepositFileResource> custodialResources,
Map<String, Object> packageOptions) {
manifestSerializer = new NihmsManifestSerializer(submission.getManifest());
metadataSerializer = new NihmsMetadataSerializer(submission.getMetadata());
metadataSerializer = new NihmsMetadataSerializer(submission.getMetadata(), packageOptions);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,7 @@ static Function<Deposit, TransportResponse> performDeposit(DepositWorkerContext

try {
packager = dc.packager();

packageStream = packager.getAssembler().assemble(
dc.depositSubmission(), packager.getAssemblerOptions());
packagerConfig = packager.getConfiguration();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,44 @@
"algorithms": [
"sha512",
"md5"
]
],
"funder-mapping": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we don't want this map here in the src repositories.json since it is specific to JHU. I think it should be removed, which means you will need to create a new test-repositories.json most likely to test your code, but I think it is cleaner to this for the community.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"johnshopkins.edu:funder:300032": "ahqr",
"johnshopkins.edu:funder:300293": "cdc",
"johnshopkins.edu:funder:300859": "cdc",
"johnshopkins.edu:funder:301459": "va",
"johnshopkins.edu:funder:300453": "epa",
"johnshopkins.edu:funder:303444": "hhmi",
"johnshopkins.edu:funder:300484": "nih",
"johnshopkins.edu:funder:300865": "nih",
"johnshopkins.edu:funder:300869": "nih",
"johnshopkins.edu:funder:300866": "nih",
"johnshopkins.edu:funder:308302": "nih",
"johnshopkins.edu:funder:305950": "nih",
"johnshopkins.edu:funder:302727": "nih",
"johnshopkins.edu:funder:300863": "nih",
"johnshopkins.edu:funder:300874": "nih",
"johnshopkins.edu:funder:300861": "nih",
"johnshopkins.edu:funder:300842": "nih",
"johnshopkins.edu:funder:303587": "nih",
"johnshopkins.edu:funder:303586": "nih",
"johnshopkins.edu:funder:306099": "nih",
"johnshopkins.edu:funder:300858": "nih",
"johnshopkins.edu:funder:301479": "nih",
"johnshopkins.edu:funder:300870": "nih",
"johnshopkins.edu:funder:303589": "nih",
"johnshopkins.edu:funder:300860": "nih",
"johnshopkins.edu:funder:300852": "nih",
"johnshopkins.edu:funder:302822": "nih",
"johnshopkins.edu:funder:302592": "nih",
"johnshopkins.edu:funder:303585": "nih",
"johnshopkins.edu:funder:303574": "nih",
"johnshopkins.edu:funder:300867": "nih",
"johnshopkins.edu:funder:303580": "nih",
"johnshopkins.edu:funder:301978": "nih",
"johnshopkins.edu:funder:305204": "aspr",
"johnshopkins.edu:funder:303395": "fda"
}
}
},
"transport-config": {
Expand Down
Loading