-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Custom Metadata Fields #170
Comments
Generally yes, there are just a couple of things to clarify NamespacesWe have to assign a "short" identifier for each namespace. I propose the following:
|
Hi @slint, thanks for your reply! Ok, if I understand you correctly, you need me to amend the provided table with the types of fields and humna-friendly labels. Is that correct? Regarding the namespace observation and the Now, I sent this by email but I think it got lost with our server issues from last week, so I'm sending here again:
Thanks in advance, |
I think we should add all terms to vocab.plazi.org
Just wanted to point out, that the human friendly names are present in the ontology: Here's an extract of the
|
@slint not sure if you saw my reply to your comment, but I think I still need some guidance! Thanks in advance! |
I am sorry, I quickly read through and missed some points.
That's optional for the time being since it's only for visually showing them up on the Zenodo record page in the sidebar. I think based on Reto's recommendation for the labels, it could be easily done after we add them to the accepted terms (so no rush on this one).
I had no idea either, just read through DublinCore and found out randomly :) Now that I think of it, but maybe this is a longer discussion, aren't the creators and contributor fields that we already have on the form covering these values? We already actually serialize these in our DublinCore export format with the
So, if for example, you have "John Smith (ORCiD: 1234)" and "Jane Doe (ORCiD: 5678)" you could submit something like: {
...,
"custom": {
"dwc:identifiedBy": [
"John Smith",
"1234",
"Jane Doe",
"5678",
]
},
} This would allow matching a search query on this custom keyword by any of the values, the exact name and/or the ORCiD (which might be good enough for now). Related to the |
Hi @slint, I think you're right about the creator... but I'm checking because there is also a Thanks a lot, and, let me know if you need anything! |
All terms have been added to Sandbox and Production, so the following metadata example should be possible now: curl -X POST "https://sandbox.zenodo.org/api/deposit/depositions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
--data @- << EOF
{
"metadata": {
...
"custom": {
"ac:subjectOrientation": ["dorsal"],
"dwc:identifiedBy": [
"John Smith",
"1234",
"Jane Doe",
"5678",
],
"dc:rightsHolder": ["John Smith"]
}
}
}
EOF @mguidoti let's discuss maybe in today's call briefly what are the next steps. |
The only two fields from the .csv missing in the table above are the
Could you confirm, map what you needed, and let me know when I can push it? The idea as we discussed in the previous meeting is to push to sandbox for evaluation first. Thanks! |
@mguidoti But the
IMHO one value per name and identifier is a way we can make this work for search purposes as well, and in case there are in the future specific fields for adding an identifier only, we can revisit and update the metadata accordingly. On our side both Sandbox and production systems have the custom keywords configuration deployed, so we can start testing the first uploads and see if the result looks good. We can have a call as well to check together and move things forward. |
Actually, I just came across tdwg/dwc#102, which argues about potentially adding the I'll also share here later a suggested mapping from the .csv file you shared, to the custom metadata with some full examples. On a side note, I'm not a big fan of the column/bar ( |
@mguidoti here's the first example of a Zenodo request JSON metadata mapping, using the first line from the .csv file. Some quick points to be figured out or discussed:
{
'metadata': {
# I used "dwc:eventDate", but it could be left empty (and
# automatically take the current date of publishing)
'publication_date': '2018-11-21',
'upload_type': 'physicalobject',
# TODO: What should be the title and description?
'title': '???',
'description': '???',
#
'related_identifiers': [
# TODO: If this is the "label"/"master" record, it should link to the different photos of the specimen
{
'identifier': '10.5281/zenodo.XYZ',
'relation': 'hasPart',
'resource_type': 'image-photo', # TODO: Or should it be "image-figure"?
},
{
'identifier': '10.5281/zenodo.XYZ',
'relation': 'hasPart',
'resource_type': 'image-photo', # TODO: Or should it be "image-figure"?
},
],
'license': 'cc-by', # TODO: Or other?
'communities': [{'identifier': 'biosyslit'}],
# I concatenated "ac:captureDevice", and "ac:resourceCreationTechnique", but can also be left blank
'method': 'GIGAmacro Magnify2, full-frame DSLR, 65 mm f2.8 macro-lens, twin-flash. focus stacking, 3:1, scale=5 mm',
'creators': [{
'name': 'T. Dikow',
'orcid': 'https://orcid.org/0000-0003-4816-2909',
'affiliation': 'USNM', # Is this correct?
}],
'locations': [
{
'lat': '-23.56333', 'lon': '15.03278',
'place': 'Namib-Naukluft National Park, Gobabeb, dunes W of Kuiseb riverbed',
},
],
'custom': {
'dwc:scientificName': ['Eremohaplomydas gobabebensis Boschert and Dikow, 2021'],
'dwc:scientificNameID': ['http://zoobank.org/745D49C1-62B8-4884-9F7F-2B82523373D3'],
'dwc:catalogNumber': ['USNMENT01518012'],
'dwc:kingdom': ['Animalia'],
'dwc:phylum': ['Arthropoda'],
'dwc:class': ['Insecta'],
'dwc:order': ['Diptera'],
'dwc:family': ['Mydidae'],
'dwc:genus': ['Eremohaplomydas'],
'dwc:specificEpithet': ['gobabebensis'],
'dwc:scientificNameAuthor': ['Boschert and Dikow'],
'dwc:scientificNameAuthorYear': ['2021'],
'dwc:basisOfRecord': ['PreservedSpecimen'],
'dwc:lifeStage': ['Adult'],
'dwc:sex': ['male'],
'dwc:individualCount': ['1'],
'dwc:institutionCode': ['USNM'],
'dwc:collectionCode': ['Entomology'],
'dwc:typeStatus': ['Paratype'],
# After discussions, we'll be using GBIF's DWC extension for ORCIDs
'dwc:identifiedBy': ['Boschert, C.', 'Dikow, T.'],
'gbif-dwc:identifiedByID': ['https://orcid.org/0000-0003-4816-2909'],
# Note that if we had multiple ORCIDs we should be storing them as separate values:
# 'gbif-dwc:identifiedByID': ['https://orcid.org/0000-0003-4816-2909', 'https://orcid.org/0000-0003-1234-5678'],
'dwc:dateIdentified': ['2019'],
'dwc:country': ['Namibia'],
'dwc:stateProvince': ['Erongo'],
'dwc:locality': ['Namib-Naukluft National Park, Gobabeb, dunes W of Kuiseb riverbed'],
'dwc:decimalLatitude': ['-23.56333'],
'dwc:decimalLongitude': ['15.03278'],
'dwc:verbatimElevation': ['401 m'],
'dwc:eventDate': ['2018-11-21'],
# Using GBIF's DWC extension for ORCIDs
'dwc:recordedBy': ['Dikow, T.'],
'gbif-dwc:recordedByID': ['https://orcid.org/0000-0003-4816-2909'],
'dwc:preparations': ['Pinned'],
'ac:captureDevice': ['GIGAmacro Magnify2, full-frame DSLR, 65 mm f2.8 macro-lens, twin-flash'],
'ac:resourceCreationTechnique': ['focus stacking, 3:1, scale=5 mm'],
'ac:subjectOrientation': ['dorsal'],
'ac:subjectPart': ['whole organism habitus'],
'dc:rightsHolder': ['Smithsonian Institution - public domain'],
}
}
} |
Ok, so, replying to the points you raised: regarding
I think the Yes, totally agree for the I would say that the I guess it's ok to replicate the info from I guess we're close..! |
If the data can be captured in the
Would something like I'm not sure if there's some domain-specific convention already for that already though, so @mguidoti, @myrmoteras (or someone else) might have some insight.
Some logical concatenation of already existing information might be enough.
Sounds good, we leave it empty and get automatically the publishing date 👍
I agree, just wasn't sure about the hierarchy/organization of the different objects.
Looks like it! Let's get this shipped 🙌 🚀 🐜 |
I would recommned to follow-up with what GBIF does because they are the trendsetter because of their use, and thus contribute to being a "standard". Also, there is large effort in the biodiv community to assign ORCIDs to persons, which will augment the use of it. Our ORCID might be scarce, but the few will play a decisive role, especially in the publishing world (Pensoft, EJT) |
In that case, I've added the 'custom': {
...,
'dwc:identifiedBy': ['Boschert, C.', 'Dikow, T.'],
'gbif-dwc:identifiedByID': ['https://orcid.org/0000-0003-4816-2909'],
# Note that if we had multiple ORCIDs we should be storing them as separate values:
# 'gbif-dwc:identifiedByID': ['https://orcid.org/0000-0003-4816-2909', 'https://orcid.org/0000-0003-1234-5678'],
'dwc:recordedBy': ['Dikow, T.'],
'gbif-dwc:recordedByID': ['https://orcid.org/0000-0003-4816-2909'],
...
},
... |
Great timing! I just finished a dashboard for @myrmoteras and will be finally looking at this today. Cheers! |
@slint,
I'm having some issues on our email server, so @myrmoteras asked me to post this as a Github issue.
I basically sent you an email late last week asking for these additional custom metadata fields so I can push a small digital specimens dataset:
I think you old me in the pass that this list is exactly what you need to made these additions.. right?
Oh, and please, note that some of these you already added...
Cheers!
The text was updated successfully, but these errors were encountered: