Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade old version of datacite documents on EZID #2003

Open
taojing2002 opened this issue Oct 28, 2024 · 9 comments
Open

Upgrade old version of datacite documents on EZID #2003

taojing2002 opened this issue Oct 28, 2024 · 9 comments

Comments

@taojing2002
Copy link
Contributor

taojing2002 commented Oct 28, 2024

We got this email from EZID:

Hello,

You are receiving this message because the EZID team has identified that an account associated with your email address >includes DataCite DOI records registered in a deprecated schema version (<4.x).

As was indicated in our message about adding support for schema v4.5, DOI registrations using schema versions older than >v4.x will no longer be supported by DataCite beginning in January 2025. Please see the DataCite documentation >herehttps://datacite.org/blog/deprecating-schema-3/ for additional details about this change.

Beginning in December of 2024, records in deprecated schema versions will be updated to the most recent version of the >DataCite schema (v4.5) on your behalf to ensure the continued functioning of EZID. The changes are relatively minor, but to >meet the requirements of this schema version, default values will be assigned to two specific fields in both EZID and the >DataCite DOI registration:

resourceType will be set to "(:unav)"
*
resourceTypeGeneral will be set to "Other"

If you do not wish for these updates to be automatically applied, we encourage you to upgrade the schema version of your DOIs >at your earliest convenience. All other account holders associated with these records will be contacted as well.
If you have any questions, please reach out and we’ll be happy to provide additional guidance.
All the best,
Adam
Adam Buttrick, Product Manager
University of California Curation Center (UC3)
California Digital Library
University of California Office of the President

Since our current datacite version from Metacat is DataCite 4.3, Matt suggested we can simply trigger a reregistration of all DOIs with older datacite metadata versions. It should work. But we need to figure out which DOIs locate at which server if two servers share the same shoulder.

@taojing2002
Copy link
Contributor Author

This the list coming from the email.
[email protected]_dois.csv

@rushirajnenuji
Copy link
Member

From thread:

Yes, looks like we're currently using DataCite 4.3, and from the CSV attached, looks like we have about ~2200 ADC DOIs, ~40 with prefix 10.25494_p6, and ~200 KNB DOIs. It seems like a lot (all?) of those are v3.x. But I think we should have the required metadata for the latest schema, so updating them sounds good.
example:

  1. https://ezid.cdlib.org/manage/display_xml/doi:10.5063/f1gq6vvv
  2. https://ezid.cdlib.org/manage/display_xml/doi:10.5063/f13b5xhm
  3. https://ezid.cdlib.org/manage/display_xml/doi:10.25494/p6qp4h
  4. https://ezid.cdlib.org/manage/display_xml/doi:10.18739/a2fn10t0w
  5. https://ezid.cdlib.org/manage/display_xml/doi:10.18739/a29300

@mbjones
Copy link
Member

mbjones commented Oct 28, 2024

@taojing2002 Note that I think this is a duplicate of issue #1949, and they both should probably close when we fix it.

Would it be worthwhile to bring the Metacat release for DataCite up to 4.5 before we run this whole re-registration? Because DataCite 4.5 is compatible with 4.4, and 4.4 is compatible with 4.3, bringing it up to-date might only require one code line change, to update the schema version header:

diff --git a/src/edu/ucsb/nceas/metacat/doi/datacite/DataCiteMetadataFactory.java b/src/edu/ucsb/nceas/metacat/doi/datacite/DataCiteMetadataFactory.java
index 0a7e4ab1..f42cd293 100644
--- a/src/edu/ucsb/nceas/metacat/doi/datacite/DataCiteMetadataFactory.java
+++ b/src/edu/ucsb/nceas/metacat/doi/datacite/DataCiteMetadataFactory.java
@@ -69,7 +69,7 @@ public abstract class DataCiteMetadataFactory {
     public static final String EN = "en";
     public static final String XML_LANG= "xml:lang";
     public static final String NAMESPACE = "http://datacite.org/schema/kernel-4";
-    public static final String SCHEMALOCATION = "https://schema.datacite.org/meta/kernel-4.3/metadata.xsd";
+    public static final String SCHEMALOCATION = "https://schema.datacite.org/meta/kernel-4.5/metadata.xsd";
     public static final String RESOURCE = "resource";
     public static final String CREATORS = "creators";
     public static final String CREATOR = "creator";

@taojing2002
Copy link
Contributor Author

taojing2002 commented Oct 29, 2024 via email

@taojing2002
Copy link
Contributor Author

After analyzing the redirect url, those dois come from ADC, KNB and OPC. I wrote and ran the script to submit the request to update those DOI. I believe all OPC and ADC ones worked. However, dozens DOIs from KNB are not in our current should list in KNB configuration, so they don't work. I leave those updates to EZID.
These are DOIs that we didn't submit since they don't have the system metadata records in our member nodes:

10.18739/a2rp4d 
redirect to https://arcticdata.io/catalog/#view/urn:uuid:1b29fabe-8930-48eb-b48c-dc7f99c2b077
10.5063/f1vm496b
redirect to https://github.com/ropensci/redland-bindings/tree/master/R/redland
10.5063/f1qv3jgm
redirect to  https://github.com/ropensci/datapack
10.5063/f1m61h5x
redirect to  https://github.com/DataONEorg/rdataone
10.5063/f1gf0rf6
redirect to  https://github.com/NCEAS/recordr

I leave those five object to the EZID update as well.

@mbjones
Copy link
Member

mbjones commented Nov 22, 2024

@taojing2002 thanks!

@rushirajnenuji or @doulikecookiedough could one of you update these software records (and maybe enhance them with ORCIDs/RORs/Funding info where that isn't too burdensome)?

@rushirajnenuji
Copy link
Member

Hi @mbjones - yes, will do. Doesn't seem either of those were registered with XML, as no XML object found in EZID, it seems like the default datacite profile was used to generate these objects.

I'll use one of our latest citation objects as a reference, and update each of these with the updated DataCite XML targetting version 4.5

10.5063/f1vm496b
redirect to https://github.com/ropensci/redland-bindings/tree/master/R/redland
10.5063/f1qv3jgm
redirect to https://github.com/ropensci/datapack
10.5063/f1m61h5x
redirect to https://github.com/DataONEorg/rdataone
10.5063/f1gf0rf6
redirect to https://github.com/NCEAS/recordr

@doulikecookiedough
Copy link
Contributor

@mbjones I can help with creating/enhancing the datacite.xml documents for the 4 github repos. Is my assumption correct that the DOI that redirects to ADC dataset does not need updating as it is just a dataset?

@rushirajnenuji Could you please help me update the EZIDs once the new citation documents are ready? Sorry but I don't have access to the EZID online interface or the credentials to push changes via the python client.

@doulikecookiedough
Copy link
Contributor

I've connected with @rushirajnenuji on Slack and will assist with getting this completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants