Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primer Xlink Issue #19

Open
thabing opened this issue Oct 28, 2016 · 20 comments
Open

Primer Xlink Issue #19

thabing opened this issue Oct 28, 2016 · 20 comments

Comments

@thabing
Copy link
Member

thabing commented Oct 28, 2016

Hi all,

Have just discovered that in the primer the xlink:type attribute isn't mentioned together at all with xlink:href. All the other attributes are listed but not that and its mandatory in the used xlink-schema and in the xlink-standard (as I have understood it). Got the comment that since its not mentioned so therefore its thought not to be used and they run into validation problems.

So can we update the text?

Best,
Karin

@thabing
Copy link
Member Author

thabing commented Oct 28, 2016

And to add, in the smLinkType where xlink attributes are referenced individually the type attribute is missing.

Best,
Karin

@leahprescott leahprescott self-assigned this May 3, 2018
@leahprescott
Copy link
Contributor

Leah will make changes to Primer - we will need to ask Glenn to make changes to the overview page

@aelkiss
Copy link
Contributor

aelkiss commented Nov 29, 2018

The XLink specification itself does not require xlink:type as I understand it - one of the options for conformance for an element is if "it does not have a type attribute from the XLink namespace and it adheres to the conformance constraints imposed by the XLink simple element type, as prescribed in this specification." (https://www.w3.org/TR/xlink11/#markup-reqs).

Additionally, the spec says later on "The value of the type attribute must be supplied unless the element is a simple link (https://www.w3.org/TR/xlink11/#dt-simplelink) and an href attribute in the XLink namespace is supplied. In the latter case, the value "simple" is implied for the type attribute. If a value is supplied for the type attribute, its value must be one of "simple", "extended", "locator", "arc", "resource", "title", or "none"." (https://www.w3.org/TR/xlink11/#link-types)

My understanding was that our xlink schema was what caused xlink:type to be required in this case, but I'm having trouble reproducing the validation error in Oxygen. This example validates fine for me:

<?xml version="1.0"?>
<mets xmlns="http://www.loc.gov/METS/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/terms/"
    xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd 
    http://www.w3.org/1999/xlink http://www.loc.gov/standards/xlink/xlink.xsd">
    <metsHdr />
    <dmdSec ID="dmd001">
        <mdRef LOCTYPE="URN" MIMETYPE="application/xml" MDTYPE="EAD" xlink:href="urn:x-nyu:fales1735" />
    </dmdSec>   
    <structMap>
        <div/>
    </structMap>
</mets>

Could someone provide an example where xlink:type is not provided but xlink:href is, and the document fails validation?

@thabing
Copy link
Member Author

thabing commented Nov 29, 2018

Unable to reproduce validation failure. We need an example

@karinbredenberg
Copy link
Contributor

karinbredenberg commented Jan 24, 2019

Its an old thing (2007 and on)
One of the problems I find in my mail archive is that it comes when you embed EAD2002 into the METS, the xlink schemas start to collide when they all use the same xlink-schema namespace but points to different schemas with different optional attributes. (A mail to the METS list from 2007) So if you havent moved to EAD3 the error might still occur.
And it goes on, so its when you embed something which use another schema of xlink but have the same namespace for xlink.
Still as i read the xlink schema in the simpleLink attribute which as I understand it is used in the METS-schema the type is not optional so we should use it.

@karinbredenberg
Copy link
Contributor

Gone through my archive and its still the comment from the 24th of January above that gives the most. Having the same namespace but using different schema versions of xlink causes the problem.

@aelkiss
Copy link
Contributor

aelkiss commented Mar 21, 2019

The EAD2002 schema https://www.loc.gov/ead/ead.xsd and the METS schema both import http://www.loc.gov/standards/xlink/xlink.xsd. The following minimal example that includes both the EAD and METS schema along with the XLink schema validates for me:

<?xml version="1.0"?>
<mets xmlns="http://www.loc.gov/METS/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:ead="urn:isbn:1-931666-22-9" xmlns:xlink="http://www.w3.org/1999/xlink"
  xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd
  urn:isbn:1-931666-22-9 https://www.loc.gov/ead/ead.xsd
  http://www.w3.org/1999/xlink http://www.loc.gov/standards/xlink/xlink.xsd  
  ">
  <metsHdr/>
  <dmdSec ID="DMD1">
    <mdWrap MDTYPE="EAD">
      <xmlData>
        <ead:ead>
          <ead:eadheader>
            <ead:eadid/>
            <ead:filedesc>
              <ead:titlestmt>
                <ead:titleproper/>
              </ead:titlestmt>
            </ead:filedesc>
          </ead:eadheader>
          <ead:frontmatter/>
          <ead:archdesc level="item">
            <ead:runner/>
            <ead:did>
              <ead:abstract/>
            </ead:did>
          </ead:archdesc>
        </ead:ead>
      </xmlData>
    </mdWrap>
  </dmdSec>
  <dmdSec ID="DMD2">
    <mdRef LOCTYPE="URN" MIMETYPE="application/xml" MDTYPE="EAD" xlink:href="urn:x-nyu:fales1735" />
  </dmdSec>   
  <structMap>
    <div/>
  </structMap>
</mets>

@andreasnef
Copy link
Contributor

I also only have a vague memory of such a problem many years ago. However, two things that I discovered while trying to find the context:

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

For http://www.w3.org/1999/xlink.xsd (and the basically-identical https://www.w3.org/XML/2008/06/xlink.xsd) the issue is that the attribute groups have different names (e.g. simpleAttrs in the W3C schema vs. simpleLink in the LOC schema.)

The particular errors you would get then appear to be dependent on the order the schemas are declared in xsi:schemaLocation. I am not sure whether there is a normative specification for how namespaces with multiply-declared schema should be handled, but at least for Xerces (via JHOVE or Oxygen) the first one loaded seems to win. It would be worth testing with another XML Schema implementation to see if it exhibits the same behavior.

I've outlined some possible paths forward below. I did some testing with a simple example that just imports a different xlink schema in addition to the METS schema, but I do want to do a little bit of testing with some more complex examples to validate these approaches.

  1. Document and promote a workaround. If we produce a xlink schema that imports the W3C schema (and therefore includes both the simpleLink and simpleAttrs attribute groups), then we could tell people running into this issue to include that as the first xsi:schemaLocation value to ensure that's the schema that gets used for XLink. I think this would be a good option at least for the short term, since it doesn't have any potential for breakage (for people not already running in to this issue) and doesn't require any coordination with other groups.

  2. Longer term, we could work towards updating the METS, PREMIS, and EAD schemas to reference the W3C schema instead of the LOC schema. At least for the METS schema, this seems to be OK if you update the attribute group names. I haven't looked at the PREMIS or EAD schemas to see if it would cause issues there. This would require a lot of coordination and testing with existing files, but should ultimately eliminate the problem.

  3. Update the LOC xlink schema to be the workaround from Clarify diskmount TRANSFORMTYPE with Jerry #1. That has the same issue with needing to test existing METS files to ensure nothing breaks, but wouldn't need as much coordination with other schemas that reference the LOC xlink schema, since it would just be an addition to the schema rather than a change.

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

I still can't reproduce the original issue (validation complaint about missing xlink:type with xlink:href) even with the W3C XLink schema - as-is, it just complains that the simpleLink or simpleAttrs groups aren't defined depending on the schema load order, and a simple example validates just fine with a version of the METS schema that references the attribute groups from the W3C XLink schema.

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

The "workaround schema" can't just import the W3C schema for a couple reasons - one, you can't import a schema for the same namespace as the one you're defining a schema for; two, it declares the same attributes as the LOC schema. Still, it looks like a compromise schema that declares the attributes once but declares the attribute groups both from the LOC and W3C schemas is possible.

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

I think I would definitely not recommend option 3 above (to make the LOC XLink schema into this compromise schema) as the compromise schema is definitely a hack. But I think having the compromise schema available (option 1) is a good option. We can talk if option 2 (changing existing schemas to use the W3C XLink schema) is something to work towards longer term or if option 1 is sufficient.

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

I verified with a more complex METS file that if I 1) change the METS schema to reference the W3C XLink schema (and update the referenced attribute groups appropriately) and then 2) validate a METS file that references both this changed METS schema and the existing PREMIS schema that it does not validate, but if I prepend the compromise XLink schema to the schemaLocation element, then it does. I will post the compromise xlink schema and the updated METS schema as a gist.

@karinbredenberg
Copy link
Contributor

EAD had its own xlink schema way back around 2007. When I raised the problem it was coordination made for it to be the same but it took some time.

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

@andreasnef I looked through the SEDA site at https://francearchives.fr/seda/, but I don't see any complete examples of SEDA XML. Do you have an example either just of SEDA or of METS that embeds SEDA?

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

@aelkiss
Copy link
Contributor

aelkiss commented Apr 18, 2019

A fourth option would be to remove the XLink schema entirely from the METS schema.

Bertrand mentions that this option has been brought up a few times in the past.

In talking with people from W3C, they thought the XLink schema was dead. It has not been widely adopted; web browsers don't understand XLink. EAD2002 does use XLink. EAD3 does not use XLink at all. The PREMIS2 schema at least imports the LOC XLink schema. It does not appear that PREMIS3 references the XLink schema at all.

Not aware of any users that really take advantage of the XLink attributes. Betrand would be in favor of getting rid of it.

@andreasnef
Copy link
Contributor

@andreasnef I looked through the SEDA site at https://francearchives.fr/seda/, but I don't see any complete examples of SEDA XML. Do you have an example either just of SEDA or of METS that embeds SEDA?

I just verified some of our reference examples for SEDA, but there are none that actually use xlink attributes. I checked with the SEDA (2.1) schemas, and while all of them define the namespace, only a couple actually import the xlink.xsd, and only one of these two actually specifies an (optional) attribute of this namespace.

So, the SEDA example is probably yet another case where it was introduced a while ago and now has the same issues as with METS...

@ntra00
Copy link

ntra00 commented May 14, 2019

On Aaron's option 2 above,
"2. Longer term, we could work towards updating the METS, PREMIS, and EAD schemas to reference the W3C schema instead of the LOC schema.".
The reason we came up with our own (for MODS and METS is because the W3C site was being hit by validators every time a METS file was accessed.

@BertrandCaron
Copy link
Contributor

I asked Baptiste Nichele (who was until recently working at the Service interministériel des archives de France on the SEDA standard) and he confirmed that the XLINK dependency was introduced in version 2. It was inherited from the standard MEDONA, but he considers that another simple way of referencing would have done the trick as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants