Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes for 1.0.2 #393

Merged
merged 3 commits into from
Jun 18, 2021
Merged

Conversation

denismaier
Copy link
Member

@denismaier denismaier commented Jun 16, 2021

This makes @form optional again. With this change 1.0 styles can again be validated.

Edit: This also re-adds title-short and container-title-short.

Question: How can I now check how that compares to 1.0.1?

@bwiernik
Copy link
Member

@denismaier
Copy link
Member Author

@denismaier v1.0.1...denismaier:fix-form-again

Thanks. I quick glance at the diff shows that there have been more changes. But: It's really hard to tell if they are significant. Some are only whitespace changes, formatting changes, code moved to other files, or other minor changes. But I cannot easily see if there are also real changes.

@bwiernik
Copy link
Member

bwiernik commented Jun 16, 2021

I read through the changes there and they all seem to be referred to in the release log here: https://github.com/citation-style-language/schema/releases/tag/v1.0.2

The moving of cs:choose is unchanged as far as I can see. The csl-data.json file is not changed other than the move. The deletion of code marked obsolete is fine. There are two changes of ? or * that were made by Rintze 9 years ago. Also fine.

@denismaier
Copy link
Member Author

Ok, I've tested with:

  • apa
  • chicago-author-date
  • chicago-fullnote-bibliography
  • chicago-fullnote-bibliography-with-ibid
  • modern-language-association
  • nature
  • society-of-biblical-literature-fullnote-bibliography
  • zeitschrift-fur-religionswissenschaft-note

The only errors I'm still getting should be legitimate:

  • => we need to use "sub-verbo" in such a case, then it should work.
  • => title-short has been removed; can we still test for that?

@bwiernik
Copy link
Member

We can find and replace those I think? @adam3smith

@adam3smith
Copy link
Member

adam3smith commented Jun 17, 2021

wait, why did we remove title-short? Wasn't that added in 1.0.1 specifically to allow testing for the presence of a short title?

(and I don't understand the sub-verbo part, sorry)

@georgd
Copy link

georgd commented Jun 17, 2021

When I asked how testing would work, the answer was to test like <if variable="title" form="short">. I can’t find where it was, though. Probably, removing title-short should be deferred to 1.1 when the new title-structure is introduced.

@denismaier
Copy link
Member Author

denismaier commented Jun 17, 2021

When I asked how testing would work, the answer was to test like <if variable="title" form="short">. I can’t find where it was, though. Probably, removing title-short should be deferred to 1.1 when the new title-structure is introduced.

Either this, or we need to add a test condition form=short. (I don't think that exists yet, right?)
Given that 1.0.2 should be a minor release and not introduce new features like test conditions or so, re-adding title-short and container-title-short is probably the way to go.

@bwiernik
Copy link
Member

@adam3smith You're right. That being removed would have been an error introduced at the same time as the @form change due to munging of the title stuff across 1.1 and master branches. title-short and container-title-short should both still be there in 1.0.2

@bwiernik
Copy link
Member

@adam3smith The "sub-verbo" thing is that in 1.0.1 the term was called "sub verbo" which was not testable because attributes are space-delimited lists. Rintze added "sub-verbo" as an alias for testing. CSL 1.0.2 removes the bug by renaming the locator "sub-verbo"

@denismaier denismaier changed the title Make @form optional again Bugfixes for 1.0.2 Jun 17, 2021
@cormacrelf
Copy link

cormacrelf commented Jun 17, 2021

@bwiernik: The deletion of code marked obsolete is fine.

I don't know what this refers to but I think @denismaier is right in preserving every deprecation in the schema. I can't tell if you're all on the same page there -- it seems the solution to some problems is to re-add to the spec, and to others, it's to do a find-and-replace across the styles repo. IMO the RNC should be able to validate all styles targeting 1.0+. If it can't, then it's a bit of a stretch to say that these changes are truly deprecations, because you'd suddenly be telling people their valid 1.0.1 styles are no longer valid.

I wonder -- can you move all the deprecated items to a separate file? Does RNC let you merge in some deprecations into the regular schema, even in deeply nested nodes? I think this is what the MathML schema does over here, but I'm not sure. If you did, then making use of deprecated items could be made an error, if you didn't include the deprecations.rnc file. That way you can easily add a flag to an online validator service, or run CI on the styles repo with two passes (deprecated uses permitted and then not permitted) and make errors in the latter one only appear as warnings.

@cormacrelf
Copy link

@bwiernik: title-short and container-title-short should both still be there in 1.0.2

Also no matter what other features you add to replace title-short and friends, they should still be in the schema in CSL 1.1, CSL 1.2, and forever more, not just 1.0.2. Old dumps of CSL-JSON from a reference library should absolutely continue to work, and so should styles targeting 1.0.1 that use these variables. I don't think anything should ever be actually removed from either the data or XML schemas. Anything that's deprecated should be purely a recommendation to stop using it in the text of the specification and warnings in validators and processors (not hard errors) to remind people.

@bdarcus
Copy link
Member

bdarcus commented Jun 17, 2021

@cormacrelf you maybe want to look at the 1.1 branch and see what you think.

We tried to raise these questions when we were working on that last year, but didn't really get any feedback.

Our general conclusion was versioned schemas + versioned files should give us more flexibility to deprecate and remove things.

It might be 1.1 really gets called 2.0 in the end. IDK.

@bwiernik
Copy link
Member

Yes, like @bdarcus said, after discussion, it seemed reasonable that requiring versions to be indicated in CSL JSON from 1.1 onwards permitted it to remove elements or change a data structure (e.g., to change title from a string to an object). CSL JSON without a version would be assumed to be CSL 1.0 compatible.

@adam3smith
Copy link
Member

OK, but, with apologies for re-opening that, is that the right decision?
Is there a real, tangible benefit to removing elements that will make old styles and data incompatible that outweighs the costs Cormac points to?
It's possible we'll come across situations where that's the case, but removing old variables (rather than marking them as deprecated) wouldn't seem to be among those.

@denismaier
Copy link
Member Author

I think that's an important discussion, but maybe we should really focus now on getting 1.0.2 out and not so much on the general question of how to deal with old stuff.

So, I don't think anyone here opposes adding title-short and friends back in. Removing them was a mistake, which is now undone. "sub verbo" is more tricky as removing it will break some styles. But OTOH it was a bug in the first place.

@adam3smith
Copy link
Member

Agreed that the larger discussion should go somewhere else and doesn't affect 1.0.2.

"sub verbo" is more tricky as removing it will break some styles. But OTOH it was a bug in the first place.

No strong feelings on this, but I'd be comfortable removing it because a) it only affects styles and not the data model and b) the number of styles it'll affect will be minimal.

@bwiernik
Copy link
Member

Agreed.

@denismaier
Copy link
Member Author

After running trang tests pass again. Good to merge now?

@adam3smith
Copy link
Member

adam3smith commented Jun 18, 2021 via email

@bwiernik bwiernik merged commit ea19f7a into citation-style-language:v1.0 Jun 18, 2021
@bdarcus
Copy link
Member

bdarcus commented Sep 6, 2021

@cormacrelf - on this specific question below (which I neglected to answer earlier; was probably pressed for time):

I wonder -- can you move all the deprecated items to a separate file? Does RNC let you merge in some deprecations into the regular schema, even in deeply nested nodes?

I think so. But I think the schemas then become more difficult to maintain (they necessarily get more complicated than they already are).

But I can experiment a bit again, and post findings, if we settle on this as necessary.

That way you can easily add a flag to an online validator service, or run CI on the styles repo with two passes (deprecated uses permitted and then not permitted) and make errors in the latter one only appear as warnings.

Why not just use different versions of the schema, e.g. different files, to validate different versions?

It's clear there are upsides; what's the downside?

PS - maybe we should call 1.1 instead 2.0?

@bdarcus
Copy link
Member

bdarcus commented Sep 6, 2021

Might (not sure ATM) be feasible to have the following schemas:

  1. 1.0
  2. 2.0 (aka 1.1)
  3. 1+2

The first two would be discrete versions of the same files, while the last would simply import and customize those.

Main thing is, we need the schemas to be:

  1. easy to maintain for us
  2. work well for style authors using a validating editor
  3. of course, work well for existing developers, but also ...
  4. be practical if a new developer three years from now wants to create a new processing library that only targets the latest version (for example, not to support CSL 1.0)

@cormacrelf
Copy link

cormacrelf commented Sep 6, 2021

Thanks for pushing for comment, Bruce, it's worthwhile :) I understand there to be at least five different questions to answer here:

  1. Whether to deprecate or delete things in CSL 1.1 (XML), or equivalently, whether to call CSL 1.1 "CSL 2.0"
  2. Whether to split into two CSL RNC schemas, one for 1.0 and one for 1.1
  3. Whether to deprecate or delete things in CSL-JSON 1.1 (the other two PRs I just submitted)
  4. Whether to call CSL-JSON 1.1 “CSL-JSON 2.0"
  5. Whether to do the same schema splitting for the CSL-JSON JSON-Schema

My thinking so far is:

  1. I cannot stress highly enough how bad I think calling it (the XML language) CSL 2.0 would be. Please don't do this! None of the changes necessarily entail dropping support for old styles. The only way to break compatibility for them is to somehow get implementations new and old to stop supporting 1.0 features/syntax/etc, and the surest way to make that happen is to call it "2.0" and thereby signal that implementers can ignore 1.0 support. I believe the spec should explicitly say "compliant implementations must support all features of previous versions of the spec." And preferably also list the deprecated ones. I am adamant about this. (There's also some room for forward-compatibility guidance, like the CSS Specification § 3.1.)
  2. Don't care nearly as much. But read this 17-year-old blog post, it's very enlightening and I hope we can use some of these strategies in 2021. I love the annotations, didn't know RNC had that, but XSLT sounds tricky. If it can work, awesome. Next best bet is probably building the validator with a "Walsh union" of the current file + 1.0 from git history.
  3. ... and
  4. Not so clear. I will think about this some more. The next CSL-JSON release is shaping up to be pretty different. The latitude for mixing deprecated items with the new way of doing things should not be tolerated nearly as much as with CSL; these are not written by hand, they're generated by computer programs which shouldn't be in limbo between versions. That's a clear difference in my view. I'm happy for processors to parse "either 1.0 or 2.0 CSL-JSON" in the same routine, and they likely will. I'm ~ 75% convinced CSL-JSON should go 2.0. (Edit: hm, not quite, pandoc makes big use of of manual CSL-JSON typed in as YAML. I'm less sure now but still open. We should definitely let them type title: "plain string" though!)
  5. Probably not so necessary in general, deprecating looks to be easier in JSON-Schema, it has apparently supported deprecated: true since 2019. But if the answer to question 4 is 2.0, then yes for 2.0, keep separate schemas. It seems there is an option to factor out defintions of date-variable etc such that you could keep 1.0 abreast of new variables. It's just JSON, you can process JSON files in any programming language very easily. This would break my own tests (I fetch the schema and try to test somewhat exhaustively), but I can live. It would also break the URL in Distribute schemas? #398.

@bdarcus
Copy link
Member

bdarcus commented Sep 9, 2021

This is generally in order of your items, but doesn't perfectly align, so I'm not numbering them @cormacrelf.

What to call it is mostly a marketing question, and I actually don't care. The exception is what the spec says about supporting previous versions. I don't have a strong view here either. Unless there are objections, let's table that 2.0 idea then.

I'm aware of strategies to do this, which is why I'd rather not. The schemas are already difficult to manage; not because they're poorly designed, I think, but because the logic is complicated.

I'm less knowledgeable about JSON Schema, other than coming to the conclusion it's much less powerful than RNG. So things we can contemplate in the latter we can't in the former.

If we add back title as string literal, then we will need to add language to the spec about what to do with it. I don't have the link handy ATM for our previous discussions on this, but splitting titles gets hairy, and I don't think we should require CSL processors to worry about that. So we'd need to figure out some details here. In short, how should/must a CSL 1.1. processor handle a request for printing a subtitle, if there is no subtitle input data, but there is a title? Maybe we should start a new issue just for this? Are there any other issues here that need focused discussion that warrants a new issue?

@bdarcus
Copy link
Member

bdarcus commented Oct 15, 2021

Just bumping this, since we haven't resolved it :-)

@cormacrelf
Copy link

cormacrelf commented Oct 15, 2021

I'm aware of strategies to do this, which is why I'd rather not. The schemas are already difficult to manage; not because they're poorly designed, I think, but because the logic is complicated.

That's why I liked the "do not manage it using actual RNC kept in the repo, just write a script to merge an old branch with the new for a validator" option

I will try this out and report how it goes

@bdarcus
Copy link
Member

bdarcus commented Oct 15, 2021

That's why I liked the "do not manage it using actual RNC kept in the repo, just write a script to merge an old branch with the new for a validator" option.

But my question is "why?" What's the advantage of having one "schema", however it's constructed and maintained, validating multiple versions, rather than multiple schemas?

@cormacrelf
Copy link

There is no validation-wise difference between a user or computer program looking at the version attribute and applying e.g. the 1.0.2 validator, and a relax-ng validator unconditionally applying the Walsh disjunction of 1.0.1 | 1.0.2 | 1.0.x | 1.1 | .... The disjunction is better for technical reasons including that you do not need to replicate the version selection logic anywhere, any RNC tooling will do it for you (in editors, for the styles repo, for a website, etc.) and you can therefore just publish one document that will validate any CSL style according to the version it says it is compatible with. That's pretty obviously a good thing.

But it is largely orthogonal to the choice of how you handle deprecations. It doesn't do any magic for that, because if you decide to remove deprecated features, there is no choice among the multi-schema options (whether baked into the RNC or manually selected) that will correctly validate the use of a deprecated feature in a version post-deprecation, which was the whole point of deprecating rather than removing and making breaking changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants