Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt community XQuery file extension convention #102

Conversation

joewiz
Copy link
Member

@joewiz joewiz commented May 4, 2022

  • main modules: .xq
  • library modules: .xqm

Requires eeditiones/tei-publisher-lib#7 and eeditiones/tei-publisher-components#92.

- main modules: .xq
- library modules: .xqm
@joewiz
Copy link
Member Author

joewiz commented May 5, 2022

For the motivation, I'll quote from my post at https://gitlab.existsolutions.com/tei-publisher/tei-publisher-app/-/issues/152#note_8478:

Might I also suggest taking the opportunity during a cleanup effort in this app to apply consistent naming for XQuery file extensions? We currently mix .xq, .xql, and .xqm in this and generated apps. While arbitrary, there is an important difference between main modules and library modules for developers, and rather than continuing to mix conventions, we could demonstrate the use of best practices. The community convention (most recently discussed at eXist-db/documentation#258) is as follows:

  • .xq for main modules

  • .xqm for library modules

The only exception would be controller.xql, which is hard coded in the eXist core. This is, of course, a main module.

I'd be happy to take this on so that it's not a drain on the other important cleanup tasks defined above - just let me know.

And in a followup post in the same thread:

I'll provide a little more background. I like to root my own thinking on the topic in the spec, describe what leeway the spec gives us, and then describe community conventions:

The XQuery spec is quite spare on the topic of file extensions:

The most common file extensions in use for XQuery are .xq and .xquery.

In contrast, the spec provides clear definitions for main modules vs. library modules:

A module that does not contain a Query Body is called a library module. A library module consists of a module declaration followed by a Prolog.

A main module consists of a Prolog followed by a Query Body.

From this I conclude the following:

  1. The spec doesn't mandate a particular file extension, or system of file extensions, for XQuery. It merely says that .xq and .xquery are the most common.
  2. The spec doesn't say whether communities should adopt a single file extension for XQuery files extension. A community is free to arbitrarily intermix .xq, .xquery (oXygen's default for XQuery files), .xqy (used commonly in the MarkLogic community), or .xql or .xqm (both used fairly interchangeably in many eXist community projects), or .txt for that matter.
  3. The spec doesn't discuss whether communities should use file extensions to distinguish between main modules and library modules. We could simply use .xq for all XQuery files, for example. But if a community does want to use file extensions to make this distinction, it would make sense to select a mnemonic device or rationale for the choice, document the choice, and stick with it & enforce it:
    1. .xq for main modules, .xql for Library modules
    2. .xq for main modules, .xqm for library Modules
    3. .xqm for Main modules, .xql for Library modules
    4. .xql for main modules (or XQuery Language), .xqm for library Modules
    5. (For extra points, use .xq for all modules! Just look inside the file, and if you see a "module" declaration, you know if it's a Library module; if you don't see one, you'll know it's just a Main module, !!@&$#!)

It seems like TEI Publisher has selected option 4 here. That's fine.

However, the eXist book recommends option 2, and this was the result of fairly extensive discussions started in this 2009 exist-open thread, prompted by a question about the proliferation of XQuery file extensions: https://sourceforge.net/p/exist/mailman/exist-open/thread/2e0234190911192025v117160a7sf904bd67a7101c06%40mail.gmail.com/. This thread is what led the eXist book to adopt 2 as the recommendation. This decision was in part based on BaseX's use of this convention, and the agreement of everyone who weighed in from the community. That's the closest thing we have to an eXist community convention. We've never enforced it on any community apps (despite eXist-db/documentation#258 having been open for over a year), so you'll still see many apps that freely mix file extensions even in a single app. Just because we have a good, if loosely enforced convention, doesn't mean TEI Publisher needs to follow it. I'd welcome you to do so, but do whatever works and makes sense.

For reference, I should also mention that raised the question again in 2013 in the xquery-talk list, where the discussion included more than just eXist folks: http://x-query.com/pipermail/talk/2013-January/004030.html.

Since that thread 2-3 years ago, the XQuery for Humanists book came out - adding its endorsement of the .xq and .xqm convention. Also, with eXist 5.3.0, now supports controller.xq and gives it higher precedence than controller.xql - see eXist-db/exist#3739. The standard app generator for eXist, generator-exist, now generates all apps using this convention as of its v6.0.0 release - eXist-db/generator-exist#614. The public-repo app made this change in eXist-db/public-repo@8ce2c69, and eXide has a PR much like this one eXist-db/eXide#461.

Thus, for pedagogical and practical purposes, adopting this convention consistently will help both new and longstanding contributors to TEI Publisher and other eXist-based projects instantly understand the nature of each XQuery module in this project by glancing at the file extension. As an example of one such user, I constantly find myself guessing whether an .xql file is a library module or main module and find it frustrating that I have to manually examine the file to know for sure.

This PR already updates the documentation where relevant, and I would be happy to contribute to the "updating" article in the documentation for the release where this is included - conceivably, the forthcoming TEI Publisher 8 release. I imagine that the lib and components repos might also need major releases. But this move puts all of the projects on an even more solid foundation, so I would argue that it is worthy of inclusion in a major release.

Comments welcome!

@tuurma
Copy link
Member

tuurma commented Mar 15, 2023

Hi @joewiz, while agreeing in principle re naming conventions, after some discussions we determined that disturbance to existing Publisher editions and potential for confusion for users is too great to justify the change. Therefore I will close this PR, nevertheless many thanks for your work and consideration that went into this proposal.

@tuurma tuurma closed this Mar 15, 2023
@joewiz
Copy link
Member Author

joewiz commented Mar 16, 2023

@tuurma Thanks for your consideration of the PR. While I agree the proposed shift would have been painful for current users migrating to a new version, the pain would have been limited to one upgrade cycle, and it would have relieved confusion and made it easier for new and existing users to understand the project and contribute to it. The continued inconsistent use of file extensions (not to mention mismatches between module base names, namespace prefixes, and namespace URIs) will continue to impose its own pain - a type of technical debt. I'm sure you've considered this in the calculus, and I fully accept the decision; I thought it worth articulating the trade-off one more time. There may come a day when reworking the naming is worth it.

Also, it might be worth adopting a consistent naming convention for new code. A statement to the effect of, say:

As of 2023-03-16, new XQuery modules contributed to the TEI Publisher repository should use the .xq file extension for main modules and .xqm for library modules. Module namespace URIs should begin with a http://teipublisher.com/ns prefix and a suffix that matches the filename, and a namespace prefix that either matches the filename or is a logical abbreviation/truncation thereof. For example, ... Existing code that does not conform to this convention should not be changed, for backwards compatibility.

@tuurma
Copy link
Member

tuurma commented Mar 17, 2023

@joewiz I agree that we may want to rework the naming at some point, perhaps already for the 9 release if, as it seems to me now, it may include further breaking changes.

Meanwhile, would you rather keep this PR opened or, as I would actually prefer, open an issue about the conventions you mention, i.e. not only module extensions but also prefixes and URIs?

@joewiz
Copy link
Member Author

joewiz commented Mar 22, 2023

@tuurma I like your idea to open a new issue. I'll aim to do so ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants