-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add first draft of URL specification, refs #93 #216
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,362 @@ | ||
# URI Specification v1.0.0 | ||
|
||
- [Introduction notes](#introduction-notes) | ||
- [Indexable Object](#indexable-object) | ||
- [Attributes](#attributes) | ||
- [name](#name) | ||
- [pod](#pod) | ||
- [kind](#kind) | ||
- [subkinds](#subkinds) | ||
- [categories](#categories) | ||
- [Indexable object types](#indexable-object-types) | ||
- [Primary indexable objects](#primary-indexable-objects) | ||
- [Pod blocks](#pod-blocks) | ||
- [Name attribute](#name-attribute) | ||
- [Kind Type](#kind-type) | ||
- [Kind Language and Programs](#kind-language-and-programs) | ||
- [Multiple pod blocks](#multiple-pod-blocks) | ||
- [Secondary indexable objects](#secondary-indexable-objects) | ||
- [Kind::Routine](#kindroutine) | ||
- [Attributes](#attributes-1) | ||
- [Kind::Syntax](#kindsyntax) | ||
- [Attributes](#attributes-2) | ||
- [Kind::Reference](#kindreference) | ||
- [Attributes](#attributes-3) | ||
- [URI setting](#uri-setting) | ||
- [URI setting](#uri-setting-1) | ||
- [URI rewriting](#uri-rewriting) | ||
- [Examples](#examples) | ||
- [Get all URLs from Primary objects](#get-all-urls-from-primary-objects) | ||
- [Get all URLs from Secondary objects](#get-all-urls-from-secondary-objects) | ||
- [Classification of secondary objects by name](#classification-of-secondary-objects-by-name) | ||
|
||
## Introduction notes | ||
|
||
In this specification, we want to set a common guidelines and rules for URLs in the [official documentation](raku.docs.org). This specification must be implemented be the tools generating the HTML pages of the doc site ([Documentable](https://github.com/Raku/Documentable) at this moment). For that reason, the described behavior is **currently** implemented in `Documentable:ver<2.0.0>`. | ||
|
||
Right now, there are some tests for URL generation in `Documentable`, but they are kind of scattered and are insufficient, so when this specification is finished, a dedicated set of tests will have to be created. Maybe a spec json file, as [Mustache](https://github.com/mustache/spec/tree/master/specs) does. | ||
|
||
In the official documentation, there are a lot of different things, like pages generated directly from a single [source file](https://docs.raku.org/type/Associative) from pages generated by grouping certain parts of several [source files](https://docs.raku.org/routine/(%7C),%20infix%20%E2%88%AA). In order to represent this information in a manageable way, we use certain data structures, name conventions and metadata, but everything is based on **indexable objects**. | ||
|
||
## Indexable Object | ||
|
||
An indexable object is a set of information documenting one thing or several related things that can be referred to. In order to clarify this definition, you can think of an indexable object as the documentation for a certain [type](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6), for some [method](https://github.com/Raku/doc/blob/aec4740ded31770c799b5e236d9e5d423b8f988b/doc/Type/Any.pod6#L19-L34) or a [tutorial](https://github.com/Raku/doc/blob/master/doc/Language/grammar_tutorial.pod6). Even [references](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6#L102) are indexable objects. | ||
|
||
We can extract and set additional information to these objects, in order to classify them and create URIs to refer them. **All indexable objects** share these attributes but with different values: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Every indexable object gets its own URI? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep. Though URIs of secondary objects are not unique (all secondary objects with the same |
||
### Attributes | ||
|
||
#### name | ||
|
||
Relatively short string to name the object. For instance: `Any`, `X::AdHoc`, `101-basics`. etc. See [indexable object types](#indexable-object-types). | ||
|
||
#### pod | ||
|
||
Pod representing the indexable object. This pod *does not have to be* a `=begin pod ... =end pod` block. It *must* be a [Pod::Block](https://docs.raku.org/type/Pod::Block) or an array containing `Pod::Block`s. | ||
|
||
#### kind | ||
|
||
This a **fixed** list of values to make a less granular classification of what the pod is representing. | ||
|
||
~~~perl6 | ||
enum Kind (Type, Language, Programs, Syntax, Routine, Reference); | ||
~~~ | ||
|
||
The first three ones, cannot be easily deduced from the indexable object, so they need to be **specified by the user**, whereas the last three can be deduced without much trouble. You should set one of this to the indexable objects depending on the documentation you are trying to represent: | ||
|
||
- `Type`: when the docs is about a class, a role or an enum. | ||
- `Language`: when the docs is related to the language itself. | ||
- `Programs`: when the docs is describing a program: a debugger, for instance. | ||
|
||
Automatically deduced: | ||
|
||
- `Syntax`: when the docs is related to a `twigil`, `constant`, `variable`, `quote` or `declarator`<sup>1</sup>. | ||
- `Routine`: when the docs is related to a `sub`, `method`, `term`, `routine`, `submethod`, `trait`, `infix`, `prefix`, `postfix`, `circumfix`, `postcircumfix` or a `listop`. | ||
- `Reference`: when the docs is just a `X<>` element. | ||
|
||
#### subkinds | ||
|
||
This is used as a more granular classification of indexable objects, based on the contents of the documentation. The value of these subkinds depends on the kind of the indexable object: | ||
|
||
- `Type`: specified by the user. | ||
- `Language`: specified by the user. | ||
- `Programs`: specified by the user. | ||
- `Routine`: deduced from the pod. A list containing a subset of these values: `infix, prefix, postfix, circumfix, postcircumfix, listop, sub, method, term, routine, submethod, trait twigil constant variable quote declarator`. | ||
- `Syntax`: deduced from the pod. A list containing a subset of these values: `twigil, constant, variable, quote, declarator` <sup>2</sup>. | ||
- `Reference`: indexable objects of this kind always have the same `subkinds` value: `['reference']`. | ||
|
||
#### categories | ||
|
||
This is also used as a more granular classification of indexable objects, nonetheless, this classification is not based entirely in the contents of the documentation. This value also depends of the kind of the indexable object: | ||
|
||
- `Type`: specified by the user. | ||
- `Language`: specified by the user. | ||
- `Programs`: specified by the user. | ||
- `Routine`: same as `subkinds` except if `subkinds` contains one the following values: `infix, prefix, postfix, circumfix, postcircumfix, listop`. In that case, `categories` is always `['operators']`. | ||
- `Syntax`: same as `subkinds`. <sup>3</sup>. | ||
- `Reference`: indexable objects of this kind always have the same `subkinds` value: `['reference']`. | ||
|
||
## Indexable object types | ||
|
||
### Primary indexable objects | ||
|
||
#### Pod blocks | ||
|
||
A primary indexable object is created from a `pod block`. A pod block is just a pod structure like this one: | ||
|
||
~~~perl6 | ||
=begin pod | ||
... | ||
=end pod | ||
~~~ | ||
|
||
But that's not a *valid* one. For a pod block to be a primary indexable object, it needs to comply some rules: | ||
|
||
- It must have a `=TITLE`. | ||
- It must have a `=SUBTITLE`. | ||
- It must contain three different key/value pairs following the format: `:kind(<string>) :subkind(<string>) :category(<string>)`. | ||
- `:kind` has to be one and only one of the stringyfied version of the first three `Kind`s: `:kind("type")`, `:kind("language")` or `:kind("programs")`. | ||
- `:subkind` is an arbitrary string. | ||
- `:category` is an arbitrary string. | ||
|
||
So, a valid primary indexable object is something like this: | ||
|
||
~~~perl6 | ||
=begin pod :kind("Language") :subkind("Language") :category("migration") | ||
=TITLE Perl to Raku guide - functions | ||
=SUBTITLE Builtin functions in Perl to Raku | ||
=end pod | ||
~~~ | ||
|
||
In this key/value pairs, you can set the value of [subkinds](#subkinds) and [categories](#categories) of the first three kinds. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are those values fixed by this document? |
||
#### Name attribute | ||
|
||
Name attribute depends on the kind specified by the user in the primary indexable object: | ||
|
||
##### Kind Type | ||
|
||
In this case, the last word of the `=TITLE` element is taken as name. So, if we have the following primary indexable object: | ||
|
||
~~~perl6 | ||
=begin pod :kind("Type") :subkind("class") :category("basic") | ||
=TITLE class Any | ||
=SUBTITLE Thing/object | ||
class Any is Mu {} | ||
=end pod | ||
~~~ | ||
|
||
Its name would be `Any`. | ||
|
||
##### Kind Language and Programs | ||
|
||
In this case, due to the arbitrariness of the `=TITLE` element, we cannot deduce a name, so we take the name of the file, stripping out the extension. So, if we have the following primary indexable object: | ||
|
||
~~~perl6 | ||
=begin pod :kind("Language") :subkind("Language") :category("migration") | ||
=TITLE Perl to Raku guide - functions | ||
=SUBTITLE Builtin functions in Perl to Raku | ||
=end pod | ||
~~~ | ||
|
||
stored in `/SomeDirectory/perl-raku-guide.pod6`, its name would be `perl-raku-guide`. | ||
|
||
#### Multiple pod blocks | ||
|
||
Several primary indexable objects of `Kind::Type` can be written in the same file as follows: | ||
|
||
~~~perl6 | ||
=begin pod :kind("Type") :subkind("class") :category("basic") | ||
=TITLE class Any | ||
=SUBTITLE Thing/object | ||
class Any is Mu {} | ||
=end pod | ||
|
||
=begin pod :kind("Type") :subkind("enum") :category("basic") | ||
=TITLE enum Bool | ||
=SUBTITLE Logical Boolean | ||
=end pod | ||
~~~ | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't know what to make of this. If this document is a URL generation document, it should include all of it, not the directory part. Right now, the name of the file is taken directly from the name of the primary file. What happens in this case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Multi class files are only supported for types, so they have different names. I will clarify that. |
||
They will be treated as two independent primary indexable objects. | ||
|
||
### Secondary indexable objects | ||
|
||
A primary indexable object can contain a lot of documentation, for instance, [Any](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6) has a very long list of methods. In order to gave a more granular documentation, we can extract certain parts of that pod and create more indexable objects. | ||
|
||
#### Kind::Routine | ||
|
||
To detect those parts, we use [pod headers](https://docs.raku.org/type/Pod::Heading). But not all pod headers are valid, they need to follow one of the these formats: | ||
|
||
- `[T|t]he <single-name> <subkind>` | ||
- `<subkind> <name>` | ||
|
||
where | ||
|
||
- `<subkind>` is one element of `infix, prefix, postfix, circumfix, postcircumfix, listop, sub, method, term, routine, submethod, trait`. | ||
- `<single-name>` is a single word (without spaces). | ||
- `<name>` can be formed by several words separated by spaces. | ||
|
||
##### Attributes | ||
|
||
- `kind` is set to `Kind::Routine`. | ||
- `name` is set to `<single-name>` or `<name>`. | ||
- `subkinds` is set to `(<subkind>)`. | ||
- `categories`: | ||
- If subkind is one of `infix, prefix, postfix, circumfix, postcircumfix, listop`, then it will be set to `("operator")`. | ||
- If subkind is one of `sub, method, term, routine, submethod, trait`, then it will be set to the same value as `subkinds`. | ||
|
||
#### Kind::Syntax | ||
|
||
To detect those parts, we use [pod headers](https://docs.raku.org/type/Pod::Heading). But not all pod headers are valid, they need to follow one of the these formats: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't say "if necessary". It's going to be used in the URL fragment. So this should be part of the URL specification too. |
||
- `[T|t]he <single-name> <subkind>` | ||
- `<subkind> <name>` | ||
|
||
where | ||
|
||
- `<subkind>` is one element of `twigil constant variable quote declarator`. | ||
- `<single-name>` is a single word (without spaces). | ||
- `<name>` can be formed by several words separated by spaces. | ||
|
||
##### Attributes | ||
|
||
- `kind` is set to `Kind::Syntax`. | ||
- `name` is set to `<single-name>` or `<name>`. | ||
- `subkinds` is set to `(<subkind>)`. | ||
- `categories`: will be set to same value as `subkinds`. | ||
|
||
`=headn X<>`<sup>1</sup><sup>2</sup><sup>3</sup> is also a valid header. | ||
|
||
#### Kind::Reference | ||
|
||
These secondary indexable objects come from `X<>` elements (see [Pod::FormattingCode](https://docs.raku.org/type/Pod::FormattingCode)). They have to be written as follows: | ||
|
||
~~~perl6 | ||
X<text|meta> | ||
~~~ | ||
|
||
`meta` is a string containing several group of words, separated by `;`, and words inside each group separated by `,`. For instance: `foo, bar; w`. Raku would interpret that `meta` attribute as follows: `[ [foo bar] [w] ]`, that is, a list containing two lists: one with two elements and other with a single element. | ||
|
||
From a single `X<>` element, several secondary indexable objects can be created, one for every group of words found in `meta`. For instance: | ||
|
||
~~~perl6 | ||
X<text|a;b,c;d> | ||
~~~ | ||
|
||
Would be interpreted as if you had typed: | ||
|
||
~~~perl6 | ||
X<text|a> | ||
X<text|b,c> | ||
X<text|d> | ||
~~~ | ||
|
||
`text` or `meta` can be empty strings, but not both at the same time, so `X<|meta>` and `X<text>` are valid references. | ||
|
||
##### Attributes | ||
|
||
In all cases, `kind` and `subkinds` are set to `Kind::Reference` and `['reference']` respectively. `categories` attribute is not set in these indexable objects. | ||
|
||
`name` setting depends on `meta` variable: | ||
|
||
- `meta` is an empty string. Then, `meta` is set to `[text]`. So it would be interpreted as `X<text|text>`. | ||
- `meta` has only one element: `name` is set to the stringyfied version of `meta`. So `X<|a>` would get the name `a`. | ||
- `meta` has more than one element: `name` is set to an alteration of `meta`. So `X<|a,b,c>` would get the name `c (a b)`. | ||
|
||
##### URI setting | ||
|
||
The URI of these indexable objects depends on the primary indexable object where the reference was found. The URI is formed as follows: | ||
|
||
~~~perl6 | ||
"{$origin.uri}#index-entry-{$meta}-{$index-text}" | ||
~~~ | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe a table that clarifies how any URI is generated? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have not added a table yet because there are over 60 valid headers and references. Nonetheless, I have added a list of tests to check that the document describes the current behavior in |
||
where: | ||
- `origin.uri` is the URI of assigned to the primary indexable object where the reference was found. | ||
- `$meta` is the concatenation by `-` of the groups found in `meta`. | ||
- `$index-text` is `text`. | ||
|
||
## URI setting | ||
|
||
All indexable objects have an associated *Uniform Resource Identifier* or URI. It is formed based on the common attributes of all indexable objects, as follows: | ||
|
||
~~~perl6 | ||
"/{$kind.lc}/{$name}" | ||
~~~ | ||
|
||
## URI rewriting | ||
|
||
As you may know, `Raku` accepts a huge range of symbols, so the `name` attribute can be a little bit weird sometimes (from a URI perspective). For this reason, `name` needs to be slightly altered to generate valid URLs. This alteration is made by making these replacements: | ||
|
||
~~~perl6 | ||
/ => $SOLIDUS | ||
% => $PERCENT_SIGN | ||
^ => $CIRCUMFLEX_ACCENT | ||
# => $NUMBER_SIGN | ||
' ' => _ | ||
~~~ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I never really liked these, any thoughts on changing them to something else? For example, CIRCUMFLEX_ACCENT can be simplified to CARET, but maybe there's an even better way? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To tell you the truth, I do not know. Some of those characters are there because they are not valid in paths. So maybe they should only be changed in the filename of the HTML page and redirected from the tool serving the pages. With a dynamic site, we would not need those I think. |
||
|
||
## Examples | ||
|
||
This specification is intended to be independent from the tool used, but as it's the first version, it's entirely based in the behavior of `Documentable:ver<2.0.0>`, so here you have some examples to check by yourself the concepts explained before: | ||
|
||
##### Get all URLs from Primary objects | ||
|
||
~~~perl6 | ||
use Documentable:ver<2.0.0>; | ||
use Documentable::Registry:ver<2.0.0>; | ||
|
||
my $registry = Documentable::Registry.new( | ||
:topdir("doc"), | ||
:dirs(DOCUMENTABLE-DIRS), | ||
:!verbose, | ||
); | ||
|
||
$registry.compose; | ||
|
||
say $registry.documentables.map({.url}); | ||
~~~ | ||
|
||
##### Get all URLs from Secondary objects | ||
|
||
~~~perl6 | ||
use Documentable:ver<2.0.0>; | ||
use Documentable::Registry:ver<2.0.0>; | ||
|
||
my $registry = Documentable::Registry.new( | ||
:topdir("doc"), | ||
:dirs(DOCUMENTABLE-DIRS), | ||
:!verbose, | ||
); | ||
|
||
$registry.compose; | ||
|
||
say $registry.definitions.map({.url}); | ||
~~~ | ||
|
||
##### Classification of secondary objects by name | ||
|
||
~~~perl6 | ||
use Documentable:ver<2.0.0>; | ||
use Documentable::Registry:ver<2.0.0>; | ||
|
||
my $registry = Documentable::Registry.new( | ||
:topdir("doc"), | ||
:dirs(DOCUMENTABLE-DIRS), | ||
:!verbose, | ||
); | ||
|
||
$registry.compose; | ||
|
||
my %routine-documents = $registry.lookup("routine", :by<kind>).categorize({.name}); | ||
my %syntax-documents = $registry.lookup("syntax", :by<kind>).categorize({.name}); | ||
|
||
say %routine-documents<⊅>; | ||
~~~ | ||
|
||
<sup>1</sup> Certain kind of headers (`=headn X<>`) too, but there are not logical reason to mark those headers as `Syntax`, so that's needs to be fixed. This behavior is inherited from the old `htmlify.p6`. | ||
|
||
<sup>2</sup> Additionally, `=headn X<>` is an indexable object with subkinds its meta part. So, for instance, `=headn X<|foo>`, is a indexable object of kind `Syntax` with subkind set to `('foo')`. This also has to be changed, but once again, this behavior is inherited from `htmlify.pod6`. | ||
|
||
<sup>3</sup> The same that happens with <sup>2</sup> and subkinds, also happens with `categories`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this document also talk about backward compatibility? (Cool URIs don't change) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think so. In addition, backward compatibility is going to be kind of hard because right now URIs are generated very weirdly in some cases. In any case, most URIs are not going to change. What will change (I think) is fragment generation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. JJ just told me about this PR. Some comments:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably add here some general principles. What we want from this URL scheme, things like compatibility, testability, things like that.