Raku · antoniogamiz · Jul 31, 2020 · Jul 31, 2020 · JJ · Jul 31, 2020
@@ -0,0 +1,362 @@
+# URI Specification v1.0.0
+
+- [Introduction notes](#introduction-notes)
+- [Indexable Object](#indexable-object)
+  - [Attributes](#attributes)
+    - [name](#name)
+    - [pod](#pod)
+    - [kind](#kind)
+    - [subkinds](#subkinds)
+    - [categories](#categories)
+- [Indexable object types](#indexable-object-types)
+  - [Primary indexable objects](#primary-indexable-objects)
+    - [Pod blocks](#pod-blocks)
+    - [Name attribute](#name-attribute)
+      - [Kind Type](#kind-type)
+      - [Kind Language and Programs](#kind-language-and-programs)
+    - [Multiple pod blocks](#multiple-pod-blocks)
+  - [Secondary indexable objects](#secondary-indexable-objects)
+    - [Kind::Routine](#kindroutine)
+      - [Attributes](#attributes-1)
+    - [Kind::Syntax](#kindsyntax)
+      - [Attributes](#attributes-2)
+    - [Kind::Reference](#kindreference)
+      - [Attributes](#attributes-3)
+      - [URI setting](#uri-setting)
+- [URI setting](#uri-setting-1)
+- [URI rewriting](#uri-rewriting)
+- [Examples](#examples)
+      - [Get all URLs from Primary objects](#get-all-urls-from-primary-objects)
+      - [Get all URLs from Secondary objects](#get-all-urls-from-secondary-objects)
+      - [Classification of secondary objects by name](#classification-of-secondary-objects-by-name)
+
+## Introduction notes
+
+In this specification, we want to set a common guidelines and rules for URLs in the [official documentation](raku.docs.org). This specification must be implemented be the tools generating the HTML pages of the doc site ([Documentable](https://github.com/Raku/Documentable) at this moment). For that reason, the described behavior is **currently** implemented in `Documentable:ver<2.0.0>`.
+
+Right now, there are some tests for URL generation in `Documentable`, but they are kind of scattered and are insufficient, so when this specification is finished, a dedicated set of tests will have to be created. Maybe a spec json file, as [Mustache](https://github.com/mustache/spec/tree/master/specs) does.
+
+In the official documentation, there are a lot of different things, like pages generated directly from a single [source file](https://docs.raku.org/type/Associative) from pages generated by grouping certain parts of several [source files](https://docs.raku.org/routine/(%7C),%20infix%20%E2%88%AA). In order to represent this information in a manageable way, we use certain data structures, name conventions and metadata, but everything is based on **indexable objects**.
+
+## Indexable Object
+
+An indexable object is a set of information documenting one thing or several related things that can be referred to. In order to clarify this definition, you can think of an indexable object as the documentation for a certain [type](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6), for some [method](https://github.com/Raku/doc/blob/aec4740ded31770c799b5e236d9e5d423b8f988b/doc/Type/Any.pod6#L19-L34) or a [tutorial](https://github.com/Raku/doc/blob/master/doc/Language/grammar_tutorial.pod6). Even [references](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6#L102) are indexable objects.
+
+We can extract and set additional information to these objects, in order to classify them and create URIs to refer them. **All indexable objects** share these attributes but with different values:
+
+###  Attributes
+
+#### name
+
+Relatively short string to name the object. For instance: `Any`, `X::AdHoc`, `101-basics`. etc. See [indexable object types](#indexable-object-types).
+
+#### pod
+
+Pod representing the indexable object. This pod *does not have to be* a `=begin pod ... =end pod` block. It *must* be a [Pod::Block](https://docs.raku.org/type/Pod::Block) or an array containing `Pod::Block`s.
+
+#### kind
+
+This a **fixed** list of values to make a less granular classification of what the pod is representing.
+
+~~~perl6
+enum Kind (Type, Language, Programs, Syntax, Routine, Reference);
+~~~
+
+The first three ones, cannot be easily deduced from the indexable object, so they need to be **specified by the user**, whereas the last three can be deduced without much trouble. You should set one of this to the indexable objects depending on the documentation you are trying to represent:
+
+- `Type`: when the docs is about a class, a role or an enum.
+- `Language`: when the docs is related to the language itself.
+- `Programs`: when the docs is describing a program: a debugger, for instance.
+
+Automatically deduced:
+
+- `Syntax`: when the docs is related to a `twigil`, `constant`, `variable`, `quote` or `declarator`<sup>1</sup>.
+- `Routine`: when the docs is related to a `sub`, `method`, `term`, `routine`, `submethod`, `trait`, `infix`, `prefix`, `postfix`, `circumfix`, `postcircumfix` or a `listop`.
+- `Reference`: when the docs is just a `X<>` element.
+
+#### subkinds
+
+This is used as a more granular classification of indexable objects, based on the contents of the documentation. The value of these subkinds depends on the kind of the indexable object:
+
+- `Type`: specified by the user.
+- `Language`: specified by the user.
+- `Programs`: specified by the user.
+- `Routine`: deduced from the pod. A list containing a subset of these values: `infix, prefix, postfix, circumfix, postcircumfix, listop, sub, method, term, routine, submethod, trait twigil constant variable quote declarator`.
+- `Syntax`: deduced from the pod. A list containing a subset of these values: `twigil, constant, variable, quote, declarator` <sup>2</sup>.
+- `Reference`: indexable objects of this kind always have the same `subkinds` value: `['reference']`.
+
+#### categories
+
+This is also used as a more granular classification of indexable objects, nonetheless, this classification is not based entirely in the contents of the documentation. This value also depends of the kind of the indexable object:
+
+- `Type`: specified by the user.
+- `Language`: specified by the user.
+- `Programs`: specified by the user.
+- `Routine`: same as `subkinds` except if `subkinds` contains one the following values: `infix, prefix, postfix, circumfix, postcircumfix, listop`. In that case, `categories` is always `['operators']`.
+- `Syntax`: same as `subkinds`. <sup>3</sup>.
+- `Reference`: indexable objects of this kind always have the same `subkinds` value: `['reference']`.
+
+## Indexable object types
+
+### Primary indexable objects
+
+#### Pod blocks
+
+A primary indexable object is created from a `pod block`. A pod block is just a pod structure like this one:
+
+~~~perl6
+=begin pod
+...
+=end pod
+~~~
+
+But that's not a *valid* one. For a pod block to be a primary indexable object, it needs to comply some rules:
+
+- It must have a `=TITLE`.
+- It must have a `=SUBTITLE`.
+- It must contain three different key/value pairs following the format: `:kind(<string>) :subkind(<string>) :category(<string>)`.
+  - `:kind` has to be one and only one of the stringyfied version of the first three `Kind`s: `:kind("type")`, `:kind("language")` or `:kind("programs")`.
+  - `:subkind` is an arbitrary string.
+  - `:category` is an arbitrary string.
+
+So, a valid primary indexable object is something like this:
+
+~~~perl6
+=begin pod :kind("Language") :subkind("Language") :category("migration")
+=TITLE Perl to Raku guide - functions
+=SUBTITLE Builtin functions in Perl to Raku
+=end pod
+~~~
+
+In this key/value pairs, you can set the value of [subkinds](#subkinds) and [categories](#categories) of the first three kinds.
+
+#### Name attribute
+
+Name attribute depends on the kind specified by the user in the primary indexable object:
+
+##### Kind Type
+
+In this case, the last word of the `=TITLE` element is taken as name. So, if we have the following primary indexable object:
+
+~~~perl6
+=begin pod :kind("Type") :subkind("class") :category("basic")
+=TITLE class Any
+=SUBTITLE Thing/object
+    class Any is Mu {}
+=end pod
+~~~
+
+Its name would be `Any`.
+
+##### Kind Language and Programs
+
+In this case, due to the arbitrariness of the `=TITLE` element, we cannot deduce a name, so we take the name of the file, stripping out the extension. So, if we have the following primary indexable object:
+
+~~~perl6
+=begin pod :kind("Language") :subkind("Language") :category("migration")
+=TITLE Perl to Raku guide - functions
+=SUBTITLE Builtin functions in Perl to Raku
+=end pod
+~~~
+
+stored in `/SomeDirectory/perl-raku-guide.pod6`, its name would be `perl-raku-guide`.
+
+#### Multiple pod blocks
+
+Several primary indexable objects of `Kind::Type` can be written in the same file as follows:
+
+~~~perl6
+=begin pod :kind("Type") :subkind("class") :category("basic")
+=TITLE class Any
+=SUBTITLE Thing/object
+    class Any is Mu {}
+=end pod
+
+=begin pod :kind("Type") :subkind("enum") :category("basic")
+=TITLE enum Bool
+=SUBTITLE Logical Boolean
+=end pod
+~~~
+
+They will be treated as two independent primary indexable objects.
+
+### Secondary indexable objects
+
+A primary indexable object can contain a lot of documentation, for instance, [Any](https://github.com/Raku/doc/blob/master/doc/Type/Any.pod6) has a very long list of methods. In order to gave a more granular documentation, we can extract certain parts of that pod and create more indexable objects.
+
+#### Kind::Routine
+
+To detect those parts, we use [pod headers](https://docs.raku.org/type/Pod::Heading). But not all pod headers are valid, they need to follow one of the these formats:
+
+- `[T|t]he <single-name> <subkind>`
+- `<subkind> <name>`
+
+where
+
+- `<subkind>` is one element of `infix, prefix, postfix, circumfix, postcircumfix, listop, sub, method, term, routine, submethod, trait`.
+- `<single-name>` is a single word (without spaces).
+- `<name>` can be formed by several words separated by spaces.
+
+##### Attributes
+
+- `kind` is set to `Kind::Routine`.
+- `name` is set to `<single-name>` or `<name>`.
+- `subkinds` is set to `(<subkind>)`.
+- `categories`:
+  - If subkind is one of `infix, prefix, postfix, circumfix, postcircumfix, listop`, then it will be set to `("operator")`.
+  - If subkind is one of `sub, method, term, routine, submethod, trait`, then it will be set to the same value as `subkinds`.
+
+#### Kind::Syntax
+
+To detect those parts, we use [pod headers](https://docs.raku.org/type/Pod::Heading). But not all pod headers are valid, they need to follow one of the these formats:
+
+- `[T|t]he <single-name> <subkind>`
+- `<subkind> <name>`
+
+where
+
+- `<subkind>` is one element of `twigil constant variable quote declarator`.
+- `<single-name>` is a single word (without spaces).
+- `<name>` can be formed by several words separated by spaces.
+
+##### Attributes
+
+- `kind` is set to `Kind::Syntax`.
+- `name` is set to `<single-name>` or `<name>`.
+- `subkinds` is set to `(<subkind>)`.
+- `categories`: will be set to same value as `subkinds`.
+
+`=headn X<>`<sup>1</sup><sup>2</sup><sup>3</sup> is also a valid header.
+
+#### Kind::Reference
+
+These secondary indexable objects come from `X<>` elements (see [Pod::FormattingCode](https://docs.raku.org/type/Pod::FormattingCode)). They have to be written as follows:
+
+~~~perl6
+X<text|meta>
+~~~
+
+`meta` is a string containing several group of words, separated by `;`, and words inside each group separated by `,`. For instance: `foo, bar; w`. Raku would interpret that `meta` attribute as follows: `[ [foo bar] [w] ]`, that is, a list containing two lists: one with two elements and other with a single element.
+
+From a single `X<>` element, several secondary indexable objects can be created, one for every group of words found in `meta`. For instance:
+
+~~~perl6
+X<text|a;b,c;d>
+~~~
+
+Would be interpreted as if you had typed:
+
+~~~perl6
+X<text|a>
+X<text|b,c>
+X<text|d>
+~~~
+
+`text` or `meta` can be empty strings, but not both at the same time, so `X<|meta>` and `X<text>` are valid references.
+
+##### Attributes
+
+In all cases, `kind` and `subkinds` are set to `Kind::Reference` and `['reference']` respectively. `categories` attribute is not set in these indexable objects.
+
+`name` setting depends on `meta` variable:
+
+- `meta` is an empty string. Then, `meta` is set to `[text]`. So it would be interpreted as `X<text|text>`.
+- `meta` has only one element: `name` is set to the stringyfied version of `meta`. So `X<|a>` would get the name `a`.
+- `meta` has more than one element: `name` is set to an alteration of `meta`. So `X<|a,b,c>` would get the name `c (a b)`.
+
+##### URI setting
+
+The URI of these indexable objects depends on the primary indexable object where the reference was found. The URI is formed as follows:
+
+~~~perl6
+"{$origin.uri}#index-entry-{$meta}-{$index-text}"
+~~~
+
+where:
+  - `origin.uri` is the URI of assigned to the primary indexable object where the reference was found.
+  - `$meta` is the concatenation by `-` of the groups found in `meta`.
+  - `$index-text` is `text`.
+
+## URI setting
+
+All indexable objects have an associated *Uniform Resource Identifier* or URI. It is formed based on the common attributes of all indexable objects, as follows:
+
+~~~perl6
+"/{$kind.lc}/{$name}"
+~~~
+
+## URI rewriting
+
+As you may know, `Raku` accepts a huge range of symbols, so the `name` attribute can be a little bit weird sometimes (from a URI perspective). For this reason, `name` needs to be slightly altered to generate valid URLs. This alteration is made by making these replacements:
+
+~~~perl6
+/   => $SOLIDUS
+%   => $PERCENT_SIGN
+^   => $CIRCUMFLEX_ACCENT
+#   => $NUMBER_SIGN
+' ' => _
+~~~
+
+## Examples
+
+This specification is intended to be independent from the tool used, but as it's the first version, it's entirely based in the behavior of `Documentable:ver<2.0.0>`, so here you have some examples to check by yourself the concepts explained before:
+
+##### Get all URLs from Primary objects
+
+~~~perl6
+use Documentable:ver<2.0.0>;
+use Documentable::Registry:ver<2.0.0>;
+
+my $registry = Documentable::Registry.new(
+    :topdir("doc"),
+    :dirs(DOCUMENTABLE-DIRS),
+    :!verbose,
+);
+
+$registry.compose;
+
+say $registry.documentables.map({.url});
+~~~
+
+##### Get all URLs from Secondary objects
+
+~~~perl6
+use Documentable:ver<2.0.0>;
+use Documentable::Registry:ver<2.0.0>;
+
+my $registry = Documentable::Registry.new(
+    :topdir("doc"),
+    :dirs(DOCUMENTABLE-DIRS),
+    :!verbose,
+);
+
+$registry.compose;
+
+say $registry.definitions.map({.url});
+~~~
+
+##### Classification of secondary objects by name
+
+~~~perl6
+use Documentable:ver<2.0.0>;
+use Documentable::Registry:ver<2.0.0>;
+
+my $registry = Documentable::Registry.new(
+    :topdir("doc"),
+    :dirs(DOCUMENTABLE-DIRS),
+    :!verbose,
+);
+
+$registry.compose;
+
+my %routine-documents = $registry.lookup("routine", :by<kind>).categorize({.name});
+my %syntax-documents = $registry.lookup("syntax", :by<kind>).categorize({.name});
+
+say %routine-documents<⊅>;
+~~~
+
+<sup>1</sup> Certain kind of headers (`=headn X<>`) too, but there are not logical reason to mark those headers as `Syntax`, so that's needs to be fixed. This behavior is inherited from the old `htmlify.p6`.
+
+<sup>2</sup> Additionally, `=headn X<>` is an indexable object with subkinds its meta part. So, for instance, `=headn X<|foo>`, is a indexable object of kind `Syntax` with subkind set to `('foo')`. This also has to be changed, but once again, this behavior is inherited from `htmlify.pod6`.
+
+<sup>3</sup> The same that happens with <sup>2</sup> and subkinds, also happens with `categories`.