Skip to content

Document Tree model

Jon Ludlam edited this page Nov 5, 2020 · 3 revisions

Document Tree model

Introduction

Currently odoc has a notion of packages from which a module / page belong. This proposal extends that scheme to be more general, to allow for better handling of Libraries, and to allow for the sidebar to be more useful. We replace the 'package' notion with a general parent-child relationship as follows.

  • Allow pages (mld files) and modules to have a parent/child relationship
  • Pages can have page children too
  • Only top-level modules need to have specified page parents
  • Replace --package with appropriate use of --parent
  • Pages refer to their children with a tag like {!child:page-Foo} or {!child:Module} - the latter will insert the text module Module : sig ... end with a link to the child. Note we already have {!modules:...} which we might use instead.
  • Pages that don't explicitly refer to their children inline will get them appended to the bottom
  • The side-bar on the page will be constructed from the
  • support {!modulelist:Foo} and {!modulelist:page-Bar} to list the modules in Foo or the children in a page

This effectively adds support for libraries, both wrapped and unwrapped, by allowing modules to be parented under a library page, which is in turn parented under a package page. This is of course optional, and so wrapped libraries may be parented directly under a package page. Dune/Odig could create these files if they're not explicitly written.

The sidebar would be created by finding the top-most parent and creating a tree from there, using the headings in pages as the nodes of the tree. Rendering of the sidebar tree would render all headings in the current page as usual, but also the headings from sibling pages and ancestors.

Mld files that have children would be output as mldname/index.html whereas mld files without children would be output as mldname.html. This preserves compatibility with the current scheme.

Benefits

There are three broad areas in which this suggestion will offer benefits over the current mechanism.

Directory structure

This allows a more flexible directory structure to be created. For example, in order to create a docs.ocaml.org site, we would like to be able to document multiple versions of packages simultaneously. In order to do that, we need allow HTML documents to be written beneath a path that isn't simply the package name. For example, we may wish to arrange for the documentation of module Astring in package astring version 1.0.0 to be found at the following path:

/packages/astring/1.0.0/Astring/index.html

This can be done by creating the following set of mld files:

  1. packages.mld:
{0 Packages}
{2 A}
{!child:astring}
...
  1. packages/astring.mld
{0 Package 'astring'}
{1 Versions}
{!child:0\.8\.5}
  1. packages/astring/0.8.5.mld
{0 Modules}
{!child:Astring}

once these are compliled, and we have a page-0.8.5.odoc file, we compile the module astring.cmti specifying page-0.8.5.odoc as its parent. Path and reference resolution then proceed as normal and when output as html, these resolved paths will point into this directory structure as expected.

Subpackages

Many packages declare 'subpackages' - for example, core_kernel defines the main package 'core_kernel' as well as several sub-packages, 'core_kernel.moption', 'core_kernel.uuid', etc. These packages contain similarly named libraries. These can be fitted into the above structure by introducing child pages below the 'version' page for each sub-package. For example, we might have:

  1. packages/core_kernel/v0.14.0.mld
{0 Modules}
{!child:Core_kernel}

{0 Sub packages}
{!child:moption}
{!child:uuid}
...
  1. packages/core_kernel/v0.14.0/moption.mld
{0 Modules}
{!child:Moption}
...

Local docs

For building local documentation, we don't normally have more than one version of a package installed at once. Therefore the structure might be simplified to omit the version layer specified above and have packages available under a simpler tree:

/packages/astring/Astring/index.html

which would be achieved by having the child module Astring parented under the package mld file rather than the version mld file.

Reference resolution

It is currently the case that it is recommended that all sub-packages of all direct and indirect dependencies are in scope (via the addition of include paths on the link CLI command). Partly this is because, unlike with the compile phase, we don't currently attempt to look for them, which is in turn partly because we can't specify them as accurately as we can for compilation where we have a digest. So, for example if a library links against core_kernel.moption, in the comments a reference like {!module-Rope} would resolve to the Rope module in core_kernel.rope, and a reference like {!module-Md5_lib} would resolve to the module Md5_lib in package base.md5, as base is a dependency of core_kernel. This comes with the danger of ambiguity, as it may require odoc to link in sub-packages that it is not possible to link as an OCaml program - e.g. there might be module name clashes, which we don't currently have a way to disambiguate.

This parent/child relationship gives us a way to be more precise about this. There are options here:

  1. Keep the current strategy of having more in scope, but warn if there are ambiguities.
  2. Restrict the current strategy somewhat by only having in scope elements in the direct and indirect dependencies (ie, don't expand it to all sub-packages in dependent packages)
  3. Restrict the current strategy further by only having in scope elements in the direct dependencies

Furthermore, we can offset these restrictions in the scope of linking by allowing references to modules as children of mld pages. For example, in order to refer to Md5_lib if it's not a direct (or indirect) dependency, we might use the reference {!packages.base.md5.Md5_lib} - this allows us to link to 'impossible' set of packages that would otherwise have module-name clashes, as long as the driver can tell odoc how to find the right odoc files.

This brings in a difficulty, related to the structure outlined above. The reference {!packages.base.md5.Md5_lib} would fit in nicely with documentation created locally, but for the docs.ocaml.org site with a version number in between, it doesn't work. What it does do though is highlight that we've not specified which version should be used - latest? something specific? We again have options:

  1. Require a version number - at least this way we can be sure it'll always work (though not if the think we're linking to comes from a different package...!)
  2. Have a way to 'alias' mld files - so the version-free reference would work. For example, we might have {!packs.base.md5} where the file packs/base.mld contains:
{alias:md5:package.base.v0\.14\.0.md5}
  1. Have the aliases to the latest verion in the package file. For example, packages/base.mld might contain:
{0 Package 'base'}
{1 Versions}
{!child:v0\.14\.0}
...
{alias:md5:v0\.14\.0.md5}

#### link-deps

This clearly has an impact on how odoc link-deps works.

Currently link-deps only looks at the roots required when resolving Paths and Fragments, not References, and so drivers must augment this in the way described above. If instead we use this new reference resolution strategy, we can be more specific. Unfortunately this isn't totally straightforward. With a reference like {!packages.base.md5.Md5_lib} we probably don't want to say we depend upon page-packages.odoc and everything that depends upon it, as this will be the entire universe of packages. However, without actually resolving the references it's not possible to distinguish simply between pages, labels and so on, without requiring people to write {!page-packages.page-base.page-md5.module-Md5_lib}.

Perhaps instead what we should do is punt this problem to the driver (e.g. dune), and have a way to specify externally which additional packages the docs depend upon?

Alternatively we could again specify them inline (which could also bring them into scope? -- though we still need the full paths to resolve if a single mld file wants to reference multiple clashing module names).

Document tree

It is often convenient to split a document into multiple pages, and to have a way to quickly and easily navigate between these pages. For example, the Phoenix docs. In this example, the sidebar serves as both a way to navigate between the different pages and also within the pages in the one tree structure.

Other related solutions are MDN where there is an "On this page" list similar to our current "Contents" tree, and an additional "Related Topics" below, which contains links to the other docs within the document tree. This one additionally has breadcrumbs. Another is Stripe which has a similar 'On this page' link in the right-hand sidebar, and a tree-like structure for navigating between the pages on the left.

The idea is to be able to build a site a bit more like the Ocsigen sites, e.g. Tyxml or Lwt. These both have a sidebar on the left to navigate between the different modules, and this is arranged in a sensible order.

We can once again use this parent-child relationship model to address this, by constructing a page-tree of the different pages, either containing the current page's contents within it (like hexdocs) or 2 separate trees (MDN or Stripe style).

For something like the docs.ocaml.org site it doesn't make sense to maintain this tree all the way down to the alphabetized list of packages, so we could mark one page as the 'pagetree root', and only construct this tree if we find an ancestor odoc file that has been marked that way.