Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blog post: https://github.com/metanorma/metanorma.org/issues/821 #823

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions _posts/2024-09-30-metanorma-error-log.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
layout: post
title: "Metanorma error logs"
date: 2024-09-30
categories: documentation

authors:
- name: Nick Nicholas
email: [email protected]
social_links:
- https://github.com/opoudjis

excerpt: >-
Metanorma error logs are captured to file, and can be used to find problems with standards documents.
---

Users of Metanorma will be familiar with the flurry of complaints that the overhelpfully pedantic
Metanorma floods the console window with, whenever it compiles a document. Observant users of Metanorma
may even have noticed that there are fewer types of error flooding the console window than there used to be:
Metanorma grammar errors are no longer being shown on screen.

Metanorma is indeed on the pedantic side when it comes to error reporting, and it is obsessive about
identifying discrepancies with SDOs' house style, that SDOs themselves are not always as concerned about.
The error log can be useful for document authors, and in order to make it useful, Metanorma has been saving
a copy of the error log to disk. Including those Metanorma grammar errors. We describe here how to navigate
the error log for a document. You will also find this functionality documented on
link:/author/topics/output/validation/[Validation].

Metanorma logs errors as belonging to four levels of severity, from 0 to 3. Severity 0 errors are fatal errors,
which crash Metanorma: Metanorma determines that it is not possible to compile a document untl that error is removed.
Severity 3 errors are minor, information-only warnings. You must address Severity 0 errors,
to get documents to compile at all, and you should also address Severity 1 errors.

TIP: Issues that cause
Severity 0 errors are outlined in the documentation; e.g. duplicate anchors, non-existent images, or
malformed index markup.

Metanorma also classifies errors into categories, and displays errors of the same category together;
that lets you go through the same class of error as a group. Metanorma defines the following classes of error,
which are documented in the link above:

* Style: these are potential violations of the house style of the SDO, and should be at least looked at.
* Metanorma XML Syntax: these issues can usually be ignored, as the XML syntax check is quite strict,
and they are demoted to Severity 2
* Relaton: Issues with the reference requested from the Relaton bibliographic database (e.g. the requested
document identifier does not exist)
* Issues with Metanorma Asciidoc markup:
** Anchors
** AsciiDoc Input
** Bibliography: syntax
** Crossreferences
** Document Attributes
** Images
** Include
** Maths
** Requirements
** Table
** Terms

If you want to exclude errors from the error log from a given severity up, use the document attribute
`:log-filter-severity:`. So `:log-filter-severity: 2` instructs Metanorma to leave out from the log errors
of Severity 2 or 3.

If you want to exclude errors from the error log for certain error classes, use the document attribute
`:log-filter-category:`, with a comma-delimited list of classes. So
`:log-filter-category: Crossreferences,Document Attributes,Metanorma XML Syntax` instructs Metanorma
to omit from the log errors of the classes Crossreferences, Document Attributes, and Metanorma XML Syntax.


The error log for a document `{document-name}.adoc` is generated with the filename `{document-name}.err.html`,
and you can open it in any browser. When you do, the first information you will find is a list of
error classes, with the count of errors in each class, by severity:

____
*Style*: Severity 2: **9** errors

*Metanorma XML Syntax*: Severity 2: **7** errors
____

Each error class is hyperlinked to the listing of errors for that class, so that you can quickly navigate to that
part of the error log.

The listing of errors under each class is styled by Severity:

* Severity 0 is boldface against a pink background
* Severity 1 is boldface
* Severity 2 is in normal type
* Severity 3 is italicised and grey

Each error is presented in a table, with enough information for users to make sense of the error reported:

____
|===
| Line | ID | Message | Context | Severity

| 000064 | *_a172c0b4-4751-941d-b6c5-344768eb7b1b* | Formula +_a172c0b4-4751-941d-b6c5-344768eb7b1b+ has not been cross-referenced within document a|
....
<formula id="_a172c0b4-4751-941d-b6c5-344768eb7b1b">
<asciimath>dot Theta ne ddot Theta</asciimath>
</formula>
....
| 2
|===
____

Line:: The line number of the source Asciidoctor where the issue has occurred. This information is only rarely
recoverable from error reporting.

ID:: The identifier of the location in the Asciidoctor where the issue has occurred. Depending on when the issue
is identified, this could be the clause number, the GUID or anchor of the paragraph or block, or the line number
of the Metanorma XML file. If the error is aligned with an anchor, a hyperlink is given to the corresponding
anchor in the generated HTML file, so that users can see where the error has happened in the generated output.

Message:: The error message describing the error.

Context:: Where applicable, two lines of Metanorma XML surrounding the error location; this helps narrow down
the location of the error, if the Line or ID are not enough to do so.
58 changes: 58 additions & 0 deletions _posts/2024-10-05-max-data-uri-size.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
layout: post
title: "Maximum Data URI size"
date: 2024-10-05
categories: documentation

authors:
- name: Nick Nicholas
email: [email protected]
social_links:
- https://github.com/opoudjis

excerpt: >-
Metanorma images are by default encoded within the generated XML file as Data URIs. In order to prevent processing
problems, they are also by default constrained to 10 MB in size.
---

Images, audio files, and video files are by default encoded in Metanorma as https://en.wikipedia.org/wiki/Data_URI_scheme[inline Data URIs]:
rather than referencing an external file for the image, the documents generated by Metanorma (including the XML file
that it takes as its starting point) represent the image inside of the file, as a (very long) URI.
The same is done (though as a an XML element rather than a URI) with the potentially even longer representation
of file attachments, which Alex Dyuzhev recently wrote about in link:/_posts/2024-08-20-pdf-attachments/[PDF Attachments].
(Attachments are just as valid for HTML as for PDF output.)

There is an advantage to this internal representation of files,
for distributing Metanorma documents: if you generate an HTML document, you can
send it somewhere else as a single file, without needing to take care of the separate media files or file attachments it invokes.
After all, you already do so for Word documents and for PDFs.

There is a disadvantage to doing this, if the media file becomes so big that software starts having trouble
with processig those URIs. Browsers think nothing of a URI 100 KB or 1 MB large; but by the time the URI
needs to represent a video file 100 MB or 1 GB in size, as we have found, bad things start happening.

To prevent bad things happen, we have put the following safeguards in place:

* First of all, the default to represent media files as Data URIs can be turned off, by setting the document attribute
`:data-uri-image: false`. If you do so, then the media files in your document are referenced, in the Metanorma XML files and the HTML output,
as links to those external files, rather than bundling them inside the file. In that case, it is the Word and PDF
outputs that need to convert the media files into internally bundled representations. And you will need to take care
to include those media files when you upload the generated HTML file anywhere.

* You can do the same with file attachments, through `:data-uri-attachments: false`. In that case, again, any file attachments
will be referenced as links, rather than bundling them inside the file, and you will need to handle them the same way you handle
attachemnts. The catch is that, unlike media files, HTML cannot make sense of Data URI encoding for an arbitrary attachment,
so you will have to distribute the HTML file with its attachments as separate files anyway: `:data-uri-attachments: false`
only shortens the XML files, it does not make the HTML any different. (In the case of HTML rendering, any attachments
bundled with the file are exported to a folder called `_{document-name}_attachments`.)

* In order to prevent users inadvertently generating Data URIs too big for a browser to handle, we set the maximum allowed
Data URI size by default to 14 MB (corresponding to a 10 MB media file). If the Data URI needed to represent a media file is
bigger than that, we now abort execution, with a warning that you need to change file configuration, to make sure you know what
you are doing. You can deal with this warning in one of three ways:
** Set `:data-uri-attachments: false`
** Set `data-uri-maxsize` to a byte size big enough to capture your file. (Remember that Data URI encodings are one third larger
than the binary files they encode). So if you have a 1 GB media file, you will need to set `data-uri-maxsize: 1400000000`,
to prevent aborting.
** Set `data-uri-maxsize: 0`, if you want to throw caution to the winds, and have no maximum Data URI size for your document.
In which case, we admire your courage...
Binary file added assets/blog/authors/[email protected]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 14 additions & 7 deletions author/iso/ref/document-attributes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,11 @@ https://www.iso.org/deliverables-all.html[ISO deliverables: The different types
`technical-corrigendum`::: Technical Corrigendum (Cor) [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]
`amendment`::: Amendment (Amd) [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]
`committee-document`::: Committee Document (for internal documents) [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.7.2]
`addendum`::: Addendum (Add) [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8]

`:updates-document-type:`::
(only when `doctype` is set to `amendment` or `technical-corrigendum`)
The document type that this amendment or technical corrigendum is
(only when `doctype` is set to `amendment`, `technical-corrigendum`, or `addendum`)
The document type that this amendment, technical corrigendum or addendum is
updating [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25].

`:docsubtype:`:: A subclass of doctype for which special processing rules apply.
Expand Down Expand Up @@ -204,7 +205,7 @@ The ISO document identifier is assembled out of these metadata elements:
publisher:: publisher of the document
document stage:: stage of development of document, according to the Harmonized Stage Codes
document number:: numeric identifier of document
update number:: serial number of update (for amendments and technical corrigenda)
update number:: serial number of update (for amendments, addenda, and technical corrigenda)
document type:: type of ISO deliverable
copyright year:: year of publication of document
language:: language of document
Expand Down Expand Up @@ -364,15 +365,15 @@ The patterns are as follows:
* `ISO/IEC TR 33333-2:2030(E)`
====

*Amendments*, *Technical Corrigendum*::
*Amendment*, *Technical Corrigendum*, *Addendum*::
`{source document ID}/{document type and stage} {update number} (: {copyright year}) ({ISO 639 language code})?` +
+
[example]
.Examples of ISO Amendment and Technical Corrigendum document identifiers
.Examples of ISO Amendment, Technical Corrigendum, and Addendum document identifiers
====
* `ISO 33333-2:2030/DIS Cor 2:2031`
* `ISO 33333-2:2030/DIS Amd 2:2031`
* `ISO 33333-2:2030/Cor 2:2032`
* `ISO/IEC 33333-2:2030/Cor 2:2032`
* `ISO/IEC 33333-2:2030/Add 2:2032`
====


Expand Down Expand Up @@ -438,9 +439,15 @@ of the document (mandatory).
`:title-amendment-{en,fr}:`:: (only when `doctype` is set to `amendment` or `technical-corrigendum`)
The English or French title of the amendment [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]

`:title-addendum-{en,fr}:`:: (only when `doctype` is set to `addendum`)
The English or French title of the addendum [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8]

`:amendment-number:`:: (only when `doctype` is set to `amendment`)
The number of the amendment [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]

`:addendum-number:`:: (only when `doctype` is set to `addendum`)
The number of the addendum [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8]

`:corrigendum-number:`:: (only when `doctype` is set to `technical-corrigendum`)
The number of the technical corrigendum [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]

Expand Down
19 changes: 11 additions & 8 deletions author/iso/topics/markup.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -822,16 +822,17 @@ The entries in the Bibliography are reordered (and, for numerical references, re
The bracketed reference numbers are expected to be correct and in order (accounting for the fact that references to standards will end up numbered): they are not overridden in rendering.


== Amendments and technical corrigenda
== Amendments, addenda, and technical corrigenda

=== General

Amendments and technical corrigenda [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]
and addenda [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8]
have the following particularities in their markup.

=== Dates

Amendments and technical corrigenda bear two dates in their identifiers: the date of the source document,
Amendments, technical corrigenda and addenda bear two dates in their identifiers: the date of the source document,
and the date of the update. The latter date is given as the `:copyright-year:` attribute (and may be given
in more detail as the `:updated-date:` attribute. The former date is given as the `:created-date:` attribute;
if it is missing, the `:copyright-year:` is used instead.
Expand All @@ -845,7 +846,8 @@ If this is a corrigendum to an addendum, the source identifier must be that of t

=== Clauses

There are no special clauses: clauses describe the location at which changes are applied. So
In amendments and technical corrigenda (but not addenda),
there are no special clauses: clauses describe the location at which changes are applied. So
`== Terms and definitions` does not introduce a Terms section: it describes the changes to be applied
to the Terms section of the existing document. For the same reason, there are no annexes or
distinct bibliographies.
Expand All @@ -864,7 +866,8 @@ link:/author/topics/inline_markup/changes/[Machine-readable changes].

=== Form

The document takes the form of clauses describing what is to be amended; the amendments themselves are quoted.
The document in amendments and technical corrigenda takes the form of clauses describing what is to be amended;
the amendments themselves are quoted.

Because the quoted material are snippets with little context, auto-numbering will not yield sensible results,
and neither will cross-referencing autonumbered blocks or clauses. For that reason, amendments and technical corrigenda
Expand All @@ -889,11 +892,11 @@ preceding digits, counting from the decimal sign. This also applies to digits
following the decimal sign.

[example]
.Rendering numbers using `stem` encoding in an ISO or IEC document
The encoding of `stem&#x200c;:[60007.12345]` in an ISO or IEC document in any language
.Rendering numbers using `number` encoding in an ISO or IEC document
The encoding of `number&#x200c;:60007.12345[]` in an ISO or IEC document in any language
is displayed as _60 007,123 45_.

[example]
.Rendering numbers using `stem` encoding in an ISO or IEC document without treatment on the hanging digit
The encoding of `stem&#x200c;:[2345.6789]` in an ISO or IEC document in any language
.Rendering numbers using `number` encoding in an ISO or IEC document without treatment on the hanging digit
The encoding of `number&#x200c;:2345.6789[]` in an ISO or IEC document in any language
is displayed as _2 345,678 9_.
10 changes: 10 additions & 0 deletions author/topics/collections/configuration.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,11 @@ coverpage and boilerplate) [added in https://github.com/metanorma/metanorma/rele
This is automatically included when `sectionsplit` is set in the
Metanorma file, to break a single document up into multiple HTML files.

`recompile-xml`:::
If a file is specified as being `*.adoc`, force it to be recompiled to XML, whether or not
a corresponding XML file is already there. Used to refresh XML files; normally Asciidoc is
not recompiled to XML to save time [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.6].

`format`::
Specifies the possible output formats for the collection as a list,
as an alternative to the `-f` argument of `metanorma collection`.
Expand Down Expand Up @@ -226,6 +231,11 @@ YAML files, or AsciiDoc documents.

* If a document is a AsciiDoc documents, it is compiled to a Metanorma
Semantic XML document in preprocessing [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.0].
The document is *not* recompiled to a Metanorma Semantic XML document if a corresponding XML
document has already been generated in the same directory [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.6];
this is because compiling documents is so time consuming. The document *is* recompiled to a Metanorma Semantic XML document,
even if a corresponding XML document has already been generated, provided the directive `recompile-xml`
is present.

* If a document is a YAML file, it is assumed to be a collection manifest itself,
and its manifest is recursively read into the current manifest at that point of the
Expand Down
7 changes: 6 additions & 1 deletion author/topics/output/validation.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,16 @@ lists the ISO-specific content style rules that Metanorma warns about when compi
derived from ISO/IEC DIR 2 and from the ISO House Style specification. These issues will not prevent compilation,
and they are not always correct, but they do catch the kinds of issues that SDO editorial review is supposed to identify.

== Relaton

Relaton:: These are issues with requests to the Relaton bibliographic database; typically, they are
reports that the requested document identifier in the bibliography does not exist.

== Markup issues

Anchors:: Issue with identifiers of document elements, or resources (including URIs)
AsciiDoc Input:: Issue with AsciiDoc markup, likely to prevent parsing of document
Bibliography:: Issue with bibliographic markup
Bibliography:: Issue with bibliographic markup (as opposed to bibliography fetches)
Crossreferences:: Issue with cross-reference to document elements
Document Attributes:: Issue with content of AsciiDoc document attributes
Images:: Issue with images
Expand Down
Loading