diff --git a/_posts/2024-09-30-metanorma-error-log.adoc b/_posts/2024-09-30-metanorma-error-log.adoc new file mode 100644 index 00000000..1e759037 --- /dev/null +++ b/_posts/2024-09-30-metanorma-error-log.adoc @@ -0,0 +1,117 @@ +--- +layout: post +title: "Metanorma error logs" +date: 2024-09-30 +categories: documentation + +authors: + - name: Nick Nicholas + email: nick.nicholas@ribose.com + social_links: + - https://github.com/opoudjis + +excerpt: >- + Metanorma error logs are captured to file, and can be used to find problems with standards documents. +--- + +Users of Metanorma will be familiar with the flurry of complaints that the overhelpfully pedantic +Metanorma floods the console window with, whenever it compiles a document. Observant users of Metanorma +may even have noticed that there are fewer types of error flooding the console window than there used to be: +Metanorma grammar errors are no longer being shown on screen. + +Metanorma is indeed on the pedantic side when it comes to error reporting, and it is obsessive about +identifying discrepancies with SDOs' house style, that SDOs themselves are not always as concerned about. +The error log can be useful for document authors, and in order to make it useful, Metanorma has been saving +a copy of the error log to disk. Including those Metanorma grammar errors. We describe here how to navigate +the error log for a document. You will also find this functionality documented on +link:/author/topics/output/validation/[Validation]. + +Metanorma logs errors as belonging to four levels of severity, from 0 to 3. Severity 0 errors are fatal errors, +which crash Metanorma: Metanorma determines that it is not possible to compile a document untl that error is removed. +Severity 3 errors are minor, information-only warnings. You must address Severity 0 errors, +to get documents to compile at all, and you should also address Severity 1 errors. + +TIP: Issues that cause +Severity 0 errors are outlined in the documentation; e.g. duplicate anchors, non-existent images, or +malformed index markup. + +Metanorma also classifies errors into categories, and displays errors of the same category together; +that lets you go through the same class of error as a group. Metanorma defines the following classes of error, +which are documented in the link above: + +* Style: these are potential violations of the house style of the SDO, and should be at least looked at. +* Metanorma XML Syntax: these issues can usually be ignored, as the XML syntax check is quite strict, +and they are demoted to Severity 2 +* Relaton: Issues with the reference requested from the Relaton bibliographic database (e.g. the requested +document identifier does not exist) +* Issues with Metanorma Asciidoc markup: +** Anchors +** AsciiDoc Input +** Bibliography: syntax +** Crossreferences +** Document Attributes +** Images +** Include +** Maths +** Requirements +** Table +** Terms + +If you want to exclude errors from the error log from a given severity up, use the document attribute +`:log-filter-severity:`. So `:log-filter-severity: 2` instructs Metanorma to leave out from the log errors +of Severity 2 or 3. + +If you want to exclude errors from the error log for certain error classes, use the document attribute +`:log-filter-category:`, with a comma-delimited list of classes. So +`:log-filter-category: Crossreferences,Document Attributes,Metanorma XML Syntax` instructs Metanorma +to omit from the log errors of the classes Crossreferences, Document Attributes, and Metanorma XML Syntax. + + +The error log for a document `{document-name}.adoc` is generated with the filename `{document-name}.err.html`, +and you can open it in any browser. When you do, the first information you will find is a list of +error classes, with the count of errors in each class, by severity: + +____ +*Style*: Severity 2: **9** errors + +*Metanorma XML Syntax*: Severity 2: **7** errors +____ + +Each error class is hyperlinked to the listing of errors for that class, so that you can quickly navigate to that +part of the error log. + +The listing of errors under each class is styled by Severity: + +* Severity 0 is boldface against a pink background +* Severity 1 is boldface +* Severity 2 is in normal type +* Severity 3 is italicised and grey + +Each error is presented in a table, with enough information for users to make sense of the error reported: + +____ +|=== +| Line | ID | Message | Context | Severity + +| 000064 | *_a172c0b4-4751-941d-b6c5-344768eb7b1b* | Formula +_a172c0b4-4751-941d-b6c5-344768eb7b1b+ has not been cross-referenced within document a| +.... + + dot Theta ne ddot Theta + +.... +| 2 +|=== +____ + +Line:: The line number of the source Asciidoctor where the issue has occurred. This information is only rarely +recoverable from error reporting. + +ID:: The identifier of the location in the Asciidoctor where the issue has occurred. Depending on when the issue +is identified, this could be the clause number, the GUID or anchor of the paragraph or block, or the line number +of the Metanorma XML file. If the error is aligned with an anchor, a hyperlink is given to the corresponding +anchor in the generated HTML file, so that users can see where the error has happened in the generated output. + +Message:: The error message describing the error. + +Context:: Where applicable, two lines of Metanorma XML surrounding the error location; this helps narrow down +the location of the error, if the Line or ID are not enough to do so. diff --git a/_posts/2024-10-05-max-data-uri-size.adoc b/_posts/2024-10-05-max-data-uri-size.adoc new file mode 100644 index 00000000..51ef858f --- /dev/null +++ b/_posts/2024-10-05-max-data-uri-size.adoc @@ -0,0 +1,58 @@ +--- +layout: post +title: "Maximum Data URI size" +date: 2024-10-05 +categories: documentation + +authors: + - name: Nick Nicholas + email: nick.nicholas@ribose.com + social_links: + - https://github.com/opoudjis + +excerpt: >- + Metanorma images are by default encoded within the generated XML file as Data URIs. In order to prevent processing + problems, they are also by default constrained to 10 MB in size. +--- + +Images, audio files, and video files are by default encoded in Metanorma as https://en.wikipedia.org/wiki/Data_URI_scheme[inline Data URIs]: +rather than referencing an external file for the image, the documents generated by Metanorma (including the XML file +that it takes as its starting point) represent the image inside of the file, as a (very long) URI. +The same is done (though as a an XML element rather than a URI) with the potentially even longer representation +of file attachments, which Alex Dyuzhev recently wrote about in link:/_posts/2024-08-20-pdf-attachments/[PDF Attachments]. +(Attachments are just as valid for HTML as for PDF output.) + +There is an advantage to this internal representation of files, +for distributing Metanorma documents: if you generate an HTML document, you can +send it somewhere else as a single file, without needing to take care of the separate media files or file attachments it invokes. +After all, you already do so for Word documents and for PDFs. + +There is a disadvantage to doing this, if the media file becomes so big that software starts having trouble +with processig those URIs. Browsers think nothing of a URI 100 KB or 1 MB large; but by the time the URI +needs to represent a video file 100 MB or 1 GB in size, as we have found, bad things start happening. + +To prevent bad things happen, we have put the following safeguards in place: + +* First of all, the default to represent media files as Data URIs can be turned off, by setting the document attribute +`:data-uri-image: false`. If you do so, then the media files in your document are referenced, in the Metanorma XML files and the HTML output, +as links to those external files, rather than bundling them inside the file. In that case, it is the Word and PDF +outputs that need to convert the media files into internally bundled representations. And you will need to take care +to include those media files when you upload the generated HTML file anywhere. + +* You can do the same with file attachments, through `:data-uri-attachments: false`. In that case, again, any file attachments +will be referenced as links, rather than bundling them inside the file, and you will need to handle them the same way you handle +attachemnts. The catch is that, unlike media files, HTML cannot make sense of Data URI encoding for an arbitrary attachment, +so you will have to distribute the HTML file with its attachments as separate files anyway: `:data-uri-attachments: false` +only shortens the XML files, it does not make the HTML any different. (In the case of HTML rendering, any attachments +bundled with the file are exported to a folder called `_{document-name}_attachments`.) + +* In order to prevent users inadvertently generating Data URIs too big for a browser to handle, we set the maximum allowed +Data URI size by default to 14 MB (corresponding to a 10 MB media file). If the Data URI needed to represent a media file is +bigger than that, we now abort execution, with a warning that you need to change file configuration, to make sure you know what +you are doing. You can deal with this warning in one of three ways: +** Set `:data-uri-attachments: false` +** Set `data-uri-maxsize` to a byte size big enough to capture your file. (Remember that Data URI encodings are one third larger +than the binary files they encode). So if you have a 1 GB media file, you will need to set `data-uri-maxsize: 1400000000`, +to prevent aborting. +** Set `data-uri-maxsize: 0`, if you want to throw caution to the winds, and have no maximum Data URI size for your document. +In which case, we admire your courage... diff --git a/assets/blog/authors/nick.nicholas@ribose.com.jpg b/assets/blog/authors/nick.nicholas@ribose.com.jpg new file mode 100644 index 00000000..7b804564 Binary files /dev/null and b/assets/blog/authors/nick.nicholas@ribose.com.jpg differ diff --git a/author/iso/ref/document-attributes.adoc b/author/iso/ref/document-attributes.adoc index f0da33ae..84dbc47d 100644 --- a/author/iso/ref/document-attributes.adoc +++ b/author/iso/ref/document-attributes.adoc @@ -56,10 +56,11 @@ https://www.iso.org/deliverables-all.html[ISO deliverables: The different types `technical-corrigendum`::: Technical Corrigendum (Cor) [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] `amendment`::: Amendment (Amd) [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] `committee-document`::: Committee Document (for internal documents) [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.7.2] +`addendum`::: Addendum (Add) [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8] `:updates-document-type:`:: -(only when `doctype` is set to `amendment` or `technical-corrigendum`) -The document type that this amendment or technical corrigendum is +(only when `doctype` is set to `amendment`, `technical-corrigendum`, or `addendum`) +The document type that this amendment, technical corrigendum or addendum is updating [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25]. `:docsubtype:`:: A subclass of doctype for which special processing rules apply. @@ -204,7 +205,7 @@ The ISO document identifier is assembled out of these metadata elements: publisher:: publisher of the document document stage:: stage of development of document, according to the Harmonized Stage Codes document number:: numeric identifier of document -update number:: serial number of update (for amendments and technical corrigenda) +update number:: serial number of update (for amendments, addenda, and technical corrigenda) document type:: type of ISO deliverable copyright year:: year of publication of document language:: language of document @@ -364,15 +365,15 @@ The patterns are as follows: * `ISO/IEC TR 33333-2:2030(E)` ==== -*Amendments*, *Technical Corrigendum*:: +*Amendment*, *Technical Corrigendum*, *Addendum*:: `{source document ID}/{document type and stage} {update number} (: {copyright year}) ({ISO 639 language code})?` + + [example] -.Examples of ISO Amendment and Technical Corrigendum document identifiers +.Examples of ISO Amendment, Technical Corrigendum, and Addendum document identifiers ==== -* `ISO 33333-2:2030/DIS Cor 2:2031` +* `ISO 33333-2:2030/DIS Amd 2:2031` * `ISO 33333-2:2030/Cor 2:2032` -* `ISO/IEC 33333-2:2030/Cor 2:2032` +* `ISO/IEC 33333-2:2030/Add 2:2032` ==== @@ -438,9 +439,15 @@ of the document (mandatory). `:title-amendment-{en,fr}:`:: (only when `doctype` is set to `amendment` or `technical-corrigendum`) The English or French title of the amendment [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] +`:title-addendum-{en,fr}:`:: (only when `doctype` is set to `addendum`) +The English or French title of the addendum [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8] + `:amendment-number:`:: (only when `doctype` is set to `amendment`) The number of the amendment [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] +`:addendum-number:`:: (only when `doctype` is set to `addendum`) +The number of the addendum [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8] + `:corrigendum-number:`:: (only when `doctype` is set to `technical-corrigendum`) The number of the technical corrigendum [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] diff --git a/author/iso/topics/markup.adoc b/author/iso/topics/markup.adoc index 5893640d..f4b93c24 100644 --- a/author/iso/topics/markup.adoc +++ b/author/iso/topics/markup.adoc @@ -822,16 +822,17 @@ The entries in the Bibliography are reordered (and, for numerical references, re The bracketed reference numbers are expected to be correct and in order (accounting for the fact that references to standards will end up numbered): they are not overridden in rendering. -== Amendments and technical corrigenda +== Amendments, addenda, and technical corrigenda === General Amendments and technical corrigenda [added in https://github.com/metanorma/isodoc/releases/tag/v1.3.25] +and addenda [added in https://github.com/metanorma/metanorma-iso/releases/tag/v2.8.8] have the following particularities in their markup. === Dates -Amendments and technical corrigenda bear two dates in their identifiers: the date of the source document, +Amendments, technical corrigenda and addenda bear two dates in their identifiers: the date of the source document, and the date of the update. The latter date is given as the `:copyright-year:` attribute (and may be given in more detail as the `:updated-date:` attribute. The former date is given as the `:created-date:` attribute; if it is missing, the `:copyright-year:` is used instead. @@ -845,7 +846,8 @@ If this is a corrigendum to an addendum, the source identifier must be that of t === Clauses -There are no special clauses: clauses describe the location at which changes are applied. So +In amendments and technical corrigenda (but not addenda), +there are no special clauses: clauses describe the location at which changes are applied. So `== Terms and definitions` does not introduce a Terms section: it describes the changes to be applied to the Terms section of the existing document. For the same reason, there are no annexes or distinct bibliographies. @@ -864,7 +866,8 @@ link:/author/topics/inline_markup/changes/[Machine-readable changes]. === Form -The document takes the form of clauses describing what is to be amended; the amendments themselves are quoted. +The document in amendments and technical corrigenda takes the form of clauses describing what is to be amended; +the amendments themselves are quoted. Because the quoted material are snippets with little context, auto-numbering will not yield sensible results, and neither will cross-referencing autonumbered blocks or clauses. For that reason, amendments and technical corrigenda @@ -889,11 +892,11 @@ preceding digits, counting from the decimal sign. This also applies to digits following the decimal sign. [example] -.Rendering numbers using `stem` encoding in an ISO or IEC document -The encoding of `stem‌:[60007.12345]` in an ISO or IEC document in any language +.Rendering numbers using `number` encoding in an ISO or IEC document +The encoding of `number‌:60007.12345[]` in an ISO or IEC document in any language is displayed as _60 007,123 45_. [example] -.Rendering numbers using `stem` encoding in an ISO or IEC document without treatment on the hanging digit -The encoding of `stem‌:[2345.6789]` in an ISO or IEC document in any language +.Rendering numbers using `number` encoding in an ISO or IEC document without treatment on the hanging digit +The encoding of `number‌:2345.6789[]` in an ISO or IEC document in any language is displayed as _2 345,678 9_. diff --git a/author/topics/collections/configuration.adoc b/author/topics/collections/configuration.adoc index bb330b89..808e156a 100644 --- a/author/topics/collections/configuration.adoc +++ b/author/topics/collections/configuration.adoc @@ -171,6 +171,11 @@ coverpage and boilerplate) [added in https://github.com/metanorma/metanorma/rele This is automatically included when `sectionsplit` is set in the Metanorma file, to break a single document up into multiple HTML files. +`recompile-xml`::: +If a file is specified as being `*.adoc`, force it to be recompiled to XML, whether or not +a corresponding XML file is already there. Used to refresh XML files; normally Asciidoc is +not recompiled to XML to save time [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.6]. + `format`:: Specifies the possible output formats for the collection as a list, as an alternative to the `-f` argument of `metanorma collection`. @@ -226,6 +231,11 @@ YAML files, or AsciiDoc documents. * If a document is a AsciiDoc documents, it is compiled to a Metanorma Semantic XML document in preprocessing [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.0]. +The document is *not* recompiled to a Metanorma Semantic XML document if a corresponding XML +document has already been generated in the same directory [added in https://github.com/metanorma/metanorma/releases/tag/v2.0.6]; +this is because compiling documents is so time consuming. The document *is* recompiled to a Metanorma Semantic XML document, +even if a corresponding XML document has already been generated, provided the directive `recompile-xml` +is present. * If a document is a YAML file, it is assumed to be a collection manifest itself, and its manifest is recursively read into the current manifest at that point of the diff --git a/author/topics/output/validation.adoc b/author/topics/output/validation.adoc index 26bfcc1e..0a80dd64 100644 --- a/author/topics/output/validation.adoc +++ b/author/topics/output/validation.adoc @@ -43,11 +43,16 @@ lists the ISO-specific content style rules that Metanorma warns about when compi derived from ISO/IEC DIR 2 and from the ISO House Style specification. These issues will not prevent compilation, and they are not always correct, but they do catch the kinds of issues that SDO editorial review is supposed to identify. +== Relaton + +Relaton:: These are issues with requests to the Relaton bibliographic database; typically, they are +reports that the requested document identifier in the bibliography does not exist. + == Markup issues Anchors:: Issue with identifiers of document elements, or resources (including URIs) AsciiDoc Input:: Issue with AsciiDoc markup, likely to prevent parsing of document -Bibliography:: Issue with bibliographic markup +Bibliography:: Issue with bibliographic markup (as opposed to bibliography fetches) Crossreferences:: Issue with cross-reference to document elements Document Attributes:: Issue with content of AsciiDoc document attributes Images:: Issue with images