update: add `pandoc` version v1.19.2.1 #2234

renovate · 2024-11-15T20:09:48Z

This PR contains the following updates:

Package	Update	Change
pandoc	minor	`1.17.0.2` -> `1.19.2.1`

Release Notes

jgm/pandoc (pandoc)

`v1.19.2.1`

Compare Source

Require skylighting >= 0.1.1.4.
- Adjust test output for skylighting version.
- Relax upper bounds on blaze-html and blaze-markup.

`v1.19.2`

Compare Source

Add dependencies on texmath and skylighting to the executable.
This is needed for dependency version numbers to be available,
with Cabal > 2.

`v1.19.1`

Compare Source

Set PANDOC_VERSION environment variable for filters (#2640).
This allows filters to check the pandoc version that produced
the JSON they are receiving.
- Docx reader: Ensure one-row tables don't have header (#3285,
  Jesse Rosenthal). Tables in MS Word are set by default to have
  special first-row formatting, which pandoc uses to determine whether
  or not they have a header. This means that one-row tables will, by
  default, have only a header -- which we imagine is not what people
  want. This change ensures that a one-row table is not understood to
  be a header only. Note that this means that it is impossible to
  produce a header-only table from docx, even though it is legal
  pandoc. But we believe that in nearly all cases, it will be an
  accidental (and unwelcome) result
- HTML reader:
  - Fixed some bad regressions in HTML table parser (#3280).
    This regression leads to the introduction of empty rows
    in some circumstances.
  - Understand style=width: as well as width in col (#3286).
- RST reader:
  - Print warnings when keys, substitition, notes not found.
    Previously the parsers failed and we got raw text. Now we get a
    link with an empty URL, or empty inlines in the case of a note or
    substitution.
  - Fix hyperlink aliases (#3283).
- Man writer: Ensure that periods are escaped at beginning of line
  (#3270).
- LaTeX writer: Fix unnumbered headers when used with --top-level
  (#3272, Albert Krewinkel). Fix interaction of top-level
  divisions part or chapter with unnumbered headers when
  emitting LaTeX. Headers are ensured to be written using
  stared commands (like \subsection*{}).
- LaTeX template: use comma not semicolon to separate keywords for
  pdfkeywords. Thanks to Wandmalfarbe.
- Markdown writer: Fixed incorrect word wrapping (#3277).
  Previously pandoc would sometimes wrap lines too early due to
  this bug.
- Text.Pandoc.Pretty: Added afterBreak [API change]. This makes it
  possible to insert escape codes for content that needs escaping at the
  beginning of a line.
- Removed old MathMLInHTML.js from 2004, which should no longer
  be needed for MathML with modern browsers.
- Fixed tests with dynamic linking (#2709).
- Makefile: Use stack instead of cabal for targets. This is just
  a convenience for developers.
- Fixed bash completion of filenames with space (#2749).
- MANUAL: improved documentation on how to create a custom
  reference.docx.
- Fix minor spelling typos in the manual (#3273, Anthony Geoghegan)

`v1.19`

Compare Source

Add dependencies on texmath and skylighting to the executable.
This is needed for dependency version numbers to be available,
with Cabal > 2.

`v1.18`

Compare Source

Added --list-input-formats, --list-output-formats,
--list-extensions, --list-highlight-languages, and
--list-highlight-styles (#3173). Removed list of highlighting
languages from --version output. Removed list of input and output
formats from default --help output.
- Added --reference-location=block|section|document option
  (Jesse Rosenthal). This determines whether Markdown link references
  and footnotes are placed at the end of the document, the end of the
  section, or the end of the top-level block.
- Added --top-level-division=section|chapter|part (Albert Krewinkel).
  This determines what a level-1 header corresponds to in LaTeX,
  ConTeXt, DocBook, and TEI output. The default is section.
  The --chapters option has been deprecated in favor of
  --top-level-division=chapter.
- Added LineBlock constructor for Block (Albert Krewinkel). This
  is now used in parsing RST and Markdown line blocks, DocBook
  linegroup/line combinations, and Org-mode VERSE blocks.
  Previously Para blocks with hard linebreaks were used. LineBlocks
  are handled specially in the following output formats: AsciiDoc
  (as [verse] blocks), ConTeXt (\startlines/\endlines),
  HTML (div with a style), Markdown (line blocks if line_blocks
  is enabled), Org-mode (VERSE blocks), RST (line blocks). In
  other output formats, a paragraph with hard linebreaks is emitted.
- Allow binary formats to be written to stdout (but not to tty) (#2677).
  Only works on posix, since we use the unix library to check whether
  output is to tty. On Windows, pandoc works as before and always requires
  an output file parameter for binary formats.
- Changed JSON output format (Jesse Rosenthal). Previously we used
  generically generated JSON, but this was subject to change depending
  on the version of aeson pandoc was compiled with. To ensure stability,
  we switched to using manually written ToJSON and FromJSON
  instances, and encoding the API version. Note: pandoc filter
  libraries will need to be revised to handle the format change.
  Here is a summary of the essential changes:
  - The toplevel JSON format is now {"pandoc-api-version" : [MAJ, MIN, REV], "meta" : META, "blocks": BLOCKS}
    instead of [{"unMeta": META}, [BLOCKS]].
    Decoding fails if the major and minor version numbers don't
    match.
  - Leaf nodes no longer have an empty array for their "c" value.
    Thus, for example, a Space is encoded as {"t":"Space"}
    rather than {"t":"Space","c":[]} as before.
- Removed tests/Tests/Arbitrary.hs and added a Text.Pandoc.Arbitrary
  module to pandoc-types (Jesse Rosenthal). This makes it easier
  to use QuickCheck with pandoc types outside of pandoc itself.
- Add bracketed_spans Markdown extension, enabled by default
  in pandoc markdown. This allows you to create a native span
  using this syntax: [Here is my span]{#id .class key="val"}.
- Added angle_brackets_escapable Markdown extension (#2846).
  This is needed because github flavored Markdown has a slightly
  different set of escapable symbols than original Markdown;
  it includes angle brackets.
- Export Text.Pandoc.Error in Text.Pandoc [API change].
- Print highlighting-kate version in --version.
- Text.Pandoc.Options:
  - Extension has new constructors Ext_brackted_spans and
    Ext_angle_brackets_escapable [API change].
  - Added ReferenceLocation type [API change] (Jesse Rosenthal).
  - Added writerReferenceLocation field to WriterOptions (Jesse
    Rosenthal).
- --filter: we now check $DATADIR/filters for filters before
  looking in the path (#3127, Jesse Rosenthal, thanks to Jakob
  Voß for the idea). Filters placed in this directory need not
  be executable; if the extension is .hs, .php, .pl, .js,
  or .rb, pandoc will run the right interpreter.
- For --webtex, replace deprecated Google Chart API by CodeCogs as
  default (Kolen Cheung).
- Removed raw_tex extension from markdown_mmd defaults (Kolen Cheung).
- Execute .js filters with node (Jakob Voß).
- Textile reader:
  - Support bc.. extended code blocks (#3037). Also, remove trailing
    newline in code blocks (consistently with Markdown reader).
  - Improve table parsing. We now handle cell and row attributes, mostly
    by skipping them. However, alignments are now handled properly.
    Since in pandoc alignment is per-column, not per-cell, we
    try to devine column alignments from cell alignments.
    Table captions are also now parsed, and textile indicators
    for thead and tfoot no longer cause parse failure. (However,
    a row designated as tfoot will just be a regular row in pandoc.)
  - Improve definition list parsing. We now allow multiple terms
    (which we concatenate with linebreaks). An exponential parsing
    bug (#3020) is also fixed.
  - Disallow empty URL in explicit link (#3036).
- RST reader:
  - Use Div instead of BlockQuote for admonitions (#3031).
    The Div has class admonition and (if relevant) one of the
    following: attention, caution, danger, error, hint,
    important, note, tip, warning. Note: This will change
    the rendering of some RST documents! The word ("Warning", "Attention",
    etc.) is no longer added; that must be done with CSS or a filter.
  - A Div is now used for sidebar as well.
  - Skip whitespace before note (Jesse Rosenthal, #3163). RST requires a
    space before a footnote marker. We discard those spaces so that footnotes
    will be adjacent to the text that comes before it. This is in line with
    what rst2latex does.
  - Allow empty lines when parsing line blocks (Albert Krewinkel).
- Markdown reader:
  - Allow empty lines when parsing line blocks (Albert Krewinkel).
  - Allow attributes on autolinks (#3183, Daniele D'Orazio).
- LaTeX reader:
  - More robust parsing of unknown environments (#3026).
    We no longer fail on things like ^ inside options for tikz.
  - Be more forgiving of non-standard characters, e.g. ^ outside of math.
    Some custom environments give these a meaning, so we should try not to
    fall over when we encounter them.
  - Drop duplicate * in bibtexKeyChars (Albert Krewinkel)
- MediaWiki reader:
  - Fix for unquoted attribute values in mediawiki tables (#3053).
    Previously an unquoted attribute value in a table row
    could cause parsing problems.
  - Improved treatment of verbatim constructions (#3055).
    Previously these yielded strings of alternating Code and Space
    elements; we now incorporate the spaces into the Code. Emphasis
    etc. is still possible inside these.
  - Properly interpret XML tags in pre environments (#3042). They are meant
    to be interpreted as literal text.
- EPUB reader: don't add root path to data: URIs (#3150).
  Thanks to @lep for the bug report and patch.
- Org reader (Albert Krewinkel):
  - Preserve indentation of verse lines (#3064). Leading spaces in verse
    lines are converted to non-breaking spaces, so indentation is preserved.
  - Ensure image sources are proper links. Image sources as those in plain
    images, image links, or figures, must be proper URIs or relative file
    paths to be recognized as images. This restriction is now enforced
    for all image sources. This also fixes the reader's usage of uncleaned
    image sources, leading to file: prefixes not being deleted from
    figure images. Thanks to @bsag for noticing this bug.
  - Trim verse lines properly (Albert Krewinkel).
  - Extract meta parsing code to module. Parsing of meta-data is well
    separable from other block parsing tasks. Moving into new module to
    get small files and clearly arranged code.
  - Read markup only for special meta keys. Most meta-keys should be read
    as normal string values, only a few are interpreted as marked-up text.
  - Allow multiple, comma-separated authors. Multiple authors can be
    specified in the #+AUTHOR meta line if they are given as a
    comma-separated list.
  - Give precedence to later meta lines. The last meta-line of any given
    type is the significant line. Previously the value of the first line
    was kept, even if more lines of the same type were encountered.
  - Read LaTeX_header as header-includes. LaTeX-specific header commands
    can be defined in #+LaTeX_header lines. They are parsed as
    format-specific inlines to ensure that they will only show up in LaTeX
    output.
  - Set documentclass meta from LaTeX_class.
  - Set classoption meta from LaTeX_class_options.
  - Read HTML_head as header-includes. HTML-specific head content can be
    defined in #+HTML_head lines. They are parsed as format-specific
    inlines to ensure that they will only show up in HTML output.
  - Respect author export option. The author option controls whether
    the author should be included in the final markup. Setting
    #+OPTIONS: author:nil will drop the author from the final meta-data
    output.
  - Respect email export option. The email option controls whether the
    email meta-field should be included in the final markup. Setting
    #+OPTIONS: email:nil will drop the email field from the final
    meta-data output.
  - Respect creator export option. The creator option controls whether
    the creator meta-field should be included in the final markup. Setting
    #+OPTIONS: creator:nil will drop the creator field from the final
    meta-data output. Org-mode recognizes the special value comment for
    this field, causing the creator to be included in a comment. This is
    difficult to translate to Pandoc internals and is hence interpreted the
    same as other truish values (i.e. the meta field is kept if it's
    present).
  - Respect unnumbered header property (#3095). Sections the unnumbered
    property should, as the name implies, be excluded from the automatic
    numbering of section provided by some output formats. The Pandoc
    convention for this is to add an "unnumbered" class to the header. The
    reader treats properties as key-value pairs per default, so a special
    case is added to translate the above property to a class instead.
  - Allow figure with empty caption (Albert Krewinkel, #3161).
    A #+CAPTION attribute before an image is enough to turn an image into
    a figure. This wasn't the case because the parseFromString function,
    which processes the caption value, would fail on empty values. Adding
    a newline character to the caption value fixes this.
- Docx reader:
  - Use XML convenience functions (Jesse Rosenthal).
    The functions isElem and elemName (defined in Docx/Util.hs) make
    the code a lot cleaner than the original XML.Light functions, but they
    had been used inconsistently. This puts them in wherever applicable.
  - Handle anchor spans with content in headers. Previously, we would only
    be able to figure out internal links to a header in a docx if the
    anchor span was empty. We change that to read the inlines out of the
    first anchor span in a header.
  - Let headers use existing id. Previously we always generated an id for
    headers (since they wouldn't bring one from Docx). Now we let it use an
    existing one if possible. This should allow us to recurs through anchor
    spans.
  - Use all anchor spans for header ids. Previously we only used the first
    anchor span to affect header ids. This allows us to use all the anchor
    spans in a header, whether they're nested or not (#3088).
  - Test for nested anchor spans in header. This ensures that anchor spans
    in header with content (or with other anchor spans inside) will resolve
    to links to a header id properly.
- ODT reader (Hubert Plociniczak)
  - Include list's starting value. Previously the starting value of
    the lists' items has been hardcoded to 1. In reality ODT's list
    style definition can provide a new starting value in one of its
    attributes.
  - Infer caption from the text following the image.
    Frame can contain other frames with the text boxes.
  - Add fig: to title for Image with a caption (as expected
    by pandoc's writers).
  - Basic support for images in ODT documents.
  - Don't duplicate text for anchors (#3143). When creating an anchor
    element we were adding its representation as well as the original
    content, leading to text duplication.
- DocBook writer:
  - Include an anchor element when a div or span has an id (#3102).
    Note that DocBook does not have a class attribute, but at least this
    provides an anchor for internal links.
- LaTeX writer:
  - Don't use * for unnumbered paragraph, subparagraph. The starred
    variants don't exist. This helps with part of #3058...it gets rid of
    the spurious *s. But we still have numbers on the 4th and 5th level
    headers.
  - Properly escape backticks in verbatim (#3121, Jesse Rosenthal).
    Otherwise they can cause unintended ligatures like ?`.
  - Handle NARRAOW NO-BREAK SPACE into LaTeX (Vaclav Zeman) as \,.
  - Don't include [htbp] placement for figures (#3103, Václav Haisman).
    This allows figure placement defaults to be changed by the user
    in the template.
- TEI writer: remove heuristic to detect book template (Albert Krewinkel).
  TEI doesn't have <book> elements but only generic <divN> division
  elements. Checking the template for a trailing </book> is nonsensical.
- MediaWiki writer: transform filename with underscores in images (#3052).
  foo bar.jpg becomes foo_bar.jpg. This was already done
  for internal links, but it also needs to happen for images.
- ICML writer: replace partial function (!!) in table handling (#3175,
  Mauro Bieg).
- Man writer: allow section numbers that are not a single digit (#3089).
- AsciiDoc writer: avoid unnecessary use of "unconstrained" emphasis
  (#3068). In AsciiDoc, you must use a special form of emphasis
  (double __) for intraword emphasis. Pandoc was previously using
  this more than necessary.
- EPUB writer: use stringify instead of plain writer for metadata
  (#3066). This means that underscores won't be used for emphasis,
  or CAPS for bold. The metadata fields will just have unadorned
  text.
- Docx Writer:
  - Implement user-defined styles (Jesse Rosenthal). Divs and Spans
    with a custom-style key in the attributes will apply the corresponding
    key to the contained blocks or inlines.
  - Add ReaderT env to the docx writer (Jesse Rosenthal).
  - Clean up and streamline RTL behavior (Jesse Rosenthal, #3140).
    You can set dir: rtl in YAML metadata, or use -M dir=rtl
    on the command line. For finer-grained control, you can set
    the dir attribute in Div or Span elements.
- Org writer (Albert Krewinkel):
  - Remove blank line after figure caption. Org-mode only treats an image
    as a figure if it is directly preceded by a caption.
  - Ensure blank line after figure. An Org-mode figure should be surrounded
    by blank lines. The figure would be recognized regardless, but images
    in the following line would unintentionally be treated as figures as
    well.
  - Ensure link targets are paths or URLs. Org-mode treats links as
    document internal searches unless the link target looks like a URL or
    file path, either relative or absolute. This change ensures that this
    is always the case.
  - Translate language identifiers. Pandoc and Org-mode use different
    programming language identifiers. An additional translation between
    those identifiers is added to avoid unexpected behavior. This fixes a
    problem where language specific source code would sometimes be output
    as example code.
  - Drop space before footnote markers (Albert Krewinkel, #3162).
    The writer no longer adds an extra space before footnote markers.
- Markdown writer:
  - Don't emit HTML for tables unless raw_html extension is set (#3154).
    Emit [TABLE] if no suitable table formats are enabled and raw HTML
    is disabled.
  - Check for the raw_html extension before emitting a raw HTML block.
  - Abstract out note/ref function (Jesse Rosenthal).
  - Add ReaderT monad for environment variables (Jesse Rosenthal).
- HTML, EPUB, slidy, revealjs templates: Use <p> instead of <h1> for
  subtitle, author, date (#3119). Note that, as a result of this change,
  authors may need to update CSS.
- revealjs template: Added notes-server option
  (jgm/pandoc-templates#212, Yoan Blanc).
- Beamer template:
  - Restore whitespace between paragraphs. This was
    a regression in the last release (jgm/pandoc-templates#207).
  - Added themeoptions variable (Carsten Gips).
  - Added beamerarticle variable. This causes the beamerarticle
    package to be loaded in beamer, to produce an article from beamer
    slides. (Carsten Gips)
  - Added support for fontfamilies structured variable
    (Artem Klevtsov).
  - Added hypersetup options (Jake Zimmerman).
- LaTeX template:
  - Added dummy definition for \institute.
    This isn't a standard command, and we want to avoid a crash when
    institute is used with the default template.
  - Define default figure placement (Václav Haisman), since pandoc
    no longer includes [htbp] for figures. Users with custom templates
    will want to add this. See #3103.
  - Use footnote package to fix notes in tables (jgm/pandoc-templates#208,
    Václav Haisman).
- Moved template compiling/rendering code to a separate library.
  doctemplates. This allows the pandoc templating system to be
  used independently.
- Text.Pandoc.Error: Fix out of index error in handleError
  (Matthew Pickering). The fix is to not try to show the exact line when
  it would cause an out-of-bounds error as a result of included files.
- Text.Pandoc.Shared: Add linesToBlock function (Albert Krewinkel).
- Text.Pandoc.Parsing.emailAddress: tighten up parsing of email
  addresses. Technically **@user is a valid email address, but if we
  allow things like this, we get bad results in markdown flavors
  that autolink raw email addresses (see #2940). So we exclude a few
  valid email addresses in order to avoid these more common bad cases.
- Text.Pandoc.PDF: Don't crash with nonexistent image (#3100). Instead,
  emit the alt text, emphasized. This accords with what the ODT writer
  currently does. The user will still get a warning about a nonexistent
  image.
- Fix example in API documentation (#3176, Thomas Weißschuh).
- Tell where to get tarball in INSTALL (#3062).
- Rename README to MANUAL.txt and add GitHub-friendly README.md
  (Albert Krewinkel, Kolen Cheung).
- Replace COPYING with Markdown version COPYING.md from GNU (Kolen Cheung).
- MANUAL.txt:
  - Put note on structured vars in separate paragraph (#2148, Albert
    Krewinkel). Make it clearer that structured author variables require a
    custom template
  - Note that --katex works best with html5 (#3077).
  - Fix the LaTeX and EPUB links in manual (Morton Fox).
  - Document biblio-title variable.
- Improve spacing of footnotes in --help output (Waldir Pimenta).
- Update KaTeX to v0.6.0 (Kolen Cheung).
- Allow latest dependencies.
- Use texmath 0.8.6.6 (#3040).
- Allow http-client 0.4.30, which is the version in stackage lts.
  Previously we required 0.5.
  Remove CPP conditionals for earlier versions.
- Remove support for GHC < 7.8 (Jesse Rosenthal).
  - Remove Compat.Monoid.
  - Remove an inline monad compatibility macro.
  - Remove Text.Pandoc.Compat.Except.
  - Remove directory compat.
  - Change constraint on mtl.
  - Remove unnecessary CPP condition in UTF8.
  - Bump base lower bound to 4.7.
  - Remove 7.6 build from .travis.yaml.
  - Bump supported ghc version in CONTRIBUTING.md.
  - Add note about GHC version support to INSTALL.
  - Remove GHC 7.6 from list of tested versions (Albert Krewinkel).
  - Remove TagSoup compat.
  - Add EOL note to time compat module. Because time 1.4 is a boot library
    for GHC 7.8, we will support the compatibility module as long as we
    support 7.8. But we should be clear about when we will no longer need
    it.
  - Remove blaze-html CPP conditional.
  - Remove unnecessary CPP in custom Prelude.

`v1.17.2`

Compare Source

Added Zim Wiki writer, template and tests. zimwiki is now
a valid output format. (Alex Ivkin)
- Changed email-obfuscation default to no obfuscation (#2988).
  - writerEmailObfuscation in defaultWriterOptions is now
    NoObfuscation.
  - the default for the command-line --email-obfuscation option is
    now none.
- Docbook writer: Declare xlink namespace in Docbook5 output (Ivo Clarysse).
- Org writer:
  - Support arbitrary raw inlines (Albert Krewinkel).
    Org mode allows arbitrary raw inlines ("export snippets" in Emacs
    parlance) to be included as @@format:raw foreign format text@@.
  - Improve Div handling (Albert Krewinkel). Div blocks handling is
    changed to make the output look more like idiomatic org mode:
    - Div-wrapped content is output as-is if the div's attribute is the
      null attribute.
    - Div containers with an id but neither classes nor key-value pairs
      are unwrapped and the id is added as an anchor.
    - Divs with classes associated with greater block elements are
      wrapped in a #+BEGIN...#+END block.
    - The old behavior for Divs with more complex attributes is kept.
- HTML writer: Better support for raw LaTeX environments (#2758).
  Previously we just passed all raw TeX through when MathJax
  was used for HTML math. This passed through too much.
  With this patch, only raw LaTeX environments that MathJax
  can handle get passed through.
  This patch also causes raw LaTeX environments to be treated
  as math, when possible, with MathML and WebTeX output.
- Markdown writer: use raw HTML for simple, pipe tables with linebreaks
  (#2993). Markdown line breaks involve a newline, and simple and pipe
  tables can't contain one.
- Make --webtex work with the Markdown writer (#1177).
  This is a convenient option for people using
  websites whose Markdown flavors don't provide for math.
- Docx writer:
  - Set paragraph to FirstPara after display math (Jesse Rosenthal).
    We treat display math like block quotes, and apply FirstParagraph style
    to paragraphs that follow them. These can be styled as the user
    wishes. (But, when the user is using indentation, this allows for
    paragraphs to continue after display math without indentation.)
  - Use actual creation time as doc prop (Jesse Rosenthal).
    Previously, we had used the user-supplied date, if available, for Word's
    document creation metadata. This could lead to weird results, as in
    cases where the user post-dates a document (so the modification might be
    prior to the creation). Here we use the actual computer time to set the
    document creation.
- LaTeX writer:
  - Don't URI-escape image source (#2825). Usually this is a local file,
    and replacing spaces with %20 ruins things.
  - Allow 'standout' as a beamer frame option (#3007).
    ## Slide title {.standout}.
- RST reader: Fixed links with no explicit link text. The link
  `<foo>`_ should have foo as both its link text and its URL.
  See RST spec at http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#embedded-uris-and-aliases
  Closes Debian #828167 -- reported by Christian Heller.
- Textile reader:
  - Fixed attributes (#2984). Attributes can't be followed by
    a space. So, _(class)emph_ but _(noclass) emph_.
  - Fixed exponential parsing bug (#3020).
  - Fix overly aggressive interpretation as images (#2998).
    Spaces are not allowed in the image URL in textile.
- LaTeX reader:
  - Fix \cite so it is a NormalCitation not AuthorInText.
  - Strip off double quotes around image source if present (#2825).
    Avoids interpreting these as part of the literal filename.
- Org reader:
  - Add semicolon to list of special chars (Albert Krewinkel)
    Semicolons are used as special characters in citations syntax. This
    ensures the correct parsing of Pandoc-style citations: [prefix; @key; suffix]. Previously, parsing would have failed unless there was a space
    or other special character as the last character.
  - Add support for "Berkeley-style" cites (Albert Krewinkel, #1978).
    A specification for an official Org-mode citation syntax was drafted by
    Richard Lawrence and enhanced with the help of others on the orgmode
    mailing list. Basic support for this citation style is added to the
    reader.
  - Support arbitrary raw inlines (Albert Krewinkel).
    Org mode allows arbitrary raw inlines ("export snippets" in Emacs
    parlance) to be included as @@format:raw foreign format text@@.
  - Remove partial functions (Albert Krewinkel, #2991).
    Partial functions like head lead to avoidable errors and should be
    avoided. They are replaced with total functions.
  - Support figure labels (Albert Krewinkel, #2496, #2999).
    Figure labels given as #+LABEL: thelabel are used as the ID of the
    respective image. This allows e.g. the LaTeX to add proper \label
    markup.
  - Improve tag and properties type safety (Albert Krewinkel).
    Specific newtype definitions are used to replace stringly typing of tags
    and properties. Type safety is increased while readability is improved.
  - Parse as headlines, convert to blocks (Albert Krewinkel).
    Emacs org-mode is based on outline-mode, which treats documents as trees
    with headlines are nodes. The reader is refactored to parse into a
    similar tree structure. This simplifies transformations acting on
    document (sub-)trees.
  - Refactor comment tree handling (Albert Krewinkel).
    Comment trees were handled after parsing, as pattern matching on lists
    is easier than matching on sequences. The new method of reading
    documents as trees allows for more elegant subtree removal.
  - Support archived trees export options (Albert Krewinkel).
    Handling of archived trees can be modified using the arch option.
    Archived trees are either dropped, exported completely, or collapsed to
    include just the header when the arch option is nil, non-nil, or
    headline, respectively.
  - Put export setting parser into module (Albert Krewinkel).
    Export option parsing is distinct enough from general block parsing to
    justify putting it into a separate module.
  - Support headline levels export setting (Albert Krewinkel).
    The depths of headlines can be modified using the H option. Deeper
    headlines will be converted to lists.
  - Replace ugly code with view pattern (Albert Krewinkel).
    Some less-than-smart code required a pragma switching of overlapping
    pattern warnings in order to compile seamlessly. Using view patterns
    makes the code easier to read and also doesn't require overlapping
    pattern checks to be disabled.
  - Fix parsing of verbatim inlines (Albert Krewinkel, #3016).
    Org rules for allowed characters before or after markup chars were not
    checked for verbatim text. This resultet in wrong parsing outcomes of
    if the verbatim text contained e.g. space enclosed markup characters as
    part of the text (=is_substr = True=). Forcing the parser to update
    the positions of allowed/forbidden markup border characters fixes this.
- LaTeX template: fix for obscure hyperref/xelatex issue.
  Here's a minimal case:
```
\documentclass[]{article}
\usepackage{hyperref}
\begin{document}
\section{\%á}
\end{document}
```
  Without this change, this fails on the second invocation of xelatex.
  This affects inputs this like # %á with pdf output via xelatex.
- trypandoc: call results 'html' instead of 'result'.
  This is for better compatibility with babelmark2.
- Document MultiMarkdown as input/output format (Albert Krewinkel, #2973).
  MultiMarkdown was only mentioned as a supported Markdown dialect but not
  as a possible input or output format. A brief mention is added
  everywhere the other supported markdown dialects are mentioned.
- Document Org mode as a format containing raw HTML (Albert Krewinkel)
  Raw HTML is kept when the output format is Emacs Org mode.
- Implement RawInline and RawBlock in sample lua custom writer (#2985).
- Text.Pandoc.Shared:
  - Introduce blocksToInlines function (Jesse Rosenthal).
    This is a lossy function for converting [Block] -> [Inline]. Its main
    use, at the moment, is for docx comments, which can contain arbitrary
    blocks (except for footnotes), but which will be converted to spans.
    This is, at the moment, pretty useless for everything but the basic
    Para and Plain comments. It can be improved, but the docx reader
    should probably emit a warning if the comment contains more than this.
  - Add BlockQuote to blocksToInlines (Jesse Rosenthal).
  - Add further formats for normalizeDate (Jesse Rosenthal).
    We want to avoid illegal dates -- in particular years with greater than
    four digits. We attempt to parse series of digits first as %Y%m%d, then
    %Y%m, and finally %Y.
  - normalizeDate should reject illegal years (Jesse Rosenthal).
    We only allow years between 1601 and 9999, inclusive. The ISO 8601
    actually says that years are supposed to start with 1583, but MS Word
    only allows 1601-9999. This should stop corrupted word files if the date
    is out of that range, or is parsed incorrectly.
  - Improve year sanity check in normalizeDate (Jesse Rosenthal).
    Previously we parsed a list of dates, took the first one, and then
    tested its year range. That meant that if the first one failed, we
    returned nothing, regardless of what the others did. Now we test for
    sanity before running msum over the list of Maybe values. Anything
    failing the test will be Nothing, so will not be a candidate.
- Docx reader:
  - Add simple comment functionality. (Jesse Rosenthal).
    This adds simple track-changes comment parsing to the docx reader. It is
    turned on with --track-changes=all. All comments are converted to
    inlines, which can list some information. In the future a warning will be
    added for comments with formatting that seems like it will be excessively
    denatured. Note that comments can extend across blocks. For that reason
    there are two spans: comment-start and comment-end. comment-start
    will contain the comment. comment-end will always be empty. The two
    will be associated by a numeric id.
  - Enable warnings in top-level reader (Jesse Rosenthal).
    Previously we had only allowed for warnings in the parser. Now we allow
    for them in the Docx.hs as well. The warnings are simply concatenated.
  - Add warning for advanced comment formatting. (Jesse Rosenthal).
    We can't guarantee we'll convert every comment correctly, though we'll
    do the best we can. This warns if the comment includes something other
    than Para or Plain.
  - Add tests for warnings. (Jesse Rosenthal).
  - Add tests for comments (Jesse Rosenthal).
    We test for comments, using all track-changes options. Note that we
    should only output comments if --track-changes=all. We also test for
    emitting warnings if there is complicated formatting.
- README: update to include track-changes comments. (Jesse Rosenthal)
- Improved Windows installer - don't ignore properties set on command-line.
  See #2708. Needs testing to see if this resolves the issue.
  Thanks to @nkalvi.
- Process markdown extensions on command line in L->R order (#2995).
  Previously they were processed, very unintuitively, in R->L
  order, so that markdown-tex_math_dollars+tex_math_dollars
  had tex_math_dollars disabled.
- Added secnumdepth variable to LaTeX template (#2920).
- Include table of contents in README.html in Windows package.
- Writers: treat SoftBreak as space for stripping (Jesse Rosenthal)
  In Writers.Shared, we strip leading and trailing spaces for display
  math. Since SoftBreak's are treated as spaces, we should strip those
  too.
- beamer, latex templates: pass biblatexoptions directly in package load.
  This allows runtime optinos to be used. Fixes jgm/pandoc-citeproc#201
- CPP workaround for deprecation of parseUrl in http-client.
- Removed some redundant class constraints.
- make_oxs_package.sh - use OSX env variable.
- Added winpkg target to Makefile. This downloads the windows package
  from appveyor and signs it using the key.
- Document Org mode as a format containing raw TeX (Albert Krewinkel).
  Raw TeX is kept verbatim when the output format is Emacs Org mode.
- Support math with haddock-library >= 1.4.
- Removed -rtsopts from library stanza. It has no effect, and Hackage
  wouldn't accept the package.
- Update library dependency versions.

`v1.17.1`

Compare Source

New output format: docbook5 (Ivo Clarysse).
- Text.Pandoc.Options: Add writerDocBook5 to WriterOptions
  (API change).
- Org writer:
  - Add :PROPERTIES: drawer support (Albert Krewinkel, #1962).
    This allows header attributes to be added to org documents in the form
    of :PROPERTIES: drawers. All available attributes are stored as
    key/value pairs. This reflects the way the org reader handles
    :PROPERTIES: blocks.
  - Add drawer capability (Carlos Sosa). For the implementation of the
    Drawer element in the Org Writer, we make use of a generic Block
    container with attributes. The presence of a drawer class defines
    that the Div constructor is a drawer. The first class defines the
    drawer name to use. The key-value list in the attributes defines
    the keys to add inside the Drawer. Lastly, the list of Block elements
    contains miscellaneous blocks elements to add inside of the Drawer.
  - Use CUSTOM_ID in properties (Albert Krewinkel). The ID property is
    reserved for internal use by Org-mode and should not be used.
    The CUSTOM_ID property is to be used instead, it is converted to the
    ID property for certain export format.
- LaTeX writer:
  - Ignore --incremental unless output format is beamer (#2843).
  - Fix polyglossia to babel env mapping (Mauro Bieg, #2728).
    Allow for optional argument in square brackets.
  - Recognize la-x-classic as Classical Latin (Andrew Dunning).
    This allows one to access the hyphenation patterns in CTAN's
    hyph-utf8.
  - Add missing languages from hyph-utf8 (Andrew Dunning).
  - Improve use of \strut with \minipage inside tables
    (Jose Luis Duran). This improves spacing in multiline
    tables.
  - Use {} around options containing special chars (#2892).
  - Avoid lazy foldl.
  - Don't escape underscore in labels (#2921). Previously they were
    escaped as ux5f.
  - brazilian -> brazil for polyglossia (#2953).
- HTML writer: Ensure mathjax link is added when math appears in footnote
  (#2881). Previously if a document only had math in a footnote, the
  MathJax link would not be added.
- EPUB writer: set navpage variable on nav page.
  This allows templates to treat it differently.
- DocBook writer:
  - Use docbook5 if writerDocbook5 is set (Ivo Clarysse).
  - Properly handle ulink/link (Ivo Clarysse).
- EPUB reader:
  - Unescape URIs in spine (#2924).
  - EPUB reader: normalise link id (Mauro Bieg).
- Docx Reader:
  - Parse moveTo and moveFrom (Jesse Rosenthal).
    moveTo and moveFrom are track-changes tags that are used when a
    block of text is moved in the document. We now recognize these tags and
    treat them the same as insert and delete, respectively. So,
    --track-changes=accept will show the moved version, while
    --track-changes=reject will show the original version.
  - Tests for track-changes moving (Jesse Rosenthal).
- ODT, EPUB, Docx readers: throw PandocError on unzip failure
  (Jesse Rosenthal) Previously, readDocx, readEPUB, and readOdt
  would error out if zip-archive failed. We change the archive extraction
  step from toArchive to toArchiveOrFail, which returns an Either value.
- Markdown, HTML readers: be more forgiving about unescaped & in
  HTML (#2410). We are now more forgiving about parsing invalid HTML with
  unescaped & as raw HTML. (Previously any unescaped &
  would cause pandoc not to recognize the string as raw HTML.)
- Markdown reader:
  - Fix pandoc title blocks with lines ending in 2 spaces (#2799).
  - Added -s to markdown-reader-more test.
- HTML reader: fixed bug in pClose. This caused exponential parsing
  behavior in documnets with unclosed tags in dl, dd, dt.
- MediaWiki reader: Allow spaces before ! in MediaWiki table header
  (roblabla).
- RST reader: Support :class: option for code block in RST reader
  (Sidharth Kapur).
- Org reader (all Albert Krewinkel, except where noted otherwise):
  - Stop padding short table rows.
    Emacs Org-mode doesn't add any padding to table rows. The first
    row (header or first body row) is used to determine the column count,
    no other magic is performed.
  - Refactor rows-to-table conversion. This refactors
    the codes conversing a list table lines to an org table ADT.
    The old code was simplified and is now slightly less ugly.
  - Fix handling of empty table cells, rows (Albert Krewinkel, #2616).
    This fixes Org mode parsing of some corner cases regarding empty cells
    and rows. Empty cells weren't parsed correctly, e.g. ||| should be
    two empty cells, but would be parsed as a single cell containing a pipe
    character. Empty rows where parsed as alignment rows and dropped from
    the output.
  - Fix spacing after LaTeX-style symbols.
    The org-reader was dropping space after unescaped LaTeX-style symbol
    commands: \ForAll \Auml resulted in ∀Ä but should give ∀ Ä
    instead. This seems to be because the LaTeX-reader treats the
    command-terminating space as part of the command. Dropping the trailing
    space from the symbol-command fixes this issue.
  - Print empty table rows. Empty table rows should not
    be dropped from the output, so row-height is always set to be at least 1.
  - Move parser state into separate module.
    The org reader code has become large and confusing. Extracting smaller
    parts into submodules should help to clean things up.
  - Add support for sub/superscript export options.
    Org-mode allows to specify export settings via #+OPTIONS lines.
    Disabling simple sub- and superscripts is one of these export options,
    this options is now supported.
  - Support special strings export option Parsing of special strings
    (like ... as ellipsis or -- as en dash) can be toggled using the -
    option.
  - Support emphasized text export option. Parsing of emphasized text can
    be toggled using the * option. This influences parsing of text marked
    as emphasized, strong, strikeout, and underline. Parsing of inline math,
    code, and verbatim text is not affected by this option.
  - Support smart quotes export option. Reading of smart quotes can be
    toggled using the ' option.
  - Parse but ignore export options. All known export options are parsed
    but ignored.
  - Refactor block attribute handling. A parser state attribute was used
    to keep track of block attributes defined in meta-lines. Global state
    is undesirable, so block attributes are no longer saved as part of the
    parser state. Old functions and the respective part of the parser state
    are removed.
  - Use custom anyLine. Additional state changes need to be made after
    a newline is parsed, otherwise markup may not be recognized correctly.
    This fixes a bug where markup after certain block-types would not be
    recognized.
  - Add support for ATTR_HTML attributes (#1906).
    Arbitrary key-value pairs can be added to some block types using a
    #+ATTR_HTML line before the block. Emacs Org-mode only includes these
    when exporting to HTML, but since we cannot make this distinction here,
    the attributes are always added. The functionality is now supported
    for figures.
  - Add :PROPERTIES: drawer support (#1877).
    Headers can have optional :PROPERTIES: drawers associated with them.
    These drawers contain key/value pairs like the header's id. The
    reader adds all listed pairs to the header's attributes; id and
    class attributes are handled specially to match the way Attr are
    defined. This also changes behavior of how drawers of unknown type
    are handled. Instead of including all unknown drawers, those are not
    read/exported, thereby matching current Emacs behavior.
  - Use CUSTOM_ID in properties. See above on Org writer changes.
  - Respect drawer export setting. The d export option can be used
    to control which drawers are exported and which are discarded.
    Basic support for this option is added here.
  - Ignore leading space in org code blocks (Emanuel Evans, #2862).
    Also fix up tab handling for leading whitespace in code blocks.
  - Support new syntax for export blocks. Org-mode version 9
    uses a new syntax for export blocks. Instead of #+BEGIN_<FORMAT>,
    where <FORMAT> is the format of the block's content, the new
    format uses #+BEGIN_export <FORMAT> instead. Both types are
    supported.
  - Refactor BEGIN...END block parsing.
  - Fix handling of whitespace in blocks, allowing content to be indented
    less then the block header.
  - Support org-ref style citations. The org-ref package is an
    org-mode extension commonly used to manage citations in org
    documents. Basic support for the cite:citeKey and
    [[cite:citeKey][prefix text::suffix text]] syntax is added.
  - Split code into separate modules, making for cleaner code and
    better decoupling.
- Added docbook5 template.
- --mathjax improvements:
  - Use new CommonHTML output for MathJax (updated default MathJax URL,
    #2858).
  - Change default mathjax setup to use TeX-AMS_CHTML configuration.
    This is designed for cases where the input is always TeX and maximal
    conformity with TeX is desired. It seems to be smaller and load faster
    than what we used before. See #2858.
  - Load the full MathJax config to maximize loading speed (KolenCheung).
- Bumped upper version bounds to allow use of latest packages
  and compilation with ghc 8.
- Require texmath 0.8.6.2. Closes several texmath-related bugs (#2775,
  #2310, #2310, #2824). This fixes behavior of roots, e.g.
  \sqrt[3]{x}, and issues with sub/superscript positioning
  and matrix column alignment in docx.
- README:
  - Clarified documentation of implicit_header_references ([#2904](https://redirect.github.com/jgm/pandoc/issue

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

♻ Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

github-actions · 2024-11-15T20:14:48Z

🎉 This PR is included in version 1.0.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

update: add pandoc version v1.19.2.1

e2e6ea5

renovate bot enabled auto-merge (squash) November 15, 2024 20:09

twangboy approved these changes Nov 15, 2024

View reviewed changes

renovate bot merged commit af94584 into master Nov 15, 2024
14 checks passed

github-actions bot added the released label Nov 15, 2024

renovate bot deleted the renovate/pandoc-1.x branch November 15, 2024 21:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update: add `pandoc` version v1.19.2.1 #2234

update: add `pandoc` version v1.19.2.1 #2234

renovate bot commented Nov 15, 2024

github-actions bot commented Nov 15, 2024

update: add pandoc version v1.19.2.1 #2234

update: add pandoc version v1.19.2.1 #2234

Conversation

renovate bot commented Nov 15, 2024

Release Notes

v1.19.2.1

v1.19.2

v1.19.1

v1.19

v1.18

v1.17.2

v1.17.1

Configuration

github-actions bot commented Nov 15, 2024

update: add `pandoc` version v1.19.2.1 #2234

update: add `pandoc` version v1.19.2.1 #2234

`v1.19.2.1`

`v1.19.2`

`v1.19.1`

`v1.19`

`v1.18`

`v1.17.2`

`v1.17.1`