Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: add pandoc version v1.19.2.1 #2234

Merged
merged 1 commit into from
Nov 15, 2024
Merged

update: add pandoc version v1.19.2.1 #2234

merged 1 commit into from
Nov 15, 2024

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Nov 15, 2024

This PR contains the following updates:

Package Update Change
pandoc minor 1.17.0.2 -> 1.19.2.1

Release Notes

jgm/pandoc (pandoc)

v1.19.2.1

Compare Source

  • Require skylighting >= 0.1.1.4.
    • Adjust test output for skylighting version.
    • Relax upper bounds on blaze-html and blaze-markup.

v1.19.2

Compare Source

  • Add dependencies on texmath and skylighting to the executable.
    This is needed for dependency version numbers to be available,
    with Cabal > 2.

v1.19.1

Compare Source

  • Set PANDOC_VERSION environment variable for filters (#​2640).
    This allows filters to check the pandoc version that produced
    the JSON they are receiving.

    • Docx reader: Ensure one-row tables don't have header (#​3285,
      Jesse Rosenthal). Tables in MS Word are set by default to have
      special first-row formatting, which pandoc uses to determine whether
      or not they have a header. This means that one-row tables will, by
      default, have only a header -- which we imagine is not what people
      want. This change ensures that a one-row table is not understood to
      be a header only. Note that this means that it is impossible to
      produce a header-only table from docx, even though it is legal
      pandoc. But we believe that in nearly all cases, it will be an
      accidental (and unwelcome) result

    • HTML reader:

      • Fixed some bad regressions in HTML table parser (#​3280).
        This regression leads to the introduction of empty rows
        in some circumstances.
      • Understand style=width: as well as width in col (#​3286).
    • RST reader:

      • Print warnings when keys, substitition, notes not found.
        Previously the parsers failed and we got raw text. Now we get a
        link with an empty URL, or empty inlines in the case of a note or
        substitution.

      • Fix hyperlink aliases (#​3283).

    • Man writer: Ensure that periods are escaped at beginning of line
      (#​3270).

    • LaTeX writer: Fix unnumbered headers when used with --top-level
      (#​3272, Albert Krewinkel). Fix interaction of top-level
      divisions part or chapter with unnumbered headers when
      emitting LaTeX. Headers are ensured to be written using
      stared commands (like \subsection*{}).

    • LaTeX template: use comma not semicolon to separate keywords for
      pdfkeywords. Thanks to Wandmalfarbe.

    • Markdown writer: Fixed incorrect word wrapping (#​3277).
      Previously pandoc would sometimes wrap lines too early due to
      this bug.

    • Text.Pandoc.Pretty: Added afterBreak [API change]. This makes it
      possible to insert escape codes for content that needs escaping at the
      beginning of a line.

    • Removed old MathMLInHTML.js from 2004, which should no longer
      be needed for MathML with modern browsers.

    • Fixed tests with dynamic linking (#​2709).

    • Makefile: Use stack instead of cabal for targets. This is just
      a convenience for developers.

    • Fixed bash completion of filenames with space (#​2749).

    • MANUAL: improved documentation on how to create a custom
      reference.docx.

    • Fix minor spelling typos in the manual (#​3273, Anthony Geoghegan)

v1.19

Compare Source

  • Add dependencies on texmath and skylighting to the executable.
    This is needed for dependency version numbers to be available,
    with Cabal > 2.

v1.18

Compare Source

  • Added --list-input-formats, --list-output-formats,
    --list-extensions, --list-highlight-languages, and
    --list-highlight-styles (#​3173). Removed list of highlighting
    languages from --version output. Removed list of input and output
    formats from default --help output.

    • Added --reference-location=block|section|document option
      (Jesse Rosenthal). This determines whether Markdown link references
      and footnotes are placed at the end of the document, the end of the
      section, or the end of the top-level block.

    • Added --top-level-division=section|chapter|part (Albert Krewinkel).
      This determines what a level-1 header corresponds to in LaTeX,
      ConTeXt, DocBook, and TEI output. The default is section.
      The --chapters option has been deprecated in favor of
      --top-level-division=chapter.

    • Added LineBlock constructor for Block (Albert Krewinkel). This
      is now used in parsing RST and Markdown line blocks, DocBook
      linegroup/line combinations, and Org-mode VERSE blocks.
      Previously Para blocks with hard linebreaks were used. LineBlocks
      are handled specially in the following output formats: AsciiDoc
      (as [verse] blocks), ConTeXt (\startlines/\endlines),
      HTML (div with a style), Markdown (line blocks if line_blocks
      is enabled), Org-mode (VERSE blocks), RST (line blocks). In
      other output formats, a paragraph with hard linebreaks is emitted.

    • Allow binary formats to be written to stdout (but not to tty) (#​2677).
      Only works on posix, since we use the unix library to check whether
      output is to tty. On Windows, pandoc works as before and always requires
      an output file parameter for binary formats.

    • Changed JSON output format (Jesse Rosenthal). Previously we used
      generically generated JSON, but this was subject to change depending
      on the version of aeson pandoc was compiled with. To ensure stability,
      we switched to using manually written ToJSON and FromJSON
      instances, and encoding the API version. Note: pandoc filter
      libraries will need to be revised to handle the format change.
      Here is a summary of the essential changes:

      • The toplevel JSON format is now {"pandoc-api-version" : [MAJ, MIN, REV], "meta" : META, "blocks": BLOCKS}
        instead of [{"unMeta": META}, [BLOCKS]].
        Decoding fails if the major and minor version numbers don't
        match.
      • Leaf nodes no longer have an empty array for their "c" value.
        Thus, for example, a Space is encoded as {"t":"Space"}
        rather than {"t":"Space","c":[]} as before.
    • Removed tests/Tests/Arbitrary.hs and added a Text.Pandoc.Arbitrary
      module to pandoc-types (Jesse Rosenthal). This makes it easier
      to use QuickCheck with pandoc types outside of pandoc itself.

    • Add bracketed_spans Markdown extension, enabled by default
      in pandoc markdown. This allows you to create a native span
      using this syntax: [Here is my span]{#id .class key="val"}.

    • Added angle_brackets_escapable Markdown extension (#​2846).
      This is needed because github flavored Markdown has a slightly
      different set of escapable symbols than original Markdown;
      it includes angle brackets.

    • Export Text.Pandoc.Error in Text.Pandoc [API change].

    • Print highlighting-kate version in --version.

    • Text.Pandoc.Options:

      • Extension has new constructors Ext_brackted_spans and
        Ext_angle_brackets_escapable [API change].
      • Added ReferenceLocation type [API change] (Jesse Rosenthal).
      • Added writerReferenceLocation field to WriterOptions (Jesse
        Rosenthal).
    • --filter: we now check $DATADIR/filters for filters before
      looking in the path (#​3127, Jesse Rosenthal, thanks to Jakob
      Voß for the idea). Filters placed in this directory need not
      be executable; if the extension is .hs, .php, .pl, .js,
      or .rb, pandoc will run the right interpreter.

    • For --webtex, replace deprecated Google Chart API by CodeCogs as
      default (Kolen Cheung).

    • Removed raw_tex extension from markdown_mmd defaults (Kolen Cheung).

    • Execute .js filters with node (Jakob Voß).

    • Textile reader:

      • Support bc.. extended code blocks (#​3037). Also, remove trailing
        newline in code blocks (consistently with Markdown reader).
      • Improve table parsing. We now handle cell and row attributes, mostly
        by skipping them. However, alignments are now handled properly.
        Since in pandoc alignment is per-column, not per-cell, we
        try to devine column alignments from cell alignments.
        Table captions are also now parsed, and textile indicators
        for thead and tfoot no longer cause parse failure. (However,
        a row designated as tfoot will just be a regular row in pandoc.)
      • Improve definition list parsing. We now allow multiple terms
        (which we concatenate with linebreaks). An exponential parsing
        bug (#​3020) is also fixed.
      • Disallow empty URL in explicit link (#​3036).
    • RST reader:

      • Use Div instead of BlockQuote for admonitions (#​3031).
        The Div has class admonition and (if relevant) one of the
        following: attention, caution, danger, error, hint,
        important, note, tip, warning. Note: This will change
        the rendering of some RST documents! The word ("Warning", "Attention",
        etc.) is no longer added; that must be done with CSS or a filter.
      • A Div is now used for sidebar as well.
      • Skip whitespace before note (Jesse Rosenthal, #​3163). RST requires a
        space before a footnote marker. We discard those spaces so that footnotes
        will be adjacent to the text that comes before it. This is in line with
        what rst2latex does.
      • Allow empty lines when parsing line blocks (Albert Krewinkel).
    • Markdown reader:

      • Allow empty lines when parsing line blocks (Albert Krewinkel).
      • Allow attributes on autolinks (#​3183, Daniele D'Orazio).
    • LaTeX reader:

      • More robust parsing of unknown environments (#​3026).
        We no longer fail on things like ^ inside options for tikz.
      • Be more forgiving of non-standard characters, e.g. ^ outside of math.
        Some custom environments give these a meaning, so we should try not to
        fall over when we encounter them.
      • Drop duplicate * in bibtexKeyChars (Albert Krewinkel)
    • MediaWiki reader:

      • Fix for unquoted attribute values in mediawiki tables (#​3053).
        Previously an unquoted attribute value in a table row
        could cause parsing problems.
      • Improved treatment of verbatim constructions (#​3055).
        Previously these yielded strings of alternating Code and Space
        elements; we now incorporate the spaces into the Code. Emphasis
        etc. is still possible inside these.
      • Properly interpret XML tags in pre environments (#​3042). They are meant
        to be interpreted as literal text.
    • EPUB reader: don't add root path to data: URIs (#​3150).
      Thanks to @​lep for the bug report and patch.

    • Org reader (Albert Krewinkel):

      • Preserve indentation of verse lines (#​3064). Leading spaces in verse
        lines are converted to non-breaking spaces, so indentation is preserved.
      • Ensure image sources are proper links. Image sources as those in plain
        images, image links, or figures, must be proper URIs or relative file
        paths to be recognized as images. This restriction is now enforced
        for all image sources. This also fixes the reader's usage of uncleaned
        image sources, leading to file: prefixes not being deleted from
        figure images. Thanks to @​bsag for noticing this bug.
      • Trim verse lines properly (Albert Krewinkel).
      • Extract meta parsing code to module. Parsing of meta-data is well
        separable from other block parsing tasks. Moving into new module to
        get small files and clearly arranged code.
      • Read markup only for special meta keys. Most meta-keys should be read
        as normal string values, only a few are interpreted as marked-up text.
      • Allow multiple, comma-separated authors. Multiple authors can be
        specified in the #+AUTHOR meta line if they are given as a
        comma-separated list.
      • Give precedence to later meta lines. The last meta-line of any given
        type is the significant line. Previously the value of the first line
        was kept, even if more lines of the same type were encountered.
      • Read LaTeX_header as header-includes. LaTeX-specific header commands
        can be defined in #+LaTeX_header lines. They are parsed as
        format-specific inlines to ensure that they will only show up in LaTeX
        output.
      • Set documentclass meta from LaTeX_class.
      • Set classoption meta from LaTeX_class_options.
      • Read HTML_head as header-includes. HTML-specific head content can be
        defined in #+HTML_head lines. They are parsed as format-specific
        inlines to ensure that they will only show up in HTML output.
      • Respect author export option. The author option controls whether
        the author should be included in the final markup. Setting
        #+OPTIONS: author:nil will drop the author from the final meta-data
        output.
      • Respect email export option. The email option controls whether the
        email meta-field should be included in the final markup. Setting
        #+OPTIONS: email:nil will drop the email field from the final
        meta-data output.
      • Respect creator export option. The creator option controls whether
        the creator meta-field should be included in the final markup. Setting
        #+OPTIONS: creator:nil will drop the creator field from the final
        meta-data output. Org-mode recognizes the special value comment for
        this field, causing the creator to be included in a comment. This is
        difficult to translate to Pandoc internals and is hence interpreted the
        same as other truish values (i.e. the meta field is kept if it's
        present).
      • Respect unnumbered header property (#​3095). Sections the unnumbered
        property should, as the name implies, be excluded from the automatic
        numbering of section provided by some output formats. The Pandoc
        convention for this is to add an "unnumbered" class to the header. The
        reader treats properties as key-value pairs per default, so a special
        case is added to translate the above property to a class instead.
      • Allow figure with empty caption (Albert Krewinkel, #​3161).
        A #+CAPTION attribute before an image is enough to turn an image into
        a figure. This wasn't the case because the parseFromString function,
        which processes the caption value, would fail on empty values. Adding
        a newline character to the caption value fixes this.
    • Docx reader:

      • Use XML convenience functions (Jesse Rosenthal).
        The functions isElem and elemName (defined in Docx/Util.hs) make
        the code a lot cleaner than the original XML.Light functions, but they
        had been used inconsistently. This puts them in wherever applicable.
      • Handle anchor spans with content in headers. Previously, we would only
        be able to figure out internal links to a header in a docx if the
        anchor span was empty. We change that to read the inlines out of the
        first anchor span in a header.
      • Let headers use existing id. Previously we always generated an id for
        headers (since they wouldn't bring one from Docx). Now we let it use an
        existing one if possible. This should allow us to recurs through anchor
        spans.
      • Use all anchor spans for header ids. Previously we only used the first
        anchor span to affect header ids. This allows us to use all the anchor
        spans in a header, whether they're nested or not (#​3088).
      • Test for nested anchor spans in header. This ensures that anchor spans
        in header with content (or with other anchor spans inside) will resolve
        to links to a header id properly.
    • ODT reader (Hubert Plociniczak)

      • Include list's starting value. Previously the starting value of
        the lists' items has been hardcoded to 1. In reality ODT's list
        style definition can provide a new starting value in one of its
        attributes.
      • Infer caption from the text following the image.
        Frame can contain other frames with the text boxes.
      • Add fig: to title for Image with a caption (as expected
        by pandoc's writers).
      • Basic support for images in ODT documents.
      • Don't duplicate text for anchors (#​3143). When creating an anchor
        element we were adding its representation as well as the original
        content, leading to text duplication.
    • DocBook writer:

      • Include an anchor element when a div or span has an id (#​3102).
        Note that DocBook does not have a class attribute, but at least this
        provides an anchor for internal links.
    • LaTeX writer:

      • Don't use * for unnumbered paragraph, subparagraph. The starred
        variants don't exist. This helps with part of #​3058...it gets rid of
        the spurious *s. But we still have numbers on the 4th and 5th level
        headers.
      • Properly escape backticks in verbatim (#​3121, Jesse Rosenthal).
        Otherwise they can cause unintended ligatures like ?`.
      • Handle NARRAOW NO-BREAK SPACE into LaTeX (Vaclav Zeman) as \,.
      • Don't include [htbp] placement for figures (#​3103, Václav Haisman).
        This allows figure placement defaults to be changed by the user
        in the template.
    • TEI writer: remove heuristic to detect book template (Albert Krewinkel).
      TEI doesn't have <book> elements but only generic <divN> division
      elements. Checking the template for a trailing </book> is nonsensical.

    • MediaWiki writer: transform filename with underscores in images (#​3052).
      foo bar.jpg becomes foo_bar.jpg. This was already done
      for internal links, but it also needs to happen for images.

    • ICML writer: replace partial function (!!) in table handling (#​3175,
      Mauro Bieg).

    • Man writer: allow section numbers that are not a single digit (#​3089).

    • AsciiDoc writer: avoid unnecessary use of "unconstrained" emphasis
      (#​3068). In AsciiDoc, you must use a special form of emphasis
      (double __) for intraword emphasis. Pandoc was previously using
      this more than necessary.

    • EPUB writer: use stringify instead of plain writer for metadata
      (#​3066). This means that underscores won't be used for emphasis,
      or CAPS for bold. The metadata fields will just have unadorned
      text.

    • Docx Writer:

      • Implement user-defined styles (Jesse Rosenthal). Divs and Spans
        with a custom-style key in the attributes will apply the corresponding
        key to the contained blocks or inlines.
      • Add ReaderT env to the docx writer (Jesse Rosenthal).
      • Clean up and streamline RTL behavior (Jesse Rosenthal, #​3140).
        You can set dir: rtl in YAML metadata, or use -M dir=rtl
        on the command line. For finer-grained control, you can set
        the dir attribute in Div or Span elements.
    • Org writer (Albert Krewinkel):

      • Remove blank line after figure caption. Org-mode only treats an image
        as a figure if it is directly preceded by a caption.
      • Ensure blank line after figure. An Org-mode figure should be surrounded
        by blank lines. The figure would be recognized regardless, but images
        in the following line would unintentionally be treated as figures as
        well.
      • Ensure link targets are paths or URLs. Org-mode treats links as
        document internal searches unless the link target looks like a URL or
        file path, either relative or absolute. This change ensures that this
        is always the case.
      • Translate language identifiers. Pandoc and Org-mode use different
        programming language identifiers. An additional translation between
        those identifiers is added to avoid unexpected behavior. This fixes a
        problem where language specific source code would sometimes be output
        as example code.
      • Drop space before footnote markers (Albert Krewinkel, #​3162).
        The writer no longer adds an extra space before footnote markers.
    • Markdown writer:

      • Don't emit HTML for tables unless raw_html extension is set (#​3154).
        Emit [TABLE] if no suitable table formats are enabled and raw HTML
        is disabled.
      • Check for the raw_html extension before emitting a raw HTML block.
      • Abstract out note/ref function (Jesse Rosenthal).
      • Add ReaderT monad for environment variables (Jesse Rosenthal).
    • HTML, EPUB, slidy, revealjs templates: Use <p> instead of <h1> for
      subtitle, author, date (#​3119). Note that, as a result of this change,
      authors may need to update CSS.

    • revealjs template: Added notes-server option
      (jgm/pandoc-templates#212, Yoan Blanc).

    • Beamer template:

      • Restore whitespace between paragraphs. This was
        a regression in the last release (jgm/pandoc-templates#207).
      • Added themeoptions variable (Carsten Gips).
      • Added beamerarticle variable. This causes the beamerarticle
        package to be loaded in beamer, to produce an article from beamer
        slides. (Carsten Gips)
      • Added support for fontfamilies structured variable
        (Artem Klevtsov).
      • Added hypersetup options (Jake Zimmerman).
    • LaTeX template:

      • Added dummy definition for \institute.
        This isn't a standard command, and we want to avoid a crash when
        institute is used with the default template.
      • Define default figure placement (Václav Haisman), since pandoc
        no longer includes [htbp] for figures. Users with custom templates
        will want to add this. See #​3103.
      • Use footnote package to fix notes in tables (jgm/pandoc-templates#208,
        Václav Haisman).
    • Moved template compiling/rendering code to a separate library.
      doctemplates. This allows the pandoc templating system to be
      used independently.

    • Text.Pandoc.Error: Fix out of index error in handleError
      (Matthew Pickering). The fix is to not try to show the exact line when
      it would cause an out-of-bounds error as a result of included files.

    • Text.Pandoc.Shared: Add linesToBlock function (Albert Krewinkel).

    • Text.Pandoc.Parsing.emailAddress: tighten up parsing of email
      addresses. Technically **@&#8203;user is a valid email address, but if we
      allow things like this, we get bad results in markdown flavors
      that autolink raw email addresses (see #​2940). So we exclude a few
      valid email addresses in order to avoid these more common bad cases.

    • Text.Pandoc.PDF: Don't crash with nonexistent image (#​3100). Instead,
      emit the alt text, emphasized. This accords with what the ODT writer
      currently does. The user will still get a warning about a nonexistent
      image.

    • Fix example in API documentation (#​3176, Thomas Weißschuh).

    • Tell where to get tarball in INSTALL (#​3062).

    • Rename README to MANUAL.txt and add GitHub-friendly README.md
      (Albert Krewinkel, Kolen Cheung).

    • Replace COPYING with Markdown version COPYING.md from GNU (Kolen Cheung).

    • MANUAL.txt:

      • Put note on structured vars in separate paragraph (#​2148, Albert
        Krewinkel). Make it clearer that structured author variables require a
        custom template
      • Note that --katex works best with html5 (#​3077).
      • Fix the LaTeX and EPUB links in manual (Morton Fox).
      • Document biblio-title variable.
    • Improve spacing of footnotes in --help output (Waldir Pimenta).

    • Update KaTeX to v0.6.0 (Kolen Cheung).

    • Allow latest dependencies.

    • Use texmath 0.8.6.6 (#​3040).

    • Allow http-client 0.4.30, which is the version in stackage lts.
      Previously we required 0.5.
      Remove CPP conditionals for earlier versions.

    • Remove support for GHC < 7.8 (Jesse Rosenthal).

      • Remove Compat.Monoid.
      • Remove an inline monad compatibility macro.
      • Remove Text.Pandoc.Compat.Except.
      • Remove directory compat.
      • Change constraint on mtl.
      • Remove unnecessary CPP condition in UTF8.
      • Bump base lower bound to 4.7.
      • Remove 7.6 build from .travis.yaml.
      • Bump supported ghc version in CONTRIBUTING.md.
      • Add note about GHC version support to INSTALL.
      • Remove GHC 7.6 from list of tested versions (Albert Krewinkel).
      • Remove TagSoup compat.
      • Add EOL note to time compat module. Because time 1.4 is a boot library
        for GHC 7.8, we will support the compatibility module as long as we
        support 7.8. But we should be clear about when we will no longer need
        it.
      • Remove blaze-html CPP conditional.
      • Remove unnecessary CPP in custom Prelude.

v1.17.2

Compare Source

  • Added Zim Wiki writer, template and tests. zimwiki is now
    a valid output format. (Alex Ivkin)

    • Changed email-obfuscation default to no obfuscation (#​2988).

      • writerEmailObfuscation in defaultWriterOptions is now
        NoObfuscation.
      • the default for the command-line --email-obfuscation option is
        now none.
    • Docbook writer: Declare xlink namespace in Docbook5 output (Ivo Clarysse).

    • Org writer:

      • Support arbitrary raw inlines (Albert Krewinkel).
        Org mode allows arbitrary raw inlines ("export snippets" in Emacs
        parlance) to be included as @@&#8203;format:raw foreign format text@@&#8203;.
      • Improve Div handling (Albert Krewinkel). Div blocks handling is
        changed to make the output look more like idiomatic org mode:
        • Div-wrapped content is output as-is if the div's attribute is the
          null attribute.
        • Div containers with an id but neither classes nor key-value pairs
          are unwrapped and the id is added as an anchor.
        • Divs with classes associated with greater block elements are
          wrapped in a #+BEGIN...#+END block.
        • The old behavior for Divs with more complex attributes is kept.
    • HTML writer: Better support for raw LaTeX environments (#​2758).
      Previously we just passed all raw TeX through when MathJax
      was used for HTML math. This passed through too much.
      With this patch, only raw LaTeX environments that MathJax
      can handle get passed through.
      This patch also causes raw LaTeX environments to be treated
      as math, when possible, with MathML and WebTeX output.

    • Markdown writer: use raw HTML for simple, pipe tables with linebreaks
      (#​2993). Markdown line breaks involve a newline, and simple and pipe
      tables can't contain one.

    • Make --webtex work with the Markdown writer (#​1177).
      This is a convenient option for people using
      websites whose Markdown flavors don't provide for math.

    • Docx writer:

      • Set paragraph to FirstPara after display math (Jesse Rosenthal).
        We treat display math like block quotes, and apply FirstParagraph style
        to paragraphs that follow them. These can be styled as the user
        wishes. (But, when the user is using indentation, this allows for
        paragraphs to continue after display math without indentation.)
      • Use actual creation time as doc prop (Jesse Rosenthal).
        Previously, we had used the user-supplied date, if available, for Word's
        document creation metadata. This could lead to weird results, as in
        cases where the user post-dates a document (so the modification might be
        prior to the creation). Here we use the actual computer time to set the
        document creation.
    • LaTeX writer:

      • Don't URI-escape image source (#​2825). Usually this is a local file,
        and replacing spaces with %20 ruins things.
      • Allow 'standout' as a beamer frame option (#​3007).
        ## Slide title {.standout}.
    • RST reader: Fixed links with no explicit link text. The link
      `<foo>`_ should have foo as both its link text and its URL.
      See RST spec at http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#embedded-uris-and-aliases
      Closes Debian #​828167 -- reported by Christian Heller.

    • Textile reader:

      • Fixed attributes (#​2984). Attributes can't be followed by
        a space. So, _(class)emph_ but _(noclass) emph_.
      • Fixed exponential parsing bug (#​3020).
      • Fix overly aggressive interpretation as images (#​2998).
        Spaces are not allowed in the image URL in textile.
    • LaTeX reader:

      • Fix \cite so it is a NormalCitation not AuthorInText.
      • Strip off double quotes around image source if present (#​2825).
        Avoids interpreting these as part of the literal filename.
    • Org reader:

      • Add semicolon to list of special chars (Albert Krewinkel)
        Semicolons are used as special characters in citations syntax. This
        ensures the correct parsing of Pandoc-style citations: [prefix; @&#8203;key; suffix]. Previously, parsing would have failed unless there was a space
        or other special character as the last character.
      • Add support for "Berkeley-style" cites (Albert Krewinkel, #​1978).
        A specification for an official Org-mode citation syntax was drafted by
        Richard Lawrence and enhanced with the help of others on the orgmode
        mailing list. Basic support for this citation style is added to the
        reader.
      • Support arbitrary raw inlines (Albert Krewinkel).
        Org mode allows arbitrary raw inlines ("export snippets" in Emacs
        parlance) to be included as @@&#8203;format:raw foreign format text@@&#8203;.
      • Remove partial functions (Albert Krewinkel, #​2991).
        Partial functions like head lead to avoidable errors and should be
        avoided. They are replaced with total functions.
      • Support figure labels (Albert Krewinkel, #​2496, #​2999).
        Figure labels given as #+LABEL: thelabel are used as the ID of the
        respective image. This allows e.g. the LaTeX to add proper \label
        markup.
      • Improve tag and properties type safety (Albert Krewinkel).
        Specific newtype definitions are used to replace stringly typing of tags
        and properties. Type safety is increased while readability is improved.
      • Parse as headlines, convert to blocks (Albert Krewinkel).
        Emacs org-mode is based on outline-mode, which treats documents as trees
        with headlines are nodes. The reader is refactored to parse into a
        similar tree structure. This simplifies transformations acting on
        document (sub-)trees.
      • Refactor comment tree handling (Albert Krewinkel).
        Comment trees were handled after parsing, as pattern matching on lists
        is easier than matching on sequences. The new method of reading
        documents as trees allows for more elegant subtree removal.
      • Support archived trees export options (Albert Krewinkel).
        Handling of archived trees can be modified using the arch option.
        Archived trees are either dropped, exported completely, or collapsed to
        include just the header when the arch option is nil, non-nil, or
        headline, respectively.
      • Put export setting parser into module (Albert Krewinkel).
        Export option parsing is distinct enough from general block parsing to
        justify putting it into a separate module.
      • Support headline levels export setting (Albert Krewinkel).
        The depths of headlines can be modified using the H option. Deeper
        headlines will be converted to lists.
      • Replace ugly code with view pattern (Albert Krewinkel).
        Some less-than-smart code required a pragma switching of overlapping
        pattern warnings in order to compile seamlessly. Using view patterns
        makes the code easier to read and also doesn't require overlapping
        pattern checks to be disabled.
      • Fix parsing of verbatim inlines (Albert Krewinkel, #​3016).
        Org rules for allowed characters before or after markup chars were not
        checked for verbatim text. This resultet in wrong parsing outcomes of
        if the verbatim text contained e.g. space enclosed markup characters as
        part of the text (=is_substr = True=). Forcing the parser to update
        the positions of allowed/forbidden markup border characters fixes this.
    • LaTeX template: fix for obscure hyperref/xelatex issue.
      Here's a minimal case:

      \documentclass[]{article}
      \usepackage{hyperref}
      \begin{document}
      \section{\%á}
      \end{document}
      

      Without this change, this fails on the second invocation of xelatex.
      This affects inputs this like # %á with pdf output via xelatex.

    • trypandoc: call results 'html' instead of 'result'.
      This is for better compatibility with babelmark2.

    • Document MultiMarkdown as input/output format (Albert Krewinkel, #​2973).
      MultiMarkdown was only mentioned as a supported Markdown dialect but not
      as a possible input or output format. A brief mention is added
      everywhere the other supported markdown dialects are mentioned.

    • Document Org mode as a format containing raw HTML (Albert Krewinkel)
      Raw HTML is kept when the output format is Emacs Org mode.

    • Implement RawInline and RawBlock in sample lua custom writer (#​2985).

    • Text.Pandoc.Shared:

      • Introduce blocksToInlines function (Jesse Rosenthal).
        This is a lossy function for converting [Block] -> [Inline]. Its main
        use, at the moment, is for docx comments, which can contain arbitrary
        blocks (except for footnotes), but which will be converted to spans.
        This is, at the moment, pretty useless for everything but the basic
        Para and Plain comments. It can be improved, but the docx reader
        should probably emit a warning if the comment contains more than this.
      • Add BlockQuote to blocksToInlines (Jesse Rosenthal).
      • Add further formats for normalizeDate (Jesse Rosenthal).
        We want to avoid illegal dates -- in particular years with greater than
        four digits. We attempt to parse series of digits first as %Y%m%d, then
        %Y%m, and finally %Y.
      • normalizeDate should reject illegal years (Jesse Rosenthal).
        We only allow years between 1601 and 9999, inclusive. The ISO 8601
        actually says that years are supposed to start with 1583, but MS Word
        only allows 1601-9999. This should stop corrupted word files if the date
        is out of that range, or is parsed incorrectly.
      • Improve year sanity check in normalizeDate (Jesse Rosenthal).
        Previously we parsed a list of dates, took the first one, and then
        tested its year range. That meant that if the first one failed, we
        returned nothing, regardless of what the others did. Now we test for
        sanity before running msum over the list of Maybe values. Anything
        failing the test will be Nothing, so will not be a candidate.
    • Docx reader:

      • Add simple comment functionality. (Jesse Rosenthal).
        This adds simple track-changes comment parsing to the docx reader. It is
        turned on with --track-changes=all. All comments are converted to
        inlines, which can list some information. In the future a warning will be
        added for comments with formatting that seems like it will be excessively
        denatured. Note that comments can extend across blocks. For that reason
        there are two spans: comment-start and comment-end. comment-start
        will contain the comment. comment-end will always be empty. The two
        will be associated by a numeric id.
      • Enable warnings in top-level reader (Jesse Rosenthal).
        Previously we had only allowed for warnings in the parser. Now we allow
        for them in the Docx.hs as well. The warnings are simply concatenated.
      • Add warning for advanced comment formatting. (Jesse Rosenthal).
        We can't guarantee we'll convert every comment correctly, though we'll
        do the best we can. This warns if the comment includes something other
        than Para or Plain.
      • Add tests for warnings. (Jesse Rosenthal).
      • Add tests for comments (Jesse Rosenthal).
        We test for comments, using all track-changes options. Note that we
        should only output comments if --track-changes=all. We also test for
        emitting warnings if there is complicated formatting.
    • README: update to include track-changes comments. (Jesse Rosenthal)

    • Improved Windows installer - don't ignore properties set on command-line.
      See #​2708. Needs testing to see if this resolves the issue.
      Thanks to @​nkalvi.

    • Process markdown extensions on command line in L->R order (#​2995).
      Previously they were processed, very unintuitively, in R->L
      order, so that markdown-tex_math_dollars+tex_math_dollars
      had tex_math_dollars disabled.

    • Added secnumdepth variable to LaTeX template (#​2920).

    • Include table of contents in README.html in Windows package.

    • Writers: treat SoftBreak as space for stripping (Jesse Rosenthal)
      In Writers.Shared, we strip leading and trailing spaces for display
      math. Since SoftBreak's are treated as spaces, we should strip those
      too.

    • beamer, latex templates: pass biblatexoptions directly in package load.
      This allows runtime optinos to be used. Fixes jgm/pandoc-citeproc#201

    • CPP workaround for deprecation of parseUrl in http-client.

    • Removed some redundant class constraints.

    • make_oxs_package.sh - use OSX env variable.

    • Added winpkg target to Makefile. This downloads the windows package
      from appveyor and signs it using the key.

    • Document Org mode as a format containing raw TeX (Albert Krewinkel).
      Raw TeX is kept verbatim when the output format is Emacs Org mode.

    • Support math with haddock-library >= 1.4.

    • Removed -rtsopts from library stanza. It has no effect, and Hackage
      wouldn't accept the package.

    • Update library dependency versions.

v1.17.1

Compare Source

  • New output format: docbook5 (Ivo Clarysse).

    • Text.Pandoc.Options: Add writerDocBook5 to WriterOptions
      (API change).

    • Org writer:

      • Add :PROPERTIES: drawer support (Albert Krewinkel, #​1962).
        This allows header attributes to be added to org documents in the form
        of :PROPERTIES: drawers. All available attributes are stored as
        key/value pairs. This reflects the way the org reader handles
        :PROPERTIES: blocks.
      • Add drawer capability (Carlos Sosa). For the implementation of the
        Drawer element in the Org Writer, we make use of a generic Block
        container with attributes. The presence of a drawer class defines
        that the Div constructor is a drawer. The first class defines the
        drawer name to use. The key-value list in the attributes defines
        the keys to add inside the Drawer. Lastly, the list of Block elements
        contains miscellaneous blocks elements to add inside of the Drawer.
      • Use CUSTOM_ID in properties (Albert Krewinkel). The ID property is
        reserved for internal use by Org-mode and should not be used.
        The CUSTOM_ID property is to be used instead, it is converted to the
        ID property for certain export format.
    • LaTeX writer:

      • Ignore --incremental unless output format is beamer (#​2843).
      • Fix polyglossia to babel env mapping (Mauro Bieg, #​2728).
        Allow for optional argument in square brackets.
      • Recognize la-x-classic as Classical Latin (Andrew Dunning).
        This allows one to access the hyphenation patterns in CTAN's
        hyph-utf8.
      • Add missing languages from hyph-utf8 (Andrew Dunning).
      • Improve use of \strut with \minipage inside tables
        (Jose Luis Duran). This improves spacing in multiline
        tables.
      • Use {} around options containing special chars (#​2892).
      • Avoid lazy foldl.
      • Don't escape underscore in labels (#​2921). Previously they were
        escaped as ux5f.
      • brazilian -> brazil for polyglossia (#​2953).
    • HTML writer: Ensure mathjax link is added when math appears in footnote
      (#​2881). Previously if a document only had math in a footnote, the
      MathJax link would not be added.

    • EPUB writer: set navpage variable on nav page.
      This allows templates to treat it differently.

    • DocBook writer:

      • Use docbook5 if writerDocbook5 is set (Ivo Clarysse).
      • Properly handle ulink/link (Ivo Clarysse).
    • EPUB reader:

      • Unescape URIs in spine (#​2924).
      • EPUB reader: normalise link id (Mauro Bieg).
    • Docx Reader:

      • Parse moveTo and moveFrom (Jesse Rosenthal).
        moveTo and moveFrom are track-changes tags that are used when a
        block of text is moved in the document. We now recognize these tags and
        treat them the same as insert and delete, respectively. So,
        --track-changes=accept will show the moved version, while
        --track-changes=reject will show the original version.
      • Tests for track-changes moving (Jesse Rosenthal).
    • ODT, EPUB, Docx readers: throw PandocError on unzip failure
      (Jesse Rosenthal) Previously, readDocx, readEPUB, and readOdt
      would error out if zip-archive failed. We change the archive extraction
      step from toArchive to toArchiveOrFail, which returns an Either value.

    • Markdown, HTML readers: be more forgiving about unescaped & in
      HTML (#​2410). We are now more forgiving about parsing invalid HTML with
      unescaped & as raw HTML. (Previously any unescaped &
      would cause pandoc not to recognize the string as raw HTML.)

    • Markdown reader:

      • Fix pandoc title blocks with lines ending in 2 spaces (#​2799).
      • Added -s to markdown-reader-more test.
    • HTML reader: fixed bug in pClose. This caused exponential parsing
      behavior in documnets with unclosed tags in dl, dd, dt.

    • MediaWiki reader: Allow spaces before ! in MediaWiki table header
      (roblabla).

    • RST reader: Support :class: option for code block in RST reader
      (Sidharth Kapur).

    • Org reader (all Albert Krewinkel, except where noted otherwise):

      • Stop padding short table rows.
        Emacs Org-mode doesn't add any padding to table rows. The first
        row (header or first body row) is used to determine the column count,
        no other magic is performed.
      • Refactor rows-to-table conversion. This refactors
        the codes conversing a list table lines to an org table ADT.
        The old code was simplified and is now slightly less ugly.
      • Fix handling of empty table cells, rows (Albert Krewinkel, #​2616).
        This fixes Org mode parsing of some corner cases regarding empty cells
        and rows. Empty cells weren't parsed correctly, e.g. ||| should be
        two empty cells, but would be parsed as a single cell containing a pipe
        character. Empty rows where parsed as alignment rows and dropped from
        the output.
      • Fix spacing after LaTeX-style symbols.
        The org-reader was dropping space after unescaped LaTeX-style symbol
        commands: \ForAll \Auml resulted in ∀Ä but should give ∀ Ä
        instead. This seems to be because the LaTeX-reader treats the
        command-terminating space as part of the command. Dropping the trailing
        space from the symbol-command fixes this issue.
      • Print empty table rows. Empty table rows should not
        be dropped from the output, so row-height is always set to be at least 1.
      • Move parser state into separate module.
        The org reader code has become large and confusing. Extracting smaller
        parts into submodules should help to clean things up.
      • Add support for sub/superscript export options.
        Org-mode allows to specify export settings via #+OPTIONS lines.
        Disabling simple sub- and superscripts is one of these export options,
        this options is now supported.
      • Support special strings export option Parsing of special strings
        (like ... as ellipsis or -- as en dash) can be toggled using the -
        option.
      • Support emphasized text export option. Parsing of emphasized text can
        be toggled using the * option. This influences parsing of text marked
        as emphasized, strong, strikeout, and underline. Parsing of inline math,
        code, and verbatim text is not affected by this option.
      • Support smart quotes export option. Reading of smart quotes can be
        toggled using the ' option.
      • Parse but ignore export options. All known export options are parsed
        but ignored.
      • Refactor block attribute handling. A parser state attribute was used
        to keep track of block attributes defined in meta-lines. Global state
        is undesirable, so block attributes are no longer saved as part of the
        parser state. Old functions and the respective part of the parser state
        are removed.
      • Use custom anyLine. Additional state changes need to be made after
        a newline is parsed, otherwise markup may not be recognized correctly.
        This fixes a bug where markup after certain block-types would not be
        recognized.
      • Add support for ATTR_HTML attributes (#​1906).
        Arbitrary key-value pairs can be added to some block types using a
        #+ATTR_HTML line before the block. Emacs Org-mode only includes these
        when exporting to HTML, but since we cannot make this distinction here,
        the attributes are always added. The functionality is now supported
        for figures.
      • Add :PROPERTIES: drawer support (#​1877).
        Headers can have optional :PROPERTIES: drawers associated with them.
        These drawers contain key/value pairs like the header's id. The
        reader adds all listed pairs to the header's attributes; id and
        class attributes are handled specially to match the way Attr are
        defined. This also changes behavior of how drawers of unknown type
        are handled. Instead of including all unknown drawers, those are not
        read/exported, thereby matching current Emacs behavior.
      • Use CUSTOM_ID in properties. See above on Org writer changes.
      • Respect drawer export setting. The d export option can be used
        to control which drawers are exported and which are discarded.
        Basic support for this option is added here.
      • Ignore leading space in org code blocks (Emanuel Evans, #​2862).
        Also fix up tab handling for leading whitespace in code blocks.
      • Support new syntax for export blocks. Org-mode version 9
        uses a new syntax for export blocks. Instead of #+BEGIN_<FORMAT>,
        where <FORMAT> is the format of the block's content, the new
        format uses #+BEGIN_export <FORMAT> instead. Both types are
        supported.
      • Refactor BEGIN...END block parsing.
      • Fix handling of whitespace in blocks, allowing content to be indented
        less then the block header.
      • Support org-ref style citations. The org-ref package is an
        org-mode extension commonly used to manage citations in org
        documents. Basic support for the cite:citeKey and
        [[cite:citeKey][prefix text::suffix text]] syntax is added.
      • Split code into separate modules, making for cleaner code and
        better decoupling.
    • Added docbook5 template.

    • --mathjax improvements:

      • Use new CommonHTML output for MathJax (updated default MathJax URL,
        #​2858).
      • Change default mathjax setup to use TeX-AMS_CHTML configuration.
        This is designed for cases where the input is always TeX and maximal
        conformity with TeX is desired. It seems to be smaller and load faster
        than what we used before. See #​2858.
      • Load the full MathJax config to maximize loading speed (KolenCheung).
    • Bumped upper version bounds to allow use of latest packages
      and compilation with ghc 8.

    • Require texmath 0.8.6.2. Closes several texmath-related bugs (#​2775,
      #​2310, #​2310, #​2824). This fixes behavior of roots, e.g.
      \sqrt[3]{x}, and issues with sub/superscript positioning
      and matrix column alignment in docx.

    • README:


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot enabled auto-merge (squash) November 15, 2024 20:09
@renovate renovate bot merged commit af94584 into master Nov 15, 2024
14 checks passed
Copy link

🎉 This PR is included in version 1.0.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@renovate renovate bot deleted the renovate/pandoc-1.x branch November 15, 2024 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant