py-pdf · andersonhc · Nov 21, 2024 · May 13, 2024 · May 13, 2024 · May 14, 2024
@@ -1350,4 +1350,4 @@
   "contributorsPerLine": 7,
   "skipCi": true,
   "commitType": "docs"
-}
+}
@@ -21,6 +21,7 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
 * new optional parameter `border` for table cells [issue #1192](https://github.com/py-pdf/fpdf2/issues/1192) users can define specific borders (left, right, top, bottom) for individual cells
 * [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): now parses `<title>` tags to set the [document title](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_title). By default, it is added as PDF metadata, but not rendered in the document body. However, this can be enabled by passing `render_title_tag=True` to `FPDF.write_html()`.
 * support for LZWDecode compression [issue #1271](https://github.com/py-pdf/fpdf2/issues/1271)
+* support for [page labels](https://py-pdf.github.io/fpdf2/PageLabels.html) and created a [reference table of contents](https://py-pdf.github.io/fpdf2/DocumentOutlineAndTableOfContents.html) implementation
 ### Fixed
 * support for `align=` in [`FPDF.table()`](https://py-pdf.github.io/fpdf2/Tables.html#setting-table-column-widths). Due to this correction, tables are now properly horizontally aligned on the page by default. This was always specified in the documentation, but was not in effect until now. You can revert to have left-aligned tables by passing `align="LEFT"` to `FPDF.table()`.
 * `FPDF.set_text_shaping(False)` was broken since version 2.7.8 and is now working properly - [issue #1287](https://github.com/py-pdf/fpdf2/issues/1287)

@@ -1,39 +1,82 @@
-# Document outline & table of contents #
+# Document Outline & Table of Contents
 
-Quoting [Wikipedia](https://en.wikipedia.org/wiki/Table_of_contents), a **table of contents** is:
-> a list, usually found on a page before the start of a written work, of its chapter or section titles or brief descriptions with their commencing page numbers.
+## Overview
+
+This document explains how to implement and customize the Document Outline (also known as Bookmarks) and Table of Contents (ToC) features in `fpdf2`.
+
+---
 
-Now quoting the 6th edition of the PDF format reference (v1.7 - 2006) :
+## Document Outline (Bookmarks)
+
+Document outlines allow users to navigate quickly through sections in the PDF by creating a hierarchical structure of clickable links.
+
+Quoting the 6th edition of the PDF format reference (v1.7 - 2006) :
 > A PDF document may optionally display a **document outline** on the screen, allowing the user to navigate interactively
 > from one part of the document to another. The outline consists of a tree-structured hierarchy of outline items
 > (sometimes called bookmarks), which serve as a visual table of contents to display the document’s structure to the user.
 
 For example, there is how a document outline looks like in [Sumatra PDF Reader](https://www.sumatrapdfreader.org/free-pdf-reader.html):
 
-![](document-outline.png)
+![Document Outline Example](document-outline.png)
 
-Since `fpdf2.3.3`, both features are supported through the use of the [`start_section`](fpdf/fpdf.html#fpdf.fpdf.FPDF.start_section) method,
-that adds an entry in the internal "outline" table used to render both features.
+Since `fpdf2.3.3`, you can use the [`start_section`](fpdf/fpdf.html#fpdf.fpdf.FPDF.start_section) method to add entries in the internal "outline" table, which is used to render both the outline and ToC.
 
 Note that by default, calling `start_section` only records the current position in the PDF and renders nothing.
-However, you can configure **global title styles** by calling [`set_section_title_styles`](fpdf/fpdf.html#fpdf.fpdf.FPDF.set_section_title_styles),
-after which call to `start_section` will render titles visually using the styles defined.
+However, you can configure **global title styles** by calling [`set_section_title_styles`](fpdf/fpdf.html#fpdf.fpdf.FPDF.set_section_title_styles), after which calls to `start_section` will render titles visually using the styles defined.
+
+To provide a document outline to the PDF you generate, you just have to call the `start_section` method for every hierarchical section you want to define.
+
+### Nested outlines
+
+Outlines can be nested by specifying different levels. Higher-level outlines (e.g., level 0) appear at the top, while sub-levels (e.g., level 1, level 2) are indented.
+
+```python
+pdf.start_section(name="Chapter 1: Introduction", level=0)
+pdf.start_section(name="Section 1.1: Background", level=1)
+```
+
+---
+
+## Table of Contents
+
+Quoting [Wikipedia](https://en.wikipedia.org/wiki/Table_of_contents), a **table of contents** is:
+> a list, usually found on a page before the start of a written work, of its chapter or section titles or brief descriptions with their commencing page numbers.
+
+### Inserting a Table of Contents
+
+Use the [`insert_toc_placeholder`](fpdf/fpdf.html#fpdf.fpdf.FPDF.insert_toc_placeholder) method to define a placeholder for the ToC. A page break is triggered after inserting the ToC.
 
-To provide a document outline to the PDF you generate, you just have to call the `start_section` method
-for every hierarchical section you want to define.
+**Parameters:**
+- **render_toc_function**: Function called to render the ToC, receiving two parameters: `pdf`, an FPDF instance, and `outline`, a list of `fpdf.outline.OutlineSection`.
+- **pages**: The number of pages that the ToC will span, including the current one. A page break occurs for each page specified.
+- **allow_extra_pages**: If `True`, allows unlimited additional pages to be added to the ToC as needed. These extra ToC pages are initially created at the end of the document and then reordered when the final PDF is produced.
 
-If you also want to insert a table of contents somewhere,
-call [`insert_toc_placeholder`](fpdf/fpdf.html#fpdf.fpdf.FPDF.insert_toc_placeholder)
-wherever you want to put it.
-Note that a page break will always be triggered after inserting the table of contents.
+**Note**: Enabling `allow_extra_pages` may affect page numbering for headers or footers. Since extra ToC pages are added after the document content, they might cause page numbers to appear out of sequence. To maintain consistent numbering, use (Page Labels)[PageLabels.md] to assign a specific numbering style to the ToC pages. When using Page Labels, any extra ToC pages will follow the numbering style of the first ToC page.
 
-## With HTML ##
+### Reference Implementation
 
-When using [`FPDF.write_html`](HTML.md), a document outline is automatically built.
-You can insert a table of content with the special `<toc>` tag.
+_New in [:octicons-tag-24: 2.8.2](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_
+
+The `fpdf.outline.TableOfContents` class provides a reference implementation of the ToC, which can be used as-is or subclassed.
+
+```python
+from fpdf import FPDF
+from fpdf.outline import TableOfContents
+
+pdf = FPDF()
+pdf.add_page()
+toc = TableOfContents()
+pdf.insert_toc_placeholder(toc.render_toc, allow_extra_pages=True)
+```
+
+---
+
+## Using Outlines and ToC with HTML
+
+When using [`FPDF.write_html`](HTML.md), a document outline is automatically generated, and a ToC can be added with the `<toc>` tag.
+
+To customize ToC styling, override the `render_toc` method in a subclass:
 
-Custom styling of the table of contents can be achieved by overriding the `render_toc` method
-in a subclass of `FPDF`:
 ```python
 from fpdf import FPDF, HTML2FPDF
 
@@ -59,7 +102,9 @@ pdf.write_html("""<toc></toc>
 pdf.output("html_toc.pdf")
 ```
 
-## Code samples ##
+---
+
+## Additional Code Samples
 
 The regression tests are a good place to find code samples.
 

@@ -0,0 +1,103 @@
+# Page Labels
+
+_New in [:octicons-tag-24: 2.8.2](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_
+
+## Overview
+
+In a PDF document, each page is identified by an integer page index, representing the page's position within the document. Optionally, a document can also define **page labels** to visually display page identifiers. 
+
+**Page labels** can be customized. For example, a document might begin with front matter numbered in roman numerals and transition to arabic numerals for the main content. In this case:
+- The first page (index `0`) would have a label `i`
+- The twelfth page (index `11`) would have label `xii`
+- The thirteenth page (index `12`) would start with label `1`
+
+The most popular PDF readers, such as Sumatra PDF and Adobe Acrobat Reader, will accurately display page labels as configured in the PDF. However, not all PDF readers support this feature, and some may not honor or display page labels correctly. In particular, browser-based PDF viewers, like those in Chrome and Edge, currently do not display page labels and will only show default page numbering.
+
+![Page Labels in Sumatra and Acrobat](page-labels.png)
+
+---
+
+## Page Label Components
+
+A **page label** consists of three main parts: `Style`, `Prefix`, and `Start`.
+
+### 1. Style
+The style defines the numbering format for the numeric portion of each page label. Available styles are:
+
+- **"D"**: Decimal Arabic numerals (1, 2, 3, ...)
+- **"R"**: Uppercase Roman numerals (I, II, III, ...)
+- **"r"**: Lowercase Roman numerals (i, ii, iii, ...)
+- **"A"**: Uppercase letters (A to Z, then AA to ZZ, and so on)
+- **"a"**: Lowercase letters (a to z, then aa to zz, and so on)
+
+### 2. Prefix
+The prefix is an optional string added before the numeric portion of each page label. For instance, a prefix of `"Appendix-"` with a style of `"D"` might result in labels like "Appendix-1", "Appendix-2", etc.
+
+### 3. Start
+The starting number for the first page of a labeled section. This is the initial numeric value applied to the first page of the label range.
+
+---
+
+## Using Page Labels in `fpdf2`
+
+You can add page labels directly when adding a new page using the `add_page()` method or update them later using `set_page_label()`.
+
+### Adding a Page with Labels in `add_page()`
+
+When adding a page, you can specify the values for `label_style`, `label_prefix`, and `label_start` to define the page label. Here’s how to do it:
+
+```python
+from fpdf import FPDF
+
+pdf = FPDF()
+
+# Add a page with specific label parameters
+pdf.add_page(
+    label_style="r",           # Lowercase Roman numerals
+    label_prefix="Preface-",   # Prefix for the label
+    label_start=1              # Start numbering at 1
+)
+pdf.output("document_with_labels.pdf")
+```
+
+### Modifying Page Labels with `set_page_label()`
+
+You can also modify page labels after a page has been added by using `set_page_label()`. This is helpful to set a new label after adding a ToC placeholder or other action that automatically adds a page break, but keep in mind `set_page_label()` will always happen after the header have been rendered. If you need this, prefer to have the label written on footer only.
+
+```python
+# Set a page label with style, prefix, and start value
+pdf.set_page_label(
+    label_style="D",           # Decimal Arabic numerals
+    label_prefix="Chapter-",   # Prefix for the label
+    label_start=1              # Start numbering at 1
+)
+```
+
+### Retrieving the Current Page Label with `get_page_label()`
+
+If you need to get the current page label, for example, to display it in a header or footer, you can use the `get_page_label()` method.
+
+---
+
+## Example Usage
+
+Below is a complete example that demonstrates adding multiple pages with different page label styles and prefixes:
+
+```python
+from fpdf import FPDF
+
+pdf = FPDF()
+
+# Adding front matter with lowercase Roman numerals
+pdf.add_page(label_style="r", label_start=1)  # Starts with "i", "ii", "iii", etc.
+
+# Adding main content with decimal numbers and a prefix
+pdf.add_page(label_style="D", label_prefix="Chapter-", label_start=1)  # "Chapter-1", "Chapter-2", etc.
+
+# Adding an appendix section with uppercase letters
+pdf.add_page(label_style="A", label_prefix="Appendix-", label_start=1)  # "Appendix-A", "Appendix-B", etc.
+
+pdf.output("labeled_document.pdf")
+```
+
+This example creates a document with three sections, each using a different labeling style and prefix.
@@ -19,7 +19,7 @@ class CoerciveEnum(Enum):
     "An enumeration that provides a helper to coerce strings into enumeration members."
 
     @classmethod
-    def coerce(cls, value):
+    def coerce(cls, value, case_sensitive=False):
         """
         Attempt to coerce `value` into a member of this enumeration.
 
@@ -48,7 +48,7 @@ def coerce(cls, value):
             except ValueError:
                 pass
             try:
-                return cls[value.upper()]
+                return cls[value] if case_sensitive else cls[value.upper()]
             except KeyError:
                 pass
 
@@ -193,7 +193,7 @@ class Align(CoerciveEnum):
     "Justify text"
 
     @classmethod
-    def coerce(cls, value):
+    def coerce(cls, value, case_sensitive=False):
         if value == "":
             return cls.L
         return super(cls, cls).coerce(value)
@@ -213,7 +213,7 @@ class VAlign(CoerciveEnum):
     "Place text at the bottom of the cell, but obey the cells padding"
 
     @classmethod
-    def coerce(cls, value):
+    def coerce(cls, value, case_sensitive=False):
         if value == "":
             return cls.M
         return super(cls, cls).coerce(value)
@@ -400,7 +400,7 @@ class TableCellFillMode(CoerciveEnum):
     "Fill only table cells in even columns"
 
     @classmethod
-    def coerce(cls, value):
+    def coerce(cls, value, case_sensitive=False):
         "Any class that has a .should_fill_cell() method is considered a valid 'TableCellFillMode' (duck-typing)"
         if callable(getattr(value, "should_fill_cell", None)):
             return value
@@ -472,7 +472,7 @@ def is_fill(self):
         return self in (self.F, self.DF)
 
     @classmethod
-    def coerce(cls, value):
+    def coerce(cls, value, case_sensitive=False):
         if not value:
             return cls.D
         if value == "FD":
@@ -1009,6 +1009,28 @@ class TextDirection(CoerciveEnum):
     "bottom to top"
 
 
+class PageLabelStyle(CoerciveEnum):
+    "Style of the page label"
+
+    NUMBER = intern("D")
+    "decimal arabic numerals"
+
+    UPPER_ROMAN = intern("R")
+    "uppercase roman numerals"
+
+    LOWER_ROMAN = intern("r")
+    "lowercase roman numerals"
+
+    UPPER_LETTER = intern("A")
+    "uppercase letters A to Z, AA to ZZ, AAA to ZZZ and so on"
+
+    LOWER_LETTER = intern("a")
+    "uppercase letters a to z, aa to zz, aaa to zzz and so on"
+
+    NONE = None
+    "no label"
+
+
 class Duplex(CoerciveEnum):
     "The paper handling option that shall be used when printing the file from the print dialog."