Implementation of rowspan in tables #1088

mjasperse · 2024-01-03T09:20:14Z

Hi team! I've been working on this for a while, apologies in advance for the huge PR. The changes are primarily motivated by my use-case, so am happy to discuss whether things should be taken out or handled differently.

The key problem with rowspan is that the rows can't be treated independently any more and there needs to be multiple layout passes to choose how to allocate padding to ensure no cell is truncated. Furthermore the mapping between "column index" and "table index" is nontrivial because it's perturbed both by colspan and rowspan intrusions. This makes it quite hard to generate a valid table on the first try.

To resolve this I decided to do a table "normalisation" step, where a None entry is inserted into Row.cells to simplify keeping track of where a spanned cell exists, so the index has clear meaning and doesn't need remapping. This greatly simplified the debugging process because you can easily inspect the book-keeping to ensure it matches expectations. This is my third attempt at the rowspan algorithm because of how hard it is to handle edge-cases reliably...

I also removed the requirement that all rows have the same number of cells, instead scaling the whole table to the max number of cells in a row. This means that if you get a rowspan or colspan wrong somewhere, you can get some visual feedback from the PDF output on what is wrong instead of just getting an assertion error with table index, which really helps identify what needs fixing in the table. One option is to emit a warning in this case?

During development I also found it very helpful to be able to define spans using placeholder elements, as requested in #970. My solution is to use an enum object to identify where the span should occur, so it cannot be confused with valid cell contents. This is somewhat similar to extended markdown and I find more intuitive for generating from some table sources - see test_table_with_rowspan_and_pgbreak() for an example. Personally I found this a useful feature but I also see the objections in #970 so I'm open to separating it out.

I've added a bunch of test cases to track the expected behaviour - particularly mess relating to pagebreaks. I anticipate adding some more as I think of other edge cases or run into issues with using it in our application.

There will be merge conflicts with my other PRs, but would prefer to hold off rebasing until after review.

Resolves #1064

Checklist:

The GitHub pipeline is OK (green),
meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.
A unit test is covering the code added / modified by this PR
This PR is ready to be merged
In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder
A mention of the change is present in CHANGELOG.md

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

mjasperse · 2024-01-16T22:45:04Z

Hi @Lucas-C and @gmischler, hope you're well! Anything I can do to assist with review of my PRs?

andersonhc · 2024-01-27T04:05:25Z

@mjasperse can you please resolve the conflicts? I will work on reviewing this PR next.

No longer relevant after rowspan refactor

Returns to frozen dataclasses, row-rendering mechanism

Previous behaviour allowed cascading of font style, in particular test case test_table_with_capitalized_font_family_and_emphasis

codecov-commenter · 2024-01-27T09:43:43Z

Codecov Report

Attention: 14 lines in your changes are missing coverage. Please review.

Comparison is base (6a8678b) 93.38% compared to head (3d606ae) 93.32%.

❗ Current head 3d606ae differs from pull request most recent head b9a862c. Consider uploading reports for the commit b9a862c to get more accurate results

Files	Patch %	Lines
fpdf/table.py	91.25%	8 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1088      +/-   ##
==========================================
- Coverage   93.38%   93.32%   -0.07%     
==========================================
  Files          29       29              
  Lines        8591     8701     +110     
  Branches     1894     1929      +35     
==========================================
+ Hits         8023     8120      +97     
- Misses        354      362       +8     
- Partials      214      219       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andersonhc · 2024-02-02T17:49:29Z

I ran several comparisons between current current master and mjasperse's branch, up to a 100k lines table (5500+ pages file) and there is no significant performance change - mjasperse's version is actually faster in most of the profile runs.

mjasperse requested a review from gmischler as a code owner January 3, 2024 09:20

andersonhc approved these changes Jan 26, 2024

View reviewed changes

mjasperse added 26 commits January 27, 2024 20:05

table: Initial functional implementation of rowspan

664b145

table: Permit combination of colspan+rowspan

20492da

table: Introduced TableSpan enum placeholder for rowspan/colspan

e00cb1a

table: WIP refactor of rowspan regularisation

beec54f

Docs: Added rowspan to tables page

63abb7c

table: Fixed book-keeping for combined rowspan and colspan

225b787

table: Dropped cols_count and column_indices from Row

1407bf8

No longer relevant after rowspan refactor

table: Fixed gutter accumulation at end of table

a69b35d

html: Added cellpadding and cellspacing table attribute handling

eccd68b

table: refactored rowspan processing

63525bd

Returns to frozen dataclasses, row-rendering mechanism

table: Check for rowspans across heading boundary

be0d2a9

html: Fixed table border reset when no <th> present

638ae3e

test: Adding test cases for rowspan implementation

ab2733e

table: Fixed handling of RHS border in case of colspan

8b7f82e

table: Fixed get_cell_border for rowspan

00aaebd

table: Dropped requirement that all rows have the same number of columns

deb18c2

html: Handle border=0 in table definition

34e4215

table: Use single disable_writing call for consistency

4be92cc

Previous behaviour allowed cascading of font style, in particular test case test_table_with_capitalized_font_family_and_emphasis

table: Fix calculation of row heights with images

782e805

table: Replaced assertion errors with FPDFException

9828a08

test: Added test case for varying number of table columns

360221a

changelog: Updated for table changes

5e321e8

doc: Updated docstring of fpdf.table()

2b13b72

table: Fixed indexing for combined rowspan and colspan

e8e3e21

table: Fix pagebreaks with overlapping spans and added test

d271a5b

table: minor refactor and blacked

037da4e

mjasperse added 5 commits January 27, 2024 20:17

table: Improved rowspan padding accumulation, added image test

07486a2

table: Minor changes per pylint suggestions

0c8f479

table: Fixed regression in pagebreak calculation

54068ed

html: Updated docs for new table attributes

e9dac08

table: Fixed outer border handling after pagebreak

a708d2b

mjasperse force-pushed the feature/table_rowspan branch from f099a32 to a708d2b Compare January 27, 2024 09:31

Merge branch 'master' into feature/table_rowspan

b9a862c

andersonhc merged commit 1effbb0 into py-pdf:master Feb 2, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of rowspan in tables #1088

Implementation of rowspan in tables #1088

mjasperse commented Jan 3, 2024 •

edited

Loading

mjasperse commented Jan 16, 2024

andersonhc commented Jan 27, 2024

codecov-commenter commented Jan 27, 2024 •

edited

Loading

andersonhc commented Feb 2, 2024 •

edited

Loading

Implementation of rowspan in tables #1088

Implementation of rowspan in tables #1088

Conversation

mjasperse commented Jan 3, 2024 • edited Loading

mjasperse commented Jan 16, 2024

andersonhc commented Jan 27, 2024

codecov-commenter commented Jan 27, 2024 • edited Loading

Codecov Report

andersonhc commented Feb 2, 2024 • edited Loading

mjasperse commented Jan 3, 2024 •

edited

Loading

codecov-commenter commented Jan 27, 2024 •

edited

Loading

andersonhc commented Feb 2, 2024 •

edited

Loading