-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of rowspan in tables #1088
Conversation
Hi @Lucas-C and @gmischler, hope you're well! Anything I can do to assist with review of my PRs? |
@mjasperse can you please resolve the conflicts? I will work on reviewing this PR next. |
No longer relevant after rowspan refactor
Returns to frozen dataclasses, row-rendering mechanism
Previous behaviour allowed cascading of font style, in particular test case test_table_with_capitalized_font_family_and_emphasis
f099a32
to
a708d2b
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1088 +/- ##
==========================================
- Coverage 93.38% 93.32% -0.07%
==========================================
Files 29 29
Lines 8591 8701 +110
Branches 1894 1929 +35
==========================================
+ Hits 8023 8120 +97
- Misses 354 362 +8
- Partials 214 219 +5 ☔ View full report in Codecov by Sentry. |
I ran several comparisons between current current master and mjasperse's branch, up to a 100k lines table (5500+ pages file) and there is no significant performance change - mjasperse's version is actually faster in most of the profile runs. |
Hi team! I've been working on this for a while, apologies in advance for the huge PR. The changes are primarily motivated by my use-case, so am happy to discuss whether things should be taken out or handled differently.
The key problem with rowspan is that the rows can't be treated independently any more and there needs to be multiple layout passes to choose how to allocate padding to ensure no cell is truncated. Furthermore the mapping between "column index" and "table index" is nontrivial because it's perturbed both by colspan and rowspan intrusions. This makes it quite hard to generate a valid table on the first try.
To resolve this I decided to do a table "normalisation" step, where a
None
entry is inserted intoRow.cells
to simplify keeping track of where a spanned cell exists, so the index has clear meaning and doesn't need remapping. This greatly simplified the debugging process because you can easily inspect the book-keeping to ensure it matches expectations. This is my third attempt at the rowspan algorithm because of how hard it is to handle edge-cases reliably...I also removed the requirement that all rows have the same number of cells, instead scaling the whole table to the max number of cells in a row. This means that if you get a
rowspan
orcolspan
wrong somewhere, you can get some visual feedback from the PDF output on what is wrong instead of just getting an assertion error with table index, which really helps identify what needs fixing in the table. One option is to emit a warning in this case?During development I also found it very helpful to be able to define spans using placeholder elements, as requested in #970. My solution is to use an enum object to identify where the span should occur, so it cannot be confused with valid cell contents. This is somewhat similar to extended markdown and I find more intuitive for generating from some table sources - see
test_table_with_rowspan_and_pgbreak()
for an example. Personally I found this a useful feature but I also see the objections in #970 so I'm open to separating it out.I've added a bunch of test cases to track the expected behaviour - particularly mess relating to pagebreaks. I anticipate adding some more as I think of other edge cases or run into issues with using it in our application.
There will be merge conflicts with my other PRs, but would prefer to hold off rebasing until after review.
Resolves #1064
Checklist:
The GitHub pipeline is OK (green),
meaning that both
pylint
(static code analyzer) andblack
(code formatter) are happy with the changes of this PR.A unit test is covering the code added / modified by this PR
This PR is ready to be merged
In case of a new feature, docstrings have been added, with also some documentation in the
docs/
folderA mention of the change is present in
CHANGELOG.md
By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.