Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagedest (addlink) not working as intended with positional parameters . #1453

Open
chukuwa-isobe opened this issue Nov 23, 2022 · 10 comments
Open
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

Comments

@chukuwa-isobe
Copy link

chukuwa-isobe commented Nov 23, 2022

If I create a link with positional, it jumps to an unintended page.

Environment

$ python -m platform
Windows-10-10.0

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.10.7

Code + PDF

I want to use the "fit" option, so I write it positional.

from PyPDF2 import PdfWriter, PdfReader
from PyPDF2.generic import RectangleObject

in_file = r"D:\xxxxxxxxxxx\in.pdf"
out_file = r"D:\xxxxxxxxxxx\out.pdf"
writer = PdfWriter()
reader = PdfReader(in_file)

for current_page in reader.pages:
    writer.add_page(current_page)

writer.addLink(0, 1, [97, 758, 162, 768], [1, 1, 1], "/XYZ", 0, 0, 0)  # <-- here

with open(out_file, "wb") as o:
    writer.write(o)

Then when I click the link it jumps from page 1 to page "3".
When I open the link properties with Acrobat, the link destination is set to page "2", and after closing the properties with OK without updating anything, clicking the link again correctly jumps to page "2".
It's hard to open the properties of all created links and click OK, so please give me an idea how to solve it.

Traceback

This is the complete Traceback I see:

TODO

@pubpub-zz
Copy link
Collaborator

@chukuwa-isobe,
can you provde an example of pdf file (in and out) with the issue please

@chukuwa-isobe
Copy link
Author

chukuwa-isobe commented Nov 25, 2022

Dear pubpub-zz,

Upload in.pdf and out.pdf.
Two links are created on the first page in out.pdf.
If I click on the first link, it jump to page 3 (issue...), but if I click on the second link, it jump to page 2.
The pages specified should be the same.

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import RectangleObject

inFile = r"xxxxxxxxxxxxxxxx\in.pdf"
outFile = r"xxxxxxxxxxxxxxxx\out.pdf"
pdf_writer = PdfFileWriter()
pdf_reader = PdfFileReader(open(inFile, "rb"))

num_of_pages = pdf_reader.getNumPages()

for page in range(num_of_pages):
    current_page = pdf_reader.getPage(page)
    pdf_writer.addPage(current_page)

pdf_writer.addLink(0, 1, [57, 721, 186, 731], [1, 1, 1], "/XYZ", 0, 0, 0)  # 1st BOX
pdf_writer.addLink(
    pagenum=0,
    pagedest=1,
    rect=RectangleObject([98, 708, 130, 718]),
    border=[1, 1, 1],
    fit="/XYZ",
)  # 2nd BOX

with open(outFile, "wb") as o:
    pdf_writer.write(o)

Is it not a good way to write?

in.pdf
out.pdf

@MartinThoma MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Nov 25, 2022
@corsair20141
Copy link

Is there any progress on this?

When I try to run the "Add Link" example(s) from the documentation they don't work either...

@ssjkamei
Copy link
Contributor

ssjkamei commented Oct 7, 2024

Is this a feature that has been removed from PdfWriter?
When I looked at the changelog, I saw that addLink was changed to add_link, but I couldn't find any mention of its deprecation.

@stefan6419846
Copy link
Collaborator

To follow PEP8, lots of names have been changed - in this case, addLink to add_link as well. Nevertheless, we additionally refactored the annotations handling as well to provide one generic interface for adding any type of annotation to the writer.

For the link-specific annotations, see https://pypdf.readthedocs.io/en/stable/user/adding-pdf-annotations.html#link

@ssjkamei
Copy link
Contributor

ssjkamei commented Oct 7, 2024

@stefan6419846 Thank you very much. I will check it out.

@ssjkamei
Copy link
Contributor

ssjkamei commented Oct 7, 2024

I have confirmed that it still occurs.

from pypdf import PdfWriter
from pypdf.generic import Fit
from pypdf.annotations import Link

out_file = r"out.pdf"
writer = PdfWriter()

for i in range(4):
    writer.add_blank_page(width=596, height=842)
    free_text_annotation = FreeText(
        text="Page" + str(i + 1),
        rect=(50, 770, 200, 800),
        bold=True,
    )
    writer.add_annotation(i, free_text_annotation)

annotation = Link(
    target_page_index=1,
    rect=[60, 730, 200, 750],
    border=[1, 1, 1],
    fit=Fit(fit_type="/XYZ"),
)  # 1st BOX to page 3(Issue)
writer.add_annotation(page_number=0, annotation=annotation)
annotation = Link(
    target_page_index=1,
    rect=RectangleObject([90, 700, 200, 720]),
    border=[1, 1, 1]
)  # 2nd BOX to page 2
writer.add_annotation(page_number=0, annotation=annotation)
annotation = Link(
    target_page_index=1,
    rect=[120, 670, 200, 690],
    border=[1, 1, 1],
    fit=Fit(fit_type="/XYZ", fit_args=(0, 0, 0))
)  # 3rd BOX to page 3(Issue)
writer.add_annotation(page_number=0, annotation=annotation)

with open(out_file, "wb") as o:
    writer.write(o)

@ssjkamei
Copy link
Contributor

ssjkamei commented Oct 7, 2024

IndirectObject should be specified, but a page number was specified. It seems that page numbers are specified only for remote pages.

Wrong: /Dest [ 1 /XYZ 0.0 0.0 0.0 ]
Correct: /Dest [ 3 0 R /XYZ 0.0 0.0 0.0 ]

Below is an excerpt from the PDF 1.7 specification.

12.3.2.2 Explicit Destinations
Table 151 shows the allowed syntactic forms for specifying a destination explicitly in a PDF file. In each case, "page" is an indirect reference to a page object (except in a remote go-to action; see 12.6.4.3, "Remote Go-To Actions"). All coordinate values (left, right, top, and bottom) shall be expressed in the default user space coordinate system. The page's corp box, the corresponding side of the crop box shall be used instead; see 14.11.2, "Page Boundaries," for further discussion of the corp box.)
No page object can be specified for a destination associated with a remote go-to action (see 12.6.4.3 "Remote Go-To Actions") because the destination page is in a different PDF document. In this case, the page parameter specifies an integer page number within the remote document instead of a page object in document.

@ssjkamei
Copy link
Contributor

ssjkamei commented Oct 7, 2024

It seems to be necessary to reference the IndirectObject of the page and set it when add_annotation is executed, but I think it would be difficult to make adjustments unless the reference to the IndirectObject is determined at write time, instead of being determined there.

I haven't read the code in much detail, so if I were to implement it, it would be quite a bit later. I am hoping someone will implement this.

@stefan6419846
Copy link
Collaborator

This might be related to #2443 and #2450. The PR is on hold, as it breaks external links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Projects
None yet
Development

No branches or pull requests

6 participants