Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help to modify a docx file #3

Open
ManuelJm opened this issue May 6, 2022 · 3 comments
Open

Help to modify a docx file #3

ManuelJm opened this issue May 6, 2022 · 3 comments

Comments

@ManuelJm
Copy link

ManuelJm commented May 6, 2022

HI, I'm trying to understand how to modify text in a Word doc file but looking at the libopc samples i could only see examples for read and write, they use separate objects/functions.

Its not possible to modify a document on-the fly, without creating a reader and a writer?
If not, what's the easiest way to copy the original and make the change with a new writer?

@jschroedl
Copy link
Collaborator

The opc container can be opened in read/write mode using opcContainerOpenMode::OPC_OPEN_READ_WRITE
It's defined in container.h

I am editing PowerPoint files, not docx but I open mine like this:

mContainer = opcContainerOpen(_X(fileName.value()), OPC_OPEN_READ_WRITE, this, nullptr);

@ManuelJm
Copy link
Author

ManuelJm commented May 6, 2022

Yes i did it like that , but don't know how to modify the text values of nodes and commit the changes to the docx file.
The samples dso not provide this and there is no API documentation yet as far as i know.
Seems the xml is wrapped around mce typedefs for reader and writer.
If you tell me i can use mceWriter to open and existing part and keep the original values i'll try but seems writer is to write new things.
I'm confused.

@jschroedl jschroedl reopened this May 6, 2022
@jschroedl
Copy link
Collaborator

jschroedl commented May 6, 2022

Sorry, didn't mean to close this one. libopc is focused on the packaging-of the files and managing the relationship pieces. ex. inserting an image requires setting up relationsships and adding the image into the package.

To work on the actual document, you end up reading the part of interest as xml content into an xmlDocPtr, changing the xml and then re-writing that xml document.

It's been a while and I have my own C++ wrapper code but it looks roughly like:

  • opcPartFind + opcContainerOpenInputStream() a to get to a specifc named part of the file.
  • Pass the stream's buffer to xmlReadMemory to read the part's xml into a libxml2 object.
  • Find and change the piece of text (or whatever) with xpath queries updating the xml document with new content.
  • Create a stream/buffer to save out the xml using opcContainerCreateOutputStream for the desired part.
  • Get the xml to write using xmlDocDumpMemory()
  • Write that xml to the doc using opcContainerWriteOutputStream then close with opcContainerCloseOutputStream.

Sorry it's not clearer but libopc is pretty general and mainly helps manage overall document container and looking up parts and adding contained images etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants