xfa-tools

Tool to extract XFA data from a PDF.

This is aimed at developers who wish to develop a parser for XFA, or at users who want to get what data they can out in a format that can be read in a text editor.

Dependencies

These are the right instructions for Arch or Parabola GNU/Linux. Note you need python2 for this, for pdfminer to work.

% sudo pacman -Ss python2 python2-pip
% sudo pip2 install pdfminer

I recommend you install jq from the AUR too.

xfa-extract

Takes PDF filename on command line, and extracts the XFA data as an array of pairs in JSON. I chose an array of pairs because I don't yet know whether it's possible that keys can be duplicated - I've only introspected one PDF file so far at the time of writing.

Here's an example of how to get the template out in human-readable format (assuming you also have jq and xmllint installed in your $PATH):

% ./xfa-extract ../vat101i-notes.pdf \
    | ./json-alist-to-object \
    | jq -r '.template' \
    | xmllint --format - > ../template.xml

You can then open the template.xml file in a text editor, or run it through a dedicated parser.

json-alist-to-object

This is a simple filter that takes a JSON array of pairs on STDIN and outputs a JSON object.

Copyright and License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
_archive		_archive
xfa_tools		xfa_tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
json-alist-to-object		json-alist-to-object
xfa-extract		xfa-extract

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xfa-tools

Dependencies

xfa-extract

json-alist-to-object

Copyright and License

About

Releases

Packages

Languages

License

nmbooker/xfa-tools

Folders and files

Latest commit

History

Repository files navigation

xfa-tools

Dependencies

xfa-extract

json-alist-to-object

Copyright and License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages