-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script throws AttributeError when parsing NCX navList #27
Comments
Fixed layouts are not part of the epub2 spec. So you should be feeding kindlegen an epub3 file with the page-list properly formatted in the nav and not an toc.ncx. Either that or use Sigil to create a proper ncx from a properly formatted nav. You should also be unpacking the mobi specifically for epub3, not 2.
Happy to debug this but I will need a copy of the source epub3 and the resulting kindlegen generated mobi so that I can see if what is growing on. The error you are getting just means the mobi8 index used for the toc info is missing required fields.
… On Jun 7, 2019, at 8:51 PM, malsagulo ***@***.***> wrote:
Not sure if you're even interested in coming back to this... but just for the record, I'm getting an AttributeError thrown when it tries to parse my NCX file for my fixed-layout book. I have a pageList in there acting as a sub-index for images/illustrations (which I got out of this part of the specification; Kindle Previewer 3 recognizes it). It looks something like this:
<!-- on same level as navMap toc -->
<navList>
<navLabel><text>Images</text></navLabel>
<navTarget id="i1">
<navLabel><text>image caption</text></navLabel>
<content src="pg1.html" />
</navTarget>
<navTarget id="i2">
<navLabel><text>image caption</text></navLabel>
<content src="pg3.html" />
</navTarget>
<navTarget id="i3">
<navLabel><text>image caption</text></navLabel>
<content src="pg7.html" />
</navTarget>
<!-- so on and so forth -->
</navList>
If this section is in the NCX file, the script throws the following error:
Traceback (most recent call last):
File "KindleUnpack/lib/kindleunpack.py", line 1020, in <module>
sys.exit(main())
File "KindleUnpack/lib/kindleunpack.py", line 1008, in main
unpackBook(infile, outdir, apnxfile, epubver, use_hd)
File "KindleUnpack/lib/kindleunpack.py", line 923, in unpackBook
process_all_mobi_headers(files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd)
File "KindleUnpack/lib/kindleunpack.py", line 840, in process_all_mobi_headers
processMobi8(mh, metadata, sect, files, rscnames, pagemapproc, k8resc, obfuscate_data, apnxfile, epubver)
File "KindleUnpack/lib/kindleunpack.py", line 530, in processMobi8
[junk1, junk2, junk3, fid, junk4, off] = ncxmap['pos_fid'].split(':')
AttributeError: 'NoneType' object has no attribute 'split'
Here's how I'm running the script:
./KindleGen_Mac/kindlegen project-fixed/fixed.opf -c2 -verbose -o fixed.mobi
And here's the full verbose console output on the off chance it's of use to you:
KindleUnpack v0.82
Based on initial mobipocket version Copyright © 2009 Charles M. Hannum ***@***.***>
Extensive Extensions and Improvements Copyright © 2009-2014
by: P. Durrant, K. Hendricks, S. Siebert, fandrieu, DiapDealer, nickredding, tkeo.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, version 3.
Unpacking Book...
Palm DB type: BOOKMOBI, 143 sections.
Unpacking a Combination M8/KF8 book...
First Image, last Image 42 72
Processing Mobipocket 6 section of book...
Mobi Version: 6
Codec: utf-8
Title: Malsagulo's Book
Huffdic compression
Unpacking images, resources, fonts, etc
Extracting image: image00042.jpeg from section 42
Extracting image: image00043.jpeg from section 43
Extracting image: image00044.jpeg from section 44
Extracting image: image00045.jpeg from section 45
Extracting image: image00046.jpeg from section 46
Extracting image: image00047.jpeg from section 47
Extracting image: image00048.jpeg from section 48
Extracting image: image00049.jpeg from section 49
Extracting image: image00050.jpeg from section 50
Extracting image: image00051.jpeg from section 51
Extracting image: image00052.jpeg from section 52
Extracting image: image00053.jpeg from section 53
Extracting image: image00054.jpeg from section 54
Extracting image: image00055.jpeg from section 55
Extracting image: image00056.jpeg from section 56
Extracting image: image00057.jpeg from section 57
Extracting image: image00058.jpeg from section 58
Extracting image: image00059.jpeg from section 59
Extracting image: image00060.jpeg from section 60
Extracting image: image00061.jpeg from section 61
Extracting image: image00062.jpeg from section 62
Extracting image: image00063.jpeg from section 63
Extracting image: image00064.jpeg from section 64
Extracting image: image00065.jpeg from section 65
Extracting image: image00066.jpeg from section 66
Extracting image: image00067.jpeg from section 67
Extracting image: image00068.jpeg from section 68
Extracting image: cover00069.jpeg from section 69
Extracting image: image00071.jpeg from section 71
Extracting Page Map Information
File contains kindlegen source archive, extracting as kindlegensrc.zip
File contains kindlegen build log, extracting as kindlegenbuild.log
Unpacking raw markup language
Write ncx
Find link anchors
Insert data into html
Insert hrefs into html
Remove empty anchors from html
Insert image references into html
Building an opf for mobi7/azw4.
Processing K8 section of book...
Mobi Version: 8
Codec: utf-8
Title: Malsagulo's Book
Huffdic compression
Unpacking images, resources, fonts, etc
Extracting Page Map Information
Unpacking raw markup language
Processing ncx / toc
Traceback (most recent call last):
File "KindleUnpack/lib/kindleunpack.py", line 1020, in <module>
sys.exit(main())
File "KindleUnpack/lib/kindleunpack.py", line 1008, in main
unpackBook(infile, outdir, apnxfile, epubver, use_hd)
File "KindleUnpack/lib/kindleunpack.py", line 923, in unpackBook
process_all_mobi_headers(files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd)
File "KindleUnpack/lib/kindleunpack.py", line 840, in process_all_mobi_headers
processMobi8(mh, metadata, sect, files, rscnames, pagemapproc, k8resc, obfuscate_data, apnxfile, epubver)
File "KindleUnpack/lib/kindleunpack.py", line 530, in processMobi8
[junk1, junk2, junk3, fid, junk4, off] = ncxmap['pos_fid'].split(':')
AttributeError: 'NoneType' object has no attribute 'split'
Otherwise, thanks for the useful tool. It's been genuinely helpful up to this point.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'm, er, reluctant to send the entire project, but I can maybe see about copying and pasting some kind of test case together for you. This is my first Kindle project, so it is entirely possible I've done something (or many things) wrong somewhere. But if I did, I'm not seeing it.
Yeah, I gave that a shot a little while after I posted. The error still occurs with the
I don't believe the problem is with how my nav is formatted. Not only was I working directly from the spec, I checked the whole thing against the original DTD. I even went back and added the |
Without pos:fid there is no link. My guess is your extra ncx entries outside of the navMap simply have not been seen before and will need some work to decode them from test cases.
These things are typically referred to now as LandMarks in epub3 and are normally part of the guide inside the opf in epub2.
If you can create a test case that shows this issue, I will try to reverse engineer what the new index tags are that being generated by your extra piece of ncx and see if we can grok it.
Kevin
… On Jun 8, 2019, at 12:51 AM, malsagulo ***@***.***> wrote:
I'm, er, reluctant to send the entire project, but I can maybe see about copying and pasting some kind of test case together for you. This is my first Kindle project, so it is entirely possible I've done something (or many things) wrong somewhere. But if I did, I'm not seeing it.
You should also be unpacking the mobi specifically for epub3, not 2.
Yeah, I gave that a shot a little while after I posted. The error still occurs with the --epub_version=3 flag set. Also did a little bit of cursory debugging on my own. The problem seems to be coming from the ncxExtract.parseNCX method in mobi_ncx.py -- in fact, specifically from inside this particular if statement. Since there's no value set unless tag is of a certain value, tmp['pos_fid'] is defaulting to None. But this method over in kindleunpack.py seems to be expecting that property to be a String. You might be able to get around the entire problem just by checking for None in or around that line.
Either that or use Sigil to create a proper ncx from a properly formatted nav.
I don't believe the problem is with how my nav is formatted. Not only was I working directly from the spec, I checked the whole thing against the original DTD. I even went back and added the playOrder attribute to everything, just because the DTD said it was required (even though it doesn't actually do anything anymore.) Error still occurred. So I don't think we're looking at a validation or formatting problem here.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@duzhor -- why are you commenting in this issue? This sounds like an entirely different bug. If you want him to address it, open up a new issue. |
There are many tools to "borkify" the text of an epub to make them meaningless. I believe there is a plugin for Sigil that does this as well as calibre has this capability. So if at all possible (for me to debug this and add support, I really need a working test case. So please consider creating a simple standalone test case from a copy of the original epub that has had all of its text changed and posting it here so that I can use it to reverse out how navList items are encoded in mobis and add support for that to kindleUnpack. |
Not sure if you're even interested in coming back to this... but just for the record, I'm getting an AttributeError thrown when it tries to parse my NCX file for my fixed-layout book. I have a pageList in there acting as a sub-index for images/illustrations (which I got out of this part of the specification; Kindle Previewer 3 recognizes it). It looks something like this:
If this section is in the NCX file, the script throws the following error:
Here's how I'm running the script:
And here's the full verbose console output on the off chance it's of use to you:
Otherwise, thanks for the useful tool. It's been a big help -- and still will be, so long as I remember to comment out that part of the NCX. :)
The text was updated successfully, but these errors were encountered: