Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script throws AttributeError when parsing NCX navList #27

Open
malsagulo opened this issue Jun 8, 2019 · 5 comments
Open

Script throws AttributeError when parsing NCX navList #27

malsagulo opened this issue Jun 8, 2019 · 5 comments

Comments

@malsagulo
Copy link

malsagulo commented Jun 8, 2019

Not sure if you're even interested in coming back to this... but just for the record, I'm getting an AttributeError thrown when it tries to parse my NCX file for my fixed-layout book. I have a pageList in there acting as a sub-index for images/illustrations (which I got out of this part of the specification; Kindle Previewer 3 recognizes it). It looks something like this:

<!-- on same level as navMap toc -->
<navList>
  <navLabel><text>Images</text></navLabel>
  <navTarget id="i1">
    <navLabel><text>image caption</text></navLabel>
    <content src="pg1.html" />
  </navTarget>
  <navTarget id="i2">
    <navLabel><text>image caption</text></navLabel>
    <content src="pg3.html" />
  </navTarget>
  <navTarget id="i3">
    <navLabel><text>image caption</text></navLabel>
    <content src="pg7.html" />
  </navTarget>
  <!-- so on and so forth -->
</navList>

If this section is in the NCX file, the script throws the following error:

Traceback (most recent call last):
  File "KindleUnpack/lib/kindleunpack.py", line 1020, in <module>
    sys.exit(main())
  File "KindleUnpack/lib/kindleunpack.py", line 1008, in main
    unpackBook(infile, outdir, apnxfile, epubver, use_hd)
  File "KindleUnpack/lib/kindleunpack.py", line 923, in unpackBook
    process_all_mobi_headers(files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd)
  File "KindleUnpack/lib/kindleunpack.py", line 840, in process_all_mobi_headers
    processMobi8(mh, metadata, sect, files, rscnames, pagemapproc, k8resc, obfuscate_data, apnxfile, epubver)
  File "KindleUnpack/lib/kindleunpack.py", line 530, in processMobi8
    [junk1, junk2, junk3, fid, junk4, off] = ncxmap['pos_fid'].split(':')
AttributeError: 'NoneType' object has no attribute 'split'

Here's how I'm running the script:

python KindleUnpack/lib/kindleunpack.py -s fixed.mobi tmp

And here's the full verbose console output on the off chance it's of use to you:

KindleUnpack v0.82
   Based on initial mobipocket version Copyright © 2009 Charles M. Hannum <[email protected]>
   Extensive Extensions and Improvements Copyright © 2009-2014 
       by:  P. Durrant, K. Hendricks, S. Siebert, fandrieu, DiapDealer, nickredding, tkeo.
   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, version 3.
Unpacking Book...
Palm DB type: BOOKMOBI, 143 sections.
Unpacking a Combination M8/KF8 book...
First Image, last Image 42 72
Processing Mobipocket 6 section of book...
Mobi Version: 6
Codec: utf-8
Title: Malsagulo's Book
Huffdic compression
Unpacking images, resources, fonts, etc
Extracting image: image00042.jpeg from section 42
Extracting image: image00043.jpeg from section 43
Extracting image: image00044.jpeg from section 44
Extracting image: image00045.jpeg from section 45
Extracting image: image00046.jpeg from section 46
Extracting image: image00047.jpeg from section 47
Extracting image: image00048.jpeg from section 48
Extracting image: image00049.jpeg from section 49
Extracting image: image00050.jpeg from section 50
Extracting image: image00051.jpeg from section 51
Extracting image: image00052.jpeg from section 52
Extracting image: image00053.jpeg from section 53
Extracting image: image00054.jpeg from section 54
Extracting image: image00055.jpeg from section 55
Extracting image: image00056.jpeg from section 56
Extracting image: image00057.jpeg from section 57
Extracting image: image00058.jpeg from section 58
Extracting image: image00059.jpeg from section 59
Extracting image: image00060.jpeg from section 60
Extracting image: image00061.jpeg from section 61
Extracting image: image00062.jpeg from section 62
Extracting image: image00063.jpeg from section 63
Extracting image: image00064.jpeg from section 64
Extracting image: image00065.jpeg from section 65
Extracting image: image00066.jpeg from section 66
Extracting image: image00067.jpeg from section 67
Extracting image: image00068.jpeg from section 68
Extracting image: cover00069.jpeg from section 69
Extracting image: image00071.jpeg from section 71
Extracting Page Map Information
File contains kindlegen source archive, extracting as kindlegensrc.zip
File contains kindlegen build log, extracting as kindlegenbuild.log
Unpacking raw markup language
Write ncx
Find link anchors
Insert data into html
Insert hrefs into html
Remove empty anchors from html
Insert image references into html
Building an opf for mobi7/azw4.
Processing K8 section of book...
Mobi Version: 8
Codec: utf-8
Title: Malsagulo's Book
Huffdic compression
Unpacking images, resources, fonts, etc
Extracting Page Map Information
Unpacking raw markup language
Processing ncx / toc
Traceback (most recent call last):
  File "KindleUnpack/lib/kindleunpack.py", line 1020, in <module>
    sys.exit(main())
  File "KindleUnpack/lib/kindleunpack.py", line 1008, in main
    unpackBook(infile, outdir, apnxfile, epubver, use_hd)
  File "KindleUnpack/lib/kindleunpack.py", line 923, in unpackBook
    process_all_mobi_headers(files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd)
  File "KindleUnpack/lib/kindleunpack.py", line 840, in process_all_mobi_headers
    processMobi8(mh, metadata, sect, files, rscnames, pagemapproc, k8resc, obfuscate_data, apnxfile, epubver)
  File "KindleUnpack/lib/kindleunpack.py", line 530, in processMobi8
    [junk1, junk2, junk3, fid, junk4, off] = ncxmap['pos_fid'].split(':')
AttributeError: 'NoneType' object has no attribute 'split'

Otherwise, thanks for the useful tool. It's been a big help -- and still will be, so long as I remember to comment out that part of the NCX. :)

@kevinhendricks
Copy link
Owner

kevinhendricks commented Jun 8, 2019 via email

@malsagulo
Copy link
Author

I'm, er, reluctant to send the entire project, but I can maybe see about copying and pasting some kind of test case together for you. This is my first Kindle project, so it is entirely possible I've done something (or many things) wrong somewhere. But if I did, I'm not seeing it.

You should also be unpacking the mobi specifically for epub3, not 2.

Yeah, I gave that a shot a little while after I posted. The error still occurs with the --epub_version=3 flag set. Also did a little bit of cursory debugging on my own. The problem seems to be coming from the ncxExtract.parseNCX method in mobi_ncx.py -- in fact, specifically from inside this particular if statement. Since there's no value set unless tag is of a certain value, tmp['pos_fid'] is defaulting to None. But this method over in kindleunpack.py seems to be expecting that property to be a String. You might be able to get around the entire problem just by checking for None in or around that line.

Either that or use Sigil to create a proper ncx from a properly formatted nav.

I don't believe the problem is with how my nav is formatted. Not only was I working directly from the spec, I checked the whole thing against the original DTD. I even went back and added the playOrder attribute to everything, just because the DTD said it was required (even though it doesn't actually do anything anymore.) Error still occurred. So I don't think we're looking at a validation or formatting problem here.

@kevinhendricks
Copy link
Owner

kevinhendricks commented Jun 8, 2019 via email

@malsagulo
Copy link
Author

TypeError: ord() expected a character, but string of length 0 found

@duzhor -- why are you commenting in this issue? This sounds like an entirely different bug. If you want him to address it, open up a new issue.

Repository owner deleted a comment from duzhor Jun 20, 2019
Repository owner deleted a comment from duzhor Jun 20, 2019
Repository owner deleted a comment from duzhor Jun 20, 2019
@kevinhendricks
Copy link
Owner

There are many tools to "borkify" the text of an epub to make them meaningless. I believe there is a plugin for Sigil that does this as well as calibre has this capability.

So if at all possible (for me to debug this and add support, I really need a working test case.

So please consider creating a simple standalone test case from a copy of the original epub that has had all of its text changed and posting it here so that I can use it to reverse out how navList items are encoded in mobis and add support for that to kindleUnpack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants