Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mac] Error after downloading #3

Open
kunstreich opened this issue Oct 8, 2010 · 14 comments
Open

[mac] Error after downloading #3

kunstreich opened this issue Oct 8, 2010 · 14 comments

Comments

@kunstreich
Copy link

Traceback (most recent call last):
  File "/path/to/springer", line 279, in <module>
    main(sys.argv[1:])
  File "/path/to/springer", line 195, in main
    pdfcat(fileList, bookTitlePath)
  File "/path/to/springer", line 28, in pdfcat
    subprocess.Popen(command, shell=False).wait()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/subprocess.py", line 595, in __init__
    errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/subprocess.py", line 1106, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied
@yvesf
Copy link
Contributor

yvesf commented Oct 8, 2010

I replaced the pdftk calls with the pure-python pyPdf Library.
Pleasy try this branch, you need to run git submodule init to clone pyPdf under lib/
http://github.com/yvesf/springer_download/
(this branch also adds socks support and ugliness while modify sys.path)

@kunstreich
Copy link
Author

hey yvesf,

nice job! It works like a charm. I’am especially exited about the support for proxies as I prefer this way to access springerlink. Do you have instructions, may I help you to test?

Thank you.

@yvesf
Copy link
Contributor

yvesf commented Oct 11, 2010

thanks for response. I've made some minor changes, now applied in my master branch. It would be nice if you could apply some testing.

You can use SOCKS like this:

   ssh #you@your ssh login server# -D 1234
   ./springer_download.py --socksaddr=localhost --socksport=1234 -l http://Spring-LINK

Please Note: I've changed the sanitizeFilename routine: (hope it works)

   def sanitizeFilename(filename):
  -    p1 = subprocess.Popen(["echo", filename], stdout=subprocess.PIPE)
  -    p2 = subprocess.Popen(["iconv", "-f", "UTF-8", "-t" ,"ASCII//TRANSLIT"], stdin=p1.stdout, stdout=subprocess.PIPE)
  -    return re.sub("\s+", "_", p2.communicate()[0].strip().replace("/", "-"))
  +    return re.sub("\s+", "_", unicode(filename).encode("ascii", "replace").replace("/","-"))

@kunstreich
Copy link
Author

There seems to be some kind of timeout. I’am trying this over a sloppy cellular network right now. Will do some serious testing tomorrow morning.

fetching book information...
    http://springerlink.com/content/978-3-531-15883-9/contents/
^CTraceback (most recent call last):
  File "/path/to/springer_download.py", line 302, in <module>
    main(sys.argv[1:])
  File "/path/to/springer_download.py", line 86, in main
    page = loader.open(link).read()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 203, in open
    return getattr(self, name)(url)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 338, in open_http
    h.endheaders()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py", line 868, in endheaders
    self._send_output()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py", line 740, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py", line 699, in send
    self.connect()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py", line 683, in connect
    self.timeout)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/socket.py", line 505, in create_connection
    sock.connect(sa)
  File "/path/to/lib/socksipy/socks.py", line 369, in connect
    self.__negotiatesocks5(destpair[0],destpair[1])
  File "/path/to/lib/socksipy/socks.py", line 228, in __negotiatesocks5
    resp = self.__recvall(4)
  File "/path/to/lib/socksipy/socks.py", line 141, in __recvall
    data = data + self.recv(bytes-len(data))
KeyboardInterrupt

@yvesf
Copy link
Contributor

yvesf commented Oct 12, 2010

weird, have you tried using your socks connection with a web browser?
additional you could test your connection without socks, i think at most front-matter.pdf and back-matter.pdf should load.

@kunstreich
Copy link
Author

socks support works as expected. no problems at all over a stable connection. great. but: I had some problems yesterday with merging the downloaded pdfs. I’am sorry but I can’t provide error messages. I think it was due to some special chars. I will investigate further. Merge upstream!

@kunstreich
Copy link
Author

I have observed some strange behavior: I’am on a Mac and the script fails/timeouts if I have activated a (unrelated ad blocker) web proxy. If I change the network settings and deactivate the http-proxy, everything is working fine?

@kunstreich
Copy link
Author

there is also an issue with the sanitizeFilename function:

fetching book information...
    http://springerlink.com/content/978-3-531-13634-9/contents/
Traceback (most recent call last):
  File "/path/to/springer_download.py", line 302, in <module>
    main(sys.argv[1:])
  File "/path/to/springer_download.py", line 123, in main
    bookTitlePath = curDir + "/%s.pdf" % sanitizeFilename(bookTitle)
  File "/path/to/springer_download.py", line 279, in sanitizeFilename
    return re.sub("\s+", "_", unicode(filename).encode("ascii", "replace").replace("/","-"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 11: ordinal not in range(128)

@yvesf
Copy link
Contributor

yvesf commented Oct 15, 2010

i've made a error in sanitizeFilename. Should work using last commit to yvesf/springer_download@ad1659a60b01e6b3ed54

@kunstreich
Copy link
Author

yay, fantastic. thank you yvesf!

@milianw
Copy link
Owner

milianw commented Oct 16, 2010

anything I should merge into the original branch?

@kunstreich
Copy link
Author

@milianw yvesfs proxy support is really nice. pleae merge.

@yvesf
Copy link
Contributor

yvesf commented Nov 19, 2010

i like the additional extraction of metadata (but not the idea to storage them in so called NFO files).
the coding style is not inefficient but dirty. Invasive changes like that are hard to merge back into main.
Altough windows support isnt in my focus, i don't think that including various binaries (Windows PE, .net, dll's) is the way to go. Not to speak about possible licensing issues.
Not least, you should create a new bug-tracker entry for this topic, this one is about mac support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants