buffer in a tmpfile #1

jbenet · 2015-04-03T20:46:06Z

hashpipe needs to buffer all the data as the hash isn't decidable until the last bit. we could buffer on disk with a tmpfile instead of memory.

jbenet · 2015-04-04T01:42:20Z

18:39 <Luke> jbenet: if I use hashpipe, it has to read the whole input into memory, right?
18:40 <jbenet> Luke: yeah, for now. https://github.com/jbenet/hashpipe/issues/1
18:40 <Luke> jbenet: yeah but either way that has to live somewhere. swapping might also take care of it
18:41 <jbenet> Luke: yeah. ideally it would be an option with a specified filename, in case you need to control it.

jbenet · 2015-04-04T01:51:47Z

<Tv`> jbenet: you could use an unlinked tempfile

thanks @tv42!

jmscott · 2015-04-04T03:40:45Z

buffering is a tricky issue and truly breaks the advantages of a pipeline. in practice a solid script would always check the exit code of hashpipe, regardless of the output. perhaps an option to hashpipe?

jbenet · 2015-04-04T03:45:57Z

It should be an option at the very least

jmscott · 2015-04-04T22:42:34Z

buffering the blob still does not remove the need to check the exit code in a pipeline of any consequence, so why buffer? for example, hypothtically the write of the buffered, verified data on the pipeline could fail with a partial write to the output. if hashpipe fails to digest properly simply abort with a burp to stderr and and a non-zero exit code and let the caller worry about how to harden the environment. in other words, the exit code indicates correctness as much as the output.

jbenet · 2015-04-04T23:01:14Z

if hashpipe fails to digest properly simply abort with a burp to stderr and and a non-zero exit code

yep, this is what it already does.

the buffering is to avoid nuking the machine's memory. swap may work fine, but it's still unideal. (i.e. if i know im about to get 100MB exec i may want to buffer it in a file of my choosing)

jmscott · 2015-04-05T00:11:55Z

so hashpipe buffers (to temp file) the whole blob before writing to output?
if so then why? the exit code would indicate correctness, not the output.
a caller will have to check the exit code anyway in a strict environment,
so it's not clear to be what advantage buffering gives.

-j

On Sat, Apr 4, 2015 at 6:01 PM, Juan Batiz-Benet [email protected]
wrote:

if hashpipe fails to digest properly simply abort with a burp to stderr
and and a non-zero exit code

yep, this is what it already does.

the buffering is to avoid nuking the machine's memory. swap may work fine,
but it's still unideal.

—
Reply to this email directly or view it on GitHub
#1 (comment).

jbenet · 2015-04-05T01:35:38Z

it does not currently buffer to a temp file. im saying there should be an option to.

the buffering is to avoid nuking the machine's memory.

suppose the file being input is 1TB.

kpcyrd · 2015-10-19T21:19:42Z

@jmscott assuming ./a | hashpipe $hash | ./b, ./b has no access to hashpipes exit code, but will start processing as soon as it gets data on stdin.

@jbenet you could chunk the file, calculate a hash for each chunk and then hash the list of hashes. On start, it reads and verifies the list of hashes with argv[1] and is then able to verify and write each chunk to stdout.

jbenet · 2015-10-19T21:21:32Z

no, ./b should NOT receive any data until the whole hash is verified. this is a well known attack vector. attackers can cut the download mid-stream and leave things in inconsistent states.

kpcyrd · 2015-10-19T21:27:13Z

@jbenet I think you misread my comment. ./b starts processing data as soon as it gets any, which must not happen until the hash is verified. This is the reason why it can't just write to stdout and expect ./b to wait for an exit code.

kpcyrd · 2015-10-19T21:30:40Z

Never mind, noticed you've been referring to the 2nd part.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer in a tmpfile #1

buffer in a tmpfile #1

jbenet commented Apr 3, 2015

jbenet commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 5, 2015

jbenet commented Apr 5, 2015

kpcyrd commented Oct 19, 2015

jbenet commented Oct 19, 2015

kpcyrd commented Oct 19, 2015

kpcyrd commented Oct 19, 2015

buffer in a tmpfile #1

buffer in a tmpfile #1

Comments

jbenet commented Apr 3, 2015

jbenet commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 4, 2015

jbenet commented Apr 4, 2015

jmscott commented Apr 5, 2015

jbenet commented Apr 5, 2015

kpcyrd commented Oct 19, 2015

jbenet commented Oct 19, 2015

kpcyrd commented Oct 19, 2015

kpcyrd commented Oct 19, 2015