Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

lep: request all the things #2

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 184 additions & 0 deletions XXX-request-all-the-things.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
| Title | Request all the things |
|--------|-------------------------|
| Author | @saghul |
| Status | DRAFT |
| Date | 2014-11-27 07:43:59 |


## Overview

This proposal describes a new approach for dealing with operations in libuv. As of
right now, handles define an entity which is capable of performing certain operations.
These operations are sometimes a result of a request being sent and some other times a
result of a callback (which was passed by the user) being called. This proposal aims
to make this behavior more consistent, by turning several operations that currently
just take a callback into a request form.


### uv_read

(This was previously discussed, but it’s added here for completeness).

Instead of using a callback passed to `uv_read_start`, the plan is to use a `uv_read`
function which performs a single read operation. The initial prototype was defined
as follows:

~~~~
int uv_read(uv_read_t* req, uv_stream_t* handle, uv_buf_t[] bufs, unsigned int nbufs, uv_read_cb, cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long line.

EDIT: Is there a guideline for that? I would suggest 80 columns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I kept them between 80 and 100, except the code blocks, but I can fix that.

~~~~

The read callback is defined as:

~~~~
typedef void (*uv_read_cb)(uv_read_t* req, int status)
~~~~

This approach has one problem, though: memory for reading needs to be allocated upfront,
which might not be desirable in all cases. For this reason, a secondary version which takes
an allocation callback is also proposed:

~~~~
int uv_read2(uv_read_t* req, uv_stream_t* handle, uv_alloc_cb alloc_cb, uv_read_cb cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name uv_read2(). Maybe uv_read_alloc()? It better describes what it does. We should rename uv_write2() as well, of course, for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

~~~~

Applications can use one or the other or mixed without problems.

Implementation details: we probably will want to have some `bufsml` of size 4 where we
copy the structures when the request is created, like `uv_write` does. Thus, the user can
pass a `uv_buf_t` array which is allocated on the stack, as long as the memory in each `buf->base`
is valid until the request callback is called.

Inline reading: if there are no conditions which would prevent otherwise, we could try to do
a read on the spot. This should work ok if the user provided preallocated buffers, because
we can hold on to them if we get EAGAIN. If `uv_read2` is used, instead, we won’t attempt
to read on the spot because the allocation callback would have to be called and we’d end
up holding on to the buffer for too long, thus defeating the purpose of deferred allocation.
A best effort inline reading function is also proposed:

~~~~
int uv_try_read(uv_stream_t* handle, uv_buf_t[] bufs, int nbufs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return value should be ssize_t?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't ssize_t's lower bound supposed to be -1? We return uv errors which are < -1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX requires that it has a range of at least -1 to 32,767 but in practice it's a signed int or long. I suppose an int is alright; it's consistent with uv_try_write() and 2^31-1 is still pretty big.

Apropos signed vs. unsigned, nbufs should probably be unsigned; its uv_try_write() counterpart is. (And bufs is const.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'll address all these tonight. Thanks a lot Ben!

~~~~

It does basically the analogous to `uv_try_write`, that is, attempt to read inline and
doesn’t queue a request if it doesn’t succeed.

### uv_stream_poll

In case `uv_read` and `uv_read2` are not enough, another way to read or write on streams
would be to get a callback when the stream is readable / writable, and use the `uv_try_*`
family of functions to perform the reads and writes inline. The proposed API for this:

~~~~
int uv_stream_poll(uv_stream_poll_t* req, uv_stream_t* handle, int events, uv_stream_poll cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like uv_stream_poll() + uv_try_read() the best, even more so than uv_read(). I think we can turn the latter into something that is API sugar on top of the former?

~~~~

`events` would be a mask composed of `UV_READABLE` and / or `UV_WRITABLE`.

The callback is defined as:

~~~~
typedef void (*uv_stream_poll_cb)(uv_stream_poll_t* req, int status)
~~~~


### uv_timeout

Currently libuv implements repeating timers in the form of a handle. The current implementation
does not account for the time taken during the callback, and this has caused some trouble
every now and then, since people have different expectations when it comes to repeating timers.

This proposal removes the timer handle and makes timers a request, which gets its callback
called when the timeout is hit:

~~~~
int uv_timeout(uv_timeout_t* req, uv_loop_t* loop, int timeout, uv_timeout_cb cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout should be an uint64_t or maybe even a double. For all its faults, libev does get timers right; you specify the timeout as a floating point value and the fractional part gets rounded up to platform granularity. E.g. 1.2345 becomes 1230 ms or 1,234,500 us, depending on whether the platform supports sub-millisecond precision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups, that should be uint64_t, but I'm open to the double approach.

~~~~

Timers are one shot, so no assumptions are made and repeating timers can be easily
implemented on top (by users).

The callback takes the following form:

~~~~
typedef void (*uv_timeout_cb)(uv_timeout_t* req, int status)
~~~~

The status argument would indicate success or failure. One possible failure is cancellation,
which would make status == `UV_ECANCELED`.

Implementation detail: Timers will be the first thing to be processed after polling for i/o.


### uv_callback

In certain environments users would like to get a callback called by the event loop, but
scheduling this callback would happen from a different thread. This can be implemented using
`uv_async_t` handles in combination with some sort of thread safe queue, but it’s not
straightforward. Also, many have fallen in the trap of `uv_async_send` coalescing calls,
that is, calling the function X times does yield the callback being called X times; it’s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/does/does not/?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.

called at least once.

`uv_callback` requests will queue the given callback, so that it’s called “as soon as
possible” by the event loop. 2 API calls are provided, in order to make the thread-safe
version explicit:

~~~~
int uv_callback(uv_callback_t* req, uv_loop_t* loop, uv_callback_cb cb)
int uv_callback_threadsafe(uv_callback_t* req, uv_loop_t* loop, uv_callback_cb cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me what the thread-safe version would do. Block until the event loop processes the callback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was an implementation detail, but maybe it's worth mentioning. The loop keeps track of active requests using a queue, so we cannot really insert anything there from another thread. My idea was to have a separate queue for callback requests which are sent from a different thread, then, as part of the loop processing, those requests are fully initialized and added to the normal queue. This other queue would need to be thread safe, using locking, atomic ops or whatever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a note about this.

~~~~

The callback definition:

~~~~
typedef void (*uv_callback_cb)(uv_callback_t* req, int status)
~~~~

The introduction of `uv_callback` would deprecate and remove `uv_async_t` handles.
Now, in some cases it might be desired to just wakeup the event loop, and having to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/wakeup/wake up/

create a request might be too much, thus, the following API call is also proposed:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm experimenting with a new API to allow queue's of work to be done, where the callback of each queue returns an int. If the return value is != 0 then the queue stops being processed. With that, I'd probably have to add a queue complete callback.

Anyway, I got part of the idea from GCD and I'll be implementing it in libnub for testing. This is just an FYI.


~~~~
void uv_loop_wakeup(const uv_loop_t* loop)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can the loop argument be const here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd just write on the write end of the pipe, can't we have it const?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, like that. I've been playing around with atomic operations, actually (set a loop flag atomically and make the receiving thread check that before polling.) That requires a non-const loop.

And, from an ideological perspective, arguments should really only be const when the operation has no side effect. A function returning void is always called purely for its side effects, of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'll make it non-const then, and we can investigate if atomic ops or the pipe / eventfd are a better approach.

~~~~

Which would just wakeup the event loop in case it was blocked waiting for i/o.

Implementation detail: the underlying mechanism for `uv_async_t` would remain (at least on Unix).

Note: As a result of this addition, `uv_idle_t` handles will be deprecated an removed.
It may not seem obvious at first, but `uv_callback` achieves the same: the loop won’t block
for i/o if any `uv_callback` request is pending. This becomes even more obvious with the
“‘pull based’ event loop” proposal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change “‘pull based’ event loop” to use ASCII characters (i.e. " and ')?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.



### uv_accept / uv_listen

Currently there is no way to stop listening for incoming connections. Making the concept
of accepting connections also request based makes the API more consistent and easier
to use: if the user decides so (maybe because she is getting EMFILE because she ran
out of file descriptors, for example) she can stop accepting new connections.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I'm not a fan of pronouns. Possibly change it to:

for example, to stop accepting new connections because EMFILE is received because file descriptors have been depleted.


New API:

~~~~
int uv_listen(uv_stream_t* stream, int backlog)
~~~~

The uv_listen function loses its callback, becoming the equivalent of `listen(2)`.

~~~~
int uv_accept(uv_accept_t* req, uv_stream_t* stream, uv_accept_cb cb)
typedef void (*uv_accept_cb)(uv_accept_t* req, int status)
~~~~

Once a connection is accepted the request callback will be called with status == 0.
The `req->fd` field will contain a `uv_os_fd_t` value, which the user can use together

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That in particular won't work on windows. It is not common to map an open socket to a "CRT fd" and it'd limit the number of open sockets to 2000. An uv_os_handle_t could work though.

Pragmatics and all that, but it has always my belief that if properly designed libuv would not need to expose raw file descriptors / handles to the user, ever. That makes for a better cross-platform abstraction, and it also clarifies the issue of ownership; if OS fds/sockets are completely hidden from the user it's clear that libuv is responsible for managing the resource.

At the same time I am well aware that being religious about this has led to some less than pretty APIs, like the way the uv_listen callback receives incoming connections in version 1.0 makes the user do casts and switches based on the handle type. The infamous "julia pipe" issue can also be partly attributed to this: casually creating and discarding a libuv handle is super unpractical.

So I'm +1 if changed to uv_os_handle_t (which is to be defined as HANDLE on windows, and int on unix), but expect me to always be a bit sad about this api.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv_os_fd_t is already a HANDLE: https://github.com/libuv/libuv/blob/master/include/uv-win.h#L233 ✨ I added it for uv_fileno which doesn't return CRT fds, but Windows HANDLE thingies. So, we seem to be on the same page here 👍

with `uv_tcp_open` for example. (This needs further investigation to verify it would
work in all cases).


### A note on uv_cancel

Gradually, `uv_cancel` needs to be improved to allow for cancelling any kind of requests.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv_cancel short-circuits an operation, but the callback still gets called (with ECANCELED as the status). That makes it useful for canceling things that might otherwise take a long time, however it doesn't allow the embedder forget about an operation entirely.

That makes it rather pointless to uv_cancel an uv_callback; the uv_callback never takes a long time so uv_cancel would merely change the status code but not do anything else.

It is not always possible to cancel an operation and also prevent it's callback from executing. But sometimes it is possible - think timers, or -indeed- an uv_callback. We should probably have a separate api for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes it rather pointless to uv_cancel an uv_callback; the uv_callback never takes a long time so uv_cancel would merely change the status code but not do anything else.

You're right, I'm not sure what I was thinking here...

It is not always possible to cancel an operation and also prevent it's callback from executing. But sometimes it is possible - think timers, or -indeed- an uv_callback. We should probably have a separate api for this.

I think we could leave the API we have for uv_cancel like it is, for now. For timers, it does make sense to be able to cancel them, and the callback would be called with status == UV_ECANCELED, so you know it didn't actually kick in.

Ideally it would be nice to avoid getting the callback called with UV_ECANCELLED in case uv_cancel returned "true", but I'm not sure this can be done in all cases.

Some of them might be a bit harder, but `uv_timeout` and `uv_callback` should be easy
enough to do.