Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase code readability/clarity #212

Open
kristinapathak opened this issue May 14, 2020 · 4 comments
Open

Increase code readability/clarity #212

kristinapathak opened this issue May 14, 2020 · 4 comments

Comments

@kristinapathak
Copy link
Contributor

func (obs *CaduceusOutboundSender) dispatcher() {

I added comments to the dispatcher() function and the outbound sender in general, but overall the outbound sender is not very easy to understand. If possible, we should figure out if there's a way to maintain performance while also improving readability.

@johnabass
Copy link
Collaborator

johnabass commented May 14, 2020

The dispatcher method has this basic form:

for {
    select {
        case msg, ok := <- messages:
            if !ok {
                // stop handling messages
                // this case happens when messages is closed AND empty
            }

            // handle msg
            // this WILL execute after the messages channel is closed but NOT empty
    }
}

This code is basically trying too hard. The for/range idiom in golang works for this:

// this for loop will exit in the exact same way as the existing, verbose loop:
// messages will have to be closed and empty
for msg := range messages {
     // handle msg
}

See: https://gobyexample.com/range-over-channels

@johnabass
Copy link
Collaborator

The code uses a chan *wrp.Message. This is a common usage pattern for coders coming from pointer-happy languages, like C. However, it isn't necessary in golang and it can cause the gc to work much harder.

Using a chan wrp.Message works much better in most cases. Copying the message onto what is essentially a doubly-ended queue winds up being much more efficient than having the gc track references to each element in a channel. It is true that wrp.Message itself has references, like slices. But those aren't always set and thus do not always involve the gc.

Don't forget that a pointer in any garbage-collected language isn't just a pointer. It's a handle, and the thing it points to can change due to the gc rearranging memory. When you have a lot of *wrp.Message elements hanging around waiting to be processed, the gc has to work arbitrarily harder just to check objects that application code knows for certain aren't going anywhere until they're pulled off the channel. Using by-value semantics, e.g. wrp.Message instead of *wrp.Message, eases the load on the gc considerably in any situation where channels have a lot of elements that take a nontrivial amount of time to process. Which is almost every situation...

As always, profiling will tell you if this is a performance improvement. But it almost certainly will be.

@johnabass
Copy link
Collaborator

Another optimization around channels to consider: Use a chan []byte instead of a chan *wrp.Message. Individual goroutines that dispatch (or maybe handle?) messages would then decode the []byte. This allows reuse of instances, thus easing up on memory allocations:

var wmsg wrp.Message
d := wrp.NewDecoderBytes(nil, wrp.Msgpack) // you get the benefit of reflection caching here
for msg := range messages {
    d.ResetBytes(msg)
    wmsp = wrp.Message{} // null out, just in case
    d.Decode(&wmsg) // TODO: error handling
    go.send(wmsg) // pass by value, ie a copy
}

If you need additional information around a message, like maybe expiry, then just pass a struct with appropriate fields on the channel instead of just a []byte.

type Envelope struct {
    Expiry time.Time
    Raw []byte
}

c := make(chan Envelope, 100)
for e := range c {
    // etc
}

This also allows caduceus to drop messages without expending the compute to decode messages, since decoding is delayed to the point at which the message is handled.

A slight disadvantage with this approach is that caduceus can receive invalid messages which will be enqueued. This is easily mitigated with judicious limits on the http.Handler code, such as a max HTTP entity size to prevent a DDoS attack.

@kristinapathak
Copy link
Contributor Author

@johnabass per your first comment (#212 (comment)), the more complex form is needed to load a new channel if it has changed:

msgQueue := obs.queue.Load().(chan *wrp.Message)

Without the for-select, we can't load something new when we are cut off. If we changed it to the traditional for-range, how could we still do this?

@kristinapathak kristinapathak removed their assignment Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants