Home

#Functional Programming with High Performance Actors

##Introduction

Why do we need a new approach to programming? On the one hand, Solid State Disk (SSD) has shifted the balance between I/O and CPU. Where before there might not have been much benefit to faster code, that is simply no longer the case. Improvements in execution speed can now significantly improve program performance.

On the other hand, computers are getting fatter, not faster. An i9 chip has 12 hardware threads. A Sparc T4 chip has 64. But writing programs that actually run faster as you add more hardware threads is difficult--which is one of the reasons why single-threaded platforms like node.js are so popular. Passing data between threads (messaging) is inherently slow. Programs which work well with multiple threads typically pass data between threads in large chunks, e.g. data-flow based software. So moving everything to an asynchronous model like actors is unlikely to work well when a lot of messages are involved.

As we move to use more threads in our software, managing state becomes more complex. Locks are not a particularly good solution, as the code is prone to errors--especially as the code changes over time. Actors provide a attractive solution to managing state, as do concurrent data structures.

But with current technologies we end up developing programs with an architecture specific to the problem at hand. And re-architect our programs as the requirements change. This is a labor-intensive approach that minimizes code reuse, particularly when using actors and refactoring method calls to be asynchronous messages or asynchronous messages become method calls. To maximize reuse, an actor should neither know nor care if the exchange of messages with another actor is synchronous or asynchronous. And if most message exchanges are synchronous, we can have many small actors working at a fairly low-level without slowing the program down to a crawl.

Another issue is flow control and its impact on throughput. Messaging is typically one-way, and takes extra effort to add flow control. But without some flow control, the systems we develop behave badly under load. Also, if you send one message at a time and do not proceed until a response is received, the throughput of most actor implementations will suffer a severe drop in throughput. What is needed is a messaging system which is more like method calls, while still having a reasonably high throughput rate.

High performance is critical these days, since computers are not getting any faster. An aggravating factor is that, except for blocking I/O and heavy computation (> 1 millisecond), generally the more you can do on a single thread the faster the program runs. So we use several strategies to achieve this:

Actors do not require a dedicated thread. This reduces the number of threads needed, so we can dedicate a generous amount of RAM to our thread stacks and still have plenty of memory for other things.
Outgoing messages are buffered in an ArrayList. This means that the number of thread switches is no longer proportional to the load placed on a program. Passing arrays of messages also works well with multiprocessor caching architectures.
Rather than treating messages as events, messages are used to implement request/response. Flow control is then implicit, which improves performance under high loads. And when a request is sent, a response is always returned unless an exception has been thrown.
Multiple actors share the same input message queue (mailbox), which improves outgoing message blocking. More importantly, actors sharing the same mailbox always operate on the same thread and a request/response between these actors is done as a method call without having to enqueue the request and response messages.
Next, we distinguish the actors which may perform I/O or other blocking operations. So when a message is sent to a non-blocking actor which has an empty mailbox, the active thread can take control and process the message synchronously. This eliminates the delay which would otherwise occur when sending a message to an idle actor.
Finally, actors support multiple types of message bindings. So for example, an actor can immediately respond to a request with a thread-safe object, e.g. an immutable object, regardless of the source of the request.

Here's a quick look at how we implemented the above: Speed --New in release 2.0.0

With high-performance actors we can handle more messages, and we'll want to employ a functional programming approach, e.g. find, map, filter etc., to keep our code clear and readable. We can also use state machines to linearize code that would otherwise be ugly and deeply nested.

Here's a quick look at

Sequence, which support a functional programming approach. --New in release 2.0.0
And Chain, a simple state machine to help keep the code linear. --New in release 2.0.0

##Current Features

Vertical scaling is achieved organically, without having to resort to application-specific architecture. This maximizes the development of code reuse.
Message processing is configurable.
- Always synchronous messages.
- Always asynchronous messages.
- Synchronous or asynchronous depending on the relationship between the requesting and target actor.
- Many readers or one writer transaction model.
Replies do not block threads, so they can be used to increase code readability without impeding scalability.
Chains of operations can be composed, with messages built using the results of earlier operations.
Interoperable with Scala Reactor.
Sequences operate both synchronously and asynchronously. Head, tail, merge, intersection operations are also supported.
Composite actors, with components included as needed to meet dependency requirements.
Composite actors are used with Dependency Injection, for greater flexibility in system configuration.
Subsystems are implemented using hierarchical dependency injection.
Systems and subsystems can have configurable properties.
Incremental (lazy) deserialization, for high-speed updates of a persistent store. Only the data which is accessed needs to be deserialized, and only the data which is updated needs to be reserialized.
Actor and component configuration handled by their respective factories. This keeps the code simple but flexible, with different factories employed for different uses.
Deserialization makes use of a factory registry. Configuration data then is not serialized, but supplied by the factory when an actor is reinstantiated.
Basic support for transactional persistence.
Crash-proof persistence.
Transaction logs can be used to rebuild the database.
Batch updates and opportunistic locking. --New in release 1.3.0
Swift only periodically updates its datastore. On restart, the tail of the old log file is used to create a current version of the datastore. The transaction rate is now limited primarily by the flushes to the log file. --New in release 1.3.2

##Short and Long Term Goals

Rework this home page.
A faster database (Falcon) which consolidates updates to reduce the number of times the log must be flushed.
Tables (using binary search) for faster serialization/deserialization of large collections.
Copy-on-Write (CoW) Hierarchical database.
High-performance and highly-scalable b-trees.
2-Stage commit.
Versioning and many readers and single writer (in contrast to many readers or single writer).

The issues that need to be addressed for b-trees include:

B-tree nodes need to be tables (arrays or lists), rather than a TreeMap structure, accessed using a binary search. This avoids the need to deserialize all the keys in a node the the construction of a TreeMap. The reason for this is that only a small portion of the data in a node may be accessed/updated while the node is resident in memory and the less data that is processed, the better.
The plan is to use Copy-on-Write (CoW) to implement the b-trees. The reasons for this are robustness and the ease of implementing a transactional database when using CoW. But CoW has a scalability issue. When a block is updated, it is copied to a new location and that requires an update to the block which references it, recursively, until you reach the root block. For high-performance, the new location needs to be recorded in the highest-possible block of the tree, preferably in the root block. This significantly reduces the number of blocks that need to be updated each time a small change is made to the database. But it does complicate the logic considerably.

##Orientation

##Blip Project - Beta

Callbacks make everything easy!

##Building on Blip

Incremental Deserialization - Alpha
Persistence - Alpha

##Contact Us!

Feel free to join the discussion group at https://groups.google.com/group/agilewikidevelopers.

Bill La Forge
Independent Research Engineer
[email protected]
Twitter: laforge49
Pearltree

Development Environment
Licence - LGPL
Other Links of Interest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Clone this wiki locally