Skip to content
This repository has been archived by the owner on Feb 3, 2023. It is now read-only.

riker for hash table #226

Merged
merged 57 commits into from
Aug 30, 2018
Merged

riker for hash table #226

merged 57 commits into from
Aug 30, 2018

Conversation

thedavidmeister
Copy link
Contributor

@thedavidmeister thedavidmeister commented Aug 18, 2018

IMPORTANT: tests are failing CI because the docker box uses stable and we need nightly. test locally for review

fixes #135
fixes #137

this introduces the Riker actor system for #135

the problem

the state tree (redux style state) is great for small, well known items of data but struggles with:

  • generic traits: trait info propagates to the root of the state tree and everything that touches it
  • lifetimes: the same issue as generic traits
  • large data: the state tree is cloned on every "mutation" by design
  • external resources: any external state/service must be a black box resource ID, so "time travel" is meaningless in this context

black ink in water preview
(actual photo of generic traits spreading through our application state)

this solution: riker actors

http://riker.rs/ implementing the riker library for actors

the first implementation is for hash tables

riker core concepts:

  • protocol: a set of valid messages that can be sent (e.g. an enum)
  • actor system: manages all the actors for a given protocol
  • actor: anything implementing Actor that creates new actor instances and defines message receive
  • actor instance: an instance of the actor struct that has internal state and is tracked by the actor system
  • actor ref(erence): an ActorRef<MyProtocol> that can tell messages to the actor instance it references via. the actor system

the approach in this PR is to implement the HashTable trait for actor refs. the actor ref is passed an "inner table" that also implements the same HashTable trait at construction time. the actor ref becomes a standardised, transparent wrapper around the inner table implementation.

this means that calling table.commit() and table_actor_ref.commit() do the same thing

the benefit of table actor refs:

  • known size at compile, safe as properties of structs/enums
  • small size, almost free to clone
  • safe to share across threads and copy, no Arc reference counting, no locks, etc.
  • safe to drop (the actor system maintains a URI style lookup)
  • known type, no onerous generic trait handling
  • no onerous lifetimes

implementation deets

the 1:1 API implementation between actors and their inner table is achieved by internally blocking on an ask from riker patterns - https://github.com/riker-rs/riker-patterns

the actor ref methods implementing HashTable send messages to itself

calling table_actor_ref.commit(entry) looks like this:

  1. the actor ref constructs a HashTableProtocol::Commit message including the entry
  2. the actor ref calls its own ask method, which builds a future using riker's ask
  3. the actor ref blocks on its internal future
  4. the referenced actor receives the Commit message and matches/destructures this into the entry
  5. the entry is passed to the commit() method of the inner table
  6. the actor's inner table, implementing HashTable, does something with commit (e.g. MemTable inserts into a standard Rust, in-memory HashMap)
  7. the return value of the inner table commit is inserted into a CommitResult message
  8. the CommitResult message is sent by the actor back to the actor ref's internal future
  9. the actor ref stops blocking
  10. the CommitResult message is destructured by the actor ref so that the return of commit satisfies the HashTable trait implementation

riker ask returns a future from the futures 0.2.2 crate, table_actor.ask calls block_on and unwrap against this ask. both the block and the unwrap should be handled better in the future.

limitations, tradeoffs

i wish that this were free

nightly rust

IIRC riker (or futures, or both) needs nightly rust (TODO: double check this, document why)

rust futures

the futures story for rust is "WIP"

even so, futures look far more pragmatic than thread/channel juggling for many situations

for example, the observer/sensor/event-loop state model we implemented ad-hoc looks a lot like some of the future/poll/task system internals

dependency on riker

once/if we merge this, we're pretty much in bed with riker moving forward:

  • it makes little sense to reinvent our approach, the same problems in our state tree apply to logging, network state, etc.
  • riker is a relatively new crate
    • sub 20 stars on github
    • ambitious but incomplete roadmap
    • incomplete docs
    • ??? team, 1 person? funded? motivated by? open to collaboration?
  • the same things that make riker powerful (opinionated, implementing a very specific approach) also mean we must pay attention along the way (discussion, careful prototyping, etc.)
  • riker is clearly the type of thing that we'd find ourselves reaching for a lot (at least for external state, possibly even for some internal state too) once it is in there

riker ask actor/future vs. futures poll

rust has limited/awkward callback support so futures in rust (unlike basically every other language with futures) is poll based, and must be "driven" externally

the native future model is designed primarily to be composable on the level of poll. the whole thing only works because nested poll calls in nested futures can bubble their results.

the usefulness of futures comes in large part through the various abstractions for controlling how nested/parallel poll results are are called, merged, blocked on, etc.

riker on the other hand is all about independent actors sending and receiving messages asynchronously. the need for futures to make this work is equal parts implementation detail and "adapter" for the broader rust ecosystem. i get the feeling that if there was a viable non-futures approach, then riker would use that instead.

for example, i couldn't figure out how to usefully nest ask futures across multiple actors like we could nest poll calls in a vanilla future. the underlying futures task context (needed by poll) is hidden somewhere in the riker internals. nested blocking on ask is a compiler error.

at this point i'm willing to chalk the friction up to my own inexperience with futures/riker and the relative immaturity of both libraries. it's certainly not clear that we have a hard requirement for nested actors... after all, i managed to find a way around it for the HashTable use-case.

Why not use...

Structs

We could fix the generic trait issues by:

  • making a wrapper struct with an inner hash table
  • the wrapper struct goes in the state chain with a known type
  • the new method of the wrapper takes a <HT: HashTable> and acts as a buffer

the problem (ignoring likely unknown size issues for the inner table) is that cloning the wrapper struct also clones the inner table. some inner implementations might be a stateless reference (e.g. a URL) and be safe to clone. Many will be stateful (e.g. MemTable) and so can't be cloned safely.

riker actor references are always stateless. the actor + actor system quarantines state for us, regardless of the inner implementation.

Actix

Actix looks great:

  • actor system
  • stable rust (i think)
  • many github stars (~1500)
  • nested actors seem to work well and clearer support for call/response actor comms

but has these limitations that looked like dealbreakers when i reviewed with the team:

  • roadmap is not covering what we would want in a generalised actor framework (e.g. persistence, event logs, actors over network, pluggable backends, etc.)
  • seems more monolithic, wasn't as clear how to plug it into our existing systems
  • the API broadly doesn't match our mental model of what we want to achieve here
    • for example, it doesn't seem to have actor references to pass around and plug into our state tree
  • docs are surprisingly unmaintained, riker has far more info, most of the actix docs are "TODO"

changes

💥 💥 💥
screen shot 2018-08-19 at 11 04 35 pm
💥 💥 💥

  • adds riker
  • adds futures 0.2.2
  • adds riker config symlinked into place as toml
  • defines a protocol for HashTable actors
  • creates an actor system for HashTable through lazy static
  • more explicit naming, e.g. get vs. get_entry and get_pair
  • extends trait bounds on HashTable to be actor ref friendly
  • implements HashTable for actor ref
  • implements an actor for hash tables
  • refactor Chain to use the HashTable actor ref instead of the inner implementation
  • implement MemTable for agent state
  • test that commit/get actions and zome functions can round trip data

followups

@thedavidmeister
Copy link
Contributor Author

this actually does a round trip!

time to start polishing..

@thedavidmeister thedavidmeister changed the title 135 riker one sys WIP: riker for hash table Aug 19, 2018
@thedavidmeister thedavidmeister mentioned this pull request Aug 20, 2018
self.top_pair.clone()
/// getter for the chain
pub fn chain(&self) -> Chain {
self.chain.clone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not do this. This can get out of hand when used overly. better return an immutable reference to self.chain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphinxc0re ok i'll change it, i have some questions about this and mutability but they can wait until later

@thedavidmeister thedavidmeister changed the title WIP: riker for hash table riker for hash table Aug 30, 2018
Copy link
Member

@zippy zippy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fabulous. Big step forward.

impl AskSelf for ActorRef<Protocol> {
fn block_on_ask(&self, message: Protocol) -> Protocol {
let a = ask(&(*SYS), self, message);
block_on(a).unwrap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this a Result?

/// every action and the result of that action
// @TODO this will blow up memory, implement as some kind of dropping/FIFO with a limit?
// @see https://github.com/holochain/holochain-rust/issues/166
actions: HashMap<ActionWrapper, ActionResponse>,
chain: Chain,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this much more that the state has a chain than a top_pair

}

#[test]
/// show two things here:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice test!!

)

(func
(export "commit_dispatch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think his will conflict with #268 so someone will have to do the merge...

@thedavidmeister thedavidmeister merged commit d5e0c68 into develop Aug 30, 2018
0. the actor ref calls its own `ask` method, which builds a future using riker's `ask`
0. the actor ref blocks on its internal future
0. the referenced actor receives the `Commit` message and matches/destructures this into the entry
0. the entry is passed to the `commit()` method of the inner table
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned of all those commit()s here which actually should be put()s.
I've created ticket #274 as a follow up task to make these function names match the mental modal we have...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants