Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messaging Megaissue #25829

Open
20 of 42 tasks
mariusandra opened this issue Oct 25, 2024 · 0 comments
Open
20 of 42 tasks

Messaging Megaissue #25829

mariusandra opened this issue Oct 25, 2024 · 0 comments

Comments

@mariusandra
Copy link
Collaborator

mariusandra commented Oct 25, 2024

Broadcasts

RFC: https://github.com/PostHog/product-internal/pull/663

MVP: #25719 split into 5 parts:

  • feat(messaging): different types of hog functions #25788
    • Migration to add type to the HogFunction model. Type will be set to destination for all existing functions.
    • The plugin server only runs destination functions on events, but also loads email functions (shared provider code) into memory.
    • The backend returns hog functions based on type, and the frontend only asks for destination
    • Tests for all of the above
  • feat(hog): importing modules in hog #25796
    • Add a system to synchronously import bytecode/files/code/functions in Hog
    • Pass loaded code around in the vmState
    • Tests
    • Backport to Python HogVM
    • Release new version
  • feat(messaging): ui #25811
    • Create a "messaging" menu item with "broadcasts" and "providers"
    • Refactor "destinationsLogic" and "DestinationsTable" into something more abstract and use it for all 3 (made a new component)
    • Separate view/edit pages for the new function types
    • Show matching persons
    • Custom testing data
  • feat(messaging): test sending e-mails #25825
    • Hardcode just one import source, provider/email, which exports sendEmail() (runs the team's email provider's bytecode)
    • Can invoke/test both providers and broadcasts
    • Tests
  • feat(messaging): actually send the message #25830
    • Make a query to fetch all persons
    • Execute the broadcast for each person (simplest possible solution)
    • Tests
    • Make it not suck

Future work to do and things to think about:

  • Add a lot more email providers
  • Error saving cohort filter on broadcast (it only gets used in a query, so the error is wrong)
  • Logs and metrics pages for both broadcasts and providers
    • How to store (tag) logs? Do we emit the same log twice, once for the provider and once for the broadcast?
  • Deduplicate e-mails in a broadcast job (currently our team gets 20k matches)
    • What's the canonical person for an email? How do we select the right one?
    • What's the distinct_id of the user we'll use for any emitted "email sent" posthog events?
  • Capture PostHog events when sending mails ('email sent', 'email sending failed', etc)
  • Where and how do we broadcast?
    • Should we broadcast directly and synchronously over HTTP like now? (django waits for the CDP API that runs blocking async)
    • Should we use the Python HogVM instead? (skips the plugin server being down)
    • Should we push all the messages to cdp_function_callbacks via Kafka?
    • Insert directly to cyclotron via Postgres?
  • Cancellation/retry support for broadcasts (e.g. error half way through queuing 100k emails and want to retry)
  • Metrics about sent/delivered/bounced (and capture these events)
    • Add "http"/"webhook" function type, make a URL that returns 200 and routes the incoming data to the function (think segment's source functions)
  • Add explicit export fun sendEmail syntax instead of the hacky return { 'sendEmail': sendEmail }
  • Layout templates, cloning broadcasts, defaults
  • Support more than one email provider, or block adding a second one
  • Release as early access
  • Release as public beta

Workflows

The work on Hog above gives use an interesting way to build workflows (aka the second part of messaging, which is currently out of scope):

  • Each node in a workflow can be its own Hog function that lives in its own path like workflow/{id}/{node_id}
  • All of these can be imported import('workflow/{id}/node/{id}')
  • Each imported code when running has its own globals --> each hog function gets its inputs, including encrypted ones
  • The entire workflow compiles into one large hog function (looped switch statement, similar to this) that imports and calls the different nodes as needed. Something like this:
// workflow.hog
let node := 'n1'
let retries := 0
let logic := {
  'n1': () -> {
    print('Entering node 1')
    import(f'workflow/{id}/n1').run()
    node := 'n2'
  },
  'n2': () -> {
    try {
      print('Entering node 2')
      import(f'workflow/{id}/n2').run()
      node := 'n3'
    } catch () {
      print('Error')
      retries := retries + 1
      if (retries > 3) throw Error('Enough')
      print('Retrying')
      sleep(1000)
    }
  },
  'n3': () -> {
    print('Entering node 3')
    import(f'workflow/{id}/n3').run()
    node := null
  },
}
while (node) {
  logic[node]()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant