Skip to content
dhedlund edited this page May 18, 2011 · 12 revisions

Overview

(assumes an authenticated HTTPS interface)

The messaging hub provides message delivering services for notifiers that may be behind transient or unreliable data connections. Notifiers can connect to the messaging hub and drop off messages that need to be delivered either immediately or in the future. Notifiers can also query the messaging hub for the delivery status of an individual message or collection of messages.

Supported Delivery Endpoints

  • SMS (what provider?)
  • IVR (INTELLIVR)

External Interface

Authentication

It is assumed that notifiers will authenticate with the messaging hub using HTTP basic auth over HTTPS. If authentication fails then the server will return a 401 Unauthorized message.

Web Resources

Message Upload

To upload messages to the message hub, the notifier should send a PUT request to /messages with a Content-Type header of application/json. Payload should be a JSON array of message record objects, each conforming to the message record format below. Server responses will be either 200 OK or 400 Bad Request.

Status Updates

To request status updates from the message hub, the notifier should send a GET request to /message_updates/YYYYMMDD-YYYYMMDD or /message_updates/YYYYMMDDHHMMSS-YYYYMMDDHHMMSS with an Accept header of application/json. Response from the server should be a JSON array of status update objects, each conforming to the status update format below. Server response will be 200 OK if request was successful. 400 Bad Request will be returned if the date range was invalid. Date range is assumed to be according to the notifier's timezone. Dates are "from-to" (not through). If using YYYYMMDD format, today's date will be considered invalid and return a 400 error as the result set is incomplete.

Message Record Format

  • id: unique identifier used to track/update a message
  • action: MESSAGE_NEW, MESSAGE_UPDATE or MESSAGE_CANCEL
  • first_name: first name of recipient
  • phone_number: phone number to deliver message to
  • template_id: template id of message to deliver
  • delivery_method: SMS, IVR
  • delivery_date: preferred delivery date (notifier's timezone)
  • delivery_expires: if specified, don't attempt delivery on or after this date (notifier's timezone)
  • preferred_time: hour or hour range in 24-hour format (i.e. 10, 9-18), relative to notifier's timezone

Status Update Format

  • id:: unique identifier used to track/update a message
  • status: SUCCESS, TEMP_FAIL or PERM_FAIL
  • error: MESSAGE_EXPIRED, TEMP_DELIVERY_FAIL, etc.
  • message: more verbose error message

Queuing and Status Updates (Bulk)

Initial Drop-off

When a notifier connects to the messaging hub, it can upload a list of one or more notifications for delivery. Each notification must contain a unique identifier that will be used for tracking delivery status and for preventing accidental duplication of message delivery (idempotence). The notifier may also use this identifier for canceling or updating a specific message. Once the data payload has been uploaded, the message hub will perform a quick data validation check and either accept or reject the entire payload. If rejected, it will almost always be because the notifier provided data in a format the messaging hub could not understand.

Data Validation and Enqueuing

Unique identifiers provided in data payloads are only considered to be unique across messages for a single notifier. Each identifier is combined with the notifier's account identifier to create a universally unique identifier (UUID) across the messaging hub. Once a payload has been accepted, the messaging hub goes through each message in the payload, generates a UUID, performs some basic validations and then adds each one to a database and the future message queue using the current date as the trigger time. Messages are only added to the future queue if all validations pass.

Basic Validations (applies to all messages)

  1. The action must be supported (MESSAGE_NEW, MESSAGE_UPDATE or MESSAGE_CANCEL). Rejected with INVALID_ACTION error if not supported.
  2. A first name must be provided. Rejected with MISSING_FIRST_NAME if not present.
  3. A phone number must be provided. Rejected with _MISSING_PHONE_NUMBER if not present.
  4. The template id must exist in the database. Rejected with INVALID_TEMPLATE if doesn't exist.
  5. The delivery date must be a valid date. Rejected with INVALID_DELIVERY_DATE if not valid.
  6. The delivery expires date must be a valid date or blank. Rejected with INVALID_DELIVERY_EXPIRES if not valid. A blank delivery expires date is assumed to be 7 days after the delivery date.
  7. The preferred time must be a valid hour, hour range or blank. If invalid, the preferred time will be changed to blank which allows delivery to occur at any reasonable time.

Message Processing and Delivery

The two queues responsible for managing message delivery on the message hub are the Future Events Queue and the Active Events Queue. The future events queue keeps track of events that need to trigger in the future. Workers watching the queue are responsible for moving message events onto the active events queue once a trigger time has been reached. The active events queue contains all events that should be processed as soon as possible. Each queue has support for multiple types of events. Once an event makes it into the active queue, it is processed by a worker based on its event type:

message_new Event

  1. If the record already exists in the database with the respective UUID, the message record is rejected with an ALREDY_EXISTS error.
  2. A new message record is created in the database with the message's UUID.
  3. A delivery event for the message then is added to the future events message queue using a combination of the delivery date and preferred time as its trigger time. The preferred time will only be used as a guide; the closer to the expires date, the further the trigger time will deviate from the preferred time (i.e. get more aggressive the closer we are to expiring).

message_update Event

  1. If the record doesn't exist in the database, it will be treated as if it was a new action.
  2. If message data is the same as in the database, nothing further will occur.
  3. If message data is different but has already been delivered successfully, reject with ALREADY_DELIVERED error.
  4. Otherwise, an attempt will be made to remove the message's delivery event from the future events message queue.
  5. An attempt will be made to remove the message's delivery event from the active events message queue if it exists and can be removed safely.
  6. If successfully removed from the queue, the message record is updated and a new delivery event is enqueued in the future events message queue.
  7. If message could not be removed from one of the queue because its delivery was active/imminent then increments a retry counter and re-adds to future event queue. Fails permanently if retry counter exceeds a configurable limit.

message_cancel Event

  1. If the record doesn't exist in the database, reject with MESSAGE_NOT_FOUND error.
  2. If message has already been canceled, nothing further will occur.
  3. If message has already been delivered successfully, reject with AREADY_DELIVERED error.
  4. Otherwise, an attempt will be made to remove the message's delivery event from the future events message queue.
  5. An attempt will be made to remove the message's delivery from the active events message queue if it exists and can be removed safely.
  6. If successfully removed from the queue, the message record is updated with a status of canceled.
  7. If message could not be removed from one of the queues because its delivery was active/imminent then increments a retry counter and re-adds to future event queue. Fails permanently if retry counter exceeds a configurable limit.

message_deliver Event

  1. Checks if the message has passed its delivery expiration date. Reject with MESSAGE_EXPIRED if expiration date has passed.
  2. Checks if delivery method is supported for requested method template. If not supported, check if any alternative delivery method is supported for same template. Reject with INVALID_TEMPLATE if template no longer exists.
  3. Process/evaluate template and attempt to deliver using selected delivery method. If temporary failure, reject with TEMP_DELIVERY_FAIL and store any relevant error messages returned from delivery endpoint. If permanent failure, reject with PERM_DELIVERY_FAIL and store any relevant error messages.
  4. Calculate next delivery attempt based on current date, delivery expires date and preferred time, decreasing the delay between attempts as the expiration date gets closer.

Error and Delivery Status Updates

Each time a message is rejected or delivered successfully, it's status and last run date is updated in the database and a logger is called. The logger appends to a long-term log file and updates aggregate statistics tables. Aggregate statistics tables can be used for analytics or for identifying delivery outages.

To request error/success statistics from the messaging hub, the notifier connects to the hub and provides a starting time. A list of all status updates associated with messages starting from that time are returned to the notifier. Any status updates recorded within the last few seconds skipped; this should help ensure subsequent requests from the notifier do not have overlapping or skipped message updates from a previous request. When making requests, the notifier should take the time of the last message update and add one second and use that as the starting time.

Database/Backend

Message Templates

The messaging hub is responsible for keeping a database of notification messages that can be delivered via each endpoint. Notifiers do not provide message content directly, but rather selects a message template from the notification database for delivery. A message program is collection of one or more pre-defined messages along with a recommended delivery schedule and additional metadata. Notifiers are currently expected to have a local copy of any message program they want to deliver messages for, including relevant template ids.

Message Data Store

A copy of all messages will be stored in a relational database table to allow indexed querying of messages and their current status. Messages will be automatically purged from the database one week after its last run date or expiration date, whichever comes last. Message deletion will be delayed for any messages that have not been delivered to the notifier.

messages Schema (note: it might be better to split status out into its own table or keep non-status info only in the message queue)

  • uuid: string, unique
  • first_name: string
  • phone_number: string
  • template_id: string
  • delivery_method: string (valid: SMS, IVR)
  • delivery_start: datetime (UTC, converted from calculation based on delivery_date + preferred_time)
  • delivery_expires: datetime (UTC, converted from calculation based on delivery_expires + preferred_time)
  • delivery_window: number of hours in preferred delivery range (from delivery_date to delivery_date + delivery_window)
  • notifer_id: integer
  • status: string (valid: SUCCESS, TEMP_FAIL, PERM_FAIL or NEW)
  • error: string
  • message: text
  • last_run_at: datetime (UTC)

Indexes: [uuid], [last_run,notifier_id]

notifiers Schema

  • id: integer, primary key
  • username: string
  • password: string _(http basic auth password)
  • timezone: string
  • last_login_at: datetime (UTC)
  • last_status_req_at: datetime (UTC)

Indexes: [id], [username]

Message Queues

DelayedJob (github.com/collectiveidea/delayed_job) will be used for handling the future and active worker queues. Resque (/w resque-scheduler) could act as a viable alternative; resque would cut down coding time by providing queue introspection via resque-web but comes at the cost of increased memory requirements due to its reliance on the redis-based data store. Delayed job has a pluggable database backend and can use ActiveRecord if integrated into a rails application.

Timezones

All dates used internally are in UTC. All dates made visible to the notifier are in the notifier's timezone.