Skip to content

[3.5 M1][MULE DEVKIT] Making OAuth2 work OOTB

Mariano Gonzalez edited this page Jun 18, 2013 · 6 revisions

Motivation

Although Devkit does a great job generating OAuth enabled connectors, the usability of those is still quite painful. Among main complexities that need to be tackled are:

  • In multitenant apps, there's a common use case in which you need to lookup the accessTokenId before actually using a processor.
  • Also, the user right now has to account for the case in which the token is still not there which translates in exceptions or filters that the user has to manage
  • Better integration with CloudHub features such as Customer Management and Insights is needed.

We want to enhance the OAuth2 experience in connectors so that the following are achieved:

  • It just works automatically for single tenant applications
  • It just works automatically for CloudHub multitenant applications
  • Fault tolerant when token not yet set

The Scope of this spec is only limited to OAuth2 connectors.

Use cases

Avoiding accessTokenId lookups

Right now we've experienced two scenarios in which we need to perform lookups before actually using an accessTokenId

API's natural access token id doesn't match the iApp's user

Consider the BoxMyPics sample iApp. It's a multitenant app in which I register for synchronization of my facebook pictures against my box account. As a multitenant iApp, I need to have some kind of userId that separates my tokens from someone elses. Let's assume that my iApp user id is [email protected]. So, when I OAuth against google, I get a token which id is gonzalez.mariano.100 (that's actually my real facebook id, I'm not making this stuff up). So, before actually hitting any facebook operation, I need to perform some kind of lookup in order to translate [email protected] into gonzalez.mariano.100. This is a pain.

The id of two different tokens might collide

Continuing with the example of BoxMyPics, let's assume for a minute that my facebook user id is in fact [email protected]. This is not a real possiblity but bear with me for just a minute. It turns out that now I also need to OAuth against box. Here I run a hugh risk of finding out that my Box user id is also [email protected]. So each time I OAuth into Box, I'd be overwriting the facebook token and vice versa. I'm sure you can quite quickly think of workarounds for this problem, but again, this is a pain

Being fault tolerant about inexistent keys

If a tenant who didn't yet authorized wants to perform an OAuth protected operation in a polling flow, right now you get an ObjectStore exception saying that there's no value for the given access token id. Common strategies on this situation are to either add a filter that stops the flow in that case or throwing exception. That can be automated through configuration. As a user, I want to have better support on this scenario so that it's more clear and easier to handle

Semantic consistency for the message processor

The authorize message processor is an intercepting one that redirects the browser to initiate the OAuth dance. If the dance was successful, the connector eventually gets a callback that should continue with the rest of the flow. However since the whole process is asynchronous on its very nature, the MuleMessage that runs the rest of the flow is different from the one that initiated the authorize process. That means that all your message properties and the payload are gone.

This is very confusing for the user and you end up with a flow which behavior doesn't match its semantics. It's also a problem when using CloudHub's customer tenancy since the tenantId variable is lost, making values being persisted in the global ObjectStore instead of the tenant bucket.

As a user, we want this behavior to be consistent. This bug is documented on DEVKIT-264

Enhance CloudHub integration

Due to the asynchronous nature of the authorize operation, CloudHub could benefit from having notifications of when the authorization operations are started and finished. This would help improve the Insights and Customer Management functionalities.

DESIGN

Configuration

This is an example of how a flow looks like when you need to lookup the access token id:

<enricher target="#[flowVars['tokenId']]">
    <objectstore:retrieve key="#[flowVars['someUserIdentifier']]" />
</enricher>
<google-calendars:get-events accessTokenId="#[flowVars['tokenId']]" />

For CloudHub users

Right now, the accessTokenId that a token is given when persisted is provided by the connector through a method annotated with @OAuthAccessTokenIdentifier. If instead of doing that, we simply default the accessTokenId to the connector's config name, then we wouldn't have an issue anymore since:

  • For single tenant applications, we don't have many tokens to identify in the same place
  • For multitenant applications, each tenant has a separate ObjectStore so there's no conflict in repeating the keys.

With this solution, the above is reduced to:

<google-calendars:get-events />

For On-Premise users

For on-premise users the same solution stands in case of single tenanted applications. However, if an on-premise user wants to implement multi tenancy on their own, defaulting the accessTokenId doesn't quite solve the problem because they can't benefit from CloudHub's multi tenancy support. That means no ObjectStore auto-partition which would lead to token overwriting if we just blindly default to the config's name. What we can do for these users is to avoid the need of accessToken lookup by allowing the user to force the accessTokenId when authorizing like this:

<google-calendars:authorize accessTokenId="#[flowVars['myTenantId']]" />

Then, I could just refer to that token directly without the need of a lookup:

<google-calendars:get-events accessTokenId="[flowVars['myTenantId']]" />

Being fault tolerant about inexistent keys

What happens right now if a tenant who didn't yet authorized wants to perform an OAuth protected operation? Right now you get an ObjectStore exception saying that the given accessTokenId doesn't exist. Common strategies on this situation are to either add a filter that stops the flow in that case or throwing exception. That can be automated through configuration:

<google-calendars:config-with-oauth name="google-calendars" consumerKey="${google.apiclient}" consumerSecret="${google.apisecret}" onNoToken="[STOP_FLOW|EXCEPTION]">
		<google-calendars:oauth-callback-config connector-ref="${oauth.http.connector}" domain="${oauth.url}" localPort="${https.port}" async="false" path="oauth2callback" />
	</google-calendars:config-with-oauth>

The new onNoToken property can have two possible values:

  • STOP_FLOW: This pretty much acts as a filter and kills the execution of the flow
  • EXCEPTION: It throws an exception saying that the token is still not there. This exception should be clearer than the ObjectStoreException we get today

Behavior

As explained above, all of these features rely on DEVKIT-264 being fixed. These are the options that were discussed to fix this issue:

Caching the original event

Before sending the redirect to the browser, keep the event in a cache. Upon reception of the callback, the cached event will be sent through the rest of the pipeline. This option was discarded because:

  • The cache should have a relatively short TTL for its items to avoid running out of memory.
  • If a load balancer is placed in front of Mule, there's no guarantee that the instance serving the callback is the same that generated the redirect and is holding the cache entry (this is the CloudHub case)

Buffering the event in ObjectStore (recommended one thus far):

Use the same solution as in pre-existing message processors such as or EventCorrelator and use ObjectStore to persist the event before doing the redirect. The ObjectStore key would be <>-auth-event, so that at tops only one entry exists per tenant. When the callback is received, the entry is retrieved and removed.

Pros:

  • Fixes DEVKIT-264
  • Maintains flow semantic consistency
  • Is consistent with the overall architecture of Mule
  • Would work with Mule HA
  • Would work in CloudHub since the ObjectStore would be share across workers

Cons:

  • Depending on the ObjectStore implementation, persistence of some events might fail due to properties not being serializable
  • If the payload is an InputStream then it needs to be consumed prior to persistence
  • Depending on the ObjectStore implementation, it might run into resource consumption issues for high volumes

Discard the event, auto generate the state:

The OAuth protocol provides a state query param that can be set when initiating the dance. Whatever value is sent there is returned to you in the same format when receiving the callback. Another option would be to not maintain the event at all and simply auto generate the state field using the message's outbound properties. Upon reception of the callback, the user will have whatever was an outbound property before starting the authorize as an inbound-property

Pros:

  • No magic. 100% inline with the protocol
  • Easy to scale

Cons:

  • Breaks backwards compatibility (at least contract-wise, you could argue that the feature never worked in the first place) with documentation of all Devkit versions and all OAuth2 connectors
  • Introduces semantic inconsistency. The flow would not behave as it reads. There's no other MP in Mule that behaves like that
  • Since the state is a query param, the redirection might fail due to the url being too long. Mule would have no way of receiving that error since the redirection happens at a browser level.
  • Forces the user to have a deeper knowledge of the OAuth protocol than we're currently asking our users

CloudHub notifications

Right before sending the authorization redirect to the browser, the Authorize message processor will fire a synchronous notification signaling the start of an authorization dance. This notification will carry the underlying MuleEvent.

Afterwards, upon reception of the callback, another synchronous notification will be fired signaling the end of that authorization. This notification will carry the reconstructed MuleEvent.

Risks

  • Unforeseen side effects of whatever solution is picked for DEVKIT-264

Other products impact

  • Devkit will have to implement changes in the generation of some processors and the XSD
  • Studio will have to change the OAuth editors
  • CloudHub will have to start listening for the authorize notifications

Migration Impact

From a config point of view

At a config level, these changes are 100% backwards compatible. Although there would be a new recommended way of using the connectors, the old configs will still work. However, application that wish to upgrade to connectors using these new functionalities will be forced to keep managing the access token ids as they are today since the default access token keys will not match with the ones being currently used.

From a behavior point of view

The main behavioral change comes from solving DEVKIT-264:

  • If we choose the "Buffering the event in ObjectStore" solution, then applications in which the Mule message is not serializable at the moment of authorizing will start to fail. This is however considered a non-issue since whatever information was in the message at that point is currently being lost by the actual behavior. Additionally, such a situation is considered way unusual and outside the use cases we've seen
  • If we choose the "Discard the event, auto generate the state" solution, then we're breaking backwards compatibility with the contracts and documentation on all Devkit versions and all OAuth2 connectors

Documentation impact

  • The overall Devkit documentations needs to be updated
  • Devkit's documentation generator will have to be updated so that new connectors refer to the new config