-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Polly Roadmap
The roadmap indicates currently envisaged or candidate development directions for Polly.
Community feedback on features you would like, or priorities, is welcome.
Comment on overall direction, new features, prioritisation requests under the roadmap placeholder issue.
If you're interested in following and contributing to the development direction of Polly, join our public Slack channel for real-time discussion of ideas.
may be under active development, may be found in an upcoming release
In dev now for v6.x
New fuller proposal: PROPOSAL--Polly-eventing-and-metrics-architecture
Slack channel: Metrics slack channel
Github discussion: #326
With the number of resilience strategies now available in Polly v5, emitting Policy events/statistics, and timings to completion/failure of Policy executions, could be valuable.
The planned architecture envisages three layers:
- raise raw events from policies
- aggregate those events with Rx or another strategy, to form relevant metrics/stats
- transform those metrics to the format required by any particular dashboard.
The layered architecture is intended to allow users to push metrics to any dashboard (eg AppInsights; Influx; Prometheus), including users writing their own plugins for custom dashboards if desired.
prototyping started, discussion transitioning to development
next after high-priority items
This proposes execute overloads giving the ability to execute delegates taking input parameters, without using closures, as discussed in #271:
TResult Policy.ExecuteAsync<T1, TResult>(Func<T1, Context, CancellationToken, Task<TResult>> action, T1 input1, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext)
Polly v6.0 plans to deliver this.
Several user requests concern refining circuit-breaker behaviour:
Polly v6.0 proposes adding a CustomCircuitBreaker(ICircuitController myCustomController)
to open up circuit-breaker customisation.
awaiting/seeking development resource, or evaluating importance
The number of overloads available to configure policies and execute actions can cause confusion.
A possible mitigation could be splitting up the fluent syntax of step 2, policy configuration. Currently the mandatory parameters of policy configuration (eg how many retries to make) and optional delegate hooks (eg OnRetry
) are configured in the same overload. Separating out the optional delegate hooks into postfix methods as below could reduce the burden.
// Syntax under consideration
Policy
.Handle<HttpException>()
.Or<WebException>()
.Retry(3, i => TimeSpan.FromSeconds(Math.Pow(2,i)))
.OnRetry(ex, timespan => ...)
.Execute(...);
A partial syntax proposal to address split sync/async policies now exists: comment welcome.
This would likely be implemented in combination with the syntax refresh above. This is a large piece of work requiring a major time investment.
not currently in line for active development
A common resilience strategy is to have failover among a series of possible endpoints or systems to try, for a particular operation.
The existing FallbackPolicy
provides a single fallback for a failed execution. A FailoverPolicy
would go further and automate the process of automatic failover among sources in a round-robin fashion: execute against system A
; if that fails, execute against B
; if that fails, execute against C
etc (eventually wrapping back to A
).
// Possible syntax:
// configuration
FailoverPolicy failoverPolicy = Policy
.Handle<WhateverException>() // These exceptions would cause failover to the next provider
.Failover(IEnumerable<TProvider> providers); // OR: .Failover(Func<TProvider> getNextProvider)
// usage
TResult result = failoverPolicy.Execute<TProvider, TResult>(Func<TProvider, TResult> func);
failoverPolicy.Execute<TProvider>(Action<TProvider> action);
Athough one configuration overload specifies IEnumerable<TProvider> providers
, this need not be a fixed collection. It is easy to code an IEnumerable
iterating over a dynamic collection. Equally, this can be an infinite enumerable, looping back to the start when it hits the end, if desired.
FailoverPolicy
would also support a method for manually failing to the next endpoint, .FailOver()
. Any other metric/trigger you care could thus be used to trigger failover. For instance, .FailOver()
could be called from the onBreak
delegate of a circuit-breaker, to make eg an AdvancedCircuitBreaker
threshold trigger failing over, in a more automated version of the process described here.
Implementation consideration: The double-generic nature of the method .Execute<TProvider, TResult>(Func<TProvider, TResult> func)
provides a challenge for Polly v5. Execute methods taking a strongly-typed input parameter do not exist in Polly v5; and even if they were added specifically on FailoverPolicy
, that would not make them available throughout a PolicyWrap
, for when FailoverPolicy
might be used within a PolicyWrap
.
The PollyExecutable
s proposed for Polly v6 solve all these problems. They open up the path for Execute methods taking strongly-typed input parameters. And the executable instance representing an execution with a strongly-typed input paramater, can be passed all through a PolicyWrap.
Moved to lower priority. The concept fits well with resilience, but other options exist in many cases, such as network load-balancers, cloud traffic-management tools, in-built multi-node-targeting in APIs which connect to third-party systems.
The following existing issues describe possibilities in the meantime: #199 and #262
A policy to limit the number of calls placed through the policy per timespan. Useful when calling a third-party system which imposes a rate-limit on its API, provided that rate-limit is known. Perhaps taking a similar approach (with refinements) to something like Jack Leitch's RateGate.
Compare BulkheadPolicy
. While BulkheadPolicy
is a parallelism-throttle, RateLimitPolicy
would be a serial throttle.
Note: A rate-limiting design of the RateGate kind, which 'holds back' already-executing hot Task
s or threads until it is their time/turn to proceed, is intrinsically vulnerable to memory bulges (all those waiting executions have to be held in memory) in high-volume scenarios where fresh requests consistently outstrip the permitted rate. Two possibilities to deal with this are co-operative demand control (aka back-pressure) and load-shedding.
In high-volume scenarios where you have control over the producers, co-operative demand control by back-pressure is recommended; Akka streams is a mature solution for this.
For those whose scenario is amenable to Rx, there may be the option of in-built operators such as throttle, buffer, sample and window.
A RateLimitPolicy
in Polly could still be useful, particularly if we can provide options for configurable load-limits/load-shedding. Possibilities include:
- a configurable upper-bound on the number of executions (across all threads) that are allowed to queue. (This would have nice symmetry with the queue on
BulkheadPolicy
.) - shedding actions which have been queuing longer than a configurable
TimeSpan
(whichTimeoutPolicy
already provides).
Discussion in our Slack channel around here also drew out the difference between being the rate-abiding or rate-imposing party. Being the rate-imposing party sounds as simple as rejecting excess calls immediately rather than allowing them to queue.
See also important alternative below re RetryAfter
Many Azure APIs impose a rate-limit on usage dependent on the pricing tier, eg CosmosDB, and many of the cognitive services.
If you are seeking a Rate-Limit policy in connection with these, be sure to explore the alternative of Retry policies honouring a 429 RetryAfter response code. CosmosDB and many cognitive services APIs return 429 responses indicating when a request may be retried. Polly already offers WaitAndRetry overloads which can calculate the duration to wait based on the returned result (ie the Retry-After header in this case).
Consider introducing configuration-provider interfaces and/or POCOs such as discussed here and exemplified here.
The original intention was that these POCOs might describe the numeric configuration parameters of a policy, such as 'number of consecutive faults before breaking'. This in turn would allow external configuration sources to be mapped to these POCOs, allowing the creation of configuration-helper plugins for various configuration sources.
Assessment: This would be simple for policies whose key configuration details are numeric, such as circuit-breakers and bulkheads.
A challenge would be policies offering configurations which are far from purely numeric. WaitAndRetry()
, for instance, has two key use cases which define waits between retries as dynamic functions: exponential backoff and jitter. These could be parameterized in the numeric parameters defining the exponential backoff and jitter. However WaitAndRetry
still offers purely dynamic Func<int, TimeSpan>
option and IEnumerable<TimeSpan>
options which are not definable at all through numeric parameters; a retry configuration POCO mapped to a config source could only ever provide partial coverage of the available retry options.
Timeout policy also falls in this category as it has an entirely dynamic Func<TimeSpan>
configuration option.
Current view: Low priority for the Polly team's time (as against deeper resilience challenges where that time can be invested). It is relatively easy for users to develop links to their own configuration sources for the subset of Polly options they choose to use. Moved to inactive.
Comment on this roadmap here
Expressions of interest in developing any of the above functionality welcome! (for major features, contact the Polly team first to avoid duplication of effort). Some issues are also marked as 'up-for-grabs' at any time, in the Issues list. See also the notes for contributors!
The roadmap is published for transparency and to solicit community input, but is by its nature indicative and subject to change: proposed features may be more difficult to implement than envisaged, or may be down-prioritised as we continue to seek the balance for the library between power and simplicity.
February 2018: dynamic reconfiguration during running
November 2017: numerous small enhancements including durations to wait in wait and retry, based on error response.
October 2017: CachePolicy (thanks to @seanfarrow for much contribution to the thinking).
June 2017: PolicyRegistry (thanks to @ankitbko); interfaces.
May 2017: Share rich information between execution and control delegates (blog)
February 2017: NoOp Policy (thanks to @lakario)
December 2016: Polly v5.0.3 RTM:
- Full .NET Standard 1.0 compatibility
- Bulkhead policy
- Timeout policy (including walking away from executions with no in-built timeout)
- Fallback policy
- PolicyWrap,
- PolicyKeys and ContextData
- Rationalised .NET40 async support
October 2016: Polly v5.0-alpha, with four new policies: Bulkhead, Timeout, Fallback, and PolicyWrap
July 2016: .NET Core 1.0 RTM support; .NET Standard 1.0 support
June 2016: Policies to handle return values
June 2016: Polly.Net40Async
April 2016: Advanced Circuit Breaker
Mar 2016: Full ContextualPolicy support
Feb 2016: Manual control and public state for circuit breaker, for health/performance monitoring
Jan 2016: Full async support.
- Home
- Polly RoadMap
- Contributing
- Transient fault handling and proactive resilience engineering
- Supported targets
- Retry
- Circuit Breaker
- Advanced Circuit Breaker
- Timeout
- Bulkhead
- Cache
- Rate-Limit
- Fallback
- PolicyWrap
- NoOp
- PolicyRegistry
- Polly and HttpClientFactory
- Asynchronous action execution
- Handling InnerExceptions and AggregateExceptions
- Statefulness of policies
- Keys and Context Data
- Non generic and generic policies
- Polly and interfaces
- Some policy patterns
- Debugging with Polly in Visual Studio
- Unit-testing with Polly
- Polly concept and architecture
- Polly v6 breaking changes
- Polly v7 breaking changes
- DISCUSSION PROPOSAL- Polly eventing and metrics architecture