Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor TaskEntity to TaskEntity<TState> #175

Merged
merged 9 commits into from
Aug 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
213 changes: 67 additions & 146 deletions src/Abstractions/Entities/TaskEntity.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
// Licensed under the MIT License.

using System.Reflection;
using System.Threading.Tasks;

namespace Microsoft.DurableTask.Entities;

Expand All @@ -12,9 +11,7 @@ namespace Microsoft.DurableTask.Entities;
/// <remarks>
/// <para><b>Entity State</b></para>
/// <para>
/// All entity implementations are required to be serializable by the configured <see cref="DataConverter"/>. An entity
/// will have its state deserialized before executing an operation, and then the new state will be the serialized value
/// of the <see cref="ITaskEntity"/> implementation instance post-operation.
/// The state of an entity can be retrieved and updated via <see cref="TaskEntityOperation.Context"/>.
/// </para>
/// </remarks>
public interface ITaskEntity
Expand All @@ -30,6 +27,7 @@ public interface ITaskEntity
/// <summary>
/// An <see cref="ITaskEntity"/> which dispatches its operations to public instance methods or properties.
/// </summary>
/// <typeparam name="TState">The state type held by this entity.</typeparam>
/// <remarks>
/// <para><b>Method Binding</b></para>
/// <para>
Expand Down Expand Up @@ -71,176 +69,99 @@ public interface ITaskEntity
///
/// <para><b>Entity State</b></para>
/// <para>
/// Unchanged from <see cref="ITaskEntity"/>. Entity state is the serialized value of the entity after an operation
/// completes.
/// Entity state will be hydrated into the <see cref="TaskEntity{TState}.State"/> property. See <see cref="State"/> for
/// more information.
/// </para>
/// </remarks>
public abstract class TaskEntity : ITaskEntity
public abstract class TaskEntity<TState> : ITaskEntity
{
/**
* TODO:
* 1. Consider caching a compiled delegate for a given operation name.
*/
static readonly BindingFlags InstanceBindingFlags
= BindingFlags.Public | BindingFlags.Instance | BindingFlags.IgnoreCase;

/// <inheritdoc/>
public ValueTask<object?> RunAsync(TaskEntityOperation operation)
{
Check.NotNull(operation);
if (!this.TryDispatchMethod(operation, out object? result, out Type returnType))
{
throw new NotSupportedException($"No suitable method found for entity operation '{operation}'.");
}

if (typeof(Task).IsAssignableFrom(returnType))
{
// Task or Task<T>
return new(AsGeneric((Task)result!, returnType)); // we assume a declared Task return type is never null.
}

if (returnType == typeof(ValueTask))
{
// ValueTask
return AsGeneric((ValueTask)result!); // we assume a declared ValueTask return type is never null.
}

if (returnType.IsGenericType && returnType.GetGenericTypeDefinition() == typeof(ValueTask<>))
{
// ValueTask<T>
return AsGeneric(result!, returnType); // No inheritance, have to do purely via reflection.
}

return new(result);
}
/// <summary>
/// Gets a value indicating whether dispatching operations to <see cref="State"/> is allowed. State dispatch
/// will only be attempted if entity-level dispatch does not succeed. Default is <c>false</c>. Dispatching to state
/// follows the same rules as dispatching to this entity.
/// </summary>
protected virtual bool AllowStateDispatch => false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastianburckhardt want to draw your attention to this. I set the default to false because I feel like that is safer. Don't want users accidentally dispatching to state when they don't realize this property & behavior exists.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting this to false by default makes sense for the way this interface is currently defined. However, I think there is still a gap here.

I think the dilemma here is about whether we are trying to provide (a) a flexible interface that is constructed logically and can support many different types of TState and (b) an interface that aggressively minimizes boilerplate for the most common and simplest use where the user wants to just write the entity just as a simple single class with methods to define the operations, i.e. they only want to look at the class TState and have the existence of TaskEntity<TState> be an implementation detail that gets hidden in some minimal boilerplate or perhaps even hidden completely at some point.

As currently designed, TaskEntity<TState> matches the requirements for (a), but not (b). I realize it may not be possible to satisfy both requirements at the same time..

But we really also need something for (b) because it just seems to matter the most in my experience - it is how we want to write all our samples and benchmarks. I remember that for DF we designed multiple APIs but when I was writing applications it turned out I only ever wanted the single-class one because the code looks so much nicer that way. Having the state indirection is a pretty big hassle IMO, and forcing our users to do that is a step backwards (allowing them to do it is fine).

So, I still think we need to somehow find a way to make that simple single-class experience with minimal boilerplate possible. Maybe use a derived class ClassBasedEntity<TClass> and then constrain all the choices to keep things simple:

  • TClass must be a class with a default constructor
  • InitializeState calls the default constructor
  • DeleteState remains available on the context so entities can delete themselves
  • All operations are dispatched to the inner class first, then the outer class

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current priorities I would be o.k. with delaying the design and implementation of this "easier to use" single-class version and focus on the more verbose and flexible TaskEntity for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion, current experience is sufficient for now. All of these scenarios are possible, but might not be a 1 to 1 with entities in-proc. Which is OK, we can expect some changes during migration. We will revisit this later based on user demand.


static bool TryGetInput(ParameterInfo parameter, TaskEntityOperation operation, out object? input)
{
if (!operation.HasInput)
{
if (parameter.HasDefaultValue)
{
input = parameter.DefaultValue;
return true;
}
/// <summary>
/// Gets or sets the state for this entity.
/// </summary>
/// <remarks>
/// <para><b>Initialization</b></para>
/// <para>
/// This will be hydrated as part of <see cref="RunAsync(TaskEntityOperation)"/>. <see cref="InitializeState"/> will
/// be called when state is <c>null</c> <b>at the start of an operation only</b>.
/// </para>
/// <para><b>Persistence</b></para>
/// <para>
/// The contents of this property will be persisted to <see cref="TaskEntityContext.SetState(object?)"/> at the end
/// of the operation.
/// </para>
/// <para><b>Deletion</b></para>
/// <para>
/// Deleting entity state is possible by setting this to <c>null</c>. Setting to default of a value-type will
/// <b>not</b> perform a delete. This means deleting entity state is only possible for reference types or using <c>?</c>
/// on a value-type (ie: <c>TaskEntity&lt;int?&gt;</c>).
/// </para>
/// </remarks>
protected TState State { get; set; } = default!;

input = null;
return false;
}
/// <summary>
/// Gets the entity operation.
/// </summary>
protected TaskEntityOperation Operation { get; private set; } = null!;

input = operation.GetInput(parameter.ParameterType);
return true;
}
/// <summary>
/// Gets the entity context.
/// </summary>
protected TaskEntityContext Context => this.Operation.Context;

static async Task<object?> AsGeneric(Task task, Type declared)
/// <inheritdoc/>
public ValueTask<object?> RunAsync(TaskEntityOperation operation)
{
await task;
if (declared.IsGenericType && declared.GetGenericTypeDefinition() == typeof(Task<>))
this.Operation = Check.NotNull(operation);
object? state = operation.Context.GetState(typeof(TState));
this.State = state is null ? this.InitializeState() : (TState)state;
if (!operation.TryDispatch(this, out object? result, out Type returnType)
&& !this.TryDispatchState(out result, out returnType))
{
return declared.GetProperty("Result", BindingFlags.Public | BindingFlags.Instance).GetValue(task);
throw new NotSupportedException($"No suitable method found for entity operation '{operation}'.");
}

return null;
return TaskEntityHelpers.UnwrapAsync(this.Context, () => this.State, result, returnType);
}

static ValueTask<object?> AsGeneric(ValueTask t)
/// <summary>
/// Initializes the entity state. This is only called when there is no current state for this entity.
/// </summary>
/// <returns>The entity state.</returns>
/// <remarks>The default implementation uses <see cref="Activator.CreateInstance()"/>.</remarks>
protected virtual TState InitializeState()
{
static async Task<object?> Await(ValueTask t)
{
await t;
return null;
}

if (t.IsCompletedSuccessfully)
if (Nullable.GetUnderlyingType(typeof(TState)) is Type t)
{
return default;
// Activator.CreateInstance<Nullable<T>>() returns null. To avoid this, we will instantiate via underlying
// type if it is Nullable<T>. This keeps the experience consistent between value and reference type. If an
// implementation wants null, they must override this method and explicitly provide null.
return (TState)Activator.CreateInstance(t);
}

return new(Await(t));
return Activator.CreateInstance<TState>();
}

static ValueTask<object?> AsGeneric(object result, Type type)
bool TryDispatchState(out object? result, out Type returnType)
{
// result and type here must be some form of ValueTask<T>.
if ((bool)type.GetProperty("IsCompletedSuccessfully").GetValue(result))
{
return new(type.GetProperty("Result").GetValue(result));
}
else
{
Task t = (Task)type.GetMethod("AsTask", BindingFlags.Instance | BindingFlags.Public)
.Invoke(result, null);
return new(t.ToGeneric<object?>());
}
}

bool TryDispatchMethod(TaskEntityOperation operation, out object? result, out Type returnType)
{
Type t = this.GetType();

// Will throw AmbiguousMatchException if more than 1 overload for the method name exists.
MethodInfo? method = t.GetMethod(operation.Name, InstanceBindingFlags);
if (method is null)
if (!this.AllowStateDispatch)
{
result = null;
returnType = typeof(void);
return false;
}

ParameterInfo[] parameters = method.GetParameters();
object?[] inputs = new object[parameters.Length];

int i = 0;
ParameterInfo? inputResolved = null;
ParameterInfo? contextResolved = null;
ParameterInfo? operationResolved = null;
foreach (ParameterInfo parameter in parameters)
if (this.State is null)
{
if (parameter.ParameterType == typeof(TaskEntityContext))
{
ThrowIfDuplicateBinding(contextResolved, parameter, "context", operation);
inputs[i] = operation.Context;
contextResolved = parameter;
}
else if (parameter.ParameterType == typeof(TaskEntityOperation))
{
ThrowIfDuplicateBinding(operationResolved, parameter, "operation", operation);
inputs[i] = operation;
operationResolved = parameter;
}
else
{
ThrowIfDuplicateBinding(inputResolved, parameter, "input", operation);
if (TryGetInput(parameter, operation, out object? input))
{
inputs[i] = input;
inputResolved = parameter;
}
else
{
throw new InvalidOperationException($"Error dispatching {operation} to '{method}'.\n" +
$"There was an error binding parameter '{parameter}'. The operation expected an input value, " +
"but no input was provided by the caller.");
}
}

i++;
throw new InvalidOperationException("Attempting to dispatch to state, but entity state is null.");
}

result = method.Invoke(this, inputs);
returnType = method.ReturnType;
return true;

static void ThrowIfDuplicateBinding(
ParameterInfo? existing, ParameterInfo parameter, string bindingConcept, TaskEntityOperation operation)
{
if (existing is not null)
{
throw new InvalidOperationException($"Error dispatching {operation} to '{parameter.Member}'.\n" +
$"Unable to bind {bindingConcept} to '{parameter}' because it has " +
$"already been bound to parameter '{existing}'. Please remove the duplicate parameter in method " +
$"'{parameter.Member}'.\nEntity operation: {operation}.");
}
}
return this.Operation.TryDispatch(this.State, out result, out returnType);
}
}
17 changes: 11 additions & 6 deletions src/Abstractions/Entities/TaskEntityContext.cs
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,16 @@ public abstract void StartOrchestration(
TaskName name, object? input = null, StartOrchestrationOptions? options = null);

/// <summary>
/// Deletes the state of this entity after the current operation completes.
/// Gets the current state for the entity this context is for. This will return <c>null</c> if no state is present,
/// regardless if <paramref name="type"/> is a value-type or not.
/// </summary>
/// <remarks>
/// The state deletion only takes effect after the current operation completes. Any state changes made during the
/// current operation will be ignored in favor of the deletion.
/// </remarks>
public abstract void DeleteState();
/// <param name="type">The type to retrieve the state as.</param>
/// <returns>The entity state.</returns>
public abstract object? GetState(Type type);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, it may be nice to document how GetState handles default states, in particular for non-object types.
For example, if I do GetState(typeof(int)) on an entity whose state is null (i.e. it has not been set, or it has been set to zero), then we would expect GetState to return 0 ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be simply to support only nullable types for entities, then we don't have this strange corner case.

I think I prefer this more restrictive solution.... there is something very strange about setting an integer entity to zero and then see it disappear from storage. It is clever, but this will invariably confuse a lot of people.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the object? return type was clear that if state is not present, you get null. I can clarify that in a comment. We already have this same behavior with DataConverter.

An alternative is bool TryGetState<T>(out T? state), but this then becomes ambiguous in the false case: was there no state to begin with, or did the state not map to T? Throwing from here violates the expectations of the Try* pattern. I think will just clarify the comment for GetState.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the object? return type was clear that if state is not present, you get null

The problem with that is that you get strange behaviors. For example, for TaskEntity<int>: after the user assigns this.State = 0 then this.State == null which looks broken (not logically sound). The reason is that assigning the default value (0 is the default value of int) is considered deletion and therefore removes the entity state completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion, setting TState to default of a struct (0 for an int in this example), will not delete state. Users will need to declare TaskEntity<int?> and then assign this.State = null. Otherwise, they can implement a Reset operation which performs this.State = this.InitializeState(); - this does not "delete" state, but resets the entity to be identical to as if state was null as far as the user is concerned.

We will also add a built in delete operation for them to rely on.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify: reset is not identical to delete as far as the user is concerned because a reset entity will still be returned by subsequent queries, while a deleted entity is not.

Our users definitely care about this distinction. we had endless discussions about this and many issues, e.g. Azure/azure-functions-durable-extension#931 (comment).
The lesson we learnt is (a) users struggle to discover how to delete entities unless we tell them very clearly how to do that, and (b) the runtime needs to reliably remove deleted entities from subsequent entity queries.


/// <summary>
/// Sets the entity state. Setting of <c>null</c> will delete entity state.
/// </summary>
/// <param name="state">The state to set.</param>
public abstract void SetState(object? state);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add an overload method for deleting state:

public void DeleteState() { this.SetState(null); }

It's redundant (and we can say that in the comment) but when I write or read code I would much rather use DeleteState() than SetState(null) since it makes it much more explicit what is happening (records being removed from storage and no longer returned by queries).

100 changes: 100 additions & 0 deletions src/Abstractions/Entities/TaskEntityHelpers.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

using System.Reflection;

namespace Microsoft.DurableTask.Entities;

/// <summary>
/// Helpers for task entities.
/// </summary>
static class TaskEntityHelpers
{
/// <summary>
/// Unwraps a dispatched result for a <see cref="TaskEntityOperation"/> into a <see cref="ValueTask{Object}"/>.
/// </summary>
/// <param name="context">The entity context.</param>
/// <param name="state">Delegate to resolve new state for the entity.</param>
/// <param name="result">The result of the operation.</param>
/// <param name="resultType">The declared type of the result (may be different that actual type).</param>
/// <returns>A value task which holds the result of the operation and sets state before it completes.</returns>
public static ValueTask<object?> UnwrapAsync(
TaskEntityContext context, Func<object?> state, object? result, Type resultType)
{
// NOTE: Func<object?> is used for state so that we can lazily resolve it AFTER the operation has ran.
Check.NotNull(context);
Check.NotNull(resultType);

if (typeof(Task).IsAssignableFrom(resultType))
{
// Task or Task<T>
// We assume a declared Task return type is never null.
return new(UnwrapTask(context, state, (Task)result!, resultType));
}

if (resultType == typeof(ValueTask))
{
// ValueTask
// We assume a declared ValueTask return type is never null.
return UnwrapValueTask(context, state, (ValueTask)result!);
}

if (resultType.IsGenericType && resultType.GetGenericTypeDefinition() == typeof(ValueTask<>))
{
// ValueTask<T>
// No inheritance, have to do purely via reflection.
return UnwrapValueTaskOfT(context, state, result!, resultType);
}

context.SetState(state());
return new(result);
}

static async Task<object?> UnwrapTask(TaskEntityContext context, Func<object?> state, Task task, Type declared)
{
await task;
context.SetState(state());
if (declared.IsGenericType && declared.GetGenericTypeDefinition() == typeof(Task<>))
{
return declared.GetProperty("Result", BindingFlags.Public | BindingFlags.Instance).GetValue(task);
}

return null;
}

static ValueTask<object?> UnwrapValueTask(TaskEntityContext context, Func<object?> state, ValueTask t)
{
async Task<object?> Await(ValueTask t)
{
await t;
context.SetState(state());
return null;
}

if (t.IsCompletedSuccessfully)
{
context.SetState(state());
return default;
}

return new(Await(t));
}

static ValueTask<object?> UnwrapValueTaskOfT(
TaskEntityContext context, Func<object?> state, object result, Type type)
{
// Result and type here must be some form of ValueTask<T>.
// TODO: can this amount of reflection be avoided?
if ((bool)type.GetProperty("IsCompletedSuccessfully").GetValue(result))
{
context.SetState(state());
return new(type.GetProperty("Result").GetValue(result));
}
else
{
Task t = (Task)type.GetMethod("AsTask", BindingFlags.Instance | BindingFlags.Public).Invoke(result, null);
Type taskType = typeof(Task<>).MakeGenericType(type.GetGenericArguments()[0]);
return new(UnwrapTask(context, state, t, taskType));
}
}
}
Loading