Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor TaskEntity to TaskEntity<TState> #175

Merged
merged 9 commits into from
Aug 25, 2023

Conversation

jviau
Copy link
Member

@jviau jviau commented Aug 11, 2023

This PR adjusts TaskEntity, which previously had that instance itself be treated as entity state to TaskEntity<TState>, where a property on the class is the state. The reason for this change is to separate dependency injection & state hydration. DI is not performed on the entity instance and state hydration is performed on the property.

Additional changes:

  1. TaskEntityContext.GetState and SetState methods added.
  2. TaskEntityContext.DeleteState removed in favor of using SetState(null) from above.
  3. Operations will by default attempt to dispatch to the state object if entity-level dispatch has no matches.
  4. TaskEntityOperation and TaskEntityContext are now properties on TaskEntity<TState>. giving an alternative to param binding.
  5. Unit tests added to cover more scenarios
  6. Logic from TaskEntity has been moved to TaskEntityOperationExtensions and to TaskEntityHelpers.

Copy link
Member

@sebastianburckhardt sebastianburckhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First set of comments. I like the approach, but there are still some design questions I think we need to address.

/// will be persisted to <see cref="TaskEntityContext.SetState(object?)"/> when the operation completes. Deleting
/// entity state can be accomplished by setting this to default(<typeparamref name="TState"/>).
/// </remarks>
protected TState State { get; set; } = default!; // leave null-checks to end implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super familiar with how ! is intended to be used. I thought it means "I know for sure this is not null, so you don't need to try to prove it" but your use here seems to suggest that it can also be used to say "I don't care whether this is null or not". Is this an o.k. way to use the !?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the ! operator is for suppressing nullability warnings. You can generalize it as appropriate for use when static analysis does not work for your scenario. In our case, it doesn't work, we don't know the deserialization logic of TState, we don't know if they want to throw on null or not.

Although, after some thought I think TState? is appropriate here. Because we will initialize to default(TState). I just want to verify it works as intended with structs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am undecided if TState or TState? is better here. The common scenario is for TState to be non-null for most operation dispatches (since we initialize via Activator.CreateInstance now). However, there are a couple scenarios where null is okay when interacting with this property:

  1. Deleting state would be done via this.State = default!; (which is null for reference types).
  2. Overriding InitializeState and setting to null is also allowed - it is then up to users to decide how null state behaves.

So, which is better? TState? to show that it can be null, and then users need to use !. everywhere. OR TState to avoid the ! because it is typically not null.

Copy link
Member

@sebastianburckhardt sebastianburckhardt Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Optional<TState>?

Since the backend really does make a distinction between "not present" (meaning there is no record for this entity in storage) and "default value" it may be just be safer to maintain that distinction throughout the API. Otherwise weird things can happen, especially for types where the default value is not null.

We could also just copy how Optional<T> is implemented and have a boolean HasState

Copy link
Member

@sebastianburckhardt sebastianburckhardt Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that Optional<TState> would be readonly so we can't just use that one directly. But here is something similar that could work I think:

  1. a boolean property HasState which determines whether the entity has state or not, and
  2. an always non-null TState State
  3. An InitializeState that returns an always non-null state
    and then call InitializeState implicitly whenever users access the State getter and HasState=false

Copy link
Member Author

@jviau jviau Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion, I have switched back to TState from TState?. We will not use an optional type and have an experience of only deleting state when it is null (and not default of a struct).

We will add a future implicit delete operation which will delete state for reference or value type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will add a future implicit delete operation which will delete state for reference or value type.

Can we add this operation now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the implicit delete code ready for another PR.

src/Abstractions/Entities/TaskEntity.cs Outdated Show resolved Hide resolved
public abstract void DeleteState();
/// <param name="type">The type to retrieve the state as.</param>
/// <returns>The entity state.</returns>
public abstract object? GetState(Type type);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, it may be nice to document how GetState handles default states, in particular for non-object types.
For example, if I do GetState(typeof(int)) on an entity whose state is null (i.e. it has not been set, or it has been set to zero), then we would expect GetState to return 0 ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be simply to support only nullable types for entities, then we don't have this strange corner case.

I think I prefer this more restrictive solution.... there is something very strange about setting an integer entity to zero and then see it disappear from storage. It is clever, but this will invariably confuse a lot of people.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the object? return type was clear that if state is not present, you get null. I can clarify that in a comment. We already have this same behavior with DataConverter.

An alternative is bool TryGetState<T>(out T? state), but this then becomes ambiguous in the false case: was there no state to begin with, or did the state not map to T? Throwing from here violates the expectations of the Try* pattern. I think will just clarify the comment for GetState.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the object? return type was clear that if state is not present, you get null

The problem with that is that you get strange behaviors. For example, for TaskEntity<int>: after the user assigns this.State = 0 then this.State == null which looks broken (not logically sound). The reason is that assigning the default value (0 is the default value of int) is considered deletion and therefore removes the entity state completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion, setting TState to default of a struct (0 for an int in this example), will not delete state. Users will need to declare TaskEntity<int?> and then assign this.State = null. Otherwise, they can implement a Reset operation which performs this.State = this.InitializeState(); - this does not "delete" state, but resets the entity to be identical to as if state was null as far as the user is concerned.

We will also add a built in delete operation for them to rely on.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify: reset is not identical to delete as far as the user is concerned because a reset entity will still be returned by subsequent queries, while a deleted entity is not.

Our users definitely care about this distinction. we had endless discussions about this and many issues, e.g. Azure/azure-functions-durable-extension#931 (comment).
The lesson we learnt is (a) users struggle to discover how to delete entities unless we tell them very clearly how to do that, and (b) the runtime needs to reliably remove deleted entities from subsequent entity queries.

src/Abstractions/Entities/TaskEntityHelpers.cs Outdated Show resolved Hide resolved
foreach (ParameterInfo parameter in parameters)
this.Operation = Check.NotNull(operation);
object? state = operation.Context.GetState(typeof(TState));
this.State = state is null ? default! : (TState)state;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an important feature still missing here, regarding automatic first initialization of the entity state.

For most applications, the first time an entity is accessed, its state has to be somehow constructed/initialized.

Of course one could write code for this in every operation, but that would be very inconvenient and error-prone for the user. With the design proposed here, users would be essentially forced to do this (or add their own extensions that do it).

For reference, in the current DF interface we accommodate this "convenient API for initial creation" by

a. (for the function-based API) support specifying an initializer when calling GetState, and
b. (for the object-based API) automatically call the default constructor if the entity is null.

I think we need something to support convenient initial creation in this API as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a virtual method to TaskEntity to initialize state. This should cover the scenario.

/// Gets a value indicating whether dispatching operations to <see cref="State"/> is allowed. State dispatch
/// will only be attempted if entity-level dispatch does not succeed. Default is <c>false</c>.
/// </summary>
protected virtual bool AllowStateDispatch => false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastianburckhardt want to draw your attention to this. I set the default to false because I feel like that is safer. Don't want users accidentally dispatching to state when they don't realize this property & behavior exists.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting this to false by default makes sense for the way this interface is currently defined. However, I think there is still a gap here.

I think the dilemma here is about whether we are trying to provide (a) a flexible interface that is constructed logically and can support many different types of TState and (b) an interface that aggressively minimizes boilerplate for the most common and simplest use where the user wants to just write the entity just as a simple single class with methods to define the operations, i.e. they only want to look at the class TState and have the existence of TaskEntity<TState> be an implementation detail that gets hidden in some minimal boilerplate or perhaps even hidden completely at some point.

As currently designed, TaskEntity<TState> matches the requirements for (a), but not (b). I realize it may not be possible to satisfy both requirements at the same time..

But we really also need something for (b) because it just seems to matter the most in my experience - it is how we want to write all our samples and benchmarks. I remember that for DF we designed multiple APIs but when I was writing applications it turned out I only ever wanted the single-class one because the code looks so much nicer that way. Having the state indirection is a pretty big hassle IMO, and forcing our users to do that is a step backwards (allowing them to do it is fine).

So, I still think we need to somehow find a way to make that simple single-class experience with minimal boilerplate possible. Maybe use a derived class ClassBasedEntity<TClass> and then constrain all the choices to keep things simple:

  • TClass must be a class with a default constructor
  • InitializeState calls the default constructor
  • DeleteState remains available on the context so entities can delete themselves
  • All operations are dispatched to the inner class first, then the outer class

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current priorities I would be o.k. with delaying the design and implementation of this "easier to use" single-class version and focus on the more verbose and flexible TaskEntity for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion, current experience is sufficient for now. All of these scenarios are possible, but might not be a 1 to 1 with entities in-proc. Which is OK, we can expect some changes during migration. We will revisit this later based on user demand.

public abstract object? GetState(Type type);

/// <summary>
/// Sets the entity state. Setting of <c>null</c> will clear entity state.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest 'delete' instead of 'clear' since it will delete records from storage, not just reset the entity to its initial state. This makes a difference, for example when querying entities.

/// Sets the entity state. Setting of <c>null</c> will clear entity state.
/// </summary>
/// <param name="state">The state to set.</param>
public abstract void SetState(object? state);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add an overload method for deleting state:

public void DeleteState() { this.SetState(null); }

It's redundant (and we can say that in the comment) but when I write or read code I would much rather use DeleteState() than SetState(null) since it makes it much more explicit what is happening (records being removed from storage and no longer returned by queries).

Copy link
Member

@sebastianburckhardt sebastianburckhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After extensive offline discussions, I think we have converged on a design. Some additional changes will be done in separate PRs.

@jviau jviau merged commit ca8d7ad into microsoft:feature/entities Aug 25, 2023
2 checks passed
@jviau jviau deleted the entity-state branch August 25, 2023 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants