Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Managed_Resource to allow implementation of in-memory caches #11577

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented Nov 18, 2024

Pull Request Description

Allows connection between refresh button and caches as requested by #11485:

  • please use Managed_Resource.new system_resource=True to create a reference that "can disappear"
  • please call EnsoContext.getResourceManager().scheduleFinalization() to clear the caches when the refresh button is pressed

ResourceManager provides scheduleFinalization method that allows cleanup of "system managed" references

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • All code follows the
    Scala,
    Java,
  • Unit tests have been written where possible.

@JaroslavTulach JaroslavTulach added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Nov 18, 2024
@JaroslavTulach JaroslavTulach self-assigned this Nov 18, 2024
@jdunkerley
Copy link
Member

jdunkerley commented Nov 18, 2024

How do we create a "static" instance of one of these?

Guess we just put one into Java as usual.

@JaroslavTulach JaroslavTulach changed the title Using Ref.new allow_gc=True to implement in-memory caches Using Ref.new allow_gc=True to implement in-memory caches Nov 18, 2024

Arguments:
- value: The value to be contained in the ref.
- allow_gc: Is the value eligible for being cleared by the system at any time?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we replace this with an atom type reference_type?

e.g.

type Reference_Type
    Weak
    Regular

Then if we ever want to also add support for soft references, it is just the matter of adding yet another variant to Reference_Type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I am thinking about:

type Ref_Auto_Clean
       Never # regular reference
       Explicit # only per user request
       Soon # using `WeakReference` disappears as soon as "possible"
       Managed estimated_memory_size:Integer time_in_ms_to_recompute:Integer # automatic when needed

the ..Explicit cleaning policy would solve @GregoryTravis's

the reference ... collected, then I know that the refresh button has been clicked

We would connect cleanup of ..Explicit directly to refresh button click and no GC.

Or we could map all above types to estimates of memory and recomputation time:

  • Never - zero memory, maximal time
  • Explicit - zero memory, zero time
  • Soon - some memory, zero time
  • Managed - some memory and some time

Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. LGTM. I am a bit afraid about transient failures in the RefTest.

@GregoryTravis
Copy link
Contributor

@JaroslavTulach I'll get this connected to my cache and write a test for it.

@GregoryTravis
Copy link
Contributor

GregoryTravis commented Nov 26, 2024

@JaroslavTulach
I implemented the logic we discussed -- my cache checks a reference to see if it has been cleared. The reference is created here.

@4e6
I modified RecomputeContextCmd to clear weak references on reload. This is just a proof-of-concept, I don't know if I did it correctly.

@JaroslavTulach JaroslavTulach linked an issue Nov 27, 2024 that may be closed by this pull request
@JaroslavTulach JaroslavTulach changed the title Using Ref.new allow_gc=True to implement in-memory caches Enhance Managed_Resource to allow implementation of in-memory caches Nov 28, 2024
private Value ref:Managed_Resource

new obj -> Cache =
on_finalize _ = 0
Copy link
Member Author

@JaroslavTulach JaroslavTulach Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GregoryTravis - At the end I decided to use enhanced Managed_Resource as it already has support for GC and finalizers. Feel free to implement a cleanup logic in this on_finalize method.

If things seem to work for you and I get approvals from engine guys, let's integrate!

register : Any -> (Any -> Nothing) -> Managed_Resource
register resource function = @Builtin_Method "Managed_Resource.register"
register : Any -> (Any -> Nothing) -> Boolean -> Managed_Resource
register resource function system_finalization_allowed=Boolean.False =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
register resource function system_finalization_allowed=Boolean.False =
register resource function system_finalization_allowed=False =

I think the preferred convention for importing Boolean is:

from project.Data.Boolean import Boolean, False, True

Value returned from the `action` or `Nothing` if the managed resource
was already finalized
with : (Any -> Any) -> Any -> Any
with self ~action ~on_missing=Nothing = with_builtin self action on_missing...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be much better to default to returning a dataflow error, or perhaps even a panic because it is an internal error.

Suggested change
with self ~action ~on_missing=Nothing = with_builtin self action on_missing...
with self ~action ~on_missing=(Panic.throw (Illegal_State.Error "The Managed_Resource is used after it has been finalized.")) = with_builtin self action on_missing...

Defaulting to Nothing may make it too easy for "usage after finalize" to go unnoticed and lead to hard to debug issues.

* @see #register
*/
@CompilerDirectives.TruffleBoundary
public final synchronized void scheduleFinalization() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but IMO scheduleFinalizationOfSystemResources could be better

Suggested change
public final synchronized void scheduleFinalization() {
public final synchronized void scheduleFinalizationOfSystemResources() {

The name scheduleFinalization sounds like it may finalize all references.

Documentation clarifies this, so I'm ok to keep it as is though.

Comment on lines +16 to +36
/**
* An Enso runtime representation of a managed resource.
*
* <p><b>Instance of this class</b> is convoluted with instances playing various roles:
*
* <ul>
* <li>this {@link ManagedResource} points to {@code resource}
* <li>this {@link ManagedResource} points to {@link PhantomReference} that is "phantom
* referencing" {@code this}
* <li>the implementation of {@link PhantomReference} is {@code Item} in {@link ResourceManager}
* <li>the {@code Item} <em>"phantom referencing"</em> {@code this} {@link ManagedResource} is
* stored in {@link ResourceManager} {@code pendingItems} collection.
* <li>the {@code Item} has a pointer to the {@code resource} as well
* <li>the {@code Item} has a pointer to the {@code finalizer} function
* </ul>
*
* Once all this braided chunk of objects is eligible for GC because <b>nobody holds pointer to
* {@link ManagedResource}</b>, the {@code Item} is put into {@link ResourceManager} {@code
* referenceQueue} and process by the intricate machinery of {@link ResourceManager} and its {@code
* ProcessItems} processor.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very appreciated

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I'd change the default on_missing to something more 'visible' than Nothing. We are having plenty of issues caused by accidentally discarding errors, so it's better if we program more 'defensively' against this.

Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have created #11714 to potentially remove now even more deprecated AutomaticParallelism IR pass and Name.Special IR elements.

Value returned from the `action` or `Nothing` if the managed resource
was already finalized
with : (Any -> Any) -> Any -> Any
with self ~action ~on_missing=Nothing = with_builtin self action on_missing...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me, what is the semantics of on_missing...?

resourceManager.unpark(mr);
}
} else {
return onMissing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs of Managed_Resource.with, on_missing is "an optional action", not a constant. Clarify and/or add a test. If it is an action, it should be suspended thunk and called appropriatelly. Maybe even add @Suspended annotation to the parameter?

@@ -1963,16 +1962,6 @@ class IrToTruffle(
"a Self occurence must be resolved"
).target
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By removing handling of Special names here, and removing NewRefNode and co., you are making AutomaticParallelism IR pass even more obsolete than it was before (which seems like a good thing because I am convinced that AutomaticParallelism has never actually done anything and will not even work if enabled). It is worth thinking about getting rid of that pass, along with the Name.Special IR definitions altogether.

}

@Test
public void regularReference() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After copy-pasting from RefTest.java, please, at least rename names of the tests, as well as the private static fields like newRef, createRef, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refresh button should clear HTTP cache
7 participants