Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-2821: Support for Timeouts on Updates #2822

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

Aklakan
Copy link
Contributor

@Aklakan Aklakan commented Nov 5, 2024

GitHub issue resolved #2821

Pull request Description:

  • Fixed inconsistent context handling in the update builder machinery and made it consistent with that in the query builder machinery.

  • The context of update requests is now forwarded to the query execution

  • UpdateEngineWorker sets the remaining query timeout if the newly introduced ARQ.updateTimeout has been set. Note, that it is possible to set an initial timeout via the ARQ.updateTimeout from which internally an ARQ.queryTimeout setting is derived: An update request would abort if the corresponding query execution timed out on the initial timeout - so only updates with a WHERE clause are affected.

  • I did not add initialTimeout methods to the update exec builders because perhaps initial timeout on updates are a bit to specific for the general interface - opinions?

  • Added UpdateProcessor.getContext() for consistency with QueryExec. This is for discussion.

  • Added UpdateProcessor.abort() which can abort the underlying query execution if it is based on QueryIterator.

  • Added test cases to the TestUpdateExecutionCancel class.

  • For HTTP-based updates, the overall update timeout is forwarded to the existing HTTP machinery.

  • There is now a class TestSPARQLProtocolTimeout which tests setting the timeout on the update builder.

  • QueryIteratorBase.requestingCancel was updated to check the Thread.interrupted() so that when Fuseki gets shut down (e.g. during unit testing) then any threads still busy executing queries can cancel execution in a timely manner.

  • Added query/update tests that register a local function that checks for the availability of the cancel signal. Please also see the test cases which register the function that checks for the cancel signal in three ways for both queries and updates:

    1. via query/update builder.set
    2. via query/update builder.context
    3. via query/update exec.getContext()

  • Tests are included.
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch from fa6b5bf to 3f50e50 Compare November 5, 2024 17:10
@afs afs self-assigned this Nov 5, 2024
return thisBuilder();
}

public Y set(Symbol symbol, boolean value) {
ensureContext();
this.context.set(symbol, value);
this.contextAcc.set(symbol, value);
return thisBuilder();
}

private void ensureContext() {
Copy link
Contributor Author

@Aklakan Aklakan Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty ensureContext method is akin to ExecHTTPBuilder. It could be removed there and in this file (ExecUpdateHTTPBuilder).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea to remove it. "Later" rarely happens.

@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch 2 times, most recently from dd2ef56 to 1db78a5 Compare November 6, 2024 14:08
@@ -537,7 +537,7 @@ protected static Iterator<Binding> evalBindings(Query query, DatasetGraph dsg, B
}

// Not QueryExecDataset.dataset(...) because of initialBinding.
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query);
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query).context(context);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far that context was never passed from the update engine to the query engine - so non-global custom settings (function registries, timeouts, etc) were lost here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on the main PR. This is potentially something that impacts users and we need to consider migration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now an updateRemainingQueryTimeout that will set ARQ.queryTimeout based on the value of ARQ.updateTimeout and the time spent in the UpdateEngineWorker.

@afs
Copy link
Member

afs commented Nov 7, 2024

This PR is a good step forward.

This PR is really two things:

  • The bug about manual query execution abort
  • A new feature - having timeouts on update.

The first is as little as move one line fix :-)

Having timeouts on update is good to have and the way it is done in this PR, timing out the WHERE clause of a modify DELETE-INSERT-WHERE, seems the best way (I'd played around with update timeout ... but stalled on that) because it protects against long-running WHERE.

There are various operations for updates and a general timeout on an update isn't a one change in one place. Update can be several operations all in one request so there could an overall timeout (caveat the issue of various operations).

The PR does impact on applications. This is not to say I think that it is bad (or good) idea but to draw out a consequence.

If I've read the PR with it comes down to this one line: pass the context to the WHERE evaluation. If a global timeout for queries is set, then that will be picked up by the update execution where it wasn't 5.2 and earlier.

For Fuseki, I would guess that timeout is usually per-endpoint; it is possible to set it database or server wide as well.

For an application using Jena as a library, an application wide timeout might be present and that would impact updates with this PR.

return thisBuilder();
}

public Y set(Symbol symbol, boolean value) {
ensureContext();
this.context.set(symbol, value);
this.contextAcc.set(symbol, value);
return thisBuilder();
}

private void ensureContext() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea to remove it. "Later" rarely happens.

@@ -537,7 +537,7 @@ protected static Iterator<Binding> evalBindings(Query query, DatasetGraph dsg, B
}

// Not QueryExecDataset.dataset(...) because of initialBinding.
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query);
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query).context(context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on the main PR. This is potentially something that impacts users and we need to consider migration.

@Aklakan
Copy link
Contributor Author

Aklakan commented Nov 8, 2024

If a global timeout for queries is set, then that will be picked up by the update execution where it wasn't 5.2 and earlier.

  • One option would be to introduce ARQ.updateTimeout and have the UpdateEngineWorker (or perhaps already at updateExecBuilder.build()) do
cxt.set(ARQ.queryTimeout, cxt.get(ARQ.updateTimeout))
  • Furthermore, there could be an additional ARQ.requestType context symbol for query and update requests (and perhaps dataset ones; c.f. LinkDataset). And the appropriate timeout value is picked based on the context's request type which might be cleaner than having to mutate the context - i.e. no need to copy the updateTimeout value to queryTimeout.

I'd say that the basic updateTimeout setting should affect the overall timeout of an update request - not per sub-update - so the remaining time would have to be passed on to the query exec. Hm, in this case the context would have to be mutated or copied.

The suggestions should be backward compatible with the existing behavior (but may require a change for computing the effective timeout in the query exec).

@Aklakan Aklakan marked this pull request as draft November 10, 2024 21:22
@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch 6 times, most recently from f433c17 to 375244b Compare November 11, 2024 12:34
try {
InputStream in = retainedConnection;
if (in != null) {
in.close();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this will just hang trying to consume any remaining data. Perhaps a test case could be created against the Fuseki server.

@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch 2 times, most recently from 2dcfb44 to e1e54e2 Compare November 11, 2024 14:52
@Aklakan Aklakan marked this pull request as ready for review November 11, 2024 14:53
@Aklakan
Copy link
Contributor Author

Aklakan commented Nov 11, 2024

I have consolidated the timeout logic (initial and overall) into the Timeouts class.
There is now ARQ.updateTimeout. The UpdateEngineWorker tracks its own elapsed processing time and updates its context with the remaining query time out before starting the underlying query execution.

The update timeouts are now only used to update ARQ.queryTimeout in the context.
If no update timeout is set then the UpdateEngineWorker will unset the query timeout which should be consistent with the previous behavior.

Conversely, the update timeouts do not affect the sinks: In principle a sink could also refuse to accept changes once the timeout is reached - but if that's really needed it could be a future update.

@Aklakan Aklakan changed the title GH-2821: Cancel Signal unavailable without timeouts - breaks manual abort. GH-2821: Support for Timeouts on Updates Nov 11, 2024
@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch 3 times, most recently from 9c84c57 to d9cbb32 Compare November 16, 2024 16:31

/** Standard update syntax with substitution should work when comments are involved. */
@Test
public void standard_syntax_update_remote_2b() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, this additional test case could go into a separate commit. It tests for an oversight I introduced with #2580 where update strings were joined on the semicolon without a preceding newline. In such cases, joining update strings that ended on a comment would produce syntax errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem having it here.
Using "rebase and merge" means the git history doesn't retain the branch structure.

@Aklakan Aklakan force-pushed the 2024-11-05_cancel-signal-fix branch 2 times, most recently from d8ae6d9 to b276f97 Compare November 17, 2024 16:44
@@ -70,7 +70,7 @@ protected QueryIteratorBase(AtomicBoolean cancelSignal) {
}

private boolean requestingCancel() {
return requestingCancel != null && requestingCancel.get() ;
return (requestingCancel != null && requestingCancel.get()) || Thread.interrupted() ;
Copy link
Contributor Author

@Aklakan Aklakan Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional interrupted check is needed in order to cleanly stop the Fuseki server in TestSPARQLProtocolTimeout which extends AbstractFusekiTest: When shutting down the Fuseki server, Jetty attempts to stop the remaining threads via interruption. However, so far this flag was not picked up by the QueryIterator machinery, causing jetty to abandon the thread and log the warning from org.eclipse.jetty.util.thread.QueuedThreadPool:

17:27:10 WARN  QueuedThreadPool :: Couldn't stop Thread[qtp1495161082-24,5,main]

@Aklakan
Copy link
Contributor Author

Aklakan commented Nov 18, 2024

I updated this PR's summary in the first post. The main changes are the introduction of ARQ.updateTimeout and the additional cancel check based on Thread.interrupted() in QueryIteratorBase.requestingCancel().

It's ready for another review.

Copy link
Member

@afs afs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
A couple of unused imports which I can sort out after the merge.

import org.apache.jena.query.ARQ;
import org.apache.jena.sparql.ARQConstants;
import org.apache.jena.sparql.ARQException;
import org.apache.jena.sparql.core.DatasetGraph;
import org.apache.jena.sparql.engine.ExecutionContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import.


/** Standard update syntax with substitution should work when comments are involved. */
@Test
public void standard_syntax_update_remote_2b() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem having it here.
Using "rebase and merge" means the git history doesn't retain the branch structure.

@Aklakan
Copy link
Contributor Author

Aklakan commented Nov 18, 2024

I removed the two unused imports.

@afs afs merged commit 6313965 into apache:main Nov 18, 2024
@afs
Copy link
Member

afs commented Nov 18, 2024

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Timeout on Updates
2 participants