-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-2821: Support for Timeouts on Updates #2822
Conversation
fa6b5bf
to
3f50e50
Compare
return thisBuilder(); | ||
} | ||
|
||
public Y set(Symbol symbol, boolean value) { | ||
ensureContext(); | ||
this.context.set(symbol, value); | ||
this.contextAcc.set(symbol, value); | ||
return thisBuilder(); | ||
} | ||
|
||
private void ensureContext() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty ensureContext method is akin to ExecHTTPBuilder. It could be removed there and in this file (ExecUpdateHTTPBuilder
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea to remove it. "Later" rarely happens.
dd2ef56
to
1db78a5
Compare
@@ -537,7 +537,7 @@ protected static Iterator<Binding> evalBindings(Query query, DatasetGraph dsg, B | |||
} | |||
|
|||
// Not QueryExecDataset.dataset(...) because of initialBinding. | |||
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query); | |||
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query).context(context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far that context was never passed from the update engine to the query engine - so non-global custom settings (function registries, timeouts, etc) were lost here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment on the main PR. This is potentially something that impacts users and we need to consider migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is now an updateRemainingQueryTimeout that will set ARQ.queryTimeout based on the value of ARQ.updateTimeout and the time spent in the UpdateEngineWorker.
This PR is a good step forward. This PR is really two things:
The first is as little as move one line fix :-) Having timeouts on update is good to have and the way it is done in this PR, timing out the WHERE clause of a modify There are various operations for updates and a general timeout on an update isn't a one change in one place. Update can be several operations all in one request so there could an overall timeout (caveat the issue of various operations). The PR does impact on applications. This is not to say I think that it is bad (or good) idea but to draw out a consequence. If I've read the PR with it comes down to this one line: pass the context to the WHERE evaluation. If a global timeout for queries is set, then that will be picked up by the update execution where it wasn't 5.2 and earlier. For Fuseki, I would guess that timeout is usually per-endpoint; it is possible to set it database or server wide as well. For an application using Jena as a library, an application wide timeout might be present and that would impact updates with this PR. |
jena-arq/src/test/java/org/apache/jena/sparql/api/TestQueryExecutionCancel.java
Outdated
Show resolved
Hide resolved
jena-arq/src/test/java/org/apache/jena/sparql/api/TestQueryExecutionCancel.java
Outdated
Show resolved
Hide resolved
jena-arq/src/test/java/org/apache/jena/sparql/api/TestQueryExecutionCancel.java
Outdated
Show resolved
Hide resolved
return thisBuilder(); | ||
} | ||
|
||
public Y set(Symbol symbol, boolean value) { | ||
ensureContext(); | ||
this.context.set(symbol, value); | ||
this.contextAcc.set(symbol, value); | ||
return thisBuilder(); | ||
} | ||
|
||
private void ensureContext() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea to remove it. "Later" rarely happens.
jena-arq/src/main/java/org/apache/jena/http/sys/ExecUpdateHTTPBuilder.java
Outdated
Show resolved
Hide resolved
jena-arq/src/main/java/org/apache/jena/sparql/exec/UpdateExecDatasetBuilder.java
Show resolved
Hide resolved
jena-arq/src/main/java/org/apache/jena/sparql/exec/http/UpdateExecHTTP.java
Outdated
Show resolved
Hide resolved
jena-arq/src/test/java/org/apache/jena/sparql/api/TestQueryExecutionCancel.java
Outdated
Show resolved
Hide resolved
jena-arq/src/test/java/org/apache/jena/sparql/api/TestQueryExecutionCancel.java
Outdated
Show resolved
Hide resolved
@@ -537,7 +537,7 @@ protected static Iterator<Binding> evalBindings(Query query, DatasetGraph dsg, B | |||
} | |||
|
|||
// Not QueryExecDataset.dataset(...) because of initialBinding. | |||
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query); | |||
QueryExecDatasetBuilder builder = QueryExecDatasetBuilder.create().dataset(dsg).query(query).context(context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment on the main PR. This is potentially something that impacts users and we need to consider migration.
cxt.set(ARQ.queryTimeout, cxt.get(ARQ.updateTimeout))
I'd say that the basic updateTimeout setting should affect the overall timeout of an update request - not per sub-update - so the remaining time would have to be passed on to the query exec. Hm, in this case the context would have to be mutated or copied. The suggestions should be backward compatible with the existing behavior (but may require a change for computing the effective timeout in the query exec). |
f433c17
to
375244b
Compare
try { | ||
InputStream in = retainedConnection; | ||
if (in != null) { | ||
in.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this will just hang trying to consume any remaining data. Perhaps a test case could be created against the Fuseki server.
2dcfb44
to
e1e54e2
Compare
I have consolidated the timeout logic (initial and overall) into the The update timeouts are now only used to update ARQ.queryTimeout in the context. Conversely, the update timeouts do not affect the sinks: In principle a sink could also refuse to accept changes once the timeout is reached - but if that's really needed it could be a future update. |
9c84c57
to
d9cbb32
Compare
|
||
/** Standard update syntax with substitution should work when comments are involved. */ | ||
@Test | ||
public void standard_syntax_update_remote_2b() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, this additional test case could go into a separate commit. It tests for an oversight I introduced with #2580 where update strings were joined on the semicolon without a preceding newline. In such cases, joining update strings that ended on a comment would produce syntax errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem having it here.
Using "rebase and merge" means the git history doesn't retain the branch structure.
d8ae6d9
to
b276f97
Compare
@@ -70,7 +70,7 @@ protected QueryIteratorBase(AtomicBoolean cancelSignal) { | |||
} | |||
|
|||
private boolean requestingCancel() { | |||
return requestingCancel != null && requestingCancel.get() ; | |||
return (requestingCancel != null && requestingCancel.get()) || Thread.interrupted() ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional interrupted check is needed in order to cleanly stop the Fuseki server in TestSPARQLProtocolTimeout
which extends AbstractFusekiTest
: When shutting down the Fuseki server, Jetty attempts to stop the remaining threads via interruption. However, so far this flag was not picked up by the QueryIterator
machinery, causing jetty to abandon the thread and log the warning from org.eclipse.jetty.util.thread.QueuedThreadPool
:
17:27:10 WARN QueuedThreadPool :: Couldn't stop Thread[qtp1495161082-24,5,main]
I updated this PR's summary in the first post. The main changes are the introduction of It's ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
A couple of unused imports which I can sort out after the merge.
jena-arq/src/main/java/org/apache/jena/sparql/engine/iterator/QueryIterProcessBinding.java
Show resolved
Hide resolved
import org.apache.jena.query.ARQ; | ||
import org.apache.jena.sparql.ARQConstants; | ||
import org.apache.jena.sparql.ARQException; | ||
import org.apache.jena.sparql.core.DatasetGraph; | ||
import org.apache.jena.sparql.engine.ExecutionContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import.
|
||
/** Standard update syntax with substitution should work when comments are involved. */ | ||
@Test | ||
public void standard_syntax_update_remote_2b() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem having it here.
Using "rebase and merge" means the git history doesn't retain the branch structure.
b276f97
to
ef354a5
Compare
I removed the two unused imports. |
Thank you! |
GitHub issue resolved #2821
Pull request Description:
Fixed inconsistent context handling in the update builder machinery and made it consistent with that in the query builder machinery.
The context of update requests is now forwarded to the query execution
UpdateEngineWorker
sets the remaining query timeout if the newly introducedARQ.updateTimeout
has been set. Note, that it is possible to set an initial timeout via theARQ.updateTimeout
from which internally anARQ.queryTimeout
setting is derived: An update request would abort if the corresponding query execution timed out on the initial timeout - so only updates with aWHERE
clause are affected.I did not add
initialTimeout
methods to the update exec builders because perhaps initial timeout on updates are a bit to specific for the general interface - opinions?Added
UpdateProcessor.getContext()
for consistency withQueryExec
. This is for discussion.Added
UpdateProcessor.abort()
which can abort the underlying query execution if it is based onQueryIterator
.Added test cases to the
TestUpdateExecutionCancel
class.For HTTP-based updates, the overall update timeout is forwarded to the existing HTTP machinery.
There is now a class
TestSPARQLProtocolTimeout
which tests setting the timeout on the update builder.QueryIteratorBase.requestingCancel
was updated to check theThread.interrupted()
so that when Fuseki gets shut down (e.g. during unit testing) then any threads still busy executing queries can cancel execution in a timely manner.Added query/update tests that register a local function that checks for the availability of the cancel signal. Please also see the test cases which register the function that checks for the cancel signal in three ways for both queries and updates:
By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.
See the Apache Jena "Contributing" guide.