Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The apoc.path.expand procedure doesn't stop if dbms.memory.transaction.database_max_size is reached #56

Closed
neo-technology-build-agent opened this issue Sep 1, 2022 · 2 comments

Comments

@neo-technology-build-agent
Copy link
Collaborator

Issue by vga91
Wednesday Oct 27, 2021 at 09:48 GMT
Originally opened as neo4j-contrib/neo4j-apoc-procedures#2264


With dbms.memory.transaction.database_max_size enabled and using apoc.path.expand in a query, it is possible for a query to exhaust the heap.
The query should be terminated if the amount of memory used breaches the defined figure in dbms.memory.transaction.database_max_size

Steps to Reproduce

  1. Deploy Neo4j 4.3 and enable dbms.memory.transaction.database_max_size=10M
  2. Create a graph using the following:
CREATE CONSTRAINT a_id IF NOT EXISTS ON (a:A) ASSERT a.id IS UNIQUE;
WITH range(0,9) AS ids
UNWIND ids as id
MERGE (a:A {id: id})
SET a.prop = toInteger(10 * rand())
RETURN a;
WITH range(0,0) AS ids
UNWIND ids as id
MATCH (a1:A), (a2:A)
WHERE a1 <> a2
MERGE (a1)-[r:REL {id: id}]->(a2);
  1. Query the graph using apoc.path.expand:
MATCH (srcA:A {id:0})
CALL apoc.path.expand(srcA, “>REL”, “+A”, 1, 100)
YIELD path
WITH srcA,
relationships(path)[0] AS srcRel,
relationships(path)[length(path) - 1] AS dstRel,
nodes(path)[length(path)] AS dstA,
path AS path
RETURN DISTINCT srcA.id, srcA.prop, dstA.id, dstA.prop
LIMIT 1000;

Observed Behaviour

The heap is consumed which triggers large GC pauses. The system performance degrades significantly.
The query is not terminated as expected with dbms.memory.transaction.database_max_size enabled.

Expected Behaviour

As per the documentation on dbms.memory.transaction.database_max_size, the amount of memory the query can use should be restricted. If the set threshold is breach, the query should be terminated.

@neo-technology-build-agent
Copy link
Collaborator Author

Comment by jexp
Thursday Oct 28, 2021 at 08:58 GMT


I don't think this is the responsibility of APOC but the traversal framework or the Procedure infrastructure or the overall database memory pool handling (e.g. where node/rel-objects are created and handed out)

Also if we start doing it here, would mean that we have to do it for all 650 apoc procedures and functions.

Also as it requires a KernelTransaction to be injected each procedure/function would have to be put on the allowlist for sandboxed execution, which defeats the purpose for all "regular" functionality that apoc offers and also is a breaking change.

@hvub
Copy link
Collaborator

hvub commented Nov 19, 2024

This example can also be expressed without APOC using QPPs by replacing
CALL apoc.path.expand(srcA, “>REL”, “+A”, 1, 100)
with
MATCH path = (srcA) ( ()-[:REL]->(:A) ){1, 100}
This should do work for most apoc.path.expand calls. Some may be syntactically a bit more involved than this particular example.

Also note — as pointed out here) — that "any local memory within a CALL is not counted towards the transaction memory limit due to current limitations in our procedure API". This may improve in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants