You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upon closer inspection, the cfg_only boolean appears to be the reverse of what it should be. We would expect that
if cfg_only is true, then we are only keeping cfg nodes.
If cfg_only is false, we are keeping non_cfgs as well.
However, what this code is actually doing, is the reverse, due to the not cfg_only condition. Is this intended behaviour? It results in graph_input_full actually missing many nodes, as it is only keeping CFG nodes, which is not what "full" suggests.
If you check the output of the above, compare it with cfg_only = True and cfg_only = False.
tl;dr which ggnn input should we generate to replicate the results? In the provided data, for example, data/ggnn_input/devign has cfg, cfg_dfg, and dfg. I.e. no AST. But in the paper, you mention using CPG, which includes AST
I presume he use python clang API to parse the statement(node) in CFG into AST, because the AST node is in a extent independent from CFG edge and PDG edge. I don't know whether I assume the right thing.
In the following code:
ReVeal/data_processing/create_ggnn_data.py
Lines 301 to 303 in bef6c92
It seems like you're only using CFG nodes from Joern's output, and discarding the rest. Is this correct?
The text was updated successfully, but these errors were encountered: