-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalidate variables based on annotations of asm statements #715
Conversation
src/framework/constraints.ml
Outdated
@@ -675,7 +675,7 @@ struct | |||
| Entry f -> tf_entry var edge prev_node f | |||
| Ret (r,fd) -> tf_ret var edge prev_node r fd | |||
| Test (p,b) -> tf_test var edge prev_node p b | |||
| ASM (_, _, _) -> tf_asm var edge prev_node (* TODO: use ASM fields for something? *) | |||
| ASM _ -> tf_asm var edge prev_node (* TODO: use ASM fields for something? *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relevant data should be taken from the ASM
constructor and passed as actual arguments instead of looking at ctx.edge
in the transfer function implementation.
Sorry we haven't provided feedback yet, we have an upcoming paper deadline on Friday AOE and will do so after for all your PRs 😄 |
@@ -42,7 +42,7 @@ let pretty () = function | |||
| Entry (f) -> Pretty.text "(body)" | |||
| Ret (Some e,f) -> Pretty.dprintf "return %a" dn_exp e | |||
| Ret (None,f) -> Pretty.dprintf "return" | |||
| ASM (_,_,_) -> Pretty.text "ASM ..." | |||
| ASM _ -> Pretty.text "ASM ..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we actually do something better than ignoring asm, it might be a good idea to also have the edge printout reflect the actual data.
|
||
asm("nop"); | ||
|
||
assert(x == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is named "discard-local", but in what sense is the local being discarded if it isn't invalidated here.
: "=r"(x)); | ||
} | ||
|
||
assert(x == 1); // UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike the UNKNOWN
in previous tests, which are simply because we don't analyze the asm instructions and assume the worst about nop
, if I understand this correctly, it's actually unknowable, not just due to our imprecision. Therefore:
assert(x == 1); // UNKNOWN | |
assert(x == 1); // UNKNOWN! |
If that's the case, then the same applies to following tests as well.
@@ -9,6 +9,7 @@ type t = | |||
| AssignSpawnedThread of lval * ThreadIdDomain.Thread.t (** Assign spawned thread's ID to lval. *) | |||
| Access of {var_opt: CilType.Varinfo.t option; write: bool} (** Access varinfo (unknown if None). *) | |||
| Assign of {lval: CilType.Lval.t; exp: CilType.Exp.t} (** Used to simulate old [ctx.assign]. *) | |||
| Invalidate of {addr_list: PreValueDomain.AD.bucket BatMap.Int.t list} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is the actual type without inlining the implementation detail about buckets:
| Invalidate of {addr_list: PreValueDomain.AD.bucket BatMap.Int.t list} | |
| Invalidate of {addr_list: PreValueDomain.AD.t list} |
"enabled": { | ||
"title": "sem.asm.enabled", | ||
"description": "", | ||
"type": "boolean", | ||
"default": true | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the use case for ever disabling asm semantics (and whatever that would mean)?
Wouldn't enabling the other options, which make additional safety assumptions about asm statements already give the old behavior, where possible changes from asm statements were completely ignored?
"basic-preserve-globals": { | ||
"title": "sem.asm.basic-preserve-globals", | ||
"description": "If enabled, do preserve globals across basic asm statements", | ||
"type": "boolean", | ||
"default": false | ||
}, | ||
"memory-preserve-globals": { | ||
"title": "sem.asm.memory-preserve-globals", | ||
"description": "If enabled, do preserve globals across advanced asm statements with memory clobber annotation", | ||
"type": "boolean", | ||
"default": false | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, it might make sense to negate these options since the pre-existing sem.unknown_function.invalidate.globals
is about invalidating as opposed to preserving.
M.tracel "invalidate" "Setting addresses [%a] to values [%a]\n" (d_list ", " AD.pretty) addrs (d_list ", " VD.pretty) vs | ||
); | ||
set_many ?ctx ask gs st invalids' | ||
invalidate_core ~ctx false (List.map (fun exp -> (exp, InvalidateTransitive)) exps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inv_globs
argument for invalidate_core
should probably be named, otherwise it's confusing what this false
means here.
| Events.Invalidate {addr_list} -> | ||
let invalidate_address a = | ||
let t = AD.get_type a in | ||
let v = get ask ctx.global st a None in (* None here is ok, just causes us to be a bit less precise *) | ||
let nv = VD.invalidate_value ask t v in | ||
(a, t, nv) | ||
in | ||
let invalids = List.map invalidate_address addr_list in | ||
if M.tracing && addr_list <> [] then ( | ||
let addrs = List.map (Tuple3.first) invalids in | ||
let vs = List.map (Tuple3.third) invalids in | ||
M.tracel "invalidate" "Setting addresses [%a] to values [%a]\n" (d_list ", " AD.pretty) addrs (d_list ", " VD.pretty) vs | ||
); | ||
set_many ~ctx ask ctx.global ctx.local invalids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for converting this into an event instead of simply extracting the functions for all globals invalidation etc inside the base analysis itself?
All the calls to invalidate_core
look awkward, because they have to delegate outside to the events system to perform these after the actual transfer function completes, but in all three cases the transfer function just returns ctx.local
as the updated state. This seems like an unnecessarily roundabout way to extract common logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for converting this into an event instead of simply extracting the functions for all globals invalidation etc inside the base analysis itself?
The idea of making invalidation an event based mechanism was that other analyses could be informed and then do some proper handling. One could e.g. imagine that an relational analysis that handles referenced variables might need to know when some variables are potentially invalidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just afraid that in light of #719 with the need to distinguish non-transitive vs transitive accesses also for reads/frees and writes other than invalidation, this won't be general enough.
It's also quite base analysis specific by using its ValueDomain.Addr
. Other analyses like apron or var_eq, which currently also do similar invalidation, aren't using base analysis's address abstraction, but have their own ways of working with invalidated lvalues/rvalues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just afraid that in light of #719 with the need to distinguish non-transitive vs transitive accesses also for reads/frees and writes other than invalidation, this won't be general enough.
The idea for invalidate events would be that the Base
analysis collects all lvals that have to be invalidated. This way, other analyses would not have to implement the collection of transitively reachable lvals.
It's also quite base analysis specific by using its ValueDomain.Addr
One could add a utility function that takes elements from the ValueDomain.Addr
and translates them back to the lval
they are on (without offset information, in case the offset is not concrete).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea for invalidate events would be that the
Base
analysis collects all lvals that have to be invalidated. This way, other analyses would not have to implement the collection of transitively reachable lvals.
They don't currently reimplement it either, but just use the ReachableFrom
query where needed. Since #590 those queries are cached, so multiple analyses asking for the same reachability comes at no extra cost.
We will probably have to see how this plays together with the general non-transitive vs transitive accessing changes from #719 and #720.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this more and how it compares to #720 and the Access
event. It might make sense to have the Invalidate
event after all because:
Access
event is emitted for all writes, not just invalidating ones. Although this could be changed by adding an extra field to that event.Access
event is emitted only during multithreaded mode, because it's all about race detection. But these invalidations also need to happen in singlethreaded mode. Although this could also be changed (and accesses from singlethreaded mode wouldn't race with anything).
But given that access analysis (and its event) is really about race detection, it might be better to keep it that way.
@@ -1899,34 +1899,34 @@ struct | |||
in | |||
List.concat_map do_exp exps | |||
|
|||
let invalidate ?ctx ask (gs:glob_fun) (st:store) (exps: exp list): store = | |||
type invalidate_mode = InvalidateSelf | InvalidateTransitive | InvalidateSelfAndTransitive [@@deriving show] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a note to us than you, but the notion of transitive vs non-transitive invalidation gets a lot more general in light of #719 because it doesn't apply to just invalidation but every form of access.
let lvals_self = List.filter_map (function (_, InvalidateTransitive) -> None | (Lval lval, _) -> Some lval | _ -> None) inv_exps in | ||
let exps_self = List.filter_map (function (_, InvalidateTransitive) | (Lval _, _) -> None | (exp, _) -> Some exp) inv_exps in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's with the special Lval
cases here in addition to the invalidate_mode
based ones? Because matching the outermost lval expression is kind of fragile: it might also be a cast expression of an lval (or any number of casts). Seems like there's a more general concept here at play.
{ | ||
asm("nop"); | ||
|
||
assert(x == 0); // UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to have additional tests for the configurable options you implement, e.g. you might add a test for what's currently sem.asm.basic-preserve-global
.
asm_volatile_goto("1:" | ||
".byte " __stringify(BYTES_NOP5) "\n\t" JUMP_TABLE_ENTRY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section might benefit from better indentation.
@@ -0,0 +1,45 @@ | |||
// see https://github.com/goblint/cil/issues/81 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test currently only about parsing and not crashing Goblint? Maybe add an explicit comment about that, and the current limitations.
Unassigning myself as I currently do not have the bandwith to invest time here. |
Thank you again for this PR. One of the student teams is now working on implementing this more generally for the Goblint practical. Therefore, I am closing this PR. |
Instead of simply ignoring
asm
statements use their annotations to invalidate variables that are potentially modified by them.asm goto
cil#92 that are required for this analysis. This commit will have to be replace in the future after i get to properly creating the CFG forasm goto
statements.TODO
asm goto
asm goto
cil#92