Invalidate variables based on annotations of asm statements #715

bottbenj · 2022-05-01T21:53:41Z

Instead of simply ignoring asm statements use their annotations to invalidate variables that are potentially modified by them.

e3af268 are the changes required to make this compile after the changes in Add support for parsing asm goto cil#92 that are required for this analysis. This commit will have to be replace in the future after i get to properly creating the CFG for asm goto statements.
92b89d5 introduces an event to invalidate variables and makes the two existing invalidates in base use it.
e3af268 the actual code to collect which variables to invalidate for an asm

TODO

use the invalidate event outside of base
proper handling of asm goto
more tests
Update goblint-cil opam pin to Add support for parsing asm goto cil#92
???

sim642 · 2022-05-02T02:07:01Z

src/framework/constraints.ml

@@ -675,7 +675,7 @@ struct
      | Entry f        -> tf_entry var edge prev_node f
      | Ret (r,fd)     -> tf_ret var edge prev_node r fd
      | Test (p,b)     -> tf_test var edge prev_node p b
-      | ASM (_, _, _)  -> tf_asm var edge prev_node (* TODO: use ASM fields for something? *)
+      | ASM _          -> tf_asm var edge prev_node (* TODO: use ASM fields for something? *)


The relevant data should be taken from the ASM constructor and passed as actual arguments instead of looking at ctx.edge in the transfer function implementation.

michael-schwarz · 2022-05-03T15:09:06Z

Sorry we haven't provided feedback yet, we have an upcoming paper deadline on Friday AOE and will do so after for all your PRs 😄

sim642 · 2022-05-09T08:56:08Z

src/framework/edge.ml

@@ -42,7 +42,7 @@ let pretty () = function
  | Entry (f) -> Pretty.text "(body)"
  | Ret (Some e,f) -> Pretty.dprintf "return %a" dn_exp e
  | Ret (None,f) -> Pretty.dprintf "return"
-  | ASM (_,_,_) -> Pretty.text "ASM ..."
+  | ASM _ -> Pretty.text "ASM ..."


Now that we actually do something better than ignoring asm, it might be a good idea to also have the edge printout reflect the actual data.

sim642 · 2022-05-09T08:58:05Z

tests/regression/56-asm/02-basic-discard-local.c

+
+  asm("nop");
+
+  assert(x == 0);


The test is named "discard-local", but in what sense is the local being discarded if it isn't invalidated here.

sim642 · 2022-05-09T09:04:53Z

tests/regression/56-asm/30-simple-branch.c

+        : "=r"(x));
+  }
+
+  assert(x == 1); // UNKNOWN


Unlike the UNKNOWN in previous tests, which are simply because we don't analyze the asm instructions and assume the worst about nop, if I understand this correctly, it's actually unknowable, not just due to our imprecision. Therefore:

Suggested change

assert(x == 1); // UNKNOWN

assert(x == 1); // UNKNOWN!

If that's the case, then the same applies to following tests as well.

sim642 · 2022-05-09T09:08:31Z

src/domains/events.ml

@@ -9,6 +9,7 @@ type t =
  | AssignSpawnedThread of lval * ThreadIdDomain.Thread.t (** Assign spawned thread's ID to lval. *)
  | Access of {var_opt: CilType.Varinfo.t option; write: bool} (** Access varinfo (unknown if None). *)
  | Assign of {lval: CilType.Lval.t; exp: CilType.Exp.t} (** Used to simulate old [ctx.assign]. *)
+  | Invalidate of {addr_list: PreValueDomain.AD.bucket BatMap.Int.t list}


I'm guessing this is the actual type without inlining the implementation detail about buckets:

Suggested change

| Invalidate of {addr_list: PreValueDomain.AD.bucket BatMap.Int.t list}

| Invalidate of {addr_list: PreValueDomain.AD.t list}

sim642 · 2022-05-09T09:12:45Z

src/util/options.schema.json

+            "enabled": {
+              "title": "sem.asm.enabled",
+              "description": "",
+              "type": "boolean",
+              "default": true
+            },


What would be the use case for ever disabling asm semantics (and whatever that would mean)?
Wouldn't enabling the other options, which make additional safety assumptions about asm statements already give the old behavior, where possible changes from asm statements were completely ignored?

sim642 · 2022-05-09T09:18:04Z

src/util/options.schema.json

+            "basic-preserve-globals": {
+              "title": "sem.asm.basic-preserve-globals",
+              "description": "If enabled, do preserve globals across basic asm statements",
+              "type": "boolean",
+              "default": false
+            },
+            "memory-preserve-globals": {
+              "title": "sem.asm.memory-preserve-globals",
+              "description": "If enabled, do preserve globals across advanced asm statements with memory clobber annotation",
+              "type": "boolean",
+              "default": false
+            },


For consistency, it might make sense to negate these options since the pre-existing sem.unknown_function.invalidate.globals is about invalidating as opposed to preserving.

sim642 · 2022-05-09T09:20:02Z

src/analyses/base.ml

-      M.tracel "invalidate" "Setting addresses [%a] to values [%a]\n" (d_list ", " AD.pretty) addrs (d_list ", " VD.pretty) vs
-    );
-    set_many ?ctx ask gs st invalids'
+    invalidate_core ~ctx false (List.map (fun exp -> (exp, InvalidateTransitive)) exps);


The inv_globs argument for invalidate_core should probably be named, otherwise it's confusing what this false means here.

sim642 · 2022-05-09T09:22:05Z

src/analyses/base.ml

+    | Events.Invalidate {addr_list} ->
+      let invalidate_address a =
+        let t = AD.get_type a in
+        let v = get ask ctx.global st a None in (* None here is ok, just causes us to be a bit less precise *)
+        let nv =  VD.invalidate_value ask t v in
+        (a, t, nv)
+      in
+      let invalids = List.map invalidate_address addr_list in
+      if M.tracing && addr_list <> [] then (
+        let addrs = List.map (Tuple3.first) invalids in
+        let vs = List.map (Tuple3.third) invalids in
+        M.tracel "invalidate" "Setting addresses [%a] to values [%a]\n" (d_list ", " AD.pretty) addrs (d_list ", " VD.pretty) vs
+      );
+      set_many ~ctx ask ctx.global ctx.local invalids


What's the reason for converting this into an event instead of simply extracting the functions for all globals invalidation etc inside the base analysis itself?
All the calls to invalidate_core look awkward, because they have to delegate outside to the events system to perform these after the actual transfer function completes, but in all three cases the transfer function just returns ctx.local as the updated state. This seems like an unnecessarily roundabout way to extract common logic.

What's the reason for converting this into an event instead of simply extracting the functions for all globals invalidation etc inside the base analysis itself?

The idea of making invalidation an event based mechanism was that other analyses could be informed and then do some proper handling. One could e.g. imagine that an relational analysis that handles referenced variables might need to know when some variables are potentially invalidated.

I'm just afraid that in light of #719 with the need to distinguish non-transitive vs transitive accesses also for reads/frees and writes other than invalidation, this won't be general enough.

It's also quite base analysis specific by using its ValueDomain.Addr. Other analyses like apron or var_eq, which currently also do similar invalidation, aren't using base analysis's address abstraction, but have their own ways of working with invalidated lvalues/rvalues.

I'm just afraid that in light of #719 with the need to distinguish non-transitive vs transitive accesses also for reads/frees and writes other than invalidation, this won't be general enough.

The idea for invalidate events would be that the Base analysis collects all lvals that have to be invalidated. This way, other analyses would not have to implement the collection of transitively reachable lvals.

It's also quite base analysis specific by using its ValueDomain.Addr

One could add a utility function that takes elements from the ValueDomain.Addr and translates them back to the lval they are on (without offset information, in case the offset is not concrete).

The idea for invalidate events would be that the Base analysis collects all lvals that have to be invalidated. This way, other analyses would not have to implement the collection of transitively reachable lvals.

They don't currently reimplement it either, but just use the ReachableFrom query where needed. Since #590 those queries are cached, so multiple analyses asking for the same reachability comes at no extra cost.

We will probably have to see how this plays together with the general non-transitive vs transitive accessing changes from #719 and #720.

I thought about this more and how it compares to #720 and the Access event. It might make sense to have the Invalidate event after all because:

Access event is emitted for all writes, not just invalidating ones. Although this could be changed by adding an extra field to that event.

Access event is emitted only during multithreaded mode, because it's all about race detection. But these invalidations also need to happen in singlethreaded mode. Although this could also be changed (and accesses from singlethreaded mode wouldn't race with anything).

But given that access analysis (and its event) is really about race detection, it might be better to keep it that way.

sim642 · 2022-05-09T09:26:34Z

src/analyses/base.ml

@@ -1899,34 +1899,34 @@ struct
    in
    List.concat_map do_exp exps

-  let invalidate ?ctx ask (gs:glob_fun) (st:store) (exps: exp list): store =
+  type invalidate_mode = InvalidateSelf | InvalidateTransitive | InvalidateSelfAndTransitive [@@deriving show]


This is more of a note to us than you, but the notion of transitive vs non-transitive invalidation gets a lot more general in light of #719 because it doesn't apply to just invalidation but every form of access.

sim642 · 2022-05-09T09:34:01Z

src/analyses/base.ml

+    let lvals_self = List.filter_map (function (_, InvalidateTransitive) -> None | (Lval lval, _) -> Some lval | _ -> None) inv_exps in
+    let exps_self = List.filter_map (function (_, InvalidateTransitive) | (Lval _, _) -> None | (exp, _) -> Some exp) inv_exps in


What's with the special Lval cases here in addition to the invalidate_mode based ones? Because matching the outermost lval expression is kind of fragile: it might also be a cast expression of an lval (or any number of casts). Seems like there's a more general concept here at play.

jerhard · 2022-06-08T07:45:02Z

tests/regression/56-asm/03-basic-discard-global.c

+{
+  asm("nop");
+
+  assert(x == 0); // UNKNOWN


It makes sense to have additional tests for the configurable options you implement, e.g. you might add a test for what's currently sem.asm.basic-preserve-global.

jerhard · 2022-06-08T07:56:05Z

tests/regression/56-asm/60-goto-issue.c

+	asm_volatile_goto("1:"
+										".byte " __stringify(BYTES_NOP5) "\n\t" JUMP_TABLE_ENTRY


This section might benefit from better indentation.

jerhard · 2022-06-08T07:57:09Z

tests/regression/56-asm/60-goto-issue.c

@@ -0,0 +1,45 @@
+// see https://github.com/goblint/cil/issues/81


Is this test currently only about parsing and not crashing Goblint? Maybe add an explicit comment about that, and the current limitations.

michael-schwarz · 2023-07-14T14:50:33Z

Unassigning myself as I currently do not have the bandwith to invest time here.

michael-schwarz · 2023-11-01T16:52:22Z

Thank you again for this PR. One of the student teams is now working on implementing this more generally for the Goblint practical. Therefore, I am closing this PR.

bottbenj added 2 commits May 1, 2022 20:34

Adjustments for the CIL changes required for ASM analysis

e3af268

Refactor invalidate and special_unknown_invalidate to use a shared Event

92b89d5

sim642 reviewed May 2, 2022

View reviewed changes

sim642 added feature unsound precision student-job labels May 2, 2022

michael-schwarz self-requested a review May 3, 2022 15:08

bottbenj added 2 commits May 4, 2022 20:23

Propagate asm arguments to analyses

d4f9374

Invalidate variables (potentially) modified by an asm statement

18f157b

bottbenj force-pushed the asm-analysis branch from 0e31525 to 18f157b Compare May 4, 2022 18:28

sim642 self-requested a review May 8, 2022 07:17

sim642 reviewed May 9, 2022

View reviewed changes

michael-schwarz requested a review from jerhard May 10, 2022 13:18

jerhard reviewed Jun 8, 2022

View reviewed changes

michael-schwarz self-assigned this Mar 24, 2023

michael-schwarz removed their assignment Jul 14, 2023

michael-schwarz closed this Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalidate variables based on annotations of asm statements #715

Invalidate variables based on annotations of asm statements #715

bottbenj commented May 1, 2022 •

edited by sim642

Loading

sim642 May 2, 2022

michael-schwarz commented May 3, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

sim642 May 9, 2022

jerhard May 10, 2022

sim642 May 10, 2022

jerhard May 11, 2022

sim642 May 11, 2022

sim642 May 30, 2022

sim642 May 9, 2022

sim642 May 9, 2022

jerhard Jun 8, 2022 •

edited

Loading

jerhard Jun 8, 2022

jerhard Jun 8, 2022

michael-schwarz commented Jul 14, 2023

michael-schwarz commented Nov 1, 2023

	\| Invalidate of {addr_list: PreValueDomain.AD.bucket BatMap.Int.t list}
	\| Invalidate of {addr_list: PreValueDomain.AD.t list}

		let lvals_self = List.filter_map (function (_, InvalidateTransitive) -> None \| (Lval lval, _) -> Some lval \| _ -> None) inv_exps in
		let exps_self = List.filter_map (function (_, InvalidateTransitive) \| (Lval _, _) -> None \| (exp, _) -> Some exp) inv_exps in

		asm_volatile_goto("1:"
		".byte " __stringify(BYTES_NOP5) "\n\t" JUMP_TABLE_ENTRY

		@@ -0,0 +1,45 @@
		// see https://github.com/goblint/cil/issues/81

Invalidate variables based on annotations of asm statements #715

Invalidate variables based on annotations of asm statements #715

Conversation

bottbenj commented May 1, 2022 • edited by sim642 Loading

TODO

Choose a reason for hiding this comment

michael-schwarz commented May 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerhard Jun 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michael-schwarz commented Jul 14, 2023

michael-schwarz commented Nov 1, 2023

bottbenj commented May 1, 2022 •

edited by sim642

Loading

jerhard Jun 8, 2022 •

edited

Loading