Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for parsing extended assembly code including asm goto #161

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
3ff22e8
support for asm goto; untested but compiles
Nov 16, 2023
a76f439
intermidiary commit, so that everyone has same type definition
Nov 17, 2023
f1e3be5
todo: move all asm instr matches to stmt matches
Nov 17, 2023
26cc002
ich hab absolut keinen bock mehr
Nov 17, 2023
d01d6ec
proposal to solve dInstr problem
Nov 17, 2023
d0dcf8c
why goblint no work???
Nov 17, 2023
7e5435d
normal gobint-cil main.exe works but from gobint --justcil doesn't
Nov 17, 2023
a339860
fixed most complaints; (dummy instruction still todo)
Jan 14, 2024
7236437
Update cparser.mly
WernerDrasche Jan 15, 2024
8037c21
Update cparser.mly
WernerDrasche Jan 15, 2024
cb96966
have to commit to test in goblint I think
Jan 15, 2024
5de562b
test
Jan 15, 2024
9e02615
Merge branch 'second_try' of github.com:N0W0RK/goblint-cil into secon…
Jan 15, 2024
ced0532
apparently asm goto ("nop") compiles
Jan 15, 2024
35969c6
to test inside analyzer
Jan 29, 2024
5871101
testing cprint because bug I think??????????
Jan 29, 2024
4cb1f6b
???????????????????????????????????/
Jan 29, 2024
cb4d68c
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Jan 29, 2024
c31ad66
wtf
Jan 29, 2024
1928ed7
I think I found the bug?
Jan 29, 2024
bddb076
suicide commit (debugging ocaml is so fun :) :) :) :) :)
Jan 29, 2024
c81580c
fixed the expected : before bug; todo fix label printing in cil.ml
Jan 30, 2024
24d806c
commiting to test in goblint
Feb 1, 2024
da74c6c
repo is broken?
Feb 1, 2024
d54ca43
fixed bug
Feb 7, 2024
39a4d1b
dummy instruction deleted, revert if necessary
Feb 7, 2024
2030d40
removed zrapp asm; revert if necessary
Feb 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion src/cfg.ml
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,7 @@ and cfgStmt (s: stmt) (next:stmt option) (break:stmt option) (cont:stmt option)
cfgBlock blk (Some s) next (Some s) nodeList rlabels
(* Since all loops have terminating condition true, we don't put
any direct successor to stmt following the loop *)
| Asm _ -> addOptionSucc next (* the extra edges are added inside goblint's cfg *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done here for consistency because everything else is done here. Otherwise the support appears broken by anyone else wanting to use CIL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear how to insert such edges, as these are only edges that are taken potentially and such a ASM block has several successors and jumps to these locations may happen at arbitrary locations within the inline assembly.

I cannot think of a general one-size-fits-all approach to include edges here, it depends on what one wants to do. In the case of Goblint over-approximating by taking any of the edges after the block has executed is what we want -- other tools might to inspect what goes on inside the assembly block or may want to underapproximate reachability etc.
I think the solution is proposed here is the best we can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the handling of computed goto-s already that way, though? I think CIL overapproximates their targets as well, yielding more than maybe necessary.

Following in the same spirit, it would make sense to use the overapproximation here as well. That way CIL's CFG is at least sound, i.e. it includes all possible control flows. Otherwise, it's neither sound nor complete, which is very ambiguous and unreliable. Downstream tools can always be more precise about it and exclude some targets if they want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.

Even if we insert these edges into the CFG, the resulting CFG would not be sound: Jumps from inline Assembly do not necessarily happen at the end or the beginning of an inline ask block. To obtain a CFG that is actually sound, we would have to look into the ASM block to insert edges appropriately, which I don't think we want to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I can find no reference for this. I was under the impression that CIL assigns each label that has it's address taken an integer and then computes with these and goes to a jump table to decide where to jump to. The resulting CFG is still a "normal" CFG that corresponds to a C program.

But that is an overapproximation! Instead of computing which exact labels a computed goto might jump to, it overapproximates that to all the labels. Of course not all computed gotos can jump to all those targets in the concrete, but that's fine: some edges are simply never used.

It would be similar for ASM gotos: have all the possible targets (which are explicitly listed by the user) in the CFG, but some of those might never be used. Without analyzing the assembly, we just cannot be any more precise.
Which exact instruction does which exact jumps is also a matter of precision, as opposed to having all of them jump to everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was decided to have a flag to toggle this behavior on or off at Gobcon.


(*------------------------------------------------------------*)

Expand All @@ -247,7 +248,7 @@ and fasStmt (todo) (s : stmt) =
| If (_, tb, fb, _, _) -> (fasBlock todo tb; fasBlock todo fb)
| Switch (_, b, _, _, _) -> fasBlock todo b
| Loop (b, _, _, _, _) -> fasBlock todo b
| (Return _ | Break _ | Continue _ | Goto _ | ComputedGoto _ | Instr _) -> ()
| (Return _ | Break _ | Continue _ | Goto _ | ComputedGoto _ | Instr _ | Asm _) -> ()
end
;;

Expand All @@ -270,6 +271,7 @@ let d_cfgnodelabel () (s : stmt) =
| Switch _ -> "switch"
| Block _ -> "block"
| Return _ -> "return"
| Asm _ -> "asm"
end in
dprintf "%d: %s" s.sid label

Expand Down
7 changes: 4 additions & 3 deletions src/check.ml
Original file line number Diff line number Diff line change
Expand Up @@ -808,9 +808,12 @@ and checkStmt (s: stmt) =
findCase !statements)
cases;

| Instr il -> List.iter checkInstr il)
| Instr il -> List.iter checkInstr il
| Asm _ -> () (* Not yet implemented *)
)
() (* argument of withContext *)


and checkBlock (b: block) : unit =
List.iter checkStmt b.bstmts

Expand Down Expand Up @@ -883,8 +886,6 @@ and checkInstr (i: instr) =
if not v.vhasdeclinstruction then
E.s (bug "Encountered a VarDecl, but vhasdeclinstruction for the varinfo is not set")

| Asm _ -> () (* Not yet implemented *)

let rec checkGlobal = function
GAsm _ -> ()
| GPragma _ -> ()
Expand Down
216 changes: 125 additions & 91 deletions src/cil.ml
Original file line number Diff line number Diff line change
Expand Up @@ -813,6 +813,34 @@ and stmtkind =
as a way to keep some attributes
local *)

(* See the GCC specification for the meaning of ASM.
If the source is MS VC then only the templates
are used *)
(* sm: I've added a notes.txt file which contains more
information on interpreting Asm instructions *)
| Asm of attributes * (* Really only const and volatile can appear
here *)
string list * (* templates (CR-separated) *)
(string option * string * lval) list *
(* outputs must be lvals with
optional names and constraints.
I would like these
to be actually variables, but I
run into some trouble with ASMs
in the Linux sources *)
(string option * string * exp) list *
(* inputs with optional names and constraints *)
string list * (* register clobbers *)
stmt ref list * (* referenced labels *)
location
(** An inline assembly instruction. The arguments are (1) a list of
attributes (only const and volatile can appear here and only for
GCC), (2) templates (CR-separated), (3) a list of
outputs, each of which is an lvalue with a constraint, (4) a list
of input expressions along with constraints, (5) clobbered
registers, (6) the labels list for asm goto, and (7) location information *)


(** Instructions. They may cause effects directly but may not have control
flow.*)
and instr =
Expand Down Expand Up @@ -840,33 +868,6 @@ and instr =
result then an implicit cast exists.
Second location is just for expression when inside condition. *)

(* See the GCC specification for the meaning of ASM.
If the source is MS VC then only the templates
are used *)
(* sm: I've added a notes.txt file which contains more
information on interpreting Asm instructions *)
| Asm of attributes * (* Really only const and volatile can appear
here *)
string list * (* templates (CR-separated) *)
(string option * string * lval) list *
(* outputs must be lvals with
optional names and constraints.
I would like these
to be actually variables, but I
run into some trouble with ASMs
in the Linux sources *)
(string option * string * exp) list *
(* inputs with optional names and constraints *)
string list * (* register clobbers *)
location
(** An inline assembly instruction. The arguments are (1) a list of
attributes (only const and volatile can appear here and only for
GCC), (2) templates (CR-separated), (3) a list of
outputs, each of which is an lvalue with a constraint, (4) a list
of input expressions along with constraints, (5) clobbered
registers, and (5) location information *)



(** Describes a location in a source file *)
and location = {
Expand Down Expand Up @@ -1125,7 +1126,6 @@ let get_instrLoc (inst : instr) =
match inst with
Set(_, _, loc, _) -> loc
| Call(_, _, _, loc, _) -> loc
| Asm(_, _, _, _, _, loc) -> loc
| VarDecl(_,loc) -> loc
let get_globalLoc (g : global) =
match g with
Expand Down Expand Up @@ -1155,6 +1155,7 @@ let rec get_stmtLoc (statement : stmtkind) =
| Loop (_, loc, _, _, _) -> loc
| Block b -> if b.bstmts == [] then lu
else get_stmtLoc ((List.hd b.bstmts).skind)
| Asm (_, _, _, _, _, _, loc) -> loc


(* The next variable identifier to use. Counts up *)
Expand Down Expand Up @@ -1353,8 +1354,7 @@ let mkBlock (slst: stmt list) : block =
let mkEmptyStmt () = mkStmt (Instr [])
let mkStmtOneInstr (i: instr) = mkStmt (Instr [i])

let dummyInstr = (Asm([], ["dummy statement!!"], [], [], [], lu))
let dummyStmt = mkStmt (Instr [dummyInstr])
let dummyStmt = mkStmt (Asm([], ["dummy statement!!"], [], [], [], [], lu))

let compactStmts (b: stmt list) : stmt list =
(* Try to compress statements. Scan the list of statements and remember
Expand Down Expand Up @@ -3708,53 +3708,6 @@ class defaultCilPrinterClass : cilPrinter = object (self)
++ unalign)
++ text (")" ^ printInstrTerminator)

| Asm(attrs, tmpls, outs, ins, clobs, l) ->
self#pLineDirective l
++ text ("__asm__ ")
++ self#pAttrs () attrs
++ text " ("
++ (align
++ (docList ~sep:line
(fun x -> text ("\"" ^ escape_string x ^ "\""))
() tmpls)
++
(if outs = [] && ins = [] && clobs = [] then
chr ':'
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun (idopt, c, lv) ->
text(match idopt with
None -> ""
| Some id -> "[" ^ id ^ "] "
) ++
text ("\"" ^ escape_string c ^ "\" (")
++ self#pLval () lv
++ text ")") () outs)))
++
(if ins = [] && clobs = [] then
nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun (idopt, c, e) ->
text(match idopt with
None -> ""
| Some id -> "[" ^ id ^ "] "
) ++
text ("\"" ^ escape_string c ^ "\" (")
++ self#pExp () e
++ text ")") () ins)))
++
(if clobs = [] then nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun c -> text ("\"" ^ escape_string c ^ "\""))
()
clobs)))
++ unalign)
++ text (")" ^ printInstrTerminator)


(**** STATEMENTS ****)
Expand Down Expand Up @@ -3977,6 +3930,82 @@ class defaultCilPrinterClass : cilPrinter = object (self)
++ self#pBlock () b)
end
| Block b -> align ++ self#pBlock () b
| Asm(attrs, tmpls, outs, ins, clobs, labels, l) ->
let rec names_of_stmtrefs stmtrefs =
match stmtrefs with
| [] -> []
| stmtref :: xs ->
let s = !stmtref in
let rec names_of_labels = function
| [] -> []
| Label (name, _, true) :: ys -> name :: names_of_labels ys
| Label (_, _, false) :: ys -> names_of_labels ys
| _ -> failwith "unreachable"
in
let names = names_of_labels s.labels in
let rest = names_of_stmtrefs xs in
let rec prepend_new_names (names: string list) rest =
match names with
| [] -> rest
| name :: ys when not (List.mem name rest) -> prepend_new_names ys (name :: rest)
| _ :: ys -> prepend_new_names ys rest
in
prepend_new_names names rest
in
self#pLineDirective l
++ text ("__asm__ ")
++ self#pAttrs () attrs
++ text " ("
++ (align
++ (docList ~sep:line
(fun x -> text ("\"" ^ escape_string x ^ "\""))
() tmpls)
++
(if outs = [] && ins = [] && clobs = [] && labels = [] then
nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun (idopt, c, lv) ->
text(match idopt with
None -> ""
| Some id -> "[" ^ id ^ "] "
) ++
text ("\"" ^ escape_string c ^ "\" (")
++ self#pLval () lv
++ text ")") () outs)))
++
(if ins = [] && clobs = [] && labels = [] then
nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun (idopt, c, e) ->
text(match idopt with
None -> ""
| Some id -> "[" ^ id ^ "] "
) ++
text ("\"" ^ escape_string c ^ "\" (")
++ self#pExp () e
++ text ")") () ins)))
++
(if clobs = [] && labels = [] then nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
(fun c -> text ("\"" ^ escape_string c ^ "\""))
()
clobs)))
++
(if labels = [] then nil
else
(text ": "
++ (docList ~sep:(chr ',' ++ break)
text
()
(names_of_stmtrefs labels))))
++ unalign)
++ text (")" ^ printInstrTerminator)


(*** GLOBALS ***)
Expand Down Expand Up @@ -4329,6 +4358,7 @@ class defaultCilPrinterClass : cilPrinter = object (self)
| "aconst", [] -> text "__const__", true
| "thread", [] -> text "__thread", false
| "volatile", [] -> text "volatile", false
| "goto", [] -> text "goto", false
| "restrict", [] -> text "__restrict", false
| "missingproto", [] -> text "/* missing proto */", false
| "asm", args ->
Expand All @@ -4355,7 +4385,6 @@ class defaultCilPrinterClass : cilPrinter = object (self)
| "arraylen", [a] ->
(* text "/*[" ++ self#pAttrParam () a ++ text "]*/" *) nil, false


| _ -> (* This is the default case *)
(* Add underscores to the name *)
let an' = "__" ^ an ^ "__" in
Expand Down Expand Up @@ -5325,16 +5354,6 @@ and childrenInstr (vis: cilVisitor) (i: instr) : instr =
if lv' != lv || fn' != fn || args' != args
then Call(Some lv', fn', args', l, el) else i

| Asm(sl,isvol,outs,ins,clobs,l) ->
let outs' = mapNoCopy (fun ((id,s,lv) as pair) ->
let lv' = fLval lv in
if lv' != lv then (id,s,lv') else pair) outs in
let ins' = mapNoCopy (fun ((id,s,e) as pair) ->
let e' = fExp e in
if e' != e then (id,s,e') else pair) ins in
if outs' != outs || ins' != ins then
Asm(sl,isvol,outs',ins',clobs,l) else i


(* visit all nodes in a Cil statement tree in preorder *)
and visitCilStmt (vis: cilVisitor) (s: stmt) : stmt =
Expand Down Expand Up @@ -5397,6 +5416,15 @@ and childrenStmt (toPrepend: instr list ref) : cilVisitor -> stmt -> stmt =
| Block b ->
let b' = fBlock b in
if b' != b then Block b' else s.skind
| Asm(sl,isvol,outs,ins,clobs,labels,l) ->
let outs' = mapNoCopy (fun ((id,s,lv) as pair) ->
let lv' = visitCilLval vis lv in
if lv' != lv then (id,s,lv') else pair) outs in
let ins' = mapNoCopy (fun ((id,s,e) as pair) ->
let e' = fExp e in
if e' != e then (id,s,e') else pair) ins in
if outs' != outs || ins' != ins then
Asm(sl,isvol,outs',ins',clobs,labels,l) else s.skind
in
if skind' != s.skind then s.skind <- skind';
(* Visit the labels *)
Expand Down Expand Up @@ -5884,7 +5912,7 @@ let rec peepHole1 (* Process one instruction and possibly replace it *)
| Switch (e, b, _, _, _) -> peepHole1 doone b.bstmts
| Loop (b, l, el, _, _) -> peepHole1 doone b.bstmts
| Block b -> peepHole1 doone b.bstmts
| Return _ | Goto _ | ComputedGoto _ | Break _ | Continue _ -> ())
| Return _ | Goto _ | ComputedGoto _ | Break _ | Continue _ | Asm _ -> ())
ss

let rec peepHole2 (* Process two instructions and possibly replace them both *)
Expand Down Expand Up @@ -5912,7 +5940,7 @@ let rec peepHole2 (* Process two instructions and possibly replace them both *)
| Loop (b, l, el, _, _) -> peepHole2 dotwo b.bstmts
| Block b -> peepHole2 dotwo b.bstmts

| Return _ | Goto _ | ComputedGoto _ | Break _ | Continue _ -> ())
| Return _ | Goto _ | ComputedGoto _ | Break _ | Continue _ | Asm _ -> ())
ss


Expand Down Expand Up @@ -6005,8 +6033,13 @@ let typeSigAttrs = function
let dExp: doc -> exp =
fun d -> Const(CStr(sprint ~width:!lineLength d, No_encoding))

(* todo: is this even used anywhere *)
(*
let dInstr: doc -> location -> instr =
fun d l -> Asm([], [sprint ~width:!lineLength d], [], [], [], l)
*)
michael-schwarz marked this conversation as resolved.
Show resolved Hide resolved
let dStmt: doc -> location -> stmt =
fun d l -> mkStmt (Asm([], [sprint ~width:!lineLength d], [], [], [], [], l))

let dGlobal: doc -> location -> global =
fun d l -> GAsm(sprint ~width:!lineLength d, l)
Expand Down Expand Up @@ -6517,6 +6550,7 @@ and succpred_stmt s fallthrough rlabels =
| hd :: tl -> link s hd ;
succpred_block b fallthrough rlabels
end
| Asm _ -> trylink s fallthrough
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Asm now includes labels, shouldn't CFG construction do something with those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below.



let caseRangeFold (l: label list) =
Expand Down Expand Up @@ -6581,7 +6615,7 @@ let rec xform_switch_stmt s break_dest cont_dest = begin
| Default(l,el) -> Label(freshLabel "switch_default",l,false)
) s.labels ;
match s.skind with
| Instr _ | Return _ | Goto _ | ComputedGoto _ -> ()
| Instr _ | Return _ | Goto _ | ComputedGoto _ | Asm _ -> ()
| Break(l) -> begin try
s.skind <- Goto(break_dest (),l)
with e ->
Expand Down
Loading
Loading