Skip to content
This repository has been archived by the owner on Feb 1, 2020. It is now read-only.

Manual (to be processed)

Grigore Rosu edited this page Dec 23, 2015 · 55 revisions

Note for @radumereuta (by Grigore): Radu, please revise the sort/operation inference and sort annotations sections. The principle underlying sort/operation inference is very simple:

  • Sort/operation inference attempts to maximize the sort of any involved term.

Specifically, suppose that a K code region of interest F admits n possible K ASTs {T1, T2, ..., Tn}, where each Ti corresponds to some particular consistent choice of sorts assigned to the variables occurring in F and of productions that make Ti a possible K AST of F. Since in KAST we have KLabels instead of productions and since there could be multiple productions sharing the same KLabel, it could very well be the case that n is strictly smaller than the number of ambiguous parses of F. In fact, recall that by default K does not report ambiguous parses which result in identical K ASTs. Of course, we assume that all n K ASTs {T1, T2, ..., Tn} respect all user-provided (sort, priority, etc.) constraints. The n terms must have the same shape modulo AC, that is, they can be transformed modulo AC of their KLabels into terms having KLabels of same root at the same positions (where the root of _+_:Exp*Exp->Exp is _+_); otherwise an ambiguous parsing should be reported. The goal of sort/operation inference is to pick one term Ti from this set with the property that each of its subterms has a sort larger than or equal to the sort of the same subterm occurring in other K AST in the set. In case some KLabels correspond to several productions, the sort of a (sub)term is calculated picking the production that yields the smallest result sort. Warning/Error messages should be reported if there is no way to achieve this.

With the above in mind, I think this text in the Sort annotations section is obsolete (and confusing):

The dual is `<:Sort` and forces the inner production to have **exactly**
the type specified (no subsorts). This can be useful sometimes when ambiguous
productions have related sorts. For example, a common ambiguity in K is related
to overloaded lists where we have `Exps ::= List{Exp,","}` and
`Ids ::= List{Id,","}`.  Also we have the subsort `Ids < Exps`. For the
following input: `(A, B)<:Exps`, both productions will be considered, but we
can annotate the expression with the desired sort to choose the desired production.

The operation <:Sort should only require its argument to be parse-able to sort Sort, which includes any subsorts of Sort.

The principle underlying sort annotations is also very simple, and I believe the above maximal-sort conventions correctly implement it:

  • Sort annotations should allow users to syntactically coerce the parser to choose any operations they want in any context, without a need to use abstract syntax (i.e., the labelled notation).

Radu, as you revise the sort/operation inference and sort annotations sections, please find a nice way to also include the two underlying principles above. I actually believe that the best way to do this is to merge the two sections into only one.

Finally, with this maximal-sort convention, I fail to see the need for both :: and <:>. Only one should suffice, with the semantics syntax Sort ::= Sort "::Sort". Do you disagree?

End of note

Note from @bmmoore: Let me propose a design principle that when we are explaining how subsorting relationships work out (and in particular what relationships are implicitly added between sorts which are lists of related sorts, or pure unions of overlapping subsorts), we should at least consider how a straightforward recursive definition of isSort predicates would apply on terms made of otherwise-unsorted labels. E.g, if Exps and Ints are lists of the same separator, a simple recursive isInts predicate would return true on a list of expressions which all happen to be integers, so we should have Ints < Exps without requiring any explicit declaration.


The text in this file will eventually be incorporated within the K reference manual.


Rewriting Modulo Structure

\Sigma

\Pi_2^0

Functional Notation in Productions

Inspired from the SDF, the following notation is allowed in K:

syntax Sort ::= name(Sort1,Sort2,...,Sortn)  [attributes]

Here we assume that attributes does not include any klabel attribute. The above notation, which we refer to as functional notation, is completely equivalent to the following:

syntax Sort ::= "name" "(" Sort1 "," Sort2 "," ... "," Sortn ")"   [attributes, klabel(name)]

Module Naming Conventions

See issue #1933 (https://github.com/kframework/k/issues/1933) for the new module conventions. Basically, each user-defined module M will have an automatically generated M-PARSING module which will be used for parsing ground terms over the syntax of M. Then users can manually add their stuff to M-PARSING, which will therefore be added to the syntax used for parsing. For example, M can be the syntax of a PL, except for the actual identifier tokens.0 Then the user can add the identifier tokens to M-PARSING manually. This way, those tokens will be used when parsing programs, but they will not be included in the semantics, this way avoiding conflicts between them and K variables.

KTest

The text in this file will eventually be incorporated within the K reference manual.


KTest: Basic use case

To test the definition against all the programs in this folder, go to the parent folder where the simple-untyped.k definition is and execute

  ktest simple-untyped.k --programs programs_and_results --extension simple --exclude dekker

The --exclude option tells which programs to exclude from krunning. You may want to exclude programs which take a lot of time, or which don't terminate.

If you want to skip the PDF poster generation, which takes some time, execute

  ktest simple-untyped.k --programs programs_and_results --extension simple --exclude dekker --skip pdf

If you already kompiled your definition and want to only krun the programs:

  ktest simple-untyped.k --programs programs_and_results --extension simple --exclude dekker --skip "kompile pdf"

KTest: Basic use case

To test your solution, go to the folder where it is located (i.e., where the file simple-untyped.k that you wrote is) and then call the ktest tool on the config.xml file (the one in this folder). For example, if your solution is in the parent folder, to test it go there and type

  ktest tests/config.xml

As another example, if the k distribution is installed at /home/cs422-Fall-2013/k and if you solve this exercise in /home/cs422-Fall-2013/joe/hw3/ex1, then test it by executing the following command from the folder where your solution is:

ktest /home/cs422-Fall-2013/k/dist/tutorial/2_languages/1_simple/1_untyped/exercises/break-continue/config.xml

If you want to skip the kompilation and the pdf generation, then use the following command instead (and similarly for the second example above):

  ktest tests/config.xml --skip "kompile pdf"

If you want to skip only one of kompile or pdf (typically the pdf), then use only that one in the --skip option (e.g., --skip pdf).

Both the original SIMPLE untyped programs and the programs in this folder will be tested. If your solution is correct, then all tests should pass, except, potentially, the new break-from-function.simple (see the comments there for the reason).

If your solution is somewhere else, then use the -d option of ktest to give the directory to where your K definition is located. For example, provided that your definition is in the parent folder, then to test it you can type the following command in this folder (i.e., programs_and_results):

  ktest config.xml -d ..

KTest: Basic use case

To test this definition, you can call ktest on the config.xml file in this folder. Recall that if you call ktest from any other folder than where your definition is, then you need the option -d path/to/your/def/directory. For example, if you want to call it from this folder, then type

  ktest config.xml -d ..

If you want to call it from the parent folder then just type:

  ktest results/config.xml

Also recall that if your definition is already kompiled, then you can skip kompilation to save time using the --skip kompile option; if you want to skip both kompilation and pdf generation then use the option --skip "kompile pdf".

KTest: General philosophy

The general philosophy of ktest is that the config file contains all the static info (where the programs and their results are), but it does not contains the dynamic info (where the actual definition to be tested is). The same philosophy holds when we use include. So all the non-definition attributes need to be statically solved at the place where the include attribute appears. The use of an include attribute is nothing but syntactic sugar, so the user does not have to repeat all that information which has already been given in the included configuration. The include attributes can be desugared completely statically, and thus independently from where the config file containing the include attribute is called.

In short, the inclusion of a config2.xml file into a config1.xml file is equivalent to having a flat config1'.xml file instead of config1.xml, which would behave exactly the same way as config1.xml, wherever ktest may call it from. So imagine that you have that config1'.xml built that way, completely statically (actually, this is probably the way you implement it anyway). Now, the same way I can call ktest on config1'.xml from wherever I want, the same way I want to be able to completely equivalently call ktest on config1.xml from the same place.

For example, suppose that you are somewhere called "the working directory" and that the definition you want to test is at path/lang.k from the working directory. Suppose that for some reason you really need to put your config1'.xml/config1.xml file at path1 from the working directory. Then your config1'.xml file will look something like this:

<test> definition="path/lang.k" programs="P1" ...  </test>

where P1 is the relative path from the directory path1 to where the programs are, and you would call it with the command

  ktest path1/config1'.xml

Now suppose that a config2.xml file for testing lank.k against the same programs has already been created in the past and that you would like to replace the verbose and redundant config1'.xml with a config1.xml which simply includes config2.xml. For concreteness, assume that config2.xml is at relative path path2 from where config1'.xml/config1.xml is, and that its contents is

<test> definition="lang.k" programs="P2" ...  </test>

Then it must be the case that P1=path1/P2. Knowing this, we can call ktest on config2.xml as follows:

  ktest path1/path2/config2.xml --directory path

This suggests that we include config2.xml into config1.xml as follows:

  <include file="path2/config2.xml" directory="path">

That is, the include attribute in config.xml contains exactly and only the information that I would have to manually pass to ktest to call config2.xml instead of config1'.xml or config1.xml.

However, since configuration files have fixed programs folders, we have to agree on what are the programs corresponding to the include attribute in config1.xml. There is only one reasonable way that can be, namely the same programs folder as in config1'.xml, that is, P1. So in a command like

  ktest path1/config1.xml --directory path --programs P

the root directory P is applied to P1 and NOT to P2 (again, think config1'.xml instead of config1.xml). So the above command would be equivalent to

  ktest path1/path2/config2.xml --directory path --programs P/path1/P2

KTest: Configuration files: Test element

In a batch mode, ktest requires a configuration file.

A configuration file can have multiple <test> elements. A <test> element corresponds to a single job mode where each attribute corresponds to the command line option of the same name.

A <test> element requires a K definition to be tested. For example, the following <test> element just kompiles "lang.k".

  <test
    definition="lang.k"
  />

Given the above, ktest does nothing but kompilation, but might be useful when you want to only check if your K definition is valid to kompile.

A <test> element can also have "programs" attributes that specify root directories of programs to be tested. Note that all nested sub-directories are considered when searching the programs. Several root directories can be given, separated by white spaces. A "programs" attribute requires an "extension" attribute that indicates extensions of programs to be executed. For example, the following <test> element kompiles "lang.k" and runs all programs under the directories "P1" and "P2", whose extension is either "E1" or "E2".

  <test
    definition="lang.k"
    programs="P1 P2"
    extension="E1 E2"
  />

In other words, given the above <test> element, ktest has the same behavior with the following bash shell script:

  kompile lang.k
  for i in `find P1 -name '*.E1' -o -name '*.E2'`
           `find P2 -name '*.E1' -o -name '*.E2'`
  do
    krun $i
  done

Note that if there exist multiple files with the same name under the "P1" and "P2" directories, the one in the "P2" is only executed.

An "exclude" attribute specifies which programs not to be executed. A program whose file name contains a pattern given by a "exclude" attribute is excluded for the execution.

A "results" attribute specifies root directories of reference outputs. For each program "pgm.ext", a reference output file "pgm.out" is searched under the directories given by a "results" attribute. As the "programs" attribute, a "results" attribute can have several root directories separated by white spaces, and also, the one from the last root directory is used when multiple reference outputs with the same name exist. The directories given by a "results" attribute are also used when searching an input file of each program.

A <test> element can have several <kompile-option> elements, which is used at the kompilation of the given K definition. The default options of krun can be specified by a <all-programs> element that contains multiple <krun-option> elements. Note that the <all-programs> element can be given once at most. Program specific options can be given by <program> elements. A <program> element has a "name" attribute and multiple <krun-option> elements. A program whose name contains a pattern given by a "name" attribute is executed with the options specified by the <program> element.

A <test> element can have an optional "directory" attribute, which specifies where the kompiled directory is created. If this attribute is not given, the kompiled directory is created, by default, at the directory in which the K definition resides. The "directory" attribute is necessary when you provide multiple <test> elements that have the same K definition but have different <kompile-option> elements.

To sum, for example, given the following <test> element:

  <test
    definition="lang.k"
    directory="D"
    programs="P1 P2"
    extension="E1 E2"
    exclude="X1 X2"
    results="R1 R2"
    />
    <kompile-option name="--O1" value="V1" />
    <kompile-option name="--O2" value="V2" />
    <all-programs>
      <krun-option name="--O3" value="V3" />
      <krun-option name="--O4" value="V4" />
    </all-programs>
    <program name="N1">
      <krun-option name="--O5" value="V5" />
      <krun-option name="--O6" value="V6" />
    </program>
    <program name="N2">
      <krun-option name="--O7" value="V7" />
      <krun-option name="--O8" value="V8" />
    </program>
  </test>

ktest works as follows:

At first, "lang.k" is kompiled with the options "--O1=V1 --O2=V2", whose kompiled directory "lang-kompiled" is saved at "D". And then, all programs under the "P1" and "P2" directories, whose extension is either "E1" or "E2", are executed with the options "--O3=V3 --O4=V4". However, a program whose name contains "N1" (or "N2, resp.) is executed with another series of options "--O5=V5 --O6=V6" (or "--O7=V7 --O8=V8", resp.). Furthermore, a program whose name contains "X1" or "X2" is excluded in the execution. When a program, say "pgm.ext", is executed, an input file "pgm.in" under the "R1" or "R2", if any, is given as a stdin; and an output file "pgm.out" is used for comparison with the actual stdout. Note that if there exist multiple files with the same name under the "P1" and "P2" (or "R1" and "R2", resp.) directories, the only one in the "P2" (or "R2", resp.) is used.

KTest: Configuration files: Include element

A configuration file can have multiple <include> elements as well as <test> elements. By an <include> element, you can include another configuration file. For example, you can include "another_config.xml" file as follows:

  <include
    file="another_config.xml"
  />

Basically, an <include> element inherits all attributes and elements from the included configuration file, which can be, of course, overwritten by explicitly specifying them, except "definition", "directory", "programs", and "results" attributes. For example, let a configuration file be:

  <include
    file="another_config.xml"
    exclude="new_exclude"
  />

and the included configuration file "another_config.xml" be:

  <test
    definition="D"
    programs="P"
    exclude="X"
  />

then the included <test> element is virtually same with the following:

  <test
    definition="D"
    programs="P"
    exclude="new_exclude"
  />

Although the "programs" and "results" attributes cannot be overwritten, they can be incremented by using "more-programs" and "more-results" attributes. For example, let a configuration file be:

  <include
    file="config.xml"
    more-programs="P"
    more-results="R"
  />

and the included configuration file "config.xml" be:

  <test
    definition="lang.k"
    programs="P1 P2"
    results="R1 R2"
  />

then the included <test> element is virtually same with the following:

  <test
    definition="lang.k"
    programs="P1 P2 P"
    results="R1 R2 R"
  />

Note that "definition" attribute cannot be overwritten at all.

Also note that an <include> element can also have "directory", "programs", or "results" attributes, but they have another meaning: path resolution.

KTest: Path resolution

Every path of a configuration file is relative. Three different kinds of paths exist: definition, programs, and results. A root for the relative directories can be specified using command line options. For example, let command line arguments of ktest be:

  $ ktest C/config.xml --directory D --programs P --results R

and 'C/config.xml' be:

  <test
    definition="D1/lang.k"
    programs="P1 P2"
    results="R1 R2"
  />

then the paths are resolved as follows:

  <test
    definition="D/D1/lang.k"
    programs="P/P1 P/P2"
    results="R/R1 R/R2"
  />

Note that the option --directory is optional, and its default value is the current directory. Also, the options --programs and --results are optional, and their default values are the directory where the given configuration file resides. For example, the following command

  ktest C/config.xml

is virtually same with the following:

  ktest C/config.xml --directory . --programs C --results C

The root can be also specified using <include>'s attributes. The same rule is applied here. For example, let a configuration file 'C/config.xml' have the following <include> element:

  C/config.xml:

  <include
    file=C1/config1.xml
    directory=D
    programs=P
    results=R
  />

and the included configuration file 'C/C1/config1.xml' be:

  C/C1/config1.xml:

  <test
    definition="D1/lang.k"
    programs="P1 P2"
    results="R1 R2"
  />

then the <include> element is resolved as follows:

  C/config.xml:

  <test
    definition="D/D1/lang.k"
    programs="P/P1 P/P2"
    results="R/R1 R/R2"
  />

Note that the attribute 'definition' is optional, and its default value is the current directory. Also the attributes 'programs' and 'results' are optional, and their default values are the directory where the included configuration file resides. For example, the following <include> element

  <include
    file=C1/config1.xml
  />

is vitally same with the following:

  <include
    file=C1/config1.xml
    directory=.
    programs=C1
    results=C1
  />

To sum, the roots given by command line options and <include>'s attributes are accumulated to resolve the relative paths. For example, let command line argument of ktest be:

  $ ktest C/config.xml --directory D --programs P --results R

and configuration file 'C/config.xml' be:

  C/config.xml:

  <include
    file=C1/config1.xml
    directory=D0
    programs=P0
    results=R0
  />

and the included configuration file 'C/C1/config1.xml' be:

  C/C1/config1.xml:

  <test
    definition="D1/lang.k"
    programs="P1 P2"
    results="R1 R2"
  />

then the paths are eventually resolved as follows:

  <test
    definition="D/D0/D1/lang.k"
    programs="P/P0/P1 P/P0/P2"
    results="R/R0/R1 R/R0/R2"
  />

Note that when they are not specified, the default roots are:

  D  = .
  P  = C
  R  = C
  D0 = .
  P0 = C1
  R0 = C1

KTest: Automatic output generation

ktest has a feature of automatic generation of output (*.out) files. This feature consists of two options: —-update-out and --generate-out.

ktest --update-out

When this option is enabled, ktest updates every output files by newly generated output of krun. This option creates a new set of reference output files, discarding the previous one. This option is useful when you want to change the format of output files due to the version update of krun.

Use this option at your own risk. With this option, ktest does not check the correctness of krun's output.

ktest —-generate-out

When this option is enabled, ktest generates a new output file at a new place whenever krun's output is different from the existing output file. The new output file is created in the last directory of the given 'results' attributes.

This option is useful when you create a new K definition and want to construct a new set of test programs and outputs for the K definition.

NOTE:

  • A new output file is also created when no previous output file exists.
  • A previous output file could be updated if the last directory of the 'results' attributes is the same with the directory that the previous output file resides.

NOTE: These two options make ktest run in sequential mode, ignoring the --threads option of ktest. Since multiple tests can share a single output file, generating/updating output files should be done sequentially. Otherwise, parallel updates may cause a race condition for the shared output files.

KTest: Skip steps

Each test element contains three steps: kompile, pdf, and krun. Each steps can be skipped by using either a command line option or an attribute of config.xml file.

In order to skip a step for the entire test suite, you can use a command line option --skip <steps>. For example, the following command runs ktest without generating PDFs for all test elements:

  $ ktest --skip pdf config.xml

You can also skip multiple steps by enumerating them. For example, the following command runs only the krun steps for all tests:

  $ ktest --skip "kompile pdf" config.xml

In order to skip a step for each test element, you can use an attribute of config.xml file 'skip'. For example, the following test element skips the PDF generation.

  <test
    ...
    skip="pdf"
    ...
  />

You can also skip multiple steps by enumerating them as follows:

  <test
    ...
    skip="kompile pdf"
    ...
  />

KTest: Overriding options

All of kompile/krun options such as <kompile-option>, <all-programs>, <program> elements in <include> overwrite the included counterparts. That is, this behavior is not a bug but a design.

The rational of this design is that it will be too difficult for users to keep track of all those options when we allow them to be extended. It would be much tricker when some of extended options are conflicted with the previous ones. In that case, we will need a way to describe how to resolve those conflict, which will make ktest much more complex.

Note that such an extension is only allowed in more-programs and more-results which extend the previous programs and results attributes. They cannot have a conflict, because it only add the more-programs/results to the previous programs/results.

Of course, I admit that this design is not consistent in some sense, although it is our best at that moment. It will be great if you suggest a more general and simpler design.

KTest: Internals

ktest is much more complex than your quick description. The program options are necessary to everybody who writes non-trivial definitions, not only to the adventurous. And so is the capability to exclude things. And the programs and results can be multiple folders, not only one. The directory and the programs/results have different meaning and semantics, they should never be put into the same category. K definitions change, evolve, while the programs and their outputs stay mostly the same; sometimes they change, too, which is why you want to have the capability to override them, too. But most of the time they don't. For example, you have lots of programs and tests for n language features. Now you add feature number n+1. You want to reuse all the previous programs and their outputs, but with a different language definition.

The semantics of a ktest config file config.xml is the following pair:

[[config.xml]] = ( { (p1,i1,o1), (p2,i2,o2), ..., (pn,in,on) }, lang.k )

The first element is a set of triples (program,input,output), namely the actual tests to be performed against a K definition. The second argument is the actual definition. Based on this domain, we can give semantics to all ktest's constructs. For example, [[include config.xml]] adds to the semantics of the current configuration file all the triples (program,input,output) from the semantics of config.xml, but ignores the definition of config.xml, because we want to test the current definition not the old one. Other constructs, like more-programs, exclude, programs, etc., allow you to modify the set of triples (program,input,output) in a compact way. This was derived by handling tens of definitions and exercises.

We may want an additional include tag, maybe import, which would include also the definition, not only the programs/inputs/outputs. A couple of people I talked to thought that was the meaning of include. Having two different ways to import tests will at least make them reflect on the difference between the two and hopefully choose the right one for their needs.

KTest: exclude attribute

exclude = "" actually means excluding EVERYTHING.

exclude is simply a pattern that is searched in FULL FILE PATHS. The exclude string it's not a glob or regex pattern, it's just a character string.

Since there was also someone else confused by this, I think it's a good idea to either change this behavior or improve help message.

Kast/Kompile/Krun

Environment Variables

JVM options for running kast/kompile/krun/etc can be provided by setting the K_OPTS environment variable. These options will always be passed after a set of default options. The HotSpot VM allows later options to override earlier options, so this shouldn't be a problem. (If you use another JVM even our default options not be supported. In that you may need to modify lib/k or lib/k.bat, which set the default options and read the environment variable).

File read/write

If you want to write something to a file at once, you can go with #open and #write .

For example,

 rule <k> run => writeFile("file", "content") ... </k>

 syntax K ::= writeFile(String,String)
 rule writeFile(File:String, Content:String)
   => writeFD(#open(File), Content)

 syntax K ::= writeFD(Int,String)
 rule writeFD(FD:Int, Content:String)
   => #write(FD, Content) ~> #close(FD)

TODO: how to read from a file

Krun: Configuration parser variables

Suppose that you want to parse configuration variables (say CFG1, CFG2, ...) with different parsers ('parser1', 'parser2', ...). Here is how you can set a parser for each configuration variable:

-$ krun -config-var-parser="parser1" -cCFG1="some term" -cCFG3="another term"
        -config-var-parser="parser2" -cCFG2="other term" 

Note that once you have setup parser1 then it is active until -config-var-parser option is used again. That is, in our example, the values of CFG1 and CFG3 are parsed using parser1 and the value of CFG2 is parser using parser2.

Generalized strictness

This feature allows one to customize strictness for all positions or for individual positions.

Example 1:

  syntax Exp ::= Exp "+" Exp [left, strict<k>(all(context('rvalue)))]
  syntax K ::= rvalue(K) [strict,context(result(RVal))]
  syntax RVal ::= Int

The first line declares "+" strict in the <k> cell, on all arguments, wrapping all of them for evaluation with the 'rvalue context wrapper. The second line declares rvalue which is itself strict, and which when used as a context cools for results of type RVal (defined on the third line).

Example 2:

  syntax Exp ::=   Exp "=" Exp [right, strict(1, 2(context(rvalue))), assign]
  syntax Exp ::= "rvalue" "(" Exp ")" [context(result(RVal)), strict, klabel(rvalue)]

The first line declares "=" strict (in the <k> cell), evaluating the first argument to the default computation result type (KResult) and the second argument to the default wrapped with the rvalue context wrapper (note the klabel annotation on line 2).

General syntax:

StrictAttribute ::=  StrictHead | StrictHead "(" StrictPositions ")" 
StrictHead ::= StrictType | StrictType "<" CellNames ">"
StrictType ::= "strict" | "seqstrict"
CellNames ::= List{CellName,","}

StrictPositions ::= List{StrictPosition, ","}
StrictPosition ::= Position | Posistion "(" StrictPositionAttributes ")"
Position ::= Int | "all" | "other"
StrictPositionAttributes ::= List{StrictPositionAttribute, ","}
StrictPositionAttribute ::= "context" "(" KLabel ")" 
                          | ContextPositionAttribute

ContextPositionAttribute ::= "result" "(" KSort ")"
                           | "cell" "(" CellNames ")"

ContextContextAttribute ::= "context" "(" ContextPositionAttributes ")"
ContextPositionAttributes ::= List{ContextPositionAttribute, ","}

The semantics when transforming these into context is based on the following principles

  1. the "<" CellNames ">" arguments of StrictHead are transformed into "cell" "(" CellNames ")" attributes for all positions.
  2. StrictPosition declarations are processed in order. If duplicates are found, the latter update the attributes values they redeclare.
  3. "all" refers to all positions, updating the existing ones and adding those missing
  4. "other" refers to all positions which have not been declared up to encountering "other"
  5. If "context" attribute is found, the RHS of the context rule is wrapped with the corresponding label and the subattributes of the "context" attribute are used to update the current attributes.
  6. The attributes passed to the context rule for a given position are obtained by updating the production attributes with the StrictPositionAttributes (perhaps updated using the context attributes as specified at (5)) corresponding to that position.

Note: Currently, generalized strictness does not apply to strict labels

[macro] / [function] / [anywhere] rules

What is the difference between 'macro', 'anywhere', and 'function'? The generated maude code seems to be same except metadata. Who makes, if any, a difference?

They all generate the same code now. However, they are semantically different:

macro should only be used for desugaring, and should be applied to programs before any evaluation is started.

function should only be used for annotating function declarations

anywhere should be used for rules which we want to apply anywhere, disabling configuration concretization for them.

Function rules are anywhere rules, but anywhere rules are not constrained to functions.

The [function] attribute can be attached to syntax declarations. All the rules corresponding to that function are annotated automatically as [anywhere] which is only a rule attribute. However, if we want to say that a rule applies anywhere without being related to a function declaration, we simply annotate it with [anywhere] and the contextual transformation will not affect that rule. Therefore the functionality of [function] is implemented using [anywhere], and [anywhere] is an attribute which does not fill the context for that rule.

Does desugaring also apply to rules?

It surely applies to programs. Concerning rules, that has been debated in the past, but I'm not sure we reached a definite conclusion.

The best way to understand these is to forget about the Maude backend. They translate the same way in Maude because Maude does nor provide us with the subtle differences that we need here.

macro: yes, they are supposed to apply statically, before any rewriting starts, both on the program and on the semantic rules. So in terms of performance, they are supposed to not count (i.e., their amortized cost is zero).

function: is a label has this attribute, then it is as if it is a builtin function, the only difference being that it is "implemented" using K rules. We even considered changing the language in which functions are implemented, and following a functional style instead. So they are meant to be very fast, almost instantaneous, when compared to the other rules. Different backends are free to implement the functions different ways. The symbolic execution engine also considers than as part of the axiomatization of the underlying domain in which semantic rules are applied. So mathematically speaking, functions do not count as semantic computational steps.

anywhere: these are just rules which can apply anywhere they match, so they are expected to be comparatively very slow. Their application counts as computational steps, so they are interleaved etc. in formal analyses. It is hard to reason with them in reachability logic. One should avoid such rules by any means if possible, and use strictness and normal rules instead.

Kompile: isSort predicates

Currently the compilation process assumes that all KLabels having 'is' as a prefix are of the form 'isSort' where Sort is the name of a syntactic category. Therefore users defining their own labels should be aware of this and understand that defining and using an isXXX label which is not related to a sort XXX will result in undefined behavior.

Kompile: Adding files to a definition

Is done using the require keyword at the beginning of a file, followed by a string with the path of the file, absolute, or relative to the current file. If the required file is not found in the relative path, then it is looked for in /k/include directory, where the framework stores the predefined builtin files. The names are solved by the JVM so relative paths should be portable on Windows, Linux, and OSX.

Kompile: Automatic module inclusion

During kompilation, some predefined information is needed in order to parse a definition. All this information is stored in files in the /include directory from the distribution. The autoincluded.k is the root file and it is always required, unless the --no-prelude option is provided. From that file, the AUTO-INCLUDED-MODULE is automatically added to all the user defined modules. The single exception is the separate syntax module (usually main module suffixed by -SYNTAX, or specified in the command line). In this case only the AUTO-INCLUDED-MODULE-SYNTAX module is added, which contains only a minimal amount of syntax usually required to define a programming language. The user can always extend this behavior by manually adding the required modules, but the program parser may be corrupted with unwanted syntactic productions.

Kompile: sort inference for variables, and disambiguation in rules.

Rewrite rules, are by far the hardest part of a definition to parse. It combines at runtime the grammar of the defined language, and the grammar of K. This means the resulting grammar will most likely be ambiguous. The parser has some builtin disambiguation filters which try to find a parse which validate the following conditions:

  • => (rewrite) is greedy (has the lowest priority)
  • ~> has the lowest priority after rewrite
  • casting has the highest priority
  • named variables must have the same sort on every location.
  • for overloaded operators we consider only the most general production that fits in the current context. Two productions are considered to be overloaded iff all terminals match, and iff all non-terminal sorts are subsorted, or equal.

The variable type inferencer tries to maximize the sorts for each variable in order to make the rule match as many cases as possible. Example:

rule A + B => A +Int B

Commonly found in definitions, is the semantics of addition. In the left hand side, A and B can have sort Exp, but on the right hand side can only be Int. The final sorts inferred by default will be Int. The default inferred sorts for variables, can be seen by running kompile with the -w all option.

Sort inference tricky example

This may not be the greatest for the manual, because it came from a bug. Can we just explain how it's supposed to work without explaining the problem instead? Supplementing this with an example where a possible parse is correctly rejected because there are no variable assignments at all might be good (maybe more closely connected to a discussion of parsing ambiguity?).

Here is a particularly tricky example of sort inference, demonstrating one interaction between parsing ambiguity and sort ambiguities:

require "domains.k"
module TEST-SYNTAX
syntax T ::= "wrap" M [klabel(wrapM)] | "wrap" X [klabel(wrapX)]
syntax A
syntax B
syntax M ::= A | B
syntax N ::= A | B | X // make A and B both maximal in the intersection of N and M
syntax X
syntax KItem ::= label(N,T)
endmodule
module TEST
imports MAP
imports TEST-SYNTAX
configuration <k>$PGM:K</k>
rule X => label(X,wrap X:A)
rule X => label(X,wrap X:B)
rule X => label(X,wrap X:X)
rule X => label(X,wrap X)
endmodule

Sort inference fails for the last rule, saying X could have any of the three sorts A, B, and X. The previous three lines demonstrate that each of those choices are legal. This requires considering parsing and sorting together because X can have sorts A or B if the text wrap is parsed to the label wrapM by the first production, and the sort X if the text wrap is parsed to the label wrapX by the second production.

This example came from a bug where we tried to separately infer variable types in each possible parse. In the wrapX parse, the constraint X:X from wrapX is tighter than the constraint X:N from the first argument of local, leaving X:X as the most general sort assignment. In the wrapM parse, the constraint X:M from wrapM intersects with X:N to leave X:A and X:B as the two highest legal sorts, with neither more general than the other. After dropping that parse for ambiguity, the choice X:X was selected as an inferred sort, even though we see both A and B are legal and neither is generalized by X:X.

Kompile: Automatic keyword rejection

To make defining languages easier and small for simple definitions, the terminals in productions are automatically rejected from being any lexical production. This is a simple way to ensure that keywords don't produce ambiguities. Example:

    Stmt ::= "return" ";"
           | Exp ";"
    Exp  ::= Id ...
    Id   ::= Token{[a-zA-Z][a-zA-Z0-9]*}

Because "return" can match the regular expression of Id, it may produce ambiguities. The solution is to say that "return" is not allowed to parse as an Id. There might be cases when the default behavior is not wanted. It can be simply be overridden by declaring another time the desired production:

    Id ::= "return"

An additional way to reject unwanted parses, is to write:

    Float ::= Token{[0-9]* "." [0-9]*}
    Float ::= "." [reject]

This will reject "." only from the Float category. It will not affect other sorts. The default behavior can however be overridden by adding the attribute 'noAutoReject' to a token declaration:

    Bubble ::= Token{~[\n\r\t]+} [noAutoReject]

This will no longer reject the terminals in the definition, and the user will have to reject the desired keywords, manually.

Another example related with the automatic keyword rejection

Suppose that you want to define your own floating point numbers.

For example, suppose that a.k is given as follows:

$ cat a.k
module A
syntax MyFloat ::= Token{ "Infinity" } [notInRules, notInGround]
rule S:String => #parse(S,"MyFloat")
endmodule

And 1.a:

$ cat 1.a
"Infinity"

Then, krun with the java backend results in #noparse:

$ ~/dev/k/bin/kompile --backend java a.k
$ ~/dev/k/bin/krun 1.a
[1, 20.47 ms]
<k>
    #noparse
</k>

This is because the built-in float.k contains the following keyword declaration,

  syntax #Float ::= "NaN" | "Infinity"

which triggers the automatic keyword rejection, rejecting "Infinity" to be parsed as MyFloat.

In order to fix this problem, you can use noAutoReject attribute for MyFloat:

syntax MyFloat ::= Token{ "Infinity" } [notInRules, notInGround, noAutoReject]

[owise] rules only work for functions in the Java backend

TODO(Cosmin): implement it properly using strategies

Java backend non-termination Behavior

When I kompile these three sets of rules, the java backend will result in non-termination:

non-termination due to sorts:

module C-LTLMC

     syntax C ::= TypeSpecifier
     syntax TypeSpecifier ::= "Void"

     syntax Expression ::= "-" K [prec(22)]
                         | "+" K [prec(22)]
                         | "!" K [prec(22)]
                         | "~" K [prec(22)]
                         | "*" K [prec(22)]
                         | "&" K [strict, prec(22)]

     syntax CId ::= "#NoName" // new
     syntax C ::= CId
     syntax Expression ::= CId
     syntax Bits ::= Int
     syntax SymLoc ::= "NullPointer"
     syntax SymBase ::= string(String)
     syntax CSize ::= Int
     syntax CValue ::= CSize
     syntax Statement ::= "loopMarked"
     syntax Type ::= type(K) [function]
     syntax FieldInfo ::= fieldInfo(List, Map, Map)
     // Ltl atomic propositions.
     syntax ValueProp ::= Int  
                        | #ptr(SymLoc, Type)
                        | #struct(List, FieldInfo)
                        | #union(List, FieldInfo)

     syntax CProp ::= ValueProp | CId 
                    | "-" CProp
                    | "+" CProp
                    | "*" CProp
                    > CProp "[" CProp "]"
                    > CProp "+" CProp [left]
                    | CProp "-" CProp [left]

     syntax Prop ::= CProp

     syntax ValueProp ::= ltlRVal(Bag, Prop) [function]


     rule ltlRVal(_, V:ValueProp) => V

     rule ltlRVal(B:Bag, - P:CProp) 
          => ltlRVal(B, 0 - ltlRVal(B, P))

     rule ltlRVal(B:Bag, + P:CProp) 
          => ltlRVal(B, P)

     rule ltlRVal(B:Bag, L:CProp[R:CProp])
          => ltlRVal(B, * (L + R))

     rule ltlRVal(B:Bag, X:CId) => ltlRVal(B, * ltlLVal(B, X))

     // Not so interested in l-values.
     syntax ValueProp ::= ltlLVal(Bag, CId) [function]
     syntax ValueProp ::= "ltlLVal'" "(" Bag "," String ")" [function]

endmodule

module C-LTLMC

     syntax C ::= AttributeWrapper(K, K) [klabel('AttributeWrapper)]

     syntax Expression ::= "-" K [prec(22)]
                         | "+" K [prec(22)]
                         | "!" K [prec(22)]
                         | "~" K [prec(22)]
                         | "*" K [prec(22)]
                         | "&" K [strict, prec(22)]

     syntax CId ::= "#NoName" // new
     syntax C ::= CId
     syntax Expression ::= CId
     syntax Bits ::= Int
     syntax SymLoc ::= "NullPointer"
     syntax SymBase ::= string(String)
     syntax CSize ::= Int
     syntax CValue ::= CSize
     syntax Statement ::= "loopMarked"
     syntax Type ::= type(K) [function]
     syntax FieldInfo ::= fieldInfo(List, Map, Map)
     // Ltl atomic propositions.
     syntax ValueProp ::= Int  
                        | #ptr(SymLoc, Type)
                        | #struct(List, FieldInfo)
                        | #union(List, FieldInfo)

     syntax CProp ::= ValueProp | CId 
                    | "-" CProp
                    | "+" CProp
                    | "*" CProp
                    > CProp "[" CProp "]"
                    > CProp "+" CProp [left]
                    | CProp "-" CProp [left]

     syntax Prop ::= CProp

     syntax ValueProp ::= ltlRVal(Bag, Prop) [function]


     rule ltlRVal(_, V:ValueProp) => V

     rule ltlRVal(B:Bag, - P:CProp) 
          => ltlRVal(B, 0 - ltlRVal(B, P))

     rule ltlRVal(B:Bag, + P:CProp) 
          => ltlRVal(B, P)

     rule ltlRVal(B:Bag, L:CProp[R:CProp])
          => ltlRVal(B, * (L + R))

     rule ltlRVal(B:Bag, X:CId) => ltlRVal(B, * ltlLVal(B, X))

     // Not so interested in l-values.
     syntax ValueProp ::= ltlLVal(Bag, CId) [function]
     syntax ValueProp ::= "ltlLVal'" "(" Bag "," String ")" [function]

endmodule

module C-LTLMC

     syntax C ::= TypeSpecifier

     syntax TypeSpecifier ::= TAtomic(K, K)
          [klabel('TAtomic)]
     syntax Expression ::= "-" K [prec(22)]
                         | "+" K [prec(22)]
                         | "!" K [prec(22)]
                         | "~" K [prec(22)]
                         | "*" K [prec(22)]
                         | "&" K [strict, prec(22)]

     syntax CId ::= "#NoName" // new
     syntax C ::= CId
     syntax Expression ::= CId
     syntax Bits ::= Int
     syntax SymLoc ::= "NullPointer"
     syntax SymBase ::= string(String)
     syntax CSize ::= Int
     syntax CValue ::= CSize
     syntax Statement ::= "loopMarked"
     syntax Type ::= type(K) [function]
     syntax FieldInfo ::= fieldInfo(List, Map, Map)
     // Ltl atomic propositions.
     syntax ValueProp ::= Int  
                        | #ptr(SymLoc, Type)
                        | #struct(List, FieldInfo)
                        | #union(List, FieldInfo)

     syntax CProp ::= ValueProp | CId 
                    | "-" CProp
                    | "+" CProp
                    | "*" CProp
                    > CProp "[" CProp "]"
                    > CProp "+" CProp [left]
                    | CProp "-" CProp [left]

     syntax Prop ::= CProp

     syntax ValueProp ::= ltlRVal(Bag, Prop) [function]


     rule ltlRVal(_, V:ValueProp) => V

     rule ltlRVal(B:Bag, - P:CProp) 
          => ltlRVal(B, 0 - ltlRVal(B, P))

     rule ltlRVal(B:Bag, + P:CProp) 
          => ltlRVal(B, P)

     rule ltlRVal(B:Bag, L:CProp[R:CProp])
          => ltlRVal(B, * (L + R))

     rule ltlRVal(B:Bag, X:CId) => ltlRVal(B, * ltlLVal(B, X))

     // Not so interested in l-values.
     syntax ValueProp ::= ltlLVal(Bag, CId) [function]
     syntax ValueProp ::= "ltlLVal'" "(" Bag "," String ")" [function]

endmodule

non-termination due to macros:

module C-SYNTAX

     syntax C ::= CId
     syntax CId ::= "#NoName" // new
     syntax Expression ::= "*" K [prec(22)]
                         > K "+" K [prec(33), left]
     syntax Expression ::= CId

     syntax Expression ::= K "[" K "]"
     syntax Expression ::= K "." CId
     syntax Expression ::= K "->" CId

     rule K:K -> F:CId => (* K) . F [macro, structural]
     rule E1:K[E2:K] => *(E1 + E2) 
          [macro, structural]
endmodule

module C-LTLMC
     imports C-SYNTAX
     syntax ValueProp ::= Int  

     syntax CProp ::= ValueProp | CId 
                    | "*" CProp
                    > CProp "[" CProp "]"
                    | CProp "." CId
                    | CProp "->" CId
                    > CProp "+" CProp [left]

     syntax Prop ::= CProp 

     syntax ValueProp ::= ltlRVal(Bag, Prop) [function]

     rule ltlRVal(_, V:ValueProp) => V

     rule ltlRVal(B:Bag, L:CProp[R:CProp])
          => ltlRVal(B, * (L + R))
endmodule

The key feature to cause this non-termination is as follow: First, I have a sort named "CProp" and have subsort "ValueProp". There are two functions under the subsort "ValueProp", which are very similar.

Second, The Identifier sort "CId" is a subsort of a lot of sorts.

Third, I have two groups of expressions which belong to two different sorts, and they have common operators.

Fourth, I have a sort "C", such that "CId" is a subsort of it and there is another Sort "TypeSpecifier" which is a subsort of "C" and it has some operators which will not be used in anywhere in the context.

krun in java backend to distinguish cat option for parser with default option

When I run c semantics with command:

krun -cOPTIONS="(SetItem(NOLINK) .Set)" -cOBJ1=".Bag" -cOBJ2=".Bag"  -d c11-translation-kompiled ctype.c11-translation

It works fine in some small programs, however if I run command:

krun -cOPTIONS="(SetItem(NOLINK) .Set)" -cOBJ1=".Bag" -cOBJ2=".Bag" --parser "kast --parser ground"  --parser="cat" -d c11-translation-kompiled ctype.c11-translation

The java backend krun will complain. Something must be wrong here. The output is like:

Exception in thread "main" java.lang.ClassCastException: org.kframework.kil.BackendTerm cannot be cast to org.kframework.backend.java.kil.Term
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:240)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:78)
    at org.kframework.kil.KSequence.accept(KSequence.java:46)
    at org.kframework.kil.AbstractVisitor.visitNode(AbstractVisitor.java:92)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:279)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:78)
    at org.kframework.kil.Cell.accept(Cell.java:287)
    at org.kframework.kil.AbstractVisitor.visitNode(AbstractVisitor.java:92)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:326)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:78)
    at org.kframework.kil.Bag.accept(Bag.java:69)
    at org.kframework.kil.AbstractVisitor.visitNode(AbstractVisitor.java:92)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:269)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.visit(KILtoBackendJavaKILTransformer.java:78)
    at org.kframework.kil.Cell.accept(Cell.java:287)
    at org.kframework.kil.AbstractVisitor.visitNode(AbstractVisitor.java:92)
    at org.kframework.backend.java.symbolic.KILtoBackendJavaKILTransformer.transformTerm(KILtoBackendJavaKILTransformer.java:134)
    at org.kframework.backend.java.kil.Term.of(Term.java:50)
    at org.kframework.backend.java.symbolic.JavaSymbolicKRun.javaKILRun(JavaSymbolicKRun.java:96)
    at org.kframework.backend.java.symbolic.JavaSymbolicKRun.internalRun(JavaSymbolicKRun.java:81)
    at org.kframework.backend.java.symbolic.JavaSymbolicKRun.run(JavaSymbolicKRun.java:76)
    at org.kframework.krun.Main.normalExecution(Main.java:372)
    at org.kframework.krun.Main.execute_Krun(Main.java:1454)
    at org.kframework.main.Main.main(Main.java:64)

Conversion between tokens and strings

You can use #tokenToString and #parseToken, built-in functions for conversion between tokens and strings. For example,

syntax Name ::= Token{[\_\$A-Za-z][\_\$A-Za-z0-9]*}   [notInRules]

syntax String ::= StringOfName(Name) [function, notInPrograms]
rule StringOfName(N:Name) => #tokenToString(N)

syntax Name ::= NameOfString(String) [function, notInPrograms]
rule NameOfString(S:String) => #parseToken("Name", S)

Once you get the string representation of the token, you can use the built-in string library functions to manipulate the string, such as substrString, findString, replaceAll, lengthString, Char2String, and String2Char.

Enriched strict attributes

Here are the declarations that use the enriched syntax for attributes:

  syntax Exp ::= "*" Exp [strict(all(context(rvalue), result(Ref))), indirect]
               > Exp "+" Exp [strict(all(context(rvalue), result(RVal))), left, plus]
               > Exp "=" Exp [strict(1, 2(context(rvalue), result(RVal))), right, assign]

  syntax KItem ::= "rvalue" "(" K ")" [context(result(Val)), strict, klabel(rvalue)]

Without the support for the above declarations, a user must explicitly write the heating&cooling rules:

  syntax Exp ::= "*" Exp [ indirect]
               > Exp "+" Exp [left, plus]
               > Exp "=" Exp [right, assign]

  syntax KItem ::= "rvalue" "(" K ")"  [strict, klabel(rvalue)]

  rule rvalue(K:K) => K ~> rvalue(HOLE)
    requires (isVal(K) =/=K true) 
    [heat]

  rule K:KItem ~> rvalue(HOLE) => rvalue(K)
    requires isVal(K)
    [cool]

  rule * K:KItem => rvalue(K) ~> * HOLE
    requires isRef(K) =/=K true
    [heat]
  rule rvalue(K) ~> * HOLE => * K
    requires isRef(K)
    [cool] 

  rule KL:K + KR:K => rvalue(KL) ~> HOLE + KR
    requires isRVal(KL) =/=K true
    [heat]
  rule rvalue(KL:K) ~> HOLE + KR:K => KL + KR
    requires isRVal(KL)
    [cool]
  rule KL:K + KR:K => rvalue(KR) ~> KL + HOLE
    requires isRVal(KR) =/=K true
    [heat]
  rule rvalue(KR:K) ~> KL:K + HOLE => KL + KR 
    requires isRVal(KR)
    [cool]

  context HOLE = _
  rule KL:K = KR:K => rvalue(KR) ~> KL = HOLE
    requires isRVal(KR) =/=K true
    [heat]
  rule rvalue(KR:K) ~> KL:K = HOLE => KL = KR 
    requires isRVal(KR)
    [cool]

We see that for each binary operator we have to write four rules. For 10 such operators, there are 40 rules. In C and C++ there more than 10 operators.

The meaning of the enriched syntax can be briefly explained as follows: The following declarations describe the situation with a binary operator that requires special strict attributes for each argument:

  syntax Exp ::= Exp "op" Exp [strict(1(context(ctx1),result(Res1)),
                                      2(context(ctx2),result(Res2))),
                               left, oplabel]

  syntax KItem ::= ctx1(K) [context(result(ResCtx1)), strict, klabel(ctx1)]
  syntax KItem ::= ctx2(K) [context(result(ResCtx2)), strict, klabel(ctx2)]

This is equivalent to the following declarations and rules (they are self-explained):

  syntax Exp ::= Exp "op" Exp [left, oplabel]

  syntax KItem ::= ctx1(K) [klabel(ctx1)]
  syntax KItem ::= ctx2(K) [klabel(ctx2)]

  //heating&cooling rules for the operator
  rule  KL op KR => ctx1(KL) ~> HOLE op KR requires isRes1(KL) =/=K true [heat]
  rule  ctx1(KL) ~> HOLE op KR => ctx1(KL) op KR requires isRes1(KL) [cool]
  rule  KL op KR => ctx2(KR) ~> KL op HOLE requires isRes2(KR) =/=K true [heat]
  rule  ctx2(KR) ~> KL op HOLE => KL op KR requires isRes2(KR) [cool]

  //heating&cooling rules for the contexts
  rule ctx1(K) => K ~> ctx1(HOLE) requires isResCtx1(KL) =/=K true 
  rule K ~> ctx1(HOLE) => ctx1(K) requires isResCtx1(KL)
  rule ctx2(K) => K ~> ctx2(HOLE) requires isResCtx2(KL) =/=K true 
  rule K ~> ctx2(HOLE) => ctx2(K) requires isResCtx2(KL)

There is the constraint as Res1 < ResCtx1 and Res1 < ResCtx1. Of course, we may have ctx1 equal to ctx2 if ResCtx1 = ResCtx2.

I think that mechanism is simple and it can be implemented in K, not necessarily in this form.

Nailgun

To run K with nailgun, set the K_NAILGUN environment variable, and then spawn a server process to handle requests using the kserver command. It is recommended that the user start the server with a high timeout value to avoid disconnects which occur due to server latency. For example, to listen on port 2113 (the default) with a timeout of 100 seconds, run kserver 2113 100000.

Following the start of the server, users can run commands against the server exactly identical to if they were running commands locally, so long as the K_NAILGUN variable is set. If it is not set, it defaults to running the command in a new process.

It is encouraged that users run large test suites using nailgun, as the increase in speed associated with decreased startup costs per process is quite dramatic and can reach as high as 7-8x faster krun on small programs.

Fresh constants (!X:Sort)

The !X:Sort notation generates fresh Sort constants based on an integer counter. Internally, the backend manages the counter and passes it to a function that returns the fresh constant. By default, K provides such functions only for Int and Id. One can define such functions for other sorts as follows:

  syntax Sort ::= freshSort(Int)    [freshGenerator, function]
  rule freshSort(I:Int) => ...

For instance, the function for Id is:

  rule freshId(I:Int) => #parseToken("Id", "#" +String Int2String(I))

The convention is that fresh constants start with # and fresh variables with _, to keep them distinct.

KLabel arity

If you declare a production with no nonterminals of sort KLabel, this is a KLabel constant declaration. A KLabel constant declaration is treated the same as a declaration of a KItem, except that it does not have information pertaining to the arity and sort of its children. If sort information in required, they are assumed to be of sort K, and if the arity is required, an error is thrown unless the production is annotated with an "arity" attribute.

For example,

syntax KLabel ::= "foo" [arity(2)]

Is treated the same as the production

syntax KItem ::= foo(K, K) [klabel(foo)]

Sort Annotations

Annotating a term with :Sort translates to a semantic check, so the rule won't apply if the subterm does not have the expected sort or subsort.

There are other sort annotations which are purely syntactic and only affect how the parsed terms are being statically type checked. To change the sort of a production, the suffix :>Sort can be used. The inner term can have any sort, but the production can be placed anywhere Sort is expected: A:>Exp + B will allow A to have any sort, but B has to be a subsort of Exp.

The dual is <:Sort and forces the inner production to have exactly the type specified (no subsorts). This can be useful sometimes when ambiguous productions have related sorts. For example, a common ambiguity in K is related to overloaded lists where we have Exps ::= List{Exp,","} and Ids ::= List{Id,","}. Also we have the subsort Ids < Exps. For the following input: (A, B)<:Exps, both productions will be considered, but we can annotate the expression with the desired sort to choose the desired production.

Combining the two behaviors can be done with the production ::Sort or <:>Sort. In practice these annotations are productions of the form:

Sort ::= Sort "::Sort"
Sort ::= Sort "<:>Sort"
K    ::= Sort "<:Sort"
Sort ::= K    ":>Sort"

and have the highest priority of all productions (A + B:Sort will apply only to B).

KDoc

To generate a pdf of a compiled K definition, run the command kdoc from within the directory in which the -kompiled directory is located. For more information on the options that can be specified to kdoc, run kdoc --help.

Symbolic Execution

To execute a program symbolically using unification instead of pattern-matching, simply add the --symbolic-execution flag to krun.

Rule Auditing

To determine why a particular rule (for example, one at line 50 of file.k), did not apply at a particular numbered rewrite step, (for example, step 100), you can use the following command:

krun <program> --audit-file file.k --audit-line 50 --audit-step 100

Macro side-conditions

When we add the appropriate check, it will be an error for macros to have side conditions.

Streaming Cell Contents

Here is a high-level design of the IO capabilities of K:

  1. Cells can be linked to streams, which can come from stdin, stdout, sockets, files, etc., and in general from any URIs. This link-age of a cell to a stream is the only thing that the user has to do, and that is done when the configuration is defined.

  2. Rules can match (or read) or write items in those cells, just like in any other cells. When a rule matches (or reads), a term of a certain sort in a streamed cell, for example, the parser will be invoked to read an appropriate sequence of characters from the streamed cell. When a rule writes a term of a certain sort to a streamed cell, then the pretty-printer will be invoked to stream than term as a sequence of characters.

No, this is not doomed to be slow! It will be slow only if implemented naively. There is no reason to not "parse" primitive values in an input stream using the fastest POSIX or whatever specialized library functions are available on the current platform, and similarly for writing things to cells. If one wants to read an expression of sort Exp or a program of sort Pgm, well, that's a different story, and things can be slower in that case (it depends of the complexity of the grammar; see, for example, our paper on monitoring CFG specifications: http://fsl.cs.illinois.edu/index.php/Efficient_Monitoring_of_Parametric_Context-Free_Patterns).

Rules may not match on the contents of an output cell. Anything added to an output cell might be output immediately. A rule like this will be rejected, and couldn't be relied on anyway:

<out>`ListItem("string") => ListItem("replacement")`...</out>

Strict Compilation

By default, sort checks are inferred for all variables regardless of whether they are typed with :: or :. This behavior is called "strict compilation", which assists debugging by ensuring that a term which does not type correctly will fail fast. This behavior can be changed with the --non-strict option however. When this option is enabled, sort checks occur only for those terms explicitly typed with :. This improves the performance in many cases by reducing the number of terms that need to compute their sort information.

--exit-code

This flag sets the exit code on the command line to the value matched by the single integer variable in the pattern specified. Exit codes 111, 112, and 113 serve dual purpose and can also refer to errors returned by K.