My solutions to the second esolang reverse engineering contest (Nuova).
The event was based on the language Nuova, defined by its implementation (in this repository as raw-nuova.c
). The tasks were as follows:
- Create a program that prints "Hi" without a linefeed.
- Create a program that prints "Hello, World!" with a linefeed.
- Make a
cat
program that does not terminate. - Make a
cat
program that terminates on EOF. - Make a
fizzbuzz
program that displays a specified amount of sequence items (as a decimal number, 0-10000). - Prove that the language is Turing-complete (either by creating an interpreter for a TC language in it or compiling a TC language into it). Assume that the registers and memory are unbounded and provide reasoning behind your proof.
This tasks would be trivial in any practical programming language, but completing fizzbuzz took me over 40 hours of work.
Nuova is a relatively simple language. Each instruction is executed one-by-one in order from start to end. The key difficulty with Nuova is that invalid instructions directly hash all three of the registers (a
, b
, and c
), and the programmer only has full control over every second instruction, so your registers typically get hashed frequently.
raw-nuova.c
: the definitive Nuova interpreter. Run withbin/prog
as the input viamake run-raw-nuova
, or./bin/nuova fizzbuzz
to run a different file as input.nuova.c
: a formatted version with debugging statements (prints tologs/nuova.log
), memory limit, and runtime limit. Run withbin/prog
as the input viamake run-nuova
.sources/*.s
: my solutions to the tasks, in an assembly-like format.compiler.c
,compiler.h
: my primary compiler for generating Nuova code. Generate Nuova bytecode atbin/prog
for a particular program usingmake run-compiler prog=cat-terminating
.solver.py
: my first attempt, using Z3 (though Z3 is no faster than exhaustive search since the hash has no useful patterns)- See the releases page for binaries of programs for all six tasks.
The hash itself is a three-round multiply-xorshift hash, named triple32
from Hash Prospector, which notes "this hash function is indistinguishable from a perfect PRF (e.g. a random permutation of all 32-bit integers)." Nuova names this function P
. Fortunately, since P
is collisionless, it is invertible. Also P(0) = 0
, which has some niche use.
Program and data is stored on the same "unbounded" (bounded by 32 bit integer addressing) tape of unsigned 32-bit integers (call it mem
). In each step, the memory at the instruction pointer (initially 0) is read like instruction = mem[ip]
, and the instruction pointer is incremented. If that instruction is one of the 23 valid instructions, then the corresponding statement occurs (described below) and the next step begins. Otherwise, all three of the registers (a
, b
, and c
) get hashed through P
, and the next step begins.
Three instructions are load-immediate:
0x0F: a = mem[ip++];
0x1F: b = mem[ip++];
0x2F: c = mem[ip++];
Three instructions are jump-immediate. 0x3F
is unconditional. 0x4F
and 0x5F
are conditional, but note they also conditionally increment the instruction pointer (so a fall-through will typically end up hashing the registers)
0x3F: ip = mem[ip++];
0x4F: if (a <= b) ip = mem[ip++];
0x5F: if (b >= c) ip = mem[ip++];
Three instructions are memset-immediate. After setting the memory (in the cell immediately after the instruction), they increment the instruction pointer again
0x00: mem[ip] = a; ip++;
0x10: mem[ip] = b; ip++;
0x20: mem[ip] = c; ip++;
Three instructions are random-access memset:
0x30: mem[a] = b;
0x40: mem[a] = c;
0x50: mem[a] = mem[a]; // why?
Four instructions move registers through a
:
0x60: b = a;
0x70: c = a;
0x80: a = b;
0x90: a = c;
One instruction is an unconditional jump to register:
0xA0: ip = a;
Two instructions do arithmetic using b
and c
as arguments, with a
as the result (I wish there was a += b
):
0xB0: a = b + c;
0xC0: a = b - c;
Two instructions do I/O on individual characters (bytes, C locale). Note putchar
only cares about the lower 8 bits of a
, and getchar
returns -1 = 0xFFFFFFFF
for EOF.
0xD0: putchar(a);
0xE0: a = getchar();
There is one instruction to exit cleanly, like exit(0);
in C, or return 0;
in the main function in C:
0xF0: exit(0);
We don't talk about 0x6F
.
Any other memory value executes a = P(a); b = P(a); c = P(c);
.
There is no instruction for reading from an arbitrary memory address.
The Nuova interpreter takes a single binary file as input, but its operation depends on 32-bit integers in all registers. What gives?
For the most part, at program start, mem[i] = P(i) ^ c
, where c
is the i
th byte of the input file. Since each byte is only one byte, and the result of the hash P(i)
can be any 32-bit value, the programmer really only has control over the bottom byte of the startup memory. The top three bytes are all exactly the top three bytes of P(i)
.
Unfortunately the instructions rely on the entire 32-bit value of instruction = mem[ip]
being a correct value, so the top three bytes must be 0. To allow the Nuova programmer control over this, Nuova has a mechanism for setting 32-bit values. If the i
th byte of the input file meets c & 0xF = 0xF
(so the bottom nibble is set), then the next four bytes are read as a 32-bit integer v
, to set mem[i + 1] = P(i + 1) ^ v
. Thus the programmer has full control over mem[i + 1]
by choosing the four bytes such that v = P(i + 1) ^ k
, to get mem[i + 1] = k
.
We don't talk about the last cell that gets set.
Suppose the input file is the sequence of five bytes 0F 04 27 41 FC
.
- Nuova starts by setting
mem[0] = P(0) ^ 0x0F
. Fortunately,P(0) = 0
, somem[0] = 0x0F
. - Since the byte
0x0F
has bottom nibble set, Nuova reads the next four bytes to setmem[1] = P(1) ^ 0x042741fc
. I chose that value carefully sinceP(1) = 0x42741d6
, somem[1] = 0x042741fc ^ 0x42741d6 = 0x2a = 42
.
The end result of all this is that mem[0] = 0x0F
and mem[1] = 42
. So Nuova executes as follows:
- Read
mem[0] = 0x0F
and conclude that it's time to doa = mem[ip++]
- Read
mem[1] = 42
to seta = 42
. - End with instruction pointer
ip = 2
A consequence of Nuova's input is that the programmer has no control over a cell of the startup memory unless the previous cell is initialized with a byte whose lowest nibble is 0xF
. Hence the bytecode input really only has control over every second cell, so hashing is inevitable.
So to write a program whose effect is putchar('h')
, we dont start by setting a = 104 = 0x68 = 'h'
like in the previous section. We start by setting a = 0xb7cd2d00
because we expect it to get hashed once.
Hence a full terminating program to print h
looks like
0F B3EA6CD6 FF C0F0B597 FF E33DE5D1
I've grouped together four bytes when they get read as a set of four bytes. After the initial processing, the startup memory looks as follows:
mem[0] = 0x0F; // a = mem[1]; ip = 2;
mem[1] = 0xB7CD2D00;
mem[2] = 0xF1DFE816; // a = P(a); b = P(b); c = P(c);
mem[3] = 0xD0; // putchar(a);
mem[4] = 0xD3A15F6A; // a = P(a); b = P(b); c = P(c);
mem[5] = 0xF0; // exit(0);
Execution proceeds as:
ip = 0
. Instruction0x0F
is read atmem[0]
, soa = mem[ip++]
is executed. Thena = 0xB7CD2D00
.ip = 2
. Since0xF1DFE816
is not a valid instruction, all the registers get hashed, soa = P(0xB7CD2D00) = 0x68 = 'h'
. We were able to force this by selectingmem[1] = inverseP(0x68)
.ip = 3
. Instruction0xD0
is read, soputchar(a)
. We geth
printed to stdout!ip = 4
. Hash all the registers.ip = 5
. Instruction0xF0
is read, soexit(0);
, terminating cleanly.
The previous example (putchar('h')
) is so short because P(0) = 0
, so the bytecode has full control over the first two memory locations. Since we want to follow up with putchar('i')
, it might seem like a tall task because it's hard in general to get memory to initialize such that mem[i] = 0x0F
and mem[i + 1] = inverseP('h')
. This was used earlier at i = 0
to start with a = inverseP('h')
in order to putchar(P(mem[i + 1])) = putchar(P(inverseP('h'))) = putchar('h')
.
But now we want to putchar(P(mem[i + 1]))
while only having control over the bottom byte of mem[i + 1]
. But this isn't too bad because we only care about the bottom byte of P(mem[i + 1])
since putchar(x) ≡ putchar(x & 0xF)
.
Exhaustive search gives three options here:
>>> [hex(c) for c in range(256) if P(c ^ P(6)) & 0xFF == ord('i')]
['0x9f', '0xae', '0xec']
Using 0x9F
would mess up input by treating the next four bytes as an integer, so let's choose 0xAE
and write the rest of the hi program around it:
0F B3EA6CD6 FF C0F0B597
FF E33DE52E AE FF 42CF8F4F
FF 082C8EEA
This leads the the following startup memory:
// putchar('h')
// 0F B3EA6CD6 FF C0F0B597
mem[0] = 0x0F; // a =
mem[1] = 0xB7CD2D00; // 0xB7CD2D00
mem[2] = 0xF1DFE816; // a = P(a); b = P(b); c = P(c);
mem[3] = 0xD0; // putchar(a);
// putchar('i')
// FF E33DE52E AE FF 42CF8F4F
mem[4] = 0xD3A15F6A; // a = P(a); b = P(b); c = P(c);
mem[5] = 0x0F; // a =
mem[6] = 0xA6385F3F; // 0xA6385F3F
mem[7] = 0x4F25D266; // a = P(a); b = P(b); c = P(c);
mem[8] = 0xD0; // putchar(a);
// exit(0)
// FF 082C8EEA
mem[9] = 0x8ED47961; // a = P(a); b = P(b); c = P(c);
mem[10] = 0xF0; // exit(0);
The execution starts as before with putchar('h')
. Then:
ip = 4
: hash all the registersip = 5
: Instruction0x0F
, soa = mem[ip++] = 0xA6385F3F
ip = 7
: hash all the registers, soa = 0x91CEC869
ip = 8
: Instruction0xD0
, soputchar(a)
. This is equivalent toputchar(0x69)
, which emits ani
character (byte) to stdout.
Then execution wraps up with the same exit(0)
approach as the previous example.
The "Hello, World\n"
task is pretty similar to "hi"
except there's more characters to print. For each byte to output, use the same 11-byte, 5-cell pattern:
- (cell
i
) Trash input byte0xFF
to control the next cell - (cell
i+1
) Four bytes to force0x0F
(a = mem[ip++]
) in the memory - (cell
i+2
) Careful bytec
such thatP(c ^ P(i + 2)) & 0xFF == 'e'
where'e'
is the byte you're trying to print and varies from'e'
to'\n'
. Note we must avoidc & 0xF == 0xF
to avoid interpreting the following four bytes as a new cell. - (cell
i+3
) Trash input byte0xFF
to control the next cell - (cell
i+4
) Four bytes to fource0xD0
(putchar(a)
) in the memory
Note this isn't possible for all starting indices i
: there are only 256 - 16 = 240
options for c
in step 3, and there are some collisions, so you might have to increment i
until a working c
is possible to choose.
Before we get to more complicated stuff, we're going to need some more notation. To help write out the solutions, I made an assembler to convert some human-readable code into Nuova bytecode.
For example, the "hi" program above can be written as
.putchar 'h'
.putchar 'i'
Ok I cheated a bit there because I made putchar a builtin directive for the assembler because it needs to look for the right index. Using other features, the same program can be written as
start: 0
// putchar('h')
a =
.val 0xB7CD2D00
.trash
putchar(a)
// putchar('i');
.trash
a =
.val 0xA6385F3F
.trash
putchar(a)
// exit(0);
.trash
exit(0)
There's a lot to unpack here. The code starts with start: 0
, which defines a label called "start" which is absolute-positioned at cell 0.
The next non-comment line is a =
, which is a mnemonic for the 0x0F
instruction a = mem[ip++]
. The full list of 23 mnemonics follows. Mnemonics must be written exactly as given; no spaces removed.
Full list of mnemonics
- 0x0F:
a =
- 0x1F:
b =
- 0x2F:
c =
- 0x3F:
ip =
- 0x4F:
if (a <= b) ip =
- 0x5F:
if (b >= c) ip =
- 0x6F:
a = sse(...)
- 0x00:
ms(ip, a)
- 0x10:
ms(ip, b)
- 0x20:
ms(ip, c)
- 0x30:
ms(a, b)
- 0x40:
ms(a, c)
- 0x50:
ms(a, mg(a))
- 0x60:
b = a
- 0x70:
c = a
- 0x80:
a = b
- 0x90:
a = c
- 0xA0:
ip = a
- 0xB0:
a = b + c
- 0xC0:
a = b - c
- 0xD0:
putchar(a)
- 0xE0:
a = getchar()
- 0xF0:
exit(0)
Next is the literal value .val 0xB7CD2D00
, which determines the memory in the initial tape. The assembler will pick the right sequence of four bytes in the input file to obtain that value.
After that is .trash
, which specifies a literal 0xFF
byte in the input bytecode. This allows the next cell to be forced exactly using four bytes.
The source file continues with putchar(a)
, another mnemonic. Once the assembler reaches the end, it ends up emitting the same bytecode that we handwrote:
0F B3EA6CD6 FF C0F0B597
FF E33DE52E AE FF 42CF8F4F
FF 082C8EEA
We got lucky in "hi" and "hello world" because you don't need much control on a putchar value. But for all the harder tasks, we're going to need some way to have full control over all 4 bytes of a register.
One way we accomplish this is by combining registers we have low control over to get a register with high control. For example, if we can vary one byte in the input file to get 240 choices for b
, and vary another byte to get 240 choices for c
, then running a = b + c
would give about 240 * 240 = 57600
choices for a
(slightly fewer due to collisions). This is slightly less than 2 bytes of control. To get close to a full 4 bytes of control, we run b = a
, vary a byte to get 240 choices for c
, then again run a = b + c
to get a little less than 3 bytes of control. Repeating one more time gives almost 4 bytes of control, which allows for a successful exchaustive search for most start indices.
In the assembly language, this would be written approximately
// example: start at 1
start_pos: 1
.trash
b =
.val ?? // 240 choices here
.trash
c =
.val ?? // 240 choices here
.trash
a = b + c
.trash
b = a
.trash
c =
.val ?? // 240 choices here
.trash
a = b + c
.trash
b = a
.trash
c =
.val ?? // 240 choices here
.trash
a = b + c
The ??
are stand-ins for actual values.
This exhaustive search ends up covering about 3/4 of the possible 32-bit output values. If the search fails, then increment the start index and try again. This is forceA and can be written via the directive .forceA value
.
(The exhaustive search is a little slow since it needs to search a few billion options on average, so the assembler caches the results in cache/tsfav
)
To get an infinite loop, we need to set the ip
back to an earlier position. We can get an empty (useless) infinite loop via
start: 0
ip =
.val 0
To get an infinite loop with a body, we can use the ip = a
(jump-to-register) instruction:
loop_start: 0x10
// loop body would go here
loop_end: 0x50
.trash
.forceA `0x10
.trash
ip = a
Here, the `
denotes inverseP
, so `0x10
is the same as the value 0x1CDB46F2
since P(0x1CDB46F2) == 0x10
. The first .trash
is just to deal with an assembler bug, and the second .trash
allows for four bytes to force the value of ip = a
.
When the program execution finishes running the instructions generated by the .forceA
directive, we have a = 0x1CDB46F2
. The .trash
instruction then hashes the registers to get a = P(0x1CDB46F2) = 0x10
. Then ip = a
unconditionally jumps back to loop_start
.
To familiarize you with another assembler syntax, &label
refers to the cell index of label
. Also, values support addition (at most one) or subtraction (at most one), so the above loop can be written as
loop_start: 0x10
// loop body would go here
loop_end: &loop_start + 0x40
.trash
.forceA `&loop_start
.trash
ip = a
We're working towards a cat program, which requires a = getchar();
and putchar(a);
in consecutive instructions. This is difficult in general. The way to do this is by setting the memory to force instructions.
For example, it would be nice for something like this to work:
start: 0x10
// 0xD0 = "putchar(a)"
.forceA `0xD0
b = a
.forceA `&putchar
.trash
ms(a, b)
getchar: 0x50
a = getchar()
putchar: &getchar + 1
.trash
The wishful thinking here is that at the point of the ms(a, b)
instruction, we'll have a = &putchar = 0x51
, and b = 0xD0
(the opcode for putchar(a)
). The instruction ms(a, b)
aka mem[a] = b
would overwrite the final .trash
to put a putchar(a)
in the cell after a = getchar()
.
But unfortunately this doesn't work out. The .forceA
directive overwrites all three registers, so it is not useful twice in a row.
Instead, we take advantage of the hashing capabilities. We just want to use ms(a, b)
eventually with the right a
and b
. So if b
isn't the right value now, we can just wait until it gets hashed to be the right value. So the general plan is
// whatever number of inverses is needed
.forceA ```````````&putchar
.trash
b =
.val ?? // 240 choices here
.trash
.trash
// however many hashes are needed (need up to 16 million ish)
.trash
.trash
ms(a, b)
Since we have about 8 bits of control over the initial value of b
, the "however many hashes are needed" might take up to 24 bits worth of waiting, so 2^24 = 16 million. This is a bit too long for our purposes, but we have another axis of control: when this whole sequence starts.
.trash
.trash
// however much padding is needed (only need about 4000 ish)
.trash
.trash
// whatever number of inverses is needed
.forceA ```````````&putchar
.trash
b =
.val ?? // 240 choices here
.trash
.trash
// however many hashes are needed (only need about 4000 ish)
.trash
.trash
ms(a, b)
Now, exhaustive search through about 4000 starting positions, 240 initial b
values, and 4000 hash counts gives about 2^32 options, which is enough to control b
at the point of ms(a, b)
.
(This is slow if implemented naively, so the assembler accelerates it by using a hash set to store a bunch of inverses of the goal b
value).
This whole 4000-240-4000 approach is represented in assembly by putting two mnemonics next to each other without .trash
in between. So getchar-putchar is represented by
a = getchar()
putchar(a)
The first mnemonic will be inserted by ms_simple
, the easy process where the bytecode has a 0xFF
byte, then four bytes to control the instruction cell. The second mnemonic will be inserted by initMemset
, the complicated 4000-240-4000 approach above.
Using initMemset
, non-terminating cat is easy to write
// cat non-terminating
p: 0x1500
a = getchar() // ms_simple
putchar(a) // initMemset
.trash
.forceA `&p
.trash
ip = a
The assembler puts the initMemset
block somewhere before 0x1500
, which gives about 5000 total cells of freedom. This works out by luck even though it's less than 4000+240+4000.
The program ends with the same infinite loop as the infinite loop example.
To save 47 bytes, the loop at the end can be swapped out for ip = 0
using
b = a
.trash
c = a
.trash
a = b - c
.trash
ip = a
This is much shorter than a general .forceA
by subtracting two equal registers to get 0, and it relies on P(0) = 0
. The assembler defines a directive to zero a
which corresponds to exactly the above code:
.zero_a
.trash
ip = a
Terminating cat
is similar to non-terminating cat
. It just needs to check for EOF using a conditional jump.
p: 0x4A00
a = getchar()
if (a <= b) ip =
.val ¬_eof
.trash
exit(0)
.trash
not_eof: &p + 8
putchar(a)
.forceA `&p
.trash
ip = a
Since EOF is the maximum 32-bit integer 0xFFFFFFFF
, pretty much all integers compare less than it, so we don't really need to initialize b
, just ensure that it is not in a cycle including 0xFFFFFFFF
.
Nuova does not have a random-access memory get instruction, so memory must be stored where it is read.
For a demonstration, let's reproduce the following program that prints out all the the bytes in a cycle without terminating, starting at !
and cycling due to overflow.
start: 0x10000
a =
.val '!'
loop_begin:
putchar(a)
// increment register a
c =
.val 1
b = a
a = b + c
ip =
.val &loop_begin
This works correctly, but it requires an unbroken chain where a
never gets overwritten. For Fizz Buzz, we will need to store more variables, so we store variables in the actual tape memory.
Let's demonstrate by storing the current character in memory at address 0x10001
(labelled with x
below).
loop_begin: 0x10000
a =
x:
.val '!'
putchar(a)
// b ← a + 1
c =
.val 1
b = a
a = b + c
b = a
// set the value of x to be a
a =
.val &x
ms(a, b)
// loop back to start
// overwriting register a is ok
.trash
a =
.val `&loop_begin
.trash
ip = a
This means that the value of x
can only be read at the beginning of the loop. If you need to read it in two locations, the program has to also write the value to those two locations, as demonstrated in the next section.
The fizzbuzz program included in this repository as fizzbuzz.s
uses the method of writing the value to those multiple locations. For example, fizzbuzz needs to read d0
(the ones digit) in two places: once for incrementing the counter, and once for printing out the counter. So it stores two copies of the value at different addresses d0_a
and d0_b
.
When d0
, it reads from d0_a
and writes to d0_a
and d0_b
:
// increment the ones digit d0
inc_d0: 0x35100
// read the current value from the address labelled d0_a
b =
d0_a:
.val '0'
// add one
c =
.val 0x01
a = b + c
c = a
// set d0_a to the new value of d0
a =
.val &d0_a
ms(a, c)
// set d0_b to the new value of d0
a =
.val &d0_b
ms(a, c)
When printing d0
, it reads from d0_b
:
a =
d0_b: 0x35E10
.val '0'
putchar(a)
When zeroing d0
, it writes to d0_a
and d0_b
:
b =
.val '0'
a =
.val &d0_a
ms(a, b)
a =
.val &d0_b
ms(a, b)
For Method 1, it helps to define a macro to simplify the memory operations. The make run-compiler
uses the gcc preprocessor (gcc -E
) to expand macros. Unfortunately, it replaces newlines with spaces, so as a hack, write ;
instead of \
when you want the macro to continue onto several lines (make run-compiler
replaces ;
with ;\
, then runs gcc -E
, then replaces ;
with newlines again).
#define MS_b(addr) \
a =;
.val addr;
ms(a, b)
Then the previous code block can be written as
b =
.val '0'
MS_b(&d0_a)
MS_b(&d0_b)
Another way to implement global variables is by storing their value in one place and reading them through callbacks.
// skip ahead past the x_b_eq declaration
skip_decls: 0xE0000
ip =
.val &loop_begin
// declare variable x
// it gets accessed by jumping to x_b_eq with the callback in register a
x_b_eq: 0xE1000
b =
x: 0xE1001
.val '!'
ip = a
loop_begin: 0xF0000
// setup callback got_x
a =
.val &got_x
// jump to x_b_eq
ip =
.val &x_b_eq
// x_b_eq will jump back to got_x with the value of x in register b
got_x: 0xF0010
a = b
putchar(a)
// b ← a + 1
c =
.val 1
a = b + c
b = a
// Set the value of x to be b
a =
.val &x
ms(a, b)
// loop back to start
// overwriting register a is ok
.trash
ip =
.val &loop_begin
This method extends to reading in several places, as long as the callback values are set correctly.
For Method 2, it helps to define some macros to simplify the memory operations.
#define glue2(x,y) x##y
#define glue(x,y) glue2(x,y)
#define global_declare(var, value, pos) \
glue(var, _b_eq): pos;
b =;
var: pos + 1;
value;
ip = a
#define b_eq_global(var, return_idx) \
a =;
.val &return_idx;
ip =;
.val &var - 1
#define global_eq_b(var) \
a =;
.val &var;
ms(a, b)
Then the ASCII loop code can be written as
skip_decls: 0xE0000
ip =
.val &loop_begin
global_declare(x, .val '!', 0xE1000)
loop_begin: 0xF0000
b_eq_global(x, got_x)
got_x: 0xF0010
a = b
putchar(a)
// b ← a + 1
c =
.val 1
b = a
a = b + c
b = a
// put b into x
global_eq_b(x)
// loop back to start
// overwriting register a is ok
.trash
ip =
.val &loop_begin
Usage: echo -n '123' | ./nuova fizzbuzz
.
Since Fizz Buzz only needs to go up to four-digit numbers (10000 is buzz), it's reasonable to just store the digits in four separate memory locations instead of an aray.
High-level C-like psuedocode with GOTOs: (the full assembly is in sources/fizzbuzz.s)
u32 a, b, c;
u32 num_lines = 0;
u32 d0 = 0, d1 = 0, d2 = 0, d3 = 0;
bool filled_d1 = 0, filled_d2 = 0, filled_d3 = 0;
u32 x3 = 0, x5 = 0;
bool do_decimal = 0;
// getc is fornumber reading
getc:
a = getchar();
if (a != EOF) goto cplus;
// end of input number, so fallthrough and start main loop
// BEGIN main loop
dec_num_lines:
if (num_lines-- == 0) exit(0);
// increment the four-digit number d = d3 d2 d1 d0
inc_d0:
if (++d0 <= 9) goto inc_x3;
d0 = 0
inc_d1:
filled_d1 = true;
if (++d1 <= 9) goto inc_x3;
d1 = 0
inc_d2:
filled_d2 = true;
if (++d2 <= 9) goto inc_x3;
d2 = 0
inc_d3:
filled_d3 = true;
++d3;
// increment multiple of 3 counter (x3)
inc_x3:
if (++x3 <= 2) goto inc_x5
x3 = 0;
do_decimal = false;
printf("Fizz");
// increment multiple of 5 counter (x5)
inc_x5:
if (++x5 <= 4) goto check_decimal
printf("Buzz");
goto endline;
// don't print the number if we printed Fizz
check_decimal:
if (!do_decimal) goto endline
// print d3 d2 d1 d0 without leading decimals
print_d3:
if (!filled_d3) goto print_d2;
printf("%d", d3);
print_d2:
if (!filled_d2) goto print_d1;
printf("%d", d2);
print_d1:
if (!filled_d1) goto print_d0;
printf("%d", d1);
print_d0:
printf("%d", d0);
endline:
printf("\n");
goto dec_num_lines;
// END main loop
// cplus is part of number input
cplus:
c = num_lines
c = 10 * c + (a - '0')
num_lines = c
goto getc;
I decided to implement cyclic tag.
Usage examples:
echo -n '"011"10"101;1;' | ./nuova cyclic-tag
for the cyclic tag system (011, 10, 101) and initial string 1echo -n '"010001"100"100100100""";100100100100;' | ./nuova cyclic-tag
for the cyclic tag system (010001, 100, 100100100, ε, ε, ε) and initial string 100100100100
First, it parses the input into a bunch of 2-word jmp instruction pairs ip = immediate
similar to what follows
// 3 production rules
ip = pdone // < curr
ip = p0
ip = p1
ip = p1
ip = pdone
ip = p1
ip = p0
ip = pdone
ip = p1
ip = p0
ip = p1
ip = preturn
// data section
ip = d1 // < head
// < tail
Each instruction pair is at two consecutive addresses in memory.
It works by extending the data section from tail
(with tail always pointing after the last datum) whenever a global variable do_append
is set to 1. Then when a pdone
is hit (done with that production string), it jumps back to head
, prints that value (and increments head by 2), sets do_append
, and jumps back to the production strings.
The first byte of the first jmp (ip = pdone
) is at address start
(0x100000), and the global variable curr
is initialized with the value 0x100000. The global variable head
is initialized with the value of 0x100018 (the position of the first data string jmp), and tail
is initialized to point immediately after it at 0x10001A. The global do_append
is initialized to 0.
Execution begins at the start of the first rule, and proceeds by the following pseudocode:
pdone:
jmp head
p0:
if do_append {
*tail = (2 instructions)"ip = d0"
tail += 2
}
curr += 2
jmp curr
p1:
if do_append {
*tail = (2 instructions)"ip = d1"
tail += 2
}
curr += 2
jmp curr
preturn:
jmp start
d0:
putchar('0')
do_append = 0
head += 2
curr += 2
jmp curr
d1:
putchar('1')
do_append = 1
head += 2
curr += 2
jmp curr