Skip to content

Commit

Permalink
a3: add writeup
Browse files Browse the repository at this point in the history
  • Loading branch information
SantriptaSharma committed Oct 23, 2023
1 parent 5d1c693 commit 9d59cc0
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 107 deletions.
64 changes: 15 additions & 49 deletions 15_A3.nc
Original file line number Diff line number Diff line change
@@ -1,49 +1,15 @@
int a = 341;
int b = 0213;
int c = -123;
int d = -00045; //invalid actually

int _valididentifier_ = +1;

swap
top

int *ptr = &d;
int *ptrptr = &ptr;
**ptr++;

ptr->addr();

if (name1 == "3454\n") {
return 'abc';
}

void identifierguy(int x = -31) {
if (x == -31) {
//identified
return;
}
else {
for (;;)
{
// not identified :(
printf("aaaaaaaaaaaaaaaaaaaaa");
}
}
}

// wdsadwaw
"weirdo stringliteral /*hopefully not a comment*/ \n \n \n \n "

/* error/* error 2
multiline error
*/ 1*/


/*pioqwepoiqew*/ asdipoi
a /*new lined comment???*/
works hopefully these are idents*/

"string literal" not anymore"
/* test1 /* test2 */ test3 */ test4
ident after should be ignored because of mismatched " on stringliteral above
// Find factorial by iteration
int main() {
int n;
int i = 0;
int r = 1;
readInt(&n);
for(i = 1; i <= n; i = i + 1)
r = r * i;
printInt(n);
printStr("! = ");
printInt(r);
r->thing;
*r;
return 0;
}
63 changes: 5 additions & 58 deletions 15_A3.tex
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
\documentclass{article}
\usepackage{graphicx} % Required for inserting images

\title{CS1319: PLDI - Assignment 2}
\title{CS1319: PLDI - Assignment 3}
\author{Hrsh Venket \& Santripta Sharma}
\date{September 2023}
\date{October 2023}

\setlength{\parindent}{0pt}

Expand All @@ -12,62 +12,9 @@
\maketitle


\section*{Explanation of Lexer}
Below is the code we have written for the lexer rules. This explanation therefore is our attempt to explain our choices by explaining how our code and the given rules for the lexer are equivalent.\bigskip

First let us consider the obvious cases, where we have written the regex for the lexer exactly as we have been given in the assignment.

\begin{verbatim}
KEYWORD char|else|for|if|int|return|void
PUNCT \[|\]|\(|\)|\{|\}|->|&|\*|\+|-|\/|%|!|\?|\<|... (and so on)
\end{verbatim}

Here, we have simply written the regex as given in the assignment using a series of OR statements\bigskip

The explanation for string literals also directly follows from the definition given in the assignment.

\begin{verbatim}
ESCAPE \\'|\\\?|\\\"|\\\\|\\a|\\b|\\f|\\n|\\r|\\t|\\v
STRCHAR [^\"\\\n]|{ESCAPE}
STRLIT \"{STRCHAR}*\"
\end{verbatim}

We define string characters as either the escape sequence or a member or the character set except for the double quote, the backslash and the newline character. We do this as we have done for the rest of the lexer, using \verb|^| to denote 'everything except'.\bigskip

String literals are defined as 1 or more string characters within double quotes.\bigskip

Two less obvious cases are the single and multiline comments. For the single line comments, we have written the regex as follows:

\begin{verbatim}
COMMENTSINGLE \/\/([^\n])*\n
\end{verbatim}

Here, \verb|\/\/| denotes the opening \verb|//| for a single line comment. Inside the comment, we allow any characters except for the newline operator. The \verb|*| means we can have 0 or more of the any type of character. It must thereby terminate with a newline character \verb|\n|.

In the case of multline comments, our code is as below:

\begin{verbatim}
COMMENTMULTI \/\*([^\*]|\*[^\/])*\*\/
\end{verbatim}

The multilinecomment starts with /*, followed by any number of characters, but where the character can be a * only if the next character is anything else besides a /. Finally, the comment is closed by */. This ensures that comments end on the first close multiline (*/) encountered.\bigskip

Now we can consider the less direct cases.

\begin{verbatim}
IDENT [a-zA-Z\_][0-9a-zA-Z\_]*
\end{verbatim}

By definition, the identifier cannot start with a digit. Therefore, we define that the identifier must start with an identifier non-digit, \verb|[a-zA-Z\_]|. This is followed by 0 or more identifier characters, \verb|[0-9a-zA-Z\_]|, which can be digits or non-digits

\begin{verbatim}
CHAR [^\\'\n]|{ESCAPE}
CONST ([\+-]?[1-9][0-9]*)|[0-9]+|'{CHAR}+'
\end{verbatim}

Here, we describe const, using some of the definitions given in the assignment. We define \verb|CHAR| as any character except for the single quote, backslash, or the newline character OR an escape sequence. We define \verb|CONST| as the constant, which can be a number, a character, or a multicharacter literal (\>1 chars).\bigskip

Besides character constants, we also define the integer constant as an optional sign, followed by a non-zero digit optionally followed by 0 or more digits from 0-9 OR a sequence of one or more digits 0-9 (no sign). \bigskip
\section*{Explanation of Parser}
We have written the bison specifications closely following the grammar, as the given grammar has precedences \& associativity resolved, and is mostly unambiguous (besides the dangling else, which is handled by bison by always shifting by default on any S/R conflicts). The only real 'change' (more of an expansion) to the grammar is for the handling of optional symbols, we have manually enumerated all possible cases for the presence/absence of optionals, and added them separately as productions for the rule in question.\bigskip

This only changes with the production for the \verb|iteration_statement| non-terminal, where 3 optional terms are used, which would lead to a (mini) combinatorial explosion had we just manually enumerated the cases ($2^3 = 8$ productions instead of just one). For this purpose, we have used an auxiliary non-terminal \verb|opt_expression|, which just produces $\epsilon$ or the non-terminal \verb|expression|.

\end{document}
Binary file added tex_build/15_A3.pdf
Binary file not shown.

0 comments on commit 9d59cc0

Please sign in to comment.