Skip to content

Latest commit

 

History

History
73 lines (49 loc) · 4.9 KB

README.md

File metadata and controls

73 lines (49 loc) · 4.9 KB

Regular Expression Refinement Types

Dissertation project submitted as part of BSc Computer Science with Intercalated Year at the University of Warwick.

Project classification
1st (85%)
Project supervisor
Michael Gale

README screenshot of web interface

Introduction

Entire classes of modern web application vulnerabilities arise due to problematic user input handling. This includes cross-site scripting (XSS), injection issues (SQL, LDAP, etc.), insecure deserialisation and file inclusion vulnerabilities – all of which are encountered by information security firms on a regular basis in application assessments.

This project explores the use of regular expressions as refinement types for constrained data in order to model user input validation. The type system of a prototype imperative language is formalised and implemented using ANTLR4 and a Java interpreter.

This repository includes:

  • TeX sources for the written components of the project required by the university
    • Project specification
    • Progress report (completed part-way through the project)
    • Final report
  • Code powering the toy language
    • ANTLR4 grammar and lexer
    • Pygments-compatible syntax highlighter
    • Web application frontend to allow experimentation with the language (using the university's ID7 identity)
    • Rise4fun-API implementation

Tooling used for report

High-quality printed results were achievable without proprietary software, using the following tools:

Wider impact

A formal academic project of reasonable length is a great opportunity to contribute knowledge back into the public domain. As well as open sourcing this repository (including the code and report), a number of existing open source projects and articles received contributions during the project.

Wikipedia Contributions:

Open Source Contributions:

TeX.SE Q&A

DCS Plagiarism Policy

Midway through the final course year within the Department of Computer Science, the handbook was updated with the following policy (~6 months after it was electronically signed):

It is becoming increasingly common for students to use repositories (such as GitHub and GitLab) to store and manage their coursework. If you do this, you must make sure that your repositories are marked as "private" (and remain so, even after you have left the University), since by default they may be public and may be seen by other students. If you make your coursework public, and it is viewed or copied by other students, you may be investigated for abetting plagiarism (just as if you had deliberately handed your work to another student to copy).

If you need to make a "portfolio" visible to potential employers, then the above still holds - either give the employer individual access (if the repository allows it), or make sure no source code is included.

To the best of my knowledge, no intellectual property in this repository is owned by the Department of Computer Science. It is unfortunate that an attempt was made to introduce such an all-encompassing policy without consultation.

In the case of this repository, a 'common sense' approach has been taken when open sourcing it in spite of this policy: the risk of plagiarism seems reasonably low, given that students are expected to come up with their own project ideas for the module.