Skip to content

Latest commit

 

History

History
354 lines (275 loc) · 16.7 KB

README.md

File metadata and controls

354 lines (275 loc) · 16.7 KB

Tao

You can now test Tao in the browser!

A statically-typed functional language with polymorphism, typeclasses, generalised algebraic effects, sum types, pattern-matching, first-class functions, currying, good diagnostics, and much more!

Demo of Tao's features

For more example programs, see...

Goals

Right now, Tao is a hobby project and I have no plans to turn it into a production-worthy language. This may change as the project evolves, but I'd rather spend as much time experimenting with new language features for now. That said, I do have a few goals for the language itself:

  • Totality

    • All programs must explicitly handle all inputs. There are no mechanisms for panicking, exceptions, etc. The goal is to build a type system that's expressive enough to prove the totality of a wide range of programs.
    • In time, I'd like to see the language develop support for termination analysis techniques like Walther recursion.
  • Extreme optimisation

    • A rather dogged and obnoxious opinion of mine is that the 'optimisation ceiling' for statically-typed, total functional programming languages is significantly higher than traditional imperative languages with comparably weak type systems. I want Tao to be a practical example of this that I can point to rather than deploying nebulous talking points about invariants.
    • I've deliberately made sure that the core MIR of Tao has a very small surface area, making it amenable to a variety of optimisations and static analyses.
    • Already the MIR optimiser performs quite a lot of optimisations that radically reduce the number of bytecode instructions emitted. See below for a list of these.
  • Learning

    • I have only a high-school knowledge of mathematics. I want to use Tao as a test bench to help me learn more about mathematics, proofs, type systems, logic, and computation.
    • In addition, I hope that Tao can serve as a useful tool for others looking to get into language design, compiler development, or simply functional programming in general: the codebase is relatively small and pragmatic (at least, given the complexity of some of the language features).

Features

  • Hindley-Milner type inference
  • Useful error messages
  • Algebraic data types
    • Sum types
    • Record types
    • Generic data types
    • Nominal aliases (i.e: data Metres = Real)
  • Type aliases
  • Type polymorphism via generics
    • Class constraints
    • Associated type equality constraints
    • Arbitrary where clauses (including associated type equality)
    • Lazy associated item inference (Foo.Bar.Baz.Biz lazily infers the class at each step!)
    • Type checker is Turing-complete (is this a feature? Probably not...)
    • Variance is properly tracked through both type and effect parameters
  • Pattern matching
    • Destructuring and binding
    • ADT patterns
    • List patterns ([a, b, c], [a, b .. c], etc.)
    • Arithmetic patterns (i.e: n + k)
    • Inhabitance checks (i.e: None exhaustively covers Maybe Never)
    • Recursive exhaustivity checks
    • let does pattern matching
  • First-class functions
    • Functions support pattern-matching
    • Currying
  • Typeclasses
    • Type parameters
    • Associated types
    • Operators are implemented as typeclasses
  • Algebraic effects
    • Effect objects (independent of functions, unlike some languages)
    • Basin and propagation syntax (equivalent to Haskell's do notation, or Rust's async/try blocks)
    • Generic effects
    • Polymorphic effects (no more try_x or async_x functions!)
    • Effect sets (i.e: can express values that have multiple side effects)
    • Effect aliases
    • Effect handlers (including stateful handlers, allowing expressing effect-driven IO in terms of monadic IO)
    • Effects can be parameterised by both types and other effects
  • Built-in lists
    • Dedicated list construction syntax ([a, b, c], [a, b .. c, d], etc.)
  • Explicit tail call optimisation
    • Better syntax/guarantees
  • Optimisation
    • Monomorphisation of generic code
    • Inlining
    • Const folding
    • Symbolic execution
    • Branch commutation
    • Dead code removal
    • Inhabitance analysis
    • Exhaustive pattern flattening
    • Unused function pruning
    • Unused binding removal
    • Arithmetic rewriting / simplification
    • Identity branch removal
  • Bytecode compiler
  • Bytecode virtual machine

Current working on

  • Pattern exhaustivity checking (sound, but unnecessarily conservative)
  • Arithmetic patterns (only nat addition is currently implemented)
  • Typeclasses
    • Coherence checker
    • Visible member semantics to relax orphan rules
  • MIR optimiser
    • Unboxing
    • Automatic repr changes for recursive types
      • Transform data Nat = Succ Nat | Zero into a runtime integer
      • Transform data List A = Cons (A, List A) | Nil into a vector
  • Algebraic effects
    • Higher-ranked effects (needed for proper async support)
    • Arbitrary resuming/suspending of effect objects
    • Full monomorphisation of effect objects

Planned features

  • Better syntax (perhaps indentation-sensitivity for pattern matching?)
  • Module system (instead of import copy/paste)
  • LLVM/Cranelift backend

Philosophy

  • Prefer general solutions over special casing: Flexible and general features should be preferred over specific solutions that might produce rough edges down the line that require even more special-case solutions to solve. It is better to provide a smaller core of general features than to grow the language into an eclectic mess.

  • Correctness over convenience: If something is wrong or has edge-cases, don't paper over the cracks. Tao tries to force the programmer to write programs that are as well-formed and as bug-free as reasonably possible. Under/overflow matters. Unhandled patterns matter. Overlapping class impls matter.

  • Do the obvious thing: When there's a choice to be made about behaviour, the thing that's most often correct should be done. All other things should be default-on lints or errors.

  • Similar concepts should have similar syntax: List/record/data type construction and destruction (i.e: pattern matching) share the same syntax. Function argument patterns and when patterns share the same syntax.

  • Local reasoning: Where possible, the behaviour of a program/function/expression should be obvious with local-only information. No wild overrides or behavioural changes that require looking at imports to understand.

  • Say what you mean: Syntax does matter! Programs are designed to be read, and Tao should encourage the writing of programs that tell a linear story. If you're needing to jump forward and backward to understand a program, that's something that needs fixing, if at all possible.

  • Abstraction should preserve 'core' semantics: Many languages provide complex macro systems that allow immense towers of meta-programming. Tao is not opposed to meta-programming and abstraction, but aggressively tries to keep such things in terms of the surface syntax, improving legibility and minimising the element of surprise. As a nice addition, rejecting macros makes Tao much friendlier to IDEs and static analysis systems.

Interesting features

Here follows a selection of features that are either unique to Tao or are uncommon among other languages.

Generalised algebraic effects

Tao has support for 'generalised algebraic effects'. 'Effects' means that Tao can express the side-effects of functions (IO, mutation, exceptions, async, etc.) in type signatures. 'Generalised' means that it's possible for you to create and use your own effects to express whatever your heart desires. 'Algebraic' means that Tao allows code to be generic over an effect (or set of effects). For example, consider the map function, used to apply a function to each element of a list in turn:

fn map A, B : (A -> B) -> [A] -> [B]
    | _, [] => []
    \ f, [x .. xs] => [f(x) .. map(f, xs)]

map can be used like so to, for example, double all elements of a list:

[1, 2, 3, 4]
    -> map(fn x => x * 2)

# Result: [2, 4, 6, 8]

Most languages, such as Rust, have a function like this. Unfortunately, it breaks down quickly when we want to do anything even slightly different to a 'pure' mapping between elements within the mapping function. For example, consider a program for which the mapping function is fallible, or requires some asynchronous operation to complete. It's necessary to do one of two things:

  • Have the function 'silently' exit through stack unwinding, as is the case in C#, C++, etc.
  • Create a copy of the function that can handle failure like try_map as in Rust

It's worth noting that Haskell mostly solves this problem with monads: but they're frequently unwieldy. Effect systems and monads have many similarities, but the former works much harder to make them integrate better with regular control flow.

In Tao, this problem can be solved by making map generic over an effect parameter, like so:

fn map A, B, e : (A -> e ~ B) -> [A] -> e ~ [B]
    | _, [] => []
    \ f, [x .. xs] => [f(x)! .. map(f, xs)!]

A few things have changed here.

Firstly, we've introduced an effect parameter, e. Secondly, the type signature has changed: the mapper function, A -> B, now has e attached to its return type, resulting in A -> e ~ B. This is also present in the final return type of the function, e ~ [B], expressing that the side effects of the function as a whole correspond to those performed by the mapping function.

Secondly, a ! operator has appeared within the implementation after calling the mapping function. This is the 'effect propagation' operator and signals to the compiler that the side effects of f should be lifted to the signature of the function as a whole.

Note that, otherwise, the implementation remains the same: we have not needed to use any complicated machinery to handle the side effect (as might be the case in a Rust-style try_map), just a single additional operator.

As a result of this change, map now accepts mapping functions that perform any side effect: throwing errors, IO, yielding values, mutation, and many more. It also accepts function that perform arbitrary combinations of effects, or those that have no side effects at all (the empty set is still a valid effect set!):

# Yield each element of the list, resulting in a generator
[1, 2, 3, 4]
    -> map(fn x => yield(x)!)!

# Generate an error if any element of the list is `0`
[1, 2, 3, 4]
    -> map(fn x => if x = 0 then err("no zeroes allowed")! else x)!

# Print each element of the list
[1, 2, 3, 4]
    -> map(fn x => print(x)!)!

This is the expressive power of algebraic effect systems: we no longer need to worry about function colours, hidden panics/exceptions, or write many versions of a function to handle all kinds of irregular control flow. Because algebraic effects generalise so well, it also becomes possible to use them to separate out interfaces from implementations in a composable way, allowing developers to swap out the implementation of even very core APIs (such as filesystem access) as required without the complexity and awkwardness of intricate callback systems.

In Tao, effects are kinds, just like types, lifetimes, and constants in Rust. They're also represented independently of function signatures too, as 'effect objects' (you can think of effect objects as being like Future/Promises, but generalised to all side effects). Because of this, it's possible to use them in a vast array of contexts.

Arithmetic patterns

Tao's type system is intended to be completely sound (i.e: impossible to trigger runtime errors beyond 'implementation' factors such as OOM, stack overflow, etc.). For this reason, subtraction of natural numbers yields a signed integer, not a natural number. However, many algorithms still require that numbers be counted down to zero!

To solve this problem, Tao has support for performing arithmetic operations within patterns, binding the result. Because the compiler intuitively understands these operations, it's possible to statically determine the soundness of such operations and guarantee that no runtime errors or overflows can ever occur. Check out this 100% sound factorial program!

fn factorial =
    | 0 => 1
    \ y ~ x + 1 => y * factorial(x)

All functions are lambdas and permit pattern matching

Excluding syntax sugar (like type aliases), Tao has only two high-level constructs: values and types. Every 'function' is actually just a value that corresponds to an line lambda, and the inline lambda syntax naturally generalises to allow pattern matching. Multiple pattern arguments are permitted, each corresponding to a parameter of the function.

def five =
    let identity = fn x => x in
    identity(5)

Exhaustive pattern matching

Tao requires that pattern matching is exhaustive and will produce errors if patterns are not handled.

Very few delimiters, but whitespace isn't semantic

In Tao, every value is an expression. Even let, usually a statement in most languages, is an expression. Tao requires no semicolons and no code blocks because of this fact.

Currying and prefix calling

In Tao, arg->f is shorthand for f(arg) (function application). Additionally, this prefix syntax can be chained, resulting in very natural, first-class pipeline syntax.

my_list
    -> filter(fn x => x % 2 == 0) # Include only even elements
    -> map(fn x => x * x)         # Square elements
    -> sum                        # Sum elements

Useful, user-friendly error diagnostics

This one is better demonstrated with an image.

Example Tao error

Tao preserves useful information about the input code such as the span of each element, allowing for rich error messages that guide users towards solutions to their programs. Diagnostic rendering itself is done by my crate Ariadne.

Automatic call graph generation.

Tao's compiler can also automatically generate graphviz call graphs of your programs to help you understand them better. Here's the expression parser + REPL from examples/calc.tao. The call graph will automatically ignore utility functions (i.e: functions with a $[util] attribute on them), meaning that even very complex programs suddenly become understandable.

Call graph of an expression parser in Tao

Usage

Commands

Compile/run a .tao file

cargo run -- <FILE>

Run compiler tests

cargo test

Compile/run the standard library

cargo run -- lib/std.tao

Compiler arguments

  • --opt: Specify an optimisation mode (none, fast, size)

  • --debug: Enable debugging output for a compilation stage (tokens, ast, hir, mir, bytecode)