Skip to content

Commit

Permalink
Finish section on structural corecursion
Browse files Browse the repository at this point in the history
  • Loading branch information
noelwelsh committed Oct 4, 2023
1 parent 711984f commit 11f2128
Show file tree
Hide file tree
Showing 3 changed files with 304 additions and 7 deletions.
37 changes: 37 additions & 0 deletions src/pages/adt/applications.md
Original file line number Diff line number Diff line change
@@ -1 +1,38 @@
## Applications of Algebraic Data Types

We seem some examples of algebraic data types already. Some of the simplest are the basic enumeration, like

```scala mdoc:silent
enum Permissions {
case User
case Moderator
case Admin
}
```

which we might use in a system monitoring application and also our old friend the `case class`

```scala mdoc:silent
final case class Uri(protocol: String, host: String, port: Int, path: String)
```

I think these are the straightforward examples, but those new to algebraic data types often don't realise how many other uses cases there are.
We'll see combinator libraries, an extremely important use, in the next chapter.
Here I want to give a few examples of finite state machines as another use case.

Finite state machines occur everywhere in programming. The state of a user interface component, such as open or closed, or visible or invisible, can be modelled as a finite state machine. The state of a job in a distributed job system, like Spark, can also be modelled as a finite state machine.
When using an algebraic data type we're not restricted to simple enumerations of state.
We can also store data within the states.
So, in our job control system, we define jobs as having states.
Here's a simple example.

```scala mdoc:silent
import scala.concurrent.Future

enum Job[A] {
case Queued(name: String, job: () => A)
case Running(name: String, host: String, result: Future[A])
case Completed(name: String, result: A)
case Failed(name: String, reason: String)
}
```
2 changes: 1 addition & 1 deletion src/pages/adt/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Algebraic Data Types and Structural Recursion
# Algebraic Data Types

In this section we'll see our first example of a programming strategy: **algebraic data types**. Any data we can describe using logical ands and logical ors is an algebraic data type. Once we recognize an algebraic data type we get three things for free:

Expand Down
272 changes: 266 additions & 6 deletions src/pages/adt/structural-corecursion.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Duality is often indicated by attaching the co- prefix to a term.
So corecursion is the dual of recursion, and sum types, also known as coproducts, are the dual of product types.

Duality is one of the main themes of this book.
By relating concepts as duals, we can transfer knowledge from one domain to another.
By relating concepts as duals we can transfer knowledge from one domain to another.
</div>

Structural recursion works by considering all the possible inputs (which we usually represent as patterns), and then working out what we do with each input case.
Expand Down Expand Up @@ -47,7 +47,7 @@ enum MyList[A] {
```

The structural corecursion strategy says we write down all the constructors and then consider the conditions that will cause us to call each constructor.
So our starting point is to just write down the two constructors, and put in some dummy conditions for each.
So our starting point is to just write down the two constructors, and put in dummy conditions.

```scala mdoc:reset:silent
enum MyList[A] {
Expand All @@ -73,7 +73,7 @@ enum MyList[A] {
}
```

Now to get the left-hand side we can use the strategies we've already seen:
To complete the left-hand side we can use the strategies we've already seen:

* we can use structural recursion to tell us there are two possible conditions; and
* we can follow the types to align these conditions with the code we have already written.
Expand All @@ -100,12 +100,272 @@ For example, `foldLeft` and `foldRight` are not structural corecursions because
Secondly, note that when we walked through the process of creating `map` as a structural recursion we implicitly used the structural corecursion pattern, as part of following the types.
We recognised that we were producing a `List`, that there were two possibilities for producing a `List`, and then worked out the correct conditions for each case.
Formalizing structural corecursion as a separate strategy allows us to be more conscious of where we apply it.
Finally, noticed how I switched from an `if` expression to a pattern match expression as we progressed through defining `map`.
Finally, notice how I switched from an `if` expression to a pattern match expression as we progressed through defining `map`.
This is perfectly fine.
Both kinds of expression can achieve the same effect, though if we wanted to continue using an `if` we'd have to define a method (for example, `isEmpty`) that allows us to distinguish an `Empty` element from a `Pair`.
This method would have to use pattern matching in its implementation, so avoiding pattern matching directly is just pushing it elsewhere.

**TODO: desribe abstract structural corecursion**

### Unfolds as Structural Corecursion

### Structural Corecursion as Unfold
Just as we could abstract structural recursion as a fold, for any given algebraic data type we can abstract structural corecursion as an unfold. Unfolds are much less commonly used than folds, but they are still a nice tool to have.

Let's work through the process using `MyList` again.

```scala mdoc:reset:silent
enum MyList[A] {
case Empty()
case Pair(head: A, tail: MyList[A])
}
```

The skeleton is

```scala
if ??? then MyList.Empty()
else MyList.Pair(???, recursion(???))
```

Let's start defining our method `unfold` so we can fill in the missing pieces.

```scala
def unfold[A, B](seed: A): MyList[B] =
if ??? then MyList.Empty()
else MyList.Pair(???, unfold(in))
```

We can abstract the condition using a function from `A => Boolean`.

```scala
def unfold[A, B](seed: A, stop: A => Boolean): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(???, unfold(seed, stop))
```

Now we need to handle the cases for `Pair`.
We have a value of type `A` (`seed`), so to create the `head` element of `Pair` we can ask for a function `A => B`

```scala
def unfold[A, B](seed: A, stop: A => Boolean, f: A => B): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(???, stop, f))
```

Finally we need to update the current value of `seed` to the next value. That's a function `A => A`.

```scala mdoc:silent
def unfold[A, B](seed: A, stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed), stop, f, next))
```

At this point we're done.
Let's see that `unfold` is useful by declaring some other methods in terms of it.
We're going to declare `map`, which we've already seen is a structural corecursion, using `unfold`, and also `fill` and `iterate`, which correspond to the methods with the same names on `List` in the Scala standard library.

To make this easier to work with I'm going to declare `unfold` as a method on the `MyList` companion object.
I have made a slight tweak to the definition to make type inference work a bit better.
In Scala, types inferred for one method parameter cannot be used for other method parameters.
However, types inferred for one methdo parameter list can be used in subsequenst lists.
Separating the function parameters from the `seed` parameter means that the value inferred for `A` from `seed` can be used for inference of the function parameters' input parameters.

I have also declared some **destructor** methods, which are methods that take apart an algebraic data type.
For `MyList` these are `head`, `tail`, and the predicate `isEmpty`.
We'll talk more about these a bit later.

Here's our starting point.

```scala mdoc:reset:silent
enum MyList[A] {
case Empty()
case Pair(_head: A, _tail: MyList[A])

def isEmpty: Boolean =
this match {
case Empty() => true
case _ => false
}

def head: A =
this match {
case Pair(head, _) => head
}

def tail: MyList[A] =
this match {
case Pair(_, tail) => tail
}
}
object MyList {
def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
}
```

Now let's define the constructors `fill` and `iterate`, and `map`, in terms of `unfold`. I think the constructors are a bit simpler, so I'll do those first.

```scala
object MyList {
def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))

def fill[A](n: Int)(elem: => A): MyList[A] =
???

def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
???
}
```

Here I've just added the method skeletons, which are taken straight from the `List` documentation.
To implement these methods we can use one of two strategies:

- reasoning about loops in the way we might in an imperative language; or
- reasoning about structural recursion over the natural numbers.

Let's talk about each in turn.

You might have noticed that the parameters to `unfold` are almost exactly those you need to create a for-loop in a language like Java. A classic for-loop, of the `for(i = 0; i < n; i++)` kind, has four components:

1. the initial value of the loop counter;
2. the stopping condition of the loop;
3. the statement that advances the counter; and
4. the body of the loop that uses the counter.

These three correspond to the `seed`, `stop`, `next`, and `f` parameters of `unfold` respectively.

Loop variants and invariants are the standard way of reasoning about imperative loops. I'm not going to describe them here, as probably learned this already (though perhaps not using these terms). Instead I'm going to discuss the second reasoning strategies, which relates writing `unfold` to something we've already discussed: structural recursion.

Our first step is to note that natural numbers (the integers 0 and larger) are conceptually algebraic data types even though the implementation in Scala---using `Int`---is not. A natural number is either:

- zero; or
- 1 + a natural number.

It's the simplest possible algebraic data type that is both a sum and a product type.

Once we see this, we can use the reasoning tools for structural recursion for creating the parameters to `unfold`.
Let's show how this works with `fill`. The `n` parameter tells us how many elements in the `List`, and the `elem` parameter creates those elements. So our starting point is to consider this as a structural recursion over the natural numbers. We can take `n` as `seed`, and `stop` as the function `x => x == 0`. These are the standard conditions for such a structural recursion. What about `next`? Well, the definition of natural numbers tells us we should subtract one, so `next` becomes `x => x - 1`. We only need `f`, and that comes from the definition of how `fill` is supposed to work. We create the value from `elem`, so `f` is just `_ => elem`

```scala
object MyList {
def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))

def fill[A](n: Int)(elem: => A): MyList[A] =
unfold(n)(_ == 0, _ => elem, _ - 1)

def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
???
}
```

```scala mdoc:reset:invisible
enum MyList[A] {
case Empty()
case Pair(_head: A, _tail: MyList[A])

def isEmpty: Boolean =
this match {
case Empty() => true
case _ => false
}

def head: A =
this match {
case Pair(head, _) => head
}

def tail: MyList[A] =
this match {
case Pair(_, tail) => tail
}

override def toString(): String = {
def loop(list: MyList[A]): List[A] =
list match {
case Empty() => List.empty
case Pair(h, t) => h :: loop(t)
}
s"MyList(${loop(this).mkString(", ")})"
}
}
object MyList {
def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))

def fill[A](n: Int)(elem: => A): MyList[A] =
unfold(n)(_ == 0, _ => elem, _ - 1)

def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
unfold((len, start))(
(len, _) => len == 0,
(_, start) => start,
(len, start) => (len - 1, f(start))
)
}
```

We should check that our implementation works as intended. We can do this by comparing it to `List.fill`.

```scala mdoc:to-string
List.fill(5)(1)
MyList.fill(5)(1)
```
```scala mdoc:silent
var counter = 0
def getAndInc(): Int = {
val temp = counter
counter = counter + 1
temp
}
```
```scala mdoc:to-string
List.fill(5)(getAndInc())
counter = 0
MyList.fill(5)(getAndInc())
```

#### Exercise {-}

Implement `iterate` using the same reasoning as we did for `fill`.
This is slightly more complex than `fill` as we need to keep to bits of information: the value of the counter and the current value of type `A`.

<div class="solution">
```scala
object MyList {
def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
if stop(seed) then MyList.Empty()
else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))

def fill[A](n: Int)(elem: => A): MyList[A] =
unfold(n)(_ == 0)(_ => elem, _ - 1)

def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
unfold((len, start)){
(len, _) => len == 0,
(_, start) => start,
(len, start) => (len - 1, f(start))
}
}
```

We should check that this works.

```scala mdoc:to-string
List.iterate(0, 5)(x => x - 1)
MyList.iterate(0, 5)(x => x - 1)
```
</div>


One last thing before we leave `unfold`. If we look at the usual definition of `unfold` we'll usually find the following definition.

```scala
def unfold[A, B](in: A)(f: A => Option[(A, B)]): List[B]
```

This is equivalent to the definition we used, just a bit more compact in terms of the interface it presents. We used a more explicit definition that is makes the use of the individual elements a little bit easier to understand.

0 comments on commit 11f2128

Please sign in to comment.