Finish section on structural corecursion

scalawithcats · Oct 4, 2023 · 11f2128 · 11f2128
1 parent 711984f
commit 11f2128
Show file tree

Hide file tree

Showing 3 changed files with 304 additions and 7 deletions.
diff --git a/src/pages/adt/applications.md b/src/pages/adt/applications.md
@@ -1 +1,38 @@
 ## Applications of Algebraic Data Types
+
+We seem some examples of algebraic data types already. Some of the simplest are the basic enumeration, like
+
+```scala mdoc:silent
+enum Permissions {
+  case User
+  case Moderator
+  case Admin
+}
+```
+
+which we might use in a system monitoring application and also our old friend the `case class`
+
+```scala mdoc:silent
+final case class Uri(protocol: String, host: String, port: Int, path: String)
+```
+
+I think these are the straightforward examples, but those new to algebraic data types often don't realise how many other uses cases there are.
+We'll see combinator libraries, an extremely important use, in the next chapter.
+Here I want to give a few examples of finite state machines as another use case.
+
+Finite state machines occur everywhere in programming. The state of a user interface component, such as open or closed, or visible or invisible, can be modelled as a finite state machine. The state of a job in a distributed job system, like Spark, can also be modelled as a finite state machine. 
+When using an algebraic data type we're not restricted to simple enumerations of state.
+We can also store data within the states.
+So, in our job control system, we define jobs as having states.
+Here's a simple example.
+
+```scala mdoc:silent
+import scala.concurrent.Future
+
+enum Job[A] {
+  case Queued(name: String, job: () => A)
+  case Running(name: String, host: String, result: Future[A])
+  case Completed(name: String, result: A)
+  case Failed(name: String, reason: String)
+}
+```
diff --git a/src/pages/adt/index.md b/src/pages/adt/index.md
@@ -1,4 +1,4 @@
-# Algebraic Data Types and Structural Recursion
+# Algebraic Data Types
 
 In this section we'll see our first example of a programming strategy: **algebraic data types**. Any data we can describe using logical ands and logical ors is an algebraic data type. Once we recognize an algebraic data type we get three things for free:
 

diff --git a/src/pages/adt/structural-corecursion.md b/src/pages/adt/structural-corecursion.md
@@ -13,7 +13,7 @@ Duality is often indicated by attaching the co- prefix to a term.
 So corecursion is the dual of recursion, and sum types, also known as coproducts, are the dual of product types.
 
 Duality is one of the main themes of this book.
-By relating concepts as duals, we can transfer knowledge from one domain to another.
+By relating concepts as duals we can transfer knowledge from one domain to another.
 </div>
 
 Structural recursion works by considering all the possible inputs (which we usually represent as patterns), and then working out what we do with each input case.
@@ -47,7 +47,7 @@ enum MyList[A] {
 ```
 
 The structural corecursion strategy says we write down all the constructors and then consider the conditions that will cause us to call each constructor.
-So our starting point is to just write down the two constructors, and put in some dummy conditions for each.
+So our starting point is to just write down the two constructors, and put in dummy conditions.
 
 ```scala mdoc:reset:silent
 enum MyList[A] {
@@ -73,7 +73,7 @@ enum MyList[A] {
 }
 ```
 
-Now to get the left-hand side we can use the strategies we've already seen:
+To complete the left-hand side we can use the strategies we've already seen:
 
 * we can use structural recursion to tell us there are two possible conditions; and
 * we can follow the types to align these conditions with the code we have already written.
@@ -100,12 +100,272 @@ For example, `foldLeft` and `foldRight` are not structural corecursions because
 Secondly, note that when we walked through the process of creating `map` as a structural recursion we implicitly used the structural corecursion pattern, as part of following the types.
 We recognised that we were producing a `List`, that there were two possibilities for producing a `List`, and then worked out the correct conditions for each case.
 Formalizing structural corecursion as a separate strategy allows us to be more conscious of where we apply it.
-Finally, noticed how I switched from an `if` expression to a pattern match expression as we progressed through defining `map`.
+Finally, notice how I switched from an `if` expression to a pattern match expression as we progressed through defining `map`.
 This is perfectly fine.
 Both kinds of expression can achieve the same effect, though if we wanted to continue using an `if` we'd have to define a method (for example, `isEmpty`) that allows us to distinguish an `Empty` element from a `Pair`.
 This method would have to use pattern matching in its implementation, so avoiding pattern matching directly is just pushing it elsewhere.
 
-**TODO: desribe abstract structural corecursion**
 
+### Unfolds as Structural Corecursion
 
-### Structural Corecursion as Unfold
+Just as we could abstract structural recursion as a fold, for any given algebraic data type we can abstract structural corecursion as an unfold. Unfolds are much less commonly used than folds, but they are still a nice tool to have.
+
+Let's work through the process using `MyList` again.
+
+```scala mdoc:reset:silent
+enum MyList[A] {
+  case Empty()
+  case Pair(head: A, tail: MyList[A])
+}
+```
+
+The skeleton is
+
+```scala
+if ??? then MyList.Empty()
+else MyList.Pair(???, recursion(???))
+```
+
+Let's start defining our method `unfold` so we can fill in the missing pieces.
+
+```scala
+def unfold[A, B](seed: A): MyList[B] =
+  if ??? then MyList.Empty()
+  else MyList.Pair(???, unfold(in))
+```
+
+We can abstract the condition using a function from `A => Boolean`.
+
+```scala
+def unfold[A, B](seed: A, stop: A => Boolean): MyList[B] =
+  if stop(seed) then MyList.Empty()
+  else MyList.Pair(???, unfold(seed, stop))
+```
+
+Now we need to handle the cases for `Pair`. 
+We have a value of type `A` (`seed`), so to create the `head` element of `Pair` we can ask for a function `A => B`
+
+```scala
+def unfold[A, B](seed: A, stop: A => Boolean, f: A => B): MyList[B] =
+  if stop(seed) then MyList.Empty()
+  else MyList.Pair(f(seed), unfold(???, stop, f))
+```
+
+Finally we need to update the current value of `seed` to the next value. That's a function `A => A`.
+
+```scala mdoc:silent
+def unfold[A, B](seed: A, stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+  if stop(seed) then MyList.Empty()
+  else MyList.Pair(f(seed), unfold(next(seed), stop, f, next))
+```
+
+At this point we're done.
+Let's see that `unfold` is useful by declaring some other methods in terms of it.
+We're going to declare `map`, which we've already seen is a structural corecursion, using `unfold`, and also `fill` and `iterate`, which correspond to the methods with the same names on `List` in the Scala standard library.
+
+To make this easier to work with I'm going to declare `unfold` as a method on the `MyList` companion object. 
+I have made a slight tweak to the definition to make type inference work a bit better.
+In Scala, types inferred for one method parameter cannot be used for other method parameters.
+However, types inferred for one methdo parameter list can be used in subsequenst lists.
+Separating the function parameters from the `seed` parameter means that the value inferred for `A` from `seed` can be used for inference of the function parameters' input parameters.
+
+I have also declared some **destructor** methods, which are methods that take apart an algebraic data type.
+For `MyList` these are `head`, `tail`, and the predicate `isEmpty`.
+We'll talk more about these a bit later.
+
+Here's our starting point.
+
+```scala mdoc:reset:silent
+enum MyList[A] {
+  case Empty()
+  case Pair(_head: A, _tail: MyList[A])
+
+  def isEmpty: Boolean =
+    this match {
+      case Empty() => true
+      case _       => false
+    }
+
+  def head: A =
+    this match {
+      case Pair(head, _) => head
+    }
+
+  def tail: MyList[A] =
+    this match {
+      case Pair(_, tail) => tail
+    }
+}
+object MyList {
+  def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+    if stop(seed) then MyList.Empty()
+    else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
+}
+```
+
+Now let's define the constructors `fill` and `iterate`, and `map`, in terms of `unfold`. I think the constructors are a bit simpler, so I'll do those first.
+
+```scala
+object MyList {
+  def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+    if stop(seed) then MyList.Empty()
+    else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
+
+  def fill[A](n: Int)(elem: => A): MyList[A] =
+    ???
+
+  def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
+    ???
+}
+```
+
+Here I've just added the method skeletons, which are taken straight from the `List` documentation.
+To implement these methods we can use one of two strategies:
+
+- reasoning about loops in the way we might in an imperative language; or
+- reasoning about structural recursion over the natural numbers.
+
+Let's talk about each in turn.
+
+You might have noticed that the parameters to `unfold` are almost exactly those you need to create a for-loop in a language like Java. A classic for-loop, of the `for(i = 0; i < n; i++)` kind, has four components:
+
+1. the initial value of the loop counter;
+2. the stopping condition of the loop; 
+3. the statement that advances the counter; and
+4. the body of the loop that uses the counter.
+
+These three correspond to the `seed`, `stop`, `next`, and `f` parameters of `unfold` respectively.
+
+Loop variants and invariants are the standard way of reasoning about imperative loops. I'm not going to describe them here, as probably learned this already (though perhaps not using these terms). Instead I'm going to discuss the second reasoning strategies, which relates writing `unfold` to something we've already discussed: structural recursion.
+
+Our first step is to note that natural numbers (the integers 0 and larger) are conceptually algebraic data types even though the implementation in Scala---using `Int`---is not. A natural number is either:
+
+- zero; or
+- 1 + a natural number.
+
+It's the simplest possible algebraic data type that is both a sum and a product type.
+
+Once we see this, we can use the reasoning tools for structural recursion for creating the parameters to `unfold`.
+Let's show how this works with `fill`. The `n` parameter tells us how many elements in the `List`, and the `elem` parameter creates those elements. So our starting point is to consider this as a structural recursion over the natural numbers. We can take `n` as `seed`, and `stop` as the function `x => x == 0`. These are the standard conditions for such a structural recursion. What about `next`? Well, the definition of natural numbers tells us we should subtract one, so `next` becomes `x => x - 1`. We only need `f`, and that comes from the definition of how `fill` is supposed to work. We create the value from `elem`, so `f` is just `_ => elem`
+
+```scala
+object MyList {
+  def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+    if stop(seed) then MyList.Empty()
+    else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
+
+  def fill[A](n: Int)(elem: => A): MyList[A] =
+    unfold(n)(_ == 0, _ => elem, _ - 1)
+
+  def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
+    ???
+}
+```
+
+```scala mdoc:reset:invisible
+enum MyList[A] {
+  case Empty()
+  case Pair(_head: A, _tail: MyList[A])
+
+  def isEmpty: Boolean =
+    this match {
+      case Empty() => true
+      case _       => false
+    }
+
+  def head: A =
+    this match {
+      case Pair(head, _) => head
+    }
+
+  def tail: MyList[A] =
+    this match {
+      case Pair(_, tail) => tail
+    }
+
+  override def toString(): String = {
+    def loop(list: MyList[A]): List[A] =
+      list match {
+        case Empty() => List.empty
+        case Pair(h, t) => h :: loop(t)
+      }
+    s"MyList(${loop(this).mkString(", ")})"
+  }
+}
+object MyList {
+  def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+    if stop(seed) then MyList.Empty()
+    else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
+
+  def fill[A](n: Int)(elem: => A): MyList[A] =
+    unfold(n)(_ == 0, _ => elem, _ - 1)
+
+  def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
+    unfold((len, start))(
+      (len, _) => len == 0,
+      (_, start) => start,
+      (len, start) => (len - 1, f(start))
+    )
+}
+```
+
+We should check that our implementation works as intended. We can do this by comparing it to `List.fill`.
+
+```scala mdoc:to-string
+List.fill(5)(1)
+MyList.fill(5)(1)
+```
+```scala mdoc:silent
+var counter = 0
+def getAndInc(): Int = {
+  val temp = counter
+  counter = counter + 1
+  temp 
+}
+```
+```scala mdoc:to-string
+List.fill(5)(getAndInc())
+counter = 0
+MyList.fill(5)(getAndInc())
+```
+
+#### Exercise {-}
+
+Implement `iterate` using the same reasoning as we did for `fill`.
+This is slightly more complex than `fill` as we need to keep to bits of information: the value of the counter and the current value of type `A`.
+
+<div class="solution">
+```scala
+object MyList {
+  def unfold[A, B](seed: A)(stop: A => Boolean, f: A => B, next: A => A): MyList[B] =
+    if stop(seed) then MyList.Empty()
+    else MyList.Pair(f(seed), unfold(next(seed))(stop, f, next))
+
+  def fill[A](n: Int)(elem: => A): MyList[A] =
+    unfold(n)(_ == 0)(_ => elem, _ - 1)
+
+  def iterate[A](start: A, len: Int)(f: A => A): MyList[A] =
+    unfold((len, start)){
+      (len, _) => len == 0,
+      (_, start) => start,
+      (len, start) => (len - 1, f(start))
+    }
+}
+```
+
+We should check that this works.
+
+```scala mdoc:to-string
+List.iterate(0, 5)(x => x - 1)
+MyList.iterate(0, 5)(x => x - 1)
+```
+</div>
+
+
+One last thing before we leave `unfold`. If we look at the usual definition of `unfold` we'll usually find the following definition.
+
+```scala
+def unfold[A, B](in: A)(f: A => Option[(A, B)]): List[B]
+```
+
+This is equivalent to the definition we used, just a bit more compact in terms of the interface it presents. We used a more explicit definition that is makes the use of the individual elements a little bit easier to understand.