Skip to content

Commit

Permalink
rebuilding
Browse files Browse the repository at this point in the history
  • Loading branch information
gvwilson committed May 4, 2024
1 parent 8f5137c commit 9cb368e
Show file tree
Hide file tree
Showing 11 changed files with 1,578 additions and 46 deletions.
10 changes: 4 additions & 6 deletions diff/index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
---
---

# Chapter N: A Diff Tool
- The difference between two sequences can be represented as a conversion of a source sequence to a target sequence, via applying a series of insertion, deletion and matching operations, in an element-wise manner.
- Being able to represent the differences between two files is a fundamental feature of version-control systems, whereby it serves to display a commit, the difference between two commits and assist in branch merging operations.
- Context is important when representing differences, and different formats may choose to achieve this via take advantage of context in different ways.

| <span style="font-weight:normal; text-align: left;"><ul><li>The difference between two sequences can be represented as a conversion of a source sequence to a target sequence, via applying a series of insertion, deletion and matching operations, in an element-wise manner.</li><li>Being able to represent the differences between two files is a fundamental feature of version-control systems, whereby it serves to display a commit, the difference between two commits and assist in branch merging operations.</li><li>Context is important when representing differences, and different formats may choose to achieve this via take advantage of context in different ways.</ul></span> |
|:----------------------------------------------|

| <span style="font-weight:normal; text-align: left;">Terms defined: [**Ability**](#), [**diff**](#), [**dynamic programming**](#), [**longest common subsequence**](#), [**memoization**](#), [**merge**](#), [**opaque type**](#), [**platform**](#), [**version-control system**](#)</span> |
|:----------------------------------------------|
Terms defined: ability, diff, dynamic programming, longest common subsequence, memoization, merge, opaque type, platform, version-control system

1. [Representation](#section-n1-representation)
2. [Longest Common Subsequence (LCS)](#section-n2-longest-common-subsequence-lcs)
Expand Down
17 changes: 17 additions & 0 deletions docs/diff/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
main = src/main.roc

build:
roc build $(main) --output rocdiff

lint:
roc check $(main)

format:
roc format $(main)

test:
roc test $(main)

clean:
rm rocdiff

4 changes: 4 additions & 0 deletions docs/diff/TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# TODO

- [ ] Add docstrings to the in-line code snippets, as well as the complete code.
- [ ] Improve error messages and eliminate all `crash` calls, which aren't absolutely necessary.
8 changes: 8 additions & 0 deletions docs/diff/examples/Hello.roc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
app "hello"
packages { pf: "https://github.com/roc-lang/basic-cli/releases/download/0.9.1/y_Ww7a2_ZGjp0ZTt9Y_pNdSqqMRdMLzHMKfdN8LWidk.tar.br" }
imports [pf.Stdout]
provides [main] to pf

main =
Stdout.line "Hello!"

8 changes: 8 additions & 0 deletions docs/diff/examples/HelloWorld.roc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
app "hello-world"
packages { pf: "https://github.com/roc-lang/basic-cli/releases/download/0.9.1/y_Ww7a2_ZGjp0ZTt9Y_pNdSqqMRdMLzHMKfdN8LWidk.tar.br" }
imports [pf.Stdout]
provides [main] to pf

main =
Stdout.line "Hello, World!"

8 changes: 8 additions & 0 deletions docs/diff/examples/source.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
A
B
C
D
E
F
G
H
8 changes: 8 additions & 0 deletions docs/diff/examples/target.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
I
B
C
D
E
F
J
H
810 changes: 770 additions & 40 deletions docs/diff/index.html

Large diffs are not rendered by default.

28 changes: 28 additions & 0 deletions docs/diff/outline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Outline
**Note:** This is only for reference. To be removed upon finalising a first full draft of the content.

- A brief intro to the [longest common subsequence](https://en.wikipedia.org/wiki/Longest_common_subsequence#Print_the_diff) (LCS) algorithm and its applications.
- Building visual intuition about the workings of the algorithm via toy examples (a couple of examples quintessentially using DNA base-pair sequences).
- Discussing algorithm design choices that make up a visually "good" diff output in practice.
- Implementing a textbook Roc version of the LCS algorithm.
- Incrementally introducing enhancements to the implementation, targeted towards using it more effectively as a `diff` tool.
- This gives the opportunity to discuss Roc-specific concepts such as:
- Abilities such as `Eq` and `Hash`.
- The discussion will touch upon the fact the LCS algorithm can be applied to arbitrary homogeneous sequences of elements of any type, as long as the elements of the underlying type can be compared for equality against each other.
- Records and associated syntax.
- This will be useful for customising the tool, for instance:
- Collapsing long sections of matching sequences (this can be parametrised by length).
- Colourising the output (different colour schemes may apply).
- Employing the implemented tool as a `git diff` tool.
- Discussing and implementing optimisations such as operating on "compressed" versions of the elements such as hashes and lengths.
- Discussing the connecting points from a `diff` tool to the ability of merging branches in a version-control system context, via the 3-way merge algorithm.

## In scope, if time permits

**Note:** By time, it is meant time from a reader's perspective, in terms of the generally-agreed-upon reader persona and the associated allotted time-per-chapter guideline.

- Improving the implementation, so that the output format - besides basic markers for insertions and deletions - conforms to one of the common `diff` format [specifications](https://www.math.utah.edu/docs/info/diff_3.html).
- An overview and perhaps implementation of algorithms used by `git diff` and/or other industry-standard tools and their juxtaposition with the LCS algorithm.

## Out of scope
- Version-control system concepts beyond the scope of `diff`-ing files and prerequisites for merging and identifying merge conflicts.
Loading

0 comments on commit 9cb368e

Please sign in to comment.