KastnerRG · Wolf-Tungsten · Apr 22, 2022
diff --git a/huffman_encoding.tex b/huffman_encoding.tex
@@ -208,7 +208,7 @@ \subsection{Compute Bit Length}
 \begin{equation}
 \begin{array}{rrcl}
 &\mathrm{depth}(\mathrm{root}) &=& 0 \\
-\forall n != \mathrm{root}, &\mathrm{depth}(n) &=& \mathrm{depth}(\mathrm{parent}(n)+1)\\
+\forall n != \mathrm{root}, &\mathrm{depth}(n) &=& \mathrm{depth}(\mathrm{parent}(n))+1\\
 \forall n, &\mathrm{child\_depth}(n) &=& \mathrm{depth}(n)+1
 \end{array}
 \end{equation}
@@ -251,7 +251,7 @@ \subsection{Truncate Tree}
 
 % more depth here would be good.
 
-The input histogram is contained in \lstinline{input_length_histogram}, which was derived by the \lstinline{compute_bit_length()} function described in the previous section. There are two identical output arrays \lstinline{truncated_length_histogram1} and \lstinline{truncated_length_histogram2}. These arrays are passed to two separate functions later in the process (\lstinline{canonize_tree} and \lstinline{create_codewords}), and thus we must have two arrays to adhere to the single producer, single consumer constraint of the \lstinline{dataflow} directive.
+The input histogram is contained in \lstinline{input_length_histogram}, which was derived by the \lstinline{compute_bit_length()} function described in the previous section. There are two identical output arrays \lstinline{output_length_histogram1} and \lstinline{output_length_histogram2}. These arrays are passed to two separate functions later in the process (\lstinline{canonize_tree} and \lstinline{create_codewords}), and thus we must have two arrays to adhere to the single producer, single consumer constraint of the \lstinline{dataflow} directive.
 
 \begin{figure}
 \lstinputlisting[lastline=39]{examples/huffman_truncate_tree.cpp}
@@ -268,13 +268,13 @@ \subsection{Truncate Tree}
 The \lstinline{copy_in for} loop is not optimized. What happens to the latency and initiation interval of the \lstinline{truncate_tree} function if we use a \lstinline{pipeline} or \lstinline{unroll} directive on this loop. What happens to the overall latency and initiation interval of the design (i.e., the \lstinline{huffman_encoding} function)?
 \end{exercise}
 
-The function continues in the second \lstinline{move_nodes for} loop, which performs the bulk of the computation. This \lstinline{for} loop starts by iterating through the \lstinline{truncated_length_histogram} array from the largest index (\lstinline{TREE_DEPTH} - the specified maximum depth for a tree). This continues down through the array until there is a non-zero element or \lstinline{i} reaches the \lstinline{MAX_CODEWORD_LENGTH}. If we do not find a non-zero element, that means the initial input Huffman tree does not have any nodes with a depth larger than the target depth. In other words, we can exit this function without performing any truncation. If there is a value larger than the target depth, then the function continues by reorganizing the tree so that all of the nodes have depth smaller than the target depth. This is done by the operations in the \lstinline{reorder while} loop. When there are nodes to move, the \lstinline{move_nodes for} loop goes through them from those with the largest depth, and continues to smaller depths until all nodes are rearranged with a depth smaller than the target. Each iteration of this \lstinline{move_nodes for} loops works on moving nodes from one depth at a time.
+The function continues in the second \lstinline{move_nodes for} loop, which performs the bulk of the computation. This \lstinline{for} loop starts by iterating through the \lstinline{output_length_histogram} array from the largest index (\lstinline{TREE_DEPTH} - the specified maximum depth for a tree). This continues down through the array until there is a non-zero element or \lstinline{i} reaches the \lstinline{MAX_CODEWORD_LENGTH}. If we do not find a non-zero element, that means the initial input Huffman tree does not have any nodes with a depth larger than the target depth. In other words, we can exit this function without performing any truncation. If there is a value larger than the target depth, then the function continues by reorganizing the tree so that all of the nodes have depth smaller than the target depth. This is done by the operations in the \lstinline{reorder while} loop. When there are nodes to move, the \lstinline{move_nodes for} loop goes through them from those with the largest depth, and continues to smaller depths until all nodes are rearranged with a depth smaller than the target. Each iteration of this \lstinline{move_nodes for} loops works on moving nodes from one depth at a time.
 
-The \lstinline{reorder while} loop moves one node in each iteration. The first \lstinline{if} statement is used to find the leaf node with the largest depth. We will then alter this node by making it an intermediate node, and adding it and the leaf node with a depth larger than than target as children. This \lstinline{if} clause has a \lstinline{do/while} loop that iterates downward from the target looking for a non-zero entry in the \lstinline{truncated_length_histogram} array. It works in a similar manner as the beginning of the \lstinline{move_nodes for} loop. When it has found the deepest leaf node less than the target depth, it stops. The depth of this node is stored in \lstinline{j}.
+The \lstinline{reorder while} loop moves one node in each iteration. The first \lstinline{if} statement is used to find the leaf node with the largest depth. We will then alter this node by making it an intermediate node, and adding it and the leaf node with a depth larger than than target as children. This \lstinline{if} clause has a \lstinline{do/while} loop that iterates downward from the target looking for a non-zero entry in the \lstinline{output_length_histogram} array. It works in a similar manner as the beginning of the \lstinline{move_nodes for} loop. When it has found the deepest leaf node less than the target depth, it stops. The depth of this node is stored in \lstinline{j}.
 
-Now we have a node with a depth \lstinline{i} larger than the target, and a node with a depth smaller than the target stored in \lstinline{j}. We move the node from depth \lstinline{i} and a node from \lstinline{j} into child nodes at depth \lstinline{j + 1}. Therefore, we add two symbols to \lstinline{truncated_length_histogram[j+1]}. We are making a new intermediate node a depth \lstinline{j} thus, we subtract a symbol from that level. We move the other leaf node from depth \lstinline{i} to depth \lstinline{i - 1}. And we subtract two from \lstinline{truncated_length_histogram[i]} since one of the nodes went to level \lstinline{j + 1} and the other when to level \lstinline{i - 1}. These operations are performed in the four statements on the array \lstinline{truncated_length_histogram}. Since we added a symbol to level \lstinline{j + 1}, we update \lstinline{j}, which holds the highest level under the target level, and then we repeat the procedure. This is done until there are no additional symbols with a depth larger than the target.
+Now we have a node with a depth \lstinline{i} larger than the target, and a node with a depth smaller than the target stored in \lstinline{j}. We move the node from depth \lstinline{i} and a node from \lstinline{j} into child nodes at depth \lstinline{j + 1}. Therefore, we add two symbols to \lstinline{output_length_histogram[j+1]}. We are making a new intermediate node a depth \lstinline{j} thus, we subtract a symbol from that level. We move the other leaf node from depth \lstinline{i} to depth \lstinline{i - 1}. And we subtract two from \lstinline{output_length_histogram[i]} since one of the nodes went to level \lstinline{j + 1} and the other when to level \lstinline{i - 1}. These operations are performed in the four statements on the array \lstinline{output_length_histogram}. Since we added a symbol to level \lstinline{j + 1}, we update \lstinline{j}, which holds the highest level under the target level, and then we repeat the procedure. This is done until there are no additional symbols with a depth larger than the target.
 
-The function completes by creating an additional copy of the new bit lengths. This is done by storing the updated bit lengths in the array \lstinline{truncated_length_histogram1} into the array \lstinline{truncated_length_histogram2}. We will pass these two arrays to the final two functions in the \lstinline{huffman_encoding} top function; we need two arrays to insure that the constraints of the \lstinline{dataflow} directive are met.
+The function completes by creating an additional copy of the new bit lengths. This is done by storing the updated bit lengths in the array \lstinline{output_length_histogram1} into the array \lstinline{output_length_histogram2}. We will pass these two arrays to the final two functions in the \lstinline{huffman_encoding} top function; we need two arrays to insure that the constraints of the \lstinline{dataflow} directive are met.
 
 \subsection{Canonize Tree}
 \label{sec:huffman_canonize_tree}
@@ -393,4 +393,4 @@ \section{Conclusion}
 
 Huffman Coding is a common type of data compression used in many applications.  While encoding and decoding using a Huffman code are relatively simple operations, generating the Huffman code itself can be a computationally challenging problem.  In many systems it is advantageous to have relatively small blocks of data, implying that new Huffman codes must be created often, making it worthwhile to accelerate.
 
-Compared to other algorithms we've studied in this book, creating a Huffman code contains a number of steps with radically different code structures.  Some are relatively easy to parallelize, while others are more challenging.  Some portions of the algorithm naturally have higher $\mathcal{O}(n)$ complexity, meaning that they must be more heavily parallelized to achieve a balanced pipeline.  However, using the \lstinline{dataflow} directive in \VHLS, these different code structures can be linked together relatively easily
+Compared to other algorithms we've studied in this book, creating a Huffman code contains a number of steps with radically different code structures.  Some are relatively easy to parallelize, while others are more challenging.  Some portions of the algorithm naturally have higher $\mathcal{O}(n)$ complexity, meaning that they must be more heavily parallelized to achieve a balanced pipeline.  However, using the \lstinline{dataflow} directive in \VHLS, these different code structures can be linked together relatively easily