Skip to content

Implementations of algorithms and data structures from "Introduction to Algorithms", Fourth Edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein

License

Notifications You must be signed in to change notification settings

wojtask/clrs4e-implementations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Algorithms, Fourth Edition — implementations of algorithms and data structures

Build & test

Overview

The goal of this project is to provide implementations of algorithms and data structures from Introduction to Algorithms, Fourth Edition, and from Solutions to exercises and problems. The book was written by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. The solutions are developed by myself, and are currently work in progress. The implementations here serve as a companion project to the solutions and play an invaluable role in verifying the theoretical work in practice. They help detecting and fixing bugs or limitations in the suggested algorithms and data structures.

Project objectives

  • Adaptation of pseudocodes from the textbook and from the solutions to a real programming language.
  • Implementation of algorithms and data structures with no explicit pseudocodes, yet precisely described in any of the source positions.
  • Testing the implementations, especially those from the solutions, to increase confidence in algorithms correctness.
  • Laying a foundation for a more practical (or "production-ready") library of algorithms and data structures for real use cases.

Choice of a programming language

Python 3 was chosen as a programming language for several reasons:

  • Python is widely used in academia and is often the programming language of choice in introductory computer science courses. It is also widely used in many business applications. That makes it widely known in many communities, both academic and professional.
  • Python syntax and semantic show many similarities to pseudocode used in the textbook and the solutions. This enables easily transitioning between the two ways of expressing algorithms.
  • Python does not limit developers with a single programming paradigm, making it elastic to adapt to different models found in pseudocodes.
  • Python typing system resembles the rules followed in pseudocode, particularly dynamic typing and duck typing.

The implementations are written in a way to be as close as possible to the algorithms in the textbook or in the solutions, both in terms of syntax, and behavior. This principle lead to using the procedural paradigm whenever possible and applicable, as well as translating pseudocode instructions to most relevant Python statements or expressions. Algorithms with no pseudocodes provided are implemented in a more flexible and often more concise way, more resembling an idiomatic Python code. The code in such cases is often appropriately structured to increase readability.

History and Future

A couple of yeas ago I started implementing the algorithms from the second edition of the book while working on the solutions in Polish. On GitHub there are legacy projects written in Java and Python. Soon after the fourth edition of the book got published, I deprecated those projects and began working on the solutions to the fourth edition by migrating the old material, i.e., adapting it to the new edition and translating it to English.

I plan to adapt the implementations similarly by moving the legacy code and keeping it consistent as I work on solving exercises in each chapter. Therefore, the progress on the implementations almost entirely depends on the progress in the solutions.

I also plan to rethink the approach to testing by introducing two test categories — fast and deterministic unit tests and robust property-based tests for generating random test cases. The idea is to make writing tests easier and better control their execution. By delegating the generation of test cases to libraries such as Hypothesis, I could simplify many legacy tests that implemented their own test cases generators. Also, keeping the two test types separated, I could better control when each are run, which would lead to speeding up the testing process. This can be achieved by running only the fast unit tests synchronously during build's test phase, and defining an asynchronous mechanism for running the other tests.

Pseudocode translation rules

The table below lists the rules followed in the implementation from translating pseudocode constructs into Python instructions. A pseudocode may also contain lines that don't match any of the categories in this table, like instructions in plain English or more sophisticated mathematical expressions. In such cases the rule is to implement these lines as a separate helper function in Python, named appropriately. The function should implement only the step reflected by that line. Pseudocode comments are ignored.

Category Pseudocode construct Python code Remarks
predefined constant
True
False
None
math.inf
custom constant
NO_SUCH_PATH
Defined as a standalone or an enum value.
variable
x
y2_
best_score
object's attribute
T.root
A.heap_size
x.key[i]
key is implemented as an Array.
assignment
x = y
scalar comparison
x == y
x and y are scalars.
x != y
x < y
x > y
x <= y
x >= y
pointer comparison
x is y
x and y are pointers.
x is not y
logical operation
<condition1> and <condition2>
<condition1> or <condition2>
not <condition>
arithmetic operation
x += y
x -= y
x + y
x - y
x * y
x / y
For real x and y.
x // y
For integer x and y.
x // y
-(x // -y)
x % y
x ** y
math.sqrt(x)
math.floor(x)
math.ceil(x)
min(x, y)
max(x, y)
value exchange
x, y = y, x
printing
print(x)
procedure definition
def insertion_sort(A, n):
  <block>
May include type hints for the parameters or the return value.
procedure call
insertion_sort(A, n)
procedure return
return
return x
return x, y
array creation
A = Array(0, n)
The created array consists of None values on each position.
array cell reference
A[i]
Either as an accessor or a mutator.
set creation
S = set()
set membership
x in S
sets union
S | {x}
set cardinality
len(S)
error
raise ValueError('overflow')
if statement
if <condition1>:
  <block1>
elif <condition2>:
  <block2>
else:
  <block3>
Can have zero or more elif branches and zero or one else branch.
for loop
for i in range_of(x, to=y):
  <block>
range_of(x, to=y) is equivalent to range(x, y + 1).
for i in range_of(y, downto=x):
  <block>
range_of(y, downto=x) is equivalent to range(y, x - 1, -1).
for-each loop
for v in V:
  <block>
while loop
while <condition>:
  <block>
repeat loop
while True:
  <block>
  if <condition>:
    break

Stay tuned for more information once I migrate some amount of code.

About

Implementations of algorithms and data structures from "Introduction to Algorithms", Fourth Edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages