- Due date: Check Discord or the Course Materials Schedule
- This assignment is graded as described in the syllabus section for an Engineering Efforts Assignment Evaluation
- Submit this assignment on GitHub following the expectations in the syllabus on Assignment Submission.
- To begin, read this
README
- This project has been adapted from Proactive Programmers' material, thus discrepancies are possible.
- Post to the #data-structures Discord channel for questions and clarifications.
- For reference, check the starter repo
- Modifications to the gatorgrade.yml file are not permitted without explicit instruction.
The learning objectives of this assignment are to:
- Use Git and GitHub to manage source code file changes
- Compare sorting algorithms empirically
- Use big-O notation to characterize sorting algorithms
- Write clearly about the programming concepts in this assignment.
Please review the course expectations on the syllabus about Seeking Assistance. Students are reminded to uphold the Honor Code. Cloning the assignment repository is a commitment to the latter.
For this assignment, you may use class materials, textbooks, notes, and the internet. Ensure that your writing is original and based on your own understanding of the concepts.
To claim that work is your own, it is essential to craft the logic and the writing together without copying or using the logical structure of another source. The honor code holds everyone to this standard.
If outside of lab you have questions, the #data-structures Discord channel, TL office hours, instructor office hours, and GitHub Issues can be utilized.
Project Goals (Project Overview Below)
This engineering effort invites you to implement and use a program called
listsorting
that performs a doubling experiment to evaluate the performance of
several different sorting algorithms. First, you will follow the Sorting
Algorithms in Python
tutorial from Real Python and add certain sorting
algorithms, like bubble sort and quick sort, to your project. After you have
implemented all of the required sorting algorithms, you will use the provided
benchmarking framework to conduct a doubling experiment. This doubling
experiment will invoke a specific sorting algorithm while repeatedly doubling
the size of the input to the sorting algorithm for a specific number of doubling
rounds. Since the doubling experiment enables you to calculate a doubling ratio,
it enables you to experimentally predict the likely worst-case time complexity
of each sorting algorithms. In addition to implementing the sorting algorithms
and extending the benchmarking framework, you will use a comprehensive
command-line interface, implemented with Typer,
that allows you to easily control the execution of a doubling experiment.
Finally, you will use your empirical results from using the listsorting
program to better understand the performance trade-offs associated with sorting
algorithms implemented in Python.
This project invites you to implement a Python program, called listsorting
that performs a doubling experiment to evaluate the performance of several
different sorting algorithms. After you finish a correct implementation of all
the program's features, running it with the command poetry run listsorting --starting-size 100 --number-doubles 5 --approach insertion
, causes it to
produce output like the following. With that said, please remember that when you
run the listsorting
program your computer it will likely produce different
performance results! Importantly, this output shows that the listsorting
program ran the insertion sort algorithm, denoted insertion
, for a total of
5
rounds in a doubling experiment that created input sizes that ranged from
100
to 1600
. When listsorting
runs the experiment, it uses the
timeit package to measure the
min
, max
, and avg
execution time of the algorithm.
✨ Conducting an experiment to measure the performance of list sorting!
The chosen sorting algorithm: insertion
Starting size of the data container: 100
Number of doubles to execute: 5
✨ Here are the results from running the experiment!
Input Size Min time (s) Max time (s) Avg time (s)
------------ -------------- -------------- --------------
100 0.00198 0.00228 0.0021
200 0.00791 0.00831 0.00804
400 0.03091 0.03179 0.03129
800 0.1397 0.14232 0.141
1600 0.56098 0.58918 0.57665
These experimental results suggest that insertion sort has a doubling ratio of
1600
and 800
, the average
execution time for insertion sort was listsorting
's help menu and learn
more about its features by typing poetry run listsorting --help
to show the
following output. Finally, remember that the listsorting
program should also
run experiments for the other sorting algorithms, such as bubble sort and quick
sort! You can also run listsorting
with larger input sizes or more rounds of
input doubling --- but be aware of the fact that your experiments could take a
long time to finish for certain algorithms!
╭─ Options ─────────────────────────────────────────────────────────────╮
│ --starting-size INTEGER [default: 1000000] │
│ --maximum-value INTEGER [default: 10000] │
│ --number-doubles INTEGER [default: 10] │
│ --approach [bubble|insertion|m [default: bubble] │
│ erge|quick|tim] │
│ --install-completion [bash|zsh|fish|powe Install completion │
│ rshell|pwsh] for the specified │
│ shell. │
│ [default: None] │
│ --show-completion [bash|zsh|fish|powe Show completion for │
│ rshell|pwsh] the specified shell, │
│ to copy it or │
│ customize the │
│ installation. │
│ [default: None] │
│ --help Show this message │
│ and exit. │
╰───────────────────────────────────────────────────────────────────────╯
Please note that the provided source code does not contain all of the
functionality to produce the output displayed in this section. As the next
section explains, you should add the features needed to ensure that
listsorting
produces the expected output! Drawing from the source code
provided in the aforementioned Sorting Algorithms in
Python tutorial from Real
Python, this project invites you to add all of the
sorting algorithms, use the listsorting
program to conduct a doubling
experiment for each of the sorting algorithms and, finally, leverage the data
tables of empirical results to calculate a doubling ratio and predict the likely
worst-case time complexity of each sorting algorithm.
Don't forget that if you want to run the listsorting
program you must use
your terminal window to first go into the GitHub repository containing this
project and then go into the listsorting/
directory that contains the
project's source code. Finally, remember that before running the program you
must run poetry install
to add its dependencies, such as Pytest for
automated testing and Rich for colorful output!
If you study the file listsorting/listsorting/sorting.py
you will see that it
has many TODO
markers that designate the sorting algorithms that you must
implement so as to ensure that listsorting
will produce correct output. For
instance, you will need to provide an implementation of each sorting algorithm,
like bubble sort, that has a signature like def bubble_sort(array: List[int]) -> List[int]
. You will also need to resolve all of the TODO
markers in
listsorting/listsorting/main.py
that involve calling the functions in
listsorting/listsorting/experiment.py
to run each of the steps in a doubling
experiment. Specifically, you must ensure that the listsorting
function in the
main
module calls the following run_sorting_algorithm_experiment_campaign
function. Once you complete a task associated with a TODO
marker, make sure
that you delete it and revise the prompt associated with the marker into a
meaningful comment.
def run_sorting_algorithm_experiment_campaign(
algorithm: str,
starting_size: int,
maximum_value: int,
number_doubles: int,
) -> List[List[Union[int, Tuple[float, float, float]]]]:
data_table = []
while number_doubles > 0:
random_list = generate_random_container(starting_size, maximum_value)
performance_data = run_sorting_algorithm(algorithm, random_list)
data_table_row = [
starting_size,
performance_data[0],
performance_data[1],
performance_data[2],
]
data_table.append(data_table_row)
number_doubles = number_doubles - 1
starting_size = starting_size * 2
return data_table
Notably, the run_sorting_algorithm_experiment_campaign
function completes all
of the steps associated with running a specified sorting algorithm in a doubling
experiment. Upon completion, this function returns a data_table
that contains
performance results for each round of the doubling experiment, as shown in the
previous section. After finishing your implementation of listsorting
,
including the call to run_sorting_algorithm_experiment_campaign
in main
, you
should repeatedly run the program in different configurations to conduct an
experiment to evaluate the performance of each sorting algorithm that you
implemented. This process will result in a data table that summarizes the
results from a doubling experiment for each sorting algorithm. You can use the
data in the table to calculate the doubling ratio and then use it to predict the
likely worst-case time complexity of each sorting algorithm.
If you study the source code in the pyproject.toml
file you will see that it
includes the following section that specifies different executable tasks like
ruff
. If you are in the containmentcheck
directory that contains the
pyproject.toml
file and the poetry.lock
file, the tasks in this section
make it easy to run commands like poetry run task ruff
to automatically run
the ruff linter designed to check the Python source code in your program and
its test suite to confirm that your source code adheres to the industry-standard.
You can also use the command poetry run task fix
to automatically reformat the
source code. poetry run task ruffdetails
will print out detailed linting errors
that point to exactly what ruff views as a linting error. Make sure to examine
the pyproject.toml
file for other convenient tasks that you can use to both
check and improve your project!
Along with running tasks like poetry run task ruff
, you can run the command
gatorgrade --config config/gatorgrade.yml
to check your work. If your work
meets the baseline requirements and adheres to the best practices that proactive
programmers adopt you will see that all the checks pass when you run
gatorgrade
. You can study the config/gatorgrade.yml
file in your repository
to learn how the :material-github:
GatorGrade program runs
:material-github: GatorGrader
to automatically check your program and technical writing. If your program has
all of the anticipated functionality, you can run the command poetry run task test
and see that the test suite produces output like the following.
tests/test_sorting.py ...
Add comments to the test suite to explain how the test cases work. It is worth
noting that the name of the test suite is test_sorting
because the
functions mentioned in the previous section exist in the listsorting
module.
Don't forget that when you commit source code or technical writing to your
GitHub repository for this project, it will trigger the run of a GitHub
Actions workflow. If you are a student at Allegheny College, then running
this workflow consumes build minutes for the course's organization! As
such, you should only commit to your repository once you have made
substantive changes to your project and you are ready to confirm its
correctness. Before you commit to your GitHub repository, you can still run
checks on your own computer by using Poetry and GatorGrader. You can also
add and commit work locally before pushing it by using git commands in the
terminal: git add .
and git commit -m "Commit Message"
.
Once you have finished both of the previous technical tasks, you can use a text
editor to answer all of the questions in the writing/reflection.md
file. For
instance, you should provide the output of the Python program in a fenced code
block, explain the meaning of the Python source code segments that you
implemented, and answer all of the other questions about your experiences in
completing this project. A specific goal of the reflection for this project is
to ensure that you can clearly write research questions and an experiment
methodology for assessing the performance of a sorting algorithm through a
doubling experiment. Once you have finished addressing the prompts in the
writing/reflection.md
file that have TODO
makers given as reminders, make
sure that you either delete the prompt or carefully integrate a revised version
of it into your writing.
Since this project is an engineering effort, it is aligned with the evaluating and creating levels of Bloom's taxonomy. You can learn more about how a proactive programming expert will assess your work by examining the assessment strategy. From the start to the end of this project you may make an unlimited number of reattempts at submitting source code and technical writing that meet all aspects of the project's specification.
After cloning this repository to your computer, please take the following steps:
- Use the
cd
command to change into the directory for this repository. - Specifically, you can change into the program directory by typing
cd listsorting
. - Install the dependencies for the project by typing
poetry install
. - Run the program with its different configurations by typing:
poetry run listsorting --starting-size 1024 --number-doubles 5 --approach tim
- Note that this is not the only configuration you should try for your experiment
- Note that the program will not work unless you add the required source code
- Refer to the
writing/reflection.md
for details about designing your experiment
- Please note that the program will not work unless you add the required
source code at the designated
TODO
markers. - Confirm that the program is producing the expected output described above
- Run the automated grading checks by typing
gatorgrade --config config/gatorgrade.yml
. - You may also review the output from running GatorGrader in GitHub Actions.
- Don't forget to provide all of the required responses to the technical writing
prompts in the
writing/reflection.md
file. - Please make sure that you completely delete the
TODO
markers and their labels from all of the provided source code. - Please make sure that you also completely delete the
TODO
markers and their labels from every line of thewriting/reflection.md
file.