Skip to content

Commit

Permalink
Merge pull request #146 from UCL/mm_updates2024
Browse files Browse the repository at this point in the history
Mm updates2024
  • Loading branch information
mmcleod89 authored Dec 11, 2024
2 parents 5e5508f + d43a9ad commit 2943dff
Show file tree
Hide file tree
Showing 7 changed files with 391 additions and 68 deletions.
9 changes: 6 additions & 3 deletions 01projects/sec02IntroToCpp.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,25 +12,26 @@ Since 1998, C++'s development has been governed by an ISO working group that col

This course will assume the use of the `C++17` standard, which is presently widely supported by compilers. (Compiler support tends to lag behind the release of the C++ standard, since the compiler developers need time to implement it and check that the new compilers work properly!)

> \* The exact meaning of "low level" depends on whom you ask, but in general _lower level_ languages are languages more directly represent the way that the machine works, or give you control over more aspects of the machine, and _higher level_ languages abstract more of that away, and focus on declaring the behaviour that you want. Traditionally, a "low level" language refers to _assembly code_, which is where the instructions that you write are the instruction set of the machine itself! This code is by its nature specific to a given kind of machine and therefore not portable between systems, and doesn't express things in a way that is intuitive to most people. High level languages were introduced to make code easier to understand and more independent of the hardware; the highest level languages, like Haskell, are highly mathematical in their structure and give hardly any indication of how the computer works at all! C++ falls somewhere in the middle, with plenty of high level abstractions and portability, but it still gives us some features associated with low level programming like direct addressing of memory. This extra degree of control is very valuable when you need to get the best out of systems that require high performance or have limited resources.
> \* The exact meaning of "low level" depends on whom you ask, but in general _lower level_ languages are languages more directly represent the way that the machine works, or give you control over more aspects of the machine, and _higher level_ languages abstract more of that away, and focus on declaring the behaviour that you want. Traditionally, a "low level" language refers to _assembly code_, which is where the instructions that you write are (a human readable version of) the instruction set of the machine itself! This code is by its nature specific to a given kind of machine and therefore not portable between systems, and doesn't express things in a way that is intuitive to most people. High level languages were introduced to make code easier to understand and more independent of the hardware; the highest level languages, like Haskell, are highly mathematical in their structure and give hardly any indication of how the computer works at all! C++ falls somewhere in the middle, with plenty of high level abstractions and portability, but it still gives us some features associated with low level programming like direct addressing of memory. This extra degree of control is very valuable when you need to get the best out of systems that require high performance or have limited resources.

## Why are we using C++?

The most common language for students to learn at present is probably Python, and many of you may have taken the Python based software engineering course last term. So why are we now changing to C++?

1. C++ is the standard language for high performance computing, both in research and industry.
2. C++ code runs much faster than native Python code. Those fast running Python libraries are written in C! As scientific programmers, we sometimes have to implement our own novel methods, which need to run efficiently.
2. C++ code runs much faster than native Python code. Those fast running Python libraries are written in C! As scientific programmers, we sometimes have to implement our own novel methods, which need to run efficiently. We can't always rely on someone else having implemented the tools that we need for us.
3. C++ is a great tool for starting to understand memory management better.
- Most code that we write will not need us to allocate and free resources manually, but C++ gives us a clear understanding of when resources are allocated and freed, and this is important for writing effective and safe programs.
- Many structures in C++ have easy to understand and well defined layouts in memory. The way that data is laid out in memory can have a major impact on performance as we shall see later, and interacting with some high performance libraries requires directly referencing contiguous blocks of memory.
4. C++ has strong support for object-oriented programming, including a number of features not present in Python. These features allow us to create programs that are safer and more correct, by allowing us to define objects which have properties that have particular properties (called _invariants_). For example, defining a kind of list that is always sorted, and can't be changed into an un-sorted state, means that we can use faster algorithms that rely on sorted data _without having to check that the data is sorted_.
5. C++ is multi-paradigm and gives us a lot of freedom in how we write our programs, making it a great language to explore styles and different programming patterns.
6. C++ has a static type system (as do many other languages), which is quite a big shift from Python's dynamic typing. Familiarity with this kind of type system is extremely useful if you haven't used it before, and as we will see it can help us to write faster code with fewer bugs.

### Why is C++ fast?

Because a C++ program is compiled before the program runs, it can be much faster than interpreted languages. Not only is the program compiled to native machine code, the lowest-level representation of a program possible with today's CPUs, compilers are capable of performing clever optimisations to vastly improve runtimes. With C++, C, Rust, Fortran, and other natively compiled languages, there is no virtual machine (like in Java) or interpreter (like in Python) that could introduce overheads that affect performance.

Many languages use a process called _garbage collection_ to free memory resources, which adds run-time overheads and is less predictable than C++'s memory management system. In C++ we know when resources will be allocated and freed, and we can run without less computational overhead, at the cost of having to be careful to free any resources that we manually allocate. (Manually allocating memory is relatively rare in modern C++ practices! This is more common in legacy code or C code, with which you will sometimes need to interact.)
Many languages use a process called _garbage collection_ to free memory resources, which adds run-time overheads and is less predictable than C++'s memory management system. In C++ we know when resources will be allocated and freed, and we can run with less computational overhead, at the cost of having to be careful to free any resources that we manually allocate. (Manually allocating memory is relatively rare in modern C++ practices! This is more common in legacy code or C code, with which you will sometimes need to interact.)

Static type checking also helps to improve performance, because it means that the types of variables do not need to be checked during run-time, and that extra type data doesn't need to be stored.

Expand All @@ -48,12 +49,14 @@ C++ Pros:
- Can write code which runs on exciting and powerful hardware like supercomputing clusters, GPUs, FPGAs, and more!
- Can program for "bare metal", i.e. architectures with no operating system, making it appropriate for extremely high performance or restrictive environments such as embedded systems.
- Static typing makes programs safer and easier to reason about.
- C++ is well known in high performance computing (HPC) communities, which is useful for collaborative work.

C++ Cons:
- Code can be more verbose than a language like Python.
- C++ is a very large language, so there can be a lot to learn.
- More control also means more responsibility: it's very possible to cause memory leaks or undefined behaviour if you misuse C++.
- Compilation and program structure means there's a bit of overhead to starting a C++ project, and you can't run it interactively. This makes it harder to jump into experimenting and plotting things the way you can in the Python terminal.
- C++ is less well known in more general research communities, so isn't always the most accessible choice for collaboration outside of HPC. (You can also consider creating Python bindings to C or C++ code if you need the performance but your collaborators don't want to deal with the language!)

For larger scale scientific projects where performance and correctness are critical, then C++ can be a great choice. This makes C++ an excellent choice for numerical simulations, low-level libraries, machine learning backends, system utilities, browsers, high-performance games, operating systems, embedded system software, renderers, audio workstations, etc, but a poor choice for simple scripts, small data analyses, frontend development, etc. If you want to do some scripting, or a bit of basic data processing and plotting, then it's probably not the best way to go (this is where Python shines). For interactive applications with GUIs other languages, like C# or Java, are often more desirable (although C++ has some options for this too).

Expand Down
Loading

0 comments on commit 2943dff

Please sign in to comment.