-
Notifications
You must be signed in to change notification settings - Fork 0
Bit Vectors
Bit vectors are sequences of bits of some length, commonly referred to as their width. Bit vectors can represent integers, real numbers, addresses, or hardware registers. In fact, all data on a digital system is fundamentally represented as a bit vector! In this section, I will identify some of the basic mathematical operations we can perform on bit vectors.
Before we move on, let's first define a useful tool when manipulating bit vectors: the bitmask. Simply put, bitmasks are special bit vectors that have certain bits turned on and others turned off depending on the need. Bitmasks are ubiquitous in programming, especially in embedded software development. They are so named because they are used to mask out bits that we don't care about. Consider a face mask that has holes for your eyes, but not your mouth or nose; your mouth and nose are masked while your eyes remain visible. Bitmasks mask the bits we do not want to operate on, while exposing the bits we do.
Bitwise operators are arithmetic functions which take either one or two bit vectors as inputs and produce a single bit vector as output. Bitwise operators are not to be confused with logical comparison operators, which take one or two boolean (true/false) expressions as input and output a single boolean value.
Bitwise NOT, represented by the ~
character in C/C++, performs the NOT operation on every bit in a single bit vector, producing that vector's bitwise complement; that is, it flips 1 to 0 and 0 to 1 for every bit in the vector. The truth table for the NOT operator is fairly simple:
I0 | O |
---|---|
F | T |
T | F |
Consider an 8-bit vector <math>10100101</math>. Let's take the bitwise NOT of this vector:
~10100101 = 01011010
I mentioned earlier in the article that there is a neat trick for negating a signed binary number in two's complement. That trick uses the bitwise NOT! It consists of two steps: 1) take the bitwise NOT of the binary representation of the number in two's complement and 2) add 1 to the result. Let's take -3 in binary, and negate it with this trick. We'll only use 3 bits for brevity, but know that this works for vectors of any width, assuming the MSB is treated as the sign bit.
-3<sub>10</sub> = 101<sub>2</sub> Step 1: ~(101<sub>2</sub>) = 010<sub>2</sub> = 2<sub>10</sub> Step 2: 2<sub>10</sub> + 1<sub>10</sub> = 3<sub>10</sub>
Note: the use of the kn notation means that number k is in base-n.
And the reverse:
3<sub>10</sub> = 011<sub>2</sub> Step 1: ~(011<sub>2</sub>) = 100<sub>2</sub> = -4<sub>10</sub> Step 2: -4<sub>10</sub> + 1<sub>10</sub> = -3<sub>10</sub>
I will note though that in this case, if we tried to negate -4, we would not end up with 4; in fact, we would find that the result is -4! This is because I artifically restricted the numerical range to 3 bits. As a result, that operation would produce an overflow back into the negative region of our number line. When we're just counting, it is sufficient to simply add another bit to our number; however, this is why it is so crucial to consider a variables range of values when programming!
Bitwise AND, represented by the &
character in C/C++, performs the AND operation on each pair of bits in the two input vectors and outputs the resulting vector. First, let's consider the truth table of the AND operation:
I1 | I0 | O |
---|---|---|
F | F | F |
F | T | F |
T | F | F |
T | T | T |
01010101 & 11111010 = 01010000
Above, we went bit position by bit position and applied the AND operator to each bit. For each bit position, we set the bit in the resulting vector to 1 or 0 depdending on the output of the AND operation for that position: if both bits were 1, then the resulting bit is 1. Otherwise, the resulting bit is 0.
In embedded programming, the AND operator is useful for a few of reasons. First, we can use it to determine if bits in a vector are on or off which is useful when writing conditional expressions. Say, for example, that we want to check if bit 1 is 1 in an 8-bit wide bit vector. Assuming the variable VECT represents the vector we're checking, it would look something like this in C:
if (VECT & 0b00000010) { // do something }
Second, we can use it in conjuction with the bitwise NOT operator to turn a set of bits off in a bit vector. Let's say that we wanted to turn bits 0, 1, and 3 off in some 8-bit vector VECT whose bit representation is <math>11111011</math>. In C, it would look something like this:
VECT = VECT & ~(0b00001011);
or
VECT &= ~(0b00001011)
More explicitly, this is:
Step 1: ~(00001011) = 11110100 Step 2: 11111011 & 11110100 = 11110000
You can see that by first negating the bitmask, we preserve the original value of all the bits that we don't care about, while turning off the bits that we do care about.
Note: in the above examples, the '0b' prefix denotes a binary literal. When you see 0b before a number, know that you should interpret as a base-2 number!
Next, let's examine the bitwise OR operator.
Bitwise OR, represented by the |
character in C/C++, performs the OR operation on each pair of bits in the two input vectors and stores the result in the output vector. First, let's consider the truth table of the OR operation:
I1 | I0 | O |
---|---|---|
F | F | F |
F | T | T |
T | F | T |
T | T | T |
01010101 & 11111010 = 11111111
Like before with AND, we stepped bit position by bit position and applied the OR operator to each bit. When either or both of the bits were 1, we set the corresponding bit in the output vector to 1, otherwise we set it to 0.
How is OR useful in programming then? Well, imagine you wanted to reverse the operation we did in the AND section and turn bits 0, 1, and 3 on in VECT. Remember, VECT is an 8-bit vector that, after our NOT then AND operations, looks like <math>11110000</math>. Using the same bitmask we saw in the AND section, <math>00001011</math>, we could turn the bits on like this in C:
VECT = VECT | 0b00001011;
or
VECT |= 0b00001011;
More explicitly, this looks like:
11110000 | 00001011 = 11111011
This is the same vector we started with!
Next, let's examine the bitwise XOR operator, or exclusive or.
Bitwise XOR, represented by the ^
character in C/C++, performs the XOR operation on each pair of bits in the vector and stores the result in the output vector. Let's take a look at the truth table for the XOR operation:
I1 | I0 | O |
---|---|---|
F | F | F |
F | T | T |
T | F | T |
T | T | F |
10011111 ^ 11010000 = 01001111
Again, we step bit position by bit position and take the XOR of each bit. When exclusively one bit is 1, we set the result to 1; otherwise, the resulting bit is 0.
One use case of the bitwise XOR operation is when we want to toggle the value of a bit or set of bits in a vector. For example, let's again consider the vector VECT which is 8-bits wide and looks like <math>11111011</math>. If we want to toggle the value of bit 5, we would do so by taking the bitmask <math>00100000</math> and XOR'ing it with VECT. In C, this would be:
VECT = VECT ^ 0b00100000;
or
VECT ^= 0b00100000;
More explicitly, this looks like:
11111011 ^ 00100000 = 11011011;
You can see that this flipped the bit in position 5, because in both vectors it was 1 and this produces a 0. If we wanted to do the opposite, and toggle the bit back on, we would simply apply the XOR operator again:
11011011 ^ 00100000 = 11111011;
Since only the fifth bit in the bitmask was 1, XOR outputs a 1. Just like that, we can toggle bits!
That's all for the basic bitwise operators. I highly suggest reading more about these operators and their properties; they are foundational in formal logic, and by extension computer science, and pop up all over the place outside of embedded programming. Next, we will take a look at the two bit shift operators and their behaviors.
Bit shift operators allow programmers to shift bit vectors left and right. In other words, for each bit position, move the bit left or right a certain number of bits, and keep its value. There are two (technically three) bit shift operators in C/C++, the left and right shift operators. On most architectures you'll encounter, these are implemented as single machine instructions so in certain cases they are particularly useful. First, we'll take a look at the left shift operator.
The left shift operator, which in C is represented by the <<
symbol, is a binary operator which takes as input a bit vector and the number of bits by which to shift the vector, and outputs the original vector shifted by that number of bits. Let's take a look at a few examples where we'll shift the vector <math>00110101</math> by a few different amounts.
Shift by 1 bit: 00110101 << 1 = 01101010
Notice that the bit vector was padded with a zero to fill in the new gap our shifts made, and some bits "fell off" the end of our vector.
Shift by 3 bits: 00110101 << 3 = 10101000
Where might this come up while programming? Well, let's consider again the bitmasks we were using in our bitwise operator examples. In each example, we wrote these vectors out by hand; for wider bit-vectors, however, this gets pretty cumbersome. Rather than use a binary integer literal like we were, let us instead apply the left shift operator and the OR operator to build our bitmask at runtime. Let's say we want to turn bits one, three, and five off in our original version of VECT, which was <math>11111011</math>. To do so in C, we would write something like:
VECT = VECT & ~((0b1 << 1) | (0b1 << 3) | (0b1 << 5));
The right shift operator, which in C is symbolized by >>
, takes the same inputs as the left shift operator but shifts the bit vector right instead of left. Like before, we'll take a look at a few examples, except this time, we must make a distinction between the behavior of the right shift operator based on the signedness of the bit vector we're operating on. That is, if the bit vector is stored in a signed variable (two's complement), we'll use the arithmetic shift; otherwise, we'll use the logical right shift. Unfortunately, C makes no distinction between the symbol used for the two operators and leaves it up to the compiler to figure it out. I don't use the right shift operator too often but it's important to understand its limitations in case you do need to use it.
Most basically, the arithmetic right shift behaves exactly like one would expect; it shifts every bit in a vector right by a certain number of positions. However, the arithmetic shift always pads the new vector with 1s instead of 0s in order to preserve the sign bit of a negative number. Of course, when the number was positive to begin with, this behavior isn't exactly desirable! Let's take a look at the same examples from before, except this time we'll use the arithmetic right shift operator:
Shift by 1 bit: 00110101 >> 1 = 10011010
Shift by 3 bits: 00110101 << 3 = 11100110
The logical right shift has the same padding scheme as the left shift, except it pads to the left instead of the right. Again, looking at the same examples:
Shift by 1 bit: 00110101 >> 1 = 00011010
Shift by 3 bits: 00110101 << 3 = 00000110
That's all for our discussion of bit-vector manipulation. Next, I discuss the logical and physical structure of memory on simple embedded devices.