Numbers and Numbering Systems • Jonathan Cook

Probably by now you have heard the terms binary, bit, and byte somewhere already, and maybe you already know exactly what they mean and why they are used. But read on anyways!

The prefix “bi” means two, and so binary is a two-valued (or base-2) number system, with only the digits 0 and 1. Why is binary important? Well, in short, because computers operate in binary. Computers are electrical machines; everything we do with them must end up as electrical signals. Devices such as toasters, even (older) TVs and radios, are analog devices, which means that their electrical signals vary continuously, but a computer is a digital electrical device, which means it operates on electrical values that are kept at discrete levels. Furthermore, all computers operate on just two discrete levels of electricity: these two levels are interpreted as the binary digits 0 and 1. Thus, in a computer, all values are in binary, and all operations (including addition, multiplication, etc.) are performed in binary. That’s why binary is so important to us.

Each binary digit is called a bit; an eight-bit binary value is called a byte, and a group of bytes is called a word. Most CPU’s these days call a 32-bit value a word; in some words are 64 bits.

Decimal and Positional Notation

In understanding and using the low-level operations of computers and their components, our normal usage of numbers can be quite cumbersome, so we need to introduce some alternatives. Our normal use of numbers is always in the decimal, or base-10, numbering system; the digits of 0 to 9 represent the values directly representable in base-10 (in any base, the digits are always from zero to one minus the value of the base). To represent values larger than 9, our numbering system uses positional notation. In this scheme, a digit can represent a value larger than its direct value. Thus, three hundred twenty-nine is represented as

329 = 300 + 20 + 9

The digits 3 and 2 do not represent three and two, but 300 and 20. This is because the position of the digit indicates a power of 10 by which the digit should be multiplied. This is represented as the formula

329 = (3 * 100) + (2 * 10) + (9 * 1) = (3 * 10²) + (2 * 10¹) + (9 * 10⁰)

The positional scheme is much better than, say, the Roman Numeral scheme, where each symbol represented a unique value no matter where in the number it existed (i.e., M=1000, D=500, C=100, L=50, X=10, V=5, I=1). For more information you can check out (Wikipedia:Positional_notation).

Positional numbering is great, and we will keep using it, but it is the decimal (base 10) part that gives us problems with computers. We need some other bases! So let’s generalize! The general scheme for a positional numbering system in a base B is:

there are B unique symbols, representing the values 0 to B-1;
digits in a number are multiplied by a power of B, starting at 0 on the right and increasing by 1 for each digit to the left.

We can use this scheme for any base we can imagine (even Pi!?!), but there are only a few that are important to us, practically speaking. Notice that if we use the symbols ‘0’ and ‘1’ to represent zero and one in any base B, the value of B is always “10” in its own base-B numbering system!

Binary, Hex, and Octal, Oh My!

The number systems we will use in this course are:

Base-10, or decimal: digits 0-9, and each place is a power of 10;
Base-16, or hexadecimal: digits 0-9,A-F, and each place is a power of 16; and
Base-2, or binary: only digits 0 and 1, and each place is a power of 2.

We call binary digits bits, and often say “hex” rather than the full “hexadecimal”. Sometimes on computers we also use the octal (base 8) number system, but we won’t in this course. For more information see (Wikipedia:Hexadecimal, Wikipedia:Octal, and Wikipedia:Binary). Visually, binary and decimal are “easy” since the digits actually look like numbers to us; with hexadecimal, however, we need 16 digit symbols and our brains are only used to recognzing 10 (the symbols 0-9). In hex we do use 0-9, but we also use the letters A-F (or a-f, letter case doesn’t matter) as symbols for the digits representing the values ten through fifteen, and we just have to get used to thinking of them as numerical digits!

Because we use the same 0-9 digits in different numbering systems, the number “42” is a valid number in either the decimal or hex number systems (and is a different number in each!). So we need some way of knowing which number system we are using. In a math course we use a subscript to show the base, such as:

1011₂ = 11₁₀

Notice that both values above only use the digits 0 or 1, but they represent the same value (eleven) with different sequences of those digits, because of the different base. Subscripts are great, but often on a computer we are writing in plain text and cannot represent subscripts. Popular conventions for writing numbers in different bases on a computer are:

A hexadecimal number has a leading ``0x'' attached to the number (e.g.,0x4f2a is the hexadecimal number for the decimal value 20266);
An octal number has just a leading ``0'' attached to it, and then the number itself (e.g., 076 is decimal 62, and 089 is not a number!); and
Sometimes (not often), a binary number has ``0b'' in front of it, and then the number (e.g., 0b10110 is decimal 22);

Usually on computers we do not directly write binary, so you have to convert a binary number to something else (like hex), then use it in whatever you are doing.

Converting Values between Number Systems

Computer Scientists use hexadecimal (and octal) a lot because it easily converts to binary, whereas decimal does not. The reason for this is that the bases 8 and 16 are exact powers of 2, whereas 10 is not. Because 16 is 2⁴, each hex digit is exactly 4 binary digits. For decimal there is no exact mapping, so you have to do some arithmetic to convert to binary.

For example, hex C5 is binary 11000101 – the leftmost four bits (1100) is the hex digit C, and the rightmost four (0101) is the 5. Hex 95 is binary 10010101: the leftmost four bits change, but the last four do not, since only the leftmost hex digit changed. This is very nice because when doing conversion we only need to think about each digit by itself, we do not need to be concerned about the overall value. The table below shows all of the binary-hex-decimal digit mappings for all 4-bit values, zero to fifteen:

Binary	Hex/Octal/Decimal	Binary	Hex	Octal	Decimal
0000	0	1000	8	10	8
0001	1	1001	9	11	9
0010	2	1010	A	12	10
0011	3	1011	B	13	11
0100	4	1100	C	14	12
0101	5	1101	D	15	13
0110	6	1110	E	16	14
0111	7	1111	F	17	15

Notice that one hex digit exactly captures all unique 4-bit values, but when the fifth bit rolls over to one, so does the hex value; so one hex digit represents exactly 4 bits. However, decimal notation rolls over from 9 to 10 part way down the second column; this makes it much harder to convert between decimal and binary, which is why we like to use hexadecimal!

To convert between hex and binary do the following simple things:

From hex to binary, write the four bits that are equivalent to the hex digit,in exactly the same position in the number as the hex digit. Always write four bits!
From binary to hex, group the bits into fours, starting at the right. Then translate each group into one hex digit, in the same position as the group of four bits.
If on the leftmost side of the binary number there are not enough bits to make a complete group, it is OK to add 0’s to the left (but not on the right!).

Unfortunately, we think in decimal, so we often need to convert numbers to and from decimal. This takes a little more work. Converting to decimal is very easy: just multiply each digit by the power of the base for that position. Examples:

Binary: 1011₂ = (1*2³)+(0*2²)+(1*2¹)+(1*2⁰) = (1*8)+(0*4)+(1*2)+(1*1) = 11₁₀
Hex: C6A = (12*16²)+(6*16¹)+(10*16⁰) = (12*256)+(6*16)+(10*1) = 3178₁₀

Notice that binary to decimal is especially easy, since anything multiplied by 1 is itself and anything multiplied by 0 is 0. This means we can drop all the terms with a 0 bit, and just use the power of 2 directly for each 1 bit. Thus the example above becomes

1011₂ = 2³ + 2¹ + 2⁰ = 8 + 2 + 1 = 11₁₀

Converting from decimal to another base is a bit harder: in this course we will not cover conversion from decimal, but if you want to read about it, keep on reading! The essential method is to repeatedly divide by the base that we are converting to. The remainders from these divisions will be our digits in the converted number (but they are backwards!) The algorithm is:

Divide number by base, get quotient and remainder. This remainder is the rightmost digit.
Now divide the qotient by base, and get new quotient and remainder. This remainder is second-rightmost digit.
Keep dividing each new quotient by the base to get the next rightmost digit.
Stop when new quotient is 0 and remainder is the old quotient; this is the final, leftmost digit of the number in the new base.} (shortcut: when you see a new quotient that is less than the base, this is your final leftmost digit)

Binary example: convert decimal 21 to binary. 21/2 is 10, remainder 1; 10/2 is 5 remainder 0; 5/2 is 2 remainder 1; 2/2 is 1 remainder 0; 1/2 is 0 remainder 1; done. Reverse remainders and get binary 10101. Check: 2⁴+2²+2⁰16+4+1 = 21₁₀.

Negative Numbers: Two’s Complement Representation

So far, we’ve just used positive numbers. If you have tried the last part of lab one yet, you have seen that the result is a negative number. Does the AVR CPU know this? How do we know this? In lab 1 it is easy because we can do the arithmetic by hand and see that a negative answer is correct, but if we don’t know ahead of time, how do we know when we are programming?

In writing, we use a minus sign to indicate a negative number. But the CPU and memory just know 1’s and 0’s, that is all. So how do we determine a negative number.

All of the values in a computer are of some fixed length of bits. In an AVR CPU, most are 8 bits, and some are 16 bits. But we always know this. This is important: values are always a fixed number of bits.
An ingenious representation of signed numbers: the 2’s complement representation (2C)
Take the positive binary number, complement (flip to opposite) all the bits, then add one
Thus, decimal -11 is found by:

Example:

taking positive decimal 11, which is 00001011
complementing it, which gives us 11110100 (each bit is opposite)
then adding 1, which gives us 11110101
so, decimal -11 is 8-bit binary 11110101

In 2C the upper bit acts as a sign bit: a leftmost 1 means negative, a leftmost 0 means positive
But arithmetic can be done completely ignoring whether a number is signed or not.
That means (important): the CPU doesn’t care if you are using signed numbers or unsigned numbers!
Another representation is 1’s complement, which is just complementing all the bits (and not adding one like 2C does).
One wierd thing about 1C is that it has two zero’s – a positive zero (00000000) and a negative zero (11111111).
Some historical computers were built that used 1C, but today all use 2C.

Also, see Wikipedia:TwosComplement

IEEE Floating Point Representation

All computers today use a standard binary representation for real numbers, devised by the IEEE. This is called the IEEE 754 floating point standard (big surprise!).

The standard is actually a class of several representations of different sizes: single precision is 32 bits, double precision is 64 bits, and quad precision is 128 bits. In the C programming language, these correspond to the float, double, and long double data types.

In single precision the IEEE format has 1 sign bit, 8 exponent bits, and 23 mantissa bits, in that order from left to right (the 32 bits are seeeeeeeemmmmmmmmmmmmmmmmmmmmmmm). The sign bit is for the sign of the mantissa (i.e., the sign of the overall number). Exponents of all 1’s and all 0’s are reserved, and the exponent stored in the bits is the actual exponent + 127. That way, the exponent stored as a positive unsigned number, but it represents exponent values from -126 to +127. The exponent is a power of 2, of course.

In addition to the mantissa, there is a hidden bit that is a 1 bit tacked onto the front of the mantissa. If you think about it, a binary mantissa always begins with 1 since we don’t write leading 0’s on numbers. So the IEEE format just assumes that a 1 is there, and doesn’t store it. It is a free extra bit of accuracy.

So, the value represented by a single precision IEEE number is

Value = s * 1.mmmmmmmmmmmmmmmmmmmmmmm * 2 ^ (eeeeeeee - 127)

In decimal terms, this gives a number with about 7 digits of accuracy, and magnitudes from about 10^-38 to 10^38.

In double precision (64 bit) IEEE format, the mantissa is 52 bits, and the exponent is 11 bits (with an offset of 1023). It gives us almost 16 decimal digits of precision, and magnitudes from 10^-308 to 10^308. This is a much larger range than single precision. Quad precision is 112 bits of mantissa, 15 of exponent.

We said earlier that exponents of all 1’s and all 0’s are reserved. This is for special error conditions, like trying to divide by 0, or taking the square root of a negative number.

An exponent of all 1’s is considered to be infinity – positive infinity if the sign bit is 0, negative infinity if the sign bit is 1. Dividing a non-zero number by zero results in infinity.

An exponent of all 0’s is considered to be not-a-number, or NaN for short. Dividing 0 by 0, or taking the square root of a negative number, will result in a NaN value.

You can read more about this on Wikipedia:IEEE754, although that page is not written well and there are probably easier reads on the internet, e.g., here and here.

Stop Here

Text below here is from another set of notes, a lot overlaps with content above. I need to edit and merge the two…

More Expanded Description

Our normal use of numbers is always in the decimal, or base-10, numbering system. The digits of 0 to 9 represent the values directly representable in base-10 (in any base, the digits are always from zero to one minus the value of the base).

To represent larger values, our numbering systems use positional notation. In this scheme, a digit can represent a value larger than its direct value. Thus

329 == 300 + 20 + 9

The digits 3 and 2 do not represent three and two, but 300 and 20. This is because the position of the digit indicates a power of 10 by which the digit should be multiplied with. This is reprsented as a formula like:

329 = (3 * 100) + (2 * 10) + (9 * 1) == (3 * 10^2) + (2 * 10^1) + (9 *10^0)

So, the general scheme for a positional numbering system in base B is:

There are B unique symbols, representing the values 0 to B-1.
Digits in a number are multiplied by a power of B, starting at 0 on the right and increasing by 1 for each digit to the left.

Binary, Hex, and Octal, Oh My!

The number systems we commonly use in computer science are:

Base-10, or decimal: digits 0-9, and each place is a multiple of 10.
Base-8, or octal: only digits 0-7, and each place is a multiple of 8.
Base-16, or hexidecimal: digits 0-9,A-F, and each place is a multiple of 16.
Base-2, or binary: only digits 0 and 1, and each place is a multiple of 2.
special note: we call binary digits “bits”, and often say “hex” rather than hexadecimal

For more information see Wikipedia:Hexadecimal, Wikipedia:Octal, and Wikipedia:Binary.

Because we use the same digits in different numbering systems, we need some way of knowing which number system we are using. In a math course we would use a subscript to show the base, such as:

1011₂ = 11₁₀

Notice that both values above only use the digits 0 or 1, but represent the same number (eleven) with different strings of those digits, because of the different base.

Writing Values in Plain Text

Because simple text files on computers do not represent subscripts, we will need to use special symbols to indicate which base we are working in. The most common notation that is used in Unix, C, C++, Java, and many other places:

A hexadecimal number has a leading “0x” attached to the number.
e.g., 0x4f2a is the hexadecimal number for the decimal value 20266
An octal number has just a leading “0” attached to it, and then the number itself.
e.g., 076 is decimal 62, and 089 is not a number!
A binary number has “0b” in front of it, and then the number
e.g., 0b10110 is decimal 22
not many tools/languages recognize this; usually in Unix/C/C++/Java you have to convert a binary number to something else (like hex), then use it

Converting Values between Number Systems

Computer Scientists use octal and hexadecimal a lot because these systems easily convert to binary, whereas decimal does not.

For octal, each octal digit is exactly 3 binary digits
For hexadecimal, each hex digit is exactly 4 binary digits
For decimal, there is no exact mapping, so you have to do some arithmetic to convert to binary
In octal, 046 is binary 100110, the first three (100) is the 4, and the second three (110) is the six
octal 036 is binary 011110; the first three digits change, but the last three do not.

To convert between hex/octal/binary do the following simple things:

From hex to binary, write the four bits that are equivalent to the hex digit, in exactly the same position in the number as the hex digit. Always write four bits!
From octal to binary, write the three bits that are equivalent to the octal digit, in exactly the same position in the number as the octal digit. Always write three bits!
From binary to hex, group the bits into fours, starting at the right. Then translate each group into one hex digit, in the same position as the group of four bits.
From binary to octal, group the bits into threes, starting at the right. Then translate each group into one octal digit, in the same position as the group of three bits.
If on the leftmost side of the binary number there are not enough bits to make a complete group, it is OK to add 0’s to the left.

Unfortunately, we think in decimal, so we often need to convert numbers to and from decimal. This takes a little more work.

Converting to decimal is very easy: just multiply each digit by the power of the base for that position
Octal example: 0473 = (48^2)+(78^1)+(38^0) = (464)+(78)+(31) = 315 in decimal
Binary example: %1011 = (12^3)+(02^2)+(12^1)+(12^0) = (18)+(04)+(12)+(11) = 11 base 10
Hex example: 0xC6A = (1216^2)+(616^1)+(1016^0) = (12256)+(616)+(101) = 3178 base 10
Converting from decimal to another base is a bit harder: must repeatedly divide by the base we are converting to. The remainders from these divisions will be our digits in the converted number (but they are backwards!)
Algorithm:

divide number by base, get quotient and remainder. This remainder is the rightmost digit.
Now divide qotient by base, get new quotient and remainder. This remainder is second-rightmost digit.
Keep dividing each new quotient by the base to get the next rightmost digit. Stop when new quotient is 0 and remainder is the old quotient; this is the final, leftmost digit of the number in the new base. (shortcut: when you see a new quotient that is less than the base, this is your final leftmost digit)

Octal example: convert decimal 93 into octal. We divide 93 by 8 and get 11 with a remainder of 5. We then take the quotient and again divide by 8. So 11 divided by 8 is 1 with a remainder of 3. We do the same again, so 1 divided by 8 is 0 with a remainder of 1. We stop when we have a quotient of 0. Now we take those remainders (5,3,1) and reverse them for our octal number of 0135, which is the conversion of decimal 93 into octal. We can check that by converting back (164 + 38 + 5 = 64+24+5 = 93 base 10).
Binary example: convert decimal 21 to binary. Divide 21 by 2 and get 10, remainder 1. Divide 10 by 2 get 5 remainder 0. Divide 5 by 2 get 2 remainder 1. Divide 2 by 2 get 1 remainder 0. Divide 1 by 2 get 0 remainder 1. Done, reverse remainders and get %10101

Why Is This Important?

But why do we need binary?

Well, in short, because computers operate in binary
Computers are electrical machines
Things like toasters, even (older) TVs and radios, are analog devices
That is, the electrical signals vary continuously
A computer is a digital electrical device
It operates on electrical values that are at discrete levels
Furthermore, all computers operate on two discrete levels of electricity
These two levels are interpreted as the binary digits 0 and 1
Thus, in a computer, all values are in binary, and all operations (including addition, multiplication, etc.) are performed in binary
Each binary digit is called a bit
An eight-bit binary value is called a byte
Most CPU’s these days call a 32-bit value a word
But not the CPU we will be programming in this class!