CS Ramble — Set 2b - bases

This is post is part of set 2 of A Ramble Around CS.

Counting in tens, like a normal human

We humans have 10 fingers, and we use ten digits for counting: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. You can imagine that counting up works kind of like a car odometer. We have wheels with 10 digits on each of them, 0–9. We turn the right-most wheel one digit to add one:

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 2 0 0 0 0 0 2

To count to numbers higher that 9, we start counting up in the second-rightmost digit. The wheels are rigged up so that when a wheel goes all the way around back to 0, it turns the wheel to its left once, “carrying the 1”:

0 0 0 0 0 9 0 0 0 0 0 1 9 0 0 0 0 0 1 0

Counting in eights… like an octopus?

So, let’s imagine that octopuses with their eight appendages only use eight digits for counting: 0, 1, 2, 3, 4, 5, 6, 7.

I suppose they’d count the same way, but start counting up on the second-rightmost digit after 7. And the wheels on their odometers would have only digits 0–7:

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 7 0 0 0 0 0 1 7 0 0 0 0 0 1 0

So octopi would count:

Names for these things

We call the way we count “base 10”, or “decimal”. The octopodes use “base 8”, also known as “octal”.

Getting used to it

At this point—weirdly!—you probably have most of what you need to understand different bases. The rest is working through the implications, getting used to it, figuring out why and when using different bases is useful, and learning how to write numbers in different bases in your code.

Different bases and computers

Base 2: “Binary”

While it’s possible to create an “analog” computer that uses different voltages corresponding smoothly to different quantities, and there’s a rich history of mechanical computers, most modern computers are electrical, and “digital”: inside, they use only on and off, like a light switch. Using 1 for “on”, and 0 for “off”.

Base 2 is like counting if you’re a … bird? using only your wings? … and have only two limbs to count on. It works just like the other bases we’ve played with, except you’re “carrying the 1” a lot!

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 1 0 0

Instead of a “ones digit”, “tens digit”, “hundreds digit”, etc., we have a “ones digit”, a “twos digit”, a “fours digit”, an “eights digit”, a “sixteens digit”, etc. Just like with base 10, each time you move one digit to the left, you multiply by the base. (Our gangly molluscan friends would have a “ones digit”, an “eights digit”, a “sixty-fours digit”, etc.)

If we have four switches, we can represent 0101 (5 in decimal) using “·” for off and “✓” for on like this:

· · 8 4 2 1 0 1 0 1

We can then use wires to carry these signals around, like to a Seven Segment Display:

· · 8 4 2 1

You can play around with a circuit simulation of a seven segment display I found on everycircuit.com here. Note that in the diagram above, I left out the “decoder” you need to convert from 4-wire (4-bit) 0–9 binary data to 7-wire segment on/off data.

“Bits”

We call each binary information digit a “bit”: 0 or 1. A bit is also the most basic “piece” or “unit” of information you can have: true/false, yes/no, +/-, on/off, (heads/tails, left/right, up/down, …)

In many programming languages, you can use binary numbers directly in your programs by using a prefix of 0b:

# A `0b` prefix indicates a binary number:
a = 0b101      # 5: 1×1 + 0×2 + 1×4
b = 0b11111111 # 255: 1×1 + 1×2 + 1×4 + 1×8 + 1×16 + 1×32 + 1×64 + 1×128

# In many languages, you can even break the digis into
# groups for readability. Here we use groups of 4 bits:
c = 0b1000_0000 # 128

Base 16: “Hexadecimal”

It’s very common to use base 16, or “hexadecimal”, numbers in computer programs. To expand our repertoire to 16 digits, we use the usual 0–9 and then add on A–F: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F.

Other than that, things work the same as before; the odometer wheels just have 16 digits:

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 9 0 0 0 0 0 9 A 0 0 0 0 0 A 0 0 0 0 0 F 0 0 0 0 0 1 F 0 0 0 0 0 1 0

In most programming languages, you can use hexadecimal numbers directly in your programs by using a prefix of 0x:

# A `0x` prefix indicates a hexadecimal number:
a = 0x0101  # 257: 1×1 + 0×16 + 1×256 + 0×4096
b = 0xFF    # 255: 15×1 + 15×16

Why hexadecimal?

We mentioned last set that a byte can hold numbers from 0–255. That’s because a byte is made up of 8 bits. (This is where the “8-bit” moniker comes from with the 8-bit computers of the 1980’s: Apple ][, Commodore 64, ZX Spectrum, Atari 2600, Nintendo Entertainment System, BBC Micro, etc. They used computer chips that worked with one byte—8 bits—of information at a time.)

Well, writing out full binary numbers — like 0x10111001— takes up a lot of space, and is hard to read. It happens that one hexadecimal digit holds exactly four bits, and so any byte value can be represented with two hex digits:

1 0 1 1 1 0 0 1 128 64 32 16 8 4 2 1 1 0 1 1 1 0 0 1 8 4 2 1 8 4 2 1 11: 0xB 9: 0x9 B 9

So 0b10111001 == 0xB9 == 185. But it’s a whole lot easer to convert B (1011) and 9 (1001) to 10111001 than it is to convert 185. Each group of 4 bits1 corresponds to one hex digit.

Now we understand the hexadecimal part of the output of man ascii. If you look up “#”, you’ll see that it’s 23 in hexadecimal (we can also write 2316). If you type a “#” into a web form, you’ll often see it show up as %23 in the URL, as in google.com/search?q=%23. (You can say the octothorpe is “percent encoded” as %23.)

We’ll get deeper into binary and working with bits in the next part.


  1. A group of four bits is also called a “nibble” — half a “bite”! ↩︎