Bits vs Bytes

This document is intended for novice use.

A bit is the smallest unit of information that can be stored or manipulated on a computer; it consists of either zero or one. Depending on meaning, implication, or even style, it could instead be described as false/true, off/on, no/yes, and so on. We can also call a bit a binary digit, especially when working with the 0 or 1 values.

A bit is not just the smallest unit of information, but for sake of discussion it can be said that a bit is also the largest unit of information a computer can manipulate. The bits are bunched together so the computer uses several bits at the same time, such as for calculating numbers. When a "bunch" means eight bits then it is called a byte.

A byte also happens to be how many bits are needed to represent letters of the alphabet and other characters. For example, the letter "A" would be 01000001; my initials "KJW" would be 010010110100101001010111. To make this a little bit easier to see where the bytes are it is customary place a comma every four digits, to make what are sometimes called nibbles: 0100,1011,0100,1010,0101,0111. That's not really much easier for people to read or write--and many computer engineers, programmers, and analysts need to read and write even longer binary codes than this.

It so happens that there are only 16 different ways to write 0's and 1's four times. So something called hexadecimal code can be used to make the numbers shorter by translating each nibble (or half-a-byte) like this:

Binary:  0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
Hexadecimal:  0 1 2 3 4 5 6 7 8 9 A B C D E F
Decimal:  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Notice that {A,B,C,D,E,F} are not letters, they are numbers! Hexadecimal "C" means decimal "12" just like binary "1100". Computers are designed to use hexadecimal because binary-hexadecimal handling is far more efficient than binary-decimal.

Each actual displayed letter is represented by a number inside the computer. (See ASCII or Unicode for tables.) So my initials would look like the following, to which I've also added a special <null> character to even it up into a full computer "word" for readability sake. In this discussion <null> is just an invisible non-printing character.

Letter:
(each a single byte)
K J W <null>
Binary:
(split into nibbles)
0100 1011 0100 1010 0101 0111 0000 0000
Hexadecimal:
(also as nibbles)
4 B 4 A 5 7 0 0

So of course "4B4A5700" is much easier to understand than "0100101101001010010101110000". To make it even a little bit easier to use commas are usually put in every 4th hexadecimal character just like was done for the binary digits. That would make my initials look like "4B4A,5700". Some people use a space instead (4B4A 5700); in both cases the idea is readability.

These groupings are also special. Four bytes (such as my 3 initials and the <null>) are called a word. A group of 4 Hexadecimal digits—which would be 16 bits long—is called a halfword.

Wanna know more about words and halfwords? Click here.


Copyright © 1999, 2009, 2017 Kevin J. Walsh
[back to Powers of Two]
walsh@njit.edu /KJW disclaimer