What’s In A Byte?

When it comes to computer memory and computer storage the unit we use for measuring size and capacity is a byte. If I was talking about a file I might say “This file is 28 bytes”. If I’m talking about a computer’s memory capacity, I might say something like “This computer has 28 bytes of memory”. Or if I’m talking about a storage device such as a hard drive, I might say “This hard drive has 28 bytes of storage.”

Okay, sure, a byte is a unit of measure for computers. But what is it really? Depending on who you ask, you might be told “a byte is the memory (or storage space) taken by one character” or “a byte is 8 bits”. Of the two, I think the first definition, “a byte is the memory (or storage space) taken by one character”, is easier to wrap one’s thoughts around, so we’ll start there.

A byte is the minimum amount of memory (or storage space) that a character uses. Or to put it more clearly: A character requires at least one byte of memory or storage. There are kinds of characters that use multiple bytes, but I’m saving that topic for another post.

Thinking of bytes in the context of characters won’t help our understanding unless we have a common understanding of what computer geeks mean by a character1. All the letters, numbers, symbols, punctuation, even spaces are all characters. The letter ‘a’ is one character. The percent symbol, ‘%’, is another. The digit ‘1’ is yet another. Each of these characters will require 1 byte of memory or storage.

The sentence ‘Iain counts’ is 11 characters — Iain is 4, counts is 6, and the space is 1, totaling to 11 characters. The sentence also requires 11 bytes of memory or storage.

There are also some special characters, usually not displayed on the screen. These characters are used in various ways, such as marking the end of a line, the end of a paragraph, one is used to represent “tab”, etc. Each of these special characters will require 1 byte of memory or storage when they are used.

In theory, one can count up all the letters, numbers, symbols, punctuation, spaces, and the special characters in a document and that will be the size of the document in bytes. However, that is only true for plain text. Plain text lacks formatting (italics, bold, fonts, font sizes, etc.) of any kind and plain text doesn’t have any characters that require more than one byte. And it is, well, text.

Text that is not plain, such as documents produced by word processing software (such as Word, LibreOffice, Wordperfect, Google Docs, etc.), will have a much larger size in bytes. This is because all the formatting information will be included in the file. Even a relatively small will have a lot of such information. For example, the sentence, “Iain counts” saved from Word produced a file that is 11,733 bytes.

When it comes to word processing documents, photographs, videos, movies, music, sounds, it is more difficult count up the number of bytes used. Generally, you can expect them all to be much larger than plain text. However, their size is still measured in number of bytes.

To put bytes in perspective, here are some example items and how many bytes they use:

Maximum length (140 characters) tweet content140 bytes
This Blog Post4,092 bytes
War and Peace (in plain text)23,359,546 bytes
AC/DC’s Back In Black (as a high-quality MP3)10,232,548 bytes
A typical movie in SD1,500,000,000 bytes

As you can see, the number of bytes of various kinds of data varies widely. And for some kinds of files can be quite large compared to a tweet or a blog post.

Byte is the basic unit of measure of computerized information. When we’re discussing plain unformatted text, one can assume that each letter, digit, symbol, or other character will increase the size of the information by one byte.

The next topic in this series is what “memory” and “storage” mean. Sorta obvious from the terms and sorta misleading from the terms. So clear as mud until discussed. More in the next post!

Key Ideas

  • Byte is the unit of measure for measuring storage space, measuring file sizes, how much memory programs and data use.
  • Characters are all the letters, numbers, and symbols we use.
  • Characters are also used to represent special concepts such as the end of a line, end of a paragraph, tab character, space, end of a file, etc.
  • Each character uses at least one byte of memory or storage.

  1. It is not, of course, what my great grandmother meant by character. 

  2. Thank you Project Gutenberg: https://www.gutenberg.org/files/2600/2600-0.txt