Working with powers of two is fundamental throughout computer science and computer technology, as is working with base two logarithms (the inverse of powers of two). There are lots of examples of this: Data is represented using bits, which is a base two representation in which each bit represents a power of two; data structures often work by splitting data in half, resulting in complexity proportional to a base two logarithm (as in balanced binary search trees, binary space partitions, and more); memory is addressed using binary values, meaning that memory sizes are naturally powers of two; and as we will see a little later, the “bit” is the most natural measure of information, leading to the importance of powers of two and the related issue of base two logarithms.
Once you learn a few tricks, estimating values that arise in computing problems can be done quickly and easily in your head, giving you a powerful skill at rough approximation. How “rough” is “rough approximation”? In this kind of estimation, we’re really only interested in getting things right to within a factor of two or four, not accurate to a fraction of a percent. The distinctions we’re after are seeing the difference between a process that takes a week and a process that takes a year — not the difference between a week and ten days. For many tasks, this rough approximation illuminates the important issues in a way that can guide more precise work.
As a first step, there is a nice correspondence between powers of two and numbers we talk about in everyday conversation (thousands, millions, etc.). In particular, as the exponent of a power of two goes through multiples of 10, values go up through our “named” numbers. For example, \(2^{10}=1024\), which is approximately a thousand. In fact, when talking about memory (RAM) sizes, since RAM capacity in a hardware module is always a power of two, these are the actual sizes when you get 1K or 1M or 1G of RAM. In other words, if you were to buy a memory module with 1K bytes of RAM (if you could even buy something that small these days!), it would actually have a capacity of 1024 bytes. More realistically, when you buy a 4GB RAM module, it is actually \(2^{32}\) or 4,294,967,296 bytes, rather than the 4 billion that a naive interpretation of the “G” prefix would suggest. (Note that this is not universal to all storage sizes — RAM is represented this way, while hard drives use “G” and “T” for traditional notions in which they are exactly a billion and a trillion, not powers of two. To make this more explicit, some people use “GiB” rather than “GB” to represent \(2^{30}\) bytes, where “GiB” is pronounced “gibibyte” – there is similar terminology for other powers of two, using KiB, MiB, TiB, etc. While MiB and GiB do theoretically remove ambiguity from referring to sizes this way, it is far from universally practiced.)
Here are the important power of two to know:
\(2^{10}\) | “about a thousand” |
\(2^{20}\) | “about a million” |
\(2^{30}\) | “about a billion” |
\(2^{40}\) | “about a trillion” |
Next, you can “fill in the gaps” between these powers of two by knowing the small powers of two – those with an exponent less than 10. If you do much computer science at all, you should know, without even really having to think, that \(2^4=16\) and \(2^8=256\) (the latter is the number of distinct 8-bit bytes, so is one of the most important values for you to know right off the top of your head). Another value that is good to know, simply because it comes up a lot, is \(2^{16}=65,536\). This is, for example, the number of 16-bit binary values. Knowing this, you know that the maximum value of a 16-bit unsigned short
in Java (or C or C++ on most computers) is \(2^{16}-1\) or \(65,535\).
Once you know these values, it’s easy to estimate others. For example, what is \(2^{25}\)? Well, the exponent is in the 20’s, so let’s separate that out by writing \(2^{25}=2^{20}\cdot 2^5\). Knowing the small powers of two and the approximations for multiples of 10 shows that this is “about a million” times 32 — in other words, \(2^{25}\) is approximately 32 million.
The numbers we are working with are often measurements of time, so there are a few powers of two to know for relating to units of time. While the approximations above are accurate to within 10% in the worst case, these are a little more variable. There’s also no good pattern to these, but they are useful enough where you should just memorize these values (shown below with relative error of each):
Seconds in an hour | Approximately \(2^{12}\) (relative error 13.8%) |
Seconds in a day | Approximately \(2^{16}\) (relative error 24.1%) |
Seconds in a year | Approximately \(2^{25}\) (relative error 6.3%) |
As another notion of time, consider the speed of a modern processor. Speeds are typically in the “low gigahertz” range, ranging from 2 GHz to 4 GHz, with each machine instruction taking a few clock cycles. For simple calculations, it’s not too far off to say that a modern CPU performs around a billion operations per second – or around \(2^{30}\) operations per second.
Example 1.1: Consider a program that allocates 40 bits to a counter. How long would it take to increment the counter before it “wraps-around” (i.e., exceeds the value that can be stored in 40 bits)? A 40-bit counter that is initialized at zero wraps around when it has been incremented \(2^{40}\) times, so if we are considering a billion increments per second then this will happen in roughly \(2^{40}/2^{30}=2^{10}\) seconds, or about 1000 seconds. 10 minutes is 600 seconds, and 20 minutes is 1200 seconds, so this is approximately 17 minutes.
Example 1.2: The original Unix implementation kept track of time with a simple integer counter, measuring the number of seconds since the beginning of January 1, 1970 (this starting time is known as the “Unix epoch”). The counter was 32 bits, and unfortunately was a signed integer, so can only count up to \(2^{31}-1\). What is the last time that can be represented with this time representation? We know that a year is approximately \(2^{25}\) seconds, so the counter is good for \(2^{31}/2^{25}=2^7=64\) years. 64 years after 1970 is 2034, so that’s a reasonable estimate on the length of time before this representation won’t work. (Note that a precise calculation shows that the counter runs out at 3:14:08 on Tuesday, January 19, 2038, and so this is known as the “Year 2038 Problem.” But 2034 is a pretty good estimate!)
The inverse of the powers of two function is the base two logarithm, which is also of major importance in computer science. The function \(\log_2 n\) is proportional to the time to store or lookup a value in a balanced binary tree, the time for a binary search, the time to update a binary heap, and more. If you can estimate powers of two quickly, then you can also estimate base two logarithms quickly! Since \(2^{10}\) is approximately 1000, you also know that \(\log_2 1000\) is approximately 10. Similarly, you know that \(\log_2 1,000,000\) is approximately \(20\). By repeatedly doing quick approximations with powers of two, you can narrow down an estimate for the base two logarithm of any number. What is the base 2 logarithm of 100,000? Well, that’s 100 times 1000 — since 100 is a small value, we know a lot of powers of two in this range, so know that this is approximately \(2^7\) (this is exactly 128). So 100,000 is approximately \(2^{7}\cdot 2^{10} = 2^{17}\), and so the base two logarithm of 100,000 is approximately 17. (Note that the precise value is 16.6096…, so this is a pretty good estimate!)
Example 1.3: Here’s an interesting problem: In 2016 there were a little over 5 million registered voters in North Carolina. Let’s say we wanted to be able to look up voters quickly, so we put all those names in a sorted array that we can use to look people up using binary search. How quickly could we find a voter? Here’s the estimation: We’re talking about something in the millions, so we start off by noting that \(2^{20}\) is about a million. Then we’re talking about 5 million — this is close to 4 million, which is \(2^2=4\) times a million, or \(2^{22}\). The base 2 logarithm of this is 22 — we can round this up to account for our underestimation of 5 million, and see that we can find any voter, out of 5 million listings, in around 23 comparisons, which a modern processor could do in less than a microsecond.. That’s excellent!
Let’s see how these approximation techniques can be used in an estimation problem that arises in cryptography. You don’t actually have to understand cryptography for these examples to make sense, so don’t worry if you know absolutely nothing about cryptography.
Example 1.4: In symmetric cryptography, keys are typically arbitrary \(k\)-bit values, where we call \(k\) the “key length” (we’ll talk more about this a little later). A “brute force attacker” simply tries keys one after another to find one that works, assuming they have some indication of when a key is correct, and there are \(2^k\) different \(k\)-bit binary values. Consider a simple encryption scheme that uses a 40-bit key, where you can test a key to see if it is the correct key at a rate of a billion per second. (For historical context: Before the year 2000, when non-US users downloaded the Netscape Navigator browser they got the “International Version,” which most commonly used 40-bit keys for a cipher called RC4.) When you search for a random key out of \(2^{40}\) possibilities, you will have to, on average, test half of the possible keys. This means you test \(2^{40}/2=2^{39}\) keys. If you can test \(2^{30}\) keys per second, this will take \(2^{39}/2^{30}=2^9\) seconds. If you know your small powers of two, you know this is 512 seconds — or even if you don’t remember that exactly, it’s half of \(2^{10}\) so it’s “about 500.” Since 10 minutes is 600 seconds, such a key can be found, on average, in a little under 10 minutes. Obviously, that’s not very secure!
Example 1.5: What if we increase the key size to 128 bits, while still being able to test keys at around a billion per second — note that 128 bits is the smallest key size supported by the modern cipher AES. This should be more secure than a 40-bit key, but how much more secure? It’s about 3 times as long, so is it three times as secure, or something else? Let’s re-run our calculations: On average, we’ll need to check \(2^{128}/2=2^{127}\) keys. At a billion a second, this is \(2^{127}/2^{30}=2^{97}\) seconds.
That’s a lot of seconds, so let’s try to get a better handle on that. Using the estimate that there are \(2^{25}\) seconds in a year, we see that this would take, on average, \(2^{97}/2^{25}=2^{72}\) years. Wow. How much is that? Since \(72=32+40\) we know that this is about 4 billion trillion years. For context, the best estimate on the age of the universe is just 13.7 billion years.
Even if you could build special-purpose hardware that could test keys a billion times faster, it would still take around 4 trillion years to find the key. That’s still almost a thousand times the age of the universe – that seems pretty safe!
This last pair of examples shows how rough approximations are useful. Even if the estimation can be off by a factor of 10, Example 1.4 would show that a 40-bit key could be broken in something between 50 seconds and 5000 seconds (about an hour and a half) – both very practical timings. And if Example 1.5 overestimated by a factor of 10, you’re still looking at hundreds of millions of trillions of years – very impractical, no matter how you look at it!
© Copyright 2020–2021, Stephen R. Tate
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.