Assignment 2 -- Due Monday, February 17

Assignment 2 – Due Monday, February 17

Note: All analysis problems can (and should!) be solved using the “estimation with powers of two” techniques from the class reading. Do not use a calculator!! You should always show all steps of your work, with explanations as necessary (don’t just write formulas!).

One of the most cost-effective ways to access significant general-purpose computing power is to rent time on a large system from a cloud provider, like Amazon AWS. In this problem, you’ll use powers of two estimation to estimate how much it would cost to break keys using massive GPUs available through AWS.
1. An NVIDIA H100 tensor core GPU can unit can test roughly 32 billion AES keys per second. How many keys (as a power of two) can an H100 test in one hour?
2. You can rent time on an Amazon “p5.48xlarge” instance, which includes 8 nVidia H100 GPUs, for a little under $40/hour, so let’s round that off and say it’s $32/hour. How much would it cost to test $2^{60}$ keys? Give this as an understandable amount – not a power of two.
3. How much would it cost, on average, to break an 80-bit key?
4. How much would it cost, on average, to break a 128-bit key? Do you think this much money exists in the world?
5. As described on page 34, if you are looking at English text, then the number of $n$-bit strings that are valid English text is approximately $2^{0.16n}$. Consider what would happen if someone used an English phrase for the AES key. How many 128-bit sequences are valid English phrases (approximately)? Assuming you had a way to efficiently generate all 128-bit English phrases, how much would it cost to test all of these on the p5.48xlarge AWS instance?
In the Bitcoin system, transactions are stored in blocks that have an 80 byte header, and “Bitcoin mining” involves repeatedly hashing modifications of the header until a hash is found that begins in a certain number of zero bits (this is not entirely accurate, but close enough for this homework question!). When you do the labtainer lab (last question, below), you’ll experiment with a small version of this problem in “Task 4.” The necessary number of zero bits is set so that the expected time for some miner to find the right input is around 10 minutes, and this length has increased over time as more and faster miners have come online. Currently, in January 2025, the hash must begin with 78 zeroes. Bitcoin miners all use specially-built hardware these days – let’s see how long it would take to successfully find such a hash input on a regular system with standard software.

On my system, the OpenSSL implementation of SHA256 can hash roughly 8 million 80-byte blocks per second. There are 16 cores, so the overall number of hashes that can be computed per second is 16 times that. The probability of some randomly chosen input having a hash that begins in 78 zeros is basically $2^{-78}$, so a single test has a Bernoulli distribution with $p=2^{-78}$. The number of trials needed before the first success has a Geometric distribution. With those hints, first find the expected number of trials before mining succeeds. Then calculate the expected time to success, using the rate of 8 million trials per second per core, and using 16 cores. Note that all values here are nicely approximated by powers of two, so use the “estimation by powers of two” techniques for your calculations, and show your work! Your final answer should be as a meaningful number, such as “16 seconds,” or “21 days,” or “2,000 years” (and not something like “$2^{31}$ seconds”).
While not part of the homework, you might find it interesting to see how the speed of your own personal system compares. If you can run openssl (either natively or in the labtainer virtual machine), then run this command:
```
  openssl speed -bytes 80 -seconds 10 sha256
```
The output on my system includes the line:
```
  Doing sha256 for 10s on 80 size blocks: 104509707 sha256's in 10.00s
```
This says it could do 104,509,707 SHA256 hashes in 10 seconds, so about 10,450,970.7 per second. I rounded down a little (but just a little!) when I said 8 million per second.
Bitcoin blocks are identified by their hash, and it would break the system if two blocks had the same hash. Unfortunately, since a large number of bits are constrained to be zero, this reduces the overall number of possible hashes and increases the probability of a collision. Let’s see if this is a problem. Solving this question involves using the techniques and estimations from the section on the “Birthday Problem” in the assigned reading.
1. Hash values are 256 long, but assume that the only allowed hashes begin with 78 zero bits. The current bitcoin blockchain has around 882,500 blocks, so let’s round that up to a million blocks. Estimate the probability that two blocks have the same hash value (assume all but the 78 fixed bits are random and uniformly distributed).
2. Let’s turn the problem around: How many blocks would need to be hashed before there’s roughly probability 1/2 of a repeated hash?
3. One of the most popular hash functions in the 1990’s was MD5, which produces 128-bit hash values. Repeat the two calculations above for 128-bit hash values (still with 78 zero bits). (Note: MD5 has been shown to be very insecure with respect to collision resistance, so should not be used. While collision resistance isn’t really the right measure for applications like this one, MD5 should still be avoided!)
This problem will help you understand why public key encryption schemes require larger keys than symmetric ciphers.
1. Assume that a public key scheme that uses Elliptic Curve Cryptography (ECC) with an $n$ bit key can be broken in $2^{n/2}$ steps. What would the keylength $n$ need to be for this to be the same time (number of steps) as the worst-case time to brute force a 128-bit key? Show and explain your work! How does this compare to the NIST 800-57 key size recommendations table (slide 14 in the “Fundamental cryptographic services” slides).
2. A rough approximation of the time it takes to factor an $n$-bit number of the type used by RSA (which would break an $n$-bit RSA key) is $2^{8\cdot n^{1/3}}$ — in case it’s not clear, the exponent is 8 times the cube root of $n$. Note that I have simplified this somewhat for the homework so that the calculations work out easily, but it’s not a bad approximation for values of $n$ between 1000 and 20,000. Given this complexity, what key size $n$ is needed for RSA to be as hard to break as the worst-case time to brute force a 128-bit key? How does this compare to the NIST 800-57 key size recommendations?
Complete the “macs-hash” labtainer exercise. Here are a few tips:
- This lab requires putting some values into a spreadsheet and completion of both a lab report, using templates that are provided. When you start the lab, it will give you paths to both documents (as well as the lab writeup). Just right-click on those links to open them.
- Make sure you complete the spreadsheet and lab report, and save them, before typing “stoplab”. Since “stoplab” bundles these into the .lab file that you will submit, they must be completed and saved before the “stoplab”!
- From the original student window (not the labtainer window), you can type “checkwork” to see what the labtainer window thinks you have done or not done. You are encouraged to do this to make sure your actions are being recorded properly! In particular, it is looking for some specific actions that are described in the lab, and is not very smart about seeing alternatives. If you know shortcuts or simplified commands, do not use them - stick with what is described in the lab instructions.