Introduction
Dragon CTF is a really high-weighted event (99.33). I created an account and planned to play solo for my ctftime rating, but then found out that Balsn is also playing, so I quickly switch to play with Balsn. There are 3 crypto challenges with decent difficulty gaps. The easiest one was quickly solved by my teammates and I didn't even have the chance to take a look, the other two are both CRC recursive challenge, with the hard one having bigger parameters. I solo solved the easier version, and with the same idea, sasdf was able to solve the harder version by optimizing my algorithm. Credit sasdf for the code and those optimization ideas, and tpobenvi for discussion.
In this writeup we go straight to the harder version.
CRC recursive challenge
Challenge
1 | #!/usr/bin/env python3 |
Solution
In the following writeup, the \(+\) operation means xor 99% of the time.
The first intuition is that the crc function is linear, so we might just need to solve some linear equations to find the correct crc string. Let's say the input crc is \(I = b_{127}b_{126}\cdots b_0\) (every \(b_i\) is a single bit), then crc128 can be written as
\[\mathsf{CRC128}(I) = \bigoplus_{i} b_iu_i \oplus C\]
for some fixed 128-bit \(u_i\) and \(C\). so we only have to solve something like
\[\bigoplus_{i} b_iu_i \oplus C = \bigoplus_{i} b_i 2^i\]
which is a typical linear basis problem with size 128 and can be solved using Guassian elimination.
However, the problem is that 'a' does not mean '9' + 1 in terms of the ascii value, so the linearity of \(\mathsf{CRC128}(I)\) breaks. A quick fix is to bruteforce \(2^{32}\) possibilities to determine if a character is an alphabet or number, and solve the linear equation, but as I finished implementing and ran the attack, this happened:
1 | # Eaiser version |
A few characters are off! Why? After a long debug, I found out the
heartbreaking fact: ord('a') = 97
and
int('a', 16) = 10
, which means that they are not
"aligned".
Let's first look at why the digits work perfectly. We know that crc128 is linear to every bit of the input, so if the input is a single character \(c = b_7b_6\cdots b_0\), then
\[\mathsf{CRC128}(c) = \bigoplus_i b_iu_i \oplus C\]
And when we parse the character \(c\) as integer's hex representation, we have
\[\mathsf{toInt}(c) = c - 48 = c \oplus 48 = \bigoplus_i b_i2^i \oplus 48\]
for every digit \(c\). Everything is linear, perfect! However, in the case of the alphabets,
\[\mathsf{toInt}(c) = c - 87\]
and we are doomed. There's no way to written it as a beautiful linear combination of \(b_i\)s (with xor operation). The fact that \(48\) is a multiple of 16 while \(87\) is not changes everything, unless...
Let's first go back to the crc128, it operates on a finite field of size \(2^{128}\), and by the way it is calculated, we can conclude that
\[u_3 = xu_2 = x^2 u_1 = x^3 u_0\]
where \(x\) is the variable for the finite field. For example, if we change a single byte of the string from '0' (48) to '1' (49), and the crc128 value differs (xor) by some \(A\), then if we change it from '0' (48) to '2' (50), the xor value differs by \(Ax\), which the "multiplying by x" is calculated in this way:
1 | def multByX(num): |
At the same time, changing a single byte from '0' to '1' also change its hex value by some power of 256 (depends on the character's index), let this value be \(B\). Similarly, replacing the byte from '0' to '2' changes the hex value by \(2B = Bx\) as \(B < 2^{127}\). The linearity we talked about is that replacing a '0' to a digit \(b_3b_2b_1b_0\) makes a crc128 difference of \((b_3x^3 + b_2x^2 + b_1x + b_0)A\) and a hex value difference of \((b_3x^3 + b_2x^2 + b_1x + b_0)B\). So if we have the freedom to change '0' to any other digit, we can describe the difference of the value \(\mathsf{CRC128}(I) \oplus \mathsf{toInt}(I)\) as "choose some values from \([A + B, (A + B)x, (A + B)x^2, (A + B)x^3]\) and xor them together". So if we hope \(\mathsf{CRC128}(I) \oplus \mathsf{toInt}(I)\) to change by a fixed amount, we can check if the 4 numbers above span that amount. (10 to 15 is invalid but we ignore this fact for now)
For alphabets, the linearity is not so obvious, if we replace an 'a'
to 'b', the crc value (which depends on the ascii value) changes by
\(A(x + 1)\)
(ord('a') ^ ord('b') = 3
), while the hex value changes by
\(B\)
(int('a', 16) ^ int('b', 16) = 1
). We can make a table for
this
From 'a' to | diff |
---|---|
'a' | \(0\) |
'b' | \(A(x + 1) + B\) |
'c' | \(Ax + B(x^2 + x)\) |
'd' | \(A(x^2 +1) + B(x^2 + x + 1)\) |
'e' | \(A(x^2) + B(x^2)\) |
'f' | \(A(x^2 + x + 1) + B(x^2 + 1)\) |
it turned out that we can also use 4 numbers to span these values: \([Ax^2 + Bx^2, A(x + 1) + B, Ax + Bx, Bx^2]\).
The reason why we want to use 4 numbers to cover the difference is that we can control 32 bytes of the string, so the difference we can make is spanned by 128 numbers. As we are operated on a finite field of size \(2^{128}\), the solution is unique (or does not exist), and we can check if it is valid after finding this unique solution. (while if we use 5 numbers to cover the possible differences, we have 160 128-bit numbers, and there are like \(2^{32}\) ways to generate a fixed value).
As alphabet is actually (in some sense) linear, we have the following attack:
- enumerate \(2^{32}\) possibilities of digit/alphabet combination
- for each possibilty, make the base-case string \(s\): a string that has only '0' and 'a'.
- calculate the base-case difference \(\mathsf{CRC128}(s) \oplus \mathsf{toInt}(s)\)
- construct the 128 values that describe the possible differences of the value \(\mathsf{CRC128}(I) \oplus \mathsf{toInt}(I)\)
- solve the linear basis problem to see if/how we can make the value \(\mathsf{CRC128}(I) \oplus \mathsf{toInt}(I)\) zero.
- check if that's a valid solution (for example, 10 ~ 16 for digits)
The time complexity is \(O(2^{N} \cdot \frac{(4N)^3}{64}) = O(2^N N^3)\) because the linear basis requires a cubic-time guassian elimination, and all the operations are bitwise xor, hence the factor 64.
I implemented this using python, and it takes 11 mins to run the easier version, and like 100 days for the harder version, so I pretty much gave up here until sasdf suggested using C language. However, it was my sleeping time, so sasdf did all the remaining optimizing work:
- switch from Python to C
- while bruteforcing the \(2^{32}\) possibilities, the adjacent iterations have similar linear basis, so store the prefix linear basis and be lazy calculating the linear basis. This reduces a 128 in terms of the time complexity.
- parallelize
- burn GPU
The second idea is quite simple that I feel bad didn't come up with it as an competitive programmer. Anyway, the script by sasdf is here.
Conclusion
Congrats Balsn for getting the first big-event champion this year!