Author Seifer
Target Project Cryptology. Essay #3: MD5 Algorithm
Public Release  06/02/2000
 

Disclaimer: Please note, the information herein is copyright to Hellforge. No portion of this text may be duplicated. Furthermore, damage or problems arising after reading this text is left to the users disposal. Neither Hellforge, nor its members can be held responsible for any direct or indirect result of following this text. The full liability of following this text is on the reader (YOU). The information is provided for educational purposes, misuse of this information is strictly prohibited. If you do not agree with this agreement, then please hit the "back" button on your browser, and go to hell. - Mercution.
 

MD5 Algorithm 

Contents:

1) Small Introduction to Hashing
2) Algorithm
3) C Implementation

Introduction
 

Today we will study some hashing algorithm for the HellForge's Crypto project. First I'll try to explain what is hashing, and then finish the introduction by giving some example. People often think that hashing is the same thing that encryption, and they aren't right on this point. Encryption enables you to transform some message into a meaningless following of characters. The person you send the encrypted message to will be able then to decrypt the message thanks to a key, and then get the original message.

Here is the principal difference between encryption and hashing. If you hash some message or some chars block, you will get a key, which size will depend on the used algorithm (example : MD5 produces 128 bits hash, SHA-1 produces 160 bits hash), and the algorithm is made so that it is impossible for anyone to retrieve the original block, by using functions that are not bijections, for example adding the chars of the block together, bits rotations...

Some hashing algorithms have been successfully attacked, but well-known cryptanalysts have showed that it is quite impossible to reverse the more recent algorithms that have been created.

You may ask yourself which is the utility of hashing if it doesn't enable you to encrypt/decrypt data. It is mainly used in security checks like digital signatures, software's protection algorithms. More and more name/serial protections use hashing algorithms to generate keys depending on user names (even if this is useless against advanced crackers ;). Moreover, you can use hashing to generate authenticity keys and then check them by comparing their hash to the one obtained from the original key. With this method, there's no point on retrieving the original key without brute-forcing, which wouldn't be possible in a reasonable time. Another utility is to generate a key with a hashing algorithm and then encrypt the hashed text block with this key. It is for example possible to hash a text block with MD5, set the result key as blowfish encryption key, and encrypt the text block with it...

Now that you are more used to what hashing is, I'll give the mathematical definition of the MD5 algorithm.

Tutorial
  

 

Algorithm:

As I was explaining you in the introduction, MD5 will produce a 128 bits key from an entered text block. First, the text will be padded into sub-blocks so that its length is a multiple of 512 bits - 64 bits. The last 64 bits of the block will depend on the length of the text. Moreover, the ASCII code 0x80 will be concatenated to the last char of the block. The operation will be done the number of times that the length of the message is a multiple of 512 bits - 64 bits.

During the padding operation, if the size of a sub-block is less than 512 bits - 64 bits, this sub-block will be filled with as many 0x00 as necessary, after the 0x80 ASCII code.

Once the text is padded, four variables called chaining variables are initialised the following way : A = 0x01234567 B = 0x89ABCDEF C = 0xFEDCAB98 D = 0x76543210

Those variables are copied into 4 other variables a, b, c and d, which will vary during the process. The 8 numbers are converted to Little-Endian.

The main part of MD5 is made of 4 rounds containing each one 16 function calls. Each function is usually called FF, GG, HH, II, and uses a non-linear function (respectively F, G, H, I) of three variables :

F (X, Y, Z) = (X AND Y) OR ((NOT X) AND Z) G (X, Y, Z) = (X AND Z) OR (Y AND (NOT Z)) H (X, Y, Z) = X XOR Y XOR Z I (X, Y, Z) = Y XOR (X OR (NOT Z))

Finally, FF, GG, HH and II look like : FF(a,b,c,d,Mj,s,ti) stands for a = b + ((a + F(b,c,d ) + Mj + ti) <<< s) GG(a,b,c,d,Mj,s,ti) , a = b + ((a + G(b,c,d ) + Mj + ti) <<< s) HH(a,b,c,d,Mj,s,ti) , a = b + ((a + H(b,c,d) + Mj + ti) <<< s) II(a,b,c,d,Mj,s,ti) , a = b + ((a + I(b,c,d ) + Mj + ti) <<< s) <<< is a left circular shift of s bits, Mj is the current 32 bits number coming from sub-block, and tj is a constant given by : tj = 2^32 * |sin j| and the 64 possible values for t are : 0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee, 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501, 0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be, 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821, 0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa, 0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8, 0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed, 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a, 0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c, 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70, 0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05, 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665, 0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039, 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1, 0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1, 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391 Mj is equal to a 32 bits number corresponding to the 4 ASCII codes of the sub-block, starting at position j, in Big-Endian ; for example, if the 4 chars following position j in the sub-blocks are "seif", Mj will be equal to 0x66696573 : s = 0x73 e = 0x65 i = 0x69 f = 0x66, which gives in Little-Endian 0x66696573. The whole algorithm is the following one : Round 1: FF (a, b, c, d, M0, 7, 0xd76aa478) FF (d, a, b, c, M1, 12, 0xe8c7b756) FF (c, d, a, b, M2, 17, 0x242070db) FF (b, c, d, a, M3, 22, 0xc1bdceee) FF (a, b, c, d, M4, 7, 0xf57c0faf) FF (d, a, b, c, M5, 12, 0x4787c62a) FF (c, d, a, b, M6, 17, 0xa8304613) FF (b, c, d, a, M7, 22, 0xfd469501) FF (a, b, c, d, M8, 7, 0x698098d8) FF (d, a, b, c, M9, 12, 0x8b44f7af) FF (c, d, a, b, M10, 17, 0xffff5bb1) FF (b, c, d, a, M11, 22, 0x895cd7be) FF (a, b, c, d, M12, 7, 0x6b901122) FF (d, a, b, c, M13, 12, 0xfd987193) FF (c, d, a, b, M14, 17, 0xa679438e) FF (b, c, d, a, M15, 22, 0x49b40821) Round 2: GG (a, b, c, d, M1, 5, 0xf61e2562) GG (d, a, b, c, M6, 9, 0xc040b340) GG (c, d, a, b, M11, 14, 0x265e5a51) GG (b, c, d, a, M0, 20, 0xe9b6c7aa) GG (a, b, c, d, M5, 5, 0xd62f105d) GG (d, a, b, c, M10, 9, 0x02441453) GG (c, d, a, b, M15, 14, 0xd8a1e681) GG (b, c, d, a, M4, 20, 0xe7d3fbc8) GG (a, b, c, d, M9, 5, 0x21e1cde6) GG (d, a, b, c, M14, 9, 0xc33707d6) GG (c, d, a, b, M3, 14, 0xf4d50d87) GG (b, c, d, a, M8, 20, 0x455a14ed) GG (a, b, c, d, M13, 5, 0xa9e3e905) GG (d, a, b, c, M2, 9, 0xfcefa3f8) GG (c, d, a, b, M7, 14, 0x676f02d9) GG (b, c, d, a, M12, 20, 0x8d2a4c8a) Round 3: HH (a, b, c, d, M5, 4, 0xfffa3942) HH (d, a, b, c, M8, 11, 0x8771f681) HH (c, d, a, b, M11, 16, 0x6d9d6122) HH (b, c, d, a, M14, 23, 0xfde5380c) HH (a, b, c, d, M1, 4, 0xa4beea44) HH (d, a, b, c, M4, 11, 0x4bdecfa9) HH (c, d, a, b, M7, 16, 0xf6bb4b60) HH (b, c, d, a, M10, 23, 0xbebfbc70) HH (a, b, c, d, M13, 4, 0x289b7ec6) HH (d, a, b, c, M0, 11, 0xeaa127fa) HH (c, d, a, b, M3, 16, 0xd4ef3085) HH (b, c, d, a, M6, 23, 0x04881d05) HH (a, b, c, d, M9, 4, 0xd9d4d039) HH (d, a, b, c, M12, 11, 0xe6db99e5) HH (c, d, a, b, M15, 16, 0x1fa27cf8) HH (b, c, d, a, M2, 23, 0xc4ac5665) Round 4: II (a, b, c, d, M0, 6, 0xf4292244) II (d, a, b, c, M7, 10, 0x432aff97) II (c, d, a, b, M14, 15, 0xab9423a7) II (b, c, d, a, M5, 21, 0xfc93a039) II (a, b, c, d, M12, 6, 0x655b59c3) II (d, a, b, c, M3, 10, 0x8f0ccc92) II (c, d, a, b, M10, 15, 0xffeff47d) II (b, c, d, a, M1, 21, 0x85845dd1) II (a, b, c, d, M8, 6, 0x6fa87e4f) II (d, a, b, c, M15, 10, 0xfe2ce6e0) II (c, d, a, b, M6, 15, 0xa3014314) II (b, c, d, a, M13, 21, 0x4e0811a1) II (a, b, c, d, M4, 6, 0xf7537e82) II (d, a, b, c, M11, 10, 0xbd3af235) II (c, d, a, b, M2, 15, 0x2ad7d2bb) II (b, c, d, a, M9, 21, 0xeb86d391) After what, a is added to A, b to B, c to C and d to D. If there are another blocks to hash, the four rounds are done again, with the new values of A, B, C and D, which are copied into a, b, c and d. To get the 128 bits key, A, B, C and D are converted into Big-Endian again, and concatenated together under their hexadecimal expression. Example of MD5 Hash result : Name : Seifer Key : E795E19ECB28B8FC07D9049AFEFE3F39 I got it with DAMN Hash Calculator, which provides keys with standard hash algorithms. To finish, here is my source code in Win32 C for Tabmail v1.3, which uses MD5 as protection algorithm. Yeah, hashing is often very bad implemented in software protection algorithms. The only thing you have to do is to apply MD5 on a text block depending on the name. It would be better if the protection was serial only, and the serial would be hashed and compared to the good hashed serial. It would be then impossible to brute force the key and find the working serial... /***********************************************************************************************/ #include <windows.h> #include "resource.h" /* *MD5 Macros avoiding to use function with too much parameters */ #define Rotation(val,s) ((val << s) | (val >> (32-s))) #define F(X, Y, Z) ((X & Y) | (Z & ~X)) #define FF(a, b, c, d, dwn, s, n) a += F(b, c, d) + dwn + n; a = b + Rotation(a, s) #define G(X,Y,Z) ((X & Z) | (Y & ~Z)) #define GG(a ,b, c, d, dwn, s, n) a += G(b, c, d) + dwn + n; a = b + Rotation(a, s) #define H(X, Y, Z) (X ^ Y ^ Z) #define HH(a, b, c, d, dwn, s, n) a += H(b, c, d) + dwn + n; a = b + Rotation(a, s) #define I(X ,Y ,Z) (Y ^ (X | ~Z)) #define II(a, b, c, d, dwn, s, n) a += I(b, c, d) + dwn + n; a = b + Rotation(a, s) /* *Sets the name to lowercase and kills the spaces inside */ void LowerCase(const unsigned char *szName, unsigned char *szChk, const unsigned nLen) { unsigned i = 0, j = 0; for(i = 0; i < nLen; ++i) if(szName[i] != ' ') szChk[j++] = tolower(szName[i]); szChk[j] &= 0x00; } /* *Returns a little endian dword from name */ unsigned long M(const unsigned n, const unsigned char* mes) { return((mes[n+3] << 24) + (mes[n+2] << 16) + (mes[n+1] << 8) + mes[n]); } void renverser(unsigned char *szStr, const unsigned long n) { unsigned i; unsigned char tmp; strrev(szStr); for(i = 0; i < n; i+=2) {tmp = szStr[i]; szStr[i] = szStr[i+1]; szStr[i+1] = tmp;} } /* *main serial's generation function */ int keygen(HWND hwnd) { unsigned char szName[51] = {0}, szChk[31] = {0}; unsigned char str[101] = {0}; unsigned char szResult[36] = {0}; unsigned char min = 0xff, max = 0x00, x, y, z; unsigned char cha[9] = {0}, chb[9] = {0}, chc[9] = {0}, chd[9] = {0}; unsigned nLen, cLen ,i; unsigned long r; unsigned long a = 0x67452301, //MD5's chaining variables b = 0xefcdab89, c = 0x98badcfe, d = 0x10325476; unsigned long A = a, B = b, C = c, D = d; nLen = GetDlgItemText(hwnd, IDC_NAME, szName, 50); if(nLen <= 8) { SetDlgItemText(hwnd, IDC_SERIAL, "Please enter more than 8 chars."); return 0; } /*length of name can be in fact greater than 30 chars, but if the *final string (str here) is greater than 64 chars, MD5 is used more *times and the string has to be padded... *who cares of this ? a name is generally less than 30 chars */ else if(nLen > 30) { SetDlgItemText(hwnd, IDC_SERIAL, "Please enter less than 31 chars"); return 0; } r = (nLen + 0xd) << 3; LowerCase(szName, szChk, nLen); cLen = strlen(szChk); for(i = 0; i < cLen; ++i) { if(szChk[i] < min) min = szChk[i]; else if(szChk[i] > max) max = szChk[i]; } x = min ^ max; y = max & min; z = min | max; wsprintf(str, "%c%s%c%s%c%c", x, "Magdaleine", y, szName, z, 0x80); /*=============== MD5 PART ===============*/ /* *Round 1 : */ FF(a, b, c, d, M(0, str), 7, 0xd76aa478); FF(d, a, b, c, M(4, str), 12, 0xe8c7b756); FF(c, d, a, b, M(8, str), 17, 0x242070db); FF(b, c, d, a, M(12, str), 22, 0xc1bdceee); FF(a, b, c, d, M(16, str), 7, 0xf57c0faf); FF(d, a, b, c, M(20, str), 12, 0x4787c62a); FF(c, d, a, b, M(24, str), 17, 0xa8304613); FF(b, c, d, a, M(28, str), 22, 0xfd469501); FF(a, b, c, d, M(32, str), 7, 0x698098d8); FF(d, a, b, c, M(36, str), 12, 0x8b44f7af); FF(c, d, a, b, M(40, str), 17, 0xffff5bb1); FF(b, c, d, a, M(44, str), 22, 0x895cd7be); FF(a, b, c, d, M(48, str), 7, 0x6b901122); FF(d, a, b, c, M(52, str), 12, 0xfd987193); FF(c, d, a, b, r, 17, 0xa679438e); FF(b, c, d, a, M(60, str), 22, 0x49b40821); /* *Round 2 : */ GG(a, b, c, d, M(4, str), 5, 0xf61e2562); GG(d, a, b, c, M(24, str) , 9, 0xc040b340); GG(c, d, a, b, M(44, str), 14, 0x265e5a51); GG(b, c, d, a, M(0, str), 20, 0xe9b6c7aa); GG(a, b, c, d, M(20, str), 5, 0xd62f105d); GG(d, a, b, c, M(40, str), 9, 0x02441453); GG(c, d, a, b, M(60, str), 14, 0xd8a1e681); GG(b, c, d, a, M(16, str), 20, 0xe7d3fbc8); GG(a, b, c, d, M(36, str), 5, 0x21e1cde6); GG(d, a, b, c, r, 9, 0xc33707d6); GG(c, d, a, b, M(12, str), 14, 0xf4d50d87); GG(b, c, d, a, M(32, str), 20, 0x455a14ed); GG(a, b, c, d, M(52, str), 5, 0xa9e3e905); GG(d, a, b, c, M(8, str), 9, 0xfcefa3f8); GG(c, d, a, b, M(28, str), 14, 0x676f02d9); GG(b, c, d, a, M(48, str), 20, 0x8d2a4c8a); /* *Round 3 : */ HH(a, b, c, d, M(20, str), 4, 0xfffa3942); HH(d, a, b, c, M(32, str), 11, 0x8771f681); HH(c, d, a, b, M(44, str), 16, 0x6d9d6122); HH(b, c, d, a, r, 23, 0xfde5380c); HH(a, b, c, d, M(4, str), 4, 0xa4beea44); HH(d, a, b, c, M(16, str), 11, 0x4bdecfa9); HH(c, d, a, b, M(28, str), 16, 0xf6bb4b60); HH(b, c, d, a, M(40, str), 23, 0xbebfbc70); HH(a, b, c, d, M(52, str), 4, 0x289b7ec6); HH(d, a, b, c, M(0, str), 11, 0xeaa127fa); HH(c, d, a, b, M(12, str), 16, 0xd4ef3085); HH(b, c, d, a, M(24, str), 23, 0x04881d05); HH(a, b, c, d, M(36, str), 4, 0xd9d4d039); HH(d, a, b, c, M(48, str), 11, 0xe6db99e5); HH(c, d, a, b, M(60, str), 16, 0x1fa27cf8); HH(b, c, d, a, M(8, str), 23, 0xc4ac5665); /* *Round 4 : */ II(a, b, c, d, M(0, str), 6, 0xf4292244); II(d, a, b, c, M(28, str), 10, 0x432aff97); II(c, d, a, b, r, 15, 0xab9423a7); II(b, c, d, a, M(20, str), 21, 0xfc93a039); II(a, b, c, d, M(48, str), 6, 0x655b59c3); II(d, a, b, c, M(12, str), 10, 0x8f0ccc92); II(c, d, a, b, M(40, str), 15, 0xffeff47d); II(b, c, d, a, M(4, str), 21, 0x85845dd1); II(a, b, c, d, M(32, str), 6, 0x6fa87e4f); II(d, a, b, c, M(60, str), 10, 0xfe2ce6e0); II(c, d, a, b, M(24, str), 15, 0xa3014314); II(b, c, d, a, M(52, str), 21, 0x4e0811a1); II(a, b, c, d, M(16, str), 6, 0xf7537e82); II(d, a, b, c, M(44, str), 10, 0xbd3af235); II(c, d, a, b, M(8, str), 15, 0x2ad7d2bb); II(b, c, d, a, M(36, str), 21, 0xeb86d391); A += a; B += b; C += c; D += d; /*============== * END OF MD5 *============*/ //sets the 4 variables into big endian again, //yeah i know it is lame-coded :p wsprintf(cha, "%.8x", A); wsprintf(chb, "%.8x", B); wsprintf(chc, "%.8x", C); wsprintf(chd, "%.8x", D); renverser(cha, 8); renverser(chb, 8); renverser(chc, 8); renverser(chd, 8); wsprintf(szResult, "%s-%s-%s-%s", cha, chb, chc, chd); SetDlgItemText(hwnd, IDC_SERIAL, szResult); return 0; } /***********************************************************************************************/ This algorithm can easily be translated into another language like asm or delphi.

Conclusion

I hope you understood how this nice algorithm works, but if you have any questions, you can still contact me.

Greetings to...
 


All HellForge members, especially to falcon, mercution, mankind and dark wolf All ECLiPSE members, especially to inferno, zoltan, promethee, wizard, syntaxerror, wizdaz, phoenix, valiant, radio, tomjoad and the other fellahs The analyst, tamambolo, recca, ganjaman, magicraphoun, thigo, lutin noir, vrom, titi, each French cracker I know Everyone I know in #c4n

Seifer[ECLiPSE/HellForge]
The end.

Any mistakes, corrections, or comments may be mailed to the members individually, or to the group : hellforge@hellforge.org.