CRACKING OF WinXFiles

by

CASIMIR

Part A

With
A Supplementary Cryptanlysis by Mike Stay from
Bruce Schneier's Crypto-Gram

Other Essays by Casimir
  • Cracking of Crypt-o-Text v1.21 & v1.24
  • Correspondence From Casimir On Reversing Turbo Encrypto
  • Cracking of Encrypt-It For Windows
  • The Cracking of File Locker
  • The Cracking of Keeper
  • The Cracking of Gregory Braun's Crypto v3.5
  • The Cracking of SecurityPlus!
  • The Cracking of MasterKey v1.02/1.05
  • Read Mike Stay's cryptanalytic summary. Mike is a cryptographer at AccessData.

    The Crypto-Gram, as Bruce writes, is "a free monthly newsletter providing summaries, analyses, insights, and commentaries on cryptography and computer security. Bruce's site, of course, is Counterpane Systems. To subscribe, visit subscribe or send a blank message to Subscribe by mail.

    Software by PepSoft
    (You might need to use FTP Search to find the software.)
    Try searching for WXF.

    This essay presents in-extenso the reverse-engineering and cracking of an encryption utility distributed by PEPSOFTWARE Ltd, WinXFiles(TM). Map of the essay:

       Introduction
       Strategy
       Getting prepared
    
       PART A - REVERSING OF WINXFILES 
            1. First contact
            2. Where is the fork?
            3. The Check routine
            4. The making of Check_string
            5. The Remainder
            6. Decryption process
    
       PART B - ANALYSIS AND BREAKING OF WINXFILES
            Method 1: let's suppose Tail = 000...
                   Step 1: Solving
                   Step 2: Checking password found
                   Notes on Method 1
            Method 2: Tail = garbage
                   Step 1: How to find 2 characters from pwd
                   Step 2: System(s) solving
                   Step 3: Checking password found
                   Bad luck!
                   Notes on Method 2
    
       PART C - 'C' SOURCE OF CRACKER
    

    Introduction

    WinXFiles is a perfect example of what should NOT be done in order to efficiently encrypt data. The algorithm used may look 100% secure "on the paper", but doesn't resist patient analysis. It's a proprietary algo: no one knows (well...used to know) its inner working but its creators. You can be pretty sure that a PROPRIETARY algo is always a big CRAP. Why do you think those brainy guys don't get their nice algo published? To prevent jealous competitors from stealing the idea?

    Wrong: competitors would just laugh at them and crack it.

    But once coded, burried deep in a big .exe or .dll, with an attractive interface on top of it, who will ever know?

    On the opposite end, serious schemes (such as BlowFish or DES) can be found fully documented in many cryptography books, everyone knows how they operate, but no quick break has been found yet. Of course, some strong encryption schemes are fully patented, and their usage is submited to a strict regulation (for instance >32-bit BlowFish can not be used outside of the USA and Canada). Moreover, having an algo patented costs money.

    With a proprietary algo, you seldom face this sort of problem {:-).

    But, of course:

    "No one may modify or patch the WinXFiles executable files in any way, including but not limited to decompiling, disassembling, or otherwise reverse engineering the programs." (extract from WinXFiles'doc)
    I can not care less about such non-sense! Goto FraVia'page for interesting thoughts on subject "Is reverse-engineering illegal?" (WHAT? Don't tell me you don't know by heart his address: Fravia's Pages of Reverse Engineering)

    Indeed, the name of the provider of this program should be PEEPSHOW instead of PEPSOFT, 'cause this program is "specially" designed to protect your hot images collection from indiscrete viewing, particulary at work! BTW, you can also encrypt pictures of your mother-in-law (NOT naked, please) or just non sexually-oriented plaintext if you prefer, WinXFiles doesn't care...

    Strategy

    This is my third paper on encryption cracking, so I guess my methods are not that bad {:-)

    .

    When trying to find out which pwd has been used to encrypt a file, you always face the same problem: the HUGE number of pwd possible. For instance, WinXFiles accepts passwords with lengths between 5 and 30 characters. Upper/lower case, number, symbol are allowed, so a character can take approximatly 2^8 (256) values. This represents:

     2^(8*5) + 2^(8*6) + ... + 2^(8*29) + 2^(8*30) = 1.7E+72 (!) pwds to try out. 
    
    Of course, no computer is able of handling such a big number in a reasonnable laps of time. To obtain a cracker that works on a standard PC, we must REDUCE strongly this number. This can be achieved by gaining extra information on the inner working of WinXFiles, i.e. by reversing it.

    Getting prepared

    We will reverse WinXFiles 32-bit v3.3 (WXFILES.EXE : 904192 bytes). Versions 2.7, 3.3, 3.4 (and probably others) work the same way, so cracker will work on all of them. Get it on PEPSOFTWARE'site: PepSoftware

    Tools needed:

    PART A - REVERSING OF WINXFILES

    In this chapter, we're going to collect some precious info on WXF, which will help us to break it in PART B.

    1. First contact

    Choose a few text and/or picture files with different names'length (ex: text.txt, picture.txt...), and encrypt them with various pwds. WXF remembers good pwd and original file's name. You will also discern something interesting:

    whatever original file was. This may mean that WXF adds something to encrypted data, probably a header or a footer. I guess that's where pwd and original name are stored, under an encrypted form {;-)

    OK, enough assumptions, let's Winice a bit.

    We will watch carefully the actions performed by WXF, from introduction of input (a pwd, good or bad) to the end of pwd'checking procedure. There's no need to study the decryption process in order to crack WXF (but it may be interesting anyway).

    
               I   ------------------      G                    ------------
       Input ----->|checking routine|-------------------------->|decryption|  
         ^         ------------------   |         Good, go on   ------------
         |                              | W          
         -----------<--------------------
                    Wrong, try again
    
    
    Create a text file called EXAMPLE.TXT, with string ABCDEF (6 bytes) inside. Encrypt it with pwd CASIMIR (UPPERCASE!). You will obtain encrypted file EXAMPLE.XFD.

    Now fire Winice, the hunt begins. Select EXAMPLE.XFD from list for decryption, enter bogus pwd: 13579 in little window, but don't press OK yet.

    2. Where is the fork?

    We look for branch I (see scheme above) in code. To pin-point it, we must find out which Windows'function collects input. Check functions most often used for this purpose (GetDlgItemTextA, GetWindowTextA) by setting BPX in Winice, then press OK. It doesn't work: Winice doesn't pop up, and you get the Bad Pwd message {:-(.

    Nevermind, there's always a work-around: instead of I, we'll try to localize G and W. Branch G can be localized easily: when pwd is good, WXF must write decrypted file to disk. So we set a BPX on function: WriteFile, and Winice pops up at beginning of WriteFile code. Return from function (F11), ok, WriteFile was called from CS:404580. Now we work backward to find out the fork where branches G and W separate. We return from portion 4045xx: call to this portion took place at code location CS:47332F

    
          :0047332F    call 0040455C 
    
    Obviously, fork is located *before* offset 47332F in code, at a conditional jump (je, jne). There are (at least) four suspect (see listing below). Let's set BPXs on all of them, and trace execution as we enter respectively a Good (G) pwd and a Wrong (W) pwd:
    
    
          :004732BC    mov eax, dword ptr [ebp+F940]          G W
          :004732C2    pop edx                                G W
          :004732C3    call 00406104                          G W
          :004732C8    test eax, eax                          G W
    fork? :004732CA    je 004732D9                            G W
          :004732CC    mov byte ptr [0047CB0C], 01              W
          :004732D3    inc dword ptr [0047CB08]                 W
          :004732D9    cmp byte ptr [0047CB0C], 00              W
    fork? :004732E0    jne 00473370                           G W
          :004732E6    lea eax, dword ptr [ebp-14]            G
          :004732E9    push eax                               G
          :004732EA    mov edx, 0047BB04                      G
          :004732EF    mov ecx, 00001000                      G
          :004732F4    lea eax, dword ptr [ebp+FE98]          G
          :004732FA    call 004044F8                          G
          :004732FF    cmp dword ptr [ebp-14], 00000000       G
    fork? :00473303    je 00473366                            G
          :00473305    cmp dword ptr [ebp-08], 00000001       G
    fork? :00473309    jne 00473315                           G
          :0047330B    mov eax, dword ptr [ebp-14]
          :0047330E    call 004739C4
          :00473313    jmp 0047331D
          :00473315    mov eax, dword ptr [ebp-14]            G 
          :00473318    call 00473A3C                          G
          :0047331D    lea eax, dword ptr [ebp-18]            G
          :00473320    push eax                               G
          :00473321    mov edx, 0047BB04                      G
          :00473326    mov ecx, dword ptr [ebp-14]            G
          :00473329    lea eax, dword ptr [ebp+FD4C]          G
          :0047332F    call 0040455C //call to WriteFile      G
    
    
    The fork we're looking for is the one at CS:4732CA. Branching depends on value returned by call to CS:406104 (return value is stored in register eax):
    
         -> eax     EQUAL 0 :  GOOD  pwd
         -> eax NOT EQUAL 0 : WRONG  pwd
    

    3. The Check routine

    Let's investigate what's going on down here, at CS:406104. Before following call, have a look at arguments (eax and edx) passed with call. Dump ds:eax and ds:edx :

         -> eax points to garbage (in fact, a check string which contains:
                                   57.66.5F.5F.64.64.64.3C.5E.72.6C.96.57...)
    
         -> edx points to our Input string ("13579" : 31.33.35.37.39 in hexa) 
    
    
    :00406104   push esi      //entry point
    :00406105   push edi
    :00406106   push ebx
    :00406107   mov esi, eax  //ds:esi contains Check_string
    :00406109   mov edi, edx  //ds:edi contains Input_string
    :0040610B   or eax, eax
    :0040610D   je 00406112
    :0040610F   mov eax, dword ptr [eax-04]  //Check_string'length (0x0D)
    :00406112   or edx, edx
    :00406114   je 00406119
    :00406116   mov edx, dword ptr [edx-04]  //Input_string'length (0x05) 
    :00406119   mov ecx, eax
    :0040611B   cmp ecx, edx  //same lengths?
    :0040611D   jbe 00406121
    :0040611F   mov ecx, edx  //if different lengths, take smallest
    :00406121   cmp ecx, ecx
    :00406123   repz
    :00406124   cmpsb        //COMPARE Input_string TO Check_string
    :00406125   je 00406151  //if strings are the same  : set eax to 0 and return
              .              //if strings are different : set eax to a non-zero 
              .              //value and return
              .
              . 
    :00406156   ret
    
    In this case, Check_string differs from Input_string, so WXF refuses to decrypt.

    Now, try to decrypt file with Input_string: CASIMIR, you'll obtain Check_string: CASIMIR, and WXF will decrypt the file.

    So WXF will let you in if and only if:

                        Check_string = Input_string
    
    
    Obviously, this equality should occur only when input_string = good pwd.

    BTW, this is NOT always the case with WXF: check routine is not restrictive enough, it always lets you in when you give the good pwd, BUT in some special cases, it lets you in even if you gave a wrong pwd (I told you it was a crap)!

    Note:

    Instead of building a Check_string (we still don't know HOW), WXF could have extracted good pwd from encrypted file and checked it against Input_string. Right, but it would have been a huge flaw! Once decrypted, poor little pwd is just like a white lamb in dark code jungle, it can't hide from hungry Winice {;-) BTW, i discovered some encryption applications that do that mistake! (have a look at Turbo Encrypto for instance). No need to add that now they rest in peace...
    OK, in order to crack WXF, we must understand HOW the hell Check_string is built. That's the main purpose of next chapter.

    4. The making of Check_string

    Well, we can reasonably assume that WXF extracts Check_string from EXAMPLE.XFD, so we'll set a breakpoint on function ReadFile. Winice pops up many times, but the location which raises our interest is the one at CS:4731E8.

    :004731D4  lea eax, dword ptr [ebp-14]
    :004731D7  push eax
    :004731D8  mov edx, 0047BB04  //start storing chars read at DS:47BB04
    :004731DD  mov ecx, 000000FF  //reads FF (255) characters from file
    :004731E2  lea eax, dword ptr [ebp+FE98]
    :004731E8  call 004044F8      //go and read EXAMPLE.XFD
    
    After call, dump memory at DS:47BB04. This is the beginning of file EXAMPLE.XFD, the first 255 characters. Huuummm, seems that header assumption was right! And now, we just have to monitor access to that data zone, setting a BPR DS:47BB04 DS:47BB04+FF RW. We quickly discover that a routine is fiddling with our header:
    
    
         :473A3A  mov eax,eax   //eax=FF
                ...
         :473A40  xor esi,esi   //esi=0
         :473A42  xor ecx,ecx   //ecx=0 
         :473A44  mov cl,byte ptr [0047BAF8] //cl=mysterious_value 
         :473A4A  mov edi,eax   //edi=FF (255 chars from Header to read)
         :473A4C  test edi,edi
         :473A4E  jle 00473AB9
         :473A50  mov eax,0047BB04           //ds:eax -> Header
    ****>:473A55  movzx ebp,byte ptr [eax]   //reads one char from Header
    *    :473A58  dec ecx       //cl--
    *    :473A59  inc esi       //esi++ (counts chars read from Input)
    *    :473A5A  test ecx, ecx
    *    :473A5C  jne 00473A66  //jump if ecx>0
    *    :473A5E  xor ecx,ecx   //ecx=0
    *    :473A60  mov cl,byte ptr [0047BAF8] //cl=mysterious_value 
    *    :473A66  xor edx,edx   //edx=0
    *    :473A68  mov dl,byte ptr [0047B7E8] //dl=Input's length 
    
    Dump DS:47B7E8, you'll see: 05.31.33.35.37.39. Yes, this is our Input ('13579'), along with input's length (5 characters). So register dl contains Input's length. OK, let's see what's next...
    
    *    :473A6E  inc edx
    *    :473A6F  cmp esi,edx  
    *    :473A71  jne 00473A78 //jump if char(s) to read from Input yet
    * L  :473A73  mov esi,1    //restart reading Input from the beginning
    * O  :473A78  xor edx,edx  //edx=0
    * O  :473A7A  mov dl,byte ptr [esi+0047B7E8] //dl=char 'esi' from Input
    * P  :473A80  test cl,1
    *    :473A83  je 00473A89  //jump if cl is even
    * 2  :473A85  mov edx,ecx  //edx=ecx
    * 5  :473A87  jmp 00473A8D
    * 5  :473A89  mov edx,ecx  //edx=ecx
    *    :473A8B  neg edx      //edx=(100-edx)
    * T  :473A8D  xor ebx,ebx  //ebx=0
    * I  :473A8F  mov bl,byte ptr [esi+0047B7E8] //bl=char i from Input      
    * M  :473A95  sub edx,ebx  //edx-=char 'esi' from Input
    * E  :473A97  add edx,ebp  //edx+=char 'eax' from Header
    * S  :473A99  cmp edx,FF
    *    :473A9F  jle 00473AA9 //jump if edx<=FF
    *    :473AA1  sub edx,100  //edx-=100, now we have: edx in [0;FF]
    *    :473AA7  jmp 00473AB3
    *    :473AA9  test edx,edx  
    *    :473AAB  jge 00473AB3 //jump if edx>=0 
    *    :473AAD  add edx,100  //edx+=100, now we have: edx in [0;FF]
    *    :473AB3  mov byte ptr [eax],dl  //crushes Header whith Check_string
    *    :473AB5  inc eax  //eax++, ready to read char i+1 from Header
    *    :473AB6  dec edi  //counter--
    *****:473AB7  jne 00473A55 //jump if char(s) to read from Header yet
                ...
         :473ABD  ret
    
    We discern that Header is remplaced by Check_string. To sum up:
    
          Check_string[i] = (Header[i]-Input[i modulo 5]+cl[i]) +/- 100
    
          and
           
           * if mysterious_value is EVEN:
               cl[0]     = mysterious_value - 1
               cl[2*i]   = cl[0] - (2*i)
               cl[2*i+1] = 100 - cl[0] + (2*i+1)
          
           * if mysterious_value is ODD:
               cl[0]     = 100 - mysterious_value - 1
               cl[2*i]   = 100 - cl[0] + (2*i)
               cl[2*i+1] = cl[0] - (2*i+1) 
    
    

    WATCH OUT:

    Once mysterious_value is known, we can calculate cl at any rank (cl[0], cl[1],...,cl[100],...). It's like pseudo-random numbers: if you know the seed, you know everything.

    For instance, here is what we obtain with EXAMPLE.XFD and Input="13579":

    
    mysterious_value = 0A (even)
    So we have      cl[0]             = 0A-1           = 09
               i=0: cl[1] = cl[2*0+1] = 100-09+(2*0+1) = F8
               i=1: cl[2] = cl[2*1]   = 09-(2*1)       = 07
               i=1: cl[3] = cl[2*1+1] = 100-09+(2*1+1) = FA
               i=2: cl[4] = cl[2*2]   = 09-(2*2)       = 05
               i=2: cl[5] = cl[2*2+1] = 100-09+(2*2+1) = FC
               i=3: ect... 
            
    
    
                  <---------------  255 characters  ------------------>
    Header(bef.): 7F  A1  8D  9C  98  99  94  73  94  B5  94  D1  85...
                + 
              cl: 09  F8  07  FA  05  FC  03  FE  01  F6  09  F8  07...
                - <---------------->  <---------------->  <---------  
           Input: 31  33  35  37  39  31  33  35  37  39  31  33  35... 
                = -----------------------------------------------------
    Header(aft.): 57 166  5F 15F  64 164  64 13C  5E 172  6C 196  57...
    
    So we obtain:
    
    Check_string= 57  66  5F  5F  64  64  64  3C  5E  72  6C  96  57...
                = Wf__ddd<^rlûW...
    
    
    But what happens if we enter the good pwd ('CASIMIR')? Well, if pwd is correct we obtain following structure:
    
                  <---------------  255 characters  ------------------>
    Header(aft.): E X A M P L E . T X T | C A S I M I R \ 0 0 0 0 0...0
                                        | Check_string  \
    
    (0 means 0x00)
    

    5. The Remainder

    The routine that builds Check_string (see above) reads mysterious_value at memory location DS:47BAF8. As usual, we set a BPR on this location, and we discover that mysterious_valus is produced by a small routine:

    
        :00472A07  nop
        :00472A08  push ebx
        :00472A09  xor ecx,ecx  //ecx=0 (sum of Input's characters)
        :00472A0B  xor edx,edx  //edx=0 (counter for characters read from Input)
        :00472A0D  mov dl,byte ptr [0047B7E8]  //dl=Input's length
        :00472A13  test edx,edx
        :00472A15  jle 00472A26        //jump if Input's length=0
        :00472A17  mov eax,0047B7E9    //ds:47B7E9 -> Input (31.33.35.37.39)
    ***>:00472A1C  xor ebx,ebx   //ebx=0
    *   :00472A1E  mov bl,byte ptr [eax]  //bl=character i from Input
    *   :00472A20  add ecx,ebx   //ecx+=character i from Input  
    *   :00472A22  inc eax       //ready to read character i+1
    *   :00472A23  dec edx       
    ****:00472A24  jne 00472A1C  //jump if char(s) to read from Input yet
        :00472A26  mov eax,ecx
        :00472A28  mov ecx,FF
        :00472A2D  cdq
        :00472A2E  idiv ecx //edx=remainder of (chars'sum divided by FF)
        :00472A30  mov byte ptr [0047BAF8], dl //writes mysterious_value
        :00472A36  pop ebx
        :00472A37  ret
    
    So we have:
                                     Sum of Input's characters
    mysterious_value = remainder of: -------------------------
                                                 FF
    
    With Input= 13579, we get: remainder of [(31+33+35+37+39)/FF] = 0A

    We'll always have: remainder included in [00 ; FE].

    6. Decryption process

    In fact, we don't even need to know how WXF proceeds to decrypt a file, because we only look for the *password* stored in Header. Anyway, it may be of some interest, for instance to add a viewer to the cracker we're going to write (so you don't need WXF anymore {;-) ).

    Enter correct pwd, then survey calls to ReadFile function occuring AFTER fork. There is a call at CS:40451D, which copys encrypted data from file to DS:47BB04:

    Set a BPR DS:47BB04 DS:47BB04+6 RW to survey access to this zone. We quickly discover that WXF proceeds exactly the same to decrypt Header as to recover original data. The routine we studied in chapter 4 is used again by WXF, with Input=CASIMIR:
      
                     C   A   S   I   M   I   R 	
     remainder of [(43+ 41+ 53+ 49+ 4D+ 49+ 52)/FF] = 0A (even) 
          
    encrypted DATA: 7B  8B  8F  93  8D  93
                  +
            Remain: 09  F8  07  FA  05  FC 
                  -
             Input: 43  41  53  49  4D  49
                  = ----------------------
    decrypted DATA: 41 142  43 144  45 146
           +/- 100: 41  42  43  44  45  46
             ASCII:  A   B   C   D   E   F
    
    

    Note:

         -encrypted Pictures get the extension : XFP,
         -encrypted Data (text)   "    "       : XFD, 
    but encryption process is exactly the same for picture and text. Extension is only a way to determine correct viewer to be used.

    OK, we have stolen enough info from WXF to be able to crack it right now {:-)

    
    Click here for Part B.

    Copyright December 1998 by Casimir.

    Mail Casimir

    Converted to hypertext by Joe Peschel December 14, 1998.

    Revised by Joe Peschel Feb. 18, 1999.