PE Files Import Table Rebuilding - Written by TiTi/BLiZZARD

1. For what is it useful?
=========================

Well I wrote this essay because I was working on a process dumper, when I 
saw that many compressors/encrypters make the import table unusable, and 
then, the dumped executables needed to have their import table rebuilt. 
I saw no essay about this on common Win32ASM sites, so here is a little 
help if you are interested.

For example, any Petite v2.1 compressed executable, after having been
dumped from memory, needs to have its import table rebuilt (more precisely
corrected), in order for this .exe to run properly. (That's the same for
ASPack, PEPack, PESentry...). That's why import table rebuilding functions
are needed in any dumper. (For example Phoenix Engine by G-RoM/UCF
(included in ProcDump), or PE Rebuilder by Virogen and me).

Well as this subject is very specific, and quite complicated, I'll
assume that you already know PE files structure.

2. Some preliminary comments
============================

Firstly, some quick information about import tables and RVA/VA.

The import table relative virtual offset (RVA) is stored in the corresponding
directory entry in the PE header. (Its offset is [offset peheader+80h]. As it
is a virtual offset, it won't match the file offset (VA) of the import table
(except if the file has just been purely dumped from memory). So, the first
thing you have to do to find the import table in a PE file is to convert this
RVA to the corresponding VA. For that, there are some different solutions : you
can write a personal routine that parses the sections directory and calculates
the VA, but the easiest way to do it is to use an API that is especially designed
to do that. This API is in IMAGEHLP.DLL (a library used on both Win9x and NT
systems), and its name is ImageRvaToVa. Here is its description (get full detail
in the MSDN library):

# LPVOID ImageRvaToVa(
#  IN PIMAGE_NT_HEADERS NtHeaders,
#  IN LPVOID Base,
#  IN DWORD Rva,
#  IN OUT PIMAGE_SECTION_HEADER *LastRvaSection
#);
#
#Parameters :
# NtHeaders 
#   Pointer to an IMAGE_NT_HEADERS structure. This structure
#  can be obtained by calling the ImageNtHeader function.
# Base
#   Specifies the base address of an image that is mapped
#  into memory through a call to the MapViewOfFile function. 
# Rva 
#  Specifies the relative virtual address to locate. 
# LastRvaSection 
#   Pointer to an IMAGE_SECTION_HEADER structure that
#  specifies the last RVA section. This is an optional
#  parameter. When specified, it points to a variable that
#  contains the last section value used for the specified
#  image to translate an RVA to a VA. 

You see it's pretty simple to use. You just have to map your PE file in memory
and call this function to get the valid VA to the import table. Note that I'll 
skip all RVA/VA remarks in the following, but dont forget to convert one to the 
other when you read/write RVAs from/to the PE file you are rebuilding.

3. Full explanation
===================

Here is a full sample of an altered import table (This is the import table of
a PE file that has been compressed with Petite v2.1, and then has been
directly dumped from memory) :

00 are represented by ''
non-strings are represented by '-'

0000C1E8h : 00 00 00 00 00 00 00 00 00 00 00 00 BA C2 00 00  ----
0000C1F8h : 38 C2 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ----
0000C208h : C5 C2 00 00 44 C2 00 00 00 00 00 00 00 00 00 00  --------
0000C218h : 00 00 00 00 D2 C2 00 00 54 C2 00 00 00 00 00 00  --------
0000C228h : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
0000C238h : 7F 89 E7 77 4C BC E8 77 00 00 00 00 E6 9F F1 77  ------------
0000C248h : 1A 38 F1 77 10 40 F1 77 00 00 00 00 4F 1E D8 77  ------------
0000C258h : 00 00 00 00 00 00 4D 65 73 73 61 67 65 42 6F 78  MessageBox
0000C268h : 41 00 00 00 77 73 70 72 69 6E 74 66 41 00 00 00  AwsprintfA
0000C278h : 45 78 69 74 50 72 6F 63 65 73 73 00 00 00 4C 6F  ExitProcessLo
0000C288h : 61 64 4C 69 62 72 61 72 79 41 00 00 00 00 47 65  adLibraryAGe
0000C298h : 74 50 72 6F 63 41 64 64 72 65 73 73 00 00 00 00  tProcAddress
0000C2A8h : 47 65 74 4F 70 65 6E 46 69 6C 65 4E 61 6D 65 41  GetOpenFileNameA
0000C2B8h : 00 00 55 53 45 52 33 32 2E 64 6C 6C 00 4B 45 52  USER32.dllKER
0000C2C8h : 4E 45 4C 33 32 2E 64 6C 6C 00 63 6F 6D 64 6C 67  NEL32.dllcomdlg
0000C2D8h : 33 32 2E 64 6C 6C 00 00 00 00 00 00 00 00 00 00  32.dll

Well as you can see this import table is divided into three main
parts :

  - From C1E8h->C237h: The array of IMAGE_IMPORT_DESCRIPTOR structures,
                       each one corresponds to one imported DLL file.
                       This array is ended by a struct filled with 0.
                       
   IMAGE_IMPORT_DESCRIPTOR struct
      OriginalFirstThunk  	dd 0  ;RVA to original unbound IAT
      TimeDateStamp		dd 0  ;not used here
      ForwarderChain		dd 0  ;not used here
      Name			dd 0  ;RVA to DLL name sring
      FirstThunk		dd 0  ;RVA to IAT array
   IMAGE_IMPORT_DESCRIPTOR ends

  - From C238h->C25Bh: The ArrayS of DWORDs called 'IAT' pointed by
                       FirstThunk members of the IMAGE_IMPORT_DESCRIPTOR
                       structs. Each DWORD of this array corresponds to
                       an imported function.

  - From C25Ch->C2DDh: These are the strings of the imported functions
                       and DLL files. One problem is that there is no
                       predefined order : sometimes the DLL names are
                       before functions, sometimes that is the contrary,
                       and sometimes they are mixed up.

Little explanation about the import tables
------------------------------------------

The OriginalFirstThunk is the array of IAT the PE loader first searches
for. If it is present, the PE loader will use it to correct eventual
problems in the FirstThunk IAT array. Once loaded into memory, each
dword of the FirstThunk array, containing an RVA to the function name
string, is replaced by a RVA to the real function's address. (The
location in memory that'll be executed when calling this function).
So, basically, there is no import table problem, provided that the
OriginalFirstThunk stays unchanged.

Here we come to our problem
---------------------------

Well, after this short description, we come to the problem. If you try
to run the executable containing the import table shown above, it won't
load, and Windows will display an error message. Why ?, simply because
the OriginalFirstThunk array has been deleted.

Actually, you notice that, for each IMAGE_IMPORT_DESCRIPTOR struct
of this import table, the OriginalFirstThunk member is 00000000h.
So we deduce that, when launching the exe, the PE loader will try
to get imported functions names from the FirstThunk array. BUT, as you
can notice, this array doesn't contain RVA to functions name strings
anymore, but RVA to function address in memory.

What we have to do
------------------

Now, what we have to do to get this executable to work is to rebuild
the FirstThunk array members, to make them point again to functions name
strings we can see in the 3rd part of the import table.

Basically, that's not a very hard task, but, we need to know which IAT
corresponds to which functions, as the functions strings are not sorted
the same way as the FirstThunk members.

So, for each IAT, we need to identify the function name it corresponds
to. (actually, we already have the DLL name, because of the IMAGE_IMPORT_
DESCRIPTOR.Name DWORD, that haven't been changed of course).

How to identify each function
-----------------------------

As we saw above, each corrupted IAT is an RVA to the function's address
in memory. These addresses do not change from one session to another, so
we just have to retrieve the function whose address is pointed by the
corrupted IAT, and make it point to the function name string.

For this, there is a very useful API in Kernel32.dll GetProcAddress. It
allows you to get the address of a given function. Here is its description :

GetProcAddress(

    HMODULE	hModule,	// handle to DLL module
    LPCSTR	lpProcName 	// name of function 
);


So, basically, for a given corrupted IAT, we just have to parse all function
names contained in the 3rd part of the import table, until GetProcAddress
returns the address of the function we are looking for.

   - The hModule parameter is the handle of the DLL module (that is to say
    the Base Address of the module's image in memory), that we can get using
    the well known GetModuleHandleA API :

    HMODULE GetModuleHandle(
        LPCTSTR  lpModuleName 	// address of module name to return handle for
    );

     (The lpModuleName just have to point to the DLL filename string we get
     from the IMAGE_IMPORT_DESCRIPTOR.Name member)

   - the lpProcName just points to the function name string.

Note that sometimes function are imported by ordinal number. These numbers are
WORDS in each [offset functionname - 2]. So, your parsing routine will have to
check if each function is imported by name or by ordinal.

Example using the import table above
------------------------------------

I'll explain how to fix the first imported function of the first imported
DLL if the sample import table above.

 1. We look at the first IMAGE_IMPORT_DESCRIPTOR struct of the array (C1E8h),
   and get the DLL name, pointed by the .Name member (C1E4h, that points to
   C1BAh). We see that it's USER32.dll.

 2. We look at the .FirstThunk member, that points to an array of IAT; each
   one corresponding to 1 imported function from this DLL (user32.dll). In
   this case, that is C1F8h, that points to C238h. So, at C238h, we have our
   corrupted IATs to fix. (You can notice that this IAT array contains 2
   dwords, so 2 functions are imported from this DLL).

 3. Let's get the first corrupted IAT. It's value is 77E7897Fh. This is the
   address the the function in memory.

 4. For each function name in the 3rd part of the import table, we call the
   GetProcAddress API. When this API returns 77E7897Fh, that's it, we reached
   the right function. So we make the corrupted IAT point to the right
   function name. (in this case that is 'wsprintfA').

 5. Now we just have to make the IAT point to : offset(function Name string)-2.
   Why -2 ? because of ordinal sorting of functions that is sometimes used.
   So, in this case, we change the content of the addr C238h to make it point
   to C26Ah (instead of 77E7897Fh).

 6. That's it, this function is fixed, now you just have to repeat this process
   on all IATs.

Last Notes
----------

Well I described the general process. Of course it will only work on DLLs that
are currently loaded into memory. For the others, you'll have to load them, or to
traverse their export table to find right functions addresses.