               NASA project 99-U2-0402/29D  "Indian Summer Sky"



                             Flexible MASM/TASM
                             ------------------
                      Controlled compiling and linking




1) Compiler
-----------
MASM is FREE (in DDK) and not so bad. When you are developing applications
for OS then it is good to use tools provided by creator of that OS.
NASM and TASM lobbys are stronger (make greater noise) than MASM one.
I was orthodox TASM fan too. But all ?ASM groups use MS LINK, TLINK32 is too
rigid and has no options. I didn't see VxD written in other ASM then MASM and
linked with other linker then LINK yet. And what about debug symbols, ha?
NASM ppl use MS LINK and MS libraries in 60% (their sources contain CALL API@X)
cases - so every discussion about what is commercial and what's not is remnant.
Of course the best is to know every ASM. If you decide for writing Win32
applications in MASM I recommend you to download Hutch's MASM32 package.


2) Linker
-----------
Linker is strategical tool for every SW company - thing which produces binaries.
LINK.exe is besides ntoskrnl.exe the best of Microsoft. With LINK you can do
anything. Collect all versions of LINK you met. I have 2.60, 5.00, 5.12 and
6.00. Versions <= 5.00 produce PE,LE without 'nonsens-Rich zone' between MZ and
PE header. 2.60 adds .reloc implicitly, 5.00 produces PEs with headers size at
least 4096 byte. 6.00 makes libraries which are 3x smaller but not compatible
with lower LINK versions.


3) Setting up environment
-------------------------
Create following two environment variables:
 SET INCLUDE=C:\MASM32\INCLUDE;C:\98DDK\INC\WIN98
 SET LIB=C:\MASM32\LIB;C:\98DDK\LIB\I386\FREE
And always put new includes, macros and libraries to those directories.
Then you don't need to specify path to includes and libraries in source file:
 INCLUDE     W32Main.inc
 INCLUDE     APImacro.mac
 INCLUDELIB  KERNEL32.lib
 INCLUDELIB  iKERNEL32.lib
It is also good idea to have ML.exe and LINK.exe in PATH.


4) Creating iLIBraries
----------------------
You don't need any import library. The all you need is MS LINK.
Take your PE with exports (PEexp.dll or .exe, .sys, .lib too!) and type (in case
of kernel32.dll) :
LINK -DUMP -EXPORTS kernel32.dll > kernel32.def
Now edit (I recommend editor which supports vertical blocks: FAR, DN; it's
question of 20 seconds) kernel32.def to this result:
   NAME    kernel32.dll

   EXPORTS
   Export0
   Export1
   ...
   ExportZ
The directive NAME gives implicitly extension .exe, LIBRARY gives .dll.
NAME with specified extension is universal.
Achtung baby! There can be this:
   HeapExtend
   HeapFree (forwarded to NTDLL.RtlFreeHeap)
   HeapLock
then remove (*):
   HeapExtend
   HeapFree
   HeapLock
Now move kernel32.def to LIB directory and type:
LINK -LIB /DEF:kernel32.def /MACHINE:IX86 /OUT:iKERNEL32.lib
DEL  iKERNEL32.exp
because you probably have in LIB directory original MS '@' kernel32.lib library.
iKERNEL32.lib is non-@ library and is designed for use with APImacro.mac.

Notes:
 a)The iLIBRARY.lib is totally optimized for your version of exe/dll/sys,
   there will be matches on hint fields - faster loading.
 b)For -LIB use LINK version < 6.00. The libraries created by higher versions
   are not compatible with lower versions. Moreover when linker links your app
   with =>6.00 libraries then the 'nonsens zone' between MZ and PE header is 
   thicker.
 c)Result of -DUMP -EXPORTS gives you info about where are functions located,
   so when AddAtomA has RVA=79B7, then load kernel32.dll to SEN's Hiew,
   press F5 and type: .79B7


5) Using APImacro.mac
---------------------
Copy it to INCLUDE directory, in your source type:
INCLUDE    APImacro.mac
INCLUDELIB iLIBRARY.lib
Some macros are based on Win32 one by Sven B. Schreiber.
APImacro.mac works with TASM32 too. Of course after compiling use LINK instead
of TLINK32. TASM /zi option has no sense. Because of TASM produces OMF, there
will be only PUBLIC variables visible as debug symbols (/DEBUG /DEBUGTYPE:COFF).
================================================================================
 a)iWin32
   is replacement for INVOKE (TASM: CALL). You don't need to use PROTOs
   generated by Hutch's L2INC, you don't need kernel32.inc, user32.inc, etc..
   You must use OFFSET instead of ADDR. If you have problems try to give
   parameter to <brackets> (required for TASM: what contains spaces must be in
   brackets!). Moreover you must give parameter type explicitly sometimes:
   in MASM you can: ..,[EAX+1],.. ; in TASM you must:  ..,DWORD PTR [EAX+1],..


  Standard:     MessageBoxA PROTO :DWORD, :DWORD, :DWORD, :DWORD
                INVOKE MessageBoxA, 0, ADDR par1,\
                                    ADDR par2, 1 OR 2

       MASM generates:
                PUSH ...
                CALL  MessageBoxA
                ...
       ;jmp table here
             MessageBoxA:
                JMP  [MessageBoxA]


  EliASM:      iWin32 MessageBoxA, 0, OFFSET par1,\
                                   OFFSET par2, 1 OR 2
    
        for TASM: iWin32 MessageBoxA, 0, <OFFSET par1>, <OFFSET par2>, <1 OR 2>

       MASM generates:
                PUSH ...
                CALL [MessageBoxA]

  Do you see the difference?  1 instruction instead of 2! (6 bytes vs. 11)



 aa) iWin32@
     is EXACT replacement for INVOKE (@ libs)  see Example05

     Perhaps there is a way to implement iEXTERNs to INVOKE (via TYPEDEF?)

================================================================================
 b)icWin32
       C variant of iWin32 - it clears stack
       suitable for wspritf, sscanf and similar VARARG functions

 ba) icWin32@
     is @ variant (but it is used rarely)

================================================================================
 c)iWin32j
   is JMP variant. 

   Standard:    ExitProcess PROTO :DWORD
                JMP  ExitProcess

       MASM generates (no parameters):

                JMP  ExitProcess
       ;jmp table here
             ExitProcess:
                JMP  [ExitProcess]

   EliASM:      iWin32j  ExitProcess, EBX, 0

       MASM generates:
                PUSH ...
                JMP  [ExitProcess]
================================================================================
 d)iLEA
   gives address of import field in import dir of your PE

   Standard:    MOV   ECX, DWORD PTR ReadFile+2 

       MASM generates:
                MOV   ECX, DWORD PTR ReadFile+2 
       ;jmp table here
             ReadFile:
                JMP  [ReadFile]     ;who wanted this?

   EliASM:      iLEA  ECX, ReadFile

       MASM generates:
                MOV   ECX, OFFSET ReadFile
================================================================================
 e)iMOV   (useful in drivers)
   gives contents (address of API) of import field in import dir of your PE

   Standard:    MOV   EAX, DWORD PTR KeNumberProcessors+2 
                MOV   EAX, [EAX]
                MOV    AL, [EAX]

       MASM generates:
                MOV   EAX, DWORD PTR KeNumberProcessors+2 
                MOV   EAX, [EAX]
                MOV    AL, [EAX]    ;no of cpus, btw signed value ( -1 cpu :)
       ;jmp table here
             GetVersion:
                JMP  [KeNumberProcessors]   ;who wanted this nonsens?

   EliASM:      iMOV  EAX, KeNumberProcessors
                MOV    AL, [EAX]

       MASM generates:
                MOV   EAX, KeNumberProcessors
                MOV    AL, [EAX]
================================================================================
 f)iPUSHo 
       See iLEA
================================================================================
 g)For iEXTERN, iMOVw, iPUSH, iPOP, iDWORD, iCMP  see iMOV.
       Of course you can create other imacros: cmp api,const; add;sub;....

   iEXTERN is UNIVERSAL !!!!

   iGet0ImpDesc, iSet0ImpDesc, iGetDllImpDesc,iSetDllImpDesc  .. pro forma

   Note: For iMOVw, iPOP, iSet* must be object where imp dir resides (usually
         .rdata, .idata) Writable.
================================================================================
 h)Nonimport macros

   sWin32  - STDCALL  call
             use when you CALL R32 or CALL mem32 or CALL label32
             example: sWin32  EDX, 0, 1, 2
   cWin32  - C  variant of sWin32

   lPUSH, lPOP  - push, pop list  in PASCAL order   (like TASM but with commas)
   PUSHL, POPL  - push, pop list  in STDCALL/C order


  If you are linking a debug build (/DEBUG option), then is jmp table added
  always!! (and perhaps it is better to use standard INVOKE; it saves 1 byte)
================================================================================


6) ASM in BAT
-------------
NMAKE is maybe good for large projects. For small examples is better .bat file.
When you want to save cluster you can merge .asm and .bat. If you have probs
with my sources, then rename .bat to .asm, take the .bat part of .asm and copy
it to new .bat.


7) Include files
----------------
Hm.. it is problem. Everything is in .h files and H2INC doesn't work correctly.
If you code another and good H2INC you'll be king.

In W32Main.inc (Sven B. Schreiber) is macro STRING, which creates ANSI or
UNICODE string - it depends on setting variable UNICODE. So when you use STRING
you can easily convert your program between ANSI/UNICODE version. CHAR_ is size
of character: 1 or 2 bytes.
Btw: I have .txt written in UNICODE (or wide char if you want). It begins with 
BYTE -1, -2. Then it is displayed correctly in NT Notepad.


8) Optimizations
----------------
I use standard VC++ size optimizations, like:

     
    AND    Dwordxx, 0 ; OR    Dwordxx, -1
    AND    wordxx, 0  ; OR    wordxx, -1 
    instead of MOV (D)wordxx, 0  (-1)

    PUSH   Const80h   ;6Ah form
    POP    R32
    instead of MOV R32, Const80h ; where : -129 < Const80h < 128
    (NASM generates long 68h form always?)

    OR     R32, -1  |  SUB R32, R32  DEC R32 | XOR R32, R32  DEC R32
   instead of MOV R32, -1 ;

    SUB R32, R32  INC R32 | XOR R32, R32  INC R32
   instead of MOV R32,  1 ;

    SUB R32, R32 | XOR R32, R32
   instead of MOV R32,  0 ;

    etc....

   but I will come back to MOV R32, const
   because it is faster and speed is more important than 2 bytes

   Also it is better to use TEST, CMP which are faster than AND, OR.

   Registers  EBX, EBP, ESI, EDI are preserved by 99% of APIs. So you
   can put often used constants/adresses/APIs to them. You can see everywhere:
   MOV E??, RtlInitUnicodeString .. CALL E?? .. CALL E??    (?? == BX,DI,...)


9) Using LINK
-------------
See my webpage/Infos (PE object merging, renaming, changing attributes, etc...).
MS LINK 6.00 produces PE files with file alignment 0x1000 because such a PE can
be quickly loaded. When you want file with minimum Win95 align (0x200) specify
among other switches: /ALIGN:0x1000 /IGNORE:4108  or /SUBSYSTEM:???,xxx where
xxx is number < 4.0 for example 3.51. When you are merging objects then specify
/IGNORE:4078. LINK can't merge reloc section - it's added after all is done;
also merging resources is not good.


10) Examples
------------
were built using MASM 6.14 (TASM 5.0) and LINK 6.00.

================================================================================
EliCZ, chemical student, Aug-09-1999
WWW: http:/elicz.cjb.net
IRC: EFnet: #win32asm, #dtg2000