Learn Microsoft Assembler in a Day:Appendix B DEFINING DATA

Appendix B
DEFINING DATA

The structure and organization of your data can be very important to the effectiveness of your program code. This is true for any computer language.

One of the most common ways to define data is with a DB statement. This statement is used to define data in byte format. It may be used to define one byte or a string of bytes. You can define numeric data or ASCII data. It is very flexible.

   ;define one byte with value zero
     db   0
   ;define one byte with value unknown
     db   ?
   ;define an ASCII text string of bytes
     db   "define many byte string of ASCII text"
   ;mix up data types in statement
     db   0,?,"mix up data statement"
   ;define binary bit pattern for 18H
     db   00011000B
   ;define an ASCIIZ string with label
text_to_print  db   ’This is ASCIIZ string’,0
   ;define data block of zeros that is 300 bytes big
arrayX    db   300 dup(0)

The DW statement is used in many data definitions. It is used to define 16-bit words of data. This data statement is also just as flexible as the DB statement. These words can be used as indirect jump vectors. When data structures are addressed as words, the CPU speed is dependent on the alignment of the word being at an even or odd address. A word with an even address will be processed faster than a word with an odd address.

     even      ;force even word addresses
data dw   1    ;define word with value one
   ;define array of 80 words with zero
array80   dw   80 dup(0)

If a word is addressed as two bytes, this will appear with the least significant byte at the first memory address and the most significant byte at the next memory address.

The DD statement is for defining 32-bit double words. These double words can be used for indirect far jump vectors.

     dd   offset_data:dataseg

There are other data definition type statements that are sometimes used. The following are examples of some standard data types.

     df   6 byte farword
     dq   quad word, 8 bytes
     dt   ten byte, 8087 format

Part of writing a complete Assembly language routine for the INTEL 80X86 processor requires the program to have an assume statement. This is used by the compiler to detect segment addressing errors. It is the programmer’s responsibility to make sure that the segment registers are indexing the correct data area at any given time. Using compiler directives provided by the Turbo Assembler system, the programmer can simplify and avoid the use of many assume statements.

The following example of bad code is used to illustrate how some addressing problems can occur.

DataSeg1  SEGMENT   para public    ’data’
var_1     dw   0
DataSeg1  ends
DataSeg2  SEGMENT   para public    ’extra’
var_2     dw   7
DataSeg2  ends
     assume    CS:CodeSeg
CodeSeg   SEGMENT   para public    ’code’
start     PROC near
     ;this gets the address of DataSeg1
     mov  ax,DataSeg1
     ;this loads DS to index DataSeg1
     mov  ds,ax
     assume    DS:DataSeg1
  ;this next mov statement will generate a compile error
    ; because var_2 is not in DataSeg1
     mov  ax,var_2

Data, whether it is in a register or in memory, can be viewed by the program code in one of two ways: as a working data value or as a pointer to a data value. There are many types of pointers and the 80X86 allows for some complex pointing support. In the 80X86, there are segment registers which are used as base pointers to index the start of a memory segment. Almost all instructions that address memory will have a segment register implied or declared for calculating the real memory address to use. Many of the CPU registers can be used to point to data as well as hold data values. Data pointers may be direct or indirect; that is, a pointer may directly point to the location of a data value or it may point to the location of another pointer.

Table of Contents

Appendix BDEFINING DATA

Appendix B
DEFINING DATA