# CLASS 14

## Data Structures

### One-dimensional array

Arrays in C and C++ are stored as pointers to the first element of the array, and a list of offsets from that element.

Arrays are laid out in memory linearly. To store an array, we allocate enough space on the stack for all the array's elements. The address of each element is computed from the address of first element and the array index i:

```  address_of_first_element + i * size_of_array_element_in_bytes
```
so a five-element array:
```	int ary[5];
```
would be:
```  ary:	ary[0]
ary+4:	ary[1]
ary+8:	ary[2]
ary+12:	ary[3]
ary+16:	ary[4]
```
To allocate an array on the stack, we treat it like one big variable:
```  ary_s = -20
save %sp, (-64 - 20) & -8, %sp
```
So address of element 0 is `%fp + ary_s`
Address of element 1 is `%fp + ary_s + 4`
To load `ary[2]` into `%o0`:
```	add	%fp, 8, %o1
ld 	[%o1 + ary_s], %o0
```
To load `ary[i]` into `%o0`, if `i` were stored in `%i_r`:
```	/* multiply i by 4 */
sll 	%i_r, 2, %o1
add	%fp, %o1, %o1
ld	[%o1 + ary_s], %o0
```

Example of Array allocation:

```int a, b, c[3];
char d[3];
short e[2];
```

Corresponding to the C declarations, the offsets on the stack will be defined as follows:

```define(a_s, -4)
define(b_s, -8)
define(c_s, -20)
define(d_s, -23)
define(e_s, -28)
```
Now, the initial declaration will be:
```save %sp, ((-64 - 28) & -8), %sp
==> save %sp, -96, %sp
```
And the variables can be loaded into the registers as:
```ld   [%fp + a_s], %o0            ! a --> o0
ldsb [%fp + c_s], %o1            ! c[0]  --> o1
ldsb [%fp + c_s + 4], %o2,       ! c[0]  --> o2
ldsh [%fp + d_s], %o3            ! d[0] --> o3
ldsh [%fp + d_s + 4], %o4        ! d[0] --> o4
and so on.......
```

Another Example of Array allocation

```int a[100];
char b[48];
register int x[3];
```
The sparc assembly code for these definitions is as follows:
```define(a_s, -400)
define(b_s, -448)
define(x_r_0, l0 )
define(x_r_1, l1 )
define(x_r_2, l2 )
```

Hence, the starting instruction of main will be:

```save %sp, ((-64 - 448) & -8), %sp
==> save %sp, (-512 & -8), %sp
==> save %sp, -512, %sp
```

Example: Summing an array

```  1	main()
2	{
3	  int nums[4] = {1, 45, -16, 23, 62}
4	  int n = 5;		/* number of elems in array */
5	  register int i;	/* for index */
6	  register int sum = 0;	/* holds the sum */
7
8       for (i=0; i < n; i++)
9       {
10         sum = sum + nums[i];
11       }
12  }
```
This code translates to:
```	/* local variables */
nums_s = -20
n_s = -24

/* i is in %l0;  sum is in %l1 */

.global	main
main:	save	%sp, -88, %sp

/* line 3 initialization */
mov	1, %o0
st	%o0, [%fp + nums_s]

mov	45, %o0
st	%o0, [%fp + nums_s + 4]

mov	-16, %o0
st	%o0, [%fp + nums_s + 8]

mov	23, %o0
st	%o0, [%fp + nums_s + 12]

mov	62, %o0
st	%o0, [%fp + nums_s + 16]

/* line 4 initialization */
mov 	5, %o0
st	%o0, [%fp + n_s]

/* line 6 initialization */
mov	0, %l1

/* line 8; top of for loop */
ba	fortest
mov	0, %l0
for:
/* line 10; body of the for loop */
sll	%l0, 2, %o0			! %o0 = i * 4
add	%fp, %o0, %o0			! %o0 = %fp + i * 4
ld	[%o0 + nums_s], %o0		! %o0 = nums[i]
add	%o0, %l1, %l1			! sum = sum + nums[i]

/* line 11; bottom of the for loop */
add	1, %l0, %l0			! i++;
fortest:
ld	[%fp + n_s], %o0		! %o0 = n
cmp	%l0, %o0			! is (i < n)?
bl 	for
nop

/* line 12; end of program */
mov	1, %g1
ta	 0
```

## Pointer and Address arithmetic

Consider the following C declarations:
```	int   a[20];
char  b[20];
short c[20];
```
Let's say these are the only declarations in a subroutine, and the value of `%fp` is 1000. What are the addresses of `a[0], b[0],` and `c[0]`?

First, we allocate the items in order. `a` takes 80 bytes, so its address is 1000-80 = 920. `b` takes 20 bytes, so its address is 920 - 20 = 900. `c` takes 40 bytes, so its address is 900 - 40 = 860.

Now, what are the values of:

```	&(a[3])	 = 920 + 3 *  4 = 932
&(b[10]) = 900 + 10 * 1 = 910
&(c[5])  = 860 + 5 *  2 = 870
```
So, as before, we multiply the index by the base item size and add it to the array start.

Since this operation is so common, the C language has it built in. Say we added these declarations.

```  int   *a_ptr = a;
char  *b_ptr = b;
short *c_ptr = c;
```
Now, each one points at the first element of its array. So
```  a_ptr = 920,
b_ptr = 900,
c_ptr = 860.
```
That is, `*a_ptr = a[0]`. To get the second element of the array, no matter what sizethe base item is, just increment the pointer:
```  a_ptr++;
b_ptr++;
c_ptr++;
```
Now `*a_ptr = a[1], *b_ptr = b[1], *c_ptr = c[1]`. Which means that
```  a_ptr = 924,
b_ptr = 901,
c_ptr = 862.
```
Notice that they have been increased by different amounts, depending on the size of each variable in bytes.

So what is the difference between `a` and `a_ptr`? When using them as names for data, nothing. You can refer to `a_ptr[20]` just as properly as referring to `a[20]`. However, there is a difference between `a` and `a_ptr`. For `a`, the space has been allocated on the stack by the compiler. For `a_ptr`, no space has been allocated -- it is up to the programmer to make sure that `a_ptr` points to a valid piece of memory. If we need to create more space for the pointer to point to during the course of execution of the program, we need to make use of functions like new and malloc to create the space. But is that space on the stack? Well, the answer is no. The space gets allocated on the heap.