Some Assembly Required

January 25, 2013

Prelab (complete before lab)

In this lab you will use the miniMIPS simulator. Prior to the lab you should verify that it works in your web browser. If it does not operate as discussed below you might consider using either Chrome or Firefox.

Load the simulator in a separate tab.
Assemble the default program by pressing [Assemble]. This should pop-up a dialog telling you how many words were loaded into the memory of the simulator.
Single-step through the program by pressing the [Step] button. While single-stepping scroll down to the bottom of the simulator to the "execution history" window. Examine the next instruction to be executed (it has a green background) and verify as you step that the instruction updates the destination register as expected. If you hover over a register cell a popup will give the register's contents in decimal.
Run the program. Press the [Run] button, and a dialog should popup indicating that the instruction labeled "halt" was reached.
Scroll down the page below the simulator window and glance over the instructions supported by miniMIPS.
Verify that the built-in editor works. Move your cursor into the assembly-code editor, and type in an instruction or two with arguments. Verify that tabs move the cursor position within the editor window.

Lab

Simple function evaluation
In the following exercises we write various assembly code fragments to evalute the following polynomial:

y = 4 x² - 11 x - 3
1. Poly version 1.
  In this version of the code we will evaluate the given polynomial quite literally as written. First you need to clear the assembly window of the simulator and enter (or simply cut and paste) the following code:
  poly1: lw $8,x # t0 = x addiu $9,$0,4 # t1 = 4 mul $9,$9,$8 # t1 = 4*x mul $9,$9,$8 # t1 = 4*x*x addiu $10,$0,11 # t2 = 11 mul $10,$10,$8 # t2 = 11*x subu $9,$9,$10 # t1 = t1 - t2 addiu $9,$9,-3 # t1 = t1 - 3 sw $9,y # y = t1 halt: beq $0,$0,halt x: .word 4 y: .space 1
  Notice that we placed the "data" (variables x and y) after the code. This was to assure that it would be located far away from the "code", which by default is location 0x00000000, the starting address for the assembler. Later we will discuss assembler directives that will give us more freedom for locating and relocating code and data. Also notice the infinite loop to "halt" after the desired value is computed and stored. This keeps the simulator from fetching instructions from following memory locations, which, in this case, happens to be our data.
  
  After the code is entered you should press the [Assemble] button and verify that ther were no typing mistakes.This should assemble 12 words in memory. Next you should step through each line of the code by pressing the [Step] button and verify that the expected result is saved in y. Then modify the value of x in the assembler to "3", then assemble and step through it again.
  
  Notice in your second running of the program that the register and most memory contents were not reinitialized in any way. One exception to this was the contents of "y", which are reset to zero. By the way, the only way to see this was to look at the execution history at the bottom of the simulator and notice the value of y just prior executing the sw instruction. This reinitialization is actually due to the assembler directive ".space" which provides the specified number of zero-filed words for allocating variables.
  
  Checkoff 1. After your second execution with x = 3, what is the value left in $10? What part of the polynomial expression does this intermediate result correspond to?
2. Poly version 2
  An alternative way of evaluating the same polynomial is to perform the operations implied by the following factored version:
  
  y = (4 x + 1)(x - 3)
  
  Resulting in
  poly2: lw $8,x # t0 = x sll $9,$8,2 # t1 = 4*x addiu $9,$9,1 # t1 = 4*x + 1 addiu $10,$8,-3 # t2 = x - 3 mul $9,$9,$10 # t1 = t1 * t2 sw $9,y # y = t1 halt: beq $0,$0,halt x: .word 4 y: .space 1
  This version evaluates the exactly the same function as the original, but uses only 9 words of memory. This is an example of low level optimization, of the form that humans might apply (i.e. reformulating the function into a form that is easier to evaluate). It is unlikely that a complier could have come up with this version of the code, since it requires an understanding of the rules of factoring and how they might be best leveraged. However, a human could have aided the complier by specifying the factored function in the high-level simply on the basis that it implies fewer operations.
  
  There is also a second optimization where a multiply by 4 is replaced by a shift-left instruction. This sort of optimazation is commonly made by compilers. Basically, a left shift by i positions is equivalent to a multiplication by 2ⁱ. This is a direct consequence of our representation of unsigned and signed (2's complement) integers discussed in lecture 2. Shifts by 1 are the same as multiplying by 2, and shifts by 2 are the same as multiplying by 4. Multiplying by powers of 2 comes up frequently when indexing data structure in computers since most data item sizes are powers of 2 in size (bytes are 1 byte, shorts are 2 bytes, words are 4 bytes, and double words are 8 bytes). Thus, the address of the i^th word in an array is offset 4*i from the base address of the array. This offset is generally computed by shifting i left by 2.
  
  Another advantage of this polynomial evaluation version is that it makes better use of the add immediate addiu instructions, since there is no immediate version on mul in miniMIPS.
  
  Checkoff 2. Choose a value for x such that the y produced will negative. What value did you choose? What intermendiate value ends up being saved in $10? What part of the expression does this value correspond to?
3. Poly version 3
  A third method for evaluating the given function, which is more suitable for optimization by compilers, is to compute y as follows:
  
  y = (4 x - 11) x - 3
  
  Checkoff 3. Write your own assembly language implementation of this verison. Handwrite your code on your check-off sheet, then enter it in the simulator to verify that it is correct. Use the "shifting" trick mentioned previously if possible. How many words does your implementation assemble to?
Machine Coding
As discussed in class, an assembler is used to translate a human-readable assembly language into machine language. However, for those diehards it is still possible to do a little machine language programming as follows.
1. Disassembly
  Clear the assembly window and enter:
  .word 0x3c1eface .word 0xaddedacc
  Then assemble the program, and examine the execution-history window of the simulator. You will notice that a "comment" has been appended to the first word (0x3c1eface) containing the "disassembled" contents of the word you initialized. Likewise for the second ".word" statement. Thus, 0x3c1eface and 0xaddedacc are the machine language encodings of the two instruction sequence
  lui $30,64206 sw $30,56012($14)
2. Re-assembling
  Now try entering these two instructions directly, and assemble the result to verify that the two ".word" statements given previously are equivalent. You might find this little trick to be useful for your upcoming homework.
Some exceptions
Not every binary word is a valid instruction. The reasons for their failings vary. In this exercise we use the simulator to generate various errors called "exceptions". These result when an instruction is invalid, or makes an invalid memory access, or, in some cases, generates an invalid result.
1. Invalid Instructions
  Let's start with an "invalid" instruction. Use the previously described technique to assemble and execute the following line of assembly code.
  .word 0xffffffff
  This generates an "invalid instruction" exception, which displays a dialog when executed. In addtion, the address of the the offending instruction is saved in $27, and the program counter is set to the kernel address 0x80000400. Register 27 by convention, is reserved for use by the kernel and it can be examined to determine what was the offending instruction. In fact, invalid instructions are one method of calling kernel services. Moreover, some implementations of the MIPS ISA may choose not to implement every instruction (div for example). This can be handled using the invalid instuction exception, which allows the kernel to implement the missing instruction in software, and then return to the following instruction.
2. Alignment Exceptions
  In the MIPS ISA all memory accesses must be properly aligned according to their data type (short, word, or long). Improper addresses generate an "alignment exception". One can be generated as follows:
  lw $t0,1($0)
  Once assembled and executed by pressing [Step], this instruction will also generate an exception and popup a dialog explaining the error. The nature of this error however is different than the "invalid instruction" error we saw previously. The source is due to the address generated by adding the immediate value and the contents of the specified register, which is not "word-aligned". One possible method for handling these sort of exceptions is to support "unaligned" word accesses in software.
3. Overflows
  The standard add, sub, and addi MIPS instructions generate an overflow exception when their result cannot be represented in 32-bits. The following series of instructions will cause an overflow.
4. Dividing by zero
  The interger divide instruction, div, has its own version of an overflow that occurs when one attempts to divide by 0. It can be generated using the following:
  div $1,$1,$0
Jumping branches
In this section we explore branching instructions. Previously, in part 1, we used a "beq" instruction to effectivlty stop the program counter (see the instruction labeled "halt" in "poly1"). In this section we will call subroutines and implement a loop using various forms of jumps and branches.
1. A subroutine
  Start by entering the following modified version of "poly2" into the assembly window of the emulator.
  main: addiu $8,$0,2 # t0 = 2 sw $8,x # x = t0 = 2 jal poly2 # y1 = poly2(x) lw $8,y # t0 = y1 sw $8,x # x = t0 = y1 jal poly2 # y2 = poly2(y1)*halt: beq $0,$0,halt poly2: lw $8,x # t0 = x sll $9,$8,2 # t1 = 4*x addiu $9,$9,1 # t1 = 4*x + 1 addiu $10,$8,-3 # t2 = x - 3 mul $9,$9,$10 # t1 = t1 * t2 sw $9,y # y = t1 jr $31 x: .space 1 *y: .space 1
  A preamble called "main" has been added to the original that calls the "poly2" function twice with different values for "x". These calls are made using the "jump and link" instruction jal. This instruction saves the address of the instruction following the jal in register 31 ($31). Notice that the last instruction of "poly2" is now a "jump register" or jr instruction with argument $31.
  
  Also notice the two asterisks placed before the labels "halt" and "y". Including an asterisk in the first column of an instruction creates an execution breakpoint at the location. An execution breakpoint stops a running simulator prior to the execution of the flagged instruction, which is useful for debugging. Including an asterisk in the first column of a data location creates an access breakpoint, which stops the simulator anytime the flagged location is written to. Rather than stepping through the program one instruction at a time, if breakpoints are set you can press the [Run] button, and examine the CPU's state when either the flagged instruction is reached, or a flagged location in memory is written. Once the dialog is dismissed, you can examine the contents of registers and the execution history. To continue execution, press [Run] again. Try it out on the given segment. The simulation should stop twice when the value of "y" is updated during the calls to "poly2" and once again when "halt" is reached.
2. A loop
  Now enter this modified version of "poly2" in the assembly window of the emulator.
  main: addiu $8,$0,0 # loop: sw $8,i # i = 0 sw $8,x # x = i jal poly2 # y = poly2(x) lw $8,i # t0 = i lw $9,y # t1 = y sll $10,$8,2 # t2 = 4*i sw $9,result($10) # result[i] = y addiu $8,$8,1 # t0 = t0 + 1 slti $10,$8,10 # t2 = (i < 10) ? 1 : 0; bne $10,$0,loop # repeat while (t2) *halt: beq $0,$0,halt i: .space 1 result: .space 10 poly2: lw $8,x # t0 = x sll $9,$8,2 # t1 = 4*x addiu $9,$9,1 # t1 = 4*x + 1 addiu $10,$8,-3 # t2 = x - 3 mul $9,$9,$10 # t1 = t1 * t2 sw $9,y # y = t1 jr $31 x: .word 4 *y: .space 1
  This code sequence calls "poly2" ten times with "x" set to successive values 0, 1, 2, ..., 9, and the resulting "y" values are stored in an array named result.
  
  The value of "i" is tested each time through the loop using the "set if less than immediate", slti, instruction. This instruction sets the destination register $10 to "1" if the contents of register $8, which contains "i" are less than the immediate operand, and sets it to "0" otherwise. This result is tested by the following bne instruction, which branches to loop is the result was non-zero (i < 10) and falls through to the beq instruction (halt) otherwise.
  
  Also notice the "shift left logical", sll, instruction that is used to multiply "i" by four so that it can be used as a word-aligned index by the following sw instruction.
  
  Press the [Run] button to execute the given code and examine each value of "y" as it is updated.
A function of your own
Checkoff 5. Write a function to compute the greatest common divisor, "gcd", of the two integer variables "x" and "y" using the following algorithm:

while (x != y) { if (x < y) { y = y - x; } else { x = x - y; } }

Then write a preamble "main" function to call it with values "x" = 72 and "y" = 120. Write your code on the checkoff sheet after testing it in the simulator (it should take less than 25 lines of assembly). Note that when the while loop finishes both "x" and "y" are set to the "gcd".

Lab Checkoff

Turn in your lab checkoff sheet with 5 checkoffs.