PDP-11 Addressing Modes and Instruction Formats =============================================== The more common addressing modes have been covered in a previous lecture. This lecture covers addressing modes in more detail, and explains the more common instruction formats. Two Address Instruction Format Many of the instructions on the PDP-11 require two addresses, a source and a destination. Examples of such instructions are MOV, ADD, CMP and BIS. The format of these instructions is as follows: instruction code (4 bits) source addressing mode (3 bits) source register (3 bits) destination addressing mode (3 bits) destination register (3 bits) 16 bits (one word) in all. If the addressing mode is indexed or index deferred (explained below), the index is specified in the word following the instruction. Example: MOV R3, -(SP) ; push R3 The instruction is MOV, and its code is 0001. The source addressing mode is REGISTER i.e. a register (in this case R3) contains the data, and the destination addressing mode is AUTODECREMENT, i.e. the register specified in the destination field (in this case SP or R6) is decremented before its contents are used as a data address. The code for register mode is 000, and the code for autodecrement is 100. The code for the registers is 011 for R3 and 110 for R6, i.e. the binary for the register number. The instruction given in the example above would be assembled as 0001 000 011 100 110 MOV (register mode) R3 (autodecrement mode) R6 Primitive Addressing Modes (In what follows, Rn represents any of the registers R0 to R7, and X represents contents of the word immediately following the instruction.) There are eight primitive addressing modes. These are: Register: code = 000, assembler symbol = Rn Register Rn contains the data to be operated upon (the operand). Register deferred: code = 001, assembler symbol = @Rn or (Rn) Register Rn contains the address of the operand. Autoincrement: Code = 010, assembler symbol = (Rn)+ The operand is referenced by the address in register Rn, and then the register is incremented. This mode is commonly used in popping data from a stack, and reading characters from and writing characters to a buffer. Autoincrement deferred: code = 011, assembler symbol = @(Rn)+ The operand is referenced by the address in the word at the address contained in register Rn, and then register Rn is incremented. Autodecrement: code = 100, assembler symbol = -(Rn) Register Rn is decremented, and then the operand is referenced by the address contained in register Rn. This operation is used to push data on to a stack, and may also be used for scanning through an array backwards. Autodecrement deferred: code = 101, assembler symbol = @-(Rn) Register Rn is decremented, and then the operand is referenced by the address contained in the word at the address contained in register Rn. Indexed: code = 110, assembler symbol = X(Rn) The operand address is obtained by adding the contents of Rn to X. This instruction is used to access array elements. The first array element of X is accessed by setting Rn to 0. In general, the ith element is accessed by setting Rn to s*(i - 1), where s is the size of each element in bytes. The value of X is held in the word following the instruction code. Index deferred: code = 111, assembler symbol = @X(Rn) The operand address is contained in the Rn'th element of array X, i.e. in the word at the address obtained by adding the contents of Rn to X. Non-primitive Addressing Modes The two commonest non-primitive addressing modes are direct addressing and immediate addressing. Direct addressing: code = 011 111, assembler symbol = X In direct mode, the address of the data is held in the word following the instruction code. It is in fact autoincrement deferred using register R7 (PC), as can be verified by checking the code for this addressing mode. When the instruction is executing, the program counter (R7) points to the word immediately following, and that holds the address of the data referenced by this instruction. PC (R7) is then incremented, to point to the next instruction. Immediate mode: code = 010 111, assembler symbol = #X In immediate mode, the data itself is held in the word immediately following the instruction. Inspection of the code for this addressing mode reveals that it is autoincrement using the program counter. One Address Instruction Format A number of instructions on the PDP-11 specify only one address in their instruction codes. Examples are INC, NEG, SWAB and JMP. The format of these instructions is as follows: Instruction code: (10 bits) Destination addressing mode: (3 bits) Destination register: (3 bits) Normally, any of the registers and any of the addressing modes (primitive and non-primitive) can be used with these instructions. An exception is that register mode cannot be used with a JMP instruction (jumps to registers are illegal). Byte and Word Instructions Many instructions have associated word and byte instructions. Examples are MOV, BIT, ASL, and CLR. If word instructions are used, the addresses referenced must be even. The code for byte and word instructions is the same except for the first (most significant) bit. This is 0 in the case of word instructions and 1 in the case of byte instructions. A few instructions do not fit easily into the groups described here. Examples are the HALT instruction and the TRAP instruction (TRAP is a software interrupt instruction. Interrupts are similar to subroutines in effect, but are used for a different purpose. More about interrupts anon.). These are essentially zero address instructions. More common instructions which do not follow this format are MUL (multiplication) and DIV (division), and must have a register as their destination mode, and JSR and RTS. The PDP-11 is a two address machine. It is also possible to have zero address, one address and three address machines. Zero address machines use a stack for storing data, and operations are performed on the top few stack entries, and the result is left on top of the stack. In addition to zero address instructions, zero address machines have branch instructions and PUSH instructions. These specify one address. One address machines use an accumulator (or, several accumulators). Operations are performed on an accumulator and a memory location, and the result is left in the accumulator. In three address machines, the result of performing an operation on data referenced by two addresses is placed in the word at the third address. Zero address machines are particularly suitable for running compiled code of languages such as Pascal, C, Algol and LISP. Three address machines are particularly suitable for executing compiled COBOL programs. In practice, most manufacturers design instruction sets sufficiently general-purpose that any languages can run on their machines. For example, both the PDP-11 (a two address machine) and the Z80 (a one address machine) have stacks, and the B6800 (a zero address machine) has extra instructions for use in compiled COBOL programs.