Notes on the DG Nova Architecture for an lcc target PROGRAM STRUCTURE Code lives in the NREL relocatable segment. Constants and globals are in ZREL for now, so they can be addressed by all code. This restricts the total amount of data in globals and constants. Function pointers must currently be addressable in relative mode. PLATFORM CONVENTIONS Bits are numbered with zero meaning the most significant ("high-order", or leftmost) bit. The "right half" of a word means the least significant byte. The "left half" means the most significant byte. DATA TYPES The native data type is the 16-bit word. char = 1 byte; a word stores 2 chars int = 1 word long = 2 words (double word) float = 2 words (single precision floating point, Nova format) 1 sign bit, 7 bit exponent (base 16, excess 64), 24 bit mantissa stored exponent is true exponent+64; normalised to 1/16 <= mantissa < 1 first word is Sign, Exponent, Mantissa bits 0-7 (most significant) second word is Mantissa bits 8-23 double = 4 words (double precision floating point, Nova format) 1 sign bit, 7 bit exponent (base 16, excess 64), 56 bit mantissa first word is Sign, Exponent, Mantissa bits 0-7 (most significant) second word is Mantissa bits 8-23 third word is Mantissa bits 24-39 fourth word is Mantissa bits 40-55 Two FP registers are available: FPAC and TEMP, which store either single or double precision values. There is also a FP status register, SR. INSTRUCTION SET Four formats: - No Accumulator-Effective Address JMP ea ; jump JSR ea ; jump to subroutine ISZ ea ; increment and skip if zero DSZ ea ; decrement and skip if zero - One Accumulator-Effective Address LDA ac,ea ; load accumulator STA ac,ea ; store accumulator - Two Accumulator-Multiple Operation ADD <#> acs,acd <,skip> [6] ; add SUB " [5] ; subtract NEG " [1] ; negate ADC " [4] ; add complement MOV " [2] ; move INC " [3] ; increment COM " [0] ; complement AND " [7] ; and These instructions combine the following operations: - initialise carry bit (do nothing, clear, set, complement) - perform arithmetic or logical operation on source accumulator value - shift result left or right, exchange bytes, or do nothing - optionally load result in destination accumulator - test result (R) and then always skip the next instruction, never skip, or only skip if C==0, C!=0, R==0, R!=0, C==0 | R==0, C!=0 & R!=0 (SZC, SNC, SZR, SNR, SEZ, SBN) - Input/Output incl MUL/DIV - Stack Manipulation PSHA ac/POPA ac SAV MTSP ac/MFSP ac MTFP ac/MFFP ac - Extended RET TRAP PATTERNS CNST - constant The Nova does not encode literal operands in instructions. Literal constants are loaded into accumulators either by short idiomatic instruction sequences or from memory (e.g. a data block at the end of a function). sub ac,ac ; generate 0 ; from Programmer's Reference: subzl ac,ac ; generate +1 adc ac,ac ; generate -1 adczl ac,ac ; generate -2 ARG - argument ignored pattern? ASGN - store accumulator into memory = STA ac,address (byte operations will use SBYT routine instead) Loading constants: INDIR - fetch (indirection); load accumulator from memory = LDA ac,address (byte operations will use LBYT routine instead) INDIRxx(ADDxx(addr,aci)) = LDA CV - convert between integer types. sign extend when widening signed value. zero upper bits when widening unsigned value. these can be subroutines. no need to convert between floating types, just store single or store double NEG,ADD,SUB,BAND,BCOM - arithmetic and logical operations LSH,RSH,MOD,BOR,BXOR,DIV,MUL will be implemented as (frame-less) subroutines with a fixed source and destination CONDITIONALS EQ,GE,GT,LE,LT,NE EQ(%0,%1) = SUB# %1,%0,SNR -then- JMP address NE(%0,%1) = SUB# %1,%0,SZR -then- JMP address GE(%0,%1) = SUBZ# %0,%1,SZC -then- JMP address GT(%0,%1) = SUBZ# %1,%0,SNC -then- JMP address LE(%0,%1) = SUBZ# %1,%0,SZC -then- JMP address LT(%0,%1) = SUBZ# %0,%1,SNC -then- JMP address but this is not necessarily efficient if we have just had the opportunity to skip on the result of an ALU operation... job for a peephole optimiser? Floating point tests seem to require a subtraction (clobbers FPAC), then load status into ac (clobbers ac), load mask into second ac (clobbers), and, then possibly skip jump: ; test == 0 .fmft .fss %1 .frst 0 ; clobbers .fmtf lda 1,eqzmask ; clobbers and 0,1,szr jmp %a ; is equal ;... is not equal gtzmask: 002000 eqzmask: 001000 ltzmask: 000400 CALL - function call = JSR addr RET - return value ?? ADDRG[P2] - address of global = label? ADDRF[P2] - address of function parameter = offset from frame pointer ADDRL[P2] - address of local variable = offset from frame pointer JUMP[Void] - unconditional jump LABEL[Void] - label definition Multiply and divide are either performed by a hardware option or emulated in software. "Two 16-bit fixed point operands can be multiplied together to yield a 32-bit fixed point result. A 16-bit fixed point operand can be divided into a 32-bit fixed point operand to yield a 16-bit fixed point quotient and a 16-bit fixed point remainder." USING BYTE POINTERS By convention, a special pointer type is required to point to char: "A byte pointer is a word in which bits 0-14 are the address in memory of a 2-byte word. Bit 15 of the byte pointer is the 'byte indicator'. If the byte indicator is 0, the referenced byte is the high-order (bits 0-7) byte of the word addressed by byte pointer bits 0-14. If the byte indicator is 1, the referenced byte is the low-order (bits 8-15) byte of the word addressed by byte pointer bits 0-14." (This is big-endian ordering.) Access to bytes is via instruction sequences. Examples of these are given in the Nova Programmer's manual: ; 21. Load a byte from memory. The routine is called via a JSR. ; The byte pointer for the requested byte is in AC2. ; The requested byte is returned in the right half of AC0. ; The left half of AC0 is set to 0. ; AC1, AC2 and the carry bit are unchanged. AC3 is destroyed. lbyt: sta 3,lret ; save return address lda 3,mask movr 2,2,snc ; turn byte pointer into word address ; and skip if request byte is right byte movs 3,3 ; swap mask if requested byte is left byte lda 0,0,2 ; place word in AC0 and 3,0,snc ; mask off unwanted byte ; and skip if swap is not needed movs 0,0 ; swap requested byte into right half of AC0 movl 2,2 ; restore byte pointer and carry jmp @lret ; return lret: 0 ; return location mask: 377 ; 22. Store a byte in memory. The routine is called via a JSR. ; The byte to be stored is in the right half of AC0 ; with the left half of AC0 set to 0. The byte pointer is in AC2. ; The word written is returned in AC0. AC1, AC2 and the carry bit ; are unchanged. AC3 is destroyed. sbyt: sta 3,sret ; save return sta 1,sac1 ; save AC1 lda 3,mask movr 2,2,snc ; convert byte pointer to word address ; and skip if byte is to be right half movs 0,0,skp ; swap byte and leave mask alone movs 3,3 ; swap mask lda 1,0,2 ; load word that is to receive byte and 3,1 ; mask off byte that is to receive new byte add 1,0 ; add memory word on top of new byte sta 0,0,2 ; store word with new byte movl 2,2 ; restore byte pointer and carry lda 1,sac1 ; restore AC1 jmp @sret ; return sret: 0 ; return location sac1: 0 mask: 377 DOUBLE WORD ARITHMETIC In a double word (long), the first word is the high order word, and second is low order (big endian ordering). Arithmetic on double word quantities is by instruction sequences, examples of which appear in Programmer's Reference Appendix D: ; to negate the double length number whose high and low-order words ; respectively are in AC0 and AC1. We negate the low-order part, ; but we simply complement the high-order part unless the low order part ; is zero. Hence neg 1,1,snr neg 0,0,skp ; low order zero com 0,0 ; low order non-zero ; in unsigned addition a carry indicates that the low-order result ; is just too large and the high-order part must be increased. ; We add the number in AC2 and AC3 to the number in AC0 and AC1. addz 3,1,szc inc 0,0 add 2,0 ; In two's complement subtraction a carry should occur ; unless the subtrahend is too large. We could increment as in addition, ; but since incrementing in the high-order part is precisely the ; difference between a one's complement and a two's complement, ; we can always manage with only two instructions. ; We subtract the number in AC2 and AC3 from that in AC0 and AC1. subz 3,1,szc sub 2,0,skp adc 2,0 THE STACK FRAME The Nova maintains a stack pointer and a frame pointer. The stack pointer always points to the top word on the stack. These registers are accessed using the MTSP, MFSP, MTFP and MFFP instructions. The Nova Programmer's Reference (Appendix E) states the following: "The basic method of transferring control to a subroutine is via a JUMP TO SUBROUTINE instruction. The subroutine executes a SAVE instruction at the subroutine entry point and returns control via the RETURN instruction. ; calling program call: jsr subr ... ... ; subroutine subr: sav ... ... retrn: ret This method has the following characteristics: 1. AC3 of the calling program is destroyed by the JSR. 2. The call is only one word. 3. Upon return to the calling program, AC3 contains the calling program's stack pointer. 4. A SAVE instruction is required at each entry point. [Unless a stack frame is not required?] 5. Arguments are easily passed on the stack because SAVE sets up the frame pointer for the called routine and RETURN places the frame pointer of the calling routine in AC3." The frame pointer is used to reference the stack frame for the calling and called function. "The frame pointer usually points to the first available word minus 1 in the current frame." If a stack frame is required, the called function executes the SAV instruction, which pushes the 5 word return block (AC0,AC1,AC2,previous frame pointer,carry/AC3). This also sets the frame pointer and AC3 to the new stack pointer. Initially the stack frame has no space allocated for local data. In function prologue, space may be reserved using PSHA (push accumulator) and released by the epilogue using POPA (pop accumulator). Alternatively, the stack pointer may be changed using MTSP (move to stack pointer). If a stack frame is used, the return block is released using RET, which also jumps to the subroutine return address. (A function without stack frame returns using JMP 0,3. If needed, it is conventional to save the return address - AC3 on entry - in a temporary location [see LBYT and SBYT above], but this of course makes the subroutine non-reentrant. A stack frame is required for re-entrancy.) "Variables and arguments can be transmitted from the calling routine to the called routine by placing them in prearranged positions in the calling routine's stack frame. Because the SAVE instruction sets the frame pointer to the last word in the return block, these variables and arguments can be referenced by the called program as a negative displacement from the frame pointer." The structure of a stack frame after a function call and callee prologue; arguments, function return value, are allocated by caller; return block (SAV), and local data are allocated by the called function: : lower addresses | caller's stack frame |---------- call setup: FP-5-M | 1st callee argument ... | ... FP-5-1 | Mth callee argument FP-5 | return value |---------- after JSR / after RET FP-4 | 1st word of return block (AC0) ... | 2nd word of return block (AC1) ... | 3rd word of return block (AC2) FP-1 | 4th word of return block (AC3) FP = AC3-> | last word of return block (carry/return PC) |---------- after SAV / before RET FP+1 | 1st word of function local data ... | ... higher addresses FP+N = SP-> | Nth word of function local data; top of stack |---------- after function prologue Because AC0-2 are preserved by the SAV/RET mechanism (and AC3 is destroyed), the function result cannot be returned in any of the accumulators. However it is convenient to store it in a fixed frame offset (FP-5 above).