ast@cs.vu.nl (Andy Tanenbaum) (11/12/88)
In article <446@eecea.eece.ksu.edu> hardin@eecea.UUCP (David Hardin) writes: >Does anyone have a complete description of the EM used in Minix? I have now prepared a version of the EM manual for public consumption. This message and one following it will contain the description of the EM instructions, along with an interpreter in Pascal that lets you see what the instructions actually do. I can email the complete manual to people who are seriously interested, but it is quite large. Andy Tanenbaum (ast@cs.vu.nl) ------------------------- Extract from EM manual, part 1 of 2 -------------- 10. EM MACHINE LANGUAGE The EM machine language is designed to make program text compact and to make decoding easy. Compact program text has many advantages: programs exe- cute faster, programs occupy less primary and secondary storage and loading programs into satellite processors is faster. The decoding of EM machine language is so simple, that it is feasible to use interpreters as long as EM hardware machines are not available. This chapter is irrelevant when back ends are used to produce executable target machine code. 10.1 Instruction encoding A design goal of EM is to make the program text as compact as possible. Decoding must be easy, however. The encoding is fully byte oriented, without any small bit fields. There are 256 primary opcodes, two of which are an escape to two groups of 256 secondary opcodes each. EM instructions without arguments have a single opcode assigned, possibly escaped: |--------------| | opcode | |--------------| or |--------------|--------------| | escape | opcode | |--------------|--------------| The encoding for instructions with an argument is more complex. Several in- structions have an address from the global data area as argument. Other in- structions have different opcodes for positive and negative arguments. There is always an opcode that takes the next two bytes as argument, high byte first: |--------------|--------------|--------------| | opcode | hibyte | lobyte | |--------------|--------------|--------------| or |--------------|--------------|--------------|--------------| | escape | opcode | hibyte | lobyte | |--------------|--------------|--------------|--------------| An extra escape is provided for instructions with four or eight byte argu- ments. 29 |--------------|--------------|--------------| |--------------| | ESCAPE | opcode | hibyte |...| lobyte | |--------------|--------------|--------------| |--------------| For most instructions some argument values predominate. The most frequent combinations of instruction and argument will be encoded in a single byte, called a mini: |---------------| |opcode+argument| (mini) |---------------| The number of minis is restricted, because only 254 primary opcodes are available. Many instructions have the bulk of their arguments fall in the range 0 to 255. Instructions that address global data have their arguments distributed over a wider range, but small values of the high byte are com- mon. For all these cases there is another encoding that combines the in- struction and the high byte of the argument into a single opcode. These op- codes are called shorties. Shorties may be escaped. |--------------|--------------| | opcode+high | lobyte | (shortie) |--------------|--------------| or |--------------|--------------|--------------| | escape | opcode+high | lobyte | |--------------|--------------|--------------| Escaped shorties are useless if the normal encoding has a primary opcode. Note that for some instruction-argument combinations several different en- codings are available. It is the task of the assembler to select the shor- test of these. The savings by these mini and shortie opcodes are consider- able, about 55%. Further improvements are possible: the arguments of many instructions are a multiple of the wordsize. Some do also not allow zero as an argument. If these arguments are divided by the wordsize and, when zero is not allowed, then decremented by 1, more of them can be encoded as shortie or mini. The arguments of some other instructions rarely or never assume the value 0, but start at 1. The value 1 is then encoded as 0, 2 as 1 and so on. Assigning opcodes to instructions by the assembler is completely table driven. For details see appendix B. 30 10.2 Procedure descriptors The procedure identifiers used in the interpreter are indices into a table of procedure descriptors. Each descriptor contains: 1. the number of bytes to be reserved for locals at each invoca- tion. This is a pointer-szied integer. 2. the start address of the procedure 10.3 Load format The EM machine language load format defines the interface between the EM assembler/loader and the EM machine itself. A load file consists of a header, the program text to be executed, a description of the global data area and the procedure descriptor table, in this order. All integers in the load file are presented with the least significant byte first. The header has two parts: the first half (eight 16-bit integers) aids in selecting the correct EM machine or interpreter. Some EM machines, for in- stance, may have hardware floating point instructions. The header entries are as follows (bit 0 is rightmost): 1: magic number (07255) 2: flag bits with the following meaning: bit 0 TEST; test for integer overflow etc. bit 1 PROFILE; for each source line: count the number of memory cycles executed. bit 2 FLOW; for each source line: set a bit in a bit map table if in- structions on that line are executed. bit 3 COUNT; for each source line: increment a counter if that line is entered. bit 4 REALS; set if a program uses floating point instructions. bit 5 EXTRA; more tests during compiler debugging. 3: number of unresolved references. 4: version number; used to detect obsolete EM load files. 5: wordsize ; the number of bytes in each machine word. 6: pointer size ; the number of bytes available for addressing. 7: unused 8: unused The second part of the header (eight entries, of pointer size bytes each) describes the load file itself: 1: NTEXT; the program text size in bytes. 2: NDATA; the number of load-file descriptors (see below). 3: NPROC; the number of entries in the procedure descriptor table. 4: ENTRY; procedure number of the procedure to start with. 5: NLINE; the maximum source line number. 6: SZDATA; the address of the lowest uninitialized data byte. 7: unused 8: unused The program text consists of NTEXT bytes. NTEXT is always a multiple of the wordsize. The first byte of the program text is the first byte of the instruction address space, i.e. it has address 0. Pointers into the program text are found in the procedure descriptor table where relocation is simple and in the global data area. The initialization of the global data area al- lows easy relocation of pointers into both address spaces. 31 The global data area is described by the NDATA descriptors. Each descrip- tor describes a number of consecutive words (of wordsize) and consists of a sequence of bytes. While reading the descriptors from the load file, one can initialize the global data area from low to high addresses. The size of the initialized data area is given by SZDATA, this number can be used to check the initialization. The header of each descriptor consists of a byte, describing the type, and a count. The number of bytes used for this (unsigned) count depends on the type of the descriptor and is either a pointer-sized integer or one byte. The meaning of the count depends on the descriptor type. At load time an interpreter can perform any conversion deemed necessary, such as reordering bytes in integers and pointers and adding base addresses to pointers. 32 In the following pictures we show a graphical notation of the initializ- ers. The leftmost rectangle represents the leading byte. Fields marked with n contain a pointer-sized integer used as a count m contain a one-byte integer used as a count b contain a one-byte integer w contain a wordsized integer p contain a data or instruction pointer s contain a null terminated ASCII string ------------------- | 0 | n | repeat last initialization n times ------------------- --------- | 1 | m | m uninitialized words --------- ____________ / bytes \ ----------------- ----- | 2 | m | b | b |...| b | m initialized bytes ----------------- ----- _________ / word \ ----------------------- | 3 | m | w |... m initialized wordsized integers ----------------------- _________ / pointer \ ----------------------- | 4 | m | p |... m initialized data pointers ----------------------- _________ / pointer \ ----------------------- | 5 | m | p |... m initialized instruction pointers ----------------------- ____________ / bytes \ ------------------------- | 6 | m | b | b |...| b | initialized integer of size m ------------------------- 33 ____________ / bytes \ ------------------------- | 7 | m | b | b |...| b | initialized unsigned of size m ------------------------- ____________ / string \ ------------------------- | 8 | m | s | initialized float of size m ------------------------- type 0: If the last initialization initialized k bytes starting at address a, do the same initialization again n times, starting at a+k, a+2*k, .... a+n*k. This is the only descriptor whose starting byte is fol- lowed by an integer with the size of a pointer, in all other descrip- tors the first byte is followed by a one-byte count. This descriptor must be preceded by a descriptor of another type. type 1: Reserve m words, not explicitly initialized (BSS and HOL). type 2: The m bytes following the descriptor header are initializers for the next m bytes of the global data area. m is divisible by the word- size. type 3: The m words following the header are initializers for the next m words of the global data area. type 4: The m data address space pointers following the header are initializ- ers for the next m data pointers in the global data area. Inter- preters that represent EM pointers by target machine addresses must relocate all data pointers. type 5: The m instruction address space pointers following the header are in- itializers for the next m instruction pointers in the global data area. Interpreters that represent EM instruction pointers by target machine addresses must relocate these pointers. type 6: The m bytes following the header form a signed integer number with a size of m bytes, which is an initializer for the next m bytes of the global data area. m is governed by the same restrictions as for transfer of objects to/from memory. type 7: The m bytes following the header form an unsigned integer number with a size of m bytes, which is an initializer for the next m bytes of 34 the global data area. m is governed by the same restrictions as for transfer of objects to/from memory. type 8: The header is followed by an ASCII string, null terminated, to ini- tialize, in global data, a floating point number with a size of m bytes. m is governed by the same restrictions as for transfer of ob- jects to/from memory. The ASCII string contains the notation of a real as used in the Pascal language. The NPROC procedure descriptors on the load file consist of an instruction space address (of pointer size) and an integer (of pointer size) specifying the number of bytes for locals. 35 11. EM ASSEMBLY LANGUAGE We use two representations for assembly language programs, one is in ASCII and the other is the compact assembly language. The latter needs less space than the first for the same program and therefore allows faster processing. Our only program accepting ASCII assembly language converts it to the com- pact form. All other programs expect compact assembly input. The first part of the chapter describes the ASCII assembly language and its semantics. The second part describes the syntax of the compact assembly language. The last part lists the EM instructions with the type of arguments allowed and an indication of the function. Appendix A gives a detailed description of the effect of all instructions in the form of a Pascal program. 11.1 ASCII assembly language An assembly language program consists of a series of lines, each line may be blank, contain one (pseudo)instruction or contain one label. Input to the assembler is in lower case. Upper case is used in this document merely to distinguish keywords from the surrounding prose. Comment is allowed at the end of each line and starts with a semicolon ";". This kind of comment does not exist in the compact form. Labels must be placed all by themselves on a line and start in column 1. There are two kinds of labels, instruction and data labels. Instruction la- bels are unsigned positive integers. The scope of an instruction label is its procedure. The pseudoinstructions CON, ROM and BSS may be preceded by a line contain- ing a 1-8 character data label, the first character of which is a letter, period or underscore. The period may only be followed by digits, the others may be followed by letters, digits and underscores. The use of the charac- ter "." followed by a constant, which must be in the range 1 to 32767 (e.g. ".40") is recommended for compiler generated programs. These labels are considered as a special case and handled more efficiently in compact assem- bly language (see below). Note that a data label on its own or two consecu- tive labels are not allowed. Each statement may contain an instruction mnemonic or pseudoinstruction. These must begin in column 2 or later (not column 1) and must be followed by a space, tab, semicolon or LF. Everything on the line following a semicolon is taken as a comment. Each input file contains one module. A module may contain many pro- cedures, which may be nested. A procedure consists of a PRO statement, a (possibly empty) collection of instructions and pseudoinstructions and fi- nally an END statement. Pseudoinstructions are also allowed between pro- cedures. They do not belong to a specific procedure. All constants in EM are interpreted in the decimal base. The ASCII assem- bly language accepts constant expressions wherever constants are allowed. The operators recognized are: +, -, *, % and / with the usual precedence order. Use of the parentheses ( and ) to alter the precedence order is al- lowed. 11.1.1 Instruction arguments Unlike many other assembly languages, the EM assembly language requires all arguments of normal and pseudoinstructions to be either a constant or an identifier, but not a combination of these two. There is one exception to 36 this rule: when a data label is used for initialization or as an instruction argument, expressions of the form 'label+constant' and 'label-constant' are allowed. This makes it possible to address, for example, the third word of a ten word BSS block directly. Thus LOE LABEL+4 is permitted and so is CON LABEL+3. The resulting address is must be in the same fragment as the la- bel. It is not allowed to add or subtract from instruction labels or pro- cedure identifiers, which certainly is not a severe restriction and greatly aids optimization. Instruction arguments can be constants, data labels, data labels offsetted by a constant, instruction labels and procedure identifiers. The range of integers allowed depends on the instruction. Most instructions allow only integers (signed or unsigned) that fit in a word. Arguments used as offsets to pointers should fit in a pointer-sized integer. Finally, arguments to LDC should fit in a double-word integer. Several instructions have two possible forms: with an explicit argument and with an implicit argument on top of the stack. The size of the implicit argument is the wordsize. The implicit argument is always popped before all other operands. For example: 'CMI 4' specifies that two four-byte signed integers on top of the stack are to be compared. 'CMI' without an argument expects a wordsized integer on top of the stack that specifies the size of the integers to be compared. Thus the following two sequences are equivalent: LDL -10 LDL -10 LDL -14 LDL -14 LOC 4 CMI 4 CMI ZEQ *1 ZEQ *1 Section 11.1.6 shows the arguments allowed for each instruction. 11.1.2 Pseudoinstruction arguments Pseudoinstruction arguments can be divided in two classes: Initializers and others. The following initializers are allowed: signed integer con- stants, unsigned integer constants, floating-point constants, strings, data labels, data labels offsetted by a constant, instruction labels and pro- cedure identifiers. Constant initializers in BSS, HOL, CON and ROM pseudoinstructions can be followed by a letter I, U or F. This indicator specifies the type of the initializer: Integer, Unsigned or Float. If no indicator is present I is assumed. The size of the initializer is the wordsize unless the indicator is followed by an integer specifying the initializer's size. This integer is governed by the same restrictions as for transfer of objects to/from memory. As in instruction arguments, initializers include expressions of the form: "LABEL+offset" and "LABEL-offset". The offset must be an unsigned decimal constant. The 'IUF' indicators cannot be used in the offsets. Data labels are referred to by their name. Strings are surrounded by double quotes ("). Semicolon's in string do not indicate the start of comment. In the ASCII representation the escape char- acter \ (backslash) alters the meaning of subsequent character(s). This feature allows inclusion of zeroes, graphic characters and the double quote in the string. The following escape sequences exist: 37 newline NL(LF) \n horizontal tab HT \t backspace BS \b carriage return CR \r form feed FF \f backslash \ \\ double quote " \" bit pattern ddd \ddd The escape \ddd consists of the backslash followed by 1, 2, or 3 octal di- gits specifing the value of the desired character. If the character follow- ing a backslash is not one of those specified, the backslash is ignored. Example: CON "hello\012\0". Each string element initializes a single byte. The ASCII character set is used to map characters onto values. Instruction labels are referred to as *1, *2, etc. in both branch in- structions and as initializers. The notation $procname means the identifier for the procedure with the specified name. This identifier has the size of a pointer. 11.1.3 Notation First, the notation used for the arguments, classes of instructions and pseudoinstructions. <cst> = integer constant (current range -2**31..2**31-1) <dlb> = data label <arg> = <cst> or <dlb> or <dlb>+<cst> or <dlb>-<cst> <con> = integer constant, unsigned constant, floating-point constant <str> = string constant (surrounded by double quotes), <ilb> = instruction label '*' followed by an integer in the range 0..32767. <pro> = procedure number ('$' followed by a procedure name) <val> = <arg>, <con>, <pro> or <ilb>. <par> = <val> or <str> <...>* = zero or more of <...> <...>+ = one or more of <...> [...] = optional ... 11.1.4 Pseudoinstructions 11.1.4.1 Storage declaration Initialized global data is allocated by the pseudoinstruction CON, which needs at least one argument. Each argument is used to allocate and initial- ize a number of consequtive bytes in data memory. The number of bytes to be allocated and the alignment depend on the type of the argument. For each argument, an integral number of words, determined by the argument type, is allocated and initialized. The pseudoinstruction ROM is the same as CON, except that it guarantees that the initialized words will not change during the execution of the pro- gram. This information allows optimizers to do certain calculations such as array indexing and subrange checking at compile time instead of at run time. 38 The pseudoinstruction BSS allocates uninitialized global data or large blocks of data initialized by the same value. The first argument to this pseudo is the number of bytes required, which must be a multiple of the wordsize. The other arguments specify the value used for initialization and whether the initialization is only for convenience or a strict necessity. The pseudoinstruction HOL is similar to BSS in that it requests an (un)initialized global data block. Addressing of a HOL block, however, is quasi absolute. The first byte is addressed by 0, the second byte by 1 etc. in assembly language. The assembler/loader adds the base address of the HOL block to these numbers to obtain the absolute address in the machine language. The scope of a HOL block starts at the HOL pseudo and ends at the next HOL pseudo or at the end of a module whatever comes first. Each instruction falls in the scope of at most one HOL block, the current HOL block. It is not allowed to have more than one HOL block per procedure. The alignment restrictions are enforced by the pseudoinstructions. All initializers are aligned on a multiple of their size or the wordsize which- ever is smaller. Strings form an exception, they are to be seen as a se- quence of initializers each for one byte, i.e. strings are not padded with zero bytes. Switching to another type of fragment or placing a label forces word-alignment. There are three types of fragments in global data space: CON, ROM and BSS/HOL. BSS <cst1>,<val>,<cst2> Reserve <cst1> bytes. <val> is the value used to initialize the area. <cst1> must be a multiple of the size of <val>. <cst2> is 0 if the initialization is not strictly necessary, 1 if it is. HOL <cst1>,<val>,<cst2> Idem, but all following absolute global data references will refer to this block. Only one HOL is allowed per procedure, it has to be placed before the first instruction. CON <val>+ Assemble global data words initialized with the <val> constants. ROM <val>+ Idem, but the initialized data will never be changed by the program. 11.1.4.2 Partitioning Two pseudoinstructions partition the input into procedures: PRO <pro>[,<cst>] Start of procedure. <pro> is the procedure name. <cst> is the number of bytes for locals. The number of bytes for locals must be specified in the PRO or END pseudoinstruction. When specified in both, they must be identical. END [<cst>] End of Procedure. <cst> is the number of bytes for locals. The number of bytes for locals must be specified in either the PRO or END pseudoinstruction or both. 39 11.1.4.3 Visibility Names of data and procedures in an EM module can either be internal or external. External names are known outside the module and are used to link several pieces of a program. Internal names are not known outside the modules they are used in. Other modules will not 'see' an internal name. To reduce the number of passes needed, it must be known at the first oc- currence whether a name is internal or external. If the first occurrence of a name is in a definition, the name is considered to be internal. If the first occurrence of a name is a reference, the name is considered to be external. If the first occurrence is in one of the following pseudoinstruc- tions, the effect of the pseudo has precedence. EXA <dlb> External name. <dlb> is known, possibly defined, outside this module. Note that <dlb> may be defined in the same module. EXP <pro> External procedure identifier. Note that <pro> may be defined in the same module. INA <dlb> Internal name. <dlb> is internal to this module and must be defined in this module. INP <pro> Internal procedure. <pro> is internal to this module and must be de- fined in this module. 11.1.4.4 Miscellaneous Two other pseudoinstructions provide miscellaneous features: EXC <cst1>,<cst2> Two blocks of instructions preceding this one are interchanged before being processed. <cst1> gives the number of lines of the first block. <cst2> gives the number of lines of the second one. Blank and pure comment lines do not count. MES <cst>[,<par>]* A special type of comment. Used by compilers to communicate with the optimizer, assembler, etc. as follows: MES 0 An error has occurred, stop further processing. MES 1 Suppress optimization. MES 2,<cst1>,<cst2> Use wordsize <cst1> and pointer size <cst2>. MES 3,<cst1>,<cst2>,<cst3>,<cst4> Indicates that a local variable is never referenced indirectly. Used to indicate that a register may be used for a specific vari- able. <cst1> is offset in bytes from AB if positive and offset from LB if negative. <cst2> gives the size of the variable. <cst3> indicates the class of the variable. The following values are currently recognized: 0 The variable can be used for anything. 40 1 The variable is used as a loopindex. 2 The variable is used as a pointer. 3 The variable is used as a floating point number. <cst4> gives the priority of the variable, higher numbers indicate better candidates. MES 4,<cst>,<str> Number of source lines in file <str> (for profiler). MES 5 Floating point used. MES 6,<val>* Comment. Used to provide comments in compact assembly language. MES 7,..... Reserved. MES 8,<pro>[,<dlb>]... Library module. Indicates that the module may only be loaded if it is useful, that is, if it can satisfy any unresolved references dur- ing the loading process. May not be preceded by any other pseudo, except MES's. MES 9,<cst> Guarantees that no more than <cst> bytes of parameters are accessed, either directly or indirectly. MES 10,<cst>[,<par>]* This message number is reserved for the global optimizer. It in- serts these messages in its output as hints to backends. <cst> in- dicates the type of hint. MES 11 Procedures containing this message are possible destinations of non-local goto's with the GTO instruction. Some backends keep lo- cals in registers, the locals in this procedure should not be kept in registers and all registers containing locals of other procedures should be saved upon entry to this procedure. Each backend is free to skip irrelevant MES pseudos. 11.2 The Compact Assembly Language The assembler accepts input in a highly encoded form. This form is in- tended to reduce the amount of file transport between the front ends, optim- izers and back ends, and also reduces the amount of storage required for storing libraries. Libraries are stored as archived compact assembly language, not machine language. When beginning to read the input, the assembler is in neutral state, and expects either a label or an instruction (including the pseudoinstructions). The meaning of the next byte(s) when in neutral state is as follows, where b1, b2 etc. represent the succeeding bytes. 0 Reserved for future use 1-129 Machine instructions, see Appendix A, alphabetical list 130-149 Reserved for future use 150-161 BSS,CON,END,EXA,EXC,EXP,HOL,INA,INP,MES,PRO,ROM 162-179 Reserved for future pseudoinstructions 180-239 Instruction labels 0 - 59 (180 is local label 0 etc.) 240-244 See the Common Table below 245-255 Not used 41 After a label, the assembler is back in neutral state; it can immediately accept another label or an instruction in the next byte. No linefeeds are used to separate lines. If an opcode expects no arguments, the assembler is back in neutral state after reading the one byte containing the instruction number. If it has one or more arguments (only pseudos have more than 1), the arguments follow directly, encoded as follows: 0-239 Offsets from -120 to 119 240-255 See the Common Table below Absence of an optional argument is indicated by a special byte. Common Table for Neutral State and Arguments class bytes description <ilb> 240 b1 Instruction label b1 (Not used for branches) <ilb> 241 b1 b2 16 bit instruction label (256*b2 + b1) <dlb> 242 b1 Global label .0-.255, with b1 being the label <dlb> 243 b1 b2 Global label .0-.32767 with 256*b2+b1 being the label <dlb> 244 <string> Global symbol not of the form .nnn <cst> 245 b1 b2 16 bit constant <cst> 246 b1 b2 b3 b4 32 bit constant <cst> 247 b1 .. b8 64 bit constant <arg> 248 <dlb><cst> Global label + (possibly negative) constant <pro> 249 <string> Procedure name (not including $) <str> 250 <string> String used in CON or ROM (no quotes-no escapes) <con> 251 <cst><string> Integer constant, size <cst> bytes <con> 252 <cst><string> Unsigned constant, size <cst> bytes <con> 253 <cst><string> Floating constant, size <cst> bytes 254 unused <end> 255 Delimiter for argument lists or indicates absence of optional argument The bytes specifying the value of a 16, 32 or 64 bit constant are present- ed in two's complement notation, with the least significant byte first. For example: the value of a 32 bit constant is ((s4*256+b3)*256+b2)*256+b1, where s4 is b4-256 if b4 is greater than 128 else s4 takes the value of b4. A <string> consists of a <cst> inmediatly followed by a sequence of bytes with length <cst>. The pseudoinstructions fall into several categories, depending on their arguments: Group 1 - EXC, BSS, HOL have a known number of arguments Group 2 - EXA, EXP, INA, INP have a string as argument Group 3 - CON, MES, ROM have a variable number of various things Group 4 - END, PRO have a trailing optional argument. Groups 1 and 2 use the encoding described above. Group 3 also uses the en- coding listed above, with an <end> byte after the last argument to indicate the end of the list. Group 4 uses an <end> byte if the trailing argument is not present. 42 Example ASCII Example compact (LOC = 69, BRA = 18 here): 2 182 1 181 LOC 10 69 130 LOC -10 69 110 LOC 300 69 245 44 1 BRA *19 18 139 300 241 44 1 .3 242 3 CON 4,9,*2,$foo 151 124 129 240 2 249 123 102 111 111 255 CON .35 151 242 35 255 11.3 Assembly language instruction list For each instruction in the list the range of argument values in the as- sembly language is given. The column headed assem contains the mnemonics defined in 11.1.3. The following column specifies restrictions of the argu- ment value. Addresses have to obey the restrictions mentioned in chapter 2. The classes of arguments are indicated by letters: assem constraints rationale c cst fits word constant d cst fits double word constant l cst local offset g arg >= 0 global offset f cst fragment offset n cst >= 0 counter s cst >0 , word multiple object size z cst >= 0 , zero or word multiple object size o cst > 0 , word multiple or fraction object size w cst > 0 , word multiple object size * p pro pro identifier b ilb >= 0 label number r cst 0,1,2 register number - no argument The * at the rationale for w indicates that the argument can either be given as argument or on top of the stack. If the argument is omitted, the argument is fetched from the stack; it is assumed to be a wordsized unsigned integer. Instructions that check for undefined integer or floating-point values and underflow or overflow are indicated below by (*). 43 GROUP 1 - LOAD LOC c : Load constant (i.e. push one word onto the stack) LDC d : Load double constant ( push two words ) LOL l : Load word at l-th local (l<0) or parameter (l>=0) LOE g : Load external word g LIL l : Load word pointed to by l-th local or parameter LOF f : Load offsetted (top of stack + f yield address) LAL l : Load address of local or parameter LAE g : Load address of external LXL n : Load lexical (address of LB n static levels back) LXA n : Load lexical (address of AB n static levels back) LOI o : Load indirect o bytes (address is popped from the stack) LOS w : Load indirect, w-byte integer on top of stack gives object size LDL l : Load double local or parameter (two consecutive words are stacked) LDE g : Load double external (two consecutive externals are stacked) LDF f : Load double offsetted (top of stack + f yield address) LPI p : Load procedure identifier GROUP 2 - STORE STL l : Store local or parameter STE g : Store external SIL l : Store into word pointed to by l-th local or parameter STF f : Store offsetted STI o : Store indirect o bytes (pop address, then data) STS w : Store indirect, w-byte integer on top of stack gives object size SDL l : Store double local or parameter SDE g : Store double external SDF f : Store double offsetted GROUP 3 - INTEGER ARITHMETIC ADI w : Addition (*) SBI w : Subtraction (*) MLI w : Multiplication (*) DVI w : Division (*) RMI w : Remainder (*) NGI w : Negate (two's complement) (*) SLI w : Shift left (*) SRI w : Shift right (*) GROUP 4 - UNSIGNED ARITHMETIC ADU w : Addition SBU w : Subtraction MLU w : Multiplication DVU w : Division RMU w : Remainder SLU w : Shift left SRU w : Shift right 44 GROUP 5 - FLOATING POINT ARITHMETIC ADF w : Floating add (*) SBF w : Floating subtract (*) MLF w : Floating multiply (*) DVF w : Floating divide (*) NGF w : Floating negate (*) FIF w : Floating multiply and split integer and fraction part (*) FEF w : Split floating number in exponent and fraction part (*) GROUP 6 - POINTER ARITHMETIC ADP f : Add f to pointer on top of stack ADS w : Add w-byte value and pointer SBS w : Subtract pointers in same fragment and push diff as size w integer GROUP 7 - INCREMENT/DECREMENT/ZERO INC - : Increment word on top of stack by 1 (*) INL l : Increment local or parameter (*) INE g : Increment external (*) DEC - : Decrement word on top of stack by 1 (*) DEL l : Decrement local or parameter (*) DEE g : Decrement external (*) ZRL l : Zero local or parameter ZRE g : Zero external ZRF w : Load a floating zero of size w ZER w : Load w zero bytes GROUP 8 - CONVERT (stack:source, source size, dest. size (top)) CII - : Convert integer to integer (*) CUI - : Convert unsigned to integer (*) CFI - : Convert floating to integer (*) CIF - : Convert integer to floating (*) CUF - : Convert unsigned to floating (*) CFF - : Convert floating to floating (*) CIU - : Convert integer to unsigned CUU - : Convert unsigned to unsigned CFU - : Convert floating to unsigned GROUP 9 - LOGICAL AND w : Boolean and on two groups of w bytes IOR w : Boolean inclusive or on two groups of w bytes XOR w : Boolean exclusive or on two groups of w bytes COM w : Complement (one's complement of top w bytes) ROL w : Rotate left a group of w bytes ROR w : Rotate right a group of w bytes 45 GROUP 10 - SETS INN w : Bit test on w byte set (bit number on top of stack) SET w : Create singleton w byte set with bit n on (n is top of stack) GROUP 11 - ARRAY LAR w : Load array element, descriptor contains integers of size w SAR w : Store array element AAR w : Load address of array element GROUP 12 - COMPARE CMI w : Compare w byte integers, Push negative, zero, positive for <, = or > CMF w : Compare w byte reals CMU w : Compare w byte unsigneds CMS w : Compare w byte values, can only be used for bit for bit equality test CMP - : Compare pointers TLT - : True if less, i.e. iff top of stack < 0 TLE - : True if less or equal, i.e. iff top of stack <= 0 TEQ - : True if equal, i.e. iff top of stack = 0 TNE - : True if not equal, i.e. iff top of stack non zero TGE - : True if greater or equal, i.e. iff top of stack >= 0 TGT - : True if greater, i.e. iff top of stack > 0 GROUP 13 - BRANCH BRA b : Branch unconditionally to label b BLT b : Branch less (pop 2 words, branch if top > second) BLE b : Branch less or equal BEQ b : Branch equal BNE b : Branch not equal BGE b : Branch greater or equal BGT b : Branch greater ZLT b : Branch less than zero (pop 1 word, branch negative) ZLE b : Branch less or equal to zero ZEQ b : Branch equal zero ZNE b : Branch not zero ZGE b : Branch greater or equal zero ZGT b : Branch greater than zero GROUP 14 - PROCEDURE CALL CAI - : Call procedure (procedure identifier on stack) CAL p : Call procedure (with identifier p) LFR s : Load function result RET z : Return (function result consists of top z bytes) 46 GROUP 15 - MISCELLANEOUS ASP f : Adjust the stack pointer by f ASS w : Adjust the stack pointer by w-byte integer BLM z : Block move z bytes; first pop destination addr, then source addr BLS w : Block move, size is in w-byte integer on top of stack CSA w : Case jump; address of jump table at top of stack CSB w : Table lookup jump; address of jump table at top of stack DCH - : Follow dynamic chain, convert LB to LB of caller DUP s : Duplicate top s bytes DUS w : Duplicate top w bytes EXG w : Exchange top w bytes FIL g : File name (external 4 := g) GTO g : Non-local goto, descriptor at g LIM - : Load 16 bit ignore mask LIN n : Line number (external 0 := n) LNI - : Line number increment LOR r : Load register (0=LB, 1=SP, 2=HP) LPB - : Convert local base to argument base MON - : Monitor call NOP - : No operation RCK w : Range check; trap on error RTT - : Return from trap SIG - : Trap errors to proc identifier on top of stack, -2 resets default SIM - : Store 16 bit ignore mask STR r : Store register (0=LB, 1=SP, 2=HP) TRP - : Cause trap to occur (Error number on stack) 47 A. EM INTERPRETER { This is an interpreter for EM. It serves as the official machine definition. This interpreter must run on a machine which supports arithmetic with words and memory offsets. Certain aspects of the definition are over specified. In particular: 1. The representation of an address on the stack need not be the numerical value of the memory location. 2. The state of the stack is not defined after a trap has aborted an instruction in the middle. For example, it is officially un- defined whether the second operand of an ADD instruction has been popped or not if the first one is undefined ( -32768 or unsigned 32768). 3. The memory layout is implementation dependent. Only the most basic checks are performed whenever memory is accessed. 4. The representation of an integer or set on the stack is not fixed in bit order. 5. The format and existence of the procedure descriptors depends on the implementation. 6. The result of the compare operators CMI etc. are -1, 0 and 1 here, but other negative and positive values will do and they need not be the same each time. 7. The shift count for SHL, SHR, ROL and ROR must be in the range 0 to object size in bits - 1. The effect of a count not in this range is undefined. } 48 {$i256} {$d+} program em(tables,prog,input,output); label 8888,9999; const t15 = 32768; { 2**15 } t15m1 = 32767; { 2**15 -1 } t16 = 65536; { 2**16 } t16m1 = 65535; { 2**16 -1 } t31m1 = 2147483647; { 2**31 -1 } wsize = 2; { number of bytes in a word } asize = 2; { number of bytes in an address } fsize = 4; { number of bytes in a floating point number } maxret =4; { number of words in the return value area } signbit = t15; { the power of two indicating the sign bit } negoff = t16; { the next power of two } maxsint = t15m1; { the maximum signed integer } maxuint = t16m1; { the maximum unsigned integer } maxdbl = t31m1; { the maximum double signed integer } maxadr = t16m1; { the maximum address } maxoffs = t15m1; { the maximum offset from an address } maxbitnr= 15; { the number of the highest bit } lineadr = 0; { address of the line number } fileadr = 4; { address of the file name } maxcode = 8191; { highest byte in code address space } maxdata = 8191; { highest byte in data address space } { format of status save area } statd = 4; { how far is static link from lb } dynd = 2; { how far is dynamic link from lb } reta = 0; { how far is the return address from lb } savsize = 4; { size of save area in bytes } { procedure descriptor format } pdlocs = 0; { offset for size of local variables in bytes } pdbase = asize; { offset for the procedure base } pdsize = 4; { size of procedure descriptor in bytes = 2*asize } { header words } NTEXT = 1; NDATA = 2; NPROC = 3; ENTRY = 4; NLINE = 5; SZDATA = 6; escape1 = 254; { escape to secondary opcodes } escape2 = 255; { escape to tertiary opcodes } undef = signbit; { the range of integers is -32767 to +32767 } { error codes } EARRAY = 0; ERANGE = 1; ESET = 2; EIOVFL = 3; EFOVFL = 4; EFUNFL = 5; EIDIVZ = 6; EFDIVZ = 7; EIUND = 8; EFUND = 9; ECONV = 10; ESTACK = 16; EHEAP = 17; EILLINS = 18; EODDZ = 19; ECASE = 20; EMEMFLT = 21; EBADPTR = 22; EBADPC = 23; EBADLAE = 24; 49 EBADMON = 25; EBADLIN = 26; EBADGTO = 27; 50 {---------------------------------------------------------------------------} { Declarations } {---------------------------------------------------------------------------} type bitval= 0..1; { one bit } bitnr= 0..maxbitnr; { bits in machine words are numbered 0 to 15 } byte= 0..255; { memory is an array of bytes } adr= {0..maxadr} long; { the range of addresses } word= {0..maxuint} long;{ the range of unsigned integers } offs= -maxoffs..maxoffs; { the range of signed offsets from addresses } size= 0..maxoffs; { the range of sizes is the positive offsets } sword= {-signbit..maxsint} long; { the range of signed integers } full= {-maxuint..maxuint} long; { intermediate results need this range } double={-maxdbl..maxdbl} long; { double precision range } bftype= (andf,iorf,xorf); { tells which boolean operator needed } insclass=(prim,second,tert); { tells which opcode table is in use } instype=(implic,explic); { does opcode have implicit or explicit operand } iflags= (mini,short,sbit,wbit,zbit,ibit); ifset= set of iflags; mnem = ( NON, AAR, ADF, ADI, ADP, ADS, ADU,XAND, ASP, ASS, BEQ, BGE, BGT, BLE, BLM, BLS, BLT, BNE, BRA, CAI, CAL, CFF, CFI, CFU, CIF, CII, CIU, CMF, CMI, CMP, CMS, CMU, COM, CSA, CSB, CUF, CUI, CUU, DCH, DEC, DEE, DEL, DUP, DUS, DVF, DVI, DVU, EXG, FEF, FIF, FIL, GTO, INC, INE, INL, INN, IOR, LAE, LAL, LAR, LDC, LDE, LDF, LDL, LFR, LIL, LIM, LIN, LNI, LOC, LOE, LOF, LOI, LOL, LOR, LOS, LPB, LPI, LXA, LXL, MLF, MLI, MLU, MON, NGF, NGI, NOP, RCK, RET, RMI, RMU, ROL, ROR, RTT, SAR, SBF, SBI, SBS, SBU, SDE, SDF, SDL,XSET, SIG, SIL, SIM, SLI, SLU, SRI, SRU, STE, STF, STI, STL, STR, STS, TEQ, TGE, TGT, TLE, TLT, TNE, TRP, XOR, ZEQ, ZER, ZGE, ZGT, ZLE, ZLT, ZNE, ZRE, ZRF, ZRL); dispatch = record iflag: ifset; instr: mnem; case instype of implic: (implicit:sword); explic: (ilength:byte); end; var code: packed array[0..maxcode] of byte; { code space } data: packed array[0..maxdata] of byte; { data space } retarea: array[1..maxret ] of word; { return area } pc,lb,sp,hp,pd: adr; { internal machine registers } i: integer; { integer scratch variable } s,t :word; { scratch variables } sz:size; { scratch variables } ss,st: sword; { scratch variables } k :double; { scratch variables } j:size; { scratch variable used as index } a,b:adr; { scratch variable used for addresses } dt,ds:double; { scratch variables for double precision } 51 rt,rs,x,y:real; { scratch variables for real } found:boolean; { scratch } opcode: byte; { holds the opcode during execution } iclass: insclass; { true for escaped opcodes } dispat: array[insclass,byte] of dispatch; retsize:size; { holds size of last LFR } insr: mnem; { holds the instructionnumber } halted: boolean; { normally false } exitstatus:word; { parameter of MON 1 } ignmask:word; { ignore mask for traps } uerrorproc:adr; { number of user defined error procedure } intrap:boolean; { Set when executing trap(), to catch recursive calls} trapval:byte; { Set to number of last trap } header: array[1..8] of adr; tables: text; { description of EM instructions } prog: file of byte; { program and initialized data } {---------------------------------------------------------------------------} { Various check routines } {---------------------------------------------------------------------------} { Only the most basic checks are performed. These routines are inherently implementation dependent. } procedure trap(n:byte); forward; procedure memadr(a:adr); begin if (a>maxdata) or ((a<sp) and (a>=hp)) then trap(EMEMFLT) end; procedure wordadr(a:adr); begin memadr(a); if (a mod wsize<>0) then trap(EBADPTR) end; procedure chkadr(a:adr; s:size); begin memadr(a); memadr(a+s-1); { assumption: size is ok } if s<wsize then begin if a mod s<>0 then trap(EBADPTR) end else if a mod wsize<>0 then trap(EBADPTR) end; procedure newpc(a:double); begin if (a<0) or (a>maxcode) then trap(EBADPC); pc:=a end; procedure newsp(a:adr); begin if (a>lb) or (a<hp) or (a mod wsize<>0) then trap(ESTACK); sp:=a end; procedure newlb(a:adr); begin if (a<sp) or (a mod wsize<>0) then trap(ESTACK); lb:=a end; procedure newhp(a:adr); begin if (a>sp) or (a>maxdata+1) or (a mod wsize<>0) then trap(EHEAP) else hp:=a end; function argc(a:double):sword; begin if (a<-signbit) or (a>maxsint) then trap(EILLINS); argc:=a end; 52 function argd(a:double):double; begin if (a<-maxdbl) or (a>maxdbl) then trap(EILLINS); argd:=a end; function argl(a:double):offs; begin if (a<-maxoffs) or (a>maxoffs) then trap(EILLINS); argl:=a end; function argg(k:double):adr; begin if (k<0) or (k>maxadr) then trap(EILLINS); argg:=k end; function argf(a:double):offs; begin if (a<-maxoffs) or (a>maxoffs) then trap(EILLINS); argf:=a end; function argn(a:double):word; begin if (a<0) or (a>maxuint) then trap(EILLINS); argn:=a end; function args(a:double):size; begin if (a<=0) or (a>maxoffs) then trap(EODDZ) else if (a mod wsize)<>0 then trap(EODDZ); args:=a ; end; function argz(a:double):size; begin if (a<0) or (a>maxoffs) then trap(EODDZ) else if (a mod wsize)<>0 then trap(EODDZ); argz:=a ; end; function argo(a:double):size; begin if (a<=0) or (a>maxoffs) then trap(EODDZ) else if (a mod wsize<>0) and (wsize mod a<>0) then trap(EODDZ); argo:=a ; end; function argw(a:double):size; begin if (a<=0) or (a>maxoffs) or (a>maxuint) then trap(EODDZ) else if (a mod wsize)<>0 then trap(EODDZ); argw:=a ; end; function argp(a:double):size; begin if (a<0) or (a>=header[NPROC]) then trap(EILLINS); argp:=a end; function argr(a:double):word; begin if (a<0) or (a>2) then trap(EILLINS); argr:=a end; procedure argwf(s:double); begin if argw(s)<>fsize then trap(EILLINS) end; function szindex(s:double):integer; begin s:=argw(s); if (s mod wsize <> 0) or (s>2*wsize) then trap(EILLINS); szindex:=s div wsize end; function locadr(l:double):adr; begin l:=argl(l); if l<0 then locadr:=lb+l else locadr:=lb+l+savsize end;
ast@cs.vu.nl (Andy Tanenbaum) (11/12/88)
53 function signwd(w:word):sword; begin if w = undef then trap(EIUND); if w >= signbit then signwd:=w-negoff else signwd:=w end; function dosign(w:word):sword; begin if w >= signbit then dosign:=w-negoff else dosign:=w end; function unsign(w:sword):word; begin if w<0 then unsign:=w+negoff else unsign:=w end; function chopw(dw:double):word; begin chopw:=dw mod negoff end; function fitsw(w:full;trapno:byte):word; { checks whether value fits in signed word, returns unsigned representation} begin if (w>maxsint) or (w<-signbit) then begin trap(trapno); if w<0 then fitsw:=negoff- (-w)mod negoff else fitsw:=w mod negoff; end else fitsw:=unsign(w) end; function fitd(w:full):double; begin if abs(w) > maxdbl then trap(ECONV); fitd:=w end; {---------------------------------------------------------------------------} { Memory access routines } {---------------------------------------------------------------------------} { memw returns a machine word as an unsigned integer memb returns a single byte as a positive integer: 0 <= memb <= 255 mems(a,s) fetches an object smaller than a word and returns a word store(a,v) stores the word v at machine address a storea(a,v) stores the address v at machine address a storeb(a,b) stores the byte b at machine address a stores(a,s,v) stores the s least significant bytes of a word at address a memi returns an offset from the instruction space Note that the procedure descriptors are part of instruction space. nextpc returns the next byte addressed by pc, incrementing pc lino changes the line number word. filna changes the pointer to the file name. All routines check to make sure the address is within range and valid for the size of the object. If an addressing error is found, a trap occurs. } function memw(a:adr):word; var b:word; i:integer; begin wordadr(a); b:=0; for i:=wsize-1 downto 0 do b:=256*b + data[a+i] ; 54 memw:=b end; function memd(a:adr):double; { Always signed } var b:double; i:integer; begin wordadr(a); b:=data[a+2*wsize-1]; if b>=128 then b:=b-256; for i:=2*wsize-2 downto 0 do b:=256*b + data[a+i] ; memd:=b end; function mema(a:adr):adr; var b:adr; i:integer; begin wordadr(a); b:=0; for i:=asize-1 downto 0 do b:=256*b + data[a+i] ; mema:=b end; function mems(a:adr;s:size):word; var i:integer; b:word; begin chkadr(a,s); b:=0; for i:=1 to s do b:=b*256+data[a+s-i]; mems:=b end; function memb(a:adr):byte; begin memadr(a); memb:=data[a] end; procedure store(a:adr; x:word); var i:integer; begin wordadr(a); for i:=0 to wsize-1 do begin data[a+i]:=x mod 256; x:=x div 256 end end; procedure storea(a:adr; x:adr); var i:integer; begin wordadr(a); for i:=0 to asize-1 do begin data[a+i]:=x mod 256; x:=x div 256 end end; procedure stores(a:adr;s:size;v:word); var i:integer; begin chkadr(a,s); for i:=0 to s-1 do begin data[a+i]:=v mod 256; v:=v div 256 end; end; procedure storeb(a:adr; b:byte); begin memadr(a); data[a]:=b end; function memi(a:adr):adr; var b:adr; i:integer; begin if (a mod wsize<>0) or (a+asize-1>maxcode) then trap(EBADPTR); b:=0; for i:=asize-1 downto 0 do b:=256*b + code[a+i] ; memi:=b end; function nextpc:byte; begin if pc>=pd then trap(EBADPC); nextpc:=code[pc]; newpc(pc+1) end; procedure lino(w:word); 55 begin store(lineadr,w) end; procedure filna(a:adr); begin storea(fileadr,a) end; {---------------------------------------------------------------------------} { Stack Manipulation Routines } {---------------------------------------------------------------------------} { push puts a word on the stack pushsw takes a signed one word integer and pushes it on the stack pop removes a machine word from the stack and delivers it as a word popsw removes a machine word from the stack and delivers a signed integer pusha pushes an address on the stack popa removes a machine word from the stack and delivers it as an address pushd pushes a double precision number on the stack popd removes two machine words and returns a double precision integer pushr pushes a float (floating point) number on the stack popr removes several machine words and returns a float number pushx puts an object of arbitrary size on the stack popx removes an object of arbitrary size } procedure push(x:word); begin newsp(sp-wsize); store(sp,x) end; procedure pushsw(x:sword); begin newsp(sp-wsize); store(sp,unsign(x)) end; function pop:word; begin pop:=memw(sp); newsp(sp+wsize) end; function popsw:sword; begin popsw:=signwd(pop) end; procedure pusha(x:adr); begin newsp(sp-asize); storea(sp,x) end; function popa:adr; begin popa:=mema(sp); newsp(sp+asize) end; procedure pushd(y:double); begin { push double integer onto the stack } newsp(sp-2*wsize) end; function popd:double; begin { pop double integer from the stack } newsp(sp+2*wsize); popd:=0 end; procedure pushr(z:real); begin { Push a float onto the stack } newsp(sp-fsize) end; function popr:real; begin { pop float from the stack } newsp(sp+fsize); popr:=0.0 end; procedure pushx(objsize:size; a:adr); var i:integer; begin if objsize<wsize then push(mems(a,objsize)) 56 else for i:=1 to objsize div wsize do push(memw(a+objsize-wsize*i)) end; procedure popx(objsize:size; a:adr); var i:integer; begin if objsize<wsize then stores(a,objsize,pop) else for i:=1 to objsize div wsize do store(a-wsize+wsize*i,pop) end; {---------------------------------------------------------------------------} { Bit manipulation routines (extract, shift, rotate) } {---------------------------------------------------------------------------} procedure sleft(var w:sword); { 1 bit left shift } begin w:= dosign(fitsw(2*w,EIOVFL)) end; procedure suleft(var w:word); { 1 bit left shift } begin w := chopw(2*w) end; procedure sdleft(var d:double); { 1 bit left shift } begin { shift two word signed integer } end; procedure sright(var w:sword); { 1 bit right shift with sign extension } begin if w >= 0 then w := w div 2 else w := (w-1) div 2 end; procedure suright(var w:word); { 1 bit right shift without sign extension } begin w := w div 2 end; procedure sdright(var d:double); { 1 bit right shift } begin { shift two word signed integer } end; procedure rleft(var w:word); { 1 bit left rotate } begin if w >= t15 then w:=(w-t15)*2 + 1 else w:=w*2 end; procedure rright(var w:word); { 1 bit right rotate } begin if w mod 2 = 1 then w:=w div 2 + t15 else w:=w div 2 end; function sextend(w:word;s:size):word; var i:size; begin for i:=1 to (wsize-s)*8 do rleft(w); for i:=1 to (wsize-s)*8 do sright(w); sextend:=w; end; function bit(b:bitnr; w:word):bitval; { return bit b of the word w } var i:bitnr; begin for i:= 1 to b do rright(w); bit:= w mod 2 end; function bf(ty:bftype; w1,w2:word):word; { return boolean fcn of 2 words } 57 var i:bitnr; j:word; begin j:=0; for i:= maxbitnr downto 0 do begin j := 2*j; case ty of andf: if bit(i,w1)+bit(i,w2) = 2 then j:=j+1; iorf: if bit(i,w1)+bit(i,w2) > 0 then j:=j+1; xorf: if bit(i,w1)+bit(i,w2) = 1 then j:=j+1 end end; bf:=j end; {---------------------------------------------------------------------------} { Array indexing {---------------------------------------------------------------------------} function arraycalc(c:adr):adr; { subscript calculation } var j:full; objsize:size; a:adr; begin j:= popsw - signwd(memw(c)); if (j<0) or (j>memw(c+wsize)) then trap(EARRAY); objsize := argo(memw(c+wsize+wsize)); a := j*objsize+popa; chkadr(a,objsize); arraycalc:=a end; {---------------------------------------------------------------------------} { Double and Real Arithmetic } {---------------------------------------------------------------------------} { All routines for doubles and floats are dummy routines, since the format of doubles and floats is not defined in EM. } function doadi(ds,dt:double):double; begin { add two doubles } doadi:=0 end; function dosbi(ds,dt:double):double; begin { subtract two doubles } dosbi:=0 end; function domli(ds,dt:double):double; begin { multiply two doubles } domli:=0 end; function dodvi(ds,dt:double):double; begin { divide two doubles } dodvi:=0 end; function dormi(ds,dt:double):double; begin { modulo of two doubles } dormi:=0 end; function dongi(ds:double):double; begin { negative of a double } dongi:=0 end; function doadf(x,y:real):real; begin { add two floats } doadf:=0.0 end; function dosbf(x,y:real):real; begin { subtract two floats } dosbf:=0.0 end; 58 function domlf(x,y:real):real; begin { multiply two floats } domlf:=0.0 end; function dodvf(x,y:real):real; begin { divide two floats } dodvf:=0.0 end; function dongf(x:real):real; begin { negate a float } dongf:=0.0 end; procedure dofif(x,y:real;var intpart,fraction:real); begin { dismember x*y into integer and fractional parts } intpart:=0.0; { integer part of x*y, same sign as x*y } fraction:=0.0; { fractional part of x*y, 0<=abs(fraction)<1 and same sign as x*y } end; procedure dofef(x:real;var mantissa:real;var exponent:sword); begin { dismember x into mantissa and exponent parts } mantissa:=0.0; { mantissa of x , >= 1/2 and <1 } exponent:=0; { base 2 exponent of x } end; 59 {---------------------------------------------------------------------------} { Trap and Call } {---------------------------------------------------------------------------} procedure call(p:adr); { Perform the call } begin pusha(lb);pusha(pc); newlb(sp);newsp(sp - memi(pd + pdsize*p + pdlocs)); newpc(memi(pd + pdsize*p+ pdbase)) end; procedure dotrap(n:byte); var i:size; begin if (uerrorproc=0) or intrap then begin if intrap then writeln('Recursive trap, first trap number was ', trapval:1); writeln('Error ', n:1); writeln('With',ord(insr):4,' arg ',k:1); goto 9999 end; { Deposit all interpreter variables that need to be saved on the stack. This includes all scratch variables that can be in use at the moment and ( not possible in this interpreter ) the internal address of the interpreter where the error occurred. This would make it possible to execute an RTT instruction totally transparent to the user program. It can, for example, occur within an ADD instruction that both operands are undefined and that the result overflows. Although this will generate 3 error traps it must be possible to ignore them all. } intrap:=true; trapval:=n; for i:=retsize div wsize downto 1 do push(retarea[i]); push(retsize); { saved return area } pusha(mema(fileadr)); { saved current file name pointer } push(memw(lineadr)); { saved line number } push(n); { push error number } a:=argp(uerrorproc); uerrorproc:=0; { reset signal } call(a); { call the routine } intrap:=false; { Don't catch recursive traps anymore } goto 8888; { reenter main loop } end; procedure trap; { This routine is invoked for overflow, and other run time errors. For non-fatal errors, trap returns to the calling routine } begin if n>=16 then dotrap(n) else if bit(n,ignmask)=0 then dotrap(n); end; procedure dortt; { The restoration of file address and line number is not essential. The restoration of the return save area is. } var i:size; 60 n:word; begin newsp(lb); lb:=maxdata+1 ; { to circumvent ESTACK for the popa + pop } newpc(popa); newlb(popa); { So far a plain RET 0 } n:=pop; if (n>=16) and (n<64) then goto 9999 ; lino(pop); filna(popa); retsize:=pop; for i:=1 to retsize div wsize do retarea[i]:=pop ; end; {---------------------------------------------------------------------------} { monitor calls } {---------------------------------------------------------------------------} procedure domon(entry:word); var index: 1..63; dummy: double; count,rwptr: adr; token: byte; i: integer; begin if (entry<=0) or (entry>63) then entry:=63 ; index:=entry; case index of 1: begin { exit } exitstatus:=pop; halted:=true end; 3: begin { read } dummy:=pop; { All input is from stdin } rwptr:=popa; count:=popa; i:=0 ; while (not eof(input)) and (i<count) do begin if eoln(input) then begin storeb(rwptr,10) ; count:=i end else storeb(rwptr,ord(input^)) ; get(input); rwptr:=rwptr+1 ; i:=i+1 ; end; pusha(i); push(0) end; 4: begin { write } dummy:=pop; { All output is to stdout } rwptr:=popa; count:=popa; for i:=1 to count do begin token:=memb(rwptr); rwptr:=rwptr+1 ; if token=10 then writeln else write(chr(token)) end ; pusha(count); push(0) end; 54: begin { ioctl, faked } dummy:=popa;dummy:=popa;dummy:=pop;push(0) end ; 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 55, 56, 57, 58, 59, 60, 61, 62: begin push(22); push(22) end; 63: { exists only for the trap } trap(EBADMON) end end; 61 {---------------------------------------------------------------------------} { Initialization and debugging } {---------------------------------------------------------------------------} procedure doident; { print line number and file name } var a:adr; i,c:integer; found:boolean; begin write('at line ',memw(lineadr):1,' '); a:=mema(fileadr); if a<>0 then begin i:=20; found:=false; while (i<>0) and not found do begin c:=memb(a); a:=a+1; found:=true; i:=i-1; if (c>=48) and (c<=57) then begin found:=false; write(chr(ord('0')+c-48)) end; if (c>=65) and (c<=90) then begin found:=false; write(chr(ord('A')+c-65)) end; if (c>=97) and (c<=122) then begin found:=false; write(chr(ord('a')+c-97)) end; end; end; writeln; end; procedure initialize; { start the ball rolling } { This is not part of the machine definition } var cset:set of char; f:ifset; iclass:insclass; insno:byte; nops:integer; opcode:byte; i,j,n:integer; wtemp:sword; count:integer; repc:adr; nexta,firsta:adr; elem:byte; amount,ofst:size; c:char; function readb(n:integer):double; var b:byte; begin read(prog,b); if n>1 then readb:=readb(n-1)*256+b else readb:=b end; function readbyte:byte; begin readbyte:=readb(1) end; function readword:word; begin readword:=readb(wsize) end; function readadr:adr; begin readadr:=readb(asize) end; function ifind(ordinal:byte):mnem; var loopvar:mnem; found:boolean; begin ifind:=NON; loopvar:=insr; found:=false; repeat 62 if ordinal=ord(loopvar) then begin found:=true; ifind:=loopvar end; if loopvar<>ZRL then loopvar:=succ(loopvar) else loopvar:=NON; until found or (loopvar=insr) ; end; procedure readhdr; type hdrw=0..32767 ; { 16 bit header words } var hdr: hdrw; i: integer; begin for i:=0 to 7 do begin hdr:=readb(2); case i of 0: if hdr<>3757 then { 07255 } begin writeln('Not an em load file'); halt end; 2: if hdr<>0 then begin writeln('Unsolved references'); halt end; 3: if hdr<>3 then begin writeln('Incorrect load file version'); halt end; 4: if hdr<>wsize then begin writeln('Incorrect word size'); halt end; 5: if hdr<>asize then begin writeln('Incorrect pointer size'); halt end; 1,6,7:; end end end; procedure noinit; begin writeln('Illegal initialization'); halt end; procedure readint(a:adr;s:size); var i:size; begin { construct integer out of byte sequence } for i:=1 to s do { construct the value and initialize at a } begin storeb(a,readbyte); a:=a+1 end end; procedure readuns(a:adr;s:size); begin { construct unsigned out of byte sequence } readint(a,s) { identical to readint } end; procedure readfloat(a:adr;s:size); var i:size; b:byte; begin { construct float out of string} if (s<>4) and (s<>8) then noinit; i:=0; repeat { eat the bytes, construct the value and intialize at a } b:=readbyte; i:=i+1; until b=0 ; end; begin halted:=false; exitstatus:=undef; uerrorproc:=0; intrap:=false; { initialize tables } 63 for i:=0 to maxcode do code[i]:=0; for i:=0 to maxdata do data[i]:=0; for iclass:=prim to tert do for i:=0 to 255 do with dispat[iclass][i] do begin instr:=NON; iflag:=[zbit] end; { read instruction table file. see appendix B } { The table read here is a simple transformation of the table on page xx } { - instruction names were transformed to numbers } { - the '-' flag was transformed to an 'i' flag for 'w' type instructions } { - the 'S' flag was added for instructions having signed operands } reset(tables); insr:=NON; repeat read(tables,insno) ; cset:=[]; f:=[]; insr:=ifind(insno); if insr=NON then begin writeln('Incorrect table'); halt end; repeat read(tables,c) until c<>' ' ; repeat cset:=cset+[c]; read(tables,c) until c=' ' ; if 'm' in cset then f:=f+[mini]; if 's' in cset then f:=f+[short]; if '-' in cset then f:=f+[zbit]; if 'i' in cset then f:=f+[ibit]; if 'S' in cset then f:=f+[sbit]; if 'w' in cset then f:=f+[wbit]; if (mini in f) or (short in f) then read(tables,nops) else nops:=1 ; readln(tables,opcode); if ('4' in cset) or ('8' in cset) then begin iclass:=tert end else if 'e' in cset then begin iclass:=second end else iclass:=prim; for i:=0 to nops-1 do begin with dispat[iclass,opcode+i] do begin iflag:=f; instr:=insr; if '2' in cset then ilength:=2 else if 'u' in cset then ilength:=2 else if '4' in cset then ilength:=4 else if '8' in cset then ilength:=8 else if (mini in f) or (short in f) then begin if 'N' in cset then wtemp:=-1-i else wtemp:=i ; if 'o' in cset then wtemp:=wtemp+1 ; if short in f then wtemp:=wtemp*256 ; implicit:=wtemp end end end until eof(tables); { read in program text, data and procedure descriptors } reset(prog); readhdr; { verify first header } 64 for i:=1 to 8 do header[i]:=readadr; { read second header } hp:=maxdata+1; sp:=maxdata+1; lino(0); { read program text } if header[NTEXT]+header[NPROC]*pdsize>maxcode then begin writeln('Text size too large'); halt end; if header[SZDATA]>maxdata then begin writeln('Data size too large'); halt end; for i:=0 to header[NTEXT]-1 do code[i]:=readbyte; { read data blocks } nexta:=0; for i:=1 to header[NDATA] do begin n:=readbyte; if n<>0 then begin elem:=readbyte; firsta:=nexta; case n of 1: { uninitialized words } for j:=1 to elem do begin store(nexta,undef); nexta:=nexta+wsize end; 2: { initialized bytes } for j:=1 to elem do begin storeb(nexta,readbyte); nexta:=nexta+1 end; 3: { initialized words } for j:=1 to elem do begin store(nexta,readword); nexta:=nexta+wsize end; 4,5: { instruction and data pointers } for j:=1 to elem do begin storea(nexta,readadr); nexta:=nexta+asize end; 6: { signed integers } begin readint(nexta,elem); nexta:=nexta+elem end; 7: { unsigned integers } begin readuns(nexta,elem); nexta:=nexta+elem end; 8: { floating point numbers } begin readfloat(nexta,elem); nexta:=nexta+elem end; end end else begin repc:=readadr; amount:=nexta-firsta; for count:=1 to repc do begin for ofst:=0 to amount-1 do data[nexta+ofst]:=data[firsta+ofst]; nexta:=nexta+amount; end end end; if header[SZDATA]<>nexta then writeln('Data initialization error'); hp:=nexta; { read descriptor table } pd:=header[NTEXT]; for i:=1 to header[NPROC]*pdsize do code[pd+i-1]:=readbyte; { call the entry point routine } ignmask:=0; { catch all traps, higher numbered traps cannot be ignored} retsize:=0; lb:=maxdata; { illegal dynamic link } pc:=maxcode; { illegal return address } push(0); a:=sp; { No environment } push(0); b:=sp; { No args } 65 pusha(a); { envp } pusha(b); { argv } push(0); { argc } call(argp(header[ENTRY])); end; 66 {---------------------------------------------------------------------------} { MAIN LOOP OF THE INTERPRETER } {---------------------------------------------------------------------------} { It should be noted that the interpreter (microprogram) for an EM machine can be written in two fundamentally different ways: (1) the instruction operands are fetched in the main loop, or (2) the in- struction operands are fetched after the 256 way branch, by the exe- cution routines themselves. In this interpreter, method (1) is used to simplify the description of execution routines. The dispatch table dispat is used to determine how the operand is encoded. There are 4 possibilities: 0. There is no operand 1. The operand and instruction are together in 1 byte (mini) 2. The operand is one byte long and follows the opcode byte(s) 3. The operand is two bytes long and follows the opcode byte(s) 4. The operand is four bytes long and follows the opcode byte(s) In this interpreter, the main loop determines the operand type, fetches it, and leaves it in the global variable k for the execution routines to use. Consequently, instructions such as LOL, which use three different formats, need only be described once in the body of the interpreter. However, for a production interpreter, or a hardware EM machine, it is probably better to use method (2), i.e. to let the execution routines themselves fetch their own operands. The reason for this is that each opcode uniquely determines the operand format, so no table lookup in the dispatch table is needed. The whole table is not needed. Method (2) therefore executes much faster. However, separate execution routines will be needed for LOL with a one byte offset, and LOL with a two byte offset. It is to avoid this additional clutter that method (1) is used here. In a produc- tion interpreter, it is envisioned that the main loop will fetch the next instruction byte, and use it as an index into a 256 word table to find the address of the interpreter routine to jump to. The routine jumped to will begin by fetching its operand, if any, without any table lookup, since it knows which format to expect. After doing the work, it returns to the main loop by jumping in- directly to a register that contains the address of the main loop. A slight variation on this idea is to have the register contain the address of the branch table, rather than the address of the main loop. Another issue is whether the execution routines for LOL 0, LOL 2, LOL 4, etc. should all be have distinct execution routines. Doing so provides for the maximum speed, since the operand is implicit in the routine itself. The disadvantage is that many nearly identical execution routines will then be needed. Another way of doing it is to keep the instruction byte fetched from memory (LOL 0, LOL 2, LOL 4, etc.) in some register, and have all the LOL mini format instruc- tions branch to a common routine. This routine can then determine the operand by subtracting the code for LOL 0 from the register, leaving the true operand in the register (as a word quantity of course). This method makes the interpreter smaller, but is a bit slower. 67 To make this important point a little clearer, consider how a production interpreter for the PDP-11 might appear. Let us assume the following opcodes have been assigned: 31: LOL -2 (2 bytes, i.e. next word) 32: LOL -4 33: LOL -6 34: LOL b (format with a one byte offset) 35: LOL w (format with a one word, i.e. two byte offset) Further assume that each of the 5 opcodes will have its own execution routine, i.e. we are making a tradeoff in favor of fast execution and a slightly larger interpreter. Register r5 is the em program counter. Register r4 is the em LB register Register r3 is the em SP register (the stack grows toward low core) Register r2 contains the interpreter address of the main loop The main loop looks like this: movb (r5)+,r0 /fetch the opcode into r0 and increment r5 asl r0 /shift r0 left 1 bit. Now: -256<=r0<=+254 jmp *table(r0) /jump to execution routine Notice that no operand fetching has been done. The execution routines for the 5 sample instructions given above might be as follows: lol2: mov -2(r4),-(sp) /push local -2 onto stack jmp (r2) /go back to main loop lol4: mov -4(r4),-(sp) /push local -4 onto stack jmp (r2) /go back to main loop lol6: mov -6(r4),-(sp) /push local -6 onto stack jmp (r2) /go back to main loop lolb: mov $177400,r0 /prepare to fetch the 1 byte operand bisb (r5)+,r0 /operand is now in r0 asl r0 /r0 is now offset from LB in bytes, not words add r4,r0 /r0 is now address of the needed local mov (r0),-(sp) /push the local onto the stack jmp (r2) lolw: clr r0 /prepare to fetch the 2 byte operand bisb (r5)+,r0 /fetch high order byte first !!! swab r0 /insert high order byte in place bisb (r5)+,r0 /insert low order byte in place asl r0 /convert offset to bytes, from words add r4,r0 /r0 is now address of needed local mov (r0),-(sp) /stack the local jmp (r2) /done The important thing to notice is where and how the operand fetch occurred: lol2, lol4, and lol6, (the mini's) have implicit operands lolb knew it had to fetch one byte, and did so without any table lookup lolw knew it had to fetch a word, and did so, high order byte first } 68 {---------------------------------------------------------------------------} { Routines for the individual instructions } {---------------------------------------------------------------------------} procedure loadops; var j:integer; begin case insr of { LOAD GROUP } LDC: pushd(argd(k)); LOC: pushsw(argc(k)); LOL: push(memw(locadr(k))); LOE: push(memw(argg(k))); LIL: push(memw(mema(locadr(k)))); LOF: push(memw(popa+argf(k))); LAL: pusha(locadr(k)); LAE: pusha(argg(k)); LXL: begin a:=lb; for j:=1 to argn(k) do a:=mema(a+savsize); pusha(a) end; LXA: begin a:=lb; for j:=1 to argn(k) do a:= mema(a+savsize); pusha(a+savsize) end; LOI: pushx(argo(k),popa); LOS: begin k:=argw(k); if k<>wsize then trap(EILLINS); k:=pop; pushx(argo(k),popa) end; LDL: begin a:=locadr(k); push(memw(a+wsize)); push(memw(a)) end; LDE: begin k:=argg(k); push(memw(k+wsize)); push(memw(k)) end; LDF: begin k:=argf(k); a:=popa; push(memw(a+k+wsize)); push(memw(a+k)) end; LPI: push(argp(k)) end end; procedure storeops; begin case insr of { STORE GROUP } STL: store(locadr(k),pop); STE: store(argg(k),pop); SIL: store(mema(locadr(k)),pop); STF: begin a:=popa; store(a+argf(k),pop) end; STI: popx(argo(k),popa); STS: begin k:=argw(k); if k<>wsize then trap(EILLINS); k:=popa; popx(argo(k),popa) end; SDL: begin a:=locadr(k); store(a,pop); store(a+wsize,pop) end; SDE: begin k:=argg(k); store(k,pop); store(k+wsize,pop) end; SDF: begin k:=argf(k); a:=popa; store(a+k,pop); store(a+k+wsize,pop) end end end; procedure intarith; var i:integer; begin case insr of { SIGNED INTEGER ARITHMETIC } ADI: case szindex(argw(k)) of 1: begin st:=popsw; ss:=popsw; push(fitsw(ss+st,EIOVFL)) end; 69 2: begin dt:=popd; ds:=popd; pushd(doadi(ds,dt)) end; end ; SBI: case szindex(argw(k)) of 1: begin st:=popsw; ss:= popsw; push(fitsw(ss-st,EIOVFL)) end; 2: begin dt:=popd; ds:=popd; pushd(dosbi(ds,dt)) end; end ; MLI: case szindex(argw(k)) of 1: begin st:=popsw; ss:= popsw; push(fitsw(ss*st,EIOVFL)) end; 2: begin dt:=popd; ds:=popd; pushd(domli(ds,dt)) end; end ; DVI: case szindex(argw(k)) of 1: begin st:= popsw; ss:= popsw; if st=0 then trap(EIDIVZ) else pushsw(ss div st) end; 2: begin dt:=popd; ds:=popd; pushd(dodvi(ds,dt)) end; end; RMI: case szindex(argw(k)) of 1: begin st:= popsw; ss:=popsw; if st=0 then trap(EIDIVZ) else pushsw(ss - (ss div st)*st) end; 2: begin dt:=popd; ds:=popd; pushd(dormi(ds,dt)) end end; NGI: case szindex(argw(k)) of 1: begin st:=popsw; pushsw(-st) end; 2: begin ds:=popd; pushd(dongi(ds)) end end; SLI: begin t:=pop; case szindex(argw(k)) of 1: begin ss:=popsw; for i:= 1 to t do sleft(ss); pushsw(ss) end end end; SRI: begin t:=pop; case szindex(argw(k)) of 1: begin ss:=popsw; for i:= 1 to t do sright(ss); pushsw(ss) end; 2: begin ds:=popd; for i:= 1 to t do sdright(ss); pushd(ss) end end end end end; procedure unsarith; var i:integer; begin case insr of { UNSIGNED INTEGER ARITHMETIC } ADU: case szindex(argw(k)) of 1: begin t:=pop; s:= pop; push(chopw(s+t)) end; 2: trap(EILLINS); end ; SBU: case szindex(argw(k)) of 1: begin t:=pop; s:= pop; push(chopw(s-t)) end; 2: trap(EILLINS); end ; 70 MLU: case szindex(argw(k)) of 1: begin t:=pop; s:= pop; push(chopw(s*t)) end; 2: trap(EILLINS); end ; DVU: case szindex(argw(k)) of 1: begin t:= pop; s:= pop; if t=0 then trap(EIDIVZ) else push(s div t) end; 2: trap(EILLINS); end; RMU: case szindex(argw(k)) of 1: begin t:= pop; s:=pop; if t=0 then trap(EIDIVZ) else push(s - (s div t)*t) end; 2: trap(EILLINS); end; SLU: case szindex(argw(k)) of 1: begin t:=pop; s:=pop; for i:= 1 to t do suleft(s); push(s) end; 2: trap(EILLINS); end; SRU: case szindex(argw(k)) of 1: begin t:=pop; s:=pop; for i:= 1 to t do suright(s); push(s) end; 2: trap(EILLINS); end end end; procedure fltarith; begin case insr of { FLOATING POINT ARITHMETIC } ADF: begin argwf(k); rt:=popr; rs:=popr; pushr(doadf(rs,rt)) end; SBF: begin argwf(k); rt:=popr; rs:=popr; pushr(dosbf(rs,rt)) end; MLF: begin argwf(k); rt:=popr; rs:=popr; pushr(domlf(rs,rt)) end; DVF: begin argwf(k); rt:=popr; rs:=popr; pushr(dodvf(rs,rt)) end; NGF: begin argwf(k); rt:=popr; pushr(dongf(rt)) end; FIF: begin argwf(k); rt:=popr; rs:=popr; dofif(rt,rs,x,y); pushr(y); pushr(x) end; FEF: begin argwf(k); rt:=popr; dofef(rt,x,ss); pushr(x); pushsw(ss) end end end; procedure ptrarith; begin case insr of { POINTER ARITHMETIC } ADP: pusha(popa+argf(k)); ADS: case szindex(argw(k)) of 1: begin st:=popsw; pusha(popa+st) end; 2: begin dt:=popd; pusha(popa+dt) end; end; SBS: begin a:=popa; b:=popa; case szindex(argw(k)) of 71 1: push(fitsw(b-a,EIOVFL)); 2: pushd(b-a) end end end end; procedure incops; var j:integer; begin case insr of { INCREMENT/DECREMENT/ZERO } INC: push(fitsw(popsw+1,EIOVFL)); INL: begin a:=locadr(k); store(a,fitsw(signwd(memw(a))+1,EIOVFL)) end; INE: begin a:=argg(k); store(a,fitsw(signwd(memw(a))+1,EIOVFL)) end; DEC: push(fitsw(popsw-1,EIOVFL)); DEL: begin a:=locadr(k); store(a,fitsw(signwd(memw(a))-1,EIOVFL)) end; DEE: begin a:=argg(k); store(a,fitsw(signwd(memw(a))-1,EIOVFL)) end; ZRL: store(locadr(k),0); ZRE: store(argg(k),0); ZER: for j:=1 to argw(k) div wsize do push(0); ZRF: pushr(0); end end; procedure convops; begin case insr of { CONVERT GROUP } CII: begin s:=pop; t:=pop; if t<wsize then begin push(sextend(pop,t)); t:=wsize end; case szindex(argw(t)) of 1: if szindex(argw(s))=2 then pushd(popsw); 2: if szindex(argw(s))=1 then push(fitsw(popd,ECONV)) end end; CIU: case szindex(argw(pop)) of 1: if szindex(argw(pop))=2 then push(unsign(popd mod negoff)); 2: trap(EILLINS); end; CIF: begin argwf(pop); case szindex(argw(pop)) of 1:pushr(popsw); 2:pushr(popd) end end; CUI: case szindex(argw(pop)) of 1: case szindex(argw(pop)) of 1: begin s:=pop; if s>maxsint then trap(ECONV); push(s) end; 2: trap(EILLINS); end; 2: case szindex(argw(pop)) of 1: pushd(pop); 2: trap(EILLINS); end; end; CUU: case szindex(argw(pop)) of 1: if szindex(argw(pop))=2 then trap(EILLINS); 2: trap(EILLINS); end; CUF: begin argwf(pop); if szindex(argw(pop))=1 then pushr(pop) else trap(EILLINS) 72 end; CFI: begin sz:=argw(pop); argwf(pop); rt:=popr; case szindex(sz) of 1: push(fitsw(trunc(rt),ECONV)); 2: pushd(fitd(trunc(rt))); end end; CFU: begin sz:=argw(pop); argwf(pop); rt:=popr; case szindex(sz) of 1: push( chopw(trunc(abs(rt)-0.5)) ); 2: trap(EILLINS); end end; CFF: begin argwf(pop); argwf(pop) end end end; procedure logops; var i,j:integer; begin case insr of { LOGICAL GROUP } XAND: begin k:=argw(k); for j:= 1 to k div wsize do begin a:=sp+k; t:=pop; store(a,bf(andf,memw(a),t)) end; end; IOR: begin k:=argw(k); for j:= 1 to k div wsize do begin a:=sp+k; t:=pop; store(a,bf(iorf,memw(a),t)) end; end; XOR: begin k:=argw(k); for j:= 1 to k div wsize do begin a:=sp+k; t:=pop; store(a,bf(xorf,memw(a),t)) end; end; COM: begin k:=argw(k); for j:= 1 to k div wsize do begin store(sp+k-wsize*j, bf(xorf,memw(sp+k-wsize*j), negoff-1)) end end; ROL: begin k:=argw(k); if k<>wsize then trap(EILLINS); t:=pop; s:=pop; for i:= 1 to t do rleft(s); push(s) end; ROR: begin k:=argw(k); if k<>wsize then trap(EILLINS); t:=pop; s:=pop; for i:= 1 to t do rright(s); push(s) end end end; procedure setops; var i,j:integer; begin case insr of { SET GROUP } INN: 73 begin k:=argw(k); t:=pop; i:= t mod 8; t:= t div 8; if t>=k then begin trap(ESET); s:=0 end else begin s:=memb(sp+t) end; newsp(sp+k); push(bit(i,s)); end; XSET: begin k:=argw(k); t:=pop; i:= t mod 8; t:= t div 8; for j:= 1 to k div wsize do push(0); if t>=k then trap(ESET) else begin s:=1; for j:= 1 to i do rleft(s); storeb(sp+t,s) end end end end; procedure arrops; begin case insr of { ARRAY GROUP } LAR: begin k:=argw(k); if k<>wsize then trap(EILLINS); a:=popa; pushx(argo(memw(a+2*k)),arraycalc(a)) end; SAR: begin k:=argw(k); if k<>wsize then trap(EILLINS); a:=popa; popx(argo(memw(a+2*k)),arraycalc(a)) end; AAR: begin k:=argw(k); if k<>wsize then trap(EILLINS); a:=popa; push(arraycalc(a)) end end end; procedure cmpops; begin case insr of { COMPARE GROUP } CMI: case szindex(argw(k)) of 1: begin st:=popsw; ss:=popsw; if ss<st then pushsw(-1) else if ss=st then push(0) else push(1) end; 2: begin dt:=popd; ds:=popd; if ds<dt then pushsw(-1) else if ds=dt then push(0) else push(1) end; end; CMU: case szindex(argw(k)) of 1: begin t:=pop; s:=pop; if s<t then pushsw(-1) else if s=t then push(0) else push(1) end; 2: trap(EILLINS); end; 74 CMP: begin a:=popa; b:=popa; if b<a then pushsw(-1) else if b=a then push(0) else push(1) end; CMF: begin argwf(k); rt:=popr; rs:=popr; if rs<rt then pushsw(-1) else if rs=rt then push(0) else push(1) end; CMS: begin k:=argw(k); t:= 0; j:= 0; while (j < k) and (t=0) do begin if memw(sp+j) <> memw(sp+k+j) then t:=1; j:=j+wsize end; newsp(sp+wsize*k); push(t); end; TLT: if popsw < 0 then push(1) else push(0); TLE: if popsw <= 0 then push(1) else push(0); TEQ: if pop = 0 then push(1) else push(0); TNE: if pop <> 0 then push(1) else push(0); TGE: if popsw >= 0 then push(1) else push(0); TGT: if popsw > 0 then push(1) else push(0); end end; procedure branchops; begin case insr of { BRANCH GROUP } BRA: newpc(pc+k); BLT: begin st:=popsw; if popsw < st then newpc(pc+k) end; BLE: begin st:=popsw; if popsw <= st then newpc(pc+k) end; BEQ: begin t :=pop ; if pop = t then newpc(pc+k) end; BNE: begin t :=pop ; if pop <> t then newpc(pc+k) end; BGE: begin st:=popsw; if popsw >= st then newpc(pc+k) end; BGT: begin st:=popsw; if popsw > st then newpc(pc+k) end; ZLT: if popsw < 0 then newpc(pc+k); ZLE: if popsw <= 0 then newpc(pc+k); ZEQ: if pop = 0 then newpc(pc+k); ZNE: if pop <> 0 then newpc(pc+k); ZGE: if popsw >= 0 then newpc(pc+k); ZGT: if popsw > 0 then newpc(pc+k) end end; procedure callops; var j:integer; begin case insr of { PROCEDURE CALL GROUP } CAL: call(argp(k)); CAI: begin call(argp(popa)) end; RET: begin k:=argz(k); if k div wsize>maxret then trap(EILLINS); for j:= 1 to k div wsize do retarea[j]:=pop; retsize:=k; newsp(lb); lb:=maxdata+1; { To circumvent stack overflow error } newpc(popa); if pc=maxcode then begin 75 halted:=true; if retsize=wsize then exitstatus:=retarea[1] else exitstatus:=undef end else newlb(popa); end; LFR: begin k:=args(k); if k<>retsize then trap(EILLINS); for j:=k div wsize downto 1 do push(retarea[j]); end end end; procedure miscops; var i,j:integer; begin case insr of { MISCELLANEOUS GROUP } ASP,ASS: begin if insr=ASS then begin k:=argw(k); if k<>wsize then trap(EILLINS); k:=popsw end; k:=argf(k); if k<0 then for j:= 1 to -k div wsize do push(undef) else newsp(sp+k); end; BLM,BLS: begin if insr=BLS then begin k:=argw(k); if k<>wsize then trap(EILLINS); k:=pop end; k:=argz(k); b:=popa; a:=popa; for j := 1 to k div wsize do store(b-wsize+wsize*j,memw(a-wsize+wsize*j)) end; CSA: begin k:=argw(k); if k<>wsize then trap(EILLINS); a:=popa; st:= popsw - signwd(memw(a+asize)); if (st>=0) and (st<=memw(a+wsize+asize)) then b:=mema(a+2*wsize+asize+asize*st) else b:=mema(a); if b=0 then trap(ECASE) else newpc(b) end; CSB: begin k:=argw(k); if k<>wsize then trap(EILLINS); a:=popa; t:=pop; i:=1; found:=false; while (i<=memw(a+asize)) and not found do if t=memw(a+(asize+wsize)*i) then found:=true else i:=i+1; if found then b:=memw(a+(asize+wsize)*i+wsize) else b:=memw(a); if b=0 then trap(ECASE) else newpc(b); end; DCH: begin pusha(mema(popa+dynd)) end; DUP,DUS: begin if insr=DUS then begin k:=argw(k); if k<>wsize then trap(EILLINS); k:=pop end; k:=args(k); for i:=1 to k div wsize do push(memw(sp+k-wsize)); end; EXG: begin k:=argw(k); for i:=1 to k div wsize do push(memw(sp+k-wsize)); for i:=0 to k div wsize - 1 do 76 store(sp+k+i*wsize,memw(sp+k+k+i*wsize)); for i:=1 to k div wsize do begin t:=pop ; store(sp+k+k-wsize,t) end; end; FIL: filna(argg(k)); GTO: begin k:=argg(k); newlb(mema(k+2*asize)); newsp(mema(k+asize)); newpc(mema(k)) end; LIM: push(ignmask); LIN: lino(argn(k)); LNI: lino(memw(0)+1); LOR: begin i:=argr(k); case i of 0:pusha(lb); 1:pusha(sp); 2:pusha(hp) end; end; LPB: pusha(popa+statd); MON: domon(pop); NOP: writeln('NOP at line ',memw(0):5) ; RCK: begin a:=popa; case szindex(argw(k)) of 1: if (signwd(memw(sp))<signwd(memw(a))) or (signwd(memw(sp))>signwd(memw(a+wsize))) then trap(ERANGE); 2: if (memd(sp)<memd(a)) or (memd(sp)>memd(a+2*wsize)) then trap(ERANGE); end end; RTT: dortt; SIG: begin a:=popa; pusha(uerrorproc); uerrorproc:=a end; SIM: ignmask:=pop; STR: begin i:=argr(k); case i of 0: newlb(popa); 1: newsp(popa); 2: newhp(popa) end; end; TRP: trap(pop) end end; 77 {---------------------------------------------------------------------------} { Main Loop } {---------------------------------------------------------------------------} begin initialize; 8888: repeat opcode := nextpc; { fetch the first byte of the instruction } if opcode=escape1 then iclass:=second else if opcode=escape2 then iclass:=tert else iclass:=prim; if iclass<>prim then opcode := nextpc; with dispat[iclass][opcode] do begin insr:=instr; if not (zbit in iflag) then if ibit in iflag then k:=pop else begin if mini in iflag then k:=implicit else begin if short in iflag the begin k:=nextpc; if (sbit in iflag) and (k>=128) then k:=k-256; for i:=2 to ilength do k:=256*k + nextpc end end; if wbit in iflag then k:=k*wsize; end end; case insr of NON: trap(EILLINS); { LOAD GROUP } LDC,LOC,LOL,LOE,LIL,LOF,LAL,LAE,LXL,LXA,LOI,LOS,LDL,LDE,LDF,LPI: loadops; { STORE GROUP } STL,STE,SIL,STF,STI,STS,SDL,SDE,SDF: storeops; { SIGNED INTEGER ARITHMETIC } ADI,SBI,MLI,DVI,RMI,NGI,SLI,SRI: intarith; { UNSIGNED INTEGER ARITHMETIC } ADU,SBU,MLU,DVU,RMU,SLU,SRU: unsarith; { FLOATING POINT ARITHMETIC } ADF,SBF,MLF,DVF,NGF,FIF,FEF: fltarith; { POINTER ARITHMETIC } ADP,ADS,SBS: ptrarith; { INCREMENT/DECREMENT/ZERO } INC,INL,INE,DEC,DEL,DEE,ZRL,ZRE,ZER,ZRF: incops; 78 { CONVERT GROUP } CII,CIU,CIF,CUI,CUU,CUF,CFI,CFU,CFF: convops; { LOGICAL GROUP } XAND,IOR,XOR,COM,ROL,ROR: logops; { SET GROUP } INN,XSET: setops; { ARRAY GROUP } LAR,SAR,AAR: arrops; { COMPARE GROUP } CMI,CMU,CMP,CMF,CMS, TLT,TLE,TEQ,TNE,TGE,TGT: cmpops; { BRANCH GROUP } BRA, BLT,BLE,BEQ,BNE,BGE,BGT, ZLT,ZLE,ZEQ,ZNE,ZGE,ZGT: branchops; { PROCEDURE CALL GROUP } CAL,CAI,RET,LFR: callops; { MISCELLANEOUS GROUP } ASP,ASS,BLM,BLS,CSA,CSB,DCH,DUP,DUS,EXG,FIL,GTO,LIM, LIN,LNI,LOR,LPB,MON,NOP,RCK,RTT,SIG,SIM,STR,TRP: miscops; end; { end of case statement } if not ( (insr=RET) or (insr=ASP) or (insr=BRA) or (insr=GTO) ) then retsize:=0 ; until halted; 9999: writeln('halt with exit status: ',exitstatus:1); doident; end.
car@pte.UUCP (Chris Rende) (11/15/88)
In article <1650@ast.cs.vu.nl>, ast@cs.vu.nl (Andy Tanenbaum) writes: > I can email the complete manual to people who > are seriously interested, but it is quite large. How large? (lines/bytes?) I'm interested in a copy but I hesitate ask for one because you hint that it is SO big... How about compressing and uuencodeing it? car. -- Christopher A. Rende Multics,DTSS,Shortwave,Scanners,StarTrek uunet!{umix,edsews}!rphroy!pte!car TRS-80 Model I: Buy Sell Trade Motorola VME1131 M68020 SVR2 Precise Technology & Electronics, Inc.