cyrus@hi.UUCP (Tait Cyrus) (07/10/87)
Here at the University of New Mexico, we will be starting to port GENIX 4.2 to a 32016/32032 board. I have some questions. 1) What good are the cxp/rxp instructions? 2) Why can't the "standard" jsr/ret instructions be used? 3) What advantages are there for going through a jump table instead of jumping directly? The reason I ask is that evern cxp/rxp causes the 32xxx to read from the mod table and I can see this as slowing things down A LOT. I have already had one person say that when they ported GENIX 4.1, they replaced all of the cxp/rxp type instructions with jsr/ret just to speed things up. I would appreciate ANY comments/suggestions/ideas. Thanks in advance -- @__________@ W. Tait Cyrus (505) 277-0806 /| /| University of New Mexico / | / | Dept of EECE - Hypercube Project @__|_______@ | Albuquerque, New Mexico 87131 | | | | | | hc | | e-mail: | @.......|..@ cyrus@hc.dspo.gov or | / | / seismo!unmvax!hi!cyrus @/_________@/
jans@tekchips.TEK.COM (Jan Steinman) (07/11/87)
>1) What good are the cxp/rxp instructions?... >The reason I ask is that evern cxp/rxp causes the 32xxx to read >from the mod table and I can see this as slowing things down A LOT. These instructions support shared libraries. Yes, they are somewhat slower than jsr/ret, but they are MUCH faster than doing shared library calls in software! Jan Steinman N7JDB - Box 500, MS 50-470 - Beaverton, OR 97077 jans@tekcrl.tek.com - 503/627-5881
collins@encore.UUCP (Jeff Collins) (07/13/87)
In article <10742@hi.UUCP>, cyrus@hi.UUCP (Tait Cyrus) writes: > Here at the University of New Mexico, we will be starting to port > GENIX 4.2 to a 32016/32032 board. I have some questions. > > 1) What good are the cxp/rxp instructions? > 2) Why can't the "standard" jsr/ret instructions be used? > 3) What advantages are there for going through a jump table instead > of jumping directly? > In normal operation these instructions should NOT be used. They are much slower than jsr/ret. We changed our compiler to not generate these instruction do to thier execution times. In fact we modified the OS to use the MOD register as infrequently as possible (only on interrupts, where we have no control over the CPU using it). The next National chip (32532) has a direct interrupt mode that does not go through the MOD table to find its vector. We will be using that when we upgrade...
urip@hcrvx1.UUCP (Uri Postavsky) (07/16/87)
In article <10742@hi.UUCP> cyrus@hi.UUCP (Tait Cyrus) writes: > >1) What good are the cxp/rxp instructions? >2) Why can't the "standard" jsr/ret instructions be used? >3) What advantages are there for going through a jump table instead > of jumping directly? > As was mentioned in previous articles, the CXP/RXP are good mainly for shared libraries and dynamic linking. In the context of UNIX they are no better than BSR/RET, just slower. The same thing is true for referencing external variables - the EXT addressing mode goes through the link table and is slower than the other memory addressing modes. *BUT* you cannot just go ahead and replace all the CXP/RXP with BSR/RET. It depends whether your assembler and linker support BSR across modules. In order to support these, the assembler has to generate PC-relative addressing mode for all external names (for variables SB-relative is better), and to generate relocation information to the linker. The linker has to "patch" the external references (both procedures and variables) in the code of each module with the correct memory addresses known only at link time. The locations to patch are indicated in the relocation information. When external references are implemented by the EXT addressing mode only (i.e. CXP/RXP), all the references of a module are done through its link table, so the linker needs only fill this table rather than patch the object code itself. Correspondingly, the assembler does not need to generate relocation information. Assembler that does not generate relocation information and linker that can only fill link tables cannot support BSR across modules!!! If the assembler and linker you have are from National, you can tell which kind of assembler you have by the directives for external names. The old tools use the CXP/RXP and the directives for external names are: .export/.import for variables and .exportp/.importp for procedures. If you have these tools, you cannot use BSR across modules. The new tools use BSR/RET and the directive for external names is: .global. -- Uri Postavsky ( ...{utzoo, utcsri}!hcr!urip ) at HCR, Toronto. (formerly at National Semiconductor Tel Aviv).
elg@killer.UUCP (Eric Green) (07/17/87)
in article <1751@encore.UUCP>, collins@encore.UUCP (Jeff Collins) says: > > In article <10742@hi.UUCP>, cyrus@hi.UUCP (Tait Cyrus) writes: >> Here at the University of New Mexico, we will be starting to port >> GENIX 4.2 to a 32016/32032 board. I have some questions. >> >> 1) What good are the cxp/rxp instructions? >> 2) Why can't the "standard" jsr/ret instructions be used? >> 3) What advantages are there for going through a jump table instead >> of jumping directly? Well, for one thing, it might make relocatable shared librarys easier. For example, on the Amiga, to access a shared library, you must first issue an "openlibrary" command (with the name of the library), which returns you the address of the start of the library's jump table. Still, an indirect indexed jsr probably would be faster.... Of course, libraries would have to consist solely of relocatable code in order to work such a scheme... or else, relocate them when loaded and map them into every process space (and un-map them when they're not being used -- on your typical system with a 32-bit address space, it's trivial to dedicate half of that address space to the kernal and shared libraries). But boy, wouldn't that make for some back doors! (someone loading in their own "stdio" library :-). Seems a shame to restrict library-loading to the "standard" libraries. In any event, it wouldn't be difficult to come up with a better shared-library system than Sys V.3 uses. Like, my 12 year old brother could probably do better :-). Eric Green {ihnp4,cbosgd}!killer!elg elg@usl.CSNET
mjd@doc.ic.ac.uk (Martin J Davies) (07/19/87)
> >>1) What good are the cxp/rxp instructions?... >>The reason I ask is that evern cxp/rxp causes the 32xxx to read >>from the mod table and I can see this as slowing things down A LOT. > >These instructions support shared libraries. Yes, they are somewhat slower than jsr/ret, but they are MUCH faster than doing shared library calls in software! I have implemented shared 'C' libraries on a 32016 machine running my own multi-user/multitasking o/s. I started using the RXP/CXP instructions and this went quite well but there was a speed penalty. I found it much faster to dynamicly link the library calls using an interrupt linkage handler. How this could be applied to unix I am not sure, but modules do seem to use a noticable amount of cpu time. Posted on for P.Winterbottom Kings College London (Gemini Project)
greg@utcsri.UUCP (07/25/87)
>>>1) What good are the cxp/rxp instructions?... >>>The reason I ask is that evern cxp/rxp causes the 32xxx to read >>>from the mod table and I can see this as slowing things down A LOT. >> >>These instructions support shared libraries. Yes, they are somewhat >>slower than jsr/ret, but they are MUCH faster than doing shared library >> calls in software! > I haven't seen the other advantage of cxp/rxp yet, so I'll bring it up. Cxp/rxp ops allow a program to be divided into a number of modules, each having its own static data. When cxp is used to call a procedure in a different module, the sb register is reloaded to point to the data of that module. This static data can be accessed by indexing from the sb register, and the less data there is, the smaller the indices will be on average. The NS32k allows three different sizes of index; thus smaller indices mean smaller and somewhat faster code. If cxp is not used, all of the static data for the whole program will be lumped together, and must be addressed using absolute addressing or by potentially large offsets from the sb. Note that the external data addressing mode ( which is time-consuming ) need not be used to access data in another module; you can use absolute addressing in this case ( not if the other module is a shared resident library of course ). Subroutines which can be called externally must end in rxp, and thus must be called via cxp, even when called from within the same module. In order to reduce this effect, the compiler should be able to determine which routines cannot be externally called, and make them 'jsr/rts' routines. Unfortunately, in C, the only routines which cannot be called externally are those decalared 'static', and this declaration is rarely used. ( I am assuming one foo.c file compiles to a single ns32k module, which is the logical way to do it ). Languages such as Concurrent Euclid, which directly support the 'module' paridigm, can make much better use of the cxp/rxp instructions. Finally, if each object module is an ns32k module ( and if the external data addressing mode is used ), linking of object modules can be done VERY cheaply ( i.e. without modifying any of the program text segment ). This is similar to the way resident shared libraries are done - a RSL is effectively linked to the program at load time. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...