olmstead (02/22/83)
I recently noticed that the compiler never gets a -O flag when compiling our kernel (well, actually, it does when it compiles vers.c, but that hardly counts). Before I go ahead and turn it on, can anyone tell me whether that's a bad idea? I've seen examples of the optimizer's generating bad code, but I think that was all floating-point stuff. Will it break our UN*X? Will it cause all those funky sed scripts to fail? [I ask this last one because I once added some code to call spl7() from a function that declared a register variable and initialized it to TS_OK (= 0). The compiler cleverly remembered that it had a zero handy in the register and called spl7 via "calls reg, _spl7" or some such. The sed script later tried to change all calls to splX into inline instructions; it KNEW, of course, that these appeared as "calls, $ 0, _splX" (the space is there so somebody's news doesn't eat my dollar-zero). The end result was that the loader told me that spl7 was undefined.] So, to optimize or not to optimize: that is the question. TIA, Patrick Olmstead ...ucbvax!menlo70!sytek!olmstead ...decvax!sytek!olmstead Olmstead.PA@PARC-MAXC.ARPA
ken (02/24/83)
It is alright to optimize the kernel, but it is not recommended to invoke the C optimizer on the kernel code. The kernel is much different than other programs because it accesses the devices sitting out there on the bus which are not necessarily memory. The optimizer assumes that every location that it accesses behaves according to a model that resembles generic memory. Many of the control, status, command, and data registers of devices such as UARTS, disk controllers, graphics boards, and the like DO NOT behave like memory. Some registers are not read/write; that is, they may be writeable, but may return garbage when read. Likewise, they may be status-type bits, which can be read but not written. One of the worst hardware designs (found quite a bit in practice) is to have the same memory address refer to two totally different registers when reading as opposed to writing. There may be a memory susbystem out there on the bus which responds to every byte address in a certain range, but which cannot be accessed in word quantities. Some types of registers modify the contents of other registers when they are read; e.g. reading the data register of a UART usually clears the FULL bit in the status register. What difference does this make, you say? Well the optimizer will do things like not even bother to read a location it has just written, even if the C code does; it already [thinks] it knows what is contained in that location because it just wrote it! Some (probably most) computers have a special instruction that performs the CLEAR function. At least one UNIX-supporting minicomputer that I know of accomplishes this by subtracting the location from itself, generating a read-modify-write cycle instead of just a write cycle. This can wreak havoc with registers that have side effects (such as the UART data register). It is better to write the constant 0 than invoke the CLEAR instruction. I discovered these problems by writing diagnostic code for imaging & graphics peripherals. Bad hardware tested good when I used the optimizer, and tested bad when I didn't. It would probably be all right to sic the optimizer on portions of the kernel that do not deal with devices, but unless you are intimately familiar with the hardware of a particular device, it is recommended NOT TO OPTIMIZE ANY DEVICE DRIVER. Ken Turkowski {ucbvax,decvax}!decwrl!turtlevax!ken
davec (02/26/83)
It has just been pointed out that only relatively recent versions of the 4.1bsd tape from Berkeley have an optimized kernel (say within the last year, since that's about the age of our system, which has always been running an optimized kernel.) dgc