jkf@Franz.COM (Sean Foderaro) (12/15/90)
Hello, I work at Franz Inc. porting Allegro Common Lisp to various architectures. I've run into a problem making our 88k version of Lisp compliant with the 88open standard. I believe that the problem lies in the 88open standard and I'd like to see that standard changed. While in theory a single person with a reasonable argument should be able to effect a change, in practice I suspect that it will actually need a large group of people in favor of the change for the 88open consortium to take notice. Thus I'm soliciting people to join with me to change the standard. This is the problem: a memory page can only be in one of three states: (1) read-execute, (2) read-write, (3) read Note that read-write-execute is not an option, and for Lisp we need read-write-execute (at least in a limited form) for many of our data pages. Let's examine current practice among computer architectures (for it was current practice that let us to design a memory allocation scheme that turned out to be incompatible with the 88open spec). Based on our current Lisp ports: Systems that permit read-write-execute: 68010, 68020, 68030 vax 386 370 cray cray 2 sparc Aviion (88k) [In our experience we don't need to cache flush on this machine, others may have different experiences] Systems that permit read-write-execute but require the user to explicitly cache flush: mips rs/6000 motorola 188 board (88k) Systems that don't permit read-write-execute at all: motorola 181 board (88k) [follows official 88open spec] The 88open consortium has taken a bold stand by refusing to support a mode of operation that its competitors support (and in fact that nearly every machine before it has supported.) Note that some 88k systems support read-write-execute but because this isn't a specified 88open extension no program that depends on that extension be certified by the 88open process. You might ask why we don't just change our Lisp to obey the 88open spec. Part of the reason is that it would be a lot of work. We have a very complex storage management/garbage collection scheme that has taken many man-years to write. It treats executable code like any other data objects and it garbage collects code that will never be referenced again. Changing this collector so that it segregates code vectors from other data objects would add another layer of complexity around what is already a very complex piece of code. Such added complexity would just make things worse on our other ports. And furthermore we don't see a huge number of 88k machines that require this change (in fact we've only come across one old-style Motorola development board that requires compliance with this part of the 88open spec). A second reason we don't like this part of the spec is that 88open is setting a dangerous precedent (new computers are supposed to increase functionality not reduce it!). We don't want the functionality of future system compromised by restrictions like this one as it will inhibit what software can be designed. What we propose: We don't need fine-grain read-write-execute where you write a word in memory and then jump to it and execute it. We propose a very coarse-grain read-write-execute that just forces the instruction and data caches to synchronize when this action is explicitly requested by the program. For those with access to the memctl() manual page the change is exactly this: To the table of states, add a new state, state 4, which is readable, writeable and executable. To the text description add: After memctl() is called with a state argument of 4, the region is readable, writeable and executable. However if any byte within a (4 byte) word in this region is written, then the subsequent execution of that word is undefined. The change implies that in order to write code into memory and then execute it, you must call memctl with a state of 4 between the time the code is written to memory and the time it is executed. Thus we don't need the instruction cache snooping the data cache. All we need is for the operating system to be able to flush a region of the data cache to memory and to be able to clear a region of the instruction cache. I think that every version of Unix that does paging will already have this capability. I'd like those who support this change to contact me (via email, phone, or comp.sys.m88k). Please mention why you want it (e.g. you have an application that requires it or you simply believe that it is correct). I'd also like to hear from hardware and operating system people who believe that implementing this would greatly slow down or otherwise hamper programs running on the 88k, even if they didn't use read-write-execute pages. - John Foderaro voice: 415 548 3600 Franz Inc fax: 415 548 8253 email: jkf@franz.com -or- uunet!franz!jkf
jkf@Franz.COM (Sean Foderaro) (12/20/90)
>> From: hamilton@dg-rtp.dg.com (Eric Hamilton) >> How about a new trap: >> r2 contains the base address >> r3 contains the length >> >> tb0 0,r0,<CacheSynchronizationTrap> >> Will cause the data and instruction caches for the specified region (between >> r2 and r2+r3-1, byte granular, no minimum length) to come into coherence, >> so that that region can be safely executed. I support your request for the trap to sychronize a region of the address space. However unless the memctl change I propose is passed, there isn't a need for the trap since it will still be illegal to have read-write-execute pages. My immediate concern is getting this rather obvious memctl() flaw fixed in the spec and thus I proposed something with very little implementation cost. The 88open folks tell me that even this little change will have a hard time getting passed. So I'm concerned that if I start asking for more (like the trap you suggested), that it won't have a chance of getting passed. The primary reason the 88open people gave for the current state of memctl() is that multiprocessor systems couldn't handle the memctl() change I proposed. As you are familiar with 88k multiprocesor systems can you tell me if this is true? Is this anything (that you can reveal) about future versions of the 88k chipset that will make this true? Could you imagine anyone building there own caching system (not the 88200) that support a paging version of Unix yet which can't support the memctl() change? -john foderaro franz inc.
hamilton@siberia.rtp.dg.com (Eric Hamilton) (12/22/90)
In article <JKF.90Dec19092258@frisky.Franz.COM>, jkf@Franz.COM (Sean Foderaro) writes: |> |> I support your request for the trap to sychronize a region of the |> address space. However unless the memctl change I propose is |> passed, there isn't a need for the trap since it will still be |> illegal to have read-write-execute pages. |> I suspect that our postings have crossed in the mail.... I've addressed this point in a "Read/write/execute Proposal" which I posted to comp.sys.m88k. And, yes, I agree that the memctl() change or something similiar is needed *as well as* the cache synchronizing trap. |> The primary reason the 88open people gave for the current state of |> memctl() is that multiprocessor systems couldn't handle the memctl() |> change I proposed. As you are familiar with 88k multiprocesor systems |> can you tell me if this is true? Is this anything (that you can |> reveal) about future versions of the 88k chipset that will make this |> true? Could you imagine anyone building there own caching system |> (not the 88200) that support a paging version of Unix yet which |> can't support the memctl() change? |> There has to have been a misunderstanding somewhere along the line. It is true that, in general, the cache synchronization trap must act on all processors' caches, but that certainly doesn't make it unimplementable. In fact, the cache synchronization trap does exactly the same thing that the kernel must do, in general, when paging in any executable pages (assuming that the instruction caches don't snoop - a good assumption). This one is technically feasible and not even especially difficult, on uniprocessor and multiprocessor systems, with all 88000s that I am aware of. I cannot imagine designing a caching subsystem for which this would not be the case.