campbell@sauron.UUCP (Mark Campbell) (10/31/86)
Question #1: Has anyone out there looked at the I80386 with respect to multiprocessing? In particular, how would you handle the lack of a CIOUT-like signal (MC68030) which would allow dynamic disabling of an external cache on accesses to specified pages? If two processes on two different processors had shared memory between them, it would seem impossible to maintain data integrity if both processors had local data caches. Question #2: Looking at the timing diagrams of the I80386, it would appear that if NA were constantly asserted you'd get rid of the address valid delay times (32ns at 20MHz) at the beginning of a cycle so that you'd have a lot more time to handle your cache tag compare. All it would cost would be the latching of addresses from the processor. Any comment and/or correction? -- Mark Campbell Phone: (803)-791-6697 E-Mail: !ncsu!ncrcae!sauron!campbell
kds@mipos3.UUCP (Ken Shoemaker ~) (11/01/86)
In article <749@sauron.UUCP> campbell@sauron.UUCP (Mark Campbell) writes: >Question #1: >Has anyone out there looked at the I80386 with respect to multiprocessing? >In particular, how would you handle the lack of a CIOUT-like signal (MC68030) >which would allow dynamic disabling of an external cache on accesses to >specified pages? easily done, just use one of the 32 address lines (I hope I'm not being too short sighted in assuming that no one is going to try to hook up 4 Gbytes of physical memory to the beastie). This has the additional value of working in a straight segmented os as well as one that uses paging. > >Question #2: >Looking at the timing diagrams of the I80386, it would appear that if >NA were constantly asserted you'd get rid of the address valid delay >times (32ns at 20MHz) at the beginning of a cycle so that you'd have >a lot more time to handle your cache tag compare. All it would cost would >be the latching of addresses from the processor. Any comment and/or >correction? >-- quite correct. The only downside is a small performance degradation of the system owing to a 3 clock latency from address output from the 386 to data being returned. For example, after a jump, it will take an additional clock to get the first instruction back to the processor, and the latency from when a read address is presented til the read data comes back is also an additional instruction. Another consideration is that if you need the extra time on the bus to get the cycle through externally, you will require a 3 clock bus cycle when going from an idle bus, since the part can't guess the next address when it is going to an idle bus, i.e., the first bus cycle from an idle bus is necessarily non-pipelined. The performance difference between a no-wait-state non-pipelined bus and a no-wait-state pipelined bus is somewhere in the range of adding 1/3 to 1/2 a wait state, so you will make a faster system with a pipelined bus than you would if you had to add a wait state (which is what you'd have to do if you couldn't run a pipelined bus and still wanted to live with the same speed memory system) but it still isn't as fast as a straight no-wait- state, non-pipelined system. -- The above views are personal. I've seen the future, I can't afford it... Ken Shoemaker, Microprocessor Design, Intel Corp., Santa Clara, California uucp: ...{ hplabs|amdcad|qantel|pur-ee|scgvaxd|oliveb }!intelca!mipos3!kds csnet/arpanet: kds@mipos3.intel.com