[net.micro.68k] Re**n: Bus Error Effluvia

kjm@ut-ngp.UTEXAS (Ken Montgomery) (09/17/85)

[reposted since I don't think it got out the first time...]

In article <532@oakhill.UUCP> oakhill!davet (Dave Trissel) writes:

>In article <69@intelca.UUCP> kds@intelca.UUCP (Ken Shoemaker) writes:
>
>>bus fault handler, you have to be very careful, since (if r0 is used to
>>pass back values)
>>
>>loop:
>> mov     to-fault-location,blah
>> cmp     r0,1
>> jnz     loop
>>
>
>This is very easy for the '020 programmer to solve.  The MC68020 NOP
>instruction serializes the machine (e.g. guarantees all updates for previous
>instructions have been done.)  Therefore, by thowing in a NOP after the
>instruction which does the write such a loop on the '020 will always
>properly execute.

How does one get this NOP generated in the correct places for compiled code
that has to be portable?  The synchronizing effect is not useful unless one
can manage to invoke it.  Even if one solves this problem, the problem of
detecting all of the places where it is needed remains.

--
The above viewpoints are mine.  They are unrelated to
those of anyone else, including my cat and my employer.

Ken Montgomery  "Shredder-of-hapless-smurfs"
...!{ihnp4,allegra,seismo!ut-sally}!ut-ngp!kjm  [Usenet, when working]
kjm@ngp.UTEXAS.EDU  [Internet, if the nameservers are up]

davet@oakhill.UUCP (Dave Trissel) (09/17/85)

In article <2393@ut-ngp.UTEXAS> kjm@ut-ngp.UTEXAS (Ken Montgomery) writes:
>>
>>>bus fault handler, you have to be very careful, since (if r0 is used to
>>>pass back values)
>>>
>>>loop:
>>> mov     to-fault-location,blah
>>> cmp     r0,1
>>> jnz     loop
>>>
>>
>>This is very easy for the '020 programmer to solve.  The MC68020 NOP
>>instruction serializes the machine (e.g. guarantees all updates for previous
>>instructions have been done.)  Therefore, by thowing in a NOP after the
>>instruction which does the write such a loop on the '020 will always
>>properly execute.
>
>How does one get this NOP generated in the correct places for compiled code
>that has to be portable?  The synchronizing effect is not useful unless one
>can manage to invoke it.  Even if one solves this problem, the problem of
>detecting all of the places where it is needed remains.

First of all, Ken's code above is not portable, and is not meant to be so.
It will only work in a system were you are the software/hardware designer and
have complete control over interrupt handlers and the hardware memory map
of the machine.

Second of all, each architecture has specific methods which must be followed
to guarantee proper communication between different tasks or between tasks
and interrupt exception routines.  Although the above code is usefull for
investigating pipeline subtleties it is not the proper way to communicate
since it makes unwarrented assumptions about its environment.  For example,
what if another bit of code (in the same task or any other) happened to write
to the same memory location?  Would it find it's register clobbered? Would the
exception handler bomb the machine because it only expected one special task
to try the write?

In other words, there is an agreement here between the interrupt handler
and this piece of code AND an underlying assumption about the MPU involved.
Compiler's simply don't expect this kind of side-effect and therefore have no
reason to prepare for it.

If you are using an intermediate language for systems program and do want to
produce something that works then certainly the language itself would have
the capability of recognizing when to use a  NOP (in the case of the '020)
or allow you to insert an embedded assembly language statement for the same
effect.  The IBM PL/I compilers, for example, would automatically produce a
NOP for the highly pipelined 360 model 91 (I think it was) whenever a null
statement (just a semicolon) was encountered in the program. That NOP did
the similar function of serializing the pipe on that machine.

Architectures also provide for locks and semaphores as proper means to access
data shared by multiple tasks and/or processors.  The Intel 808x has a LOCK
prefix which can be used on most instructions.  Motorola has TAS (Test and
Set) and CAS (Compare and Swap).  Usually computer system designers will add
their own conventions such as special dual-ported memories or queueing
circuits to be used in combination with, or enhance, methods provided by
the microprocessor itself.

  -- Dave Trissel  {ihnp4,seismo}!ut-sally!oakhill!davet

ed@mtxinu.UUCP (Ed Gould) (09/19/85)

>In article <532@oakhill.UUCP> oakhill!davet (Dave Trissel) writes:
>>This is very easy for the '020 programmer to solve.  The MC68020 NOP
>>instruction serializes the machine (e.g. guarantees all updates for previous
>>instructions have been done.)  Therefore, by thowing in a NOP after the
>>instruction which does the write such a loop on the '020 will always
>>properly execute.

In article <2393@ut-ngp.UTEXAS> kjm@ut-ngp.UTEXAS (Ken Montgomery) writes:
>How does one get this NOP generated in the correct places for compiled code
>that has to be portable?  The synchronizing effect is not useful unless one
>can manage to invoke it.  Even if one solves this problem, the problem of
>detecting all of the places where it is needed remains.

How does one write in a high-level language the idea that a store
might fault, that the fault is acceptable, and knowledge of the
fault status is in r0?  This sort of code *isn't* portable (which
may be an argument for not using it), and happens only in deliberate,
thereby known, places.

-- 
Ed Gould                    mt Xinu, 2910 Seventh St., Berkeley, CA  94710  USA
{ucbvax,decvax}!mtxinu!ed   +1 415 644 0146

"A man of quality is not threatened by a woman of equality."

kjm@ut-ngp.UTEXAS (Ken Montgomery) (10/01/85)

[]
>>How does one get this NOP generated in the correct places for compiled code
>>that has to be portable?  The synchronizing effect is not useful unless one
>>can manage to invoke it.  Even if one solves this problem, the problem of
>>detecting all of the places where it is needed remains.  [me]
>
>How does one write in a high-level language the idea that a store
>might fault, that the fault is acceptable, and knowledge of the
>fault status is in r0?  This sort of code *isn't* portable (which
>may be an argument for not using it), and happens only in deliberate,
>thereby known, places.  [Ed Gould]

This is a valid objection to my objection.  (I will admit that my
objection was not truly robust.)  But there is one point to consider:
operating systems have such unusual code sequences in places; I, at
least, would prefer to be able to integrate them with a minimum of pain --
ergo my question.

--
The above viewpoints are mine.  They are unrelated to
those of anyone else, including my cat and my employer.

Ken Montgomery  "Shredder-of-hapless-smurfs"
...!{ihnp4,allegra,seismo!ut-sally}!ut-ngp!kjm  [Usenet, when working]
kjm@ngp.UTEXAS.EDU  [Internet, if the nameservers are up]

kjm@ut-ngp.UTEXAS (Ken Montgomery) (10/01/85)

[]
>>>This is very easy for the '020 programmer to solve.  The MC68020 NOP
>>>instruction serializes the machine (e.g. guarantees all updates for previous
>>>instructions have been done.)  Therefore, by thowing in a NOP after the
>>>instruction which does the write such a loop on the '020 will always
>>>properly execute.  [Dave Trissel]
>>
>>How does one get this NOP generated in the correct places for compiled code
>>that has to be portable?  The synchronizing effect is not useful unless one
>>can manage to invoke it.  Even if one solves this problem, the problem of
>>detecting all of the places where it is needed remains.  [Ken Montgomery]
>
>First of all, Ken's code above is not portable, and is not meant to be so.
>It will only work in a system were you are the software/hardware designer and
>have complete control over interrupt handlers and the hardware memory map
>of the machine.  [Dave Trissel]

Touche, but one would, on occasion, like to be able to do this cleanly
in an HLL.  Having to generate NOPs in odd places is grody.

>Second of all, each architecture has specific methods which must be followed
>to guarantee proper communication between different tasks or between tasks
>and interrupt exception routines.  Although the above code is usefull for
>investigating pipeline subtleties it is not the proper way to communicate
>since it makes unwarrented assumptions about its environment. [...]
>[Dave Trissel]

This sounds like you're saying you have to make assumptions that you're
not allowed to make...

>In other words, there is an agreement here between the interrupt handler
>and this piece of code AND an underlying assumption about the MPU involved.
>Compiler's simply don't expect this kind of side-effect and therefore have no
>reason to prepare for it.  [Dave Trissel]

Which is precisely one reason why I object to this sort of architectural
grodiness.

>If you are using an intermediate language for systems program and do want to
>produce something that works then certainly the language itself would have
>the capability of recognizing when to use a  NOP (in the case of the '020)
>or allow you to insert an embedded assembly language statement for the same
>effect.  [...]  [Dave Trissel]

More of the same sort of grodiness.

>Architectures also provide for locks and semaphores as proper means to access
>data shared by multiple tasks and/or processors. [...]  [Dave Trissel]

What does this have to do with pipeline synchronization?

--
The above viewpoints are mine.  They are unrelated to
those of anyone else, including my cat and my employer.

Ken Montgomery  "Shredder-of-hapless-smurfs"
...!{ihnp4,allegra,seismo!ut-sally}!ut-ngp!kjm  [Usenet, when working]
kjm@ngp.UTEXAS.EDU  [Internet, if the nameservers are up]

kds@intelca.UUCP (Ken Shoemaker) (10/03/85)

> >How does one write in a high-level language the idea that a store
> >might fault, that the fault is acceptable, and knowledge of the
> >fault status is in r0?  This sort of code *isn't* portable (which
> >may be an argument for not using it), and happens only in deliberate,
> >thereby known, places.  [Ed Gould]

certainly, this isn't portable among different architectures.  However,
the knowledge can be build as to register usage in a compiler (actually,
it must be).  I remember some of the early versions of Berkeley VAX unix
used the knowledge of the way register variables were allocated by the
c compiler in mixing assembly code with compiler code, so such a piece 
of code could be written that would mix low level interrupts (or bus
errors) with a high level language.  Also, for many systems, code must
be written to be non-portable, simply because the peripherals in the
processor subsystem (e.g., timers, etc.) are different.

If I were to write a piece of such code for a VAX780, I would expect it
to work the same on a VAX750.  Even such a kludgy piece of code as was written
in the original article probably would work.  Its a shame the same isn't
true between the 68010 and 68020, even if the external memory management
unit is the same.  Honestly, this is a minor thing.  My main point was
that moving from a 68* to a 68* may not be as simple as plugging in
a new component, even if everything else in the system is the same because
of changes other than architecture changes, but rather because of micro-
architecture changes.

Another assumption was made in the arguments about the 68020 bus effulvia,
in that it is assumed that it is required.  Certainly, this is true
with their current implementation, but if they had done an implementation
in a way that kept track of changes internally between bus requests and
bus acknowledges, they could set the processor to redo the instructions.
Performance in this case is a nit, since you are probably entering a very
slow operation anyway (e.g., paging), and besides, you probably would
take at least as long to dump all the crap on the external bus.  Whatever,
I have no idea whether such a scheme would even be implementable in
their transistor budget, but such a scheme would certanly be cleaner from
an external standpoint.  Note that for the most part, with this scheme
they could still pipeline out the wazoo if they so desired, so you still
get optimal performance for the normal case, i.e., no bus errors at all.
-- 
...and I'm sure it wouldn't interest anybody outside of a small circle
of friends...

Ken Shoemaker, Microprocessor Design for a large, Silicon Valley firm

{pur-ee,hplabs,amd,scgvaxd,dual,qantel}!intelca!kds
	
---the above views are personal.  They may not represent those of the
	employer of its submitter.

davet@oakhill.UUCP (Dave Trissel) (10/03/85)

In article <2442@ut-ngp.UTEXAS> kjm@ut-ngp.UTEXAS (Ken Montgomery) writes:
>[]
>>>>This is very easy for the '020 programmer to solve.  The MC68020 NOP
>>>>instruction serializes the machine (e.g. guarantees all updates for previous
>>>
>>>How does one get this NOP generated in the correct places for compiled code
>>>that has to be portable?  The synchronizing effect is not useful unless one
>>
>>First of all, Ken's code above is not portable, and is not meant to be so.
>>It will only work in a system were you are the software/hardware designer and
>>have complete control over interrupt handlers and the hardware memory map
>>of the machine.  [Dave Trissel]
>
>Touche, but one would, on occasion, like to be able to do this cleanly
>in an HLL.  Having to generate NOPs in odd places is grody.

The HLL would have a construct to do it.  Like the PL/I REORDER statement
prefix or special meaning to the PL/I null statement for 360/91.

>>In other words, there is an agreement here between the interrupt handler
>>and this piece of code AND an underlying assumption about the MPU involved.
>>Compiler's simply don't expect this kind of side-effect and therefore have no
>>reason to prepare for it.  [Dave Trissel]
>
>Which is precisely one reason why I object to this sort of architectural
>grodiness.

For a given architecture one can think up something arbitrarily grody. For
example, how would a high-level language support the return of a 32-bit
value from an exception routine in a 16-bit architecture which only has 16 bit
registers and no 32-bit load instructions?  Would you classify all 16 bit
machines as therefor grody?

>>If you are using an intermediate language for systems program and do want to
>>produce something that works then certainly the language itself would have
>>the capability of recognizing when to use a  NOP (in the case of the '020)
>>or allow you to insert an embedded assembly language statement for the same
>>effect.  [...]  [Dave Trissel]
>
>More of the same sort of grodiness.

As I pointed out, the "problem" exists on every architecture and HLL
constructs can be made available to deal with such things.

>>Architectures also provide for locks and semaphores as proper means to access
>>data shared by multiple tasks and/or processors. [...]  [Dave Trissel]
>
>What does this have to do with pipeline synchronization?

The question was originally what is a valid context when an exception (or
interrupt) occurs.  I merely pointed out that all architectures have these
anomalies when it comes to High Level Langauges. The '020 pipeline doesn't
really make matters any more difficult as long as the system programmer
fully understands the issues involved. No different than the programmer on a
16 bit machine understanding the issues involved there with 32-bit data.

  --  Dave Trissel   {seismo,ihnp4}!ut-sally!oakhill!davet
      Motorola Semiconductor Austin