[comp.arch] volatile

daveb@geac.UUCP (David Collier-Brown) (04/24/88)

In article <20345@pyramid.pyramid.com> eric@pyrps5 (Eric Bergan) writes:
[in response to <488@wsccs.UUCP>, from terry@wsccs.UUCP
| 	The flip side of the argument is "How far should we bend C for
| optimization purposes?" It would be great from an optimizer's viewpoint
| to be able to definitively know if a variable has no aliases. I suspect
| that this is too traumatic a change for the language, however, so
| we will have to rely on doing less optimization then might be available.
| (Have there been any studies of just how much performance improvements
| might be possible if an optimizer definitively knew what was and was
| not aliased?) 

 As a sidebar to the question, consider this fragment from
comp.arch, 
| Article 4111 of comp.arch:
| From: fnf@mcdsun.UUCP (Fred Fish)
| This is just the tip of the iceburg, there are lots of other optimizations
| that become obvious.  By examining the static and dynamic characteristics
| of the program, the data section objects can be sorted to get the most
| frequently used objects into low data memory.  The linker might also 
| decide that certain sections of the program reference portions of
| data memory more often than others, and insert the appropriate code to
| change the data mapping on the fly, rather than using a static mapping.

    It is interesting to note that there have not been, to date,
**any** other discussion of the necessity of "volatile" et all, only
of their desirability in a given language, taking their necessity as
**a given**.

    Are the two discussion groups disjoint? 

--dave (If you think that wasn't a nasty comment,
       sed <.signature 's/months/picoseconds/') c-b
-- 
 David Collier-Brown.                 {mnetor yunexus utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind) 
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

mash@mips.COM (John Mashey) (04/26/88)

In article <2642@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
....
>    It is interesting to note that there have not been, to date,
>**any** other discussion of the necessity of "volatile" et all, only
>of their desirability in a given language, taking their necessity as
>**a given**.

This is an assertion of non-fact.  There was such a discussion that
ran for about a week, a month or so ago, at least in comp.lang.c.
People gave examples of potential usage; some of us who have it and use
it said so; I observed that both of our kernels had about 200
instances of volatile, and we'd miss it a lot if it weren't there.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

daveb@geac.UUCP (David Collier-Brown) (04/29/88)

In article <2642@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
|     It is interesting to note that there have not been, to date,
| **any** other discussion of the necessity of "volatile" et all, only
| of their desirability in a given language, taking their necessity as
| **a given**.

In article <2082@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
| This is an assertion of non-fact.  There was such a discussion that
| ran for about a week, a month or so ago, at least in comp.lang.c.
| People gave examples of potential usage; some of us who have it and use
| it said so; I observed that both of our kernels had about 200
| instances of volatile, and we'd miss it a lot if it weren't there.

 Sorry John, that's not a discussion of the necessity for such a
facility.  (Yes, I read your article when it was published).
  That your compiler-wriers believed that such was necessary **is**
germane, and I'll happily agree that, if found necessary by the
compiler-writers, one would be a little foolish to not use it (:-)).

  I reiterate: the question of the necessity of certain information
for optimization purposes is:
	1) in part architectural,
	2) in part a question of compiler technology, and
	3) open.

  Specifically:
	1) what architectures currently use asynchronously-changing
memory locations for program notification of events? DEC Vax,
various CDC boxes, MIPS(tm),...
	2) what other programmer-visible alternatives are there?
Interrupts, event queues, (Hoare) monitors... 
	3) what is the state of compiler/translator technology for
the new architectures, especially for parallel processing? Is the
"volatile" question valid, has it been dealt with, and if so, how?
VLIW, (Brinch Hansen's) Edison, Concurrent <whatever>, etc...

--dave c-b
-- 
 David Collier-Brown.                 {mnetor yunexus utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind) 
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

limes@sun.uucp (Greg Limes) (04/30/88)

In article <2674@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>
>  I reiterate: the question of the necessity of certain information
>for optimization purposes is:
>	1) in part architectural,
>	2) in part a question of compiler technology, and
>	3) open.
>
>  Specifically:
>	1) what architectures currently use asynchronously-changing
>memory locations for program notification of events? DEC Vax,
>various CDC boxes, MIPS(tm),...

... and any program with a signal handler that modifies a global
variable, which is pretty inclusive if you ask me, it all comes to the
same effect from the POV of both the mainline thread and the
compiler ...

>	2) what other programmer-visible alternatives are there?
>Interrupts, event queues, (Hoare) monitors... 

... use only specific transfer variables, test these only around calls
to synchronous functions that the compiler can never know about and must
assume that they modify these locations ...
-- 
   Greg Limes [limes@sun.com]				frames to /dev/fb

brooks@lll-crg.llnl.gov (Eugene D. Brooks III) (04/30/88)

>In article <2642@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>|     It is interesting to note that there have not been, to date,
>| **any** other discussion of the necessity of "volatile" et all, only
>| of their desirability in a given language, taking their necessity as
>| **a given**.
Unfortunately, the ANSI C standard proposal did not give much attention to
multiprocessing issues, but the use of volatile in tightly coupled shared
memory programs is very important.  A shared memory variable is often
"cacheable" for a short period of time, and by this I include "cache in the
usual sense" and the use of registers as cache.  Volitile allows the user
to tell the compiler which references are not to be taken from cache, but
obviously needs further development in the context of multiprocessing.
Perhaps this will happen in the second "standards cycle" for C.  With all
the VLSI based multiprocessors hitting the market, good language support
for them will become very important.

mash@mips.COM (John Mashey) (05/01/88)

In article <2674@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:

>In article <2642@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>|     It is interesting to note that there have not been, to date,
>| **any** other discussion of the necessity of "volatile" et all, only
>| of their desirability in a given language, taking their necessity as
>| **a given**.

>In article <2082@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
>| This is an assertion of non-fact.  There was such a discussion that
>| ran for about a week, a month or so ago, at least in comp.lang.c.
>| People gave examples of potential usage; some of us who have it and use
>| it said so; I observed that both of our kernels had about 200
>| instances of volatile, and we'd miss it a lot if it weren't there.

> Sorry John, that's not a discussion of the necessity for such a
>facility.  (Yes, I read your article when it was published).
>  That your compiler-wriers believed that such was necessary **is**
>germane, and I'll happily agree that, if found necessary by the
>compiler-writers, one would be a little foolish to not use it (:-)).

It wasn't our compiler-writers, it was us (the OS group) who thought we
needed it.

>  I reiterate: the question of the necessity of certain information
>for optimization purposes is:
>	1) in part architectural,
>	2) in part a question of compiler technology, and
>	3) open.

>  Specifically:
>	1) what architectures currently use asynchronously-changing
>memory locations for program notification of events? DEC Vax,
>various CDC boxes, MIPS(tm),...
>	2) what other programmer-visible alternatives are there?
>Interrupts, event queues, (Hoare) monitors... 

As I thought I'd mentioned before, the most immediate cause of wanting
volatile has nothing to do with computer architecture in the sense of
1), or OS theory/practice in the sense of 2), but the very simple,
practical reason, that in an open systems design, using a standard
bus, and buying off-the-shelf controllers, you will discover that
many such controllers, otherwise desirable, use memory as
a communication mechanism.  You get a choice:
	a) Add volatile to C, and be able to optimize your drivers
	with minimum performance loss, thus maintaining the ability to
	use C rather than assembler without irritating performance lossage.
	You also get to port existing drivers fairly easily.
	OR
	b) Don't optimize your drivers very much.
	OR
	c) Eliminate many potential peripheral controllers, or at best,
	have to completely redo drivers you get from elsewhere.
	OR
	d) Build all your own peripheral controllers.
Of these options, most people do b), because c) and d) aren't practical
for many organizations; a) is clearly more pleasant, an much more in tune
with the philosophy in UNIX that using C shouldn't cause you a big
performance hit, and that C should be able to describe devices and
otehr interfaces not under the software architect's control.

Does that address the question: volatile matches the reality of what's
out there, not what might be possible otherwise.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

wcs@skep2.ATT.COM (Bill.Stewart.<ho95c>) (05/02/88)

In article <51437@sun.uucp> limes@sun.UUCP (Greg Limes) writes:
:In article <2674@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
:>	1) what architectures currently use asynchronously-changing
:>memory locations for program notification of events? DEC Vax,
:>various CDC boxes, MIPS(tm),...
:... and any program with a signal handler that modifies a global
:variable, which is pretty inclusive if you ask me, .....

	Any program with a signal handler that *reads* global variables
needs the moral equivalent of volatile, if you're going to allow
optimizing compilers to avoid "redundant" writes to memory.
-- 
#				Thanks;
# Bill Stewart, AT&T Bell Labs 2G218, Holmdel NJ 1-201-949-0705 ihnp4!ho95c!wcs
# skep2 is a local machine I'm trying to turn into a server.  Please send
# mail to ho95c or ho95e instead.  Thanks.

daveb@geac.UUCP (David Collier-Brown) (05/02/88)

In article <6612@lll-winken.llnl.gov> brooks@lll-crg.llnl.gov.UUCP (Eugene D. Brooks III) writes:
>obviously needs further development in the context of multiprocessing.
>Perhaps this will happen in the second "standards cycle" for C.  With all
>the VLSI based multiprocessors hitting the market, good language support
>for them will become very important.

  Yup.
  The Sequent is probably a good example: their cache algorithms are
subtle and (seemingly) elegant

--dave c-b
-- 
 David Collier-Brown.                 {mnetor yunexus utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind) 
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

daveb@geac.UUCP (David Collier-Brown) (05/02/88)

In article <2114@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
| As I thought I'd mentioned before, the most immediate cause of wanting
| volatile has nothing to do with computer architecture in the sense of
| 1), or OS theory/practice in the sense of 2), but the very simple,
| practical reason, that in an open systems design, using a standard
| bus, and buying off-the-shelf controllers, you will discover that
| many such controllers, otherwise desirable, use memory as
| a communication mechanism. 

   Agreed.  I was asking an architecture/theory question, in hopes
of invalidating part of the problem.  And it is a real problem:
there is nothing improper or undesirable in using a portion of the
address space for special purposes.  Adding volatile or equivalent
directives to languages is a perfectly reasonable tradeoff,
expecially if the alternative is to break off a part of the address
space for "i/o ports" as some microcomputer manufacturers do.

  Thats why I addressed it to the architects.  I know why I'd need
it as a programmer: (a).  Or I can use (b) and write a
very-machine-specific driver using lots of source-to-source
transformations.  I've written drivers in GMAP.  Never again!

 --dave c-b
-- 
 David Collier-Brown.                 {mnetor yunexus utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind) 
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

johnl@ima.ISC.COM (John R. Levine) (05/04/88)

>In article <2114@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
> [ "volatile" makes it possible to program memory-mapped devices in C,
> without having to resort to subterfuges like turning off the optimizer
> and hoping the compiler is dumb enough.  (I hope this doesn't misrepresent
> his argument.) ]

Nobody I know has argued that something like "volatile" isn't useful.
There's no doubt that it is.  The question is whether it belongs in ANSI C,
which purports to be a standard for a language for portable programs.

The problem is that volatile means many different things to different people.
For some people it means a memory-like device register.  For others, it means
shared memory that might be modified by another process, or memory that might
be modified by an interrupt routine or by the operating system.  Some
optimists even hope that something like this:

	extern volatile foo;
	...
	foo++;

could be used for synchronization of shared memory areas.

All of these are useful, and all of them are different. That's why a single
keyword in the language doesn't suffice to describe them. You'd have many
different implementations giving different semantics to the same syntax,
hardly a recipe for portability. Use a compiler-specific extension, or use a
#pragma. (I note in passing that for many architectures, if you're addressing
device registers you also need some way to tell the compiler to use aligned
16-bit references or some such, which clearly needs to be a #pragma.)

I suppose that if the C committee were prepared to decree a standard for the
semantics of shared variables, it would then be reasonable to define a keyword
to access them. Considering the plethora of shared memory implementations, and
the equally large plethora of synchronizing schemes, I doubt any such standard
would at this point be acceptable to more than a small minority of users.
Until then, though, let's restrict the standard language to things that we
know how to make portable.

To open another can of worms, this same argument applies to noalias.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
Rome fell, Babylon fell, Scarsdale will have its turn.  -G. B. Shaw

ray@micomvax.UUCP (Ray Dunn) (05/05/88)

In article <2674@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) asks:
 [On the necessity of "volatile"]
>  Specifically:
>	1) what architectures currently use asynchronously-changing
>memory locations for program notification of events? DEC Vax,
>various CDC boxes, MIPS(tm),...

The asking of this question is a clear example of the distorted view many
people have, whose environment is restricted to DEC, SUN, CDC, IBM etc
*standard* machines.  This is not meant as a flame, only as an observation.

The answer to the question is "as many architectures as there are
proprietary designs of micro-processor based equipment which use
asynchronously changing memory locations, including *all*, for example,
which use memory mapped I/O.

As I said in an earlier posting (on reflection not very succinctly), a
compiler for a particular [micro-processor] CPU can make *no* assumptions
about the architecture that the compiled code is running on, other than [the
architecture of the microprocessor itself].

Writing a compiler for a 68000 is a different exercise than writing one for
a VAX, which is a fully defined environment.

So it's not reasonable to make these controls [e.g. "volatile"] a compiler
specific thing.

They *have* to be part of the language.

-- 
Ray Dunn.                      |   UUCP: ..!{philabs, mnetor}!micomvax!ray
Philips Electronics Ltd.       |   TEL : (514) 744-8200   Ext: 2347
600 Dr Frederik Philips Blvd   |   FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9   |   TLX : 05-824090

johnl@ima.ISC.COM (John R. Levine) (05/11/88)

In article <1030@micomvax.UUCP> ray@micomvax.UUCP (Ray Dunn) writes:
>As I said in an earlier posting (on reflection not very succinctly), a
>compiler for a particular [micro-processor] CPU can make *no* assumptions
>about the architecture that the compiled code is running on, other than [the
>architecture of the microprocessor itself].
> ...
>So it's not reasonable to make these controls [e.g. "volatile"] a compiler
>specific thing.
>
>They *have* to be part of the language.

It seems to me that we have a confusion here between making constructs part of
the language and making them part of the standard.  Nobody has ever said that
ANSI C compilers shouldn't have extensions -- they all will, of some sort
or another.  But whether "volatile" belongs in the standard is entirely another
thing.  The standard defines a language for writing portable programs, i.e.
any standard-conforming program should do the same thing on all ANSI C
processors.  I have never seen any suggestion that a program containing
"volatile" would be portable except perhaps to other processors which happen
to have similar memory, I/O, and hardware architectures.

So, sure, go ahead and put it in your compiler where it will be useful for
all sorts of stuff.  But keep it out of the standard until you can define it
in a way that will produce the same results on all C processors.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
Rome fell, Babylon fell, Scarsdale will have its turn.  -G. B. Shaw

aglew@urbsdc.Urbana.Gould.COM (05/12/88)

>I have never seen any suggestion that a program containing
>"volatile" would be portable except perhaps to other processors which happen
>to have similar memory, I/O, and hardware architectures.
>
>John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869

Then I suppose that I'd better tell my friend who is writing
a massive astronomical simulation using System V shared memory
in as portable a way as possible to give up?

woerz@iaoobelix.UUCP (Dieter Woerz) (05/15/88)

In article <1011@ima.ISC.COM> johnl@ima.UUCP (John R. Levine) writes:
> ...
>or another.  But whether "volatile" belongs in the standard is entirely another
>thing.  The standard defines a language for writing portable programs, i.e.
>any standard-conforming program should do the same thing on all ANSI C
>processors.  I have never seen any suggestion that a program containing
>"volatile" would be portable except perhaps to other processors which happen
>to have similar memory, I/O, and hardware architectures.
> ...

I don't see, what "volatile" has todo with the hardware architecture.
In my understanding, "volatile" simply says that the compiler has to
access the variable in the memory every time the variable is access-
ed. There seems to be nothing dependent on the hardware, exept that
you can't get access to I/O ports on architectures, which don't have
memory mapped I/O. So "volatile" has nothing todo with portability,
because every compiler can be made to access a variable in memory
every time it is read or writtten.

I agree, that if you depend on "volatile" for doing memory mapped
I/O, that wouldn't be too portable on an architecture with port I/O.
But that low level I/O-Software normally doesn't have to be portable.

------------------------------------------------------------------------------

Dieter Woerz
Fraunhofer Institut fuer Arbeitswirtschaft und Organisation
Abt. 453
Holzgartenstrasse 17
D-7000 Stuttgart 1
W-Germany

BITNET: iaoobel.uucp!woerz@unido.bitnet
UUCP:   ...{uunet!unido, pyramid}!iaoobel!woerz

scc@cl.cam.ac.uk (Stephen Crawley) (05/20/88)

In article <166@iaoobelix.UUCP> woerz@iaoobelix.UUCP (Dieter Woerz) writes:
>In article <1011@ima.ISC.COM> johnl@ima.UUCP (John R. Levine) writes:
>> ....  I have never seen any suggestion that a program containing
>>"volatile" would be portable except perhaps to other processors which happen
>>to have similar memory, I/O, and hardware architectures.
>> ...
>
>I don't see, what "volatile" has todo with the hardware architecture.
>In my understanding, "volatile" simply says that the compiler has to
>access the variable in the memory every time the variable is access-
>ed. There seems to be nothing dependent on the hardware, exept that
>you can't get access to I/O ports on architectures, which don't have
>memory mapped I/O. So "volatile" has nothing todo with portability,
>because every compiler can be made to access a variable in memory
>every time it is read or writtten.
>

The "volatile" construct is also necessary to write portable multi-process
applications that use shared memory.  Consider two processes A and B with
a shared variable v, and the following code

   int x = v	/* A1 */	|	v = 2	/* B1 */
   int y = v	/* A2 */	|	

Assume that v starts with the value 1, and that the statements are actually
executed in the order A1,B1,A2.  

If the C compiler generates a memory fetch for each access to v, A's 
variables x and y will contain 1 and 2 respectively.  If the compiler
does not generate a memory fetch for each access, x and y will both
contain 1.  To get behaviour that is the same for all compilers, the 
variable v must be declared volatile.

[It might be argued that if processes A and B ought were using some 
 synchronisation mechanism (e.g. a Unix semaphore call after A1) and 
 that an optimizing compiler would notice this and realise that 
 statement A2 needed to have a fetch.
 
 I claim that such an argument is invalid.  In some tightly coupled 
 multiprocessor systems, shared memory cells are the basis for ALL 
 synchronisation.  If the compiler won't let me declare volatile 
 variables, I can't implement synchronisation primitives.  Even in
 a uniprocessor UNIX system, there are situations where it is better
 to use a shared variable for synchronising 2 processes rather than 
 frittering away a couple of milli-seconds on semaphore system calls]

-- Steve

henry@utzoo.uucp (Henry Spencer) (05/23/88)

>    int x = v	/* A1 */	|	v = 2	/* B1 */
>    int y = v	/* A2 */	|	
> 
> If the C compiler generates a memory fetch for each access to v, A's 
> variables x and y will contain 1 and 2 respectively.  If the compiler
> does not generate a memory fetch for each access, x and y will both
> contain 1.  To get behaviour that is the same for all compilers, the 
> variable v must be declared volatile.

Unfortunately, this is not sufficient.  "Volatile" does not guarantee
that operations are atomic.  It is entirely possible for x and/or y to
contain trash because they caught the variable midway through the
assignment.  (Not likely to be a problem with the values 1 and 2, but
123534234 and 878787970 are a different matter.)  "Volatile" makes *no*
promises about how things will look in parallel processes; C is defined
as a sequential language.  (Please don't bring up signal handlers as an
exception unless you have read the very careful weasel-wording about them
in recent X3J11 drafts.)

With very careful cooperation from compiler and hardware, "volatile" may
permit such things to be done safely.  But that is an implementation-
dependent property, not a requirement of the language that all compilers
must satisfy.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

rpw3@amdcad.AMD.COM (Rob Warnock) (05/27/88)

+---------------
| >    int x = v	/* A1 */	|	v = 2	/* B1 */
| >    int y = v	/* A2 */	|	
| > If the C compiler generates a memory fetch for each access to v, A's 
| > variables x and y will contain 1 and 2 respectively...
| > ...variable v must be declared volatile.
| Unfortunately, this is not sufficient.  "Volatile" does not guarantee
| that operations are atomic.  It is entirely possible for x and/or y to
| contain trash because they caught the variable midway through the
| assignment.  (Not likely to be a problem with the values 1 and 2, but
| 123534234 and 878787970 are a different matter.)
+---------------

Note that 12345678 is likely to be a problem only on machines for which the
memory bus is not wide enough to represent 12345678 in a single "write" cycle.

There is a perfectly respectable mutual exclusion technique which can be
used on multi-processor machines, which requires no special hardware, and
requires only that writes of a small integer are atomic. (The "small integer"
has to be able to hold a processor number.) In the two-processor case, this
is called "Dekker's Algorithm".  (For large numbers of processors, it is
called "expensive"!  ;-}  ;-}  )

The "volatile" type is necessary and sufficient for correctly implementing
Dekker's Algorithm in "optimizing C".


Rob Warnock
Systems Architecture Consultant

UUCP:	  {amdcad,fortune,sun,attmail}!redwood!rpw3
ATTmail:  !rpw3
DDD:	  (415)572-2607
USPS:	  627 26th Ave, San Mateo, CA  94403

johnl@ima.ISC.COM (John R. Levine) (05/29/88)

In article <21821@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
>There is a perfectly respectable mutual exclusion technique which can be
>used on multi-processor machines, which requires no special hardware, and
>requires only that writes of a small integer are atomic. (The "small integer"
>has to be able to hold a processor number.) In the two-processor case, this
>is called "Dekker's Algorithm".  (For large numbers of processors, it is
>called "expensive"!  ;-}  ;-}  )
>
>The "volatile" type is necessary and sufficient for correctly implementing
>Dekker's Algorithm in "optimizing C".

Not so fast, there. In large multi-CPU systems, there is often a write-behind
cache and writes by one processor are not always visible to other processors
until you do some sort of cache-flush operation. (For example, on 370
architecture machines the cache is only flushed on process switch, start I/O,
and a very slow flush no-op.)

I can imagine that you might follow each write to a volatile variable with a
flush, but that would cause a huge and unnecessary performance penalty in the
common case that volatile is used to share data among asynchronous events on a
single processor, such as routines called via signal(), so compiler writers
would justifiably be reluctant to put flushes all over their object code. Then
you get compiler switches to explain to the compiler what you meant by
"volatile" or #pragma lines to say this is a single-cpu volatile and that is a
multi-cpu volatile. Perhaps this suggests that it's premature to standardize
the semantics of volatile, or even to include it in the standard language, but
that is beating a dead and increasingly smelly horse.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
Rome fell, Babylon fell, Scarsdale will have its turn.  -G. B. Shaw

ray@micomvax.UUCP (Ray Dunn) (05/30/88)

In article <1988May23.003847.1114@utzoo.uucp> (Henry Spencer) writes:
 [quoting an article without attributing it]
>> ....  To get behaviour that is the same for all compilers, the 
>> variable v must be declared volatile.
>
>Unfortunately, this is not sufficient.  "Volatile" does not guarantee
>that operations are atomic.  It is entirely possible for x and/or y to
>contain trash because they caught the variable midway through the
>assignment.

Fortunately this is sufficient when the programmer understands what he is
programming, and chooses data types etc which will ensure atomicity, if that
is what he is trying to achieve.

Even in assembler, problems will result if a non-atomic instruction is used
in a semaphore mechanism.

All "volatile" gives in this shared memory context is a basic low level tool
that a careful programmer can use to sufficiently handle this class of
problems.  All the compiler need be aware of is that a reference to a
volatile variable must always generate a memory reference.

-- 
Ray Dunn.                      |   UUCP: ..!{philabs, mnetor}!micomvax!ray
Philips Electronics Ltd.       |   TEL : (514) 744-8200   Ext: 2347
600 Dr Frederik Philips Blvd   |   FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9   |   TLX : 05-824090

scc@cl.cam.ac.uk (Stephen Crawley) (05/30/88)

In article <21821@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
>There is a perfectly respectable mutual exclusion technique which can be
>used on multi-processor machines, which requires no special hardware, and
>requires only that writes of a small integer are atomic. (The "small integer"
>has to be able to hold a processor number.) In the two-processor case, this
>is called "Dekker's Algorithm".  (For large numbers of processors, it is
>called "expensive"!  ;-}  ;-}  )
>
>Rob Warnock
>Systems Architecture Consultant

A more recent solution to the mutual exclusion is NOT expensive for multiple
processors:

  "A Fast Mutual Exclusion Algortithm"
  Leslie Lamport, Nov 14 1985.
  Report #5, DEC Systems Research Centre

The review at the beginning of the report (by Butler Lampson) says it 
better than I can:

"   To build a useful computer system from a collection of processors that
communicate by sharing memory, but lack any atomic operation more complex
than a memory read or write, it is necessary to implement mutual exclusion
by using only these operations.  Solutions to this problem have been known
for twenty years, but they are linear in the number of processors.  Lamport
presents a new algorithm which takes constant time (five writes and two
reads) in the absence of contention, which is the normal case.  To achieve
this performance it sacrifices fairness [*], which is probably unimportant in
practical applications.
    The paper gives an informal argument that the algorithm's performance
in the absence of contention is optimal [!!], and a fairly formal proof
of safety and freedom from deadlock, using a slightly modified Owicki-Gries 
method.  The proofs are extremely clear, and use very little notation."

[All typo's in the above are mine!]

[* The body of the report notes that there is a variation of the algorithm
   that is starvation free, at the cost of one extra memory reference in
   the absence of contention.]

In case you want to rush out and buy a copy, the address for DEC SRC is
given as 

    Digital Systems Research Center
    130 Lytton Avenue
    Palo Alto, California 94301


-- Steve

brooks@lll-crg.llnl.gov (Eugene D. Brooks III) (06/02/88)

In article <206@gannet.cl.cam.ac.uk> scc@cl.cam.ac.uk (Stephen Crawley) writes:
>A more recent solution to the mutual exclusion is NOT expensive for multiple
>processors:
>
>  "A Fast Mutual Exclusion Algortithm"
>  Leslie Lamport, Nov 14 1985.
>  Report #5, DEC Systems Research Centre
>
I have coded Lamports algorithm on the Sequent Balance and it actually
beat the performance of their ALM hardware that lives on the Multibus.
Lamports algorithm is a good practical algorithm for machines that
done have test_and_set support which runs at the same speed as main
memory.  If a collision for the lock occurs the algorithm is order
Nprocessors in time, which is why you do want test and set hardware
support.  Lamport's paper is very recommended reading for those interested
in shared memory multiprocessing.

nevin1@ihlpf.ATT.COM (00704a-Liber) (06/02/88)

In article <1078@micomvax.UUCP> ray@micomvax.UUCP (Ray Dunn) writes:
|In article <1988May23.003847.1114@utzoo.uucp> (Henry Spencer) writes:

|>Unfortunately, this is not sufficient.  "Volatile" does not guarantee
|>that operations are atomic.  It is entirely possible for x and/or y to
|>contain trash because they caught the variable midway through the
|>assignment.

|Fortunately this is sufficient when the programmer understands what he is
|programming, and chooses data types etc which will ensure atomicity, if that
|is what he is trying to achieve.

C itself does not guarantee that access to any particular data type,
including char, is atomic.  My question is:  is there *any* use for 'volatile'
which does not require 'atomicity' at some level?  If not, then 'volatile'
doesn't really fix any of the problems we have without it.

-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				You are in a little twisting maze of
 /  / _ , __o  ____		 email paths, all different.
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

brooks@maddog.llnl.gov (Eugene D. Brooks III) (06/02/88)

>>  "A Fast Mutual Exclusion Algortithm"
>>  Leslie Lamport, Nov 14 1985.
>>  Report #5, DEC Systems Research Centre
>>
Having been pestered by several people to post the code for Lamport's
algorithm to the net so people don't have wait for his paper, which you
really ought to read if you want to discuss this item, here is an expression
of the algorithm in C fragments.  The "shared" is meant to indicate the
status of the storage of the memory cells.  I hopefully have not blown it,
please don't start up your flame machines if I did.  READ LESLIE'S PAPER!



#define TRUE		1
#define FALSE		0	/* Zero so b[] starts out FALSE. */
#define PROCUNDEF	-1
#define MAXPROC		8	/* How many processors you have. */

shared int x = PROCUNDEF;
shared int y = PROCUNDEF;
shared int b[MAXPROC];

/* Code fragment to get locked access to critical region.
PROCNUM is a unique identifier for the cpu executing the code.
	*/
	tryagain :
	b[PROCNUM] = TRUE;
	x = PROCNUM;
	if(y != PROCUNDEF) {
		b[PROCNUM] = FALSE;
		while(y != PROCUNDEF);
		goto tryagain;
	}
	y = PROCNUM;
	if(x != PROCNUM) {
		b[PROCNUM] = FALSE;
		for(j = 0; j < MAXPROC; j += 1) {
			while(b[j] != FALSE);
		}
		if(y != PROCNUM) {
			while(y != PROCUNDEF);
			goto tryagain;
		}
	}
/* Code fragment to unlock critical region.
	*/
	y = PROCUNDEF;
	b[PROCNUM] = FALSE;



brooks@maddog.llnl.gov, brooks@maddog.UUCP