[comp.theory] Procedures with multiple out parameters: why not?

vidynath@function.mps.ohio-state.edu (Vidhyarath K Rao) (08/16/90)

[Line eaters are extinct, aren't they?]

I hope that this is the right newsgroup for this article.

I was reading E. Dijkstra, "The humble programmer", Comm ACM 15(1972), 
pp. 859--866, as reprinted in "Classics in Software Engineering" edited by
E. N. Yourdon. I came across the following [para 2 on p. 121 in the reprint]:
	A number of rules have been discovered, voilation of which will
	either seriously impair or totally destory the intellectual
	managebility of the program. These are of two kinds. Those of the first
	kind ... Examples are the exclusion of the goto-statements and of
	procedures with more than one output parameter.
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

While there is a lot of well-known [i.e. I have heard of them :-^] articles
on the goto statement, I have never seen heard of the theoreticians' case
against having more than one out parameter. None of the reasons I could
think of relates to theory. Can somebody out there give some pointers?

Let me list the reasons I came up with for not having it in an existing
compiler:
	1. It is hard to implement: I do not doubt this: I have heard of
functions returning records/structures being implemented in an non-reentrant
way. But that does not explain why it shold not be used in psuedo-code or in
an ideal language. [In fact R. E. Tarjan uses them in Data Structures and
Network Algorithms.]
	2. It does not increase efficiency over var parameters in accumulator
based machines or in machines with few general purpose registers: This is a
practical issue. In any case [using Tarjan's terminology] parameters being
modified shold have been initialized, where as parameters being returned
need not be. This is useful in lint and its brethern.
	3. It does not fit in ``typed Omega calculus'' [I don't know if
I can say Lamda calculus here. As far as I know, there is no such thing as
Omega calculus.]: Why not get rid of Omega calculus? Is any stronger theory
inconsistent?

  Two further remarks: Firstly, I do not question the fact that having a
single return parameter of pre-specified type allows the use of functions
in expressions. But why prohibit more than one at all times?
  Second point, in regard to 3 above: In some otherwise forgettable article
in an old issue of BYTE, I saw something to this effect: "Functions can
return only a simple value because functions in mathemtatics are required
to have a unique result". I immediately performed a gedanken experiment:
Go around telling mathemtaicians that they cannot say [in TeX mode]
	You cannot say things like "Now we will construct a function $f$
	from $X$ to $Y \times Z$."
At that time I simple chalked it upto the writer's ignorance. But reading
the cited article of Dijkstra brought the question again.

Eagerly awaiting enlightenment
--
Vidhyanth Rao			It is the man, not the method, that solves
function.mps.ohio-state.edu	the problem. - Henri Poincare
    (614)-366-9341		[as paraphrased by E. T. Bell]

dlawson@grebyn.com (Drew Lawson) (08/16/90)

In article <1990Aug15.231838.5664@zaphod.mps.ohio-state.edu> vidynath@function.mps.ohio-state.edu (Vidhyarath K Rao) writes:
>
>
>While there is a lot of well-known [i.e. I have heard of them :-^] articles
>on the goto statement, I have never seen heard of the theoreticians' case
>against having more than one out parameter. None of the reasons I could
>think of relates to theory. Can somebody out there give some pointers?
>

I don't have any references to this, but the attitude I have beed
exposed to which agrees with it is that procedures/functions should
not both return a value and alter parameters.  The problems can be similar
to the goto problems: it is less obvious what the flow is (in this case
the data flow rather than the control flow).

I don't cling too tightly to this rule.  Being a C programmer, I make a
lot of use of procedures with variable parameters which return a logical
value indicating whether the operation was completed correctly.

I don't know of any strictly theoretical backing for Dijkstra's assertion,
just pragmatic issues.  But then, a large part of the problems with gotos
is pragmatics.

-- 
+--------------------------------------------------------------------+
| Is life an illusion?                          | Drew Lawson        |
| Or does it just seem that way?                | dlawson@grebyn.com |
+--------------------------------------------------------------------+

dsr@abraxas.cs.Virginia.EDU (Dana Richards) (08/17/90)

In article <21458@grebyn.com> dlawson@grebyn.UUCP (Drew Lawson) writes:
>In article <1990Aug15.231838.5664@zaphod.mps.ohio-state.edu> vidynath@function.mps.ohio-state.edu (Vidhyarath K Rao) writes:
<<
<<
<<While there is a lot of well-known [i.e. I have heard of them :-^] articles
<<on the goto statement, I have never seen heard of the theoreticians' case
<<against having more than one out parameter. None of the reasons I could
<<think of relates to theory. Can somebody out there give some pointers?
<<
<
<I don't have any references to this, but the attitude I have beed
<exposed to which agrees with it is that procedures/functions should
<not both return a value and alter parameters.  The problems can be similar
<to the goto problems: it is less obvious what the flow is (in this case
<the data flow rather than the control flow).

on  a related note:  why does sedgewick's book not have any VAR parameters?
(he does have global variables of course).

dana richards

langley@ds1.scri.fsu.edu (Randolph Langley) (08/17/90)

I hope there is nothing wrong with this, since my master's project involves
exactly this! I am doing a graph-structured programming language, and you
can view each node as a function with any number of inputs and outputs.

rdl

mark@parc.xerox.com (Mark Weiser) (08/17/90)

Perhaps it depends on what counts as more than one out parameter.

Many languages permit the return of record structures, and the selection
of components as part of the call.  This could be argued as a form of
multiple out parameters.  (e.g.:
	PROC foo [] RETURNS RECORD [i,j: INT];
	...
	x = foo[].i;

If the issue is something like value vs. var parameters, then the restriction
could be restated as saying that it is bad to have ANY out parameters (i.e.
any var parameters)--only return of a (possibly complex value) should be
permitted.  This seems to be in agreement with formal practice also.  I would
guess it is in line with Dijkstra's intent.

-mark
--
Spoken: Mark Weiser 	ARPA:	weiser@xerox.com	Phone: +1-415-494-4406

cik@l.cc.purdue.edu (Herman Rubin) (08/18/90)

In article <516@roo.UUCP>, mark@parc.xerox.com (Mark Weiser) writes:
 
> Perhaps it depends on what counts as more than one out parameter.
 
> Many languages permit the return of record structures, and the selection
> of components as part of the call.  This could be argued as a form of
> multiple out parameters.  (e.g.:
> 	PROC foo [] RETURNS RECORD [i,j: INT];
 	...
> 	x = foo[].i;
 
> If the issue is something like value vs. var parameters, then the restriction
> could be restated as saying that it is bad to have ANY out parameters (i.e.
> any var parameters)--only return of a (possibly complex value) should be
> permitted.  This seems to be in agreement with formal practice also.  I would
> guess it is in line with Dijkstra's intent.

This is far too restrictive.  The only present HLL I know of which has full
multiple results for an operation is LISP.  That the others do not was, in
my opinion, massive oversight.  Outputting a record would not be adequate,
as the items output may have drastically different types.

If one does not worry about storage and access problems, there is no problem.
The output variables can be explicitly listed as arguments of the procedure
which returns VOID.  This has been used since antiquity/
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

milgr@teapot.prime.COM (Marc Milgram) (08/18/90)

As an implementation issue, many traditional programming languages
implimentations
returning a value by storing it in a register, and executing a return
instruction (or
an indirect jump ...).  Languages implimented as such are limited to returning
simple values as complex structures would not fit in the user register set.

More recent language implimentations use other methods for returning
structures.


Data flow analysis becomes significantly harder when functions can modify their
arguments -- also modifying a variable in another function.



Marc Milgram
milgr@teapot.prime.com

mark@parc.xerox.com (Mark Weiser) (08/19/90)

In article <2442@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:

>In article <516@roo.UUCP>, mark@parc.xerox.com (Mark Weiser) writes:
> 
>> ...
>> If the issue is something like value vs. var parameters, the restriction
>> could be restated as saying that it is bad to have ANY out parameters (i.e.
>> any var parameters)--only return of a (possibly complex value) should be
>> permitted.  This seems to be in agreement with formal practice also.  I 
>> would guess it is in line with Dijkstra's intent.

>This is far too restrictive.  The only present HLL I know of which has full
>multiple results for an operation is LISP.  That the others do not was, in
>my opinion, massive oversight.  Outputting a record would not be adequate,
>as the items output may have drastically different types.

I don't quite understnd this.  LISP is in fact one of the languages in
which one hardly every uses reasonable return values, in the sense of
typed records.  Modula-(2 and 3), Cedar, Mesa, lots of others have
record returns.  I truly don't understand the comment about a record
being inadequate because of the drastically different types--a record
is that data structure one uses when one has drastically different
types.  C requires that to return a record one must separately define
it.  Cedar (e.g.) has the convention that the procedure
declaration defines an implicit record type to be the return value
(see my original posting, which was sort of pseudo Cedar).

>If one does not worry about storage and access problems, there is no problem.
>The output variables can be explicitly listed as arguments of the procedure
>which returns VOID.  This has been used since antiquity.

I believe that the advice being given is that this is a practice to be
avoided.  Reading the code at the point of the call there is no
syntactic evidence about which parameters are which.  When huge
semantic differences are hidden by identical syntax, that is a bad
thing.

-mark
--
Spoken: Mark Weiser 	ARPA:	weiser@xerox.com	Phone: +1-415-494-4406

ianh@batserver.cs.uq.oz.au (Ian Hayes) (08/24/90)

vidynath@function.mps.ohio-state.edu (Vidhyarath K Rao) writes:


>I hope that this is the right newsgroup for this article.

>I was reading E. Dijkstra, "The humble programmer", Comm ACM 15(1972), 
>pp. 859--866, as reprinted in "Classics in Software Engineering" edited by
>E. N. Yourdon. I came across the following [para 2 on p. 121 in the reprint]:
>	A number of rules have been discovered, voilation of which will
>	either seriously impair or totally destory the intellectual
>	managebility of the program. These are of two kinds. Those of the first
>	kind ... Examples are the exclusion of the goto-statements and of
>	procedures with more than one output parameter.
>	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I suspect the theoretical problem that Dijkstra is referring to is
aliasing of variables. For example, consider
^^^^^^^^^^^^^^^^^^^^^
	proc P(var x, y : integer);
	begin
		x := 1; y := 2
	end
At the end of this procedure one can assert { x = 1 and y = 2 }.
Most people would consider this procedure equivalent to the
following procedure Q.
	proc Q(var x, y : integer);
	begin
		y := 2; x := 1
	end
One can also assert { x = 1 and y = 2} after this.

Now consider the calls P(z,z).
Using the usual substitution rules on assertion one can deduce
	{ z = 1 and z = 2 }
is true. But it is false. The problem comes from having two names
(aliases) x and y within the procedure P for the one variable z.

If var parameters are implemented by call by reference then P(z,z)
will leave z with the value 2 and Q(z,z) will leave z with the value
1. The two procedures which are intuitively equivalent are not
equivalent in this case. 

Another more practical example is the following matrix
multiplication procedure
	procedure MatrixMultiply(var A, B, C : matrix);
	begin
		what you would expect
		{ C = A * B }
	end
Here the A and B are passed as var parameters even though they are
not being modified by the matrix multiplication procedure. However,
if we have a call of the form
	MatrixMultiply(M,M,M)
we will not get the result that we want as A, B and C will all be
aliases for the same matrix M. To avoid this problem we need the
header to read
	procedure MatrixMultiply( A, B : matrix; var C : matrix)
this avoids any aliasing problems and will work correctly for the
call
	MatrixMultiply(M,M,M).
If you still don't follow try it and see what happens with the two
different headers.

This does not rule out a procedure that returns a composite object
such as a record or Cartesian product. Such an object is still a
single variable and doesn't suffer from aliasing.


--
Ian Hayes				ACSNet: ianh@uqcspe.cs.uq.oz.au
Mail: Department of Computer Science, University of Queensland,
      St. Lucia, Queensland  4067, Australia.
Work: +61 (7) 377 4131, Room 618.	FAX:  +61 (7) 371 0783.

ianh@batserver.cs.uq.oz.au (Ian Hayes) (08/24/90)

vidynath@function.mps.ohio-state.edu (Vidhyarath K Rao) writes:


>I hope that this is the right newsgroup for this article.

>I was reading E. Dijkstra, "The humble programmer", Comm ACM 15(1972), 
>pp. 859--866, as reprinted in "Classics in Software Engineering" edited by
>E. N. Yourdon. I came across the following [para 2 on p. 121 in the reprint]:
>	A number of rules have been discovered, voilation of which will
>	either seriously impair or totally destory the intellectual
>	managebility of the program. These are of two kinds. Those of the first
>	kind ... Examples are the exclusion of the goto-statements and of
>	procedures with more than one output parameter.
>	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I suspect the theoretical problem that Dijkstra is referring to is
aliasing of variables. For example, consider
^^^^^^^^^^^^^^^^^^^^^
	proc P(var x, y : integer);
	begin
		x := 1; y := 2
	end
At the end of this procedure one can assert { x = 1 and y = 2 }.
Most people would consider this procedure equivalent to the
following procedure Q.
	proc Q(var x, y : integer);
	begin
		y := 2; x := 1
	end
One can also assert { x = 1 and y = 2} after this.

Now consider the calls P(z,z).
Using the usual substitution rules on assertion one can deduce
	{ z = 1 and z = 2 }
is true. But it is false. The problem comes from having two names
(aliases) x and y within the procedure P for the one variable z.

If var parameters are implemented by call by reference then P(z,z)
will leave z with the value 2 and Q(z,z) will leave z with the value
1. The two procedures which are intuitively equivalent are not
equivalent in this case. 

Another more practical example is the following matrix
multiplication procedure
	procedure MatrixMultiply(var A, B, C : matrix);
	begin
		what you would expect
		{ C = A * B }
	end
Here the A and B are passed as var parameters even though they are
not being modified by the matrix multiplication procedure. However,
if we have a call of the form
	MatrixMultiply(M,M,M)
we will not get the result that we want as A, B and C will all be
aliases for the same matrix M. To avoid this problem we need the
header to read
	procedure MatrixMultiply( A, B : matrix; var C : matrix)
this avoids any aliasing problems and will work correctly for the
call
	MatrixMultiply(M,M,M).
If you still don't follow try it and see what happens with the two
different headers.

This does not rule out a procedure that returns a composite object
such as a record or Cartesian product. Such an object is still a
single variable and doesn't suffer from aliasing.

--
Ian Hayes				ACSNet: ianh@uqcspe.cs.uq.oz.au
Mail: Department of Computer Science, University of Queensland,
      St. Lucia, Queensland  4067, Australia.
Timezone: GMT-10 hours +- 1 hour depending who is on summertime.

eerke@cs.kun.nl (Eerke Boiten) (08/27/90)

In article <4625@uqcspe.cs.uq.oz.au> ianh@batserver.cs.uq.oz.au writes:
[]I suspect the theoretical problem that Dijkstra is referring to is
[]aliasing of variables. For example, consider
[example deleted] 
[]Another more practical example is the following matrix
[]multiplication procedure
[]	procedure MatrixMultiply(var A, B, C : matrix);
[]Here the A and B are passed as var parameters even though they are
[]not being modified by the matrix multiplication procedure. However,
[]if we have a call of the form
[]	MatrixMultiply(M,M,M)
[]we will not get the result that we want as A, B and C will all be
[]aliases for the same matrix M. To avoid this problem we need the
[]header to read
[]	procedure MatrixMultiply( A, B : matrix; var C : matrix)
[]this avoids any aliasing problems and will work correctly for the
[]call
[]	MatrixMultiply(M,M,M).

What I'm going to say is pretty obvious if you're a regular Pascal
programmer, but since this is comp.theory ...

In answer to the question that arises: why would anyone declare *all*
arguments as *var* arguments?
most Pascal implementations put *adresses* of var-arguments on the
stack, whereas *values* of non-var arguments are stacked when a
procedure/function call occurs. Thus, when one has *matrix* (read:
large) arguments, one usually makes them var-parameters.
Note that thus Ian Hayes' example shows that no everywhere-correct &
efficient Pascal matrix multiplication procedure can exist. (Or am I
overlooking something? If I did, we might continue on comp.lang.misc)

Eerke Boiten
Department of Informatics (STOP Project), K.U.Nijmegen
Toernooiveld, 6525 AD Nijmegen, The Netherlands
Tel. +31-80-612236.	Email: eerke@cs.kun.nl