[comp.lang.misc] From Modula to Oberon

crowl@cs.rochester.edu (Lawrence Crowl) (02/27/88)

Oberon is a new language designed by Niklaus Wirth.  It is a refinement of
Modula-2.  The remainder of this article is the LaTeX form of an article
which appeared in ASCII form on BIX.  Enjoy.
--------------------------------------------------------------------------
\documentstyle[11pt]{article}
%
% margin settings
%
\oddsidemargin 0.5in       %   Left margin on odd-numbered pages.
\evensidemargin 0in        %   Left margin on even-numbered pages.
\textwidth 6in
\textheight 8.5in
\topmargin 0.25in
\headheight 0pt
\headsep 0pt
\footheight 12pt
\footskip 30pt
%
% paragraph settings
%
\parskip 4pt plus 1.5pt minus 1.5pt
\clubpenalty 500
\widowpenalty 500
\displaywidowpenalty=500

\begin{document}

\title{From Modula to Oberon}
\author{N. Wirth}
\date{Tuesday 23 February 1988}
\maketitle

\begin{abstract}
The programming language Oberon is the result of a
concentrated effort  increase the power of Modula-2 and
simultaneously to reduce its  complexity. Several features
were eliminated, and a few were added in  order to increase
the expressive power and flexibility of the language.  This
paper describes and motivates the changes. The language is 
defined in a concise report.

\vspace{0.25in}
\noindent
BIX oberon/long.messages \#11,
from frode, 21774 chars, Tue Feb 23 20:43:46 1988.
\break
TITLE: Oberon Language Article, (c) ETH-ZENTRUM, SWITZERLAND
\end{abstract}

\section*{Introduction}

The programming language Oberon evolved from a project whose
goal was the  design of a modern, flexible and efficient
operating system for a single-user workstation. A
principal guideline was to concentrate on properties  that
are genuinely essential and --- as a consequence --- to omit
ephemeral  issues. It is the best way to keep a system in
hand, to make it understandable, explicable, reliable and
efficiently implementable.

Initially, it was planned to express the system in Modula-2
[1], as that  language supports the notion of modular design
quite effectively, and  because an operating system has to
be designed in terms of separately  compilable parts with
conscientiously chosen interfaces. In fact, an  operating
system should be no more than a set of basic modules, and
the  design of an application must be considered as a
goal-oriented extension of  that basic set: Programming is
always {\em extending} a given system.

Whereas modern languages, such as Modula, support the notion
of extensibility  in the procedural realm, the notion is
less well established in the domain of  data types. Modula
in particular does not allow the definition of new data 
types as extensions of other, programmer-defined types in an
adequate manner.  An additional feature was called for,
thereby giving rise to an extension of  Modula.

The concept of the planned operating system also called for
a highly dynamic,  centralized storage management relying on
the technique of garbage collection.  Although Modula does
not prevent the incorporation of a garbage collector in 
principle, its variant record feature constitutes a genuine
obstacle. As the  new facility for extending types would
make the variant record feature  superfluous, the removal of
this stumbling block was a logical decision.  This step,
however gave rise to a restriction (subset) of Modula.
It soon became clear that the rule to concentrate on the
essential and to  eliminate inessential should not only be
applied to the design of the new  system, but equally
stringently to the language in which the system is 
formulated. The application of the principle thus led from
Modula to a new  language. The adjective new, however, has
to be understood in proper context:  Oberon evolved from
Modula by very few additions and several subtractions.  In
relying on evolution rather than revolution we remain in the
tradition of  a long development that led from Algol to
Pascal, then to Modula-2, and  eventually to Oberon. The
common trait of these languages are their  procedural rather
than functional model, and the strict typing of data. More 
fundamental even is perhaps the idea of abstraction: the
language must be  defined in terms of mathematical, abstract
concepts without reference to any computing mechanism. Only
if a language satisfies this criterion can it be  called
``higher-level''. No syntactic coasting whatsoever can earn a
language  this attribute alone.

The definition of a language must be coherent and concise.
This can only be  achieved by a careful choice of the
underlying abstractions an appropriate  structure combining
them. The language manual must be reasonably short, 
avoiding the explanation of individual cases derivable from
the general  rules. The power of a formalism must not be
measured by the length of its  description. To the contrary,
an overly lengthy definition is a sure symptom  of
inadequacy. In this respect, not complexity but simplicity
must be the  goal.

In spite of its brevity, a description must be complete.
Completeness is to be  achieved within the framework of the
chosen abstractions. Limitations imposed  by particular
implementations do not belong to a language definition
proper.  Examples of such restrictions are the maximum
values of numbers, rounding and  truncation errors in
arithmetic, and actions taken when a program violates  the
stated rules. It should not be necessary to supplement a
language definition  with voluminous standards document to
cover ``unforeseen'' cases.

But neither should a programming language be a mathematical
theory only. It  must be practical tool. This imposes
certain limits on the terseness of the  formalism. Several
features of Oberon are superfluous from a purely theoretical 
point of view. They are nevertheless retained for practical
reasons, either  for programmers' convenience  or to allow
for efficient code generation  without the necessity of
complex, ``optimizing'' pattern matching algorithms  in
compilers. Examples of such features are the presence of
several forms of  repetitive statements, and of standard
procedures such as INC, DEC, and ODD.  They complicate
neither the language conceptually nor the compiler to any 
significant degree.

These underlying premises must be kept in mind when
comparing Oberon with other languages.
Neither the language
nor its defining document reach the ideal; but Oberon
approximates these goals much better than its predecessors.

A compiler for Oberon has been implemented for the NS32000
processor family and  is embedded in the Oberon operating
environment. The following data provide an  estimate of the
simplicity and efficiency of the implementation, and readers 
are encouraged to compare them with implementations of other
languages.  (Measurements with 10 MHz NS32032).

\begin{center}
\begin{tabular}{lrrrr}
\hline
&\multicolumn{2}{c}{length of}
&\multicolumn{1}{c}{length of}
&\multicolumn{1}{c}{time of self}\\
&\multicolumn{2}{c}{source program}
&\multicolumn{1}{c}{compiled code}
&\multicolumn{1}{c}{compilation}\\
\hline
&lines &characters &bytes &seconds\\
\hline
Parser&1116&3671&99928&11.53\\
Scanner&346&9863&3388&3.80\\
Import/Export&514&18386&4668&5.25\\
Code generator&19636&5901&21636&21.02\\
Total&3939&130869&39620&41.60\\
\end{tabular}
\end{center}

Subsequently, we present a brief introduction to Oberon
assuming familiarity  with Modula (or Pascal), concentrating
on the added features and listing the  eliminated ones. In
order to be able to ``start with a clean table'', the  latter
are taken first.

\section*{Features omitted from Modula}

\subsection*{Data types}

Variant records are eliminated, because they constitute a
genuine difficulty  for the implementation of a reliable
storage management system based on  automatic garbage
collection. The functionality of variant records is 
preserved by the introduction of extensible data types.

Opaque types cater to the concept of the abstract data type
and information  hiding. They are eliminated because again
the concept is covered by the new  facility of extended
record types.

Enumeration types appear to be a simple enough feature to be
uncontroversial.  However, they defy extensibility over
module boundaries. Either a facility to  extend enumeration
types would have to be introduced, or they would have to be 
dropped. A reason in favour of the latter, radical solution
was the observation  that in a growing number of programs
the indiscriminate use of enumerations  had led to a pompous
style that contributed not to program clarity, but rather 
to verbosity. In connection with import and export,
enumerations gave rise to  the exceptional rule that import
of a type identifier also causes the  (automatic) import of
all associated constant identifiers. This exceptional  rule
defies conceptual simplicity and causes unpleasant problems
for the  implementor.

Subrange types were introduced in Pascal (and adopted in
Modula) for two  reasons: (1) to indicate that a variable
accepts a limited range of values of  the base type and
allow a compiler to generate appropriate guards for 
assignments, and (2) to allow a compiler to allocate the
minimal storage space  needed to store values of the
indicated subrange. This appeared desirable in  connection
with packed records. Very few implementations have taken
advantage  of this space saving facility, because additional
compiler complexity is very  considerable. Reason 1 alone,
however, did not appear to provide sufficient  justification
to retain the subrange facility in Oberon.

With the absence of enumeration and subrange types, the
general possibility to  define set types based on given
element types appeared as redundant. Instead,  a single,
basic type SET is introduced, whose values are sets of
integers from  0 to an implementation-defined maximum.

The basic type CARDINAL had been introduced in Modula-2 in
order to allow  address arithmetic with values from 0 to
$2^{16}$ on 16-bit computers. With the  prevalence of 32-bit
addresses in modern processors, the need for unsigned 
arithmetic has practically vanished, and therefore the type
CARDINAL has been  eliminated. With it, the bothersome
incompatibilities of operands of types  CARDINAL and INTEGER
have disappeared.

The notion of a definable index type of arrays has also been
abandoned: All  indecies are by default integers.
Furthermore, the lower bound is fixed to 0;  array
declarations specify a number of elements (length) rather
than a pair of  bounds. This break with a long standing
tradition since Algol 60 demonstrates  the principle of
eliminating the inessential most clearly. The specification
of  an arbitrary lower bound provides no expressive power at
all, but it introduces  a non-negligible amount of hidden,
computational effort. (Only in the case of  static
declarations can it be delegated to the compiler).

\subsection*{Modules and import/export rules}

Experience with Modula over the last eight years has shown
that local modules  were rarely used. The additional
complexity of the compiler required to handle  them, and the
additional complications in the visibility rules of the
language  definition appear not to justify local modules.

The qualification of an imported object's identifier x by
the exporting  module's name M, viz. M.x can be circumvented
in Modula by the use of the  import clause FROM M IMPORT x.
This facility has also been discarded.  Experience in
programming systems involving many modules has taught that
the  explicit qualification of each occurrence of x is
actually preferable. A  simplification of the compiler is a
welcome side-effect.

The dual role of the main module in Modula is conceptually
confusing. It  constitutes a module in the sense of a
package of data and procedures enclosed  by a scope of
visibility, and at the same time it constitutes a single 
procedure called the main program. Within the Oberon system,
the notion of  a main program has vanished. Instead, the
system allows the user to activate  any (exported,
parameterless) procedure (called a command). Hence, the 
language excludes modules without explicit definition parts,
and every module  is defined in terms of a definition part
and an implementation part (not  definition module and
implementation module).

\subsection*{Statements}

The with statement has been discarded. Like in the case of
exported  identifiers, the explicit qualification of field
identifiers is to be  preferred.

The elimination of the for statement constitutes a break
with another long  standing tradition. The baroque mechanism
in Algol 60's for statement had  been trimmed considerably
in Pascal (and Modula). Its marginal value in  practice has
led to its absence in Oberon.

\subsection*{Low-level facilities}

Modula-2 makes access to machine-specific facilities
possible trough low-level  constructs, such as the data
types ADDRESS and WORD, absolute addressing of  variables,
and type casting functions. Most of them are packaged in a
module  called SYSTEM. The features were supposed to rarely
used and easily visible  trough the presence of SYSTEM in a
module's import list. Experience has  revealed, however,
that a significant number of programmers import this  module
quite indiscriminately. A particulary seducing trap are
Modula's  type transfer functions.

It appears preferable to drop the pretense of portability of
programs that  import a ``standard'', yet system-specific
module. Both the module SYSTEM and  the type transfer
functions are eliminated, and with them also the types 
ADDRESS and WORD. Individual implementors are free to
provide system-dependent  modules, but they do not belong to
the general language definition. Their use  then declares a
program to be patently implementation-specific, and thereby 
non-portable.

\subsection*{Concurrency}

The system Oberon does not require any language facilities
for expressing  concurrent processes. The pertinent,
rudimentary features of Modula, in  particular the
coroutine, were therefore not retained. This exclusion is 
merely a reflection of our actual needs within the concrete
project, but not  on the general relevance of concurrency in
programming.

\section*{Features introduced in Oberon}

\subsection*{Type extension}

The most important addition is the facility of extended
record types. It  permits the construction of new types on
the basis of existing types, and  establishing a certain
degree of compatibility between the names of the new  and
old types. Assuming a given type

\begin{verbatim}
	T = RECORD x, y: INTEGER END
\end{verbatim}

extensions may be defined which contain certain fields in
addition to the  existing ones. For example

\begin{verbatim}
	T0 = RECORD (T) z: REAL END;
	T1 = RECORD (T) w: LONGREAL END;
\end{verbatim}

define types with fields x, y, z and x, y, w respectively.
We define a type  declared by

\begin{verbatim}
	T' = RECORD (T) <field 	definitions> END
\end{verbatim}

to be a (direct) extension of T, and conversely T to be the
(direct) base  type of T'. Extended types may be extended
again, giving rise to the following  definitions:

A type T' is an extension of T, if T' = T or T' is a direct
extension of an  extension of T. Conversely, T is a base of
T', if T = T' or T is the direct  base type of a base type
of T'. We denote this relationship by T' {\tt =>} T.

The rule of assignment compatibility states that values of
an extended type  are assignable to variables of their base
types. For example, a record of  type T0 can be assigned to
a variable of the base type T. This assignment  involves the
fields x and y only, and in fact constitutes a projection of 
the value onto the space spanned by the base type.

It is important that an extended type may be declared in a
module that imports  the base type. In fact, this is
probably the normal case.

This concept of extensible data type gains importance when
extended to pointers.  It is appropriate to say that a
pointer type P' bound to T' extends a pointer  type P, if P
is bound to a base type T of T', and to extend the
assignment  rule to cover this case. It is now possible to
form structures whose nodes  are of different types, i.e.
inhomogenious data structures. The inhomogeneity  is
automatically (and most sensibly) bounded by the fact that
the nodes are  linked by pointers of a common base type.

Typically, the pointer fields establishing the structure are
contained in the  base type T, and the procedures
manipulating the structure are defined in the  same (base)
module as T. Individual extensions (variants) are defined in 
client modules together with procedures operating on nodes
of the extended  type. This scheme is in full accordance
with the notion of system  extensibility: new modules
defining new extensions may be added to a  system without
requiring a change of the base modules, not even their 
recompilation.

As access to an individual node via a pointer bound to a
base type provides a  projected view of the node data only,
a facility to widen the view is  necessary. It depends on
the possibility to determine the actual type of the 
referenced node. This is achieved by a type test, a Boolean
expression of the  form

\begin{verbatim}
	t IS T'		(or p IS P')
\end{verbatim}

If the test is affirmative, an assignment t' := t (t' of
type T') or  p' := p (p' of type P') should be possible. The
static view of types,  however, prohibits this. Note that
both assignments violate the rule of  assignment
compatibility. The desired statement is made possible by 
providing a type guard of the form

\begin{verbatim}
	t' := t(T)  (p' := p(P))
\end{verbatim}

and by the same token access to the field z of a T0 (see
previous examples)  is made possible by a type guard in the
designator t(T0).z. Here the guard  asserts that t is
(currently) of type T0.

The declaration of extended record types, the type test, and
the type guard  are the only additional features introduced
in this context. A more extensive  discussion is provided in
[2]. The concept is very similar to the class notion  of
Simula 67 [3], Smalltalk [4], and others. Differences lie in
the fact that  the class facility stipulates that all
procedures applicable to objects of the  class are defined
together with the data declaration. It is awkward to be 
obliged to define a new class solely because a method
(procedure) has been  added or changed. In Oberon, procedure
(method) types rather than methods are  connected with
objects in the program text. The binding of actual methods 
(specific procedures) to objects (instances) is delayed
until the program is  executed. In Smalltalk, the
compatibility rules between a class and its  subclasses are
confined to pointers, thereby intertwining the concept of 
access method and data type in an undesirable way.  Here,
the relationship  between a type an its extensions is based
on the established mathematical  concept of projection.

In Modula, it is possible to declare a pointer type within
an implementation  module, and to export it as an opaque
type by listing the same identifier in  the corresponding
definition module. The net effect is that the type is 
exported whereby its associated binding remains hidden
(invisible to clients).  In Oberon, this facility is
generalized in the following way: Let a record  type be
defined in a certain implementation part, for example:

\begin{verbatim}
	Viewer = RECORD width, height: INTEGER; x, y: INTEGER END
\end{verbatim}

In the corresponding definition part, a partial definition
of the same type  may be specified, for example

\begin{verbatim}
	Viewer = RECORD width, height: INTEGER END
\end{verbatim}

with the effect that a partial view --- a public projection ---
is visible to  clients. In client modules as well as in the
implementation part it is  possible to define extensions of
the base type (e.g. TextViewers or  GraphViewers).

\subsection*{Type inclusion}

Modern processors feature arithmetic operations on several
number formats. It  is desirable to have all these formats
reflected in the language as basic  types. Oberon features
five of them:

\begin{verbatim}
	LONGINT, INTEGER, SHORTINT(integer types)
	LONGREAL, REAL(real types)
\end{verbatim}

With the proliferation of basic types, a relaxation of
compatibility rules  between them becomes almost mandatory.
(Note that in Modula the arithmetic  types INTEGER, CARDINAL
and REAL are uncompatible). To this end, the notion  of type
inclusion is introduced: a type T includes a type T', if the
values  of T' are also values of type T. Oberon postulates
the following hierarchy:

\begin{verbatim}
	LONGREAL > REAL > LONGINT > INTEGER > SHORTINT
\end{verbatim}

[Note that ``{\tt >}'' should be replaced by the appropriate
mathematical sign.   Limitation of type-in..]

The assignment rule is relaxed accordingly: A value of type
T' can be assigned  to a variable of type T, if T' is
included in T (if T' extends T), i.e. if  T {\tt >} T' or T' {\tt =>} T.
In this respect, we return to (and extend) the flexibility 
of Algol 60. For example, given variables

\begin{verbatim}
	i: INTEGER; k: LONGINT; x: REAL
\end{verbatim}

the assignments

\begin{verbatim}
	k:=i; x:=k; x:=1; k:=k+1; x := x*10 + i
\end{verbatim}

are confirming to the rules, where the assignments

\begin{verbatim}
	i:=k; k:=x
\end{verbatim}

are not  acceptable. Finally, it is worth noting that the
various arithmetic types  represent a limited set of
subrange types.

The multi-dimensional open array and the closure statement
(in symmetry to a  module's initialization body) are the
remaining facilities of Oberon not  present in Modula.

\section*{Summary}

The language Oberon has evolved from Modula-2 and
incorporates the  experiences of many years of programming
in Modula. A significant number of  features have been
eliminated. They appear to have contributed more to 
language and compiler complexity than to genuine power and
flexibility  of expression. A small number of features have
been added, the most  significant one being the concept of
type extension.

The evolution of a new language that is smaller, yet more
powerful than its  ancestor is contrary to common practices
and trends, but has inestimable  advantages. Apart from
simpler compilers, it results in a concise definition 
document [5], and indispensible prerequisite for any tool
that must serve  in the construction of sophisticated and
reliable systems.

\section*{Acknowledgement}

It is impossible to explicitly acknowledge all contributions
of ideas that  ultimately simmered down to what is now
Oberon. Most came from the use or  study of existing
languages, such as Modula-2, Ada, Smalltalk, C++ and  Cedar,
which often though us how not to do it. Of particular value
was  the contribution of Oberon's first user, J. Gutknecht.
The author is  grateful for his insistence on the
elimination of deadwood and on basing  the remaining
features on a sound mathematical foundation.

\section*{References}

\noindent
1. N. Wirth. Programming in Modula-2.
Springer-Verlag, 1982.

\noindent
2. N. Wirth. Type Extensions. ACM Trans. on Prog.
Languages and Systems  (to appear)

\noindent
3. G. Birtwistle, et al. Simula Begin. Auervach,
1973.

\noindent
4. A. Goldberg, D. Robson. Smalltalk-80: The language
and its implementation. Addison-Wesley, 1983.

\noindent
5. N. Wirth. The Programming language Oberon
(language definition document)

\end{document}
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

kent@xanth.cs.odu.edu (Kent Paul Dolan) (02/29/88)

Moderately enraged response; add salt!

The definition of Oberon looks, in the main, like a win, but somehow the
baby went out with the bathwater!

After a couple decades of using integers to encode concepts in FORTRAN
and many other tongues, enumerated types were a gift from the dieties
of programming; they rid the code of magic numbers, made it more
readable and maintainable, and lessened dramatically the chance of
coding errors.  Since the automatic import of a type and all its named
constants was already eliminated for Oberon, the explicit
qualification of imported objects should have removed any
implementation problems here.

That is, if module A defines enumerated type a with constants p and q,
and module B extends this to type b with additional constants r and s,
and module C extends b to type c with added constant names t and u,
surely in module C I can say:

	A.a.p
	A.a.q
	B.b.r
	B.b.s
	C.c.t, c.t or t
	C.c.u, c.u or u

without any ambiguities?  Where was the compelling need to eliminate
this very valuable feature?  Note that I assume here that we do NOT
allow renaming, so that B must make the definitions imported from A
visible to C, if C is to import them from B, and that this is exactly
textually equivalent to C importing them from A directly.

I really, really, really don't want to backslide to programs full of
anonymous small integers which are really not arithmetic integers at
all, but only cryptic encodings for logical problem space concepts
which should have names, and the C language #define alternative makes
me quite nauseous with all the #include conflicts that arise in trying
to use this facility.

Also, as another issue, if words, bitsets, and similar implementation
hardware explicit concepts leave the standard language, then in the
field of operating systems design, I would appreciate strengthening of
the definition of sets, in the ways most implementations of Modula-2
have chosen to extend the language.

Also, some language mechanism should (IMHO) be added to indicate that
a piece of storage should be extended to the next boundary for
convenient hardware access (i.e., if I define a set type of a size to
require 7 bits, I should be able to force instances of elements of its
powerset to extend to a multiple of the size of a character, short
integer, integer, long integer, or pointer for execution efficiency,
without being too bothered about just how big any of the latter items
are in this particular hardware.  I'm sure others would be just as
interested in the opposite facility, to pack the elements for storage
efficiency.

Kent, the (just an old coder, no language design expert) man from xanth.

franka@mmintl.UUCP (Frank Adams) (03/01/88)

In article <7161@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>[Text by Niklaus Wirth]
>With the proliferation of basic types, a relaxation of compatibility rules
>between them becomes almost mandatory.  To this end, the notion  of type
>inclusion is introduced: a type T includes a type T', if the values  of T'
>are also values of type T. Oberon postulates the following hierarchy:
>	LONGREAL > REAL > LONGINT > INTEGER > SHORTINT

This appears to be a mistake.  I am assuming that these are intended to be
64 and 32 bit reals, and 32, (16 or 32), and 16 bit integers, respectively.

The basic problem is that the values of a 32 bit real do not include the
values of a 32 bit integer.  If one is going to be meticulous enough about
assignments to define type inclusion, one should only make types be included
when they really are.

(If INTEGER is intended to be 32 bits, and LONGINT 64, the problems are even
more serious.  If REAL is intended to be a 64 bit float, the absence of a 32
float is a problem.)
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

sommar@enea.se (Erland Sommarskog) (03/02/88)

First, Lawrence, thank you for posting Wirth's paper. It was 
interesting to read.

So Wirth has designed a new language. If he continues in this
direction he's going to end up with C next time :-) That is
to say I am not impressed. OK, the extended type facility is
a nice feature, but principally it's just another way of
expressing prefix classes, well-known from Simula. A somewhat
different approach is Ada's generic packages with private-
type parameters.

He seems to have got simplifications in the compiler on his
mind, totally ignoring that a simplification in the compiler may
end up as more work for the poor programmer. Of the restrictions
he has made, I'd like to particulary question four of them: 
enumeration types, subranges, array indexes and for-statements.

Kent Paul Dolan has already given good arguments for keeping 
enumeration types, so I won't reapeat , jsut strongly agree. However, 
I don't share Kent's ideas on how to refer to imported enumration 
values. That would lead to code like:
     Case Action of
        Module1.Action_type.Action1 : Do_something;
        Module1.Action_type.Action2 : Do_something_else;
        ...   
Talk about meaningless verbosity!

Wirth's argument for removing subranges is really short-cutting.
No one uses packing, and just guarding isn't sufficient enough,
he says in essence. Well, it is. Not the least when you remove
enumeration. If you had subranges, you could at least protect your 
now integer-coded enumrates from illegal values due to assigning 
from uninitiated variables.

I don't fully understand the passage on arrays. He says that indexes
are integers "by default". Does this mean you could have a character
index if you like? The rest of the passage doesn't seem to admit it.
And that is a miss, I think. 
  The idea of having all arrays starting on index zero is really to
put the burden on the programmer. If Wirth really must have a fixed
lower index, he could at least have chosen one! How many of you do
often use arrays with zero as the lower bound? I do it very seldom.
And his argument about saving execution time for index computation
is truely naive. The result will just be references like:
    String(char_no + 1), 
just making the code more verbose and complex for no use.

The removal of the FOR-statement is also questionable. This means
that all loops must be done with WHILE (or REPEAT?). WHILE is more 
error-prone than FOR, since there are more things that the programmer
has to take care of. 
  And particulary in Pascal, and I guess also Modula, a certain mistake 
may have disastrous consequences:
    PROCEDURE A;
    VAR i : integer;
       PROCEDURE B;
       BEGIN
          FOR i := 1 TO Max DO
             Something;
       END;
    BEGIN
       FOR i := 1 TO Max DO
          B;
    END;
Since we forgot to declare i in B, B will be called only once. (OK, 
depends on how FOR is implemented, but replace with WHILE and there is
no doubt.) Most Pascal implementions help to save you from this by 
allowing an arbitrary order of the declarations, so we can desclare i
just before A's body. (How about Modula and Oberon?)
  Here I can't keep from mentioning Ada. In Ada the loop variable is 
declared in the FOR statement as such. Now, if only things like
   FOR i IN Array_range WHILE A(i) /= 0 LOOP
had been allowed... 

Wirth also talks about the importance of the description being concise.
My doubts here. OK, "Programming in Modula-2" may not count as the
description, but I recall it as truely drivelling in many parts. (Also,
reading it directly after going through the Ada RM, was like drinking 
diluted lemonade.) And his code examples were about unreadable, on top
of all being set with a proportional font!

We had an discussion on "Modern languages" a while ago, me being guilty.
A current trend in language design seem to the removal of long-ago
accpted constructs. Oberon drops enumerations, Eiffel has no arrays.
The only difference is that while Oberon leaves no substitions except
integer constants, Eiffel provides arrays as a standard class. Also,
Eiffel, does really introduce something new: Multiple inheritence.
Missing in many OO-languages as Simula, Smalltalk and also Oberon.


-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Souvent pour s'amuser les hommes d'equipages
                            and it's like talking to a stranger" -- H&C.

crowl@cs.rochester.edu (Lawrence Crowl) (03/03/88)

In article <2787@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>The extended type facility is a nice feature, but principally it's just
>another way of expressing prefix classes, well-known from Simula. A somewhat
>different approach is Ada's generic packages with private-type parameters.

Ada's generic packages may introduce a new procedure to handle each parameter
type.  Oberon's extended types guarantee that exactly one procedure is needed.
Ada's generic packages allow only one type to be in the corresponding
structure (e.g. queue), while Oberon's extended types allow many different
types to be in the queue.  Ada's generic packages allow putting a type on a
queue to be done simply as an afterthought, but using Oberon's extended types
requires the more effort to put a type on a queue as an afterthought.

>The idea of having all arrays starting on index zero is really to put the
>burden on the programmer. ... And his argument about saving execution time
>for index computation is truely naive. The result will just be references
>like: "String(char_no + 1)", just making the code more verbose and complex
>for no use.

One compiler technique adjusts the "base" address portion of the index
computation to account for non-zero lower element.  This optimization becomes
harder if non-zero basing is done in user code.
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

hal@pur-phy (Hal Chambers) (03/03/88)

In article <2787@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:

 >First, Lawrence, thank you for posting Wirth's paper. It was 
 >interesting to read.

Ditto.

 >So Wirth has designed a new language. If he continues in this
 >direction he's going to end up with C next time :-)

It appears just about all of us got that impression.

 >  The idea of having all arrays starting on index zero is really to
 >put the burden on the programmer. If Wirth really must have a fixed
 >lower index, he could at least have chosen one! How many of you do
 >often use arrays with zero as the lower bound? I do it very seldom.

For myself, usage is split just about 50-50 between lower index 0
and lower index 1.

 >  Here I can't keep from mentioning Ada. In Ada the loop variable is 
 >declared in the FOR statement as such.

One of the features a Ada that I like! This also makes the fact that
the scope of the loop variable is confined to the loop quite explicit.

As to requiring explicit qualification to imported names, this makes it
impossible for a programmer to provide facilities which appear to extend
for language.  For example, I use my own standard io module (inspired by
Software Tools) and like to regard  open,close,putcf etc. as part of the
language.  Requiring qualification means I have to code

		stdio.open, stdio.close,   etc.

To me, this appears verbose (and adds no clarity).  If one feels that
the programmer should not be lulled into thinking that open/close ARE
part of the language then I think that requirement is satified by the
appearance of open/close in the import list:

	FROM stdio IMPORT open, close, ...

One feature of Oberon that I like alot is mixed mode arithmetic.
Requiring the programmer to provide explicit type conversion in
expressions (INTEGER,CARDINAL,TRUNC,FLOAT,...) adds to the verbosity
of the program and detracts from understanding the meaning of the
expression.

How would programmers feel if a language designer decided that a hierarchy
of implicit operator precedence was dangerous and required programmers
to always use parentheses explicitly!  Specifying an order of numeric
type precedence frees the source program of the warts mentioned above.
Note that I am only refering to type conversion within the numeric
types and NOT between CHAR, BOOLEAN, etc.

Hal Chambers

gore@eecs.nwu.edu (Jacob Gore) (03/04/88)

Can someone tell me how to get a copy of the Oberon language definition
document?  Or any other writeups on it, preferably with some examples?

Jacob

P.S.  I remember seeing Wirth's article on "derived types" in SIGPLAN Notices,
but I can't locate the issue at the moment...

Jacob Gore				Gore@EECS.NWU.Edu
Northwestern Univ., EECS Dept.		{oddjob,gargoyle,ihnp4}!nucsrl!gore

djsalomon@watdragon.waterloo.edu (Daniel J. Salomon) (03/04/88)

I would like to point out two minor typo's in Wirth's introduction
to Oberon that lead to humorous results.  On page 3 he gives statistics
on code size for the parts of his Oberon compiler.  The correct table
should be:

                   length of          length of     time of self
                source program      compiled code   compilation
----------------------------------------------------------------
                lines  characters       bytes         seconds
----------------------------------------------------------------
Parser           1116       36719        9928           11.53
Scanner           346        9863        3388            3.80
Import/Export     514       18386        4668            5.25
Code generator   1963       65901       21636           21.02
                ------     -------     -------         -------
Total            3939      130869       39620           41.60

Notice that only two digits have changed columns, but now the columns
add up correctly, and the statistics make more sense.

nevin1@ihlpf.ATT.COM (00704a-Liber) (03/04/88)

In article <2740@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
.In article <7161@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
.>[Text by Niklaus Wirth]
.>With the proliferation of basic types, a relaxation of compatibility rules
.>between them becomes almost mandatory.  To this end, the notion  of type
.>inclusion is introduced: a type T includes a type T', if the values  of T'
.>are also values of type T. Oberon postulates the following hierarchy:
.>	LONGREAL > REAL > LONGINT > INTEGER > SHORTINT
.
.The basic problem is that the values of a 32 bit real do not include the
.values of a 32 bit integer.  If one is going to be meticulous enough about
.assignments to define type inclusion, one should only make types be included
.when they really are.

This depends on what is meant by value.  Including precision, you are
right; n bit reals cannot exactly represent all the n bit integers.  But,
if you allow the precision to 'slack off', n bit reals CAN represent
all the n bit integers.  But the mapping is not 1:1; it is many:1 (from
ints to reals); this means that in the following case:

VAR IX, IY : INTEGER;	{integers are 32 bits}
    RZ : REAL;		{reals are 32 bits}
...
{Somethings sets IX}
...
RZ := IX;
IY := INTEGER(RZ);

IX and IY are not (necessarily) equal.

But I wonder:  in this circumstance, is it possible for the IY assignment
statement to ever fail (ie, produce a run-time error)?  I would hope not.


A lot of people have praised Pascal and Modula-2 BECAUSE there were no
implicit type conversions.  Me, I'm not sure whether this is good or bad.
Personally, I make most of my type conversions explicit anyway in most of the
languages I use.  Does it really add all that much complexity to the
compiler; or, more importantly, does the compiler generate better code if
the conversions are implicit (Personally, I can't see this happening)?
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

nevin1@ihlpf.ATT.COM (00704a-Liber) (03/04/88)

In article <2787@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>
>So Wirth has designed a new language. If he continues in this
>direction he's going to end up with C next time :-) 

No chance of this happening.  Wirth will never extend pointers to the point
where they are as powerful as pointers in C.  Read some of his eariler papers
and you'll see why.

>Wirth's argument for removing subranges is really short-cutting.
>No one uses packing, and just guarding isn't sufficient enough,
>he says in essence. Well, it is. Not the least when you remove
>enumeration. If you had subranges, you could at least protect your 
>now integer-coded enumrates from illegal values due to assigning 
>from uninitiated variables.

I never really agreed with this type of run-time checking (that's really
what he is getting rid of), although it is somewhat useful for debugging
purposes.  First off, it is an awful lot of run-time overhead.  Secondly,
if your program is 'correct', you should not need this checking at all.
Thirdly, it is not powerful enough.  If an error like this *does* occur in
my program, I would like to be able to trap it and take care of it, instead
of the program just dying (maybe I can save some of the data, etc.).

>I don't fully understand the passage on arrays. He says that indexes
>are integers "by default". Does this mean you could have a character
>index if you like?

You could in Modula-2.  And I quote (w/o permission) from section 9 (p36)
of the "Programming in Modula-2" by Wirth (2nd edition):

"For example, the array declaration

	map:	ARRAY CHAR OF CARDINAL

introduces an array of 128 cardinals where each element is indexed by a
character value as shown by the statements

	map["A"] := 0;		k := map["+"]."

>  The idea of having all arrays starting on index zero is really to
>put the burden on the programmer. If Wirth really must have a fixed
>lower index, he could at least have chosen one! How many of you do
>often use arrays with zero as the lower bound? I do it very seldom.
>And his argument about saving execution time for index computation
>is truely naive. The result will just be references like:
>    String(char_no + 1), 
>just making the code more verbose and complex for no use.

I've been thinking about the overhead for arbituary ranges on arrays.  In
the most efficient implementation, you would need three pieces of
information to carry along (although what three pieces of information you
keep is implementation-dependent):

1)	Address of the first element of the array.
2)	Address of the last element of the array.
3)	Address of array element 0.

[Note:  there are other ways to organize this information such as 2 & 3
being the upper and lower bounds, respectively.  But I think that you
always need three pieces of information.]  Also, '1)' must be known in
order to access the array at all.

In order to access an element x, you must calculate the address of the
element (address := x * sizeof(type of array) + '3)'), and check to see
that it falls within the bounds of '1)' and '2)'.  Assuming that '1)' is a
synonym (internally) for the array (and I am defining this value to be of
no cost since all implementations of array indeces need this or a similar
value) , this involves 1 indirect reference (to get '3)') 
to calculate the address of the element x, which would not be
needed if the lower bound is always 0 (since '1)' and '3)' are the same
address; hence, only two pieces of information are needed in this case).
Also, this example shows that maximal efficiency is achieved by making the
lower-bound 0 instead of 1.

A proposed solution:  whenever you need an array from 1..n, why don't you
just declare it from 0..n??  I know that this takes slightly more memory (1
extra element), but in most circumstances it is probably worth (no pun
intended :-)) it.  If 1 were always the lower bound, then whenever you
needed an array starting at 0 you would *always* have to add one to your
index; there are no other tricks around it.

BTW, I liked having arrays with variable ranges.  In C, I can implement
this when necessary (by playing with pointers), but this does not seem to
be possible in Oberon  (without always adjusting the index).

>The removal of the FOR-statement is also questionable. This means
>that all loops must be done with WHILE (or REPEAT?). WHILE is more 
>error-prone than FOR, since there are more things that the programmer
>has to take care of. 
>  And particulary in Pascal, and I guess also Modula, a certain mistake 
>may have disastrous consequences:
>    PROCEDURE A;
>    VAR i : integer;
>       PROCEDURE B;
>       BEGIN
>          FOR i := 1 TO Max DO
>             Something;
>       END;
>    BEGIN
>       FOR i := 1 TO Max DO
>          B;
>    END;
>Since we forgot to declare i in B, B will be called only once. (OK, 
>depends on how FOR is implemented, but replace with WHILE and there is
>no doubt.) Most Pascal implementions help to save you from this by 
>allowing an arbitrary order of the declarations, so we can desclare i
>just before A's body. (How about Modula and Oberon?)

This is illegal in standard Pascal!  B is NOT allowed to modify i in this
manner, since i is the loop variable in A.  I think that Wirth is taking
this out because it is not always possible at compile time to determine
whether or not the 'global' loop variable is being used by the local
procedure (just add a few conditionals and flags passed to B
to the above example and you'll see
what I mean).  Since compilers don't get it right (ie, always enforce the
rules of the language--something which Wirth seems to thing programming
languages need),  why leave it in??

Also, arbitrary order of declarations alone does not alleviate this
problem.  In your example, i should still be a valid var in B.  You have to
add the restriction the no variable (procedure, type, etc.) can be used
before it is declared.

>Missing in many OO-languages as Simula, Smalltalk and also Oberon.

I don't think that Oberon is even *close* to being object-oriented!!


So far, this discussion has been comparing Oberon to Wirth's previous
languages (with little bits of other languages thrown in).  In addition to
this, I would like to see a discussion of what types of programs this
language IS suited for, especially with respect to other languages.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

erja@daimi.UUCP (Erik Jacobsen) (03/04/88)

In <2787@enea.se> Erland Sommarskog (sommar@enea.se) makes many good
points about Wirth's new language, Oberon, but also deplores the
removed FOR-loop. He gives an example, and writes:

>  ... (OK, 
>  depends on how FOR is implemented ...

That is in fact the problem with FOR-loops - at least as they exist
today.

At one time I had access to 5 different PASCAL-compilers, and could write
a program with one FOR-loop, that would give 4 different results when
executed, and one that wouldn't compile.

This problem does not exist with WHILE/REPEAT-loops, and removing
the FOR-loop from a language is one effective way of making it
cleaner.

Another way is to define what the FOR-loop actually means in one
particular language, and today there is a PASCAL-standard. But
we still have old compilers, and we have FOR-loops in other languages,
that look the same, but behave differently.

If you write programs for portability, you must know what subset of
valid FOR-loops will compile and execute correctly in all implementations
of the langauge (and possibly in other languages), and otherwise use
WHILE/REPEAT-loops.

oconnor@sunset.steinmetz (Dennis M. O'Connor) (03/05/88)

It's difficult to see the logic in using Modula2 if
a reliable Ada(R) compiler is available. This is
precisely my situation : I use Modula2 as a "poor
man's Ada", on my Amiga. Not that it isn't okay,
but Ada leaves Modula2 far behind.

Oberon seems to be a "poor man's Modula2", which
I guess makes it a "welfare Ada" :-) I see no
use for Oberon. With the ever-increasing availablity
of "standard" C, Ada, Common Lisp, and yes Modula2,
it has no unique problem domain in which it excels
by enough of a margin to make it worth the hassle
that is ALWAYS involved in using a new language.

Sad what can happen when famous intellectuals try
to recapture their glory days, isn't it :-)
--
    Dennis O'Connor			      oconnor%sungod@steinmetz.UUCP
		   ARPA: OCONNORDM@ge-crd.arpa
   (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)

shebs%defun.utah.edu.uucp@utah-cs.UUCP (Stanley T. Shebs) (03/06/88)

In article <1008@pur-phy> hal@newton.physics.purdue.edu.UUCP (Hal Chambers) writes:

>How would programmers feel if a language designer decided that a hierarchy
>of implicit operator precedence was dangerous and required programmers
>to always use parentheses explicitly!

Gee, sounds like Lisp to me!  I don't recall ever hearing that explicit
parens everywhere reduced programmers' productivity; on the contrary, Lisp
systems are considered one of the best tools to get lots of code written
fast (reliability is another matter).  Dunno why people assume that an
operator precedence hierarchy is a Good Thing, given that its main rationale
is historical (dating from the late Renaissance?), and that reduction of
the size of source code is not a particularly strong reason!

							stan shebs
							shebs@cs.utah.edu

cjeffery@arizona.edu (Clinton Jeffery) (03/07/88)

From article <9796@steinmetz.steinmetz.UUCP>, by oconnor@sunset.steinmetz (Dennis M. O'Connor):
> It's difficult to see the logic in using Modula2 if
> a reliable Ada(R) compiler is available...[verbiage omitted]...
> Oberon seems to be a "poor man's Modula2", which
> I guess makes it a "welfare Ada" :-) I see no use for Oberon...[verbiage]...
> it has no unique problem domain in which it excels

***FLAME ON***
Bashing Wirth without understanding his point is silly, especially by
contrasting his work with Ada!  I may not be excited about Oberon, because
I am particularly fond of FOR loops :-), but it is more sensible to design
a general purpose language that accomodates abstractions efficiently than
to try to provide ALL abstractions as builtins!  Mr. O'Connors last point
about unique problem domains is VERY apt, but applies equally well to that
thing people call Ada.  Get Ada lovers out of this newsgroup; they don't
understand what Modula-2 and friends are for.
-- 
+--------------
| Clint Jeffery, University of Arizona Department of Computer Science
| cjeffery@arizona.edu -or- {ihnp4 noao}!arizona!cjeffery
+--------------

oconnor@sungoddess.steinmetz (Dennis M. O'Connor) (03/09/88)

An article by cjeffery@arizona.edu (Clinton Jeffery) says:
] From article <9796@steinmetz.steinmetz.UUCP>, by oconnor@sunset.steinmetz (Dennis M. O'Connor):
] > It's difficult to see the logic in using Modula2 if
] > a reliable Ada(R) compiler is available...
] > Oberon seems to be a "poor man's Modula2", which
] > I guess makes it a "welfare Ada" :-) I see no use for Oberon...
] > it has no unique problem domain in which it excels
] 
] ***FLAME ON***
] Bashing Wirth without understanding his point is silly, especially by
] contrasting his work with Ada!

As I understand it, the point of Oberon is to make the compiler writer
do LESS, and the application programmer do more. Which, if true, is INSANE.

] [...] it is more sensible to design
] a general purpose language that accomodates abstractions efficiently than
] to try to provide ALL abstractions as builtins!

If there's a language that does "provide ALL abstractions as
builtins", it is NOT Ada. Ada is VERY extensible, with its
private types, limited types, packages and generics. What the
HELL are you talking about ? Have you programmed in Ada ??
Have you even read the LRM ? Nice book, that LRM.

] Mr. O'Connors last point about unique problem domains is VERY apt,
] but applies equally well to that thing people call Ada.

Sorry, Ada has a unique problem domain : software for the military.
Due to legislation more than anything else, perhaps. But you are
still wrong. IMHO, Ada is a VERY good language for LARGE SYSTEM
development by MANY programmers. That's a BIG problem domain.

] Get Ada lovers out of this newsgroup; they don't
] understand what Modula-2 and friends are for.

My, my, aren't we opinionated? "Modula-2 and friends"?? That's a
funny statement, since Ada and Modula-2 are practically sisters.
Now me, I've programmed EXTENSIVELY (not just homework) in
ZetaLISP, C, Ada, Modula-2, FORTRAN, and assembly. I like
Ada, I like Modula-2, I like Zetalisp. I love my wife.
( I HATE CASE SENSITIVITY IN A LANGUAGE, tho )

Have you written large code in Ada ?? If not, you have no
right to talk about it. So SHUT UP.

] | Clint Jeffery, University of Arizona Department of Computer Science

Boy, the things these college students say...
--
    Dennis O'Connor			      oconnor%sungod@steinmetz.UUCP
		   ARPA: OCONNORDM@ge-crd.arpa
   (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)

snorri@rhi.hi.is (Snorri Agnarsson) (03/09/88)

In my opinion Oberon is a significant improvement over
Pascal, Modula-2 and Ada in one respect:  It has
automatic garbage collection.  This feature
is enough to choose Oberon over the other three.
I would rather have list processing without modules than
modules with no list processing.  Of course having both
is best.
-- 
Snorri Agnarsson                  Internet:  snorri@rhi.hi.is
Raunvisindastofnun Haskolans      uucp:      ...!mcvax!hafro!krafla!snorri
Dunhaga 3 107 Reykjavik ICELAND

kers@otter.hple.hp.com (Christopher Dollin) (03/09/88)

"oconnor@sungoddess.steinmetz (Dennis M. O'Connor)" says:

| ( I HATE CASE SENSITIVITY IN A LANGUAGE, tho )

My, time for more war! Wouldn't be without it ... Common Lisp, *yuk*.

Regards,
Kers                                    | "Why Lisp if you can talk Poperly?"

sommar@enea.se (Erland Sommarskog) (03/11/88)

Erik Jacobsen (erja@daimi.UUCP) writes:
>That is in fact the problem with FOR-loops - at least as they exist
>today.
>...
>If you write programs for portability, you must know what subset of
>valid FOR-loops will compile and execute correctly in all implementations
>of the langauge (and possibly in other languages), and otherwise use
>WHILE/REPEAT-loops.

I have already referred to Ada and I will again. Ada as I see it have
the perfect solution: The loop variable is declared in the FOR-
statement, and is thus not accessible afterwards. I can't but see 
that that definition solves the problems. No, if this had possible
for WHILE-loops to!

A side note: Since REPEAT loops are quite rare and an source of
error when used in the wrong place, it surprises me that Wirth
removed too, to make his compiler even simpler.
-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Souvent pour s'amuser les hommes d'equipages
                            and it's like talking to a stranger" -- H&C.

noise@eneevax.UUCP (Johnson Noise) (03/11/88)

In article <1345@daimi.UUCP> erja@daimi.UUCP (Erik Jacobsen) writes:

>This problem does not exist with WHILE/REPEAT-loops, and removing
>the FOR-loop from a language is one effective way of making it
>cleaner.

	I guess I don't have to tell you how C handles the problem.
In my opinion, it is by far, the most logical.  Sometimes I wonder
which construct (while, for) I should use. After all, they generate
the same code.

>Another way is to define what the FOR-loop actually means in one
>particular language, and today there is a PASCAL-standard. But
>we still have old compilers, and we have FOR-loops in other languages,
>that look the same, but behave differently.

	Not in C. Or should I say, _never_ in C.
I don't subscribe to the Pascal standard (or any so called standard
proposed by N. Wirth).  I think Ritchie (or Thompson) was rather shrewd
in this regard.

cbseeh@fang.ATT.COM (Edmund E Howland) (03/14/88)

> 
> In my opinion Oberon is a significant improvement over
> Pascal, Modula-2 and Ada in one respect:  It has
> automatic garbage collection.  This feature
> is enough to choose Oberon over the other three.
> I would rather have list processing without modules than
> modules with no list processing.  Of course having both
> is best.
> -- 
Huh?
Since when have these three ever had a need for garbage collection?
I am not too familiar with Ada, but garbage collection exists in
languages for whom dynamic binding is a way of life, not an option.

It seems you really are saying that you like languages with imbedded
list processing. Modula-2, Pascal and Ada do not directly support
list contructs in their syntax, but it is of course possible to roll-your-own,
so to speak. Garbage collection is not really the issue. It all boils
down to the right tool for the right job. 

I think it unwise to pick a language over another because of the lack
of some feature, in the latter. I might argue that Ada is a better language
than Modula 2, simply because it has better concurancy primitives. But, if
the application at hand had no use for these features, my argument would be 
invalid. Indeed, Modula 2 would be the better choice since the implementation
would be smaller, resulting in tighter code, and in the long run more 
maintainable.

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (03/15/88)

In article <2827@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
> ... Ada as I see it have
> the perfect solution: The loop variable is declared in the FOR-
> statement, and is thus not accessible afterwards.

Here's the problem with that construct:  I often write

	for (i = 0; i < max; i++)
		if (something)
			break;
	if (i == max)
		...

If the loop variable is inaccessible outside of the loop there's
no way to tell how it terminated, except by using an extra flag,
which is ugly.

	Wayne

karl@haddock.ISC.COM (Karl Heuer) (03/16/88)

In article <1557@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne A. Christopher) writes:
>Here's the problem with that construct:  I often write [edited --kwzh]
>  for (i = 0; i < max; i++)
>    if (something) break;
>  if (i < max) ...
>If the loop variable is inaccessible outside of the loop there's no way to
>tell how it terminated, except by using an extra flag, which is ugly.

Personally, I think that duplicating the test (i < max) is just as ugly.  I
would write the above as
  bool issomething() {
    for (i = 0; i < max; ++i)
      if (something) return (YES);
    return (NO);
  }
  ...
  if (issomething()) ...
which does fit nicely with the short-scope index variable rule.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/16/88)

In article <272@fang.ATT.COM> cbseeh@fang.ATT.COM (Edmund E Howland) writes:
>I might argue that Ada is a better language
>than Modula 2, simply because it has better concurancy primitives. But, if
>the application at hand had no use for these features, my argument would be 
>invalid. Indeed, Modula 2 would be the better choice since the implementation
>would be smaller, resulting in tighter code, and in the long run more 
>maintainable.

If your do not need any concurrency in your application, a smart Ada
compiler will generate code not containing any provision for
concurrency.  Therefore there is no reason to believe that the code
generated from the Modula-2 program will be smaller.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

peter@athena.mit.edu (Peter J Desnoyers) (03/16/88)

In article <1557@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne A. Christopher) writes:
>In article <2827@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
>> ... Ada as I see it have
>> the perfect solution: The loop variable is declared in the FOR-
>> statement, and is thus not accessible afterwards.
>Here's the problem with that construct:  I often write
>
>	for (i = 0; i < max; i++)
>		if (something)
>			break;
>	if (i == max)
>		...
>
>	Wayne

I've forgotten most of the CLU I learned, but I think the way to do
that in a real language would be:

for thing in what_i_want_from list[1..max]
begin
   ...
end
  except when empty
  ...

where what_i_want_from is a user-defined iterator.

I wish they had put iterators in Ada. C provides a general linear
iterator, but in CLU you can iterate through a tree or anything else.
After all, the idea of iterating is so that you can separate the
structure of bunch of data from the logic of:

for each foo in bar
  do something
end


				Peter Desnoyers

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/16/88)

> 
> I wish they had put iterators in Ada. C provides a general linear
> iterator, but in CLU you can iterate through a tree or anything else.
> After all, the idea of iterating is so that you can separate the
> structure of bunch of data from the logic of:
> 
> for each foo in bar
>   do something
> end
> 

		***This is not a flame against the author.***

   When will people learn, data abstraction is not a programming
language or type of programming language, it is a programming
methodology.
 
  In particular if you want generalized iteration say in C, its
a simple matter of defining an iteration function for that particular
data type.  Provide it a reset flag to start iteration,
either a static var or extra field in the data type representing
the current object and some termination object.

    for (object = iter_func(data_struct,START); 
	 object != TERMINATOR;
	 object = iter_func(data_struct,NEXT)) {....}

I do this for my generic hash tables in C when I need to dump the
them in linear format.
-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

crowl@cs.rochester.edu (Lawrence Crowl) (03/16/88)

In article <3764@bloom-beacon.MIT.EDU> peter@athena.mit.edu
(Peter J Desnoyers) writes:
)I wish they had put iterators in Ada. C provides a general linear iterator,
)but in CLU you can iterate through a tree or anything else.  After all, the
)idea of iterating is so that you can separate the structure of bunch of data
)from the logic of:
)
)    for each foo in bar
)        do something
)    end

In article <1130@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>When will people learn, data abstraction is not a programming language or type
>of programming language, it is a programming methodology.
> 
>In particular if you want generalized iteration say in C, its a simple matter
>of defining an iteration function for that particular data type.  Provide it a
>reset flag to start iteration, either a static var or extra field in the data
>type representing the current object and some termination object.
>
>    for (object = iter_func(data_struct,START); 
>	 object != TERMINATOR;
>	 object = iter_func(data_struct,NEXT)) {....}
>
>I do this for my generic hash tables in C when I need to dump the them in
>linear format.

Eddie Wyatt's proposal is not as general as CLU iterators.  In particular, (as
Peter J Desnoyers points out) you can easily implement a tree traversal in CLU,
but you must allocate and maintain a stack of nodes in order to implement tree
traversal under the separate function scheme.

Another problem is that objects must have distinct TERMINATOR values.  This is
not generally true.

The suggestion of using "either a static var or extra field in the data type"
fails when two iterations may proceed on the same structure at the same time.
(Note that this can happen without concurrency, but I admit it is rather rare.)

You can do all your data abstraction and programming in assembler.  The issue
is what language constructs make such programming *effective*, not just
possible.
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/16/88)

In article <7747@sol.ARPA>, crowl@cs.rochester.edu (Lawrence Crowl) writes:
> 
> Eddie Wyatt's proposal is not as general as CLU iterators.  In particular, (as
> Peter J Desnoyers points out) you can easily implement a tree traversal in CLU,
> but you must allocate and maintain a stack of nodes in order to implement tree
> traversal under the separate function scheme.

1)   For trees you could have secondary threads.  iter_func(x,START)
could traverse the tree in any order you want, setting up the threads
and returning the first element of the list. Successive  calls
could just return the next element off the list.  Point being,
with a little imagination you can do it. (Basically, the tree will
ask act as a stack, too).
  
2)   The other option is to proceed in the inverse direction. Provide
the iteration function a function to be invoke on each node.

> 
> Another problem is that objects must have distinct TERMINATOR values.  This is
> not generally true.

  The change the format to:

	BOOL iter_func(data_type,flag,ret)  

	for (ret = iter_func(data_type,START,&val); 
	     ret == TRUE; 
	     ret = iter_func(data_type,NEXT,&val)) {...}
> 
> The suggestion of using "either a static var or extra field in the data type"
> fails when two iterations may proceed on the same structure at the same time.
> (Note that this can happen without concurrency, but I admit it is rather rare.)

Option number 2 may solve this, but see final comment.

> 
> You can do all your data abstraction and programming in assembler.  The issue
> is what language constructs make such programming *effective*, not just
> possible.

FINAL COMMENT
This kind of reminds me of the critisms of Algo's iteration as being 
too barique.  And in being so, it tended to compile to very
inefficient code.  Is the same true for CLU?  Comments?

-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

snorri@rhi.hi.is (Snorri Agnarsson) (03/16/88)

From article <272@fang.ATT.COM>, by cbseeh@fang.ATT.COM (Edmund E Howland):
> It seems you really are saying that you like languages with imbedded
> list processing. Modula-2, Pascal and Ada do not directly support
> list contructs in their syntax, but it is of course possible to roll-your-own,
> so to speak. Garbage collection is not really the issue. It all boils
> down to the right tool for the right job. 
> 

Not true.  It is not a matter of syntax and it is not possible in general
to "roll-your-own" in a satisfactory manner.  To really support abstraction
garbage collection is a necessity.  If you don't believe me, try to 
write list processing primitives in MODULA-2 that work without their user
having to explicitly deallocate unused space and allow the user to write
list processing constructs in a reasonable way, i.e.:

	X := CONS(Y,Z);
	X := HEAD(Y);
	X := TAIL(Y);

It can't be done in any MODULA-2 or PASCAL version I know of.
Ada compilers can theoretically be written in such a way that garbage
collection is possible (at least so I'm told), but I don't know if any
such compilers exist.  It does not seem to be considered a big issue.
In my opinion it is.
-- 
Snorri Agnarsson                  Internet:  snorri@rhi.hi.is
Raunvisindastofnun Haskolans      uucp:      ...!mcvax!hafro!krafla!snorri
Dunhaga 3 107 Reykjavik ICELAND

schooler@oak.bbn.com (Richard Schooler) (03/16/88)

  The debate about CLU iterators misses an important point: CLU iterators are
coroutines, which C does not have.  The implication is that per-iteration state
is nicely hidden in CLU ("on the stack" in the coroutine local variables), and
has to be explicitly managed in C (or any other coroutine-free language) by
putting the state into globals or making the user hang on to the state.

	-- Richard
	schooler@bbn.com

michael@boulder.Colorado.EDU (Michael Schmidt) (03/17/88)

Posting-Front-End: Gnews 1.1


In article <2827@enea.se>, sommar@enea (Erland Sommarskog) writes:
>I have already referred to Ada and I will again. Ada as I see it have
>the perfect solution: The loop variable is declared in the FOR-
>statement, and is thus not accessible afterwards.

Furthermore it is a constant inside the loop, if I remember
right.

	Michael

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/17/88)

> 
>   The debate about CLU iterators misses an important point: CLU iterators are
> coroutines, which C does not have.  The implication is that per-iteration state
> is nicely hidden in CLU ("on the stack" in the coroutine local variables), and
> has to be explicitly managed in C (or any other coroutine-free language) by
> putting the state into globals or making the user hang on to the state.
> 

   But do you need coroutines to implimented generlized iteration.  
I think not.  The optional solution I posted should allow you
to perform the same task.  

  The optional solution was to  provided the iteration function a function
to invoke on each member of the enumerated list.  A single call would be
made to iterate through the list.

Example:

	void iterate_inorder(root,func)
	    TREE *root;
	    void (*func)();
	    {
	    if (root != NULLPTR(TREE))
		{
	    	iterate_inorder(root->left,func);
	    	func(root->val);
	    	iterate_inorder(root->right,func);
		}
	    }

    I do see one problem though, which is dealing with variable
    referenced  inside the loop.  They would have to become
    global.  Comments as to why this method might not work in general?
-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

crowl@cs.rochester.edu (Lawrence Crowl) (03/17/88)

In article <1139@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
]The debate about CLU iterators misses an important point: CLU iterators are
]coroutines, which C does not have.  The implication is that per-iteration
]state is nicely hidden in CLU ("on the stack" in the coroutine local
]variables), and has to be explicitly managed in C (or any other coroutine-free
]language) by putting the state into globals or making the user hang on to the
]state.

In article <22162@bbn.COM> schooler@oak.bbn.com (Richard Schooler) writes:
>But do you need coroutines to implimented generalized iteration?  I think not.
>The optional solution I posted should allow you to perform the same task.  

The CLU iterators are not general coroutines, but a special case that allows
them to be implemented with the conventional activation stack.  The use of
iterators reduces programmer effort.  It does not make something possible that
was not previous possible.

>The optional solution was to provide the iteration function a function to
>invoke on each member of the enumerated list.  A single call would be made to
>iterate through the list. ... [inorder tree traversal example deleted] ...
>I do see one problem though, which is dealing with variable referenced inside
>the loop.  They would have to become global.  Comments as to why this method
>might not work in general?

The method can work in general.  In fact, it is very powerful in general. 
However, it requires nested functions so that the loop-body-function can gain
access to the local variables of the code wanting the loop.  This only becomes
convenient when the language allows inline function defininition (something
like BCPL's value-yielding blocks).  
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

peter@athena.mit.edu (Peter J Desnoyers) (03/18/88)

In article <1130@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>		***This is not a flame against the author.***
>   When will people learn, data abstraction is not a programming
>language or type of programming language, it is a programming
>methodology.
>  In particular if you want generalized iteration say in C, its
>a simple matter of defining an iteration function for that ...
>
>Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

Data abstraction is __assisted__ by the proper choice of programming
language. Note that the `for' construct in Pascal is not powerful 
enough to build such a general-purpose iterator - you would have
to do it with the more general `while' construct. Similarly, C does
not provide direct language support for co-routines, and constructing
an iterator that yields members of a recursive structure requires 
constructing an explicit static stack. (Similar to expressing a recursive
routine in a non-recursive Basic)

Try writing an iteration function in C for a binary tree. That may be 
data abstraction, but when you get done with it, it's going to look 
ugly because __C is not powerful enough to express general iterators
cleanly__.

I hope that I did not flame too much - it's late at night, and I wanted
to clarify my previous comments.


					Peter Desnoyers
					peter@athena.mit.edu
					

bruns@catalina.SW.MCC.COM (Glenn Bruns) (03/24/88)

In article <1130@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>  In particular if you want generalized iteration say in C, its
>a simple matter of defining an iteration function for that particular
>data type.  Provide it a reset flag to start iteration,
>either a static var or extra field in the data type representing
>the current object and some termination object.
>
>    for (object = iter_func(data_struct,START); 
>	 object != TERMINATOR;
>	 object = iter_func(data_struct,NEXT)) {....}
>

I have also wrestled with implementing generalized iteration in C.
Eddie's solution is, I think, too restrictive, in that it prevents
one iterator to begin while another is in progress.  For example,
consider an iterator over a tree, used in a C for loop.  The loop
body may call some function that also needs to iterate over all
nodes in the tree.  

My solution (not original, I'm sure) is to provide the following
iterator functions for a data abstraction:

    x_ptr = first_x(abs, int_ptr)
    x_ptr =  next_x(abs, int_ptr)

Where 
    abs      is a data abstraction
    int_ptr  is a pointer to an integer, modified only by first_x() and next_x()
    x_ptr    is a pointer to an element of abs

A null object is returned by first_x() (or next_x()) when there is no
first (or next) object.  The variable 'int_ptr' is used by the iterator
to indicate a position in the data abstraction.

For example:

    for (node = first_node(tree, &i); node != NULL_NODE; node = next_node(tree, &i)) {
	    node_func(node);
    }

Sometimes it is easier to provide a function that traverses all
elements of a data abstraction, applying a given function to each
element.  So the previous example would become:

    tree_apply(tree, node_func);


-- 
Glenn Bruns 
MCC, Software Technology Program
arpa: bruns@mcc.com    
uucp: {seismo,harvard,gatech,pyramid}!ut-sally!im4u!milano!bruns

ralphw@IUS3.IUS.CS.CMU.EDU (Ralph Hyre) (03/25/88)

In article <1130@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>> 
>> I wish they had put iterators in Ada. C provides a general linear
>> iterator, but in CLU you can iterate through a tree or anything else...
....
>		***This is not a flame against the author.***
>
>   When will people learn, data abstraction is not a programming
>language or type of programming language, it is a programming
>methodology.
> 
>  In particular if you want generalized iteration say in C, its
>a simple matter of defining an iteration function for that particular
>data type.
I agree with your first point, but it's not so simple to do things
in a language that wasn't designed to support them cleanly.  The Guttag &
Liskov book spends a chapter or so discussing doing data abstraction
in Turbo Pascal.
I'd rather spend my time writing code than worrying about how to map my
programming methodology onto a language that doesn't support it neatly.

When I make my own mapping, it also makes it harder for someone else to
come along later and understand what I did, unless I document my
techniques and experiences.

[I must admit to some bias here, I feel much more productive using CLU
than C.  Other than interoperability with certain existing code, I can't
see any reason to use C when CLU is available.]
-- 
					- Ralph W. Hyre, Jr.

Internet: ralphw@ius2.cs.cmu.edu    Phone:(412)268-{2847,3275} CMU-{BUGS,DARK}
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/25/88)

 > I agree with your first point, but it's not so simple to do things
 > in a language that wasn't designed to support them cleanly.  The Guttag &
 > Liskov book spends a chapter or so discussing doing data abstraction
 > in Turbo Pascal.
 > I'd rather spend my time writing code than worrying about how to map my
 > programming methodology onto a language that doesn't support it neatly.
 >

	In my opinion there's just one misfeature that makes data abstraction
   ugly in C - scoping.  C only has two levels, global and local.  I
   would prefer a much finer control. Such as what is provided by module-2
   and Common Lisp.

 > 
 > When I make my own mapping, it also makes it harder for someone else to
 > come along later and understand what I did, unless I document my
 > techniques and experiences.

	You have to map abstract structures to actual structures at some point
	in time whether it's C or CLU.  As long as you adhere to the
	access interface, you gain/lose nothing in either language.
	If this sentences refers to operand over loading as a means of
	clearity of the access interface, well operand over loading has
	its pitfalls as well as gains.  

 > 
 > [I must admit to some bias here, I feel much more productive using CLU
 > than C.  Other than interoperability with certain existing code, I can't
 > see any reason to use C when CLU is available.]

  Try code speed, progaming tools, debugger ....  
-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

cjeffery@arizona.edu (Clinton Jeffery) (03/25/88)

From article <1217@PT.CS.CMU.EDU>, by ralphw@IUS3.IUS.CS.CMU.EDU (Ralph Hyre):
> ...Other than interoperability with certain existing code, I can't
> see any reason to use C when CLU is available.]

	Oh great, another person who can't see any reason to use a
	'low level' language instead of a _real_language_! :-)

<<I admit in advance I am *only* a student, and have not programmed in CLU;
  before anyone flames me, here's the real reason for this posting:>>

From reading a CLU paper, the language appears FANTASTIC--do any free
implementations exist??  I *AM* interested in source code if available...
Thanks in advance for any pointers.
-- 
+--------------
| Clint Jeffery, University of Arizona Department of Computer Science
| cjeffery@arizona.edu -or- {ihnp4 noao}!arizona!cjeffery
+--------------

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (03/25/88)

Before you can really say a language is any good, you have to write a few
30K line plus programs in it.  What large programs have been written
in CLU?  A lot of languages look really nice until you try to do real
work in them...

	Wayne

peter@sugar.UUCP (Peter da Silva) (03/25/88)

In article <1139@PT.CS.CMU.EDU>, edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>>
>>  The debate about CLU iterators misses an important point: CLU iterators are
>>coroutines, which C does not have.

Coroutines in 'C' are easy to implement, though. Why, this whole O/S we're
reading news on (UNIX) is written as a set of coroutines in 'C'. (yes, it's
a simplification... but not much of one).

I'd like to see the following functions become standard in 'C':

COROUTINE -- Build a jmp_buf for a coroutine.

int coroutine(jump, entry, stack, stacksize);
struct jmp_buf *jump;
void (*entry)();
char *stack;
int stacksize;

This sets up the stack and jmp_buf so that a call to "longjmp(jmp_buf)"
will appear to be a call to entry(). It will return an error only if the
stack isn't large enough for a small routine that does nothing but call
the following function:

int switch(from, to, status)
struct jmp_buf *from, *to;
int status;
{
	int code;

	if(!(code=setjmp(from)))
		longjmp(to, status);

	return code;
}

Voila! Co-routines! Lightweight processes (best if you have the Berkeley
signal handler, I guess, so you could run it off alarms...):

struct proc {
	struct proc *next;
	struct proc *prev;
	char *stack;
	struct jmp_buf context;
};

struct proc *runq;	/* doubly linked circular queue */

sleep()
{
	struct proc *self;
	/* do nothing if no procs or I'm alone */
	if(!runq)
		return;
	if(runq->next == runq)
		return;
	self = runq;
	runq = runq->next;
	switch(&self->context, &runq->context, 1);
}

int spawn(entry, stacksize)
void (*entry)();
int stacksize;
{
	struct proc *p;

	if(!(p = malloc(sizeof *p)))
		return 0;
	if(!(p->stack = malloc(stacksize))) {
		free(p);
		return 0;
	}
	if(!coroutine(p, entry, p->stack, stacksize)) {
		free(p->stack);
		free(p);
		return 0;
	}
	p->stacksize = p;
	p->next = runq->next;
	p->prev = runq;
	runq->next->prev = p;
	runq->next = p;
	return p;
}

int delete(p) /* note, this version doesn't allow a process to delete itself! */
struct process *p;
{
	if(p==runq)
		return 0;

	p->next->prev = p->prev;
	p->prev->next = p->next;
	free(p->stack);
	free(p);
}
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

peter@athena.mit.edu (Peter J Desnoyers) (03/26/88)

In article <1220@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
> > 
> > [I must admit to some bias here, I feel much more productive using CLU
> > than C.  Other than interoperability with certain existing code, I can't
> > see any reason to use C when CLU is available.]
>
>  Try code speed, progaming tools, debugger ....  
                                    ^^^^^^^^
Not to be rude, but the CLU debugger I used in Guttag & Liskov's
course was much better than C debuggers I've seen - dbx, apollo debug
- among other things the interface was (almost?) a full interpreted
CLU and it supported dynamic linking. Have you used CLU before?  
>--
>Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu

				Peter Desnoyers
				peter@athena.mit.edu

james@cantuar.UUCP (J. Collier) (03/29/88)

Expires:

Sender:

Followup-To:

Distribution:

Keywords:


Peter da Silva (peter@sugar.UUCP) writes:
>....
>I'd like to see the following functions become standard in 'C':
>....
>COROUTINE -- Build a jmp_buf for a coroutine.
>....
>This sets up the stack and jmp_buf so that a call to "longjmp(jmp_buf)"
>will appear to be a call to entry().
> [implementation outlines deleted]

   I seem to remember a minor war in the letters to 'Software Practice
and Experience' [I think] on this subject a couple of years back.
(Wasn't it peaceful in the days when it took a month or two to get the
next installment..)

   Correct me if I'm wrong, but I find that on some machines (well, on
BSD Vaxen anyway) setjmp()/longjmp() can't be used to implement
coroutines because longjmp() unwinds the stack destructively while
checking for botches. A small amount of in-line assembly language is
therefore necessary for transferring control.

   I agree with Peter's view that coroutines should be supported in the
C library. As he says, coroutine packages are quite easy to write and they
are indispensible for certain classes of application (I wanted one originally
for a window server/multiplexor).

   The current situation where everybody brews their own isn't really
acceptable. The programs aren't portable, and the semantics differ
sufficiently to make things confusing for the reader. The details will
have to be thrashed out before a standard is defined.

   Pre-empting threads which share a common data space are probably not
a good idea for most purposes - the synchronisation problems usually
outweigh any advantages. Leave true multitasking to the operating system
and keep to protected data spaces if possible, unless you enjoy wrapping
semaphore primitives around every second line of code. (OK, I've just spent
2 years doing this on the Mac, but that's different - no OS processes).

   Threads with explicit sleep() calls and round-robin transfer - as
suggested by Peter - are one way of organising things. I suppose it's
a matter of personal taste, but I prefer a system where the 'new_coroutine()'
call returns a pointer to a structure which contains at least the context
information and the stack base pointer, a reduced form of Peter's 'proc'
structure. This makes it easier to tailor your use of coroutines; sometimes
you want to transfer control explicitly rather than set up a threads system,
and you therefore need some way to identify coroutines throughout their
lifespan. The switch()/transfer()/resume() call keeps track of the
current coroutine through a static pointer, and hence needs no 'from'
parameter.

   There are other issues, such as how best to set up the routines so that
they exit cleanly, whether the original context should be set up as
a coroutine, and whether to support special transfers such as resume_caller()
or resume_<some default>(). Comments?

-------------------------
James Collier              Internet(ish):  james@cantuar.uucp
Computer Science Dept.,    UUCP:           {watmath,munnari,mcvax}!cantuar!james
University of Canterbury,  Spearnet/Janet: j.collier@nz.ac.canty
Christchurch, New Zealand. Office: +64 3 482 009 x8356  Home: +64 3 554 025

franka@mmintl.UUCP (Frank Adams) (03/29/88)

In article <4494@megaron.arizona.edu> cjeffery@arizona.edu (Clinton Jeffery) writes:
>Thanks in advance for any pointers.

int *p, *q, *r;

You're welcome.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

fortin@zap.UUCP (Denis Fortin) (03/29/88)

In article <272@fang.ATT.COM> cbseeh@fang.ATT.COM (Edmund E Howland) writes:
>>> [talking about the definition of the Oberon language posted recently
>>>  to comp.lang.modula2]

>> In my opinion Oberon is a significant improvement over Pascal, Modula-2 
>> and Ada in one respect:  It has automatic garbage collection.  This feature
>> is enough to choose Oberon over the other three.  
> 
>Huh?
>Since when have these three ever had a need for garbage collection?
>I am not too familiar with Ada, but garbage collection exists in
>languages for whom dynamic binding is a way of life, not an option.

(note: the stuff below applies equally well to Modula-2 or Ada, although
       it is a bit more Ada-specific)

Actually, Ada *allows* but does not not *require* the existence of a
garbage collector (LRM section 4.8 paragraph 7).  While I agree that for
real-time embedded applications, a garbage collector can be annoying
(though there are incremental algorithms that make this less true), in a
language that encourages information-hiding, a good garbage collector is
almost a must (in my opinion). 

The problem is that users might be using modules/packages that contain
hidden pointers to dynamic data.  Without garbage collection, you have
to have a free/dispose/unchecked_deallocation routine available for all
hermetic data types you export, otherwise code will be written that does
not release variables of that type after using them and if at a latter
point you change the implementation of the hidden data type (which you
SHOULD be able to do, that's why you hid it's true contents in the first
place) and stick a pointer/access variable in it, then you're stuck ->
your heap will slowly erode away!

Actually, by not requiring that a user call a "free" routine for all
variables of data types you export (those where implementation is
hidden), you are allowing him/her to make a very important assumption
about the way your data type is implemented (which is bad). 

Of course, one CAN export a "free" routine with *ALL* data types, but
then it becomes a real pain to write code that uses the data type.  For
example, a "dynamic string package" would require that you call
"free(...)" for all of the string variables you declare at the end of
each block that declares such variables -> this is not only notationally
painful (!?), but also very error-prone. 

Anyway, I've been involved in a 75000+ lines Ada project, and every time
I can talk to Ada compiler vendors, I always ask them *when* garbage
collection will finally be available for their compiler.  (Put in a
pragma to turn it off for real-time applications or something...)

(Actually, the two biggest demands that my group had for Ada compilers
 (besides the obvious "fast compile" and "fast execution") where "fast
 tasking" and "garbage collection"!)

Sigh, it felt good getting this off my chest :-)
-- 
Denis Fortin                            | fortin@zap.UUCP
CAE Electronics Ltd                     | philabs!micomvax!zap!fortin
The opinions expressed above are my own | fortin%zap.uucp@uunet.uu.net

crowl@cs.rochester.edu (Lawrence Crowl) (03/30/88)

In article <429@zap.UUCP> fortin@zap.UUCP (Denis Fortin) writes:
>... in a language that encourages information-hiding, a good garbage
>collector is almost a must (in my opinion). ... Without garbage collection,
>you have to have a free/dispose/unchecked_deallocation routine available for
>all hermetic data types you export, ... [otherwise you limit subsequent
>implementations] ... but then it becomes a real pain to write code that uses
>the data type.

C++ provides the notion of a *destructor*.  Whenever you define a class
(abstract data type), you may define a destructor.  The destructor is
*implicitly* called when the variable is deleted (e.g. at the end of the
procedure for local variables).  The implementation of the class may change
from static allocation to dynamic allocation, without changing the client
code, without leaking memory, and without explicit client control of
deallocation.

Garbage collection is one approach to reducing the storage management burden
on programmers.  It is not the only approach, nor is it always the best
approach.
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

boehm@titan.rice.edu (Hans Boehm) (03/31/88)

  It seems to me that the C++ approach to storage management is at best
a partial solution.  Admittedly it succeeds at automatically deallocating
objects in trivial cases.  For some applications this is, no doubt,
sufficient.  But consider a case as simple as implementing a stack type,
whose underlying representation is a linked list.  Assume this type
includes a non-destructive "pop" operation that returns a new stack
one shorter than the old one.  The original stack is left intact.  ("Pop"
can of course be implemented as the "tail" operation on linked lists.)
Should the head of the original linked list be deallocated?  Is it reasonable
to make that the caller's responsibility?  Certainly not, since it's not
supposed to know anything about the representation of stacks.  After all,
I could be copying arrays.  The stack implementation could reference count,
but that's more tedious, error prone, and probably slower than a good
garbage collector.  It also doesn't always work.
  My experience is that any attempt at manipulation of interesting linked
structures in a language that doesn't support real automatic storage
management will either fail, or waste large amounts of debugging time.
(My experience includes a (working) 40,000 line compiler, written in C, that
manipulates a reference counted syntax "tree", or more accurately, a
reference counted syntax graph.)  Normally, it is extremely difficult
to track down bugs created by premature deallocation.  When such bugs are
finally removed, the resulting programs normally include a substantial
number of storage leaks.
  Some recent experiments by Mark Weiser and myself with retrofitting a
garbage collector to existing C programs, verify the latter point.
(The garbage collector should never free anything since that was the
programmers responsibility.  It does.  In other people's code.
Our paper on this should appear in Software P&E shortly.)
Mike Caplinger reported similar results for another application at the last
USENIX conference, I believe.  We have resurrected C code with storage
management bugs by linking it against a garbage collector (which in the case
of C doesn't always work either, but it has a better track record than manual
storage management).
  There are arguably cases in which a garbage collector is undesirable,
notably in the presence of severe real-time constraints.  But even in such
a situation, I would want a garbage collector during debugging to help
me track down storage leaks.

Hans-J. Boehm                       boehm@rice.edu
Computer Science Dept.
Rice University
Houston, TX 77251-1892