[net.lang.mod2] Cardinal

powell@decwrl.UUCP (02/25/84)

From: powell (Mike Powell)
Is the cardinal data type in Modula-2 a mistake?  What was wrong with subranges
of integer, anyhow?  If you believe your machine implements values in the range
0..65535 efficiently, then by all means, define a type and use it in your
programs.  There is no reason a subrange can't be 0..4294967295 either, if
that's to your liking.  At some point, the compiler will complain about your
using numbers it can't handle.  E.g., mine won't like -1..4294967295, but at
least you'll get a compile error if you try, not a runtime error (assuming your
compiler/hardware is nice enough to do range/overflow checking).

The compiler (writer) can tell by looking at the subrange declarations and the
instruction manual which instructions to generate.  And you won't build
unnecessary machine dependencies into your programs.

Defining cardinal and integer to be something as small as 16 bits means many
programs have no hope of working unless there are longcardinal and longinteger.
No one writing a program on a real computer (one with >= 32-bit words) is going
to expect an integer to be <= 32767, and their programs will never port to a
microprocessor that thinks integers are so small.  Why do we need to repeat the
mistakes of C?  If the micro Modula-2 programmers think that 16 bits are enough
for a number, they are free to define types with smaller ranges.  E.g.,
type int = [-32768..32767]; uint = [0..65535].

I propose that cardinal be eliminated as a primitive type, and that it simply
be an alias for 0..MAXINT.  No reasonable implementation should define MAXINT
less than, say, a billion.  The range -MAXINT..MAXINT should always be legal.
Ranges with higher or lower bounds may be permitted, but not necessarily.  The
compiler is free to store subranges as smaller values, and may use unsigned
arithmetic when the values are all non-negative.  If the result of an operation
is stored into a subrange, than the compiler may use shorter arithmetic, since
range checking or overflow detection would catch any errors.  There is no need
to make different subranges incompatible.  If the compiler cannot generate
code for a particular statement because it exceeds the capacity of the machine,
then that's an implementation restriction.  E.g., my compiler wouldn't let you
add i : [-2147483648..2147483647] to j : [0..4294967295], but would allow
k : [0..65535] times i (and do it as signed integer) or k times j and do it
as unsigned integer.

This solution also eliminates the problems with addresses.  An address is
compatible with (subranges of) integers.  Because addresses may be segmented,
different range checks may be necessary, but for the most part, an address is
just another subrange of integer.  My compiler would likely define them to be
[0..4294967295], but other compilers might make them be [0..16777215] or
whatever is appropriate.

Is there any reason why this won't work?  Are there hidden advantages to
cardinal that I don't know about?  Does anyone like the current situation with
cardinal, integer, and address (and probably longcard, and longint)?

					Michael L. Powell
					Digital Equipment Corporation
					Western Research Laboratory
					Los Altos, California
					{decvax,ucbvax}!decwrl!powell
					powell@berkeley

phipps@fortune.UUCP (Clay Phipps) (03/07/84)

The 'cardinal' type was not in the original Modula paper
[*Software -- Practice And Experience*, vol. 7 (1977), p. 3 .. 35; see p. 11],
so I suspect that it was added late in the Modula-2 design
without considering its implications.

The integer vs. cardinal distinction appears to be an abstraction
of the low-level distinction between signed and unsigned arithmetic.
The Modula-2 book mentions the greater range of unsigned integers
and the gain in program reliability from defining away many illegal values
for subranges with nonnegative lower bounds, such as types used for counters.

It seems like a nice idea, especially because I was substantially burned
by a magic constant 65535 in a program that would not work
if that number were not represented as a 16-bit unsigned integer.
I thought that this would provide a means for clarifying a programmer's
intentions via source language constructs, but this isn't the case [sigh!].
My initial enthusiasm for the notion of cardinal types was reinforced
by a vague feeling that they would simplify some heuristic representation
selection code in a Pascal compiler I worked on a few years back;
I can't recall the details at the moment.  
I think it had to do with fields in packed records
used to map data structures to communicate with microcode and hardware.

I am really disappointed by the apparent lack of thoroughness in the language
design with regard to cardinals.  Wirth didn't take it far enough.

For my notion of cardinals as an abstraction of unsigned numbers
to be effective, it must be possible to express subranges
relative to an explicit base type.
This means that a subrange would be like an Ada subtype,
i.e., a base type with a particular constraint;
this seems most consistent with the Modula notion of "compatible".
Therefore, instead of just 

   SubrangeType = "[" ConstExpr ".." ConstExpr "]"

the language would have a rule like

   SubrangeType = [BaseType] "[" ConstExpr ".." ConstExpr "]"

Only if the programmer declined to specify a BaseType
would the current rule about using a negative lower bound be applied
to select between cardinal and integer.

This seems to be especially important because cardinals and integers
cannot be mixed in expressions.  This strikes me as a cop-out by Wirth.
Selecting a good rule for combining the two types in expressions
might be tricky [I haven't given it much thought],
but it seems analogous to the integer vs. floating-point combination
problem that has been around for years.
Is forcing the use of type-transfer functions reasonable in practice ?

The definition of the MaxInt and related values is prefaced in the Report,
p. 145, with the phrase "For implementations on 16-bit computers".
I note that Wirth assumes 2's-complement representation,
because MinInt is given as -32768, not -32767.
I believe that only the symmetrical range should be standard,
with the extra (nonsymmetrical) number allowed as an extension.
I would have preferred that these values be obtained
by inquiry functions, as in Algol 68.

By the way, I know of at least one machine that has *signed* integer
addresses and *signed* 'program counter' arithmetic: 
the ELXSI 6400, which, like the VAX, has a 32-bit address space.
It turns out that signed addresses cause no problems to speak of, 
no more so than east vs. west street addresses do.
However, treating ELXSI address bit patterns simplemindedly as cardinals
would be a problem.

-- Clay Phipps

-- 
   {allegra,amd70,cbosgd,dsd,floyd,harpo,hpda,ihnp4,
    megatest,nsc,oliveb,sri-unix,twg,varian,VisiA,wdl1}
   !fortune!phipps

rcd@opus.UUCP (03/08/84)

<>
 > . . . cardinals and integers
 > cannot be mixed in expressions.  This strikes me as a cop-out by Wirth.
 > Selecting a good rule for combining the two types in expressions
 > might be tricky [I haven't given it much thought],
 > but it seems analogous to the integer vs. floating-point combination
 > problem that has been around for years.
No, not a cop-out.  The problem is that you can't tell what the result type
should be.  The analogy with the conversion to floating point doesn't quit
work - normally there exists a floating point value which corresponds
correctly to every integer (possibly with some loss of accuracy if maxint
is larger than the fraction part can hold).  This means that floating point
can be treated as the 'dominant type' - whenever integral and floating are
combined, the result is floating.  But in the case of cardinals and
integers (of a given length, say), the ranges aren't nested, so there isn't
an obvious dominant type.  Should (integer+cardinal) be integral or
cardinal?  Some contexts are strong enough to tell - e.g., if the result is
assigned directly to a variable, use the type of the variable.  Not all
contexts have enough information to make the decision.

The situation you want to avoid is having to make some decision at runtime
as to the type.  I suspect that Wirth may have been thinking of this
problem, since an analogous glitch in Algol 60 created real problems.
(That is the situation of exponentiation, i^j, with i and j integral, in
which the result type is integral if j>0 and real if j<0.)
-- 

{hao,ucbvax,allegra}!nbires!rcd