[comp.lang.misc] C strongly typed?

sommar@enea.se (Erland Sommarskog) (03/07/90)

Henry Spencer (henry@utzoo.uucp) writes:
>Modern
>C is a strongly-typed language by any reasonable definition, although
>there are still a lot of antique compilers around that don't fully
>enforce its rules.

C strongly typed? If I write something like: (I don't speak C
so the syntax is probably bogus.)

    typedef apple int;
    typedef orange int;
    apple a;
    orange b;
    ...
    a = b;

Will a "modern" compiler object?
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

ark@alice.UUCP (Andrew Koenig) (03/07/90)

In article <849@enea.se>, sommar@enea.se (Erland Sommarskog) writes:

> C strongly typed? If I write something like: (I don't speak C
> so the syntax is probably bogus.)

>     typedef apple int;
>     typedef orange int;
>     apple a;
>     orange b;
>     ...
>     a = b;

> Will a "modern" compiler object?

I don't understand why this is relevant.

If in Standard ML I write

	type apple = int;
	type orange = int;

	val a: apple = 1;
	val b: orange = a;

the compiler won't object either.

Does this mean Standard ML is not strongly typed?
-- 
				--Andrew Koenig
				  ark@europa.att.com

henry@utzoo.uucp (Henry Spencer) (03/08/90)

In article <849@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>>Modern
>>C is a strongly-typed language by any reasonable definition...
>
>C strongly typed? If I write something like...
>    typedef apple int;
>    typedef orange int;
>    apple a;
>    orange b;
>    ...
>    a = b;
>
>Will a "modern" compiler object?

No, because the somewhat-misnamed "typedef" explicitly declares a synonym,
not a new type.  However, if you write something like:

	char *p;
	int a;
	...
	a = p;

any modern compiler will object.  C's type system is not extensible unless
you count "struct", but the language is strongly typed -- mixing random
types is not allowed.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

jlg@lambda.UUCP (Jim Giles) (03/08/90)

From article <849@enea.se>, by sommar@enea.se (Erland Sommarskog):
> [...]
> C strongly typed? If I write something like: (I don't speak C
> so the syntax is probably bogus.)
>     typedef apple int;
>     typedef orange int;
>     apple a;
>     orange b;
>     ...
>     a = b;

Yes C is strongly typed - by the definition of 'strong typing'.  The
phrase 'strong typing' means that the type of any object in an scope
can be determined at compile time.  So, in the example you gave, it is
quite trivial to determine the data types of all the objects given just
by examining the text.  C is not only strongly typed, but it requires
explicit declarations of everything.

You are confusing 'strong typing' with 'strict type checking'.  The
later term refers to languages which discourage (or even disallow)
any mixed-mode operations without _explicit_ type coersions.  To be
sure, strict typing is easier to do if the language is also strongly
typed - this is probably how this confusion of terms (which is common)
originally arose.  But a strict language isn't necessarily strongly
typed.  Any language which allows late binding is (again by definition)
_not_ stongly typed - but such a language may still restrict mixed-mode
operations; it would just have to do all the checking at run-time.

J. Giles

ok@goanna.oz.au (Richard O'keefe) (03/08/90)

In article <849@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
> C strongly typed?  If I write something like: (I don't speak C
> so the syntax is probably bogus.)

>     typedef apple int;
>     typedef orange int;
>     apple a;
>     orange b;
>     ...
>     a = b;

> Will a "modern" compiler object?

No, of course not.  Try it in Pascal:
	program main;
	type
	    apple  = integer;
	    orange = integer;
	var
	    a: apple;
	    o: orange;
	begin
	    a := o;
	end.
A Pascal compiler may complain that "o" is uninitialised, but it
*must* accept the assignment as well-typed.  (I tried it.)

Try it in Ada (not checked, as I haven't access to an Ada compiler):
	declare
	    subtype apple  is integer;
	    subtype orange is integer;
	    a: apple;
	    o: orange := 1;
	begin
	    a := o;
	end
Again, the assignment is well-typed.  Why should C be different?

bvickers@ics.uci.edu (Brett J. Vickers) (03/08/90)

In article <1990Mar7.182230.5517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>                                  C's type system is not extensible unless
>you count "struct", but the language is strongly typed -- mixing random
>types is not allowed.

This is simply not true.

foo()
{
  int a;
  char c;

  a = 48;
  c = a;
}

As far as I know, this will compile.  C is extremely flexibly typed.
If you want a strongly typed language, use Ada.

--
bvickers@bonnie.ics.uci.edu

machaffi@fred.cs.washington.edu (Scott MacHaffie) (03/08/90)

In article <14262@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
%Yes C is strongly typed - by the definition of 'strong typing'.  The
%phrase 'strong typing' means that the type of any object in an scope
%can be determined at compile time.  So, in the example you gave, it is

You just described "static typing".

		Scott MacHaffie

nmm@cl.cam.ac.uk (Nick Maclaren) (03/08/90)

Henry Spencer writes:

> ....  C's type system is not extensible unless
> you count "struct", but the language is strongly typed -- mixing random
> types is not allowed.

I am afraid that I must disagree with this.  While mixing RANDOM types is
not allowed, even ANSI C permits a bewildering variety of type punning.
For example:

    double x, y;
    memcpy(&x,&y,sizeof(double));

The ANSI standard (and K&R) explicitly require that any data type may be
treated as an array of characters (under certain circumstances, such as
the above).  A huge proportion of the library relies upon this to work at
all (e.g. much of string.h, some of stdlib.h, some of stdio.h).

    union {void *a; char *b;} fred;
    fred.a = ...;
    ... = fred.b;

This example is a curiosity in ANSI C:  while it is illegal, and the compiler
is entitled to throw it out, it is also required to work!  The reason is that
'char *' and 'void *' are different types but are required to have the same
representation and alignment.

There are a large number of more obscure cases, many of which are relied
upon by traditional C programs.  Good, portable ones avoid such constructions
if at all possible, but sometimes their use is essential.

Nick Maclaren
University of Cambridge Computer Laboratory
nmm@cl.cam.ac.uk

mjs@hpfcso.HP.COM (Marc Sabatella) (03/09/90)

>C strongly typed? If I write something like: (I don't speak C
>so the syntax is probably bogus.)
>
>    typedef apple int;
>    typedef orange int;
>    apple a;
>    orange b;
>    ...
>    a = b;
>
>Will a "modern" compiler object?

No.  C uses (mostly) "structural equivalence" for determining when two types
are compatible.  You seem to be saying that "strong typing" implies "name
equivalence", which is not the way we learned it.  There are three sets of
completely orthogonal distinctions here

strong vs weak typing (does everything have a type at compile time?)
name vs structural equivalence (when are two types equivalent?)
strict vs non-strict type checking (does the compiler allow type mismatches?)

C is strong, non-strict, and structural (although ANSI is quite a bit stricter
than K&R, it still has automatic type promotions).

Marc

ftw@quasar..westford.ccur.com (Farrell Woods) (03/09/90)

In article <25F5AA40.27091@paris.ics.uci.edu> bvickers@ics.uci.edu (Brett J. Vickers) writes:
>In article <1990Mar7.182230.5517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>>                                  C's type system is not extensible unless
>>you count "struct", but the language is strongly typed -- mixing random
>>types is not allowed.

>This is simply not true.

[Example deleted]

Henry's right!  The point is that `char' and `int' (and, `short' and `long')
all describe *integer* quantities.  It's just that the range os values which
each of these "types" can hold differs due to the amount of storage allocated
to a variable of a given type.


	-- Farrell Woods

--
Farrell T. Woods				Voice:  (508) 392-2471
Concurrent Computer Corporation			Domain: ftw@westford.ccur.com
1 Technology Way				uucp:   {backbones}!masscomp!ftw
Westford, MA 01886				"I can't drive...fifty-five!"

merriman@ccavax.camb.com (03/09/90)

In article <1990Mar7.182230.5517@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
> However, if you write something like:
> 
> 	char *p;
> 	int a;
> 	...
> 	a = p;
> 
> any modern compiler will object.  C's type system is not extensible unless
> you count "struct", but the language is strongly typed -- mixing random
> types is not allowed.

I guess that lets VAX C out as a modern compiler. the following compiles
without complaint:

#include stdio

int i;
char c;
int *ip;
char *cp;

main()

{
i = 700;
c = i;

printf("%d %d\n", i, c);

ip = &i;

cp = ip;

printf("%d %d\n", *cp, *ip);

i = cp;

ip = i;

printf("%d %d\n", *cp, *ip);

}

and produces the following output:

700 -68
-68 700
32 544

Anyone care to explain the last line of output?

BTW, I included the int to char assignment to demonstrate what I consider
to be a really obnoxious and dangerous shortcoming (at least in VAX C).
Do real C compilers allow the same thing, without comment?

steven@cwi.nl (Steven Pemberton) (03/09/90)

In article <8960013@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes:
> C uses (mostly) "structural equivalence" for determining when two types
> are compatible.  You seem to be saying that "strong typing" implies "name
> equivalence", which is not the way we learned it.  There are three sets of
> completely orthogonal distinctions here
> 
> strong vs weak typing (does everything have a type at compile time?)
> name vs structural equivalence (when are two types equivalent?)
> strict vs non-strict type checking (does the compiler allow type mismatches?)
> 
> C is strong, non-strict, and structural

I prefer another definition of strongly-typed.

My term for "does everything have a type at compile time" is
"statically typed", and I use "strongly typed" for: to what extent
does the compiler disallow illegal operations on values at
compile-time.

Examples of 'illegal operations' are: indexing a non-array or
selecting a field from a non-structure, dereferencing a non-pointer,
assigning incompatible types, and so on.

The fact that languages only check certain subsets of all possible
illegal operations allows you talk in terms of degrees of strong
typing: certain languages are weakly typed, others are strongly typed;
you coudn't do this with the definition of strongly typed as
"whether everything is statically typed".

Again in my opinion, most languages that are today called
strongly-typed are really only firmly-typed, because there are still
loads of illegal operations that are only identified at run-time.
Examples of operations that could be reduced to compile-time type
errors are: dereferencing nil, array indexing errors and sub-range
errors in general.

On the name equivalence front, I believe that name equivalence is in
general more useful from an abstract point of view than structure
equivalence, since it allows you to say:
	type height: integer
	type weight: integer

	height h; weight w;

and to let the compiler prevent you from accidentally saying:

	h= w

(Note, I'm not proposing that the compiler prevent you from assigning w
to h, only from you doing it accidentally; a well-designed language
with strong typing should allow you to do anything you want, and as
little as possible you don't want).

Indeed, C doesn't use entirely structure equivalence, because you can
always say
	typedef struct{int height_value;} height;
	typedef struct{int weight_value;} weight;

	height h;
	weight w;

	h.height_value= 175;
	w.weight_value= 60;

and the compiler complains about
	w= h;

But still, who goes to this trouble?

----------------
Disclaimer 1:
I appreciate that saying that dereferencing nil can be reduced to a
static typing problem is going to start a flurry in this group. For
this I apologise. To try and reduce the clamour, let me just say in
advance to anyone contemplating replying "This is impossible, because
it's equivalent to the halting problem": if you think this is so, you
don't understand the halting problem properly. Try and show that
static type checking is equivalent to the halting problem using the
same arguments, and I think you'll see the problem. Note that I most
certainly did not say "you can write a program that confirms that a
given C program is dereference safe".

Disclaimer 2:
I think that most languages that are today called "high-level" are
actually only "medium-level", too.

Disclaimer 3:
You could make the definition of strongly typed orthogonal by removing
the bit about "at compile-time". Then you could talk about "strong
dynamic typing", "weak static typing", and so on. This however would
be too much at odds with current usage, I think.

Steven Pemberton, CWI, Amsterdam; steven@cwi.nl
   Waiting for a compilation to finish, the C programmer switched the TV on.
   "Shoot first, ask questions later!" yelled the sheriff.
   At this the programmer gained enlightenment.

jejones@mcrware.UUCP (James Jones) (03/09/90)

In article <1990Mar7.182230.5517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>any modern compiler will object.  C's type system is not extensible unless
>you count "struct", but the language is strongly typed -- mixing random
>types is not allowed.

Eh?  C will let you coerce essentially anything.  If all else fails, coerce the
address of the object about whose type you wish to lie, and then dereference.

	James Jones

jlg@lambda.UUCP (Jim Giles) (03/10/90)

In article <11007@june.cs.washington.edu>, machaffi@fred.cs.washington.edu (Scott MacHaffie) writes:
- In article <14262@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
- %Yes C is strongly typed - by the definition of 'strong typing'.  The
- %phrase 'strong typing' means that the type of any object in an scope
- %can be determined at compile time.  So, in the example you gave, it is
- 
- You just described "static typing".

Yes, I did.  The two term are synonymous.  English _does_ have this
annoying habbit of providing more than one word for a given meaning.
This is even true in technical jargon.

The fact is that strong/weak typing is defined (at least in the language
design field) as the distinction between compile-time and run-time type
specification.  If you prefer 'static' to 'strong' that is your choice.

J. Giles

sommar@enea.se (Erland Sommarskog) (03/10/90)

Henry Spencer (henry@utzoo.uucp) writes:
)No, because the somewhat-misnamed "typedef" explicitly declares a synonym,
)not a new type.  However, if you write something like:
)
)	char *p;
)	int a;
)	...
)	a = p;
)
)any modern compiler will object.  C's type system is not extensible unless
)you count "struct", but the language is strongly typed -- mixing random
)types is not allowed.

Well, apparently I am allowed to mix apples and oranges. If I have
two types of data that both happens to be represented by integers,
but have no logical connection what so ever I cannot apparently 
express that in C. And consequently I cannot take help from the
compiler to catch inadvertent mixups in for instance procedure calls.

-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

mike@cs.umn.edu (Mike Haertel) (03/10/90)

In article <862@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>Well, apparently I am allowed to mix apples and oranges. If I have
>two types of data that both happens to be represented by integers,
>but have no logical connection what so ever I cannot apparently 
>express that in C. And consequently I cannot take help from the
>compiler to catch inadvertent mixups in for instance procedure calls.

So declare

struct apple { int v; };
struct orange { int v; };

C uses name equivalence for structure types.
-- 
Mike Haertel <mike@ai.mit.edu>
"We are trying to support small memory machines." -- Larry McVoy

CMH117@psuvm.psu.edu (Charles Hannum) (03/10/90)

In article <862@enea.se>, sommar@enea.se (Erland Sommarskog) says:
>
>Well, apparently I am allowed to mix apples and oranges. If I have
>two types of data that both happens to be represented by integers,
>but have no logical connection what so ever I cannot apparently
>express that in C. And consequently I cannot take help from the
>compiler to catch inadvertent mixups in for instance procedure calls.


Well, yes, the following does compile with no problems:

  typedef enum {
      SKIN, CORE
  }   apple;

  typedef enum {
      PEEL, SEED
  }   orange;

  apple  grannysmith;
  orange tangerine;

  int main(void) {
      grannysmith = tangerine;
  }


But if I may ask, what's your point?  Anyone programmer with half a brain
would know that a Granny Smith isn't equivalent to a tangerine.

I like C precisely because it DOESN'T hold my hand.


Virtually,
- Charles Martin Hannum II       "Klein bottle for sale ... inquire within."
    (That's Charles to you!)     "To life immortal!"
  cmh117@psuvm.{bitnet,psu.edu}  "No noozzzz izzz netzzzsnoozzzzz..."
  c9h@psuecl.{bitnet,psu.edu}    "Mem'ry, all alone in the moonlight ..."

throopw@sheol.UUCP (Wayne Throop) (03/11/90)

> From: steven@cwi.nl (Steven Pemberton)
> I use "strongly typed" for: to what extent
> does the compiler disallow illegal operations on values at
> compile-time.

A reasonable position.  I largely agree.

> Examples of 'illegal operations' are: indexing a non-array or
> selecting a field from a non-structure, dereferencing a non-pointer,
> assigning incompatible types, and so on.

C passes all of these examples.

On the other hand, I wouldn't call these "illegal", or the other
examples Steven gives.  They could be made "legal" by a language
standard.  Perhaps they ought to be called "imoral"? "Unethical"?
Whatever...  perhaps "surprising" would suffice. 

> The fact that languages only check certain subsets of all possible
> illegal operations allows you talk in terms of degrees of strong
> typing: certain languages are weakly typed, others are strongly typed;

Right.  Very good feature of this definition.

> Examples of operations that could be reduced to compile-time type
> errors are: dereferencing nil, array indexing errors and sub-range
> errors in general.

Well, since dereferencing nil is a sort of range check, I suppose
that it could be checked for at compile time as easily as the others.
But it seems to me that all of these can be reduced to solving
the halting problem.  Even if I'm wrong about that, the problem
is quite a bit beyond the current state of the art of static
flow-of-control analysis, is it not?

ANYway, based on Stephen's definition, I'd say that C is fairly strongly
typed compared to other Algol relatives, but that most people simply
don't run the static checking phase of the compiler.
--
Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

CMH117@psuvm.psu.edu (Charles Hannum) (03/11/90)

In article <1798@gannet.cl.cam.ac.uk>, nmm@cl.cam.ac.uk (Nick Maclaren) says:
>
>The ANSI standard (and K&R) explicitly require that any data type may be
>treated as an array of characters ...
>the above).

This is simply not true!  See below ...


>             A huge proportion of the library relies upon this to work at
>all (e.g. much of string.h, some of stdlib.h, some of stdio.h).

Well, everything in string.h works with arrays of characters; what's your
point?  As for stdlib.h and stdio.h, see comments on the void type below ...


>    union {void *a; char *b;} fred;
>    fred.a = ...;
>    ... = fred.b;

What is wrong with this?  First, this would only work if "..." was another
void pointer or a char pointer.  You seem to be bashing C for the inclusion
of a void type ...

"void" in C is similar to "nil" in LISP.  It's just a generic type, which can
hold the place of any other type, WHEN USED IN A POINTER!!  I couldn't assign
a character to a void type, for example.  But I *can* assign a void pointer
to a character pointer, and vice versa.  This is part of the power of the C
language.  It simplifies things like malloc(), for example.

And void has *nothing* to do with char!  Try this:

  #include <stdio.h>

  int main(void) {
      printf("%d %d\n",sizeof(void),sizeof(char));
  }


(Actually, my code sample above *does* show a deficiency in the C language;
 let's see if anyone can figure out what it is.)


>[rest of gibberish deleted]

Personally, I'm more concerned about the fact that this stupid terminal's "0"
key on the keypad isn't working.  It causes a lot more problems than a void
type.


Virtually,
- Charles Martin Hannum II       "Klein bottle for sale ... inquire within."
    (That's Charles to you!)     "To life immortal!"
  cmh117@psuvm.{bitnet,psu.edu}  "No noozzzz izzz netzzzsnoozzzzz..."
  c9h@psuecl.{bitnet,psu.edu}    "Mem'ry, all alone in the moonlight ..."

sommar@enea.se (Erland Sommarskog) (03/11/90)

Mike Haertel (mike@cs.umn.edu) writes:
)So declare
)
)struct apple { int v; };
)struct orange { int v; };
)
)C uses name equivalence for structure types.

Did I hear "kludge"? (But it should be acknowledged that not all
Pascal compilers would keep such apples and oranges above apart.).

-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

sommar@enea.se (Erland Sommarskog) (03/11/90)

Charles Hannum (CMH117@psuvm.psu.edu) writes:
>But if I may ask, what's your point?  Anyone programmer with half a brain
>would know that a Granny Smith isn't equivalent to a tangerine.

Stupidity. Ever heard of the work "mistake"? Say that I have the 
routine Macedonia declared as (in Ada syntax):

   FUNCTION Macedonia(Granny_smith : IN Apple;
                      Tangerine    : IN Orange) RETURN Some_type;

Now, I'm calling this from another module. In which order comes
the parameters now? Ah, it was probably the orange first, wasn't
it? A strongly typed language like Ada will catch this error.
With C or Pascal I have to spend half a day to find out why the
damned fruit salad doesn't taste as intended.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

CMH117@psuvm.psu.edu (Charles Hannum) (03/11/90)

In article <873@enea.se>, sommar@enea.se (Erland Sommarskog) says:
>
>Stupidity. Ever heard of the work "mistake"? Say that I have the
>routine Macedonia declared as (in Ada syntax):
>
>   FUNCTION Macedonia(Granny_smith : IN Apple;
>                      Tangerine    : IN Orange) RETURN Some_type;
>
>Now, I'm calling this from another module. In which order comes
>the parameters now? Ah, it was probably the orange first, wasn't
>it? A strongly typed language like Ada will catch this error.
>With C or Pascal I have to spend half a day to find out why the
>damned fruit salad doesn't taste as intended.

Your point is well taken.  While I do often find C's liberal number conversion
useful, it is also annoying that I can't defined a new type, without defining
it as a structure (or union).

Oh well.  Fortunately, I haven't ever run into an application where I really
desparately needed name equivalence in anything but structured.  Perhaps some
day I will.

Then I'll switch to Eiffel.


Virtually,
- Charles Martin Hannum II       "Klein bottle for sale ... inquire within."
    (That's Charles to you!)     "To life immortal!"
  cmh117@psuvm.{bitnet,psu.edu}  "No noozzzz izzz netzzzsnoozzzzz..."
  c9h@psuecl.{bitnet,psu.edu}    "Mem'ry, all alone in the moonlight ..."

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (03/12/90)

In article <90070.034113CMH117@psuvm.psu.edu> CMH117@psuvm.psu.edu (Charles Hannum) writes:
>"void" in C is similar to "nil" in LISP.  It's just a generic type, which can
>hold the place of any other type, WHEN USED IN A POINTER!!  I couldn't assign

Some clarification is called for here.  In C, 'void' (*not* 'void *')
is used in function declarations and definitions to signify the
absence of arguments and/or a return value.  In this role, Common
Lisp's closest analog is the expression

	(VALUES)

which returns zero values.  (The VALUES form is more often used to
return multiple values.)  NIL is quite different.  In Common Lisp,
NIL is a specific object: the only object of type NULL, which is
in turn a subtype of both SYMBOL and LIST.  NIL's existence is
just as tangible as that of the number 0.  NIL does, of course,
fulfil some special roles (e.g., as the default initializer for a
name binding).

C's 'void *', on the other hand, plays the role of a generic
pointer type, as you say.  Its closest Common Lisp analog is the
type T, which is a supertype of every type.  If I were forced at
gunpoint to write type declarations for a Lisp function's
arguments, I would declare each of them to be of type T, which
essentially asserts existence but nothing more.

Despite the syntax, 'void *' has no semantic connection to 'void'
at all as far as I can see.  Indeed, they are almost opposites:
"anything" vs. "nothing." Apparently, C compiler writers simply
decided to apply some new semantics to whatever unused syntax was
lying around.


	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

CMH117@psuvm.psu.edu (Charles Hannum) (03/12/90)

In article <14318@cbnewsc.ATT.COM>, lgm@cbnewsc.ATT.COM (lawrence.g.mayka) says:
>
>Despite the syntax, 'void *' has no semantic connection to 'void'
>at all as far as I can see.  Indeed, they are almost opposites:
>"anything" vs. "nothing." Apparently, C compiler writers simply
>decided to apply some new semantics to whatever unused syntax was
>lying around.

As they did, but less effectively, with the keyword "static".


Virtually,
- Charles Martin Hannum II       "Klein bottle for sale ... inquire within."
    (That's Charles to you!)     "To life immortal!"
  cmh117@psuvm.{bitnet,psu.edu}  "No noozzzz izzz netzzzsnoozzzzz..."
  c9h@psuecl.{bitnet,psu.edu}    "Mem'ry, all alone in the moonlight ..."

gudeman@cs.arizona.edu (David Gudeman) (03/13/90)

In article  <0501@sheol.UUCP> throopw@sheol.UUCP (Wayne Throop) writes:
>> From: steven@cwi.nl (Steven Pemberton)
>
>> Examples of operations that could be reduced to compile-time type
>> errors are: dereferencing nil, array indexing errors and sub-range
>> errors in general.
>
>Well, since dereferencing nil is a sort of range check, I suppose
>that it could be checked for at compile time as easily as the others.
>But it seems to me that all of these can be reduced to solving
>the halting problem.  Even if I'm wrong about that, the problem
>is quite a bit beyond the current state of the art of static
>flow-of-control analysis, is it not?

I didn't reply to the original because I expected several other people
to reply.  In fact, dereferencing nil and range checks _can_ be
reduced to the halting problem.  The trick is to insert a
nil-dereference/range violation at each exit point.  More obviously,
how do you check the following?

	i := read_integer(input_file);
	x := a[i];

The answer is that you can't.  You have to know what the input to the
program is going to be.

However, you _can_ do such an analysis making worst-case or best-case
assumptions, getting an approximation to the answer.  That is, there
are a lot of cases where you can statically analyse a program and say
at a given place that (worst-case) ``there are some inputs for which
this may be an error'', or that (best-case) ``for all inputs this
will be an error''.  What you cannot do is say that ``there are some
inputs for which this will be an error'', and guarantee that you have
found all points in the program at which this is true.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

news@ism780c.isc.com (News system) (03/13/90)

In article <862@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>
>Well, apparently I am allowed to mix apples and oranges. If I have
>two types of data that both happens to be represented by integers,
>but have no logical connection what so ever I cannot apparently 
>express that in C. And consequently I cannot take help from the
>compiler to catch inadvertent mixups in for instance procedure calls.
>

I am unaware of any commonly available language that prevents this form of
mistake.  Look at the following:

  double distance;
  double time;
  double velocity;

  velocity = distance/time;  /* this makes sense */
  velocity = distance+time;  /*  I mixed 'apples' and 'oranges' and produced
				 a lemon :-) */

I did read a paper (sorry, I don't have the reference) describing a language
that allowed one to augment the the type declaration with a units declaration
so as to be able to catch errors of this form.

   Marv Rubinstein

rimvallm@jupiter.crd.ge.com (Magnus Rimvall) (03/14/90)

In article <39941@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
>I am unaware of any commonly available language that prevents this form of
>mistake.  Look at the following:
>
>  double distance;
>  double time;
>  double velocity;
>
>  velocity = distance/time;  /* this makes sense */
>  velocity = distance+time;  /*  I mixed 'apples' and 'oranges' and produced
>				 a lemon :-) */
>
>I did read a paper (sorry, I don't have the reference) describing a language
>that allowed one to augment the the type declaration with a units declaration
>so as to be able to catch errors of this form.
>
>   Marv Rubinstein

The problem with mismatching units is particularly bothersome in
the area of continuous system simulation, where the whole
program/model consists of equations taking units.  Some
simulation programs do indeed support unit checking (a new version
of the simulation language CSSL-IV which supported unit
declarations/checks was announced some time ago - this might have
been the paper Marv read).

The task of unit tests is, at least in the USA, only half the
battle. Until the metric system is adopted, we would also need
automatic scaling of units (even the *dumbest* unit test unit
could learn that a KW is equal to 1000 W, without knowing what a
W itself is. Not even intelligent human beings can know for sure
how much a "ton" or a "gallon" really is ... but this does not really
belong in comp.*, so flame me in sci.misc).

Magnus Rimvall

Disclaimer: This letter does not necessarily reflect the opinions
of anybody else (though they should)

fischer@iesd.auc.dk (Lars P. Fischer) (03/14/90)

In article <862@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>Well, apparently I am allowed to mix apples and oranges. If I have
>two types of data that both happens to be represented by integers,
>but have no logical connection what so ever I cannot apparently 
>express that in C. 

True. C has type synonyms, but you cannot introduce new types. It a
pain at times, but you learn to live with it. Note that this does
*not* mean that C is not strongly typed (it is). It means that there
are some type constructions mechanisms that are not available in C.

>And consequently I cannot take help from the
>compiler to catch inadvertent mixups in for instance procedure calls.

However:

   banach> cat te.c
   typedef enum { o1, o2 } orange ;
   typedef enum { a1, a2 } apple ;

   orange o;
   apple a;

   void main ()
   {
	   o = o1;
	   a = o;
   }

   banach> lint te.c
   te.c(10): warning: enumeration type clash, operator =
   banach>

Use lint, not Ada.

/Lars
--
Lars Fischer,  fischer@iesd.auc.dk   | Q: How does a project get to be one
CS Dept., Univ. of Aalborg, DENMARK. | year late?  A: One day at a time.

sakkinen@tukki.jyu.fi (Markku Sakkinen) (03/14/90)

In article <39941@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
> ...
>  velocity = distance/time;  /* this makes sense */
>  velocity = distance+time;  /*  I mixed 'apples' and 'oranges' and produced
>				 a lemon :-) */
>
>I did read a paper (sorry, I don't have the reference) describing a language
>that allowed one to augment the the type declaration with a units declaration
>so as to be able to catch errors of this form.

I think there has been more than one article in ACM SIGPLAN Notices
during the last two or three years that has suggested such a language
extension (to Pascal at least) in considerable detail.

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland
          SAKKINEN@FINJYU.bitnet (alternative network address)

plogan@mentor.com (Patrick Logan) (03/14/90)

In article <90071.030339CMH117@psuvm.psu.edu> CMH117@psuvm.psu.edu
 >(Charles Hannum) writes:
 >  In article <14318@cbnewsc.ATT.COM>, lgm@cbnewsc.ATT.COM
 > (lawrence.g.mayka) says:
 >  >
 >  >Despite the syntax, 'void *' has no semantic connection to 'void'
 >  >at all as far as I can see.  Indeed, they are almost opposites:
 >  >"anything" vs. "nothing." Apparently, C compiler writers simply
 >  >decided to apply some new semantics to whatever unused syntax was
 >  >lying around.

 >   As they did, but less effectively, with the keyword "static".

C++ plays even more fun games with the keyword "static". In that
language "static" has the same uses as C as well as a means of
introducing class-wide functions and data.

 >   Virtually,
 >  - Charles Martin Hannum II       "Klein bottle for sale ... inquire within."

That's funny about "whatever unused syntax was lying around" because
it seems exactly what has been happening with C and C++.
-- 
Patrick Logan  uunet!mntgfx!plogan | 
Mentor Graphics Corporation        | 
Beaverton, Oregon 97005-7191   	   |

al@nmtsun.nmt.edu (Al Stavely) (03/15/90)

In article <3744@tukki.jyu.fi> sakkinen@jytko.jyu.fi (Markku Sakkinen) writes:
>In article <39941@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
>>I did read a paper (sorry, I don't have the reference) describing a language
>>that allowed one to augment the the type declaration with a units declaration
>>so as to be able to catch errors of this form.
>
>I think there has been more than one article in ACM SIGPLAN Notices
>during the last two or three years that has suggested such a language
>extension (to Pascal at least) in considerable detail.


This is a moderately good but totally obvious idea, and language constructs
for doing this have been re-invented over and over again.  It's just that
no one has thought it significant enough to incorporate into a major language.

It might be amusing to count how many times this idea has been presented in
SIGPLAN Notices over the last *twenty* years.

- Allan Stavely, New Mexico Tech, USA   al@nmt.edu

firth@sei.cmu.edu (Robert Firth) (03/15/90)

In article <3965@nmtsun.nmt.edu> al@nmtsun.nmt.edu (Al Stavely) writes:
>In article <3744@tukki.jyu.fi> sakkinen@jytko.jyu.fi (Markku Sakkinen) writes:
>>In article <39941@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
>>>I did read a paper (sorry, I don't have the reference) describing a language
>>>that allowed one to augment the the type declaration with a units declaration
>>>so as to be able to catch errors of this form.
>>
>>I think there has been more than one article in ACM SIGPLAN Notices
>>during the last two or three years that has suggested such a language
>>extension (to Pascal at least) in considerable detail.
>
>
>This is a moderately good but totally obvious idea, and language constructs
>for doing this have been re-invented over and over again.  It's just that
>no one has thought it significant enough to incorporate into a major language.

Well, I know of a minor programming language that allows one to
achieve most of what is required at a fairly low cost in language
feature overhead.  The concepts and their rationale are explained
in the "Rationale for the Design of the Ada programming language",
sections 7.2 and 7.3.  Examples there of types with implied units
are FRANC and MARK, DOLLAR and CENT, LENGTH and AREA.  You might
want to look it up.

hit n now, rest is junk to massage some bloody fool's ego

sorry
i
have
to
include
more
new
text
than
quoted
text
so
wasting
your
time
and
money

utoddl@uncecs.edu (Todd M. Lewis) (03/15/90)

In article <6475@bd.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes:
>Well, I know of a minor programming language that allows one to
>achieve most of what is required at a fairly low cost in language
>feature overhead.  The concepts and their rationale are explained
>in the "Rationale for the Design of the Ada programming language",
>sections 7.2 and 7.3.  Examples there of types with implied units
>are FRANC and MARK, DOLLAR and CENT, LENGTH and AREA.  You might
>want to look it up.

Hey, this sounds really neat.  Tell me, how does the rationale
deal with keeping the conversions from FRANCs to MARKs to DOLLARs
and CENTs?  I've been using a file with the conversion factors
in it for my own stuff, but that's always seemed just a little
too cavalier for my tastes, what with these values changing by 
the second in the real world.  I'm glad to see the solution
so well worked out that it becomes part of the rationale for 
designing a language!  Wow!