[mod.std.c] mod.std.c Digest Volume 4 : Issue 15

osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (03/14/85)

From: Orlando Sotomayor-Diaz (The Moderator) <cbosgd!std-c>


mod.std.c Digest            Wed, 13 Mar 85       Volume 4 : Issue  15 

Today's Topics:
                   Can Reiser CPP be ISO-Standard?
             European language support in ISO Standard C
                                ftell 
----------------------------------------------------------------------

Date: Wed, 13 Mar 85 04:09:56 pst
From: sun!gnu (John Gilmore)
Subject: Can Reiser CPP be ISO-Standard?
To: std-c@cbosgd.ATT.UUCP

Suppose for example that Sun decided that we would like to supply an
ISO-Standard C compiler with the new string concatenation stuff, but
have an extension that lets our old Reiser-CPP dependencies work.  Is
this a contradiction in terms?  How can we word the standard to permit
this as an extension?

If we can't, it's gonna be one hell of a flag day for all our customers...

------------------------------

Date: Wed, 13 Mar 85 04:09:56 pst
From: sun!gnu (John Gilmore)
Subject: European language support in ISO Standard C
To: std-c@cbosgd.ATT.UUCP

(While people have been talking about ANSI Standard C, I assume there is
also a parallel ISO effort which is working from the same drafts, as was
true of the draft APL standard.  Presumably this is where the issue arises.)

> From: utzoo!henry (Henry Spencer @ U of Toronto Zoology)
> Subject: trigraphs
>             ...many European countries need to use those codes for
> other things, because they have more than 26 letters in their alphabets!
> These people have terrible trouble with Unix and C as they now stand.
>                                 ...  My personal view is that the
> occurrence of trigraph escapes in the same file as non-ISO characters
> (i.e., stuff written both ways) should be cause for an error message.
> This would at least simplify conversion.  The idea of making trigraphs
> available only via a compiler option also deserves consideration.

The whole idea has not been clearly thought out.

If your C source is written in Swedish, the compiler had better
interpret the input byte 0x5C in its input as an ALPHABETIC (capital O
umlaut) and not as the reverse vigule (also known as "backslash").  If
your C source is written in English, it had better do the reverse.
Depending which country you are in, different bytes are alphabetic
or symbolic -- and the symbols vary widely.

In other words, this "compiler option" does not make an "optional
feature" available; it's a switch that controls the interpretation of
every byte of the input file.  This switch can be set to exactly one of
N values, based on the character set of the source file.  The draft
standard doesn't specify the full behavior of the compiler for all N
values of this switch, or even how many values there are, yet it sounds
like the committee intends that support for C source written in all N
languages will be a required part of all ISO Standard compilers.

If the countries involved were not interested in using the \{}etc bytes
as alphabetics, we wouldn't need to embed the trigraphs in the compiler
-- their sources would avoid these bytes (except in strings), and a
simple local sed script inside 'cc' could convert an ugly looking but
clearly nonalphabetic ??/ (used for editing) into the USASCII byte 0x5C
the compiler expects.  The graphic representation of the \{}etc bytes in
strings, of course, would depend on the output device they were written
on at execution time; EXCEPT of course that the compiler puts special
interpretation on ONE byte value (besides the quote used to begin the
string).  That value is 0x5C, '\', and perhaps we should standardize a
way to change that character via a pragma -- because with that one change
to standard C, this scheme should work for European languages.

If the countries involved ARE interested in using \{}etc as alphabetics
in C identifiers and such (a great idea -- they can spell all their
words now!), the language that results is not compatible with ANY of
the current Unix C (and CP/M C, and Mac C, and DEC C, and MSDOS C)
source files.  I don't see how the resulting language can be part of
the ISO C Standard.  It can't be intermixed with normal C expressions,
functions, or include files.  You can link a "normalC" program with a
"funnyC" program, but then again you can link it with a "Fortran"
program too.

People who want to write in a European-alphabet-capable computer
language are free to define one.  C is unfortunately not it.

[As an aside, I might suggest that the prolific European language
designers stop inventing languages, eg Pascal, that use up their
own national character positions!]

I'm not trying to be tough on Europe -- indeed, I'm working to get
better European support in Sun products -- but we can't close the barn
door after all the characters have been stolen.

------------------------------

Date: Tue, 12 Mar 85 00:02:40 PST
From: Richard Mathews <ucbvax!lcc.rich-wiz@UCLA-LOCUS.ARPA>
Subject: ftell 
To: cbosgd!std-c@BERKELEY

On a UNIX-like system with 4K blocks (as IX370 is supposed to have) a file
may contain more than 2^30 blocks, or about 2^42 bytes = 4.4e12 bytes.	This
by far exceeds the 4.3e9 bytes accessible from an unsigned, 32 bit long.

On the other hand, the suggestion of using a structure to be returned by
ftell() would break a large amount of existing code.  A provision, however,
should be made for these larger machines.  C should not restrict itself to
mini computers.  Any method devised should provide a consistent interface
for lseek(), tell(), fseek(), and ftell().  Has the committee considered
this?  Can anyone think of a clean way around it which is compatible with
the old system calls?

Richard M. Mathews
						       lcc!richard@ucla-cs
					{ucivax,trwrb}!lcc!richard
	 {ihnp4,randvax,sdcrdcf,ucbvax,trwspp}!ucla-cs!lcc!richard

------------------------------

End of mod.std.c Digest - Wed, 13 Mar 85 23:38:07 EST
******************************
USENET -> posting only through cbosgd!std-c.
ARPA -> ... through cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C)
In all cases, you may also reply to the author(s) above.