[comp.lang.c] What real non-UNIX 'C' compilers implement...

peter@sugar.UUCP (Peter da Silva) (09/08/87)

To: Doug Gwynn.

Mail is off the hook, so I'll post a reply here. I think it's of general
interest, anyway.

> [ Doug sent me mail saying most non-UNIX 'C' compilers probably don't
>   implement read/write anyway ]

Well, actually, every non-UNIX 'C' compiler I've used has had read and write.
In fact, some have not had fread/fwrite (the ones that emulated pre-version-7
libraries). Usually fread and fwrite are implemented in terms of read and
write, even when that is a brain damaged thing to do (Lattice 'C' on the IBM
does this and breaks fseek for text files. Manx 'C' on the Amiga does this
despite the AmogaDOS having a read and a write with identical semantics to
UNIX). On a couple, read/write are used for binary I/O and fread and fwrite
and (mainly) fprintf and fgets are used for text I/O.

Database:

	VAX/VMS 'C'.
	DECUS 'C' for the PDP-11 is so screwed up I can't remember what
		it implemented.
	BDS-C for the Z80, no fread/fwrite (actually implemented UNIXio
		functions with stdio names).
	Small-C for the 8080 with the library that came with my copy.
	Lattice and Manx 'C' on the Amiga.
	Lattice, Microsoft, and de Smet 'C' on the IBM-PC.

I would probably recommend implementing read/write because in preactical
terms that's what most people tend to use for binary file-fiddling. And you
don't want to disappoint people wanting to port their binary file-fiddlers,
now would you? But make them efficient... on MS-DOS, frex, make them straight
mappings to the MS-DOS "UNIX-style" I/O. On the Amiga, make them aliases for
Read and Write (etc.)... this will break people who do read(0) and write(1),
but these people are probably doing pipe(&pip) and fork() and stuff anyway.
-- 
-- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter
--                 'U`  <-- Public domain wolf.

lmiller@venera.isi.edu (Larry Miller) (09/10/87)

In article <672@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>To: Doug Gwynn.
>
>Mail is off the hook, so I'll post a reply here. I think it's of general
>interest, anyway.
>
>> [ Doug sent me mail saying most non-UNIX 'C' compilers probably don't
>>   implement read/write anyway ]
>
>Well, actually, every non-UNIX 'C' compiler I've used has had read and write.

>  And lots more stuff about how everyone uses read and write anyhow.

The problem goes beyond just read and write, to any OS system calls.  The
new Turbo C makes things worse because, unlike UNIX which separates system
calls (Section 2 in the manual) from C library routines (Section 3), Turbo
C just gloms them all in alphabetical order in the reference manual.  NO
new programmer to C would have any inclination that some calls are portable
C, and some are DOS specific.

A good reorganiztion of the Turbo C reference manual is in order.

Larry Miller
USC/ISI
lmiller@venera.isi.edu (no uucp)

chips@usfvax2.UUCP (Chip Salzenberg) (09/14/87)

In article <3545@venera.isi.edu>, lmiller@venera.isi.edu (Larry Miller) writes:
}
} The problem goes beyond just read and write, to any OS system calls.  The
} new Turbo C makes things worse because, unlike UNIX which separates system
} calls (Section 2 in the manual) from C library routines (Section 3), Turbo
} C just gloms them all in alphabetical order in the reference manual.  NO
} new programmer to C would have any inclination that some calls are portable
} C, and some are DOS specific.
} 
} A good reorganiztion of the Turbo C reference manual is in order.
} 
} --
} Larry Miller

The one thing I like about Lattice is that their C manual tags functions with
a `type' -- ANSI, UNIX, XENIX, MS-DOS, etc.  Thus you can pick the function(s)
that are available on the range of systems important to you.  (Why restrict
yourself to ANSI function calls for a throw-away utility?)

I like Turbo's reference manual.  I usually use it even when programming on
Xenix. (Of course, Turbo doesn't have msgctl(), but I can read the SVID. :-))
-- 
Chip Salzenberg            UUCP: "uunet!ateng!chip"  or  "chips@usfvax2.UUCP"
A.T. Engineering, Tampa    Fidonet: 137/42    CIS: 73717,366
"Use the Source, Luke!"    My opinions do not necessarily agree with anything.

peter@sugar.UUCP (09/14/87)

In article <3545@venera.isi.edu>, lmiller@venera.isi.edu (Larry Miller) writes:
> The problem goes beyond just read and write, to any OS system calls.  The
> new Turbo C makes things worse because, unlike UNIX which separates system
> calls (Section 2 in the manual) from C library routines (Section 3), Turbo
> C just gloms them all in alphabetical order in the reference manual.  NO
> new programmer to C would have any inclination that some calls are portable
> C, and some are DOS specific.

AT&T manuals now puts some of the library routines into the system call
section. For some bizzarre reason fclose, fflush, fopen, freopen, fdopen,
fread, fseek, rewind, ftell, fwrite, malloc, free, popen, and system are
in "Base System Routines" (which seems to be mostly section 2)... rather
than being in "standard library routines" (section 3).

So, UNIX isn't nice and elegant anymore.

(System V... from now on, consider it substandard)
-- 
-- Peter da Silva `-_-' ...!hoptoad!academ!uhnix1!sugar!peter
--                 'U`      ^^^^^^^^^^^^^^ Not seismo!soma (blush)

gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/21/87)

The SVID does not attempt to dictate which functions must be
implemented as system calls and which must be implemented as
library routines.  (Contrary to some claims, the SVID is not
simply documentation of one vendor's implementation, but is
a specification for a system interface, very much like POSIX
without any attempt to bless 4.nBSD systems as conforming.)

It made sense to move malloc() into the BA_OS section, even
though it's normally implemented as a library function that
invokes the sbrk() system call (which is not required by the
SVID).  It makes somewhat less sense to have placed STDIO
routines in BA_OS rather than BA_LIB, but since both BA_OS
and BA_LIB are required for Base System conformance, it
really doesn't matter which section they're described in.

peter@sugar.UUCP (Peter da Silva) (09/25/87)

> It made sense to move malloc() into the BA_OS section, even
> though it's normally implemented as a library function that
> invokes the sbrk() system call (which is not required by the
> SVID).

I guess, if you can have execlp() in "section 2", why not malloc().

> It makes somewhat less sense to have placed STDIO
> routines in BA_OS rather than BA_LIB, but since both BA_OS
> and BA_LIB are required for Base System conformance, it
> really doesn't matter which section they're described in.

Except that the O/S *manuals* follow the SVID. And except that it confuses
people. "Hey, peter, how come they have read and fread?" "Well, fread is
a library routine." "Oh. How do you tell which ones are library routines?"
"read the manual. Library routines are section 3" "section 3? I don't
have a section 3" "What? Let me look at that... they must be kidding".

And it used to be you could find something by starting at "a" and searching
from there. You only had to know if it was a system call, a program, a library
routine, or a data structure (though the section 4/section 5 split was
not always obvious). Now you have to guess whether someone thought "awk"
was a programming language, a utility, or an administrative tool (the
answer is "yes"). And as seen by the location of fread(), etc... it's
not always obvious.

Finally, I can easily imagine people implementing fread() in the kernel
if they follow SVID literally.

As far as I'm concerned, it's UNIX if it provides the system calls listed
in section 2 of the Version 7 manual, with the exception of ptrace and
chroot and a few others, and with gtty/stty perhaps replaced by ioctl.
-- 
-- Peter da Silva `-_-' ...!hoptoad!academ!uhnix1!sugar!peter
--                 'U`  Have you hugged your wolf today?
-- Disclaimer: These aren't mere opinions... these are *values*.

guy%gorodish@Sun.COM (Guy Harris) (09/25/87)

> Except that the O/S *manuals* follow the SVID.

*Which* OS manuals follow the SVID?  AT&T's certainly don't; in the S5R3
documentation, "read" is in section 2, and "fread" is in section 3S (same as it
ever was).

Frankly, *I*'d like to see the distinction between section 2 and section 3
erased completely, so that you don't know which library routines consist of a
little glue and a "trap" call and which don't.

Note that people *already* can't determine what is a "system call" based merely
on which manual section it's in.  If you think, for example, that "sigvec", on
systems that support it, consists merely of a little glue and a "trap" call
just because it's in section 2, think again; on Suns, it's actually a
non-trivial piece of C code maintaining its own signal vector.  This is
probably true of "signal" on a lot of machines as well.

> And except that it confuses people. "Hey, peter, how come they have read
> and fread?" "Well, fread is a library routine."

What "confuses people" here is not that this is somehow intrinsically
confusing, but that "Well, 'fread' is a library routine" is a lousy
explanation.  The difference is not "'fread' is a library routine" (in fact,
they're *both* library routines on most implementations, one just happens to be
relatively trivial - at least on UNIX systems), but "'fread' deals with
standard I/O streams and 'read' deals with UNIX file descriptors"; this
explanation may not make sense to somebody not familiar with standard I/O
streams and UNIX file descriptors, but it's not clear any *other* explanation
of this will make sense if you're not familiar with those objects.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/27/87)

In article <29156@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes:
>... this
>explanation may not make sense to somebody not familiar with standard I/O
>streams and UNIX file descriptors, but it's not clear any *other* explanation
>of this will make sense if you're not familiar with those objects.

How about: "fread() uses user-mode buffering; read() does not".
Of course there are other differences, but that seems to be the
essential functional distinction.

jgp@moscom.UUCP (09/30/87)

In article <29156@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes:
>Frankly, *I*'d like to see the distinction between section 2 and section 3
>erased completely, so that you don't know which library routines consist of a
>little glue and a "trap" call and which don't.

The Lattice C (V2) manuals clasified functions into levels instead of
indicating whether they were a system call or a library function.  Level 1
routines provided low level access with routines like sbrk() and read().
Level 2 and 3 functions provided higher level access with things like
malloc() and fread().

Things were sorted by function and subsorted by level number so that you
could see the alternative ways of doing something (eg. file I/O) grouped
together.  There was the logical concept that a level 2 function was just
an standard way of using a level 1 function.

Another problem with the labeling things as system calls is that a system
call on one system may be a library call on the next.  signal() is a system
call on most systems but on BSD 4.[23] it has been changed to a library call.
-- 
Jim Prescott	moscom!jgp@cs.rochester.edu
		{rutgers,ames,cmcl2}!rochester!moscom!jgp

lmiller@venera.isi.edu.UUCP (10/02/87)

In article <1059@moscom.UUCP> jgp@moscom.UUCP (Jim Prescott) writes:
>In article <29156@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes:
>>Frankly, *I*'d like to see the distinction between section 2 and section 3
>>erased completely, so that you don't know which library routines consist of a
>>little glue and a "trap" call and which don't.
>
>Another problem with the labeling things as system calls is that a system
>call on one system may be a library call on the next.  signal() is a system
>call on most systems but on BSD 4.[23] it has been changed to a library call.
>-- 
>Jim Prescott	moscom!jgp@cs.rochester.edu
>		{rutgers,ames,cmcl2}!rochester!moscom!jgp

It is VERY important, for purposes of portability, to make the section 2/
section 3 distinction.  That is, C STANDARD library routines, and system
calls.  Why?  Because most programmers, particuarly when faced with
something like Turbo C's alphabetical listing of all functions, don't
really know the difference.  Then try porting DOS calls to UNIX.

Further, standard library routines are supposed to perform in a documented
way, conforming to the standard.  System routines need not.  Consequently,
from a formal verification standpoint, programs are likely to be safer,
more robust.  And knowing the distinction allows you to package the stuff
that has to be system-dependent in a way that aids portability.

Larry Miller
USC/ISI

guy%gorodish@Sun.COM (Guy Harris) (10/03/87)

> It is VERY important, for purposes of portability, to make the section 2/
> section 3 distinction.  That is, C STANDARD library routines, and system
> calls.  Why?  Because most programmers, particuarly when faced with
> something like Turbo C's alphabetical listing of all functions, don't
> really know the difference.  Then try porting DOS calls to UNIX.

It is EXTREMELY irrelevant, for purposes of portability, to make the section
2/section 3 distinction.  That is, system calls/C standard AND general UNIX C
library calls.  Why?  Because you're probably going to have no better luck, in
general, porting "getpwent", which is in section 3 but is NOT a "C STANDARD
library routine", to a non-UNIX system than you are porting "stat" to such a
system.

Section 3, in UNIX systems, at least, contains plenty of things that are NOT "C
STANDARD library routines".

The distinction between C standard library routines and routines particular to
a specific implementation or operating system is important; however, the
distinction between routines listed in section 2 of the UNIX documentation, and
routines listed in section 3 of that documentation, is not the same
distinction.

> Further, standard library routines are supposed to perform in a documented
> way, conforming to the standard.  System routines need not.  Consequently,
> from a formal verification standpoint, programs are likely to be safer,
> more robust.

1) Is there a formal specification of the C library?  If not, formal
verification is irrelevant here.

2) I believe people *have* tried making formal specifications of the behavior
of UNIX systems (although the systems may have been V6 or V7 systems).

3) "The standard" is still in draft form.

4) There is a standard for UNIX-flavored systems in draft form, so the
distinction here between "standard library routines" and "system routines" is
not relevant.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

gwyn@brl-smoke.UUCP (10/04/87)

In article <3701@venera.isi.edu> lmiller@venera.isi.edu.UUCP (Larry Miller) writes:
>It is VERY important, for purposes of portability, to make the section 2/
>section 3 distinction.  That is, C STANDARD library routines, and system
>calls.  Why?  Because most programmers, particuarly when faced with
>something like Turbo C's alphabetical listing of all functions, don't
>really know the difference.  Then try porting DOS calls to UNIX.
>
>Further, standard library routines are supposed to perform in a documented
>way, conforming to the standard.  System routines need not.  Consequently,
>from a formal verification standpoint, programs are likely to be safer,
>more robust.  And knowing the distinction allows you to package the stuff
>that has to be system-dependent in a way that aids portability.

I have to disagree with the way you've packaged "standard" and "system"
routines.  Rather than bandy about the term "standard", you should refer to
the standard document you use as an interface reference.  At the moment, it
must be mostly K&R, which unfortunately does not specify very many library
routines.  In the not-too-distant future, the most useful reference spec
will be the ANSI standard for C, X3.159-198x.

It would indeed be wise to keep separate those library routines that are
assumed to be "universally" available from those that are definitely
specific to a particular system, using the former whenever possible in
code intended to be portable and confining use of the latter to small,
isolated system interface modules with well-defined portable interfaces.

I'm in the process of preparing a list of standard library functions that
can be counted on in a majority of current environments, in sections
corresponding to ANSI C, POSIX, and SVID (each assuming the previous ones
as a subset).  This is not perfect, but it is a useful guide for those
programmers who really aren't sure how universally available various
routines are.  I'll post the list here "soon".

ron@topaz.rutgers.edu (Ron Natalie) (10/04/87)

Presumably, people implementing C compilers for non-UNIX systems
should ignore sections of the UNIX manual entirely and implement
the X3J11 subroutine set.

What you really need to know is when is _errno valid after a
function call.  This is the information that is not determinable
from most manuals.

-Ron

rwhite@nusdhub.UUCP (Robert C. White Jr.) (10/08/87)

In article <29914@sun.uucp>, guy%gorodish@Sun.COM (Guy Harris) writes:
 [Lots of Stuff about "Standards" and things Deleted]
> distinction here between "standard library routines" and "system routines" is
> not relevant.

	1)  One, If there was no distinction, there would be no distinction.

	2)  A good use for seperating the segments is for the CONSTRUCTION
OF SPECIALIZED LIBRARIES.  If you are writing a screen handling lib,
like curses or termcap (Go Ahaed, tell me nobody uses those! I dare you.),
You dont wan't that new library to be dependant on loading other libraries.
If you don't have a distinction drawn, and you put together a library of
your favorite fast and dirty routines to do X, all your programs will
be HUGE if you got just the wrong call.
	To mix apples and oranges, A perfect example of cross-dependant
libraries is: any program compiled with IBM-PASCAL v1.0.  It makes A mess.

	3) If you want an alphabetical listing of calls and functions
turn to the "index", that is what it is there for.

	4) When doing the "small, fast, and critical" things needed
for many applications like drivers and such, traversing a library
structure, [i.e. function calls function calls OS-Primitive],
can be much slower than just calling an OS-Primitive.


	MOST of the griping I see on 1) Organizing the manuals,
2) Setting indentation standards, 3) Pointer "stupidity", and
4) "Brain-damaged" articles of fact, come from people who either
A) are not used to what they are seeing, or B) Bought a machine
whos internals are not best suited for the "standard way of doing
this"
	You all seem to be forgetting that the idea behind a
standard is that it will "Be Most Usefull To The Greatest Number
of People and Environments"

	If you dont like the Manuals, get another set from someone
else.  I have three sets of manuals for C.  1) the standard manual
for my 3b2/600 (devided by the standard 2 and 3[C|S|M|N|X|F] method)
2) A quick refrence (by Prentice Hall, under direction from bell-labs)
which is devided by Library MODULE (sou you can pre-judge the run
image size) and 3) The refrence set for Microsoft C 4.0.  [I don't
include my K&R refrence, I think of it like the bible, it just must
be.]

Disclaimer: you all know if you should be included in "you all",
	If you don't know, you probably are, If you don't care,
	you probably arn't.  If I didn't mean it, I wouldn't
	say it.  I only say these things behind my employers back.
	If I misspelled it, but you know what I meant, I don't need
	to hear about it, if you don't know what I menat, the
	clarification wont help, get the "Cliff Notes".
	Is this enough mommy, I want tot go home now....

Rob ("Who asked you anyway?") White.

decot@hpisod2.HP.COM (Dave Decot) (10/11/87)

A distinction that WOULD be useful is the following:

    OS: routines that are part of the kernel; your executable file
	does not contain them.

    LIB: routines that are linked into your executable file.

This distinction would be useful for migrating exectuables between
systems that had a common instruction set, system call mechanism, and
object file format.  An example of this situation would be a system
that ran executables linked elsewhere, but did not itself provide a
linker or libraries.

Dave Decot
hpda!decot

guy%gorodish@Sun.COM (Guy Harris) (10/12/87)

> A distinction that WOULD be useful is the following:
> 
>     OS: routines that are part of the kernel; your executable file
> 	does not contain them.
> 
>     LIB: routines that are linked into your executable file.

By this definition, the OS section would be pretty empty indeed....  "read"
*is*, in almost all UNIX implementations, a routine linked into your executable
file.  It just happens, in most cases, to be a trivial routine that does a
trap.

> This distinction would be useful for migrating exectuables between
> systems that had a common instruction set, system call mechanism, and
> object file format.  An example of this situation would be a system
> that ran executables linked elsewhere, but did not itself provide a
> linker or libraries.

How does this distinction help here?  Once you've built the image, if all
the routines are linked into the executable image it shouldn't matter which
routines are just system call wrappers and which aren't.  If you have a shared
library mechanism, and the routines aren't all linked in, the distinction
blurs; you now have to be sure that the library interface for the routines in
question is common across those systems.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com