[net.lang.c] C Builtin Functions

blarson@usc-oberon.UUCP (Bob Larson) (04/15/86)

In article <2528@brl-smoke.ARPA> gwyn@brl.ARPA writes:
>In a hosted (ONLY) implementation, some library functions
>are likely to call on other library functions; if you
>redefine library functions yourself, you may well break
>other apparently unrelated library functions.  The only
>relatively safe redefinition would be one with a standard-
>conforming interface, but then why not just use the one
>provided?
>
Perhaps because library authors arn't always perfect, or just for
performance.  (Is your strcmp optimised for 10Mbyte strings?)


-- 
Bob Larson
Arpa: Blarson@Usc-Ecl.Arpa
Uucp: ihnp4!sdcrdcf!usc-oberon!blarson

kwh@bentley.UUCP (KW Heuer) (04/16/86)

In article <4017@pur-ee.UUCP> pur-ee!pasm (PASM Parallel Processing Laboratory) writes:
>	but you can just as correctly write:
>		sizeof int

I've seen compilers that will accept this, but I believe K&R says the
parens are necessary when the argument is a datatype (to avoid the
ambiguity of e.g. "sizeof char * - 1").

>	What the heck is a compile-time function?  Real useful -
>	functions that return constants.  Come on now.

I can think of two cases, off the top of my head, where a compile-time
function makes sense:

	double log10(x) double x; { return log(x)/log(10.0); }
	char buf[max(XSIZE,YSIZE)];

It's more efficient for log(10.0) to be evaluated at compile-time,
and it's essential for the compiler to know the constant result of
max().  In practice, of course, these are handled by a preprocessor
constant (for the specific value LOG10 or its reciprocal) and a
macro implementation of max (which unfortunately can't also be used
as a function in general -- this is a potential argument in favor
of builtins?).

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/19/86)

In article <276@usc-oberon.UUCP> blarson@usc-oberon.UUCP (Bob Larson) writes:
>Perhaps because library authors arn't always perfect, or just for
>performance.  (Is your strcmp optimised for 10Mbyte strings?)

Yes.

Bob, please contribute your superb implementations of library
routines to the system librarian so others can benefit from them.
Thanks.

kwh@bentley.UUCP (KW Heuer) (04/22/86)

In article <836@ttrdc.UUCP> ttrdc!levy (Daniel R. Levy) writes:
>I think that inline expansion would make lots of sense for cases like
>printf() [which could] parse the formats at compile time [and catch
>mismatched formats and variables].

The smart versions of lint can already detect such format mismatches in
printf() and scanf().  This is actually more powerful than having it done
by the C compiler, because cc would only recognize the builtin functions
(i.e.  those in libc) while lint can be told that any given function is
"printf-like".  Of course it's a kludge either way, and won't help with
other functions based on the same principle, such as tparm() in libcurses.

The only advantage I can see in a builtin printf() is that it might be
possible to optimize  printf("%d\n",n);  into  d_printf(n); putchar('\n');
and avoid loading the entire printf library (the "%g" functions, etc.)

>Someone commented that implementations were expected to have an external
>set of the "standard" functions on hand [for pointer passing], e.g. passing
>strcmp() as a comparison routine to qsort().  Are there other reasons?

For reasons I won't elaborate here (other than to mention orthogonality and
the Spirit of Unix), I'm opposed to the idea of builtin functions.  I might
support the idea of additional operators for sufficiently primitive
operations.  (Unary "|" for abs()?  Binary "><" for min(), "<>" for max()?
Now that would drive the pascal/basic users batty! :-)  And we'd get "><="
and "<>=" operators for free.)  I don't want C to get overblown like ksh
(where "kill" is a builtin!).

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint
Why won't people fix the Newsgroups and Subject lines when the topic changes?

roy@phri.UUCP (04/23/86)

In article <729@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
> 
> I might support the idea of additional operators [...] Binary "><" for
> min(), "<>" for max()?

	A bunch of years ago I ported the PWB (or was it v7?) C compiler to
a v6 system (turned out to be fairly trivial; bootstrapping long constants
was the only tricky part).  Much to my surprise, the lexical analyzer had
commented-out code to recognize "/\" for max and "\/" for min.  There were
even entries for these operators in the code tables.

	We never tested it out, but I think all we had to do was un-comment
a few things here and there and we would have had builtin min and max
operators in C.  I suppose it's a good thing we didn't -- talk about
non-portable code! :-)
-- 
Roy Smith, {allegra,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

cdshaw@watdragon.UUCP (Chris Shaw) (04/28/86)

This discussion is getting really boring. (Yes I know I can "n" past them,
but that's not the point). Could people who have a burning desire to
talk this to death please stop (or do so by mail) ?  
I'm getting rather sick of sizeof blah.


Chris Shaw    watmath!watrose!cdshaw  or  cdshaw@watmath
University of Waterloo
Bogus as HELL !!!

kwh@bentley.UUCP (KW Heuer) (05/05/86)

In article <1700010@umn-cs.UUCP> umn-cs!herndon writes:
>I feel compelled to go on record as being VERY MUCH AGAINST having reserved
>procedure names in C.  For those of us who have ever written stand-alone
>code for PDP-11s, VAXen, ..., it is a simple matter, as things stand, to
>compile our programs, and then have them linked with our own versions of
>'putc', 'read', etc. stashed in stand-alone libraries.

Fine, if you replace *all* of libc.  Otherwise you may find surprises (some
of the stdio function behave as though they call each other, but have been
optimized to use the system calls directly in some cases).

>One of the (in my opinion) great strengths of the C language is that it
>does not have 'built-in' functions.

After rethinking this question, I've decided that it *does*.  However, the
builtins are all punctuation ("%"), whereas the library functions are all
alphanumeric ("abs").  The only exception is "sizeof", which is something of
a special case anyway.  (Let's not quibble about "sizeof" again.  And yes, I
know "%" is an "operator", but doesn't that just mean "builtin function with
builtin syntax"?)

>If one user doesn't like the interface that 'printf' provides, or a whole
>bunch of users don't, they are free to write their own functions and use
>those instead.

As has been mentioned before, if you're changing the semantics it's wise to
change the name too.

>Building those functions into the language implies that there will be much
>code [within the compiler] for special casing those functions.

But the standard doesn't *require* them to be special-cased; you could port
the compiler by commenting out that code.

>On the flip side, the language may not be as efficient.  If the compiler
>writers want to allow these procedures to be built-in to allow in-line
>procedures, I think this should be an option (DEFAULT=OFF), and then the
>capabilities of the language will be compromised as little as possible.

If they make it an option, the default will probably be ON.

Here's my opinion.  Using punctuation for builtins (and alphanumerics for
library functions) is a nice way to keep them straight; let's keep it that
way.  If certain functions are so trivial that it's worthwhile for them to
be expanded inline (are there any besides abs(), min(), and max()?), then
they should have non-alphanumeric spellings; i.e. they should be operators.

I can feel the flames ("More operators?  It's getting as bad as APL!").

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

rbj@icst-cmr (Root Boy Jim) (05/06/86)

> >I can feel the flames ("More operators?  It's getting as bad as APL!").
 
Hey, the more primitives, the better. I also like(d) TECO. Just because
a language is terse, doesn't mean it's unreadable. In my opinion, people
who laughed at the greek were just to lazy to learn the lingo. APL is
quite readable once you get used to it, especially with direct definition.

> Excuse me...in APL parlance (eg, the language used in the APL Draft Standard),
> things like "+", "-",etc. are called functions, as are user-defined functions.
> Operators are things which alter the effect of functions, such as "." in the 
> context of "+.x", which causes an inner matrix product to be evaluated.

Interesting. Would the construct `+.FUNC' be legal also?
 
> (No, this doesn't belong in net.lang.c, but YOU STARTED IT!)
 
And I'm adding to it.

> Peter S. Shenkin	 Columbia Univ. Biology Dept., NY, NY  10027
> {philabs,rna}!cubsvax!peters		cubsvax!peters@columbia.ARPA

	(Root Boy) Jim Cottrell		<rbj@cmr>
	"One man gathers what another man spills"

franka@mmintl.UUCP (Frank Adams) (05/06/86)

In article <788@bentley.UUCP> kwh@bentley.UUCP writes:
>I can feel the flames ("More operators?  It's getting as bad as APL!").
>
>Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

"More operators?  It's getting as good as APL!" -- except that it still
falls way short in this respect.  I'm serious; APL does *not* have too many
operators.  It does have too little structure.

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108

greg@ncr-sd.UUCP (Greg Noel) (05/08/86)

I'm surprised that with all this discussion on built-ins, nobody has pointed
out how the C standard specifies that it should work.  My copy of the standard
is pretty old and has been stolen, so perhaps it was removed in a later
specification?

In article <788@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
>In article <1700010@umn-cs.UUCP> umn-cs!herndon writes:
>>I feel compelled to go on record as being VERY MUCH AGAINST having reserved
>>procedure names in C. ......
>>One of the (in my opinion) great strengths of the C language is that it
>>does not have 'built-in' functions.
>
>After rethinking this question, I've decided that it *does*.  However, the
>builtins are all punctuation ("%"), whereas the library functions are all
>alphanumeric ("abs").
>	.......
>Here's my opinion.  Using punctuation for builtins (and alphanumerics for
>library functions) is a nice way to keep them straight; let's keep it that
>way.  If certain functions are so trivial that it's worthwhile for them to
>be expanded inline (are there any besides abs(), min(), and max()?), then
>they should have non-alphanumeric spellings; i.e. they should be operators.

No, what the standard does (did?) is reserve all names that begin with one
or two underscores.  There is some difference in meaning between the two,
but it's not important here.  Pick one of these names, say, __OPEN, and
build it in to the compiler (i.e., it produces in-line code).  Then you
simply act as if there were an implicit "#define open __OPEN" at the head
of each program so that under normal circumstances the built-in version
is used.  If you want to provide your own "open" routine, all you have to
do is have an "#undef open" in your program and the external semantics we
all know and love are used instead.

This seems to solve both of your needs without turning C into APL.  If this
technique is still in the standard, this debate is moot.  If it's not, then
I, for one, would like to know why it was removed.

Aside:  I think that all of the machine-specific names should be in a file
somewhere and the preprocessor should implicitly read it, as if there were
a "#include <standard.h>" (or equivalent) as the first line of every
program.  This would have the #defines for the machine, architecture, OS,
and whatever else it takes to describe the environment.  The header for
the standalone environment simply wouldn't have the #defines for whichever
syscalls are built-in (like __OPEN above) but might have #defines for, say,
abs, l3tol/lto3l, or memcpy if it made sense.
-- 
-- Greg Noel, NCR Rancho Bernardo    Greg@ncr-sd.UUCP or Greg@nosc.ARPA

jss@ulysses.UUCP (Jerry Schwarz) (05/09/86)

> I'm surprised that with all this discussion on built-ins, nobody has pointed
> out how the C standard specifies that it should work.  My copy of the standard
> is pretty old and has been stolen, so perhaps it was removed in a later
> specification?
> 

It was about the 3rd item posted in the discussion.  But here it is
again (Februray 1986 draft)

        D.1.2 Headers

        ... All external identifiers declared in any of the headers
        [i.e. as specified in the standard] are reserved, whether or
        not the associated header is included. All external
        identifiers and macro names that begin with an underscore are
        also reserved.  If the program redefines a reserved external
        identifier, even with a semantically equivalent form the
        behavior is implementation defined. ...

        D.1.3 Use of library functions

        ... Any function declared in a header may be implemented as a
        macro defined in the header, so a library function should not
        be declared explicitly if its header is included. ... The use
        of #undef to remove any macro definition will also ensure
        that  an actual function is referenced.

To me this seems perfectly reasonable.  Note in particular that a
vendor is required to tell you what will happen if you try to redefine
one of the reserved identifiers.  

Jerry Schwarz 
Bell Labs, MH  
ulysses!jss

kwh@bentley.UUCP (KW Heuer) (05/09/86)

In article <594@brl-smoke.ARPA> rbj@icst-cmr (Root Boy Jim) writes:
>Hey, the more primitives, the better. I also like(d) TECO. Just because
>a language is terse, doesn't mean it's unreadable. In my opinion, people
>who laughed at the greek were just to lazy to learn the lingo. APL is
>quite readable once you get used to it, especially with direct definition.

I don't often admit it, but I like both APL and TECO.  The major flaw (of
both) seems to be that people like to "optimize" for minimal source code,
and hence go around removing "unnecessary" comments and whitespace.  (From
your comments on your C style, I'd bet that you make heavy use of APL glue
like "+0x" or ",0R", even though it's more efficient to use a newline.)

I suspect that in part this is a result of interpretation.  I've seen it in
BASIC too.  A compiled language like C is hopefully less susceptible to
obfuscation (the upcoming contest notwithstanding); I think the only real
problem with adding more operators is resolving the precedence rules.  (I
like the way APL handles that, btw.)

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint