[comp.lang.c] STREQ

henry@utzoo.UUCP (Henry Spencer) (06/26/87)

> However, STREQ may have only minimal value, as any good strcmp()
> implementation will return immediately upon discovering that the first
> bytes of its arguments do not match.

Yes... but it still incurs the full call/return overhead, which is usually
an order of magnitude more expensive than a simple one-character compare.
STREQ and similar things produced a considerable improvement in the speed of
C news, which does a lot of string-bashing.
-- 
"There is only one spacefaring        Henry Spencer @ U of Toronto Zoology
nation on Earth today, comrade."   {allegra,ihnp4,decvax,pyramid}!utzoo!henry

henry@utzoo.UUCP (Henry Spencer) (07/08/87)

> >>#define    STREQ(a, b)    (*(a) == *(b) && strcmp((a), (b)) == 0)
> 
>   This will also greatly slow a good many programs down on machines
> that do not support byte addressing.

No, not really.  Think about it:  all the macro is doing is duplicating
the first test that strcmp has to do.  Strcmp has to do it anyway, and
will probably have to do it much the same way.  (There are tricks that
can be used to do some string operations a word at a time, but they fall
down badly when the two strings are not aligned the same way -- a frequent
occurrence.)  Like most things, it is a tradeoff:  one redundant test in
the case of similar strings, versus one less function call in the case of
dissimilar strings.  Even on machines that have trouble with byte addressing,
a character comparison is likely to be considerably cheaper than a function
call.  STREQ loses only if similar strings (similar in the first character,
anyway) are considerably more common than dissimilar ones.  Empirically, in
most programs most strings are dissimilar, so STREQ wins on almost any
machine.

(Even disregarding this, using STREQ rather than direct strcmp wins big
everywhere, because it gives you the ability to change your mind about such
things by changing one macro definition!)
-- 
Mars must wait -- we have un-         Henry Spencer @ U of Toronto Zoology
finished business on the Moon.     {allegra,ihnp4,decvax,pyramid}!utzoo!henry

mick@auspyr.UUCP (Mick Andrew) (07/10/87)

in article <8277@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) says:
> 
>> >>#define    STREQ(a, b)    (*(a) == *(b) && strcmp((a), (b)) == 0)
>> 

This fine idea must be used with caution.

Note that strcmp() and its siblings correctly handle the case
of null pointers....
-- 
----
Mick    Austec, Inc.,   San Jose, CA 
	{sdencore,necntc,cbosgd,amdahl,ptsfa,dana}!aussjo!mick
	{styx,imagen,dlb,gould,sci,altnet}!auspyr!mick

john@auspyr.UUCP (John Weald) (07/11/87)

In article <7312@auspyr.UUCP> mick@auspyr.UUCP (Mick Andrew) writes:
>in article <8277@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) says:
>> 
>>> >>#define    STREQ(a, b)    (*(a) == *(b) && strcmp((a), (b)) == 0)
>>> 
>
>This fine idea must be used with caution.
>
>Note that strcmp() and its siblings correctly handle the case
>of null pointers....
>-- 
>----
>Mick    Austec, Inc.,   San Jose, CA 

You must please excuse Mick, moments after he posted this he went off
on a three week vacation back home to sunny old England. I think his
brain preceded him by about 5 hours!!

Strcmp may well die if handled null pointers, this is surely true on
VMS (and he and I are doing the port to VMS!!). 

BTW:
If I recall the standard AT&T C version of strcmp (SVR2) first does a 
compare to see if the passed pointers point to the same string. Is this
the normal case? I suspect not, again Henry's solution will save this
comparison (assuming that strcmp works like the AT&T C version).

Save your flames as Mick is not around to read them (save them until 8/3)!!!!

John Weald
-- 
UUCP: {sdencore,cbosgd,amdahl,ptsfa,dana}!aussjo!john
UUCP: {styx,imagen,dlb,gould,sci,altnet}!auspyr!john

guy%gorodish@Sun.COM (Guy Harris) (07/11/87)

> >> >>#define    STREQ(a, b)    (*(a) == *(b) && strcmp((a), (b)) == 0)
> 
> This fine idea must be used with caution.
> 
> Note that strcmp() and its siblings correctly handle the case
> of null pointers....

Anybody who feeds null pointers to "strcmp" or "STREQ" is screwing
up.  Since a null pointer does not point to any object, it is
meaningless to ask what "strcmp" should return; it returns an
indication of whether the string referred to by the first pointer
argument is less than, equal to, or greater than the second string.
Since a null pointer does not refer to ANY string, there is no way to
assign a meaning to a call to "strcmp" that passes a null pointer.
Anything "strcmp" wants to do when passed a NULL pointer, including
crashing the running program, is perfectly legitimate; there are
implementations of "strcmp" and company that do crash when handed
null pointers.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

devine@vianet.UUCP (Bob Devine) (07/15/87)

In article <8277@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
* > >>#define    STREQ(a, b)    (*(a) == *(b) && strcmp((a), (b)) == 0)
* > 
* >   This will also greatly slow a good many programs down on machines
* > that do not support byte addressing.
* 
* No, not really.  Think about it:  all the macro is doing is duplicating
* the first test that strcmp has to do.  Strcmp has to do it anyway, and
* will probably have to do it much the same way.  (There are tricks that
* can be used to do some string operations a word at a time, but they fall
* down badly when the two strings are not aligned the same way -- a frequent
* occurrence.)

  I used a 4-char compare inside a string compare for a spelling checker
that I wrote a looong time ago.  It sped the checker up considerably. 
This was for an implementation that I knew would give me the correct
alignment so that scarfing up the first 4 chars into an int was easy.
Plus I was only concerned about equal/not-equal; there was no need for
checking if a word was ordered before or after the second word.

Bob Devine