[alt.sources] Fast strcmp

larocque@jupiter.crd.ge.com (David M. LaRocque) (09/26/90)

After I profiled my C program I discovered that the function
strcmp() takes one third of my program's CPU time.  I was hoping
someone may have written their own version of strcmp() that
outperforms the library's function.

Thanks, Dave

/**************************************************
 * larocque@crd.ge.com        (518) 387-5805
 * ...!crdgw1!cetus.crd.ge.com!larocque
 **************************************************/

fraser@edc.UUCP (Fraser Orr) (09/26/90)

In article <12145@crdgw1.crd.ge.com> larocque@jupiter.crd.ge.com (David M. LaRocque) writes:
>After I profiled my C program I discovered that the function
>strcmp() takes one third of my program's CPU time.  I was hoping
>someone may have written their own version of strcmp() that
>outperforms the library's function.

One quick dirty thing I did once was to change

	if (strcmp (a,b)==0)
to
	if (*a==*b && (strcmp(a.b)==0))

I seem to remember a remarkable performance improvement, like about 5
times faster.  Probably due to the fact that the program mainly did
strcmp and the strcmp was pretty bad.

Worth considering anyway.

==Fraser Orr <fraser@edc.uucp> +44 506 416778x206
UseNet: {uunet,sun}!atexnet!fraser JANet: fraser%edc@cs.hw.ac.uk

otto@tukki.jyu.fi (Otto J. Makela) (09/27/90)

In article <1646@cherry.edc.UUCP> fraser@edc.UUCP (Fraser Orr) writes:
   In article <12145@crdgw1.crd.ge.com> larocque@jupiter.crd.ge.com (David M. LaRocque) writes:
   >After I profiled my C program I discovered that the function
   >strcmp() takes one third of my program's CPU time.  I was hoping
   >someone may have written their own version of strcmp() that
   >outperforms the library's function.

   One quick dirty thing I did once was to change
	   if (strcmp (a,b)==0)
   to
	   if (*a==*b && (strcmp(a.b)==0))

   I seem to remember a remarkable performance improvement, like about 5
   times faster.  Probably due to the fact that the program mainly did
   strcmp and the strcmp was pretty bad.

This should obviously be:
	if(*a==*b && (strcmp(a,b)==0))
(just in case...)

I'd believe if your compiler is really brain-damaged, this could also help
a teeny weeny bit:
	if(*a==*b && (!strcmp(a,b)))
--
   /* * * Otto J. Makela <otto@jyu.fi> * * * * * * * * * * * * * * * * * * */
  /* Phone: +358 41 613 847, BBS: +358 41 211 562 (CCITT, Bell 24/12/300) */
 /* Mail: Kauppakatu 1 B 18, SF-40100 Jyvaskyla, Finland, EUROPE         */
/* * * Computers Rule 01001111 01001011 * * * * * * * * * * * * * * * * */

cedman@lynx.ps.uci.edu (Carl Edman) (09/27/90)

In article <OTTO.90Sep27145643@tukki.jyu.fi> otto@tukki.jyu.fi (Otto J. Makela) writes:
   In article <1646@cherry.edc.UUCP> fraser@edc.UUCP (Fraser Orr) writes:
      In article <12145@crdgw1.crd.ge.com> larocque@jupiter.crd.ge.com (David M. LaRocque) writes:
      >After I profiled my C program I discovered that the function
      >strcmp() takes one third of my program's CPU time.  I was hoping
      >someone may have written their own version of strcmp() that
      >outperforms the library's function.

      One quick dirty thing I did once was to change
	      if (strcmp (a,b)==0)
      to
	      if (*a==*b && (strcmp(a.b)==0))

      I seem to remember a remarkable performance improvement, like about 5
      times faster.  Probably due to the fact that the program mainly did
      strcmp and the strcmp was pretty bad.

   This should obviously be:
	   if(*a==*b && (strcmp(a,b)==0))
   (just in case...)

   I'd believe if your compiler is really brain-damaged, this could also help
   a teeny weeny bit:
	   if(*a==*b && (!strcmp(a,b)))

If you use a good compiler (like f.e. gcc) you should get an improvement
better than the above by simply inline-ing a self-written strcmp function.
Another thing oyu might consider and which would be useful for some
types of machines like f.e. 680x0 , would be to do the first few cmp-s
as byte-cmps until you get to a longword boundary, after which you do
longword compares. When you have got long strings which often match in
the first part (or you often compare matching strings) then this could
give you a really remarkable preformance improvement.
Maybe the best way to fix your problem would be to find our if you REALLY
need that many strcmp-s. If you f.e. do some kind of parseing and simply
check the text against a list of keywords there are MANY better solutions,
from sort-ing the keyword list and using a binary search, to using a
hash-function (there are programms which can help you create them), to
using lex.

Theorectial Physicist,N.:A physicist whose   | Send mail
existence is postulated, to make the numbers |  to
balance but who is never actually observed   | cedman@golem.ps.uci.edu
in the laboratory.                           | edmanc@uciph0.ps.uci.edu

dfs@doe.carleton.ca (David F. Skoll) (09/27/90)

In article <CEDMAN.90Sep27075013@lynx.ps.uci.edu>
cedman@lynx.ps.uci.edu (Carl Edman) writes:

>      One quick dirty thing I did once was to change
>	      if (strcmp (a,b)==0)
>      to
>	      if (*a==*b && (strcmp(a,b)==0))
>
>      I seem to remember a remarkable performance improvement, like about 5
>      times faster.  Probably due to the fact that the program mainly did
>      strcmp and the strcmp was pretty bad.

Hmm... that seems strange.  If the first characters of the strings differ,
most strcmps will not bother to test the rest.  All that the above code
possibly saves you is a function call/return.  If this makes such a huge
difference, then the compiler or your machine's architecture must be pretty
bad.
--
David F. Skoll        | Department of Electronics | Opinions expressed here are
dfs@doe.carleton.ca   | Carleton University       | my own and not necessarily
(613) 788-5771 | 5772 | Ottawa, Ontario, Canada   | those of my employer.

cedman@lynx.ps.uci.edu (Carl Edman) (09/29/90)

In article <1990Sep27.151543.8025@ccs.carleton.ca> dfs@doe.carleton.ca (David F. Skoll) writes:
   In article <CEDMAN.90Sep27075013@lynx.ps.uci.edu>
   cedman@lynx.ps.uci.edu (Carl Edman) writes:

   >      One quick dirty thing I did once was to change
   >	      if (strcmp (a,b)==0)
   >      to
   >	      if (*a==*b && (strcmp(a,b)==0))
   >
   >      I seem to remember a remarkable performance improvement, like about 5
   >      times faster.  Probably due to the fact that the program mainly did
   >      strcmp and the strcmp was pretty bad.

   Hmm... that seems strange.  If the first characters of the strings differ,
   most strcmps will not bother to test the rest.  All that the above code
   possibly saves you is a function call/return.  If this makes such a huge
   difference, then the compiler or your machine's architecture must be pretty
   bad.

I did NOT write the above original article ! Please get your attributions
right. All I wrote was one response to it, which is not quoted here.

As to the content: Yes, all that saves is the function call overhead
but that can be quite a substantial amount even on machines with good
compilers. That is why I suggested inline-ing and (under some circumstances)
a rewritten strcmp which uses longword compares. Another possibility which
comes to mind when every string is compared very often, is to create a
string structure (or better class, long live C++ ! :-) which calculates
a 32-bit CRC for each string the first time and stores it somewhere.
Then only 1 (inlined) longword-compare will do the stringcomparisons
for you.

Theorectial Physicist,N.:A physicist whose   | Send mail
existence is postulated, to make the numbers |  to
balance but who is never actually observed   | cedman@golem.ps.uci.edu
in the laboratory.                           | edmanc@uciph0.ps.uci.edu

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (09/30/90)

In article <1990Sep27.151543.8025@ccs.carleton.ca> dfs@doe.carleton.ca (David F. Skoll) writes:

| Hmm... that seems strange.  If the first characters of the strings differ,
| most strcmps will not bother to test the rest.  All that the above code
| possibly saves you is a function call/return.  If this makes such a huge
| difference, then the compiler or your machine's architecture must be pretty
| bad.

  If the test of the first character saves a procedure call, then no
matter how good the compiler is it will be faster not to do the call.
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

AlexK@tharr.UUCP (Alex Kiernan) (10/01/90)

>  If the test of the first character saves a procedure call, then no
>matter how good the compiler is it will be faster not to do the call.

The Lattice C compiler uses a builtin strcmp, as mandated by ANSI, so often it
can generate great code, taking advantage of all sorts of alignment/length
knowledge. BTW this the LC 68000 code generator, I don't know about other 
platforms. 

Alex K.                                                               
-- 
<-- tharr *free* public access to Usenet in the UK 0234 261804 -->

sartin@hplabsz.HPL.HP.COM (Rob Sartin) (10/04/90)

In article <CEDMAN.90Sep29091115@lynx.ps.uci.edu> cedman@lynx.ps.uci.edu (Carl Edman) writes:
>string structure (or better class, long live C++ ! :-) which calculates
>a 32-bit CRC for each string the first time and stores it somewhere.
>Then only 1 (inlined) longword-compare will do the stringcomparisons
>for you.

Afraid not.  It'll give you an estimate of whether the strings match
(correctly identifying those that don't).  You will need to then
actually compare the strings if they are the same.  This method will
also be unable to reproduce strcmp's behavior (strcmp returns a signed
result indicated the <, =, > by being negative, zero, positive), it will
only return a boolean (match, no match).

If you do lots of string comparisons strictly for equality/inequality it
might be worth investigating.

Rob Sartin					uucp: hplabs!sartin
"Some may say that I have gone astray.	    internet: sartin@hplabs.hp.com
How would they know? They never follow."

sartin@hplabsz.HPL.HP.COM (Rob Sartin) (10/04/90)

In article <6003@hplabsz.HPL.HP.COM> I write:
>(correctly identifying those that don't).  You will need to then
>actually compare the strings if they are the same.  This method will

Of course I should have said:

You will need to then actually compare the strings if the CRC's/checksum's
are the same.

Rob

markh@csd4.csd.uwm.edu (Mark William Hopkins) (10/04/90)

In article <12145@crdgw1.crd.ge.com> larocque@jupiter.crd.ge.com (David M. LaRocque) writes:
>After I profiled my C program I discovered that the function
>strcmp() takes one third of my program's CPU time.  I was hoping
>someone may have written their own version of strcmp() that
>outperforms the library's function.

Look at some references on the architecture of the machine your system is
running on.  It will most likely have hardware-implemented string instructions
that you can use.

On our machine, the strcmp library routine is really nothing more than a single
assembly-language instruction with a couple register-initializations.  It will
almost certainly run far faster than any equivalent high-level source.

meissner@osf.org (Michael Meissner) (10/04/90)

In article <6757@uwm.edu> markh@csd4.csd.uwm.edu (Mark William
Hopkins) writes:

| Path: paperboy!snorkelwacker!bionet!uwm.edu!csd4.csd.uwm.edu!markh
| From: markh@csd4.csd.uwm.edu (Mark William Hopkins)
| Newsgroups: alt.sources
| Date: 4 Oct 90 06:01:08 GMT
| References: <12145@crdgw1.crd.ge.com>
| Sender: news@uwm.edu
| Reply-To: markh@csd4.csd.uwm.edu (Mark William Hopkins)
| Organization: University of Wisconsin-Milwaukee
| Lines: 13
| 
| In article <12145@crdgw1.crd.ge.com> larocque@jupiter.crd.ge.com (David M. LaRocque) writes:
| >After I profiled my C program I discovered that the function
| >strcmp() takes one third of my program's CPU time.  I was hoping
| >someone may have written their own version of strcmp() that
| >outperforms the library's function.
| 
| Look at some references on the architecture of the machine your system is
| running on.  It will most likely have hardware-implemented string instructions
| that you can use.
| 
| On our machine, the strcmp library routine is really nothing more than a single
| assembly-language instruction with a couple register-initializations.  It will
| almost certainly run far faster than any equivalent high-level source.

While it is not probably true in this case, some machine instructions
which do strcmp are actually slower than exceedingly tight asm code
that a real programmer can produce.  It depends on many factors,
including how much of the chip area supports the string instructions
or whether they are microcoded.  For example, the original microvax
did not have some of the string instructions in the hardware, and
trapped to the OS for emulation.  Another example is on one other
specific machine within an architecture family was found to do string
moves slower than the naive load byte/store byte sequence!

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (10/05/90)

falk@peregrine.Sun.COM (Ed Falk) writes:
> sartin@hplabs.hp.com (Rob Sartin) writes:
>> cedman@lynx.ps.uci.edu (Carl Edman) writes:
>>>string structure (or better class, long live C++ ! :-) which calculates
>>>a 32-bit CRC for each string the first time and stores it somewhere.
>>>Then only 1 (inlined) longword-compare will do the stringcomparisons
>>>for you.
>>
>>Afraid not.  It'll give you an estimate of whether the strings match
>>(correctly identifying those that don't).  You will need to then
>>actually compare the strings if they are the same.  This method will
>>also be unable to reproduce strcmp's behavior (strcmp returns a signed
>>result indicated the <, =, > by being negative, zero, positive), it will
>>only return a boolean (match, no match).
>
>Also, these two strings
>
>	"ab\0x"
>	"ab\0y"
>
>(where x and y are any garbage that happens to be in memory after the
>terminating '\0') will be evaluated as unequal.

For an exclusive or or other 16 bit at a time checksum, perhaps, but a
properly designed CRC (which is a bitwise operation on the string) would
stop at the end of the byte before the null byte.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

falk@peregrine.Sun.COM (Ed Falk) (10/05/90)

In article <6003@hplabsz.HPL.HP.COM> sartin@hplabs.hp.com (Rob Sartin) writes:
>In article <CEDMAN.90Sep29091115@lynx.ps.uci.edu> cedman@lynx.ps.uci.edu (Carl Edman) writes:
>>string structure (or better class, long live C++ ! :-) which calculates
>>a 32-bit CRC for each string the first time and stores it somewhere.
>>Then only 1 (inlined) longword-compare will do the stringcomparisons
>>for you.
>
>Afraid not.  It'll give you an estimate of whether the strings match
>(correctly identifying those that don't).  You will need to then
>actually compare the strings if they are the same.  This method will
>also be unable to reproduce strcmp's behavior (strcmp returns a signed
>result indicated the <, =, > by being negative, zero, positive), it will
>only return a boolean (match, no match).

Also, these two strings

	"ab\0x"
	"ab\0y"

(where x and y are any garbage that happens to be in memory after the
terminating '\0') will be evaluated as unequal.

There are workarounds for both problems, of course, but I think there
won't be much of a performance improvement after you've done all it
requires to get it right.

		-ed falk, sun microsystems
		 sun!falk, falk@sun.com
		 card-carrying ACLU member.

cedman@lynx.ps.uci.edu (Carl Edman) (10/05/90)

In article <1145@exodus.Eng.Sun.COM> falk@peregrine.Sun.COM (Ed Falk) writes:
   In article <6003@hplabsz.HPL.HP.COM> sartin@hplabs.hp.com (Rob Sartin) writes:
   >In article <CEDMAN.90Sep29091115@lynx.ps.uci.edu> cedman@lynx.ps.uci.edu (Carl Edman) writes:
   >>string structure (or better class, long live C++ ! :-) which calculates
   >>a 32-bit CRC for each string the first time and stores it somewhere.
   >>Then only 1 (inlined) longword-compare will do the stringcomparisons
   >>for you.
   >
   >Afraid not.  It'll give you an estimate of whether the strings match
   >(correctly identifying those that don't).  You will need to then
   >actually compare the strings if they are the same.  This method will
   >also be unable to reproduce strcmp's behavior (strcmp returns a signed
   >result indicated the <, =, > by being negative, zero, positive), it will
   >only return a boolean (match, no match).

   Also, these two strings

	   "ab\0x"
	   "ab\0y"

   (where x and y are any garbage that happens to be in memory after the
   terminating '\0') will be evaluated as unequal.

   There are workarounds for both problems, of course, but I think there
   won't be much of a performance improvement after you've done all it
   requires to get it right.

Please , take note ! I did NOT suggest that this method is always the
best, but I gave 3 or 4 different methods, and some hints, about when
I would use which one.
And I still hold that the CRC-method could be by far the most
efficient under some conditions:

	- each word is checked often. Then a small overhead like f.e.
	cleaning up the garbage behind the end of the string for the
	one time calculation wouldn't matter

	- strings are long, so the overhead of 32-bit per string is
	justified

	- you only need to know if 2 strings are identical

	- most compares between unequal strings

You can even imagine systems where it could be possible to remove
the actual string from memory, and simply assume that if the 32-bit
CRC match, the strings match. Such systems would have to be tolerant
about an occasional mismatch.

If memory serves correctly the above approach is used in some
implementations for diff. (only to give one practical, real world
example)

	Carl Edman

Theorectial Physicist,N.:A physicist whose   | Send mail
existence is postulated, to make the numbers |  to
balance but who is never actually observed   | cedman@golem.ps.uci.edu
in the laboratory.                           | edmanc@uciph0.ps.uci.edu

srt@aerospace.aero.org (Scott "TCB" Turner) (10/06/90)

In article <CEDMAN.90Oct4171710@lynx.ps.uci.edu> cedman@lynx.ps.uci.edu (Carl Edman) writes:
>... I still hold that the CRC-method could be by far the most
>efficient under some conditions:
>
>	- each word is checked often.
>	- strings are long... [etc]

Spell-checking is an interesting problem which involves comparisons
between many strings, and which admits of many algorithmic
improvements.  There was a very good article on how the Unix "spell"
program works in a past issue of Communications, probably in a Pearls"
"column.  I can't recall enough of the article to go into
detail here, but the algorithm finally developed involves a checksum
type approach similar to what Carl suggested.

					-- Scott Turner

peter@ficc.ferranti.com (Peter da Silva) (10/07/90)

Hey, folks... ever hear of alt.sources.d?
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

matloff@heather.ucdavis.edu (Norm Matloff) (10/10/90)

Test.