[alt.hackers] Useful string function

hollombe@ttidca.TTI.COM (The Polymath) (06/26/91)

Here's a simple minor hack I threw together.  It's proved so useful, I'm
amazed it isn't part of the standard Unix libraries. (C'mon.  Even BASIC
does this)!  Anyway, it's a function that finds a substring within a
string and returns its starting location or -1 if the substring isn't
there.  I'm sure you can think of many variations and improvements.  This
version does what I need it to.  Enjoy.

int Strloc (s1, s2)
char *s1;           /* String to be searched */
char *s2;           /* String to search for  */
{

               /*****************************************/
               /* Locate substring s2 within string s1. */
               /* Return starting position of s2.       */
               /* Return -1 if s2 not found in s1.      */
               /*                                       */
	       /* Requires stdio.h.                     */
	       /* Calls strlen().                       */
               /*                                       */
	       /* Placed in public domain 25 June 1991  */
	       /*                  by                   */
	       /*       G. Hollombe (The Polymath)      */
               /*****************************************/

     register int i, j;       /* Loop indices       */
     register int s1_limit;   /* Search limit in s1 */
     register int s2_length;  /* Length of s2       */

     s2_length = strlen (s2);                /* Minimize function calls.       */
     s1_limit = strlen (s1) - s2_length;     /* If it ain't in here, it ain't. */

     for (i = 0; i <= s1_limit; i++)         /* Search s1 until not enough  */
     {                                       /* characters left to hold s2. */
	  for (j = 0; j < s2_length; j++)
	  {
	       if (s1[i + j] != s2[j])       /* Mismatch? */
	       {
		    if (j > 0)               /* Yes, move i to start of mis- */
			 i += (j - 1);       /* match (but don't back up)    */
		    break;                   /* and start again.             */
	       } /* end if */

	  } /* end for */

	  if (j >= s2_length)                /* All matched?       */
	       break;                        /* Yes, quit looking. */

     } /* end for */

     if (i > s1_limit)                       /* Found s2? */
	  return (-1);                       /* No, return -1 */
     else
	  return (i);                        /* Yes, return position */

} /* end Strloc() */

-- 
The Polymath (aka: Jerry Hollombe, M.A., CDP, aka: hollombe@ttidca.tti.com)
Head Robot Wrangler at Citicorp                   Turn the rascals out!
3100 Ocean Park Blvd.   (213) 450-9111, x2483     No incumbents in '92!
Santa Monica, CA  90405 {rutgers|pyramid|philabs|psivax}!ttidca!hollombe

volpe@camelback.crd.ge.com (Christopher R Volpe) (06/26/91)

In article <27187@ttidca.TTI.COM>, hollombe@ttidca.TTI.COM (The
Polymath) writes:
|>Here's a simple minor hack I threw together.  It's proved so useful, I'm
|>amazed it isn't part of the standard Unix libraries.

That's because it's part of the standard C library, and it's called
"strstr". Check the man page.

|>  (C'mon.  Even BASIC
|>does this)!  Anyway, it's a function that finds a substring within a
|>string and returns its starting location or -1 if the substring isn't
|>there.  I'm sure you can think of many variations and improvements.  This
|>version does what I need it to.  Enjoy.
                                                              

==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com

pierce@watson.ibm.com (06/26/91)

From article <27187@ttidca.TTI.COM>, by hollombe@ttidca.TTI.COM (The Polymath):
> Here's a simple minor hack I threw together.  It's proved so useful, I'm
> amazed it isn't part of the standard Unix libraries. (C'mon.  Even BASIC
> does this)!  Anyway, it's a function that finds a substring within a
> string and returns its starting location or -1 if the substring isn't
> there.

It may not be part of ANSI C, but every halfway respectable compiler
I've ever used has had a strstr() function that returns a pointer to
the substring.

--
____ Tim Pierce               / They call television a medium.  That is
\  / pierce@watson.ibm.com    / because it is neither rare nor well done.
 \/  twpierce@amh.amherst.edu /   -- Ernie Kovacs

hollombe@ttidca.TTI.COM (The Polymath) (06/27/91)

In article <27187@ttidca.TTI.COM> hollombe@ttidca.TTI.COM (The Polymath) writes:
}Here's a simple minor hack ...
}... a function that finds a substring within a
}string and returns its starting location or -1 if the substring isn't
}there.  ...

My thanks to all who wrote to tell me about the strstr(3), or equivalent,
function in their libraries.  The version of SysV we use doesn't include
it nor anything like it.  Neither does our 4.3 bsd.

To all others in my position I say again, enjoy.  To those who have
strstr(3) available to them, never mind (-: (unless you need a PD
version).

(And to those who think they can improve on my code -- go for it.  I make
no claim perfection).

-- 
The Polymath (aka: Jerry Hollombe, M.A., CDP, aka: hollombe@ttidca.tti.com)
Head Robot Wrangler at Citicorp                   Turn the rascals out!
3100 Ocean Park Blvd.   (213) 450-9111, x2483     No incumbents in '92!
Santa Monica, CA  90405 {rutgers|pyramid|philabs|psivax}!ttidca!hollombe

imp@solbourne.com (Warner Losh) (06/27/91)

In article <1991Jun26.164123.6281@watson.ibm.com> pierce@watson.ibm.com () writes:
>It may not be part of ANSI C, but every halfway respectable compiler
>I've ever used has had a strstr() function that returns a pointer to
>the substring.

And the ones that didn't it was easy enough to write

char *strstr (register char *str, register char *substr)
{
	register int len = strlen (substr);

	for (str = strchr (str, *substr);
	     str && strncmp (str, substr, len) != 0;
	     str = strchr (str + 1, *substr))
		continue;

	return str;
}

[[ This is typed from memory, so I may have a + 1 where I don't need
   it or lack one where I do ]]

Warner
-- 
Warner Losh		imp@Solbourne.COM
But it was our hill.  And they were our beans.

ns@csd.cri.dk (Nick Sandru) (06/27/91)

pierce@watson.ibm.com writes:

>It may not be part of ANSI C, but every halfway respectable compiler
>I've ever used has had a strstr() function that returns a pointer to
>the substring.

strstr() is part of ANSI C. But some C compilers lack it (the one which
comes with DEC Ultrix f. ex.).

>--
>____ Tim Pierce               / They call television a medium.  That is
>\  / pierce@watson.ibm.com    / because it is neither rare nor well done.
> \/  twpierce@amh.amherst.edu /   -- Ernie Kovacs


-- 
Nick Sandru - System administrator   | e-mail: ns@csd.cri.dk
Columbus Space Station SDE Project   |
Computer Resources International A/S | phone:  +45 45 82 21 00 x2036 (office)
Bregnerodvej 144                     |         +45 47 98 06 27       (home)
DK-3460 Birkerod, Denmark            | fax:    +45 45 82 17 11
--
Nick Sandru - System administrator   | e-mail: ns@csd.cri.dk
Columbus Space Station SDE Project   |
Computer Resources International A/S | phone:  +45 45 82 21 00 x2036 (office)
Bregnerodvej 144                     |         +45 47 98 06 27       (home)

karl@ima.isc.com (Karl Heuer) (06/27/91)

In <27212@ttidca.TTI.COM> hollombe@ttidca.TTI.COM (The Polymath) writes:
>In <27187@ttidca.TTI.COM> hollombe@ttidca.TTI.COM (The Polymath) writes:
>}... a function that finds a substring within a string and returns its
>}starting location or -1 if the substring isn't there.  ...
>
>My thanks to all who wrote to tell me about the strstr(3), or equivalent,
>function in their libraries.  [We don't have it here.]
>To all others in my position I say again, enjoy.

To you and all others in your position, I would recommend that you *not* use
Strloc(), because it's better to create your own implementation of strstr()
that agrees with the ANSI specs.  Then when strstr() does become ubiquitous,
you won't have future generations scratching their heads wondering why your
programs use an incompatible variant.

>(And to those who think they can improve on my code -- go for it.  I make
>no claim perfection).

I rather like this implementation, myself; it's small and contains no function
calls.  Speed improvements are possible, but I wouldn't bother with minor
tweaks--if the speed is necessary, just go straight to Boyer-Moore.

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
________
#if defined(NEED_STRSTR) /* deassert if you already have an ANSI library */
/* Public Domain strstr() implementation by Karl Heuer */
#include <stddef.h> /* for NULL */
char *strstr(register char const *s, register char const *t) {
    do {
	register char const *ss = s;
	register char const *tt = t;
	do {
	    if (*tt == '\0') return ((char *)s);
	} while (*ss++ == *tt++);
    } while (*s++ != '\0');
    return (NULL);
}
#endif

hollombe@ttidca.TTI.COM (The Polymath) (06/28/91)

In article <20919@crdgw1.crd.ge.com> volpe@camelback.crd.ge.com (Christopher R Volpe) writes:
}In article <27187@ttidca.TTI.COM>, hollombe@ttidca.TTI.COM (The
}Polymath) writes:
}|>Here's a simple minor hack I threw together.  It's proved so useful, I'm
}|>amazed it isn't part of the standard Unix libraries.
}
}That's because it's part of the standard C library, and it's called
}"strstr". Check the man page.

Two comments, then I'm going to shut up about this and stop answering
my mail. (-:

     1)  I'm getting mildly annoyed that so many people assume I was too
         stupid to RTFM before writing the above mentioned function.  One
         more time:  Neither our version of SysV nor our version of 4.3
         bsd have the strstr() function in their libraries.  I looked.
         Really.

     2)  So far, I've received two bug reports on the function as posted:

	  It incorrectly returns 0 if s2 is null.
	  (Thanks to jwahar r. bammi for finding this).

	  It incorrectly returns -1 when looking for "ABABD" in "ABABABD".
	  (Thanks to Pat Place).

         Sorry about that.  Corrections for these problems, and any
         others, are left as an exercise for the reader.  I'm not going to
         post version 1.1 and get my mailbox flooded all over again.  I've
	 made an appointment with my doctor to see if he can extract my
	 foot from my mouth.

~sigh~  Didja ever have one of those years ... ?

-- 
The Polymath (aka: Jerry Hollombe, M.A., CDP, aka: hollombe@ttidca.tti.com)
Head Robot Wrangler at Citicorp                   Turn the rascals out!
3100 Ocean Park Blvd.   (213) 450-9111, x2483     No incumbents in '92!
Santa Monica, CA  90405 {rutgers|pyramid|philabs|psivax}!ttidca!hollombe

Jari.Karjala@hut.fi (06/29/91)

In article <27187@ttidca.TTI.COM> hollombe@ttidca.TTI.COM (The Polymath) writes:
>(And to those who think they can improve on my code -- go for it.  I make
>no claim perfection).

	Here's a simple one I once translated from pascal, the original
	is in Sedgewick's good book "Algorithms" (which is nowadays
	even better when there is a version with C examples). (Add
	'register' keywords into suitable positions if your compiler
	needs them.)

--clip--
/* Returns pointer to string p in string a or NULL if not found */
char *strstr(a,p)
char	*a,*p;
{
	char	*c=p;
	while (*a && *p)
		if (*a++ != *p++)
			a -= p-c-1, p = c;		
	if (*p) return NULL;
	else return a-(p-c);
}
--clip--

--
/*--- Jari.Karjala@hut.fi -- The World is Just a Huge Fractal ---*/ 
float O,I,o=0.075,h=1.5,T= -2,r,l;main(){int _=0,L=80,s=3200;for(;s
%L|| (h-=o,T= -2),s;(4-(r=O*O)<(l=I*I)|++_==L)&&write(1,(--s%L?_<L?
--_%6:6:7)+"World! \n",1)&&(O=I=l=_=r=0,T+=o/2))O=I*2*O+h,I=l+T-r;}

john@iastate.edu (John Hascall) (06/30/91)

karl@ima.isc.com (Karl Heuer) writes:
}To you and all others in your position, I would recommend that you *not* use
}Strloc(), because it's better to create your own implementation of strstr()
}that agrees with the ANSI specs.  Then when strstr() does become ubiquitous...

}>(And to those who think they can improve on my code -- go for it.  I make
}>no claim perfection).
          ^
          to  ;-)

}char *strstr(register char const *s, register char const *t) {
}    do {
}	register char const *ss = s;
            :

While certainly a nice implementation of strstr(), if your compiler is
ANSIfied enough to have function prototypes & "const" isn't there a pretty
good chance it ANSIfied enough to have strstr() already too?

John