[comp.sys.atari.st] string comparisons in C

covertr@gtephx.UUCP (Richard E. Covert) (07/14/89)

	This is not a flame.

	Mark Williams C is a wonder. The manual is so full of GEM/AES/VDI
hints that it still amazes me after two years! Anyway, C has a lack of good
string handling operations, so you need to use library functions. I was writing
a program in which I needed to determine whether a certain file type was a 
member of a desired set. So, how do you do string comparisons in C?? The
normal, portable, way is to do a char search and match. Slow and ugly.

	But, I was browsing thru the MWC manual, and lo and behold I see
pnmatch(). Now pnmatch is a wonderful little function which does string
comparisons. And it even accepts wildcards, so I was in business. I just
build an array of strings such as "*.PI1", and then by looping thru the
array I can string compare a user inputted filename against the list of
legal filetypes. Pretty neat solution.

	So, the moral of this little ditty, is READ YOUR MANUAL!!

P.S. Does anyone know if pnmatch() is implemented on other C compilers??

Richard (gtephx!covertr) Covert

leo@philmds.UUCP (Leo de Wit) (07/14/89)

In article <44672745.14a1f@gtephx.UUCP> covertr@gtephx.UUCP (Richard E. Covert) writes:
|	But, I was browsing thru the MWC manual, and lo and behold I see
|pnmatch(). Now pnmatch is a wonderful little function which does string
|comparisons. And it even accepts wildcards, so I was in business. I just
|build an array of strings such as "*.PI1", and then by looping thru the
|array I can string compare a user inputted filename against the list of
|legal filetypes. Pretty neat solution.
|
|	So, the moral of this little ditty, is READ YOUR MANUAL!!
|
|P.S. Does anyone know if pnmatch() is implemented on other C compilers??

Lattice C has stcpm() and stcpma() for unanchored and anchored pattern
matching. The BSD C libraries have regcomp() and regex() for regular
expression pattern matching (which probably goes a lot further than
any of pnmatch(), stcpm() or stcpma()).

The drawback of all these wonderful functions is that they are hardly
standardized, so you loose if portability is at stake (and it is more
often than you'd hope for). For maximum portability, use the functions
that are in the ANSI draft, and create your own library for functions
like pnmatch() that aren't there (it is very easy to make a general
wildcard pattern matcher). You'll be grateful for this advice when
you switch to another compiler, or your vendor doesn't support this
neat little function in the next release, or you're porting to a
different system, or ...

    Leo.

scs@adam.pika.mit.edu (Steve Summit) (07/15/89)

In article <44672745.14a1f@gtephx.UUCP> covertr@gtephx.UUCP (Richard E. Covert) writes:
>P.S. Does anyone know if pnmatch() is implemented on other C compilers??

No vendor should provide a routine named "pnmatch."  Vendors are
not supposed to pollute the namespace with "convenient" (but
invariably unportable and system-specific) routines.  ("Then why
do so may vendors do so?" you ask.)  Vendor-supplied routines not
mentioned in the standard are supposed to have names beginning
with at least one underscore (e.g. "_pnmatch").

Similarly, portable programs cannot really use these extensions,
no matter how convenient they may be.  When extensions are used,
they should be hidden behind at least one function call; that is,
don't call pnmatch directly, but rather invent your own routine --
"match_filenames" or something, which calls pnmatch.  Then, when
you port your code to a different system that doesn't have
pnmatch or uses some wildly different wildcard mechanism, you
only have to rewrite match_filenames().

                                            Steve Summit
                                            scs@adam.pika.mit.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/15/89)

In article <12689@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:
>In article <44672745.14a1f@gtephx.UUCP> covertr@gtephx.UUCP (Richard E. Covert) writes:
>>P.S. Does anyone know if pnmatch() is implemented on other C compilers??
>No vendor should provide a routine named "pnmatch."

That's based on a misunderstanding.  The actual constraint is that a
vendor is not supposed to interfere with an application's having its
own function (or external variable, or whatever) named "pnmatch".
Such a vendor-supplied function can be included in the standard C
library if it is not invoked by any standard library routines and
if it is not declared in any standard header.

>Similarly, portable programs cannot really use these extensions,
>no matter how convenient they may be.

A portable program cannot rely on the existence of a vendor-specific
function such as "pnmatch", but only because it doesn't exist in some
environments -- no other reason.

scs@adam.pika.mit.edu (Steve Summit) (07/16/89)

In article <10533@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <12689@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:
>>No vendor should provide a routine named "pnmatch."
>That's based on a misunderstanding.  The actual constraint is that a
>vendor is not supposed to interfere with an application's having its
>own function (or external variable, or whatever) named "pnmatch".

Indeed.  I should know better than to post assertions about a
standard I've never actually read.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

dal@syntel.UUCP (Dale Schumacher) (07/16/89)

[covertr@gtephx.UUCP (Richard E. Covert) writes...]
> 	But, I was browsing thru the MWC manual, and lo and behold I see
> pnmatch(). Now pnmatch is a wonderful little function which does string
> comparisons. And it even accepts wildcards, so I was in business. I just
> build an array of strings such as "*.PI1", and then by looping thru the
> array I can string compare a user inputted filename against the list of
> legal filetypes. Pretty neat solution.
> 
> 	So, the moral of this little ditty, is READ YOUR MANUAL!!
> 
> P.S. Does anyone know if pnmatch() is implemented on other C compilers??

dLibs contains several useful (though non-standard) functions, including
some things which are similar in application to pnmatch().  Here are a
few relevent description from the dLibs "manual".

char *findfile(char *afn[, *ext])
	Return full file spec for <afn> if found. If <afn> has no extension,
	extensions from <ext> are tried until a match is found, or the list
	ends.  <ext> is a list of extensions separated by '\0' characters
	and ending with an additional '\0', ie. ".ttp\0.tos\0.prg\0" (note
	that the final null is added by the compiler to any string constant.
	If <afn> already has an extension, <ext> is not used.  If no matching
	files are found, NULL is returned.  The pointer returned when a match
	is found points to a buffer which is internal to fullpath().  If you
	want to save the value returned, you must make a copy before the
	buffer is overwritten by subsequent calls.  Note: several dLibs
	functions call filefind(), so don't make too many assumptions about
	how long the internal buffer is going to stay valid.

char *pfindfile(char *path, *afn[, *ext])
	Like findfile() but search all directories (separated by ',' or ';')
	in <path>.  If <path> is NULL, the "PATH" environment variable is
	used instead.  If <afn> specifies a drive or directory, <path> is
	not used.  The internal buffer for findfile() is used by pfindfile().

char *wildcard(char *pathname)
	Return matches for a wildcard filename.  If <pathname> is not
	NULL, the first file which matches <pathname> will be returned.
	The <pathname> may contain wildcards only in the filename portion,
	not in any sub-directories.  Subsequent calls to wildcard() with
	a NULL argument return the next matching filename.  NULL is
	returned when no more files match.  Note: the pointer returned
	points to an internal buffer which is overwritten with each
	call.  It should not be modified, and should be copied into a
	safe place if you want to save the value.

...etc...there are more.  Your moral is quite accurate.  You'd be surprised
how much interesting code you can find that has been provided for you.  In
addition, at least with dLibs, you have the source code, so you can actually
include the routine directly when porting to a non-dLibs environment.

\\   /  Dale Schumacher                         399 Beacon Ave.
 \\ /   (alias: Dalnefre')                      St. Paul, MN  55104-3527
  ><    ...umn-cs!midgard.mn.org!syntel!dal     United States of America
 / \\   "What is wanted is not the will to believe, but the will to find out,
/   \\  which is the exact opposite." -Bertrand Russell

bright@Data-IO.COM (Walter Bright) (07/18/89)

In article <12689@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:
<No vendor should provide a routine named "pnmatch."  Vendors are
<not supposed to pollute the namespace with "convenient" (but
<invariably unportable and system-specific) routines.  ("Then why
<do so may vendors do so?" you ask.)

I'll tell you why: because customers want them. Here's a transcript of a
not uncommon telephone call:

Customer:	Compiler X has 427 library functions. Your compiler has
		only 387 library functions. When are you going to fix that?
Me:		Which library functions does X have that we don't that you
		need?
Customer:	But there are more library functions with X, therefore
		X is better.

Some people tend to rate a compiler by:
1.	The number of library functions (What they are is irrelevant).
2.	The number of pages in the manual (Content is irrelevant).

If you disagree with this, pick up some magazine reviews of C compilers.
Though, to their credit, the reviews *have* improved on these points in
the last 2 years, though some are still impressed by the *heft* of the
package!

Compiler vendors have responded to this pressure by creating vast quantities
of library functions. A surprisingly large percentage of these are totally
trivial (< 10 instructions). For example, routines that merely interface
with BIOS functions. What's wrong with these functions are that:
1. They clutter up the library.
2. They clutter up the manual with descriptions that are essentially
   duplicated from the BIOS manual.
3. They foster an illusion of writing portable code.

Manual writers (and reviewers) ought to read Strunk and White. A good
reference manual should contain exactly sufficient words to describe it,
no more, no less. The bloat properly belongs in a physically separate tutorial.

covertr@gtephx.UUCP (Richard E. Covert) (07/18/89)

In article <10533@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> 
> A portable program cannot rely on the existence of a vendor-specific
> function such as "pnmatch", but only because it doesn't exist in some
> environments -- no other reason.

	This is an interesting stream of messages about pnmatch(). I first
posted it because I thought that MWC had written a clever little piece of
code. I have found out that pnmatch() once existed in Version 7 (MINIX is
Version 7) UNIX.

	Anyway, the talk is now centering around portability issues. I have been
programming for quite a few years and I realize just how important portability
is, BUT, a good software engineer decides when portability is not an issue, 
and doesn't always stick to the book.

	My application involves operating system and hardware specific issues
which make my program non-portable to other non-GEM computers anyway. Possibly,
I could port my program to an IBM PC running DRI GEMDOS, but it wouldn't have
all of the different types of picture files that the ST does anyway. So, I
maintain that portability is a moot point in my application. Furthermore,
my application is non-portable due to the fact that AES/VDI headers are not included
in the ANSI Draft. And I have a copy of the ANSI Draft for C at my desk at work
and at home.

	I did look up pnmatch() and it is not in the ANSI Draft C. But neither is the 
regexp() function mentioned elsewhere.

	In any case, I made the decision to use pnmatch() because it fulfills
my need, and it is unlikely that I will ever need to port this program. Also,
and finally, if portability is an issue, then you can purchase the source code to
the Mark Williams C compiler directly from Mark Williams Corp for $149.00.
And then you can use pnmatch() to your heart's content!!

P.S. Try porting the ST fsel_input() to some other computer!!! And people are
worried about pnmatch() not being portable :-).

Richard (just trying to write some GEMs) Covert

henry@utzoo.uucp (Henry Spencer) (07/19/89)

In article <447ec923.14a1f@gtephx.UUCP> covertr@gtephx.UUCP (Richard E. Covert) writes:
>...I have found out that pnmatch() once existed in Version 7 (MINIX is
>Version 7) UNIX.

Can you cite a reference for this?  It wasn't in the V7 that utzoo ran
until a year ago -- and our distribution tape came direct from Bell Labs.

>I did look up pnmatch() and it is not in the ANSI Draft C. But neither is the 
>regexp() function mentioned elsewhere.

However, something very much like my regexp functions is likely to appear
in Son of Posix (1003.2, that is -- standardizing commands turned out to
be a good time to make some additions to the libraries).
-- 
$10 million equals 18 PM       |     Henry Spencer at U of Toronto Zoology
(Pentagon-Minutes). -Tom Neff  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu