mike2@lcuxa.UUCP (M S Slomin) (11/04/88)
Have you wished to be able to specify wildcards in DOS the way you can in Unix? I certainly have. The brain-damaged * (which matches the remainder of the string, and not portions that are followed by a specification, such as *a to match all filenames ending with the letter a) and lack of range specifications (like [a-f]*, etc.) have driven me to distraction, as has the requirement that the filename and extension be matched separately with a period between them. Last month, Alan Strassberg posted a public domain 'gmatch' function that implements string matching capabilities similar to those implemented by the Bourne shell. This stimulated me to see if it could be used in DOS. What follows is the result of pasting together three existing sources of code, and adding my own code to fill the interstices. The result seems to work -- probably independently of memory model in the TurboC version, but maybe with memory model problems in the MSC version because of the funny pointer conversions used to implement findfirst/findnext/setdta/getdta. I've compiled and run the code in the large model under MSC4.0 and MSC5.0 without problem, but nevertheless I'm not convinced that it is bulletproof in other than the small model. Perhaps others might improve it; I'm satisfied for the time being. Operation: 1. The first operative line of code in a C program after the declarations should be: argv = exparg(&argc, argv); having first been preceded by the declaration: extern char **exparg(); 2. If so, arguments to the C program will be expanded, and will populate a replacement set of argv[1], argv[2],... argv[argc-1] strings, and argc will be readjusted. Thereafter, the program can be written to access argv[]/argc as if the expanded versions had been placed there by the operating system. 3. * will match zero or more of any character; ? will match a single character (but not zero occurrences); [a-d] will match a single character in the range 'a' through 'd'; [!a-d] will match any single character except a character in the range 'a' through 'd'. 4. The period between the filename root and its extension need not be stated explicitly. Thus, the pattern a*e will match 'abacus.exe' as well as 'axyz.e' and 'apple'. Size: The following code size differences resulted when a simple-minded test program was compiled with and without the exparg code: MSC4.0 1670 bytes added (small model) MSC5.0 1670 bytes added (small model) TURBOC1.5 1446 bytes added (small model) (Not bad for what it gives you!) =================================================================== CODE =================================================================== /* Compilation options: */ /* #define MSC /* for Microsoft C */ /* #define TURBOC /* for TurboC 1.0 or 1.5 */ #define ATTRIB 1 /* search only for normal files, including read-only ones */ /* #define ATTRIB 0 /* search only for normal writable files */ /* #define ATTRIB 0x3f /* search for all file names, including hidden and system ones, directory names and . and .. */ /* #define TEST /* see it work */ /* Credits: 1. The first/next/getdta/setdta routines are based (loosely) on a public domain "Sample wildcard processor in Lattice C" by Alan Losoff, Milwaukee WI 2. The exparg code is a modification of wildcard expansion code originally written for TurboC 1.0 by: Richard Hargrove Texas Instruments, Inc. P.O. Box 869305, m/s 8473 Plano, Texas 75086 214/575-4128 and posted to USENET in Sept., 1987. 3. The gmatch code was posted to USENET by Alan Strassberg, Lockheed, Santa Cruz, CA in Oct., 1988. His posting indicated that it was derived from a posting to comp.os.minix. 4. The remainder, such as it may be, is mine, Mike Slomin, bellcore!lcuxa!mike2, and may be used for any purpose. */ #include <stdio.h> #include <ctype.h> #include <dos.h> #include <string.h> #ifndef TURBOC #include <direct.h> #include <malloc.h> #else #include <dir.h> #include <alloc.h> #endif /* TURBOC */ #define DOS_GETFAT 0x3600 #define DOS_SETDTA 0x1A00 #define DOS_GETDTA 0x2F00 #define DOS_FFIRST 0x4E00 #define DOS_FNEXT 0x4F00 #define CARRY_FLAG 0x0001 #define MAXARGS 100 /* maximum number of entries the new argv */ /* array can contain */ #define MAXPATH 80 #define MAXDIR 66 #define MAXDRIVE 3 #define MAXFILE 9 #define MAXEXT 5 #define TRUE 1 #define FALSE 0 #define NIL(type) ((type *) NULL) typedef int BOOLEAN; struct DIRS /* dos directory entry */ { char for_dos[21]; char attr; struct ftime { unsigned hour : 5; unsigned minute : 6; unsigned twosec : 5; } time; struct fdate { unsigned year : 7; unsigned month : 4; unsigned day : 5; } date; long size; char name[13]; char fill[85]; }; static union REGS reg; static struct DIRS dta; static struct SREGS segregs; static char path[80]; static int pathend; /* The following are not all really needed for MSC5.0+, which does have functions such as _dos_findfirst/_dos_find_next, etc., however since the code works on MSC4.0 and is upwardly compatible it seemed easier simply to stick with it, rather than to migrate it. */ #ifdef MSC char * getdta() { reg.x.ax = DOS_GETDTA; reg.x.bx = 0; reg.x.cx = 0; reg.x.dx = 0; intdos(®, ®); return (reg.x.bx); } setdta(dta) char *dta; { reg.x.ax = DOS_SETDTA; reg.x.bx = 0; reg.x.cx = 0; reg.x.dx = (unsigned int) dta; intdos(®, ®); } char * strlwr(s) register char *s; { register char *os; os = s; while(*s){ *s = tolower(*s); *s++; } return(os); } char * stpcpy(s1,s2) char *s1, *s2; { return(strcpy(s1,s2) + strlen(s1)); } char * first(name, blk, attrib) char *name, *blk; int attrib; { setdta(&dta); reg.x.ax = DOS_FFIRST; reg.x.bx = 0; reg.x.cx = 0; reg.x.dx = (unsigned int) name; intdos(®, ®); if (reg.x.cflag & CARRY_FLAG) return(-1); return(0); } char * next(blk) /* find next directory entry */ char *blk; { setdta(&dta); reg.x.ax = DOS_FNEXT; reg.x.bx = 0; reg.x.cx = 0; reg.x.dx = 0; intdos(®, ®); if (reg.x.cflag & CARRY_FLAG) return(-1); return(0); } #endif /* MSC */ #ifdef TURBOC first(name, dta, attrib) char *name, *dta; int attrib; { return (findfirst(name, dta, attrib)); } next(dta) char *dta; { return (findnext(dta)); } #endif /* TURBOC */ pathsplit(fpath, drive_dir) char *fpath, *drive_dir; { /* separate path and directory from input name */ strcpy(drive_dir, fpath); pathend = strlen(drive_dir); while(pathend && drive_dir[pathend-1] != ':' && drive_dir[pathend-1] != '\\') pathend--; drive_dir[pathend] = '\0'; return(drive_dir); } /******************************************************************************/ /* The following is an adaptation of Richard Hargrove's 'exparg.c' code which he wrote to do wild card expansion for the initial release of TurboC (TurboC 1.0). It keeps track of dynamic memory allocation efficiently, and codes well. Besides, who wants to reinvent the wheel? As originally written, the code invoked TurboC's findfirst/findnext routines to expand each argv[] argument, and replaced the original array of argv[] strings with an expanded one. It also appropriately replaced argc. Thus, so long as the first operative line after main() in a program was argv = exparg(&argc,argv); from that point onward the program would operate as if the operating system, and not the program, had already expanded the arguments. To bring in pdgmatch, the game is: a) to use Mr. Hargrove's findfirst/findnext code with the argument "*.*", to get a list of all of the files in the selected path; b) apply pdgmatch to each original argv[] and the list of all files; and c) use the result of the pdgmatch(s) to populate the replacement argv[] strings. Also, the results of DOS' findfirst/findnext are converted to lower case before they are sent to pdgmatch, since it would be annoying to have to use upper. Note that the ATTRIB definition will determine whether only conventional files will be matched (ATTRIB=0) or whether hidden and system files, and directories, will also be matched (ATTRIB=32). */ char **exparg (pargc, argv) int *pargc; char **argv; { static char *newargv[MAXARGS]; char pathi[MAXPATH]; char patho[MAXPATH]; char drive[MAXDRIVE]; char dir[MAXDIR]; char drive_dir[MAXDRIVE + MAXDIR]; char *olddta; int args = 0; int newargc = 0; BOOLEAN err = FALSE; olddta = getdta(); newargv[newargc++] = argv[args++]; while (!err && args < *pargc) { patho[0]='\0'; pathsplit(argv[args],drive_dir); stpcpy(stpcpy(patho, drive_dir), "*.*"); if (!first(patho, &dta,ATTRIB)) { do { char *localcptr = (char *)malloc ( (unsigned)(stpcpy(stpcpy(pathi,drive_dir),dta.name) - pathi) + 1); #ifdef TURBOC if (localcptr == NIL(char)){ #else if (localcptr == NULL){ #endif /* TURBOC */ fputs("\n_exparg error : no memory for filenames\n",stderr); exit(1); } if (gmatch(strlwr(pathi), argv[args])) { newargv [newargc++] = strcpy (localcptr, pathi); } } while ((newargc < MAXARGS) && !next (&dta)); } else { newargv [newargc++] = argv [args]; } err = (newargc == MAXARGS); args++; } if (err) fputs ("\n_exparg error : too many filenames\n", stderr); setdta (olddta); *pargc = newargc; return (&newargv [0]); } /***************************************************************************/ /* * int gmatch(string, pattern) * char *string, *pattern; * * Match a pattern as in sh(1). */ #define NULL 0 #define CMASK 0377 #define QUOTE 0200 #define QMASK (CMASK&~QUOTE) #define NOT '!' /* might use ^ */ static char *cclass(); int gmatch(s, p) register char *s, *p; { register int sc, pc; if (s == NULL || p == NULL) return(0); while ((pc = *p++ & CMASK) != '\0') { sc = *s++ & QMASK; switch (pc) { case '[': if ((p = cclass(p, sc)) == NULL) return(0); break; case '?': if (sc == 0) return(0); break; case '*': s--; do { if (*p == '\0' || gmatch(s, p)) return(1); } while (*s++ != '\0'); return(0); default: if (sc != (pc&~QUOTE)) return(0); } } return(*s == 0); } static char * cclass(p, sub) register char *p; register int sub; { register int c, d, not, found; if ((not = *p == NOT) != 0) p++; found = not; do { if (*p == '\0') return(NULL); c = *p & CMASK; if (p[1] == '-' && p[2] != ']') { d = p[2] & CMASK; p++; } else d = c; if (c == sub || c <= sub && sub <= d) found = !not; } while (*++p != ']'); return(found? p+1: NULL); } /******************************************************************************/ #ifdef TEST main (argc,argv) int argc; char **argv; { /* Normally, when using exparg, you should precede the exparg() call with the declaration: extern char **exparg(); and the first line non-declaration code after main should be: argv = exparg (&argc, argv) However, to show how it works, we will first print the original command line parameters in the following test code. And, since exparg() has already been declared, we will not bother to do so here. */ int i = 0; printf ("original command line parameters : argc: %d\n", argc); for (; i < argc; i++) { printf ("%s\n", argv [i]); } argv = exparg (&argc, argv); printf ("new command line parameters : argc: %d\n", argc); for (i = 0; i < argc; i++) { printf ("%s\n", argv [i]); } } #endif ===============================END OF CODE=========================== No warranties whatsoever. You get what you pay for! Mike Slomin bellcore!lcuxa!mike2
link@stew.ssl.berkeley.edu (Richard Link) (11/04/88)
In article <215@lcuxa.UUCP> mike2@lcuxa.UUCP (M S Slomin) writes: > >Have you wished to be able to specify wildcards in DOS the way you >can in Unix? Personally, I wish both UNIX and MS-DOS could expand wildcards like VMS. I *hate* the useless destination restrictions in UNIX. Dr. Richard Link University of California, Berkeley link@ssl.berkeley.edu
naughton%wind@Sun.COM (Patrick Naughton) (11/04/88)
I like the idea of having unix shell style wildcard expansion, but having it be a C client function which has to be called before argv[] processing kind of limits its usefulness. The solution I would like to see would be a COMMAND.COM shell enhancement (replacement?) which handled the usual stuff like file completion, history substitution, AND wildcard expansion. This way the client program/command such as masm or link (which always has long command lines under Unix) would already have the correct arg[cv] values on startup. This also has the advantage of having only one copy of argexp() lying around rather than one copy per client. This is a small matter of programming and I would certainly have already done it if it were not for the point of this posting: The DOS command line (last time I checked) had an upper limit of 128 characters. Thus any wildcard expansion which expanded out to more than 128 characters would fail. Does anyone know if this is hardwired, or where the hack would have to go in to change it? I am guessing that it is because old ".COM" files put argv[] in the PSP which is only 256 bytes long. It seems that for an ".EXE" file, the process loader or the offset fixup-er could allocate some space for the arg list no matter how long and point argv[] at it. This shortcoming of DOS/COMMAND.COM causes standard Unix Makefiles to be useless since most link lines get to be several hundred characters, if you have any reasonable number of libraries or object files. DOS has the @file convention for reading the args for commands out of, but it makes Makefile management a nightmare and portability to Unix a moot point. Any comments or suggestions? -Patrick ______________________________________________________________________ Patrick J. Naughton ARPA: naughton@Sun.COM Window Systems Group UUCP: ...!sun!naughton Sun Microsystems, Inc. AT&T: (415) 336 - 1080
dhesi@bsu-cs.UUCP (Rahul Dhesi) (11/05/88)
In article <76192@sun.uucp> naughton@sun.com (Patrick J. Naughton) writes: >The DOS command >line (last time I checked) had an upper limit of 128 characters.... >Does anyone know if this is hardwired, or where the hack >would have to go in to change it? ... >This shortcoming of DOS/COMMAND.COM causes standard Unix Makefiles to be >useless since most link lines get to be several hundred characters,... The command line limit is unfortunately hard-coded because of the limited size of the PSP. (I think this problem is inherited from CP/M.) For C programs, that use argv[], this is not a problem at all -- the C runtime library can still expand command line arguments and malloc() space for them, and in fact both Microsoft and Borland supply functions that will do this for you and call main() with argv[] containing expanded filenames. (It's true that various versions of these have various peculiarities, often choking on forward slashes in pathnames.) This command line shortcoming is not a problem with makefiles if you use Don Kneller's ndmake program. It accepts highly UNIX-compatible makefiles, and recognizes the word "link" and feeds the linker a response file with the @ command. It's shareware for $35, and a steal at that price. (P.S. They just fixed my PS/2 and gave it back to me, and I'm using it for the first time and finding the placement of the keys quite irritating. Does anybody have a simple way of exchanging the caps-lock and ctrl keys on the keyboard? I wish these big companies such as DEC, AT&T, and IBM would stop trying to improve keyboards by constantly changing the keys around, inserting backslashes at odd places, making escape keys vanish, etc.) -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
sullivan@marge.math.binghamton.edu (fred sullivan) (11/07/88)
In article <76192@sun.uucp> naughton@sun.com (Patrick J. Naughton) writes: >I like the idea of having unix shell style wildcard expansion, but >having it be a C client function which has to be called before argv[] >processing kind of limits its usefulness. The solution I would like to >see would be a COMMAND.COM shell enhancement (replacement?) which >handled the usual stuff like file completion, history substitution, AND >wildcard expansion. > >This is a small matter of programming and I would certainly have already >done it if it were not for the point of this posting: The DOS command >line (last time I checked) had an upper limit of 128 characters. Thus >any wildcard expansion which expanded out to more than 128 characters >would fail. A similar problem exists for the Atari ST, and one solution? which has been used (I think by Mark Williams C) is to pass arguments which go past 128 bytes in an environment variable. The problem with any solution like this is, of course, that it simply will not work with existing software. For software which one writes for oneself, it is easy enough to have the program itself expand the wildcards (with Turbo C one links in a routine called setargv, which is called on startup by the runtime system -- I assume other compilers have something similar). It was an extremely stupid decision to embed the command line arguments in a 256 byte PSP, but then it's things like this that make DOS DOS. Fred Sullivan SUNY at Binghamton Dept. Math. Sciences Binghamton, NY 13903 sullivan@marge.math.binghamton.edu First you make a roux!
swh@hpsmtc1.HP.COM (Steve Harrold) (11/07/88)
Re: Long DOS command lines in makefiles As stated earlier in this notestring, UNIX makefiles are hard to manage under MSDOS because the long command lines often encountered in UNIX get truncated by DOS to 128 characters. The work-around is to use manually created "response" files, but these are hard to keep in sync with various make macros such as the commonly used $(OBJS). There is a DOS make product named OPUS Professional Make that handles this automatically. When a command line to invoke "link", "lib" or any other user-designated program is longer than 128 characters (after macro substitution), the OPUS program creates a temporary response file and feeds THAT to the intended program. There is no special user involvement in using this "workaround"; he simply manages his makefiles as if they had infinitely long command lines. Another bonus is that a "mkmf" program is included that allows you to automatically generate the dependencies in the makefile, directly from the source files. Their address is: OPUS Software 1468 8th Ave. San Francisco, CA 94122 UUCP: ...ucbvax!ucsfcgl!kneller My only association with the product is that of a satisfied user. -- --------------------- Steve Harrold ...hplabs!hpsmtc1!swh HPG200/13 (408) 447-5580 ---------------------
zu@ethz.UUCP (Urs Zurbuchen) (11/09/88)
In article <76192@sun.uucp> naughton@sun.com (Patrick J. Naughton) writes: >I like the idea of having unix shell style wildcard expansion, but >having it be a C client function which has to be called before argv[] >processing kind of limits its usefulness. If anyone is interested in a replacement for the Microsoft C startup wildcard expansion routine drop me a note. I adjusted some subroutines which do UN*X style wildcard expansion and included them into the C libraries. For all the C programs you compile you get this feature for free. I didn't write all the code myself. Credits are given in the sources. >... The solution I would like to >see would be a COMMAND.COM shell enhancement (replacement?) which >handled the usual stuff like file completion, history substitution, AND >wildcard expansion. > This solution is preferable of course. But as I have to sources to almost all the utilities I use it isn't of high necessity to me. I would suggest that instead of hooking something to Command.Com it could be easier and better to write a replacement. Anybody volunteering ? (Don't tell me about the MKS toolkit's korn shell. I heard of that.) Have a nice day, ...urs
james@bigtex.cactus.org (James Van Artsdalen) (11/12/88)
In <11470038@hpsmtc1.HP.COM>, swh@hpsmtc1.HP.COM (Steve Harrold) wrote: > Re: Long DOS command lines in makefiles [...] > There is a DOS make product named OPUS Professional Make that handles > this automatically. [...] I'll strongly second this recommendation of Opus make. It is the essentially the equal or better of any unix make. It works correctly, unlike that wretched abomination Microsoft has. Borland's make is good, but I could not make Borland's make work well with large projects spanning multiple directories, whereas Opus make worked perfectly. There is also an OS/2 version, for those programming in OS/2 and needing a working "make". -- James R. Van Artsdalen james@bigtex.cactus.org "Live Free or Die" Home: 512-346-2444 Work: 338-8789 9505 Arboretum Blvd Austin TX 78759
simpsong@ncoast.UUCP (Gregory R. Simpson) (11/12/88)
In article <16481@agate.BERKELEY.EDU> link@stew.ssl.berkeley.edu (Richard Link) writes: >In article <215@lcuxa.UUCP> mike2@lcuxa.UUCP (M S Slomin) writes: >> >>Have you wished to be able to specify wildcards in DOS the way you >>can in Unix? > >Personally, I wish both UNIX and MS-DOS could expand wildcards like VMS. >I *hate* the useless destination restrictions in UNIX. > >Dr. Richard Link >University of California, Berkeley >link@ssl.berkeley.edu What VMS Wildcards? I can't even use the most simple wildcards, like cc *.c !!!! THat is considered wildcard expansion??? Follow-up to comp.sys.vms... this doesn't really need to be in comp.sys.ibm.pc... Greg -- --- Gregory R. Simpson Prefered Internet: SIMPSONG%LTD2.decnet@ge-crd.arpa UUCP: uunet!steinmetz!ltd2.decnet!simpsong UUCP: <BACKBONE>!cbosgd!ncoast!simpsong
allbery@ncoast.UUCP (Brandon S. Allbery) (11/14/88)
As quoted from <1554@bingvaxu.cc.binghamton.edu> by sullivan@marge.math.binghamton.edu (fred sullivan): +--------------- | In article <76192@sun.uucp> naughton@sun.com (Patrick J. Naughton) writes: | | >This is a small matter of programming and I would certainly have already | >done it if it were not for the point of this posting: The DOS command | >line (last time I checked) had an upper limit of 128 characters. Thus | >any wildcard expansion which expanded out to more than 128 characters | >would fail. | | A similar problem exists for the Atari ST, and one solution? which has been | used (I think by Mark Williams C) is to pass arguments which go past 128 bytes | in an environment variable. The problem with any solution like this is, of | course, that it simply will not work with existing software. For software +--------------- Perhaps someone should consider an extension to DOS (TOS). Compliant command interpreters would place as much of the argument list as possible in the command line and *also* place the full command list into another memory segment or etc. which would be passed to the program. The extended segment could be 1024 bytes or etc. Programs which aren't compliant would continue to have problems, but compliant programs could use the parameter segment to get the full argument list. If enough programs used the extension, it would eventually become a standard. ++Brandon -- Brandon S. Allbery, comp.sources.misc moderator and one admin of ncoast PA UN*X uunet!hal.cwru.edu!ncoast!allbery <PREFERRED!> ncoast!allbery@hal.cwru.edu allberyb@skybridge.sdi.cwru.edu <ALSO> allbery@uunet.uu.net comp.sources.misc is moving off ncoast -- please do NOT send submissions direct Send comp.sources.misc submissions to comp-sources-misc@<backbone>.