diamond@jit345.swstokyo.dec.com (Norman Diamond) (01/24/91)
In article <22855@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: > #define MAXNAMES 1000 > static char users[MAXNAMES][UT_NAMESIZE+1]; > (void) strncpy( users[nusers], u.ut_name, UT_NAMESIZE ); > users[nusers][UT_NAMESIZE] = '\0'; >And yes, this will fail if more than 1000 users are logged in at >the same time. Imagine how concerned I am. Uh, maybe equally concerned as people who knew that their operating system would never last 10 years, or 28 years, or whatever? Equally concerned as people who knew that the spacecraft would not last a year, or when it did, they knew it wouldn't last another 4 years? You should know to set a better example than this. Followups (if any are necessary) are directed to comp.lang.c. -- Norman Diamond diamond@tkov50.enet.dec.com If this were the company's opinion, I wouldn't be allowed to post it.
jef@well.sf.ca.us (Jef Poskanzer) (01/25/91)
In the referenced message, diamond@jit345.enet@tkou02.enet.dec.com (Norman Diamond) wrote: }In article <22855@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: }> #define MAXNAMES 1000 }> static char users[MAXNAMES][UT_NAMESIZE+1]; }> (void) strncpy( users[nusers], u.ut_name, UT_NAMESIZE ); }> users[nusers][UT_NAMESIZE] = '\0'; }>And yes, this will fail if more than 1000 users are logged in at }>the same time. Imagine how concerned I am. } }Uh, maybe equally concerned as people who knew that their operating system }would never last 10 years, or 28 years, or whatever? }Equally concerned as people who knew that the spacecraft would not last a }year, or when it did, they knew it wouldn't last another 4 years? Gosh, in ten years, if every trend in computer usage magically reverses itself, I'll get a message telling me to change the number from 1000 to 10000. Yes, it does check for overflow. }You should know to set a better example than this. I think this is an *excellent* example of appropriate programming technology. Dan Bernstein's hack of reading utmp twice and allocating 50 extra slots in case more users log in between the two is, when you come down to it, *no better*. Just more complicated. Worse, in fact, since he *doesn't* check for overflow. He complained about a hard limit of 200 users and then went and programmed a different hard limit of 50 new users in an unknowable time period. Foo on that. If you must handle an arbitrary number of users, do the doubling-realloc trick. But don't invest the effort until you get at least one report of someone overflowing the fixed-size array, since any malloc hacking that anyone does has a good chance of being buggy. End of sermon. --- Jef Jef Poskanzer jef@well.sf.ca.us {apple, ucbvax, hplabs}!well!jef "Why me, John Bigboote?"
diamond@jit345.swstokyo.dec.com (Norman Diamond) (01/25/91)
In article <22870@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: >In the referenced message, diamond@jit345.enet@tkou02.enet.dec.com (Norman Diamond) wrote: >}In article <22855@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: >}> #define MAXNAMES 1000 >}> static char users[MAXNAMES][UT_NAMESIZE+1]; >}> (void) strncpy( users[nusers], u.ut_name, UT_NAMESIZE ); >}> users[nusers][UT_NAMESIZE] = '\0'; >}>And yes, this will fail if more than 1000 users are logged in at >}>the same time. Imagine how concerned I am. >} >}Uh, maybe equally concerned as ... > >Gosh, in ten years, if every trend in computer usage magically reverses >itself, I'll get a message telling me to change the number from 1000 to >10000. Suppose someone starts logging NFS clients? Or the clients of some other service? 1000 would already be a bit small for that. >Yes, it does check for overflow. Uh, you mean that it doesn't abort on overflow, but only gives inaccurate answers. OK, so your example does about 1/4 of what a good example would do. >Dan Bernstein's hack of reading utmp twice and allocating >50 extra slots in case more users log in between the two is, when you >come down to it, *no better*. Just more complicated. Worse, in fact, >since he *doesn't* check for overflow. If I had seen that posting, and if Mr. Bernstein had made some claim about adequacy, and if I had the time, I would have criticized that too. In fact, if I had seen the posting, and given the hypocrisy that you attributed to him (which I deleted, sorry), then it wouldn't matter if I had the time; I'd've flamed him ;-) . But I didn't see it, sorry. -- Norman Diamond diamond@tkov50.enet.dec.com If this were the company's opinion, I wouldn't be allowed to post it.
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/25/91)
Ah, yes, Jef takes his place next to Chris on my list of gurus I've caught in a mistake. Two mistakes, in fact. Read on... In article <22870@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: > I think this is an *excellent* example of appropriate programming > technology. Dan Bernstein's hack of reading utmp twice and allocating > 50 extra slots in case more users log in between the two is, when you > come down to it, *no better*. I address this below. You aren't thinking things through. My program would be objectively better than yours even if it allocated *zero* extra slots for users who log in after the first read. > Just more complicated. Worse, in fact, > since he *doesn't* check for overflow. I do check for overflow. See that test for i < lines + 50? Jef, I expect an apology. I appreciate criticism of my code, particularly when it gives me better insight into what people are looking for. But I don't appreciate someone trying to excuse his programming mistakes by saying ``Dan's code fucks up too'' when, in fact, my code works exactly as it's supposed to. Before this, I thought you were the type who would give constructive criticism---things like ``You should've cast back and forth to char * or void * at the qsort() interface.'' Not false accusations that show you hardly even pay attention to what you're talking about. > He complained about a hard > limit of 200 users and then Actually, my main project for last May was writing pty 3.0 from scratch, including the PD utilities (like u.c, who.c, etc.) that come with the package. So don't think I'm complaining about a problem without already having tried to fix it. > went and programmed a different hard limit > of 50 new users in an unknowable time period. You are wrong. You're correctly reporting the limit I coded, but you said above that this behavior is no better than your fixed limit. In that statement you are wrong. I won't go into a long treatise about the principles of taking snapshots of a dynamic system. But here are the two most important properties of ``foo'', a utmp scanner: 1. foo reports a user only if he is logged on at some point between when foo is invoked and when it finishes. 2. There is some time interval during which foo is running, such that foo reports any user who is logged on throughout that interval. (The reason readdir() isn't safe for some applications is that it doesn't obey #2 in most implementations.) Guess what? My version of users satisfies these properties. Your version fails #2. Now you can talk all you want about reallocating memory (btw, there's no safe way to use realloc(), but you knew that) to read in as many users as possible. I'll skip the comments about a quadratic time requirement, and about people who simply *talk* about code instead of *writing* code, and cut to the heart of the issue: You won't be able to identify a single functional requirement that your reallocating version satisfies and that my users program doesn't. You see, users has to exit at some point, and before that point there must be a window when users doesn't detect new logins. No external requirement can tell how big that window is. So there's no way to tell the difference between a program that cuts things off when people log on too fast and a program that is cut off by the scheduler. The best you can do is #2 above. (This explanation isn't particularly lucid, but if you try to say what advantage a reallocating version will have, you'll realize that there is none.) ---Dan
jef@well.sf.ca.us (Jef Poskanzer) (01/25/91)
In the referenced message, diamond@jit345.enet@tkou02.enet.dec.com (Norman Diamond) wrote: }>Gosh, in ten years, if every trend in computer usage magically reverses }>itself, I'll get a message telling me to change the number from 1000 to }>10000. } }Suppose someone starts logging NFS clients? Or the clients of some other }service? 1000 would already be a bit small for that. Huh? In the users command? What are you talking about? Stick to the given problem domain. }>Yes, it does check for overflow. } }Uh, you mean that it doesn't abort on overflow, but only gives inaccurate }answers. OK, so your example does about 1/4 of what a good example would do. No, of course that's not what I mean. It checks for overflow, tells you that it needs to be recompiled on overflow, and aborts on overflow. Why is that so hard to understand? Complete source is appended, so that we will have no more creative misunderstandings. Note that it does a few more things than the usual users command. Anyway, if you don't like the fixed-size array answer or the doubling-realloc answer or the read it twice answer, then let's see what you *do* like. Time to sling some code, dude. --- Jef Jef Poskanzer jef@well.sf.ca.us {apple, ucbvax, hplabs}!well!jef INSPECTED BY #6 /* ** users - show users, with those on a list highlighted ** ** version of 10oct90 ** ** Copyright (C) 1990 by Jef Poskanzer. ** ** Permission to use, copy, modify, and distribute this software and its ** documentation for any purpose and without fee is hereby granted, provided ** that the above copyright notice appear in all copies and that both that ** copyright notice and this permission notice appear in supporting ** documentation. This software is provided "as is" without express or ** implied warranty. */ #include <stdio.h> #include <strings.h> #include <sys/types.h> #include <utmp.h> #ifndef UT_NAMESIZE #define UT_NAMESIZE 8 #endif #ifndef _PATH_UTMP #define _PATH_UTMP "/etc/utmp" #endif #define TBUFSIZE 1024 #define MAXNAMES 1000 #define LINEWIDTH 79 extern char* getenv(); extern char* tgetstr(); int cmp(); int inlist(); void putch(); main( argc, argv ) int argc; char* argv[]; { char* term; char* strptr; char* soptr; char* septr; int smart; char buf[TBUFSIZE]; static char strbuf[TBUFSIZE]; struct utmp u; FILE* fp; static char friends[MAXNAMES][UT_NAMESIZE+1]; static char users[MAXNAMES][UT_NAMESIZE+1]; int i, nfriends, nusers; int wid; /* Check args. */ if ( argc == 1 ) ; else if ( argc == 3 && strcmp( argv[1], "-h" ) == 0 ) { /* Read the friends list. */ fp = fopen( argv[2], "r" ); if ( fp == NULL ) { perror( argv[2] ); exit( 1 ); } nfriends = 0; while ( fgets( buf, sizeof(buf), fp ) != NULL ) { if ( buf[strlen(buf)-1] == '\n' ) buf[strlen(buf)-1] = '\0'; if ( nfriends >= MAXNAMES ) { (void) fprintf( stderr, "Oops, too many names in the friends file. Gotta increase MAXNAMES.\n" ); exit( 1 ); } (void) strncpy( friends[nfriends], buf, UT_NAMESIZE ); friends[nfriends][UT_NAMESIZE] = '\0'; ++nfriends; } (void) fclose( fp ); /* qsort( friends, nfriends, sizeof(friends[0]), cmp ); */ } else { (void) fprintf( stderr, "usage: %s [-h highlightlist]\n", argv[0] ); exit( 1 ); } /* Initialize termcap stuff. */ if ( isatty( fileno( stdout ) ) == 0 ) smart = 0; else { term = getenv( "TERM" ); if ( term == 0 ) smart = 0; else if ( tgetent( buf, term ) <= 0 ) smart = 0; else { strptr = strbuf; soptr = tgetstr( "so", &strptr ); septr = tgetstr( "se", &strptr ); if ( soptr == NULL || septr == NULL ) smart = 0; else smart = 1; } } /* Open utmp and read the users. */ fp = fopen( _PATH_UTMP, "r" ); if ( fp == NULL ) { perror( "utmp" ); exit( 1 ); } nusers = 0; while ( fread( (char*) &u, sizeof(u), 1, fp ) == 1 ) { if ( u.ut_name[0] != '\0' ) { if ( nusers >= MAXNAMES ) { (void) fprintf( stderr, "Oops, too many users logged in. Gotta increase MAXNAMES.\n" ); exit( 1 ); } (void) strncpy( users[nusers], u.ut_name, UT_NAMESIZE ); users[nusers][UT_NAMESIZE] = '\0'; ++nusers; } } (void) fclose( fp ); qsort( users, nusers, sizeof(users[0]), cmp ); /* Show the users. */ wid = 0; for ( i = 0; i < nusers; ++i ) { if ( wid + strlen( users[i] ) + 3 > LINEWIDTH ) { putchar( '\n' ); wid = 0; } if ( wid > 0 ) { putchar( ' ' ); ++wid; } if ( inlist( users[i], friends, nfriends ) ) { if ( smart ) tputs( soptr, 1, putch ); else putchar( '<' ); fputs( users[i], stdout ); if ( smart ) tputs( septr, 1, putch ); else putchar( '>' ); if ( ! smart ) wid += 2; } else fputs( users[i], stdout ); wid += strlen( users[i] ); } putchar( '\n' ); exit( 0 ); } int cmp( a, b ) char* a; char* b; { return strcmp( a, b ); } int inlist( str, list, nlist ) char* str; char list[MAXNAMES][UT_NAMESIZE+1]; int nlist; { int i; /* (This could be made into a binary search.) */ for ( i = 0; i < nlist; ++i ) if ( strcmp( str, list[i] ) == 0 ) return 1; return 0; } void putch( ch ) char ch; { putchar( ch ); }
jef@well.sf.ca.us (Jef Poskanzer) (01/25/91)
In the referenced message, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) wrote: }Ah, yes, Jef takes his place next to Chris on my list of gurus I've }caught in a mistake. You're right, I didn't notice the i < lines + 50 test. I grovel at your feet O Master. }Now you can talk all you want about reallocating memory (btw, there's no }safe way to use realloc(), but you knew that) Actually, I didn't. Say more. }I'll skip the comments about a quadratic time requirement, Please do. }and about people who simply *talk* about code instead of *writing* code, Please get stuffed. }You won't be able to identify a }single functional requirement that your reallocating version You must have mis-read my message. I don't have any version which uses realloc. }This explanation isn't particularly lucid, You're right, but I understood it anyway. As long as you've got that overflow check in there, fine, it works. But after correctness you have to consider simplicity, and the fixed-size (but large and checked) array wins there. I realize they tell you in Computer Science School that you're not supposed to do things like this. I'm telling you now that it can be appropriate. --- Jef Jef Poskanzer jef@well.sf.ca.us {apple, ucbvax, hplabs}!well!jef Published simultaneously in Canada.
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/26/91)
In article <22879@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: > }Now you can talk all you want about reallocating memory (btw, there's no > }safe way to use realloc(), but you knew that) > Actually, I didn't. Say more. Some versions of realloc() return the original pointer rather than 0 if they run out of memory. So you have to code the malloc()/bcopy()/free() sequence yourself if you want error checking. > }and about people who simply *talk* about code instead of *writing* code, > Please get stuffed. Hey, bud, you started. My code can't defend itself against your insults, so someone has to do the job... :-) > }You won't be able to identify a > }single functional requirement that your reallocating version > You must have mis-read my message. I don't have any version which uses > realloc. This was in the hypothetical case that you do write a reallocating version. > As long as you've got that > overflow check in there, fine, it works. But after correctness you > have to consider simplicity, and the fixed-size (but large and checked) > array wins there. It depends on whether you consider the fixed-size array to be correct. Anyway, it's so simple to allow any number of users that you might as well make the change. > I realize they tell you in Computer Science School > that you're not supposed to do things like this. Hey, bud, don't accuse me of being a computer scientist, or I'll have to start flaming you again. (Last I heard, programming wasn't even part of the computer science curriculum.) > I'm telling you now > that it can be appropriate. Be serious. We're talking about a trivial piece of code. Why is it ``appropriate'' to use an arbitrary limit when it's so easy to get rid of the limit? ---Dan
gwyn@smoke.brl.mil (Doug Gwyn) (01/26/91)
In article <22311:Jan2502:34:1191@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >... there's no safe way to use realloc() ... In Standard C realloc() is required to be safe. Of course it may return NULL even if you're attempting to shrink the allocation, although it is unlikely that an implementation would be so deficient. The relevant point is that one should be prepared to deal with realloc() failure, not blindly assume it will always work.
smryan@garth.UUCP (Steven Ryan) (01/26/91)
>No, of course that's not what I mean. It checks for overflow, tells >you that it needs to be recompiled on overflow, and aborts on >overflow. Why is that so hard to understand? Complete source is Recompile what? Is the source always available? Is the build process properly documented and all build files available? Is the routine coded so that Joe Average can fix, recompile, and continue in five minutes? Do you what Joe Average is going to think of you afterwards? Do you think he'll be eager to run anything else with your name on it? Why is it difficult for so-called programmers to avoid arbitrary limits? -- ...!uunet!ingr!apd!smryan Steven Ryan ...!{apple|pyramid}!garth!smryan 2400 Geng Road, Palo Alto, CA
manson@iguana.cis.ohio-state.edu (Bob Manson) (01/27/91)
In article <60@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes: >Recompile what? Is the source always available? Is the build process >properly documented and all build files available? Is the routine >coded so that Joe Average can fix, recompile, and continue in five >minutes? Do you what Joe Average is going to think of you afterwards? I know what I thought of the "person" that hard-coded a limit on the # of /etc/magic entries in AT&Ts file program...and it wasn't kind. No, I didn't have source. No, I couldn't recompile. The solution was to write a replacement that didn't have any such stupid limit coded in it. >Why is it difficult for so-called programmers to avoid arbitrary limits? Because they don't care. I've met several people who call themselves "programmers" that think writing portable, reasonably limit-free code is a joke. They've just got a job to get done, a hacky piece of code to be written, and they don't care what it looks like or if it'll work a year from now. I tend to write any program as if I were going to show it to someone else, someone who could appreciate it and say "That's a really sharp implementation" as opposed to "Who wrote this piece of shit?" I tend to do this simply because I've had to port a wide range of software to various machines, and I can't say that I was pleased to have worked on most of it. I really don't want someone calling me some of the names I've been calling others. You think 1000 users is a large number in a users program? Suppose I decide to start recording all users over a large network in my utmp file? (Wouldn't that be nice...how I hate rwho.) I'll bet that in a few years, 1000 will be far too small....and I won't be able to recompile your program, because let's face it, 99.9% of all Unix distributors don't give source. So get a grip, take the time to create data structures that don't involve fixed-sized arrays, and a lot of people will be much happier with you. I know it's hard to think that not everyone has two machines & 10 users, but it's true. >...!uunet!ingr!apd!smryan Steven Ryan Bob manson@cis.ohio-state.edu
rsalz@bbn.com (Rich Salz) (01/29/91)
In <23975:Jan2516:36:5891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >Some versions of realloc() return the original pointer rather than 0 if >they run out of memory. Than such versions are seriously broken, since returning the original pointer is a valid thing to do if there already was enough space allocated. I would not code for such systems. -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net. Use a domain-based address or give alternate paths, or you may lose out.
jef@well.sf.ca.us (Jef Poskanzer) (01/29/91)
In the referenced message, Bob Manson <manson@cis.ohio-state.edu> wrote: }You think 1000 users is a large number in a users program? Suppose I }decide to start recording all users over a large network in my utmp }file? (Wouldn't that be nice...how I hate rwho.) Yes, that might be nice... but if you did that, why would you want to run "users"? Three screenfuls of usernames is not particularly useful. And as for piping it to another program, there's the small problem that most "users" programs don't bother to write out any newlines. When you have fixed the far more serious problem of most Unix programs dumping core on such input (not even a "recompile me" message, how rude), then maybe I'll consider it worthwhile to add the malloc gunk. In general, sure, handling arbitrary input is great. In specific cases where you can make a confident estimate of the maximum input size, I have no problem at all with using checked fixed size arrays of ten times that size. The benefit is N fewer lines to get wrong, and the cost, if your estimate is good, is non-existant. }I'll bet that in a few years, 1000 will be far too small.... What is the precise meaning of "far too small"? At least one system where 1000 is too small? We probably have that already. But if you mean that such systems will be common, sure, I'll take that bet. How much? }and I won't be able to }recompile your program, because let's face it, 99.9% of all Unix }distributors don't give source. I give source. In fact, one reason I like code which prints messages like "change XYZ and recompile me please" is to discourage bozos from doing any god damned binary-only distributions of *my* source. --- Jef Jef Poskanzer jef@well.sf.ca.us {apple, ucbvax, hplabs}!well!jef "So young, so bad, so what."
manson@python.cis.ohio-state.edu (Bob Manson) (01/29/91)
In article <22921@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: >In the referenced message, Bob Manson <manson@cis.ohio-state.edu> wrote: >}You think 1000 users is a large number in a users program? Suppose I >}decide to start recording all users over a large network in my utmp >}file? (Wouldn't that be nice...how I hate rwho.) > >Yes, that might be nice... but if you did that, why would you want to >run "users"? Well, I probably wouldn't want to _look_ at it, per se. But... >Unix programs dumping core on such input (not even a "recompile me" >message, how rude), then maybe I'll consider it worthwhile to add the >malloc gunk. The argument of "everything else is busted, so I'll leave my program broken too" isn't a real good one, but I do see a point there. Hmmm...Lets see what dies on an 13K input line. (Sun SLC+ running SunOS 4.1.) Well, tr was happy to convert the spaces into newlines, and I don't see much reason to go further, as the output of that could be postprocessed as I wished. (Yes, most unix utilities puke badly on input lines > 2K. On this Sun, grep and egrep deal with it OK, producing correct output, but sed silently truncates the output to 4001 bytes. The behavior of grep & egrep is atypical, but I'll bet tr will work in any case.) >In general, sure, handling arbitrary input is great. In specific cases >where you can make a confident estimate of the maximum input size, I have >no problem at all with using checked fixed size arrays of ten times I've had to deal with one too many utilities where someone makes a "confident estimate of the maximum input size" only to find that it's too small. Assuming that someone would never have more than 2048 password entries, for example. OK, I question strongly whether most unix sites have 2500 entries in their password files. Ours did (when I worked for the CIS dept. here), and I didn't have source to the programs. I was hosed. Seeing messages from programs like "recompile program with larger NENTS" is useless in these cases, as all I can do is call {insert your workstation maker here} and say "I need program X recompiled with a larger NENTS" and they laugh. And not everyone who does sysadmin is even capable of recompiling programs; the people I'm currently working with couldn't if their life depended on it. >What is the precise meaning of "far too small"? At least one system >where 1000 is too small? We probably have that already. But if you >mean that such systems will be common, sure, I'll take that bet. How >much? What does "common" have to do with anything? If your utility won't work at my site, what good is it? >I give source. In fact, one reason I like code which prints messages >like "change XYZ and recompile me please" is to discourage bozos from >doing any god damned binary-only distributions of *my* source. Hasn't stopped HP or AT&T from distributing code with similar limits. Won't stop anyone else either. I know what you're trying to say. It's a useless waste of time to write extra code to make a program limit-independent when we can make a good estimate of the maximum numbers & provide source for recompilation. My argument is, it really doesn't cost that much more to design the program properly to function without limits. The cost in making utilities with fixed limits in them is unhappy customers & time spent rewriting programs, since I seriously doubt source policies will change anytime soon. Your point about utilities dying on too long input lines is an excellent example; really, there is no much thing as a "too long input line". Whoever wrote sed decided that lines would never be longer than 4000 characters, and they were quite wrong... > Jef Poskanzer jef@well.sf.ca.us {apple, ucbvax, hplabs}!well!jef Bob manson@cis.ohio-state.edu
barmar@think.com (Barry Margolin) (01/30/91)
In article <87774@tut.cis.ohio-state.edu> Bob Manson <manson@cis.ohio-state.edu> writes: >I know what you're trying to say. It's a useless waste of time to >write extra code to make a program limit-independent when we can make >a good estimate of the maximum numbers & provide source for >recompilation. My argument is, it really doesn't cost that much more >to design the program properly to function without limits. The cost in >making utilities with fixed limits in them is unhappy customers & time >spent rewriting programs, since I seriously doubt source policies will >change anytime soon. Your point about utilities dying on too long >input lines is an excellent example; really, there is no much thing as >a "too long input line". Whoever wrote sed decided that lines would >never be longer than 4000 characters, and they were quite wrong... I agree with this most emphatically. The kind of software design Mr.Manson is complaining about is rampant in the industry, and pervades Unix. Most programmers learn software design by example. Sometimes this is good, when a good programming style (e.g. programs that filter stdin to stdout) is mimicked, but it also propogates poor programming practices. When I talk about the "brokenness" of Unix, it's this kind of stuff I'm thinking of. These kinds of problems aren't even in utility programs, but in just about every level of the system. For instance, file descriptors are indexes into a per-process table in the kernel; in many of the older Unix versions I don't even think the size of this table was configurable, but I may be mistaken. Of course, there are often good reasons to put some limits on per-process and per-user resources, to keep a single user or buggy program from hogging a system, but why aren't they runtime options? Why should I have to rebuild a kernel because I need more ptys? Yes, I admit that it is easier to program with fixed-size tables and buffers, but who ever said good programming was supposed to be easy? Of course, I'm biased, because I do much of my programming in Lisp, which makes it easy to write programs with few arbitrary limits. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
schwartz@groucho.cs.psu.edu (Scott Schwartz) (01/30/91)
barmar@think.com (Barry Margolin) writes: | I agree with this most emphatically. The kind of software design Mr.Manson | is complaining about is rampant in the industry, and pervades Unix. Most | programmers learn software design by example. Sometimes this is good, when | a good programming style (e.g. programs that filter stdin to stdout) is | mimicked, but it also propogates poor programming practices. When I talk | about the "brokenness" of Unix, it's this kind of stuff I'm thinking of. Part of the problem is that the standard libraries most systems supply are flawed in various ways. In stdio, ``gets'' leaps to mind. Moreover, ``fgets'' imposes an upper bound on input length, so lots of programs inherit that flaw. In V10 the fast io library imposes a fixed length (not even user selectable) on lines that ``Frdline'' will return. Happily, Chris Torek's new 4.4BSD stdio provides a way to read lines of any length using ``fgetline''. The only problem with that is that there is no general mechanism to read arbitrarily long tokens -- fgetline should either take a user supplied delimiter, or there should be a separate routine (fgettoken?) to do the job. Now's that time to fix this, before 4.4BSD really hits the streets. | I'm biased, because I do much of my programming in Lisp, which | makes it easy to write programs with few arbitrary limits. I'd kill for a scheme compiler that was suitable for writing systems programs.
sef@kithrup.COM (Sean Eric Fagan) (01/30/91)
In article <87774@tut.cis.ohio-state.edu> Bob Manson <manson@cis.ohio-state.edu> writes: >Hmmm...Lets see what dies on an 13K input line. (Sun SLC+ running >SunOS 4.1.) Well, tr was happy to convert the spaces into newlines, >and I don't see much reason to go further, as the output of that could >be postprocessed as I wished. (Yes, most unix utilities puke badly on >input lines > 2K. On kithrup (an SCO SysVr3.2v2 system), tr, grep, and wc all dealt nicely with a 120k line. (/etc/termcap is useful for such things 8-).) I was actually quite impressed. >>I give source. In fact, one reason I like code which prints messages >>like "change XYZ and recompile me please" is to discourage bozos from >>doing any god damned binary-only distributions of *my* source. >Hasn't stopped HP or AT&T from distributing code with similar limits. >Won't stop anyone else either. Just a note here. SCO's version of yacc has some semi-fixed limits. If it runs out of space for some of the tables, it complains, and says to rerun it with a different option. The actual message is something like: Out of <whatever> space. Run with -Sm# option (current setting 5000). (That's how I got perl working.) Although that's not the best solution (for example, it could be argued that yacc should realloc() up the space itself), it *does* manage what I consider a decent compromise between static space and run-time limitations. Anyway, just my two cents... -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others.
chip@tct.uucp (Chip Salzenberg) (01/30/91)
According to schwartz@groucho.cs.psu.edu (Scott Schwartz): >Happily, Chris Torek's new 4.4BSD stdio provides a way to >read lines of any length using ``fgetline''. BSD isn't the world; fixing 4.4BSD won't help me. Each site (or programmer) needs to write fgetline() or its moral equivalent using getc(), malloc() and realloc(), and use it every time gets() or fgets() would have been used. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> "I want to mention that my opinions whether real or not are MY opinions." -- the inevitable William "Billy" Steinmetz
martin@mwtech.UUCP (Martin Weitzel) (01/31/91)
In article <87774@tut.cis.ohio-state.edu> Bob Manson <manson@cis.ohio-state.edu> writes: >My argument is, it really doesn't cost that much more >to design the program properly to function without limits. The cost in >making utilities with fixed limits in them is unhappy customers & time >spent rewriting programs, since I seriously doubt source policies will >change anytime soon. Well, ... sience fiction ON I could see a time when by law a) software manufacturers must name all fixed limits of their products b) the customer can assume that all unnamed limits are in fact not fixed to some arbitrary value c) the customer has the right to request the sources and and whatever else is needed (e.g. a special compiler) from the manufacturer for no additional fee if any limit is hit which is below the promises of a) and b) science fiction OFF Of course, somewhat more realistic is that we'll have PD-versions of all the useful programs some day so that no manufacturer can sell something of less quality ... -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
byron@archone.tamu.edu (Byron Rakitzis) (01/31/91)
In article <22921@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes: >In the referenced message, Bob Manson <manson@cis.ohio-state.edu> wrote: >}You think 1000 users is a large number in a users program? Suppose I >}decide to start recording all users over a large network in my utmp >}file? (Wouldn't that be nice...how I hate rwho.) > >Yes, that might be nice... but if you did that, why would you want to >run "users"? Three screenfuls of usernames is not particularly >useful. And as for piping it to another program, there's the small >problem that most "users" programs don't bother to write out any >newlines. When you have fixed the far more serious problem of most >Unix programs dumping core on such input (not even a "recompile me" >message, how rude), then maybe I'll consider it worthwhile to add the >malloc gunk. > >In general, sure, handling arbitrary input is great. In specific cases >where you can make a confident estimate of the maximum input size, I have >no problem at all with using checked fixed size arrays of ten times >that size. The benefit is N fewer lines to get wrong, and the cost, if >your estimate is good, is non-existant. I think the point made here is that there *are* utilities written with bad a priori limits in their data structures. The most flagrant examples I can think of are vi and sh. Under certain circumstances, if you declare too many (== over 30 or so, not really that many!!) environment variables, vi and sh will dump core on my sun 4/280 running StunOS 4.0.3. It remains to be seen whether Sun addressed this bug in 4.1, but in the meantime I will agree wholeheartedly with the opinion that hard limits in code must be avoided. I've finishing writing a small sh-like shell whose only hard limit (which I'm thinking of taking out) is the number of commands that can be entered in a single pipeline. Currently the value is 512; more than the maximum number of processes allowed on any unix machine I've seen, so I consider myself safe. But at least I am aware of this as a shortcoming. Byron. -- Byron Rakitzis byron@archone.tamu.edu
tchrist@convex.COM (Tom Christiansen) (02/04/91)
From the keyboard of chip@tct.uucp (Chip Salzenberg): :According to schwartz@groucho.cs.psu.edu (Scott Schwartz): :>Happily, Chris Torek's new 4.4BSD stdio provides a way to :>read lines of any length using ``fgetline''. : :BSD isn't the world; fixing 4.4BSD won't help me. It's not the world, but it's a start. Do you have a scheme for fixing everything everywhere simultaneously? It's a hard problem. (I often wish RTM's Internet worm had gone around fixing broken code: the ultimate update engine. :-) :Each site (or programmer) needs to write fgetline() or its moral :equivalent using getc(), malloc() and realloc(), and use it every time :gets() or fgets() would have been used. Ug. If it's written once, published, and made available for use free of charge *and* without viral strings attached, each site or programmer won't have to re-invent the wheel. Of course, sites without source are still largely at the mercy of vendors. --tom -- Still waiting to read alt.fan.dan-bernstein using DBWM, Dan's own AI window manager, which argues with you 10 weeks before resizing your window. Tom Christiansen tchrist@convex.com convex!tchrist
darcy@druid.uucp (D'Arcy J.M. Cain) (02/07/91)
In article <1991Feb03.181937.9090@convex.com> Tom Christiansen writes: >From the keyboard of chip@tct.uucp (Chip Salzenberg): >:Each site (or programmer) needs to write fgetline() or its moral >:equivalent using getc(), malloc() and realloc(), and use it every time >:gets() or fgets() would have been used. > >Ug. If it's written once, published, and made available for use free >of charge *and* without viral strings attached, each site or programmer >won't have to re-invent the wheel. Of course, sites without source are >still largely at the mercy of vendors. OK, I have made a stab at it. Of course the first thing to do is define it. I have whipped up a man page for the way I think this function should work and it is included here. While I am at it the code to implement it is also included. (Yah, source but it's so small.) Anybody want to use this as a starting point? I have made it completely free so that no one has to worry about licensing restrictions. Besides, it's so trivial who couldn't duplicate it in 20 minutes anyway? ----------------------------- cut here -------------------------------- /* NAME fgetline SYNOPSIS char *fgetline(FILE *fp, int exclusive); DESCRIPTION Reads a line from the stream given by fp and returns a pointer to the string. There is no length restiction on the returned string. Space is dynamically allocated for the string as needed. If the exclusive flag is set then the space won't be reused on the next call to fgetline. RETURNS A pointer to the string without the terminating EOL is returned if successful or NULL if there was an error. AUTHOR D'Arcy J.M. Cain (darcy@druid.UUCP) CAVEATS This function is in the public domain. */ #include <stdio.h> #include <malloc.h> /* I originally was going to use 80 here as the most common case but */ /* decided that a few extra bytes to save a malloc from time to time */ /* would be a better choice. Comments welcome. */ #define CHUNK 128 static char *buf = NULL; char *fgetline(FILE *fp, int exclusive) { size_t sz = CHUNK; /* this keeps track of the current size of buffer */ size_t i = 0; /* index into string tracking current position */ char *ptr; /* since we may set buf to NULL before returning */ int c; /* to store getc() return */ /* set buf to 128 bytes */ if (buf == NULL) buf = malloc(sz); else buf = realloc(buf, sz); /* check for memory problem */ if (buf == NULL) return(NULL); /* get characters from stream until EOF */ while ((c = getc(fp)) != EOF) { /* check for end of line */ if (c == '\n') goto finished; /* cringe */ buf[i++] = c; /* check for buffer overflow */ if (i >= sz) if ((buf = realloc(buf, (sz += CHUNK))) == NULL) return(NULL); } /* see if anything read in before EOF */ /* perhaps some code to preserve errno over free() call needed? */ if (!i) { free(buf); buf = NULL; return(NULL); } finished: buf[i++] = 0; /* the realloc may be overkill here in most cases - perhaps it */ /* should be moved to the 'if (exclusive)' block */ ptr = buf = realloc(buf, i); /* prevent reuse if necessary */ if (exclusive) buf = NULL; return(ptr); } --------------------------------------------------------------------------- -- D'Arcy J.M. Cain (darcy@druid) | D'Arcy Cain Consulting | There's no government West Hill, Ontario, Canada | like no government! +1 416 281 6094 |
sms@lonex.radc.af.mil (Steven M. Schultz) (02/07/91)
In article <1991Feb6.170055.2081@druid.uucp> darcy@druid.uucp (D'Arcy J.M. Cain) writes: >In article <1991Feb03.181937.9090@convex.com> Tom Christiansen writes: >>From the keyboard of chip@tct.uucp (Chip Salzenberg): >>:Each site (or programmer) needs to write fgetline() or its moral This whole thread is 1) inappropriate for this group, it has about as much to do with 4BSD as the price of tea in China, 2) has devolved into a religious issue best resolved off in a place like alt.computers.religion. Please move this discussion elsewhere. Steven
mcdaniel@adi.com (Tim McDaniel) (02/08/91)
torek@h2opolo.ee.lbl.gov (who CALLS himself Chris Torek) writes:
int
fgetline(FILE *stream, int *len);
...
while ((p = fgetline(inf)) != NULL)
Notice the "int" return code and the argument count problem. Chris
Torek making TWO errors in a SINGLE posting (about a routine He designed
and wrote) is patently impossible. torek@h2opolo.ee.lbl.gov is
therefore NOT The REAL Chris Torek, but a shameless forger.
It's obvious what happened. Someone saw Chris Torek's announcement
that He was off the net temporarily and didn't know His future e-mail
address. This person waited a plausible amount of time, and then
posted this crude forgery. The "oops, bug in the man page" followup
didn't fool me one little bit.
I'm writing to the system administrators at ee.lbl.gov to get this
dastardly imposter fired immediately, and (if possible) brought up on
criminal charges.
--
Tim McDaniel Applied Dynamics Int'l.; Ann Arbor, Michigan, USA
Work phone: +1 313 973 1300 Home phone: +1 313 677 4386
Internet: mcdaniel@adi.com UUCP: {uunet,sharkey}!amara!mcdaniel
torek@h2opolo.ee.lbl.gov (Chris Torek) (02/08/91)
I posted two articles early this morning (<9644@dog.ee.lbl.gov> and <9653@dog.ee.lbl.gov>) with the second being a correction to the first. Now that article cancellation is fixed, here is a corrected version; I have cancelled the previous two articles. Before things get out of hand, here is the fgetline man page from 4.3-and-two-thirds-or-whatever-you-call-it: FGETLINE(3) UNIX Programmer's Manual FGETLINE(3) NAME fgetline - get a line from a stream SYNOPSIS #include <stdio.h> char * fgetline(FILE *stream, int *len); DESCRIPTION Fgetline returns a pointer to the next line from the stream pointed to by stream. The newline character at the end of the line is replaced by a '\0' character. If len is non-NULL, the length of the line, not counting the terminating NUL, is stored in the memory location it refer- ences. SEE ALSO ferror(3), fgets(3), fopen(3), putc(3) RETURN VALUE Upon successful completion a pointer is returned; this pointer becomes invalid after the next I/O operation on stream (whether successful or not) or as soon as the stream is closed. Otherwise, NULL is returned. Fgetline does not distinguish between end-of-file and error, and callers must use feof and ferror to determine which occurred. If an error occurrs, the global variable errno is set to indicate the error. The end-of-file condition is remembered, even on a terminal, and all subsequent attempts to read will return NULL until the condition is cleared with clearerr. It is not possible to tell whether the final line of an input file was terminated with a newline. ERRORS [EBADF] Stream is not a stream open for reading. Fgetline may also fail and set errno for any of the errors specified for the routines fflush(3), malloc(3), read(2), stat(2), or realloc(3). (the underlining and boldface have vanished, but the above should still be comprehensible). Note that fgetline makes no promises about the pointer it returns. If you want a copy of the line, you must copy it yourself. This is so that fgetline can return pointers within the original stdio buffers; in particular, the sequence: /* add quote widgets */ while ((p = fgetline(inf, (int *)NULL)) != NULL) if (fprintf(outf, ">%s\n", p) < 0) /* error */ break; does not require an intermediate buffer into which lines are copied. They go directly from the input file's buffer to the output file's buffer. (Thus, there is one memory-to-memory copy in the above loop.) It is unfortunate that there is no formal mechanism to avoid read copies for other operations. In particular, copying an input file to an output file could be done with no (user) memory-to-memory copies with a loop of the form: while (there is more in the input buffer) write the input buffer to the output file; whenever the block sizes match, since the input buffer can be written to the output file with a direct write() system call. As it is, you must use fread to obtain data, with at least one copy. [Thanks to: Arnold Robbins, Cesar A Quiroz, Jef Poskanzer, Henry Spencer, and Ray Butterworth for the fixes included here. (These are in alphabetical order, by first name.)] -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
darcy@druid.uucp (D'Arcy J.M. Cain) (02/08/91)
In article <9644@dog.ee.lbl.gov> torek@ee.lbl.gov (Chris Torek) writes: >In article <1991Feb6.170055.2081@druid.uucp> darcy@druid.uucp >(D'Arcy J.M. Cain) writes: >>NAME >> fgetline >> >>SYNOPSIS >> char *fgetline(FILE *fp, int exclusive); > >Before things get out of hand, here is the fgetline man page from >4.3-and-two-thirds-or-whatever-you-call-it: > > NAME > fgetline - get a line from a stream > > SYNOPSIS > #include <stdio.h> > > int > fgetline(FILE *stream, int *len); Oops, don't have such a beast on my SVR3.2. I thought from the discussion that people were proposing such a function. I still like the exclusive use flag though. -- D'Arcy J.M. Cain (darcy@druid) | D'Arcy Cain Consulting | There's no government West Hill, Ontario, Canada | like no government! +1 416 281 6094 |