[net.unix-wizards] instability in Berkeley versus A

haahr@siemens.UUCP (08/05/85)

> > For an explanation of why "one program, one function, done well" is a good
> > way to build a system, see almost any discussion of the "Unix philosophy".
> > Try Kernighan & Pike.
> > -- 
> > 				Henry Spencer @ U of Toronto Zoology
> > 				{allegra,ihnp4,linus,decvax}!utzoo!henry
> 
> _Software Tools_, pg. 84 (Kernighan & Pike):

Hate to be picky, but _Software_Tools_ was written by Kernighan and P. J.
Plauger (of Whitesmiths).  But it is a great source for Unix philosophy.

> 	Efficiency is hardly of importance for a temporary hookup
> 	meant only to be used a few times. Should a particular
> 	organization of tools prove so useful that it begins to
> 	consume significant resources, then you can consider
> 	replacing it with a more efficient version. And you are
> 	way ahead at this point, for you are writing a program
> 	that has precise specifications and that has been shown
> 	to be useful.
> 
> I think columnar ls is a case in point, though perhaps a bit trivial to
> really worry about. From the human perspective, it is much more
> pleasant, and doesn't waste my time scrolling the listing off the
> screen. And if it is used heavily, why not incorporate it? ...

The definitive argument against Berkeley ls is in _Program_Design_in_
_the_UNIX_System_Environment_, by (would you believe) Kernighan & Pike,
BSTJ, October 1984, specifically pages 1601-1603.  To quote:  (note that
the authors refer to Berkeley ls as lsc)

	  Surprisingly, lsc operates differently if its output is a file
	or a pipe:
		lsc
	produces output different from
		lsc | cat
	The reason is that lsc begins by examining whether its output is
	a terminal, and prints its output in columns only if it is.  By
	retaining single-column output to files or pipes, lsc retains 
	compatibility with programs like grep or wc, which expect things
	to be printed one per line...
	  A more insidious problem with lsc is the columnation facility,
	which is actually a useful, general function, is built in and thus
	inaccessible to other programs that could use a similar compression.

The authors then suggest a general purpose filter (based on pr) that would
take its output and columnate it.  So, for five-column output from ls:
	ls | 5

On your concern for efficiency, the only avoidable answer is yes, the
two command version is slower and does involve more typing.  On typing,
create a shell script ll (or whatever) that does
	ls $* | 5
or, for people who like to see /*=@ etc after filenames
	ls -F $* | 5
If your shell provides aliasing or shell functions this is even easier.
This is, of course, slower than Berkeley ls, except ls wouldn't have to
check where it's output is going (one isatty call), so a vanilla ls
piping its output to grep will work faster.  Plus, for a command that
is executed interactively (I can think of very few occasions when one would
want to have columnar ls output to a terminal from something that is
time critical), I would doubt that the big problem comes from between pipes
I have tried using an 'll' which is a csh alias for a pipe and can't notice
difference (VAX-11/750, unloaded).  My point here is basically that you
are talking about gaining a trivial amount efficiency in exchange for
elegance and simplicity.

>							... I agree with
> modular, single function boxes, but there should be some quarter given
> to practicality here; if it is used heavily, that is your license to
> incorporate it into the code. I won't argue the more esoteric features
> of ls. The point is, I think, that there are several levels of use:
> One-function boxes strung together with pipes, shell scripts, and for
> the most heavily used features that have made it to the level of
> 'heavily used shell scripts', C coding, or inclusion into an existing
> binary program.  I see no justification for the religious statement
> "Thou shalt code in sh." It is a relative, sliding scale that leaves
> room for things like Berkley ls. Not to say that they have never
> over-done it, mind you...

They have overdone it all too often it seems.  If Berkeley had provided a
general purpose columnation command (even better if it was termcap
sensitive rather than fixed 80-columns) and had found that a lot of
people had aliases or shell scripts that piped ls to this program and
found a way to keep it compatible with existing use (I don't like having
programs like ls checking if their output is a terminal) and there was
a big win in speed observed by hacking the code into ls, then, yes, maybe
they should do ls -C, but the Berkeley approach seems to be:  gee, output
from ls (and no other program? :-) runs off the screen a lot.  Why don't
I fix (break?) ls so that doesn't happen.  But I don't want to affect
any program that looks at ls, only what a user sees.  And only if they
decide not to use a filter like more for those really large directories.

The Berkeley philosophy is reminiscent of the FORTRAN programmer in
_Software_Tools_'s first chapter, who has the clever idea of not using
a red pencil to find all FORMAT statements in his program, but instead
writes a program that searches for the word FORMAT.  Now that they have
proven that columnation of output is useful, why not provide a general
purpose way of doing it?

					Paul Haahr
					..!allegra!princeton!macbeth!haahr

haahr@siemens.UUCP (08/05/85)

> > Nobody is arguing that the functionality isn't useful; it's just misplaced.
> > 
> > For an explanation of why "one program, one function, done well" is a good
> > way to build a system, see almost any discussion of the "Unix philosophy".
> 
> The problem is, that its very hard to define the "one function" in any
> way that's acceptable to more than a handful of people.
> 
> For listing a directory, my idea of "one function" would be to
> simply read the directory and print the names.  no more.  no
> sorting, no looking up attributes (stat calls), none of this
> unnecessary crud.  In fact, "ls" can do this, but you have to
> tell it "-f".
> 
> The real point is, of course, that by the time that you
> run a sh script like
> 
> 	ls | sort | mtimes | sizes | owner | links | permissions
> 
> every time you want what you would normally get with "ls -l"
> the whole system will die.

Yes.  But, aside from the fact that ls had the -l option, what makes
ls -l anything but another example of creeping featurism.  Berkeley ls
just takes a bad trend and makes it worse.  I want my ls command to be
what 4.2 and sysV call ls -1f.  This is useful to pipe to programs.
For a user interfaced version (maybe lsc) i would set a sh script or
alias of ls | grep -v '^\.' | sort | columnate.  If I find that this is
terribly slow, then maybe incorporate the checking for files beginning
with '.' into ls and provide a '-a' option, maybe even '-A' (same as
'-a' but without files '.' and '..') or maybe make it an option to trim
those files.  Next, if sorting is that important and the extra exec and
pipe I/O is a real slowdown, then add an option to for sorting, or even
make that the default.

ls -l is really a separate case.  It provides completely different
information.  Why not have one program which takes a list of named
files and produces ls -l style output.  Default might be the same as
ls or might be something else, and it might handle named directories
differently (ls -ld always seemed awkward to me, why not ls -l name and
ls -l name/*).   I personally think of ls -l as a very different program
from vanilla ls, and consider the bundling of the two a design problem.
Maybe I'm alone in this.

You suggest that the function that ls provide could be done as separate
filters.  Why not just provide wc like command line options for which
fields you want (lsl -is for inode number and size) and make the default
all.  Simple to implement.  Separate function from other programs
(program to return data from inode rather than list files in a directory).
Easy to understand what it does (no -[a-zA-Z]).  However, for the sake of
compatibility, things probably shouldn't be changed.  But ls -l isn't
a good example of anything.

> 		... surely sorting & columnising stand or
> fall together?

No more so than sorting and removing files that start with '.' are linked
or listing files and printing stat information on files are inseparably
bound.

> The real problem, is that there are people who imagine that
> the "one function" that a program is supposed to solve is
> the "one" that it originally solved, and that any change,
> either to delete "wrong" functionality, or add "new" functionality
> is automatically a "bad thing".  ...

It's just that the original design in Unix is, on average, good.  Not
to say that everything is the best it could possibly be, just not bad.

>				   ... (This is quite apart from the
> problem of incompatible versions, which really is a problem).

Yes.  No question there.

> Had "ls" output file names in columns from its first writing
> with an option to produce only 1 column when its output is
> being piped into some other program, then there would be none
> of this controversy.  (I know, people scream "but programs
> should always be written to assume that their output will
> be the input of another program")

Maybe, but I'm not sure that it would have.

> What's really hard to justify is having the output appear
> in a different form depending on where the output is going,
> but such is the price of backward compatability - and we
> *know* who it is that screams loudest whenever something
> is changed that "breaks" something, don't we?

Agreed.  But kludging things to be backward compatible in special cases
is definitely not the right way to add features.

> Robert Elz			ucbvax!kre kre@monet.berkeley.edu

> ps: K&P on this topic suggest using "pr" as a columnising filter.
> To my mind, "pr" is a paginator, its just as bad to make a paginator
> produce columns as some side effect as it is to make a directory
> listing program produce columns as a side effect - but of course,
> this was in "pr" from the beginning, so it is blessed...

It seems to me that a program for paginating might have to worry about
columns, but, yes this is probably not the best place to put a columnator.
On the other hand, it is possible to use the columnator from pr from other
programs, where with ls it's at very least kind of difficult (create a file
with the name of each line, do a ls -Cf, and hope that there aren't two
files with the same name? :-)
--
					Paul Haahr
					..!allegra!princeton!haahr

peter@kitty.UUCP (Peter DaSilva) (08/08/85)

> time critical), I would doubt that the big problem comes from between pipes
> I have tried using an 'll' which is a csh alias for a pipe and can't notice
> difference (VAX-11/750, unloaded).  My point here is basically that you
> are talking about gaining a trivial amount efficiency in exchange for
> elegance and simplicity.

A VAX 750 is a bit of a powerhouse even today. On small machines that amount
of efficiency isn't trivial.

peter@kitty.UUCP (Peter DaSilva) (08/08/85)

> You suggest that the function that ls provide could be done as separate
> filters.  Why not just provide wc like command line options for which
> fields you want (lsl -is for inode number and size) and make the default
> all.  Simple to implement.

I implemented this. Called it "le" for "list extended". It dumps all sorts of
extra info too. Found I don't bother to use it. Anyone want a copy?

levy@ttrdc.UUCP (Daniel R. Levy) (08/11/85)

>I implemented this. Called it "le" for "list extended". It dumps all sorts of
>extra info too. Found I don't bother to use it. Anyone want a copy?

Please post to net.sources?
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|       dan levy | yvel nad      |  my own and are not at all those of my em-
|         an engihacker @        |  ployer, my pets, my plants, my boss, or the
| at&t computer systems division |  s.a. of any computer upon which I may hack.
|        skokie, illinois        |
|          "go for it"           |  Path: ..!ihnp4!ttrdc!levy
 --------------------------------     or: ..!ihnp4!iheds!ttbcad!levy