[comp.sys.amiga] Pattern Matching & Shell Design

cc1@locus.ucla.edu (Michael Gersten) (01/14/87)

>>> = Mike Myers
>
>Uh, since I'm Mike Meyer (not plural, and add an e), and I said these
>things, I think I'll correct the attribution.

Sorry about that. Meyer. Mike.

[discussion of how the shell can tell what and how to expand]

>It can if it's told what they are. See the article I posted a couple
>of days ago that's a draft of "A Shell for Modern Personal Computers,"
>which describes one way of doing just that.

Ok. There are several problems with your shell.
#1: windows
#2: non flexible parameter passing

Windows: I can accept one window for normal work (both shell output and
command output and my input). I can accept a second window for history.

But I will not accept seperate windows for talking to the shell, another
for talking to my program, and yet a third for history. I do not program
with the mouse. Have you used AmigaBasic yet? I tried, but the thing has
SOOO MUCH excessive use of the mouse that I cannot get comfortable
with it. I'm afriad that your shell will be smilar

As for parameters, whats wrong with a program determining the type of
its parameters based on the type of previous parameters? For such a
program, an explicit list of 'arg1, arg2, arg3 required, arg4' or what
is not very good. As an example, consider DATE.

imagine this...

DATE 11:30
(shell): TIME arg required, enter:

(DATE can take TIME, DATE or DATE, TIME for its args; but the list would
(and does) have to specify one).

Or,

DATE
(shell): DATE and TIME args required, enter:

when DATE is happy to just print out the current date. In other words,
some program will do one of many different things, and interpret arguments
in many different ways, based on number/types of arguments.

Now, since your shell requires that whether or not args are exanded is
FIXED (a big complaint of mine) and not changable, how do you handle
such a program which can take varying args, etc. Answer: you don't. You
say, 'Do no arg checking or exansion', or 'Do no arg checking, but expand
filenames'. But, this is fixed, i.e. you can't say

set noglob
oddball df0:c/* = df1:c/*
set glob

or
oddball df0:c = c1:* c2:* c3:*

where A) is similar to PIP while B) is similar to cp
(yes, this seems far-fetched right now, but this is off the top of my head)

>>Here we have some disagrements. I feel that ANY time a program deals
>>with files, it should be expanded, even if it is only one file
>>(like CD). As for non-standard, thats only Z and fastdir.
>>Besides, echo does deal with filenames--on Unix, I use it as a fast
>>ls to see if a file exists.
>
>I've used that trick before, but that's a kludge. Ls is nicer, and
>nearly as fast on well-run systems.

Ls is never 'as fast'. It has at least a program startup overhead, and it
has to fstat each file to see how big it is
(my ls is alias ls lf -s)

>With only ONE argument, I'd much rather have completion. If you can
>get to a 4.3 system, do "set filec", and then try typing escapes in
>the middle of file names. If you can't do that, get a copy of
>microGNUEmacs, and play with typing spaces at prompts for buffer and
>command names.

Completion sounds like real time expansion. If so, great, I'll take it.
But again, what if a command may take one item one time, or a lot of
items another time? (now I'm thinking of a copy program which would either
backup a disk or copy files depending on its args (trs-80 model 1))

>>3 with bogus semantics. Thats one STRONG reason for this stuff to
>>be in a common subroutine library or by the shell.
>
>It's even better to put it where the user can play with it. See the
>above-mentioned posting.

But the user can play with it if there is an ordered list he can work
with.

[discussion of unix file-expansion]

>Almost every time I use sed, I have to go out of my way to quote the
>arguments. Without filename expansion, this wouldn't be a problem.
>Other expansion tricks hurt other programs. Of course, you probably do
>what I do, and make sure things never have the magic characters in
>them. We have odd restrictions on the names in our CMS system because
>of oddities in the Unix shell!

Ok. I just quote the arg to awk and sed automatically, since they expect
their arg to be in one of the argc/argv positions, not spread out over
two or three (If I'm wrong, my apologies, but I've never checkd, and always
assumed this to be the case)

>Huh? You mean you can create a device gort/foo: ??? How about
>gort/my_foo:, then do a "list gort" to see the devices in it. I think
>not, but could be wrong (never had a reason to do anything THAT
>bizarre).

Hmm, I've never tried this, but I think
A: the mount as gort/my_foo: would work
B: the list gort would fail
C: a list gort/#?: should work, but probably won't (since DIR doesn't
expand device names, I don't think list will)

Well, I'll try that ASSIGN when I get home

>>As for copying df0: to ram:, or TYPE DF0: TO DF1:, or something:
>>I belive this is a fault of the BCPL programs doing checks that they
>>shouldn't be checking. If I am wrong,
>>
>>*** BUG REPORT ***
>
>I don't think it's a bug report - I think it's confusion on your part.
>Have you ever worked with a large system which had similar features
>(TOPS-10, VMS, and probably others)? On those systems, devices (and
>system logicals) ARE NOT part of the file name space. Then again, they
>tend to have ugly three-part file names: device, followed by a magic
>character; a directory path, with magic characters at both ends, and
>finally a file name. Probably split into a name and an extension, with
>a magic character seperating. Compared to such kludgyness, AmigaDOS is
>nice

Nonesense. AmigaDos requires all of its handlers to support common
READ and WRITE packets; exec requires all devices to support common
CMD_READ and CMD_WRITE messages. So, why not expect that if AmigaDos
knows that vdisk.device, unit #2 is called by filesystem-handler under
name VD2:, then Dos knows both a handler and a device, and can get
the proper thing sent out. So, It can be done, it can be done easily,
ergo, it should be done. (Note: I did not say, "Put everything into the
kernel"; I said "Put it into the kernel if the kernel can do it easily". I
also feel "Put it into the kernel if it can't be done at all
any other way" (this applies to exec(), which is extremely hard to impossible
to do under AmigaDos. Fexec() is not exec)

>Of course, you still have to deal with the problem of "list par:"
>saying "packet request type unknown". Or is that a bug, also?

No, list would send a packet asking for file names, par: would say,
"Say WHAT?", list would reply 'par:' or 'Not a file system' or something
like that.

>>Actually, I think they should be FULLY integrated, and if they aren't,
>>blame Dos; if they are, blame the BCPL programs.
>
>I tend to think they aren't fully integrated, and think that DOS
>should be blamed. But it's hard to blame the designers: they did that
>work before Unix really became popular, and the way they did it was
>the way almost everyone else was doing it back then. "Blame it on
>TRIPOS."

The point is, things must evolve. Look at what unix has done. Look at
what BSD has done. Ok, don't look at what Bsd has done. Look at what
V8 is doing (or the rumors that I hear say they are doing). In short,

No, I don't expect to see unix on the AMiga
Yes, I think Meta-comco should look at unix and borrow the better stuff,
OR SUPPORT EXEC LIBRARY FORMAT so we can change AmigaDos ourselves. Right
now, we are stuck with it, so it should not have these problems.

Flame off (sorry)

>>I don't know about you, but I HATE outfile=x type things. Absolutely HATE.
>>DO NOT put that into amiga dos. Religious fight up ahead 8<)
>>
>>Besides, quoting > is easy: \> or '>'. Quoting 'outfile=file*' is not
>>as easy, and you have to get it to expand and then ignore the outfile= part.
>
>You don't have to quote it if the command is doing the expanding
>instead of the shell. You seem to have ignored the possibility of it
>being shortend to o=fi*, also. And o= isn't noticable harder to type
>than > (to key presses, either way, on the Amiga keyboard).

I said this was religeous. My complaint is that the programmer has
significantly more work to do, the syntax is not as nice, varios minor
things, etc.

Consider

mv i=(f? ~rem/uucp/cki*) o=~/uucp

to

mv f? ~rem/uucp/cki* ~/uucp

and others. Besides, if there is an o= to replace >, you have to worry
about a file named o=file1 being mentioned but not seen. (Yes, thats
pushing. But you mentioned filename restrictions from the unix shell)

>Of course, the shell can do all of this for you. See the
>afore-mentioned posting for details.
>
>>Well, we both agree that the shell should do it. We both agree that
>>it should be user configurable. Soon, we'll have something we both
>>hate equally--that will be it.
>
>Yes, but we got to the shell doing it via totally different paths. You
>started wanting a dumb shell, ala Unix, that just expands everything.
>I started wanting more intelligent handling of arguments. Making that
>user-configurable pushes things from the command to the command
>processor.  See the much-mentioned paper to see where that leads you,
>and how FAR that is from a dumb "file-expand-every-argument" shell.
>
>If that posting hasn't gotten to you by the time you see this,
>something probably broke. Please let me know.
>
>	<mike

I will have to re-read it, and give some more comments on it.
			Michael
      Views expressed here may not be those of the Computer Club, UCLA, or
  anyone in their left OR right mind.  And that's the name o' that tune.

mwm@eris.BERKELEY.EDU (Mike Meyer) (01/15/87)

In article <3728@curly.ucla-cs.UCLA.EDU> ucla-cs!cepu!ucla-an!remsit!stb!michael@ihnp4.UUCP cc1@LOCUS.UCLA.EDU (Michael Gersten) writes:
>>It can if it's told what they are. See the article I posted a couple
>>of days ago that's a draft of "A Shell for Modern Personal Computers,"
>>which describes one way of doing just that.
>
>Ok. There are several problems with your shell.
>#1: windows
>#2: non flexible parameter passing

#1 is a matter of taste. #2 is a misunderstanding of what was in the
paper. We'll talk about them both. What you actually get is a
stripped-down macro processor layered on top of a Unix-like facility.

>Windows: I can accept one window for normal work (both shell output and
>command output and my input). I can accept a second window for history.
>
>But I will not accept seperate windows for talking to the shell, another
>for talking to my program, and yet a third for history.

Why do you assume there'll be a second window for talking to your
program? Unless it opens one, the shells input window should work for
it. Just like csh/sh/ksh on Unix. Better yet, the history mechanism
should work for input to things other than the shell, unless it does
raw/cbreak mode stuff; *much* better than any Unix shell does (unless
you're running it in an Emacs, of course). If it doesn't do
line-oriented input, it's probably got it's own window, and you'd have
to deal with that no matter WHAT shell you ran.

>I do not program
>with the mouse. Have you used AmigaBasic yet? I tried, but the thing has
>SOOO MUCH excessive use of the mouse that I cannot get comfortable
>with it. I'm afriad that your shell will be smilar

No, I haven't looked at AmigaBASIC. But that's a matter of taste, and
I even mentioned it towards the end. You don't have to use the mouse,
but it'd make your life more painfull. I *do* program with the mouse.
Once again, get a copy of mg, and look at the mouse interface I
desgned (both the editor hooks and the browser are my design. Mic did
a lot of work to make them nicer; except for taking the names off the
mouse keys so can get both mouse keys and function keys), then notice
that I have changed things so that Echo-Mouse does a "save-buffer",
and you'll quickly realize that I can go through a short edit session
without ever typing characters. And I've done so.

>As for parameters, whats wrong with a program determining the type of
>its parameters based on the type of previous parameters? For such a
>program, an explicit list of 'arg1, arg2, arg3 required, arg4' or what
>not very good. As an example, consider DATE.

Yeah, I know. And you almost reached the correct conclusions:

>Now, since your shell requires that whether or not args are exanded is
>FIXED (a big complaint of mine) and not changable, how do you handle
>such a program which can take varying args, etc. Answer: you don't. You
>say, 'Do no arg checking or exansion', or 'Do no arg checking, but expand
>filenames'.

Well, almost right. You can tag the two arguments as optional,
possibly with keywords, like so:

	command date
	optional date expand date single
	optional time expand time single

Assuming the shell understood the date and time universes. That way,
you could type things like:

	date yes<space> noo<space>

and it would set the date to noon yesterday. Much nicer than the Unix
shells. Admittedly, putting date/time matching into the shell isn't
going to be usefull for very many things, so it's not likely.

>But, this is fixed, i.e. you can't say
>
>set noglob
>oddball df0:c/* = df1:c/*
>set glob

And what makes you think that you can't do that? Of course, I wouldn't
make it so ugly, having set noexpand and unset noexpand. Better yet, a
command/argument completion character that says "don't expand the
preceeding." Thus saving you the trouble of doing the set/unset.

>>>Here we have some disagrements. I feel that ANY time a program deals
>>>with files, it should be expanded, even if it is only one file
>>>(like CD). As for non-standard, thats only Z and fastdir.
>>>Besides, echo does deal with filenames--on Unix, I use it as a fast
>>>ls to see if a file exists.
>>
>>I've used that trick before, but that's a kludge. Ls is nicer, and
>>nearly as fast on well-run systems.
>
>Ls is never 'as fast'. It has at least a program startup overhead, and it
>has to fstat each file to see how big it is
>(my ls is alias ls lf -s)

I said "nearly" as fast, and on well-run systems. You get to the point
where IO overhead (to the disk, then to the terminal) is the bulk of
the time. You're also assuming that "echo" is a built-in, which is a
fairly recent development in Unix shells. Of course, if you're adding
the fstat, you're putting an unfair handicap on ls. You could also
break some shell scripts, depending on where you do the alias. Not
well-written scripts, but we've done AT&T and BSD bashing before.

>>With only ONE argument, I'd much rather have completion. If you can
>>get to a 4.3 system, do "set filec", and then try typing escapes in
>>the middle of file names. If you can't do that, get a copy of
>>microGNUEmacs, and play with typing spaces at prompts for buffer and
>>command names.
>
>Completion sounds like real time expansion. If so, great, I'll take it.

For the shell I proposed, it's real-time expansion. In most places, it
isn't, as you're limited to only one "expansion."

>But again, what if a command may take one item one time, or a lot of
>items another time? (now I'm thinking of a copy program which would either
>backup a disk or copy files depending on its args (trs-80 model 1))

So? I don't see any problem. You describe it as a repeated arg, then
select a single device, or multiple files. Maybe you'd better describe
it in more detail.

>>>3 with bogus semantics. Thats one STRONG reason for this stuff to
>>>be in a common subroutine library or by the shell.
>>
>>It's even better to put it where the user can play with it. See the
>>above-mentioned posting.
>
>But the user can play with it if there is an ordered list he can work
>with.

Huh? Please expand on this.

>[discussion of unix file-expansion]
>
>>Almost every time I use sed, I have to go out of my way to quote the
>>arguments. Without filename expansion, this wouldn't be a problem.
>>Other expansion tricks hurt other programs. Of course, you probably do
>>what I do, and make sure things never have the magic characters in
>>them. We have odd restrictions on the names in our CMS system because
>>of oddities in the Unix shell!
>
>Ok. I just quote the arg to awk and sed automatically, since they expect
>their arg to be in one of the argc/argv positions, not spread out over
>two or three (If I'm wrong, my apologies, but I've never checkd, and always
>assumed this to be the case)

Nope, they aren't. They check for things after other arguments, and
similar oddities. Admittedly, they expect a single argument for them,
with other arguments keying them in. If the shell weren't doing
expansion (and variables, and loads of other things), this quoting
wouldn't be needed except for spaces.

>Hmm, I've never tried this, but I think
>A: the mount as gort/my_foo: would work
>B: the list gort would fail
>C: a list gort/#?: should work, but probably won't (since DIR doesn't
>expand device names, I don't think list will)

I don't think they will work. Even if the mount works, those ARE NOT
file names, they're device names. Devices aren't in the file system
name space; some devices support it. Some even support it with
different semantics (consider a list of "CON:100/100/*/*")! More
logically, consider the semantics of a remote-mounted CP/M file
system, or some other creature. It'd still be "DEV:filename."

>Nonesense. AmigaDos requires all of its handlers to support common
>READ and WRITE packets; exec requires all devices to support common
>CMD_READ and CMD_WRITE messages. So, why not expect that if AmigaDos
>knows that vdisk.device, unit #2 is called by filesystem-handler under
>name VD2:, then Dos knows both a handler and a device, and can get
>the proper thing sent out.

Uh, you left out the hard part. How do you decide, under certain
conditions, which is "the correct thing"? Even Unix can't handle that.
Either something is a file, or it's a directory, and copying them does
different things. DF0: is a device and a directory; the program
dealing with it has to make assumptions about it. Or maybe you can
provide an algorithm that'll work?

>The point is, things must evolve. Look at what unix has done. Look at
>what BSD has done. Ok, don't look at what Bsd has done. Look at what
>V8 is doing (or the rumors that I hear say they are doing). In short,

What I've been saying all along. It's just that we disagree over which
direction things should go :-). Oh, yeah - you should look at what BSD
has done. The two most interesting Unix developments going on are
based on BSD (MACH on 4.3, and v8 on 4.1). And there are some good
things in BSD (just because the code is buggy, and the kernel makes me
want to regurgitate doesn't mean it's all bad). The single most
important thing that UCB CSRG did with BSD was to not give a shit
where things came from, and steal them if they were worth stealing.
Finally, you should also look at OS/9; it strikes an interesting
balance between Unix and Tripos/AmigaDOS. There are probably others
I've missed (MINIX, Ameoba, V, Locus, Domainix, and others come to
mind).

>No, I don't expect to see unix on the AMiga
>Yes, I think Meta-comco should look at unix and borrow the better stuff,
>OR SUPPORT EXEC LIBRARY FORMAT so we can change AmigaDos ourselves. Right
>now, we are stuck with it, so it should not have these problems.
>
>Flame off (sorry)

One of the nicest things about Unix (well, early in it's life,
anyway), and also one of the worst things about it, was that it came
with source, and you could fix things to be the way you wanted them.
The lossage was that everybody did things their way. At one Usenix (SF
in '80, I think), they had a questionaire asking "What interesting
things have you done to the kernel (Don't bother to mentioning tty
driver hacks)?" Everybody tweaked the tty driver, and in different
ways. This type of things was (and still is) rampant.

I miss that facility in AmigaDOS (but not much. I get more than enough
systems hacking to make my living, thank you). The interesting things
about Tripos/AmigaDOS (and OS/9, for that matter), is that the pieces
can be added without having to fool with the exec. For example, the
pipe: driver that's floating around (love it, love it, LOVE it. Got a
CLI script I'm going to post shortly; the pipe: device makes it go
LOTS faster, and you can do things you can't do in Unix!)

>>YOU don't have to quote it if the command is doing the expanding
>>instead of the shell. You seem to have ignored the possibility of it
>>being shortend to o=fi*, also. And o= isn't noticable harder to type
>>than > (to key presses, either way, on the Amiga keyboard).
>
>I said this was religeous. My complaint is that the programmer has
>significantly more work to do, the syntax is not as nice, varios minor
>things, etc.

The programmer doesn't have to have more work to do; it depends on
what kind of support environment you have. Consider VMS or getopt on
Unix. Niceness of syntax is a matter of taste, and will depend heavily
on what you were brought up on. As someone put it, the really nasty
thing about Unix is that a generation of programmers is being brought
up thinking that the user interface isn't worth thinking about, and
that user-friendlyness is for lusers.

>Consider
>
>mv i=(f? ~rem/uucp/cki*) o=~/uucp
>
>to
>
>mv f? ~rem/uucp/cki* ~/uucp

True, the <param>= isn't right for everything. But it sure makes life
a lot nicer for the things that it *IS* right for. For instance, I'd
love to have a symlink command that had those things. It's arguments
are backwards compared to the rest of the Unix world.

The best trick is to support (as I suggested in the paper) both
position and keyword driven arguments. In the shell, not a library,
so the user can get to them.

>and others. Besides, if there is an o= to replace >, you have to worry
>about a file named o=file1 being mentioned but not seen. (Yes, thats
>pushing. But you mentioned filename restrictions from the unix shell)

Huh? Why so? Just type o=o=file1. The shell should have put
"output=o=file1" on the command line for you to see, though.

>>If that posting hasn't gotten to you by the time you see this,
>>something probably broke. Please let me know.
>
>I will have to re-read it, and give some more comments on it.

They'd be appreciated. The above has made me realize one thing: The
alias facility really needs to let you twiddle args, so that you can
have different restrictions on them. For instance, date might look
like so:

	command setdate alias date
	required date
	optional time
	command settime alias date
	required time
	command date
	optional date
	optional time

Which lets you have argument-driven versions of date to set the date
and time. Figureing out how to work that into a cshell-like mechanism
for the "alias" builtin may be a bitch, though.

	<mike