[comp.unix.wizards] filename separators and option indicators

dankg@lightning.Berkeley.EDU (Dan KoGai) (05/30/90)

In article <BZS.90May23210652@world.std.com> bzs@world.std.com (Barry Shein) writes:
>
>>Using '/' for paths and '-' for options seems intuitive, especially '/'
>>for paths.  This is my guess why Thompson used them in Unix.
>
>Multics (the previous bad experience which inspired unix) used > for
>paths as I remember, with the reading of A>B>C as A "down" B "down" C.
>That's also pretty intuitive, but it was shifted which was probably a
>drawback (no, no, the > in the shell came later.)

	So Multics had no concept of "read to stdin" and "write from stdout"?
well, that could've been "->" and "<-" (This is not intercal!).  But among
a lot of CLIs, I love UNIX the best and always have trouble typing "A:\foo\bar"
but none of CLI implementation of delimiter will be intuitive enough.
On Macintosh, thanks to GUI, only ":" is reserved as delimiter and it's
directory delimiter.  That makes MPW users (Hi, robert!) hard to deal with
files but for the "rest of them" it's nice to be able to make such file names
as "Foo killed bar's blech".

>You also had considerations like printing your output on 64-character
>band printers which were missing some characters.

	Some guys have opposite problems:  Too many characters to handle.
it take at least 3,500 Kanjis (Nice iconic character from china) to handle
daily Japanese and it's even more for Chinese.  As for Japanese, there are
several standards going on currently and its complexity is nothing compared
to ASCII vs EBCDIC:  They have to use 16bit char instead of 8 but they also
want to use standard ASCII (I'm not sure how well EBCDIC is used in Japan).
So there must be delimiter to toggle 2byte char on and off.  One implementation
uses escape sequense.  Other uses uppermost bit as switch.  Basic index is
set by JIS (Japan Industory Standard, ANSI equivalent for the Japanese) but
even that has changed once--some characters are moved elswhere, some deleted
and some added.  (For sure fj.* newsgroup uses New JIS--New index, escape
sequence toggling, 7-bit compatible).  This pain is something alphabet users 
can hardly understand.
	And on some implementation of Japanese ASCII, some of punctuation chars
are replaced with others.  The funniest is that backslash is replaced with
yen figure ('Y' + '=').  And they are basically using the same DOS.  So instead
of bunch of backslashes, they see a lot of yen figures in their path string
(That applies to C's char quotation also!).  Think about it:

	$usr$local$bin$bash

	No wonder they are rich, huh? :)

	And I think that apply to other Indo-European language character sets 
also (Suppose British uses starling figure for the place of backslash?)  Come
to think there's no cent figure for ASCII.  Anyone know why?


----------------
____  __  __    + Dan The "Punctuated" Man
    ||__||__|   + E-mail:	dankg@ocf.berkeley.edu
____| ______ 	+ Voice:	+1 415-549-6111
|     |__|__|	+ USnail:	1730 Laloma Berkeley, CA 94709 U.S.A
|___  |__|__|	+	
    |____|____	+ "What's the biggest U.S. export to Japan?" 	
  \_|    |      + "Bullshit.  It makes the best fertilizer for their rice"

ralf@b.gp.cs.cmu.edu (Ralf Brown) (05/30/90)

In article <1990May30.045903.14249@agate.berkeley.edu> dankg@ocf.Berkeley.EDU (Dan Kogai) writes:
}a lot of CLIs, I love UNIX the best and always have trouble typing "A:\foo\bar"
}but none of CLI implementation of delimiter will be intuitive enough.
}On Macintosh, thanks to GUI, only ":" is reserved as delimiter and it's
}directory delimiter.  That makes MPW users (Hi, robert!) hard to deal with
}files but for the "rest of them" it's nice to be able to make such file names
}as "Foo killed bar's blech".

total 240
-rw-r--r--  1 ralf            0 May 30 07:29 Foo killed bar's bletch
-rw-r--r--  1 ralf       231682 May 30 07:11 frain13r.zip

You were saying?...  The only printable character you CAN'T put into a Unix
filename is the forward slash.

-- 
{backbone}!cs.cmu.edu!ralf   ARPA: RALF@CS.CMU.EDU   FIDO: Ralf Brown 1:129/46
BITnet: RALF%CS.CMU.EDU@CMUCCVMA   AT&Tnet: (412)268-3053 (school)   FAX: ask
DISCLAIMER? | _How_to_Prove_It_ by Dana Angluin  20. by vehement assertion: It
What's that?|is useful to have some kind of authority relation to the audience

guy@auspex.auspex.com (Guy Harris) (05/31/90)

>	And I think that apply to other Indo-European language character sets 
>also (Suppose British uses starling figure for the place of backslash?)

Some, but not all.  I suspect the ISO 646 character set for the UK may
substitute "pounds sterling" for "dollar sign".  The ISO 646 character
sets are 7-bit character sets; mostly ASCII, but a few character
positions are designated for "national characters".  The US version is
ASCII.

However, if you go for the more state-of-the-art ISO 8859 character
sets, you get to use the 8th bit; all the 8859 character sets are ASCII
in the first 128 positions (8th bit zero), and have additional
characters including accented letters, etc. in the next 128 positions. 
ISO 8859/1, the Western Europe and (North?) American (in the sense of
the American continents, not the US) character set, has both "$" in the
usual ASCII position, as well as "pound sterling".

(There's also ISO 10646, which is a *big* character set under
development that will supposedly give you all the characters in the
world, or at least a big subset including Japanese & Chinese and the
like....)

>Come to think there's no cent figure for ASCII.  Anyone know why?

Not enough demand to cause some other character to be shoved out?  ISO
8859/1 *does* have it, one position before "pound sterling".

dankg@lightning.Berkeley.EDU (Dan KoGai) (05/31/90)

In article <9460@pt.cs.cmu.edu> ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
>In article <1990May30.045903.14249@agate.berkeley.edu> dankg@ocf.Berkeley.EDU (Dan Kogai) writes:
>}a lot of CLIs, I love UNIX the best and always have trouble typing "A:\foo\bar"
>}but none of CLI implementation of delimiter will be intuitive enough.
>}On Macintosh, thanks to GUI, only ":" is reserved as delimiter and it's
>}directory delimiter.  That makes MPW users (Hi, robert!) hard to deal with
>}files but for the "rest of them" it's nice to be able to make such file names
>}as "Foo killed bar's blech".
>
>total 240
>-rw-r--r--  1 ralf            0 May 30 07:29 Foo killed bar's bletch
>-rw-r--r--  1 ralf       231682 May 30 07:11 frain13r.zip
>
>You were saying?...  The only printable character you CAN'T put into a Unix
>filename is the forward slash.

	Gee, I knew that:  You can quote file name in shell and just do
fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in c source or anything but
in reality those drive your shell nuts.  As far as Unix has no Finder
or SFDialog to access these files witout pain, we'd better stay away from
those nasty punctuation marks...

----------------
____  __  __    + Dan The "~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<" Man
    ||__||__|   + E-mail:	dankg@ocf.berkeley.edu
____| ______ 	+ Voice:	+1 415-549-6111
|     |__|__|	+ USnail:	1730 Laloma Berkeley, CA 94709 U.S.A
|___  |__|__|	+	
    |____|____	+ "What's the biggest U.S. export to Japan?" 	
  \_|    |      + "Bullshit.  It makes the best fertilizer for their rice"

exspes@gdr.bath.ac.uk (P E Smee) (05/31/90)

In article <1990May30.045903.14249@agate.berkeley.edu> dankg@ocf.Berkeley.EDU (Dan Kogai) writes:
>	And I think that apply to other Indo-European language character sets 
>also (Suppose British uses starling figure for the place of backslash?)  Come
>to think there's no cent figure for ASCII.  Anyone know why?

Hardly important, but on (at least) most British ASCII terminals and
printers which support the 'pounds sterling' currency symbol, it
replaces the hash (pigpen, us number sign).  On some micro packages it
requires an escape sequence, and is > 0177.  Makes C preprocessor stuff
look funny.  Always have wondered why it didn't replace the dollarsign.
-- 
Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK
 P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132

cjc@ulysses.att.com (Chris Calabrese[mav]) (05/31/90)

In article <1990May31.065335.10406@agate.berkeley.edu>, dankg@lightning.Berkeley.EDU (Dan KoGai) writes:
> 
> 	Gee, I knew that:  You can quote file name in shell and just do
> fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in c source or anything but
> in reality those drive your shell nuts.  As far as Unix has no Finder
> or SFDialog to access these files witout pain, we'd better stay away from
> those nasty punctuation marks...

Not only does the shell not have problems with punctuation marks, but
it even handles non-printable characters without complaining.

Script started on Thu May 31 10:07:17 1990
619 telemachos_/papers/htor) > 'this is a #^$&$&&$Q test'
620 telemachos_/papers/htor) > 'this is a ^G ^C^B^A test'
621 telemachos_/papers/htor) ls this*
this is a #^$&$&&$Q test        this is a ^G ??? test
622 telemachos_/papers/htor) exit

script done on Thu May 31 10:07:33 1990
Name:			Christopher J. Calabrese
Brain loaned to:	AT&T Bell Laboratories, Murray Hill, NJ
att!ulysses!cjc		cjc@ulysses.att.com
Obligatory Quote:	``Anyone who would tell you that would also try and sell you the Brooklyn Bridge.''

guy@auspex.auspex.com (Guy Harris) (06/01/90)

 >As far as Unix has no Finder or SFDialog to access these files witout
 >pain,

Some UNIXes don't, others do.  Plenty of add-on products of that sort
exist, both for character-based terminals and X11....

jbm@eos.UUCP (Jeffrey Mulligan) (06/01/90)

cjc@ulysses.att.com (Chris Calabrese[mav]) writes:

>In article <1990May31.065335.10406@agate.berkeley.edu>, dankg@lightning.Berkeley.EDU (Dan KoGai) writes:

>> 	Gee, I knew that:  You can quote file name in shell and just do
>> fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in c source or anything but
>> in reality those drive your shell nuts.

>Not only does the shell not have problems with punctuation marks, but
>it even handles non-printable characters without complaining.

I think he should have said "in reality those will drive many users nuts."
Of course wizards aren't fazed, since they already are nuts :-)



-- 

	Jeff Mulligan (jbm@eos.arc.nasa.gov)
	NASA/Ames Research Ctr., Mail Stop 262-2, Moffet Field CA, 94035
	(415) 604-3745

stripes@eng.umd.edu (Joshua Osborne) (06/01/90)

In article <1990May31.065335.10406@agate.berkeley.edu> dankg@lightning.Berkeley.EDU (Dan KoGai) writes:
>	Gee, I knew that:  You can quote file name in shell and just do
>fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in c source or anything but
>in reality those drive your shell nuts.  As far as Unix has no Finder
>or SFDialog to access these files witout pain, we'd better stay away from
>those nasty punctuation marks...
Unix does in the same sense that MS-DOS does.  Have you ever heard of X?
Or run a Motif program?  It's not true that all, or even many Unix programs
use such boxes, but you can get them.

('tho I doubt I will use a Mac-like interface to the file system, it's too
constricting - the Finder that is.  I know Sun's File Manager doesn't suit
me, but I havn't seen Looking Glass yet, but I'm most satisfyed with csh &
twm so I'm not about to spend money).
-- 
           stripes@eng.umd.edu          "Security for Unix is like
      Josh_Osborne@Real_World,The          Mutitasking for MS-DOS"
      "The dyslexic porgramer"                  - Kevin Lockwood
"Don't try to change C into some nice, safe, portable programming language
 with all sharp edges removed, pick another language."  - John Limpert

hwt@.bnr.ca (Henry Troup) (06/01/90)

In article <1990May31.092357.16792@gdr.bath.ac.uk> exspes@gdr.bath.ac.uk (P E Smee) writes:
>In article <1990May30.045903.14249@agate.berkeley.edu> dankg@ocf.Berkeley.EDU (Dan Kogai) writes:
>>also (Suppose British uses sterling figure for the place of backslash?)  Come
 
>Hardly important, but on (at least) most British ASCII terminals and
>printers which support the 'pounds sterling' currency symbol, it
>... Always have wondered why it didn't replace the dollarsign.

IBM specifies the $ as 'national currency symbol' in EBCDIC.  When we first
brought up a node in the U.K. there was great confusion as email across the
Atlantic transparently changed currency symbols.  Fortunately, you can
tweak the character table by esoteric modifications to VM/SP....

"That will cost $1,000,000" is vastly different from "That will cost #1,000,000"
--
Henry Troup - BNR owns but does not share my opinions
..uunet!bnrgate!hwt%bwdlh490 or  HWT@BNR.CA

cjc@ulysses.att.com (Chris Calabrese[mav]) (06/01/90)

In article <6807@eos.UUCP>, jbm@eos.UUCP (Jeffrey Mulligan) writes:
> cjc@ulysses.att.com (Chris Calabrese[mav]) writes:
> 
> >In article <1990May31.065335.10406@agate.berkeley.edu>, dankg@lightning.Berkeley.EDU (Dan KoGai) writes:
> 
> >> 	Gee, I knew that:  You can quote file name in shell and just do
> >> fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in c source or anything but
> >> in reality those drive your shell nuts.
> 
> >Not only does the shell not have problems with punctuation marks, but
> >it even handles non-printable characters without complaining.
> 
> I think he should have said "in reality those will drive many users nuts."
> Of course wizards aren't fazed, since they already are nuts :-)

Yes, well, umm...
Actually, there is one interesting thing about files containing funny
characters - xargs barfs on them:

Script started on Fri Jun  1 11:42:15 1990
908 telemachos_/watsop/htor/test) > 'this is a test'
909 telemachos_/watsop/htor/test) echo this* | xargs ls
this not found
is not found
a not found
test not found
910 telemachos_/watsop/htor/test) exit
script done on Fri Jun  1 11:42:26 1990

If you have some admin scripts which use xargs (often backup scripts,
user deleting scripts, etc), you can wreak havoc on your system like
this.  Depending on the exact situation you may be able to use this to:
1) make files that crash the backup scripts
2) make files that can-not be deleted
3) make files that crash user-deletion scripts
4) make files that aren't deleted when they're supposed to be by scripts
etc.
Name:			Christopher J. Calabrese
Brain loaned to:	AT&T Bell Laboratories, Murray Hill, NJ
att!ulysses!cjc		cjc@ulysses.att.com
Obligatory Quote:	``Anyone who would tell you that would also try and sell you the Brooklyn Bridge.''

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/03/90)

In article <1990May31.065335.10406@agate.berkeley.edu>, dankg@lightning.Berkeley.EDU (Dan KoGai) writes:
> 	Gee, I knew that:  You can quote file name in shell and just do
> fopen("~!@#$%^&*()-_=+[{]}\\|\'\";:?.>,<", "w") in C source or anything but
> in reality those drive your shell nuts.  As far as Unix has no Finder
> or SFDialog to access these files without pain, we'd better stay away from
> those nasty punctuation marks...

Try a Sun 386i some time.  Or use the Xerox/Envos lisp environment.  Or use
one of several menu-shells.  And so on.
Remember:  UNIX is not finished.
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

ercm20@castle.ed.ac.uk (Sam Wilson) (06/06/90)

In article <1990May30.045903.14249@agate.berkeley.edu> dankg@ocf.Berkeley.EDU (Dan Kogai) writes:
>	And I think that apply to other Indo-European language character sets 
>also (Suppose British uses starling figure for the place of backslash?)

Nope, the standard place to put a pound sterling sign (a curly 'L' with
a '-' or '=' through it) is where the '#' usually is.  On most keyboards
here that's shifted 3.  Some keyboards put it in place of grave '`'
('back-quote'). 

Sam Wilson
Edinburgh University