[comp.lang.perl] While learning PERL... a suggestion

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (01/19/91)

While learning PERL I have reached the part about the various file test
operator.  I noted that -B and -T need to read SOME of the file to make
their appropriate determinations.  I though of another kind of test that
could be useful, especially from pipes where the name might not be there.

Suggestion:

Add a -Z file test operator that returns TRUE if the file appears to be
the output of the UNIX compress command.  Testing this file with -B would
still yield TRUE since a compressed file is a subset of binary files.

I've still yet to get through the rest of PERL (probably this weekend)
so I don't know yet if it is particularly easy to invoke uncompress -c
or zcat to pipe input data back to a PERL script.  When reading from
a file one can reposition back to the beginning of the file and pass
the file descriptor on to zcat as STDIN and read from the pipe instead.
But for something already coming in from a pipe, the suggested -Z test
would have already taken data out of the pipe.

Well I hope I don't have egg all over my face with this suggestion.
We shall see.
-- 

--Phil Howard, KA9WGN-- | Individual CHOICE is fundamental to a free society
<phil@ux1.cso.uiuc.edu> | no matter what the particular issue is all about.

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/19/91)

In article <1991Jan19.003519.23569@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
: While learning PERL I have reached the part about the various file test
: operator.  I noted that -B and -T need to read SOME of the file to make
: their appropriate determinations.  I though of another kind of test that
: could be useful, especially from pipes where the name might not be there.
: 
: Suggestion:
: 
: Add a -Z file test operator that returns TRUE if the file appears to be
: the output of the UNIX compress command.  Testing this file with -B would
: still yield TRUE since a compressed file is a subset of binary files.

I don't think Perl should have operators to check magic numbers.  It's
too easy to read the first word yourself now that there's sysread().

: I've still yet to get through the rest of PERL (probably this weekend)
: so I don't know yet if it is particularly easy to invoke uncompress -c
: or zcat to pipe input data back to a PERL script.  When reading from
: a file one can reposition back to the beginning of the file and pass
: the file descriptor on to zcat as STDIN and read from the pipe instead.
: But for something already coming in from a pipe, the suggested -Z test
: would have already taken data out of the pipe.

There's no way to do this at all under Unix, let alone Perl, without
interposing a process to supply the magic number you destructively read
out of the pipe.  You can't seek backwards on a pipe, and unless you can
convince compress to accept input without the leading magic number, you're
stuck.

As you pointed out, it wouldn't help to give Perl a -Z, since it would still
have to read the pipe.

There's one other possibility.  If you know the pipe is really a socket,
you might be able to do a recv() with the MSG_PEEK flag and read it out
non-destructively.  Likewise for streams, using I_PEEK.

: Well I hope I don't have egg all over my face with this suggestion.
: We shall see.

I'm afraid the yolk is on all of us.

Larry

rbj@uunet.UU.NET (Root Boy Jim) (01/19/91)

In article <11115@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>There's one other possibility.  If you know the pipe is really a socket,
>you might be able to do a recv() with the MSG_PEEK flag and read it out
>non-destructively.  Likewise for streams, using I_PEEK.

If it wasn't so trivial to write a getchar/ungetchar pair
might ask Larry to hack in ungetc, possibly with no limit on pushback.

Which brings us to the question of what should go in perl?
An easier question might be "What's missing, compared to C?"

Going thru the syscalls I see trivial omissions, such as `mknod'
or `reboot' that can best be done by invoking "system", without
resorting to using syscall.

However, it is control over signals that I find most lacking.

It is unclear whether signals are reset upon being being caught
and whether it is blocked during signal handler execution. I
believe Larry deliberately avoided answering these questions
in order to avoid system dependencies.

Some day, POSIX signal handling will have to be hacked in,
along with finer resolution alarm timers.

I suppose setjmp/longjmp is out of the question?
Catch/throw/unwind protect?

Grep is mapcar, am I right?

>I'm afraid the yolk is on all of us.

Yeah, but can a blue man sing the whites?

>Larry

-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

ronald@robobar.co.uk (Ronald S H Khoo) (01/20/91)

rbj@uunet.UU.NET (Root Boy Jim) writes:

> An easier question might be "What's missing, compared to C?"

Seems like a reasonable question to me :-)

> Going thru the syscalls I see trivial omissions, such as `mknod'
> or `reboot' that can best be done by invoking "system", without
> resorting to using syscall.

One function that I need, which *can't* be gotten through either syscall
or system is sbrk(0) -- especially necessary if you're running on a
System V based PC Unix where the performance hits the floor as soon as
you run out of real core, so it is (sigh) necessary to tune one's
programs not to.  [ I guess lousy VM is better than no VM at
all, though. ]

So guess what's the current only member of my usersub.c :-)

> It is unclear whether signals are reset upon being being caught
> and whether it is blocked during signal handler execution. I
> believe Larry deliberately avoided answering these questions
> in order to avoid system dependencies.

It's the usual trade between portability of the perl script vs. more
accurate access to the underlying OS I suppose ?  Both aspects
of perl are important.  How *do* you determine which road to follow where
they conflict ?

> along with finer resolution alarm timers.

If this is done, I hope that someone will do emulation modules for
systems without it.  I guess that I rate perl script portability slightly
more highly.  It would really get on my wick if I had to mod a perl
script just because someone gratuitouly used [gs]etitimer in a program
that didn't actually need it...  An emulation for System V that didn't
actually give you the finer resolution would save the day there.
Obviously it doesn't help the apps that actually need the resolution,
but there's nothing you can do about that (is there ?)

[ PS Actually, I need that [gs]etitimer emulator -- does anyone have one ?
     I can't write it cos I ain't got BSD manuals :-( ]
-- 
Ronald Khoo <ronald@robobar.co.uk> +44 81 991 1142 (O) +44 71 229 7741 (H)

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (01/21/91)

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:

>There's no way to do this at all under Unix, let alone Perl, without
>interposing a process to supply the magic number you destructively read
>out of the pipe.  You can't seek backwards on a pipe, and unless you can
>convince compress to accept input without the leading magic number, you're
>stuck.

>As you pointed out, it wouldn't help to give Perl a -Z, since it would still
>have to read the pipe.

So I guess what one must do then is just read the file and look at the data
and check for the magic number.  Then if it looks compressed and you want to
uncompress it, fork two more processes:

1.  to write onto a socketpair prefixing the original data read and copy
    all the rest into process 2:
2.  exec's compress writing out to yet another socketpair to 3:
3.  performs the originally intended processing.

Perhaps the order of forking can be done to keep #3 running in the same
pid as originally, with #1 above inheriting the original STDIN or other
file being read.

I would just like to be able to do it without so many processes.
-- 

--Phil Howard, KA9WGN-- | Individual CHOICE is fundamental to a free society
<phil@ux1.cso.uiuc.edu> | no matter what the particular issue is all about.

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (01/21/91)

rbj@uunet.UU.NET (Root Boy Jim) writes:

>>There's one other possibility.  If you know the pipe is really a socket,
>>you might be able to do a recv() with the MSG_PEEK flag and read it out
>>non-destructively.  Likewise for streams, using I_PEEK.

>If it wasn't so trivial to write a getchar/ungetchar pair
>might ask Larry to hack in ungetc, possibly with no limit on pushback.

It would have to push back into the unix file descriptor so that a
forked process like uncompress could read it.

>Which brings us to the question of what should go in perl?
>An easier question might be "What's missing, compared to C?"

I don't see why perl has to have EVERYTHING that is in C.

>However, it is control over signals that I find most lacking.

>It is unclear whether signals are reset upon being being caught
>and whether it is blocked during signal handler execution. I
>believe Larry deliberately avoided answering these questions
>in order to avoid system dependencies.

Though I am just at the learning stage of this, one of the things
I would like to do in perl is to do totally unblocked i/o using
select().

An example of a trivially described program that could demo what I
need is a program that take a number of hostnames as arguments,
looks up the IP address, each in parallel, opens a socket to connect
to the given hosts telnet port, and report which host said "login:"
first.  Delays in ANY step along the way for ANY host should have NO
affect on any others.  None of the steps should let the process block.
Juggling this in C is tough enough.  Maybe not so trivial.

BTW, something I cannot see yet how to do, though I have not quite yet
gotten time to experiment and try things, is how to keep and manage an
array of filehandles.  The man page does not show any examples or give
any idea, so I will guess and try some things such as putting a subscript
on the filehandle itself (kinda doubt it) and just assigning the string
that identifies the filehandle into a scalar variable and use that scalar
wherever the filehandle is needed (appears to introduce ambiguities in
the syntax).
-- 

--Phil Howard, KA9WGN-- | Individual CHOICE is fundamental to a free society
<phil@ux1.cso.uiuc.edu> | no matter what the particular issue is all about.

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (01/22/91)

merlyn@iwarp.intel.com (Randal L. Schwartz) writes:

>Arrays of filehandles are not directly supported.  You can use arrays
>of values that are to be assigned into indirect filehandles, like so:

>	$indir = $filehandle[17];
>	print $indir "This goes to the 17th filehandle";

>but you cannot directly do the dereferencing of the array and the I/O
>in the same expression.

My only comment is that this must be a very complex syntax to parse.
The mere existance of space between the handle and the string above
would separate things.  I wonder if there are any cases where the
data to be printed could be an expression with a unary prefix operator
that could cause an apparent ambiguity as to look like a binary infix
operator between the filehandle and the data, thus appearing to be an
expression to be printed on STDOUT.

>You could also futz with the namespace, like:

>	*HANDLE = $filehandle[17];
>	print HANDLE "this goes to the 17th filehandle";

>but remember that this affects &HANDLE, %HANDLE, and @HANDLE as well.
>(I understand that this is more efficient than the indirect filehandle
>if you do more than one I/O per filehandle switch.)

In what way does it affect them?  I was getting the impression from the
man pages that these name spaces were separate.  This was discussed in
the man pages with regard to passing by references, but I guess you are
now filling in the ambiguities of its affect elswhere.  I'm not up on
subroutines yet (probably mostly due to not understand how the name
space works).

>By the way, in this syntax, the filehandle never hits the Perl
>tokenizer, so you are not limited to a standard "name" for your
>filehandle.  You can use "\007smurf city u.s.a. 00001" as your
>filehandle name, for example.  My favorite is something like "0000001"
>in a variable that is being automagically incremented, so I can
>generate them on demand and know that they won't collide with other
>symbol names in my namespace.
-- 

--Phil Howard, KA9WGN-- | Individual CHOICE is fundamental to a free society
<phil@ux1.cso.uiuc.edu> | no matter what the particular issue is all about.

rbj@uunet.UU.NET (Root Boy Jim) (01/22/91)

In article <1991Jan21.092204.11944@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>I don't see why perl has to have EVERYTHING that is in C.

Simply because I'm tired of writing in C. Too low level.

	grep(print("why use $_ when you can use perl?\n"),
		("sed","awk","sh","C"));

Another thing perl needs is the ability to create lock files. Sysopen?

-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

rbj@uunet.UU.NET (Root Boy Jim) (01/22/91)

In article <1991Jan21.214453.25039@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
>From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
>:Another thing perl needs is the ability to create lock files. Sysopen?
>
>flock() is there, as is lockf() disguised as fcntl().

Pardon me. I didn't make myself clear. I need to create lock files,
not lock files already created. I meant open(path,O_CREAT|O_EXCL,0666);
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/22/91)

As quoted from <1991Jan21.201845.28080@ux1.cso.uiuc.edu> by phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN):
+---------------
| My only comment is that this must be a very complex syntax to parse.
| The mere existance of space between the handle and the string above
| would separate things.  I wonder if there are any cases where the
| data to be printed could be an expression with a unary prefix operator
| that could cause an apparent ambiguity as to look like a binary infix
| operator between the filehandle and the data, thus appearing to be an
| expression to be printed on STDOUT.
+---------------

I don't know what Perl does in that case (Perl isn't installed on ncoast), but
there is a disambiguator:  the full syntax of "print" is:

	print FILEHANDLE [(] arguments [)];

Just enclose the arguments to be printed in parentheses.  The ambiguity is no
different from the usual Perl irregularity, where operators like "print",
"chmod", and just about everything else take a following left parenthesis as
an argument indicator a' la C functions.  You just have to be careful.

+---------------
| In what way does it affect them?  I was getting the impression from the
| man pages that these name spaces were separate.  This was discussed in
| the man pages with regard to passing by references, but I guess you are
| now filling in the ambiguities of its affect elswhere.  I'm not up on
| subroutines yet (probably mostly due to not understand how the name
| space works).
+---------------

The problem is that the * syntax doesn't support specifying *which* namespace
to alter:  does "*x" refer to a filehandle, a scalar, an array, or an
associative array?  And you can't say "*$a" to specify which (not that it
would help anyway, since filehandles don't have indicators and you would need
to retain the current behavior of "*x" for compatibility.

This can be viewed as a drawback of Perl, or of * indirection.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

tchrist@convex.COM (Tom Christiansen) (01/22/91)

From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
:>:Another thing perl needs is the ability to create lock files. Sysopen?
:>
:>flock() is there, as is lockf() disguised as fcntl().
:
:Pardon me. I didn't make myself clear. I need to create lock files,
:not lock files already created. I meant open(path,O_CREAT|O_EXCL,0666);

With NFS, all bets are off: the same file may be created and deleted
more than once, with success returned to both.  It's horrid, I know, but
it's there.  Too many of us use NFS to rely on this.

Anyway aren't you "supposed" to use link, not creat, for creating
lock files of this nature?

I'm not saying we really don't need a sysopen -- some other flags
(like NDELAY) might be nice.  Just that for locking this isn't really
great.  If you have syscall(), you could probably put something
together without code changes.

--tom
--
"Hey, did you hear Stallman has replaced /vmunix with /vmunix.el?  Now
 he can finally have the whole O/S built-in to his editor like he
 always wanted!" --me (Tom Christiansen <tchrist@convex.com>)

ronald@robobar.co.uk (Ronald S H Khoo) (01/22/91)

rbj@uunet.UU.NET (Root Boy Jim) writes:

> Another thing perl needs is the ability to create lock files. Sysopen?

Eh ?  You're not advocating the use of O_CREAT|O_EXCL for lockfiles are
you ?  That's not portable.  It doesn't work on my machine for one :-)
Why not do lockfiles like C News does them ?

[ Question: did I get the $SIG{} = stuff right ? it's a little hard to test ..]

# =()<$NEWSCTL = "@<NEWSCTL>@";>()=
$NEWSCTL = "/usr/lib/news";

$ltmp	= "$NEWSCTL/LTMP.$$";
$lock	= "$NEWSCTL/LOCK";

sub Handler	{  unlink($lock, $ltmp) if $locked; exit 1;   }
sub Lock {
	open (LTMP, ">$ltmp") || die "cannot write to $ltmp: $!\n";
	print LTMP "$$\n"; close(LTMP);
	until ($locked) {
		if (link($ltmp, $lock)) {
			$locked = 1;
		} else {
			sleep 30;
		}
	}
	$SIG{'HUP'} = $SIG{'INT'} = $SIG{'TERM'} = 'Handler';
}

sub UnLock {
	unlink ($lock, $ltmp) if $locked;
	$locked = 0;
}
-- 
Ronald Khoo <ronald@robobar.co.uk> +44 81 991 1142 (O) +44 71 229 7741 (H)

urlichs@smurf.sub.org (Matthias Urlichs) (01/22/91)

In comp.lang.perl, article <119430@uunet.UU.NET>,
  rbj@uunet.UU.NET (Root Boy Jim) writes:
< 
< Another thing perl needs is the ability to create lock files. Sysopen?
< 

## code fragment from my UUCP job file mangler
# usage: &start;
# if &lock("RESOURCE") {
#    ...
#    &unlock("RESOURCE");
# } ...
# &end;
#
# lock files are assumed to contain the process ID as a binary integer.
# Modification for ASCII PIDs is trivial.
# Dead lock files are automagically deleted -- race conditions are possible
#   here!
#
# If you want to convert this to a package, go ahead.

$basedir = '/usr/local/uucp/';
$lockdir = $basedir . 'lock/';

sub readlock {
   local($lock) = @_;
   local($pid) = 0;
   local($spid);
   if (open (LF, $lock)) {
      if (read (LF, $spid, 4) == 4) {
         ($pid) = unpack ("L", $spid);
      }
      close(LF);
   }
   $pid;
}

sub exists {
  local($proc) = @_;
  if ($proc) {
     if (kill (0,$proc)) {
	1;
     } else {
	if($! == 3) { # ESRCH -- process does not exist
	   0;
	} else {
	   1;
	}
     }
  } else {
     0;
  }
}

sub lock {
   local($lock) = @_;
   $lock = $lockdir . 'LCK..' . $lock;
   if (link ($locktemp, $lock)) {
      1;
   } else {
      local($pid) = &readlock ($lock);
      if (&exists($pid)) {
	 0;
      } else {
	 unlink($lock);
	 sleep 2;
	 link($locktemp,$lock);
      }
   }
}

sub unlock {
   local($lock) = @_;
   $lock = $lockdir . 'LCK..' . $lock;
   unlink ($lock);
}


sub start {
   $locktemp = $lockdir . 'PID..' . $$;
   open(LOCK, '>'.$locktemp) || die 'open ' . $locktemp;
   print LOCK pack('I', $$);
   close(LOCK);
}

sub end {
   &unlock($curlock);
   unlink($locktemp);
}


-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de     /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330)   \o)/

tchrist@convex.COM (Tom Christiansen) (01/22/91)

It occurs to me that rather than adding sysopen(), merely overloading the
existing open() would be better.  Would you want just flags or mode as
well?  And I bet you still really want an underlying open(2) followed by
an fdopen(3), right?

--tom

[I'm just getting these all in because Larry and Randal don't have
 terminals in their hotel rooms as I do. :-]
--
"Hey, did you hear Stallman has replaced /vmunix with /vmunix.el?  Now
 he can finally have the whole O/S built-in to his editor like he
 always wanted!" --me (Tom Christiansen <tchrist@convex.com>)

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/23/91)

As quoted from <1991Jan22.054732.14797@convex.com> by tchrist@convex.COM (Tom Christiansen):
+---------------
| With NFS, all bets are off: the same file may be created and deleted
| more than once, with success returned to both.  It's horrid, I know, but
| it's there.  Too many of us use NFS to rely on this.
| 
| Anyway aren't you "supposed" to use link, not creat, for creating
| lock files of this nature?
+---------------

The System III/V manpage for open(2) strongly implies (if not actually says
--- I don't know if the Xenix manpage on here reflects the real System V one
or not) that open(..., O_CREAT|O_EXCL, ...) is the "proper" way to create lock
files.  (This despite certain problems like root being able to create it
anyway.)  Nevertheless, since it's documented as being a way of doing it,
there are programs (not to mention programmers, although I'm not one of them)
that do this.

Also, RFS appears to get this right --- and we have more systems using RFS
than NFS.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

rbj@uunet.UU.NET (Root Boy Jim) (01/23/91)

In article <1991Jan22.054732.14797@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
>From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
>:>:Another thing perl needs is the ability to create lock files. Sysopen?
>
>With NFS, all bets are off: the same file may be created and deleted
>more than once, with success returned to both.  It's horrid, I know, but
>it's there.  Too many of us use NFS to rely on this.

NFS? Did *I* say NFS? I think not!
What I am trying to do is steal one of UUCP's dialout modems,
following the same protocol that tip/cu use.

Perhaps you are correct, that it won't work on across NFS.
The solution is not to attempt it then, but only to use it
on local filesystems. We don't rip all that other stuff
out of the kernel because of NFS's braindamage.

>Anyway aren't you "supposed" to use link, not creat, for creating
>lock files of this nature?

Not, I believe, since the "new" flags were added to open.

>I'm not saying we really don't need a sysopen -- some other flags
>(like NDELAY) might be nice.  Just that for locking this isn't really
>great.  If you have syscall(), you could probably put something
>together without code changes.

Yes, I'l do just that.
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

les@chinet.chi.il.us (Leslie Mikesell) (01/24/91)

In article <119542@uunet.UU.NET> rbj@uunet.UU.NET (Root Boy Jim) writes:

>What I am trying to do is steal one of UUCP's dialout modems,
>following the same protocol that tip/cu use.

>>Anyway aren't you "supposed" to use link, not creat, for creating
>>lock files of this nature?

>Not, I believe, since the "new" flags were added to open.

Actually you still need to make the file under a temp name and link it
because the contents of the lockfile are used by cu/uucp, and there
is no other way to make the contents of the file consistent at the
time the name appears.  It is still impossible to use the contents
for anything that relies on the contents remaining the same after
looking at it, but that's not perl's problem.

Les Mikesell
  les@chinet.chi.il.us

maf@thor (Martin Foord) (01/24/91)

In article <1991Jan22.080310.5582@robobar.co.uk> ronald@robobar.co.uk (Ronald S H Khoo) writes:
>Why not do lockfiles like C News does them ?

Hmmm, cops is now being written in perl, any perl gurus out there ever thought
about writing Cnews in perl ? Isn't this the sort of application that begs
to be written in perl?


-- 
Martin Foord.				MHSnet :  maf@dbsm.oz.au
Midland Montagu Australia Limited.	INTERNET GATEWAY:
Dominguez Barry Samuel Montagu.		maf%dbsm.oz.au@munnari.oz.au
					Phone: (02) 258-2724

merlyn@iwarp.intel.com (Randal L. Schwartz) (01/24/91)

In article <1991Jan21.201845.28080@ux1.cso.uiuc.edu>, phil@ux1 (Phil Howard KA9WGN) writes:
| >You could also futz with the namespace, like:
| 
| >	*HANDLE = $filehandle[17];
| >	print HANDLE "this goes to the 17th filehandle";
| 
| >but remember that this affects &HANDLE, %HANDLE, and @HANDLE as well.
| >(I understand that this is more efficient than the indirect filehandle
| >if you do more than one I/O per filehandle switch.)
| 
| In what way does it affect them?  I was getting the impression from the
| man pages that these name spaces were separate.  This was discussed in
| the man pages with regard to passing by references, but I guess you are
| now filling in the ambiguities of its affect elswhere.  I'm not up on
| subroutines yet (probably mostly due to not understand how the name
| space works).

The name spaces are separate in the ordinary sense, but the * operator
is not an ordinary operator. :-)

I hope Larry will forgive me for the following.  It's probably close
enough for jazz, but Larry will cringe at the description.

Think of *FOO as the internal representation of the symbol "FOO" in
the program, in *whatever* form it is used: &FOO, $FOO, %FOO, @FOO,
and FOO as a filehandle.  When you say *BAR = *FOO, you are really
saying that BAR can be used anywhere that FOO can be used with
identical results.  It doesn't matter if there's an intermediate step:
you could say $a = *FOO then *BAR = $a.

So assigning something to *BAR _must_ affect all object with a BAR as
their name.  It's dangerous... it's not just for filehandles.

Now, the one little tricky that I didn't describe so far is what I
used in my previous post.  Namely: that *BAR = *FOO can also be
accomplished with *BAR = 'FOO', because Perl will look up FOO at
runtime to figure out what it really is.  Now, since the "name" FOO is
quoted, you could really stick anything there, like *BAR = '42*xx',
and now the name BAR is a synonym with another name 42*xx, even though
that sort of name would never make it past the parser.

There.  Clear as mud? :-)

By the way, I just got home from Uniforum.  It was great.  300 books
sold and signed within 2 hours.  My hand is tired. :-)

print do {{ redo unless "Just another Perl hacker,"; }} # huh? :-)
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/

merlyn@iwarp.intel.com (Randal L. Schwartz) (01/24/91)

In article <1991Jan23.230154.27272@dbsm.oz.au>, maf@thor (Martin Foord) writes:
| Hmmm, cops is now being written in perl, any perl gurus out there ever thought
| about writing Cnews in perl ? Isn't this the sort of application that begs
| to be written in perl?

I nearly died laughing when I read this one.  Larry and I had lunch
with the father of Cnews (Henry Spencer) today.  He has said some less
than glorious things about Perl (he had a paper/talk at Usenix about
how to use awk(!) as a systems programming language).  This'd just
take the cake.

But you know, I bet it could be done pretty easily.

@x{0..24} = split(//,"rJeslthnco euhPk,aert ar "); print values %x;
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/25/91)

As quoted from <1991Jan23.230154.27272@dbsm.oz.au> by maf@thor (Martin Foord):
+---------------
| about writing Cnews in perl ? Isn't this the sort of application that begs
| to be written in perl?
+---------------

Henry Spencer would have a fit.  [ ;-) ]

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

vsh@etnibsd.UUCP (Steve Harris) (01/25/91)

In article <1991Jan24.100539.20366@iwarp.intel.com>, merlyn@iwarp.intel.com (Randal L. Schwartz) writes:
> 
> I hope Larry will forgive me for the following.  It's probably close
> enough for jazz, but Larry will cringe at the description.
> 
> Think of *FOO as the internal representation of the symbol "FOO" in
> the program, in *whatever* form it is used: &FOO, $FOO, %FOO, @FOO,
> and FOO as a filehandle.  When you say *BAR = *FOO, you are really
> saying that BAR can be used anywhere that FOO can be used with
> identical results.  It doesn't matter if there's an intermediate step:
> you could say $a = *FOO then *BAR = $a.
> 
> So assigning something to *BAR _must_ affect all object with a BAR as
> their name.  It's dangerous... it's not just for filehandles.

Okay, is this paraphrase correct?

Perl keeps a symbol table containing all the symbols (names) in your
perl program.  However, each entry in the symbol table **MAY** refer to
a scalar, an array, a filehandle, a subroutine, etc., depending on
context.

Presumably, the symbol table entry for "FOO" has a pointer to a struct
containing (pointers to) all the possible instances of the symbol.

When you say *BAR = *FOO, you're creating a symbol table entry "BAR"
which points to the same struct as the symbol table entry "FOO".  So
that a reference to $BAR is exactly the same as a reference to $FOO,
ditto for @FOO and @BAR, etc.


	FOO ----+	    ->	scalar
		 \	    ->	array
		  -> struct ->	associative array
		 /	    ->	subroutine
	BAR ----+	    ->	filehandle
-- 
Steve Harris - Eaton Corp. - Beverly, MA - uunet!etnibsd!vsh

rbj@uunet.UU.NET (Root Boy Jim) (01/25/91)

In article <1991Jan24.100539.20366@iwarp.intel.com> merlyn@iwarp.intel.com (Randal L. Schwartz) writes:
>The name spaces are separate in the ordinary sense, but the * operator
>is not an ordinary operator. :-)
>
>I hope Larry will forgive me for the following.  It's probably close
>enough for jazz, but Larry will cringe at the description.

I hope not. 

>Think of *FOO as the internal representation of the symbol "FOO" in
>the program, in *whatever* form it is used: &FOO, $FOO, %FOO, @FOO,
>and FOO as a filehandle.  When you say *BAR = *FOO, you are really
>saying that BAR can be used anywhere that FOO can be used with
>identical results.  It doesn't matter if there's an intermediate step:
>you could say $a = *FOO then *BAR = $a.

I'm jumping in here because of a phone conversation I had with merlyn
before he left. In LISP, a symbol is an object. It has four "cells":
a value, a function, a property list, and a print name. Think of
PERL symbol as having many cells also: scalar, array, assoc, label,
a subroutine (Larry, you used a dirty word!), & file handle.

A similar concept exists in C++, the reference operator.

So *FOO really *IS* the pointer to the symbol itself. It's a pity
that Larry decided to swap the meaning of & and *. Because then
he wouldn't have had to include these in "Here is what C has that
perl doesn't". It is probably too late to right this wrong.

>By the way, I just got home from Uniforum...My hand is tired. :-)

Maybe you should have taken your woman with you :-)

> It was great.  300 books sold and signed within 2 hours.  

Oh yeah.

>print do {{ redo unless "Just another Perl hacker,"; }} # huh? :-)
>-- 
>/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
>| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
>| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
>\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

rbj@uunet.UU.NET (Root Boy Jim) (01/25/91)

In article <1991Jan23.230154.27272@dbsm.oz.au> maf@thor (Martin Foord) writes:
>Any perl gurus out there ever thought about writing Cnews in perl ?

One of the original reasons for writing Cnews in the first place was speed.
Henry and Jeff spent quite a bit of effort reducing the number of
system calls. Read the paper "News Need Not be Slow" in the distribution.

>Isn't this the sort of application that begs to be written in perl?

No, it begs to be PROTOTYPED in perl.
Originally, news was a bunch of shell scripts.
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

tchrist@convex.COM (Tom Christiansen) (01/26/91)

From the keyboard of vsh@etnibsd.UUCP (Steve Harris):
:When you say *BAR = *FOO, you're creating a symbol table entry "BAR"
:which points to the same struct as the symbol table entry "FOO".  So
:that a reference to $BAR is exactly the same as a reference to $FOO,
:ditto for @FOO and @BAR, etc.
:
:
:	FOO ----+	    ->	scalar
:		 \	    ->	array
:		  -> struct ->	associative array
:		 /	    ->	subroutine
:	BAR ----+	    ->	filehandle

There are also formats.

--tom
--
"Hey, did you hear Stallman has replaced /vmunix with /vmunix.el?  Now
 he can finally have the whole O/S built-in to his editor like he
 always wanted!" --me (Tom Christiansen <tchrist@convex.com>)

rkrebs@archie.dsd.es.com (Randall Krebs) (02/06/91)

In article <1991Jan24.174047.2897@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
>As quoted from <1991Jan23.230154.27272@dbsm.oz.au> by maf@thor (Martin Foord):
>+---------------
>| about writing Cnews in perl ? Isn't this the sort of application that begs
>| to be written in perl?
>+---------------
>
>Henry Spencer would have a fit.  [ ;-) ]

Then this should give him a seizure.

I just completed massaging our C news system to gateway local mailing
lists to local newsgroups.  During the entire activity, I was thinking
how much more elegant this process would be IF ONLY it had all been
written in perl.

I was forced to modify the message injection utilities to strip lines
out of the message headers.  The inject/defhdrs.awk and inject/anne.jones
fairly cry out for a perl implementation.  I haven't reimplemented
these in perl yet, but I'm only about two bugs away from throwing
the whole sh implementation into the bit bucket.

(This isn't supposed to make any sense to anyone that doesn't maintain
C news on a daily basis.)

randall.
-- 
   Randall S. Krebs                        | "Let it never be said that we
   (esunix!rkrebs@cs.utah.edu)             |  didn't do the very least we
   Evans & Sutherland Computer Corporation |  could."
   Salt Lake City, Utah (Where?)           |               - Arthur Unnoon

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (02/09/91)

As quoted from <1991Feb2.150543.569@haapi.uci.com> by clay@haapi.uci.com (Clayton Haapala):
+---------------
| In article <1991Jan24.174047.2897@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
| >As quoted from <1991Jan23.230154.27272@dbsm.oz.au> by maf@thor (Martin Foord):
| >+---------------
| >| about writing Cnews in perl ? Isn't this the sort of application that begs
| >+---------------
| >
| >Henry Spencer would have a fit.  [ ;-) ]
| 
| I have run the translator for some of the awk scripts, like histdups,
| but it didn't seem to run any faster for that particular application.
+---------------

a2p works, but even Larry admits (I think the manpage mentions this) that the
generated code is far from optimal.  I suspect that rewriting it manually in
Perl, using Perl's idea of how a program should work instead of awk's, would
show a large speed improvement.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY