[comp.sources.bugs] Perl Bugs and Comments

tchrist@sushi.UUCP (02/25/88)

I've compiled perl after applying all 22 of Larry's patches, and have
written a few programs that use it.  In the course of these programs,
I've discovered several possible bugs and probably peculiarities.  I'm
going to outline them here for discussion, as some may well not be bugs.
First, though, I want to say that I *like* perl.  It's something I've
wanted for a long time, and I sincerely want to take my hat off to
Larry for it.  So don't take these as gripes, but constructive
comments.


A pair of bugs:

A) $! is always 25 (ENOTTY), regardless of what I've just done.
   The documentation says it should contain "the current value 
   of errno, with all the usual caveats".  I think I understand how
   things interract with errno, but I can't explain this one.

B) Certain kinds of variable references misbehave when imbedded in a
   string:

	"this $#ARGV in a string" prints "this ARGV in a string" 
	"this $_[$i] in a string" prints "this <whole line>[i] in a string" 

   where <whole line> is the current input record; others are literals.


Now for some things that are sadly lacking:

1) There is no perror() mechanism.  I would like to be able to say 
	die "$file: $SYSERR[$!]" unless open(fd,$file);
	
2) There is no mechanism for internal file globbing.  Saying
	@manfiles = split(' ',`ls /usr/man/man?/$arg.* 2> /dev/null`);
   is such an overkill -- plus I must check $? as the bourne shell
   returns the unglobbed string if it makes no match.

3) It would be nice if $ENV{'SHELL'} would be honored for `evals`.

4) There is no mechanism for "if" tests, like -e, -x, -w, -d, ...
   Using the `eval` mechanism is clumsy, inefficient, and sometimes
   impossible.  For example, 

   if ( `if [ -e a* 2> /dev/null ]; then echo 1; fi` ) {

   will succeed for just ONE a* file, not zero or more than one.

5) I wish I didn't have to use two statements for this:
	$hours = $_[3];
	$hours =~ s/:.*//;
   I would like to say 
	$hours = $_[3] =~ s/:.*//;
   but in this context, perl says it's is a pattern-compare and returns
   one or zero.

6) You can link but not symlink; if you're a BSD system this should be
   possible.

7) I've had trouble with the array/scalar notation, as well as not
   always being certain whether to use a $ or @ at all.  I've always 
   worked it out, but it's somehow not very intuitive.  Has anyone else
   had this problem?

lwall@devvax.JPL.NASA.GOV (Larry Wall) (02/28/88)

In article <69600002@sushi> tchrist@sushi.UUCP writes:
: A) $! is always 25 (ENOTTY), regardless of what I've just done.
:    The documentation says it should contain "the current value 
:    of errno, with all the usual caveats".  I think I understand how
:    things interract with errno, but I can't explain this one.

I can't explain it either.  It works fine here on both Sun and Vax.  Is your
errno declared to be something other than int maybe?  Is your compiler blowing
the cast to (double) in stab_str()?

: B) Certain kinds of variable references misbehave when imbedded in a
:    string:
: 
: 	"this $#ARGV in a string" prints "this ARGV in a string" 
: 	"this $_[$i] in a string" prints "this <whole line>[i] in a string" 
: 
:    where <whole line> is the current input record; others are literals.

The BUGS section of the manual says:
    You can't currently dereference array elements inside a double-quoted
    string.  You must assign them to a temporary and interpolate that.

The problem with implementing it is that I'd either have to do a run-time
evaluation on the subscript expression, which can be very slow, or I'd have
to decompose
	"this $_[$i] in a string"
into
	"this " . $_[$i] . " in a string"

I just haven't gotten around to doing it yet.  Someday.  That's why it's
in the BUGS section.

: Now for some things that are sadly lacking:
: 
: 1) There is no perror() mechanism.  I would like to be able to say 
: 	die "$file: $SYSERR[$!]" unless open(fd,$file);

Yes, this is sadly lacking.  Not difficult to add, either, so expect it soon.

: 2) There is no mechanism for internal file globbing.  Saying
: 	@manfiles = split(' ',`ls /usr/man/man?/$arg.* 2> /dev/null`);
:    is such an overkill -- plus I must check $? as the bourne shell
:    returns the unglobbed string if it makes no match.

Sometime soon.  It's a little harder than SYSERR though, mostly because I have
to incorporate portable directory reading routines.  Turning a file glob
pattern into a regular expression is fairly trivial.  I already have the
routine to do it in rn.

As to syntax, I'm thinking of providing a glob('*.[ch]') function which returns
an array.  If you can think of a better way to do it, lemme know.

If there are any PD globbing routines out there I'd like to know about those
too.

: 3) It would be nice if $ENV{'SHELL'} would be honored for `evals`.

This is, unfortunately, a bug of popen(), not perl.  For the moment you'd
have to say
	$shell = $ENV{'SHELL'};
	`$shell -c 'evals'`;

: 4) There is no mechanism for "if" tests, like -e, -x, -w, -d, ...
:    Using the `eval` mechanism is clumsy, inefficient, and sometimes
:    impossible.  For example, 
: 
:    if ( `if [ -e a* 2> /dev/null ]; then echo 1; fi` ) {
: 
:    will succeed for just ONE a* file, not zero or more than one.

I've been thinking about adding this for some time.  While you can stat
the file from within perl and pick apart the mode word, I wouldn't want
to wish this on anyone, either the person trying to write it, or the person
trying to read it.

I suspect they'd simply be unary operators with a precedence a little higher
than relationals and a little lower than math and concatenation operators.
So you could say

    if (-r $foo . '.bak' || -r $foo)

and have it work in the most intuitive manner.

: 5) I wish I didn't have to use two statements for this:
: 	$hours = $_[3];
: 	$hours =~ s/:.*//;
:    I would like to say 
: 	$hours = $_[3] =~ s/:.*//;
:    but in this context, perl says it's is a pattern-compare and returns
:    one or zero.

I don't see any easy way to change this offhand.  I'd hate to make the operation
of =~ depend on its syntactic context, and I have to have =~ return a boolean
in logical contexts.

How about this:  I might be able to convince perl that an assignment is a
valid lvalue.  Then you could say

	($hours = $_[3]) =~ s/:.*//;

That's within two charcters of what you wanted, though with inside-out
semantics.  And of course, the whole thing STILL returns a boolean.

: 6) You can link but not symlink; if you're a BSD system this should be
:    possible.

I suppose.  Though I've resisted incorporating features that I know can't
be used everywhere.  This one's so useful, however, that I may break down
and do it.  Scripts using it would be no less portable than C programs that
use it.

: 7) I've had trouble with the array/scalar notation, as well as not
:    always being certain whether to use a $ or @ at all.  I've always 
:    worked it out, but it's somehow not very intuitive.  Has anyone else
:    had this problem?

The thing I find interesting is that nobody has EVER complained about my
forcing them to put $ or @ on the front of every variable in sight, when awk
and C don't do so.  The main reason, apart from readability, is to let me
add new keywords to the language without blowing all your old scripts out of
the water because you happened to use $glob.

As to your complaint, there are two parts to it.  Let me see if you are
saying what I think you are saying.

	1) Scaler references are $foo.  Array references are @foo.  But
	array element references are $foo[$i].  Why not @foo[$i]?

	2) There are some places in the language where $ and @ appear
	to be unnecessary.  Where are these places?

1) Why not @foo[$i]?
I dunno, it just came out the other way.  I think of the $ or @ as a type mark
for the whole term, not the type mark of the following identifier.  I think
in terms of the scaler value I'm really going to reference, not the route I
took to get there.  So my mental grouping of $foo[$i] is not

	<$foo>[$i]
but rather
	$<foo[$i]>

Maybe I'm just strange in my head, but since there are places in the code
where the type of the object determines whether the surrounding context does
an array operation or a scaler one, I like to see the type of the whole object
out in front.  For example, of the following two items, I think the second is
more intuitive:

	push(@array,@whatever[$i]);	# illegal

	push(@array,$whatever[$i]);

At a glance I know the second one is pushing a scaler value.  Your mileage
may vary, of course.

2) Where are $ or @ unnecessary?
I don't think there's any place where $ is unnecessary.  As for @, there are
some array functions that KNOW that a particular argument is going to be an
array, and so don't care if you put the @ or not.  It never hurts to put it,
though.

I don't know if I can justify having optional @ typemarks.  I could easily
argue myself into wishing they were mandatory.  Of course, that might break
some perl scripts already in the world, so I probably will leave it as it
is, but I suggest you always use @.  Especially if it's @glob.  :-)

I hope I haven't lied too much...

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

ewiles@netxcom.UUCP (Edwin Wiles) (02/29/88)

In article <1418@devvax.JPL.NASA.GOV> lwall@devvax.JPL.NASA.GOV (Larry Wall)
writes:
>In article <69600002@sushi> tchrist@sushi.UUCP writes:
>: A) $! is always 25 (ENOTTY), regardless of what I've just done.
>:    The documentation says it should contain "the current value 
>:    of errno, with all the usual caveats".  I think I understand how
>:    things interract with errno, but I can't explain this one.
>
>I can't explain it either.  It works fine here on both Sun and Vax.  Is your
>errno declared to be something other than int maybe?  Is your compiler blowing
>the cast to (double) in stab_str()?

	I've run into this problem before.  (Xenix, on PC/AT)  It seems
	that sprintf, and possibly others in the 'print' family, make a
	call to "isatty()".  Of course, since it isn't printing to a tty
	(sprintf that is...) it returns false, and sets errno to 'ENOTTY'.
	A d*mn nussance, since it requires that we save the value of errno
	before doing any sprintf, if we expect to need it afterward.

	(Larry!  I'm supprised!  I thought you'd know about this one!)

					Later!
-- 
...!hadron\   "Who?... Me?... WHAT opinions?!?" | Edwin Wiles
  ...!sundc\   Schedule: (n.) An ever changing	| NetExpress Comm., Inc.
   ...!pyrdc\			  nightmare.	| 1953 Gallows Rd. Suite 300
    ...!uunet!netxcom!ewiles			| Vienna, VA 22180

allbery@ncoast.UUCP (Brandon Allbery) (03/05/88)

As quoted from <1418@devvax.JPL.NASA.GOV> by lwall@devvax.JPL.NASA.GOV (Larry Wall):
+---------------
| : 7) I've had trouble with the array/scalar notation, as well as not
| :    always being certain whether to use a $ or @ at all.  I've always 
| 
| The thing I find interesting is that nobody has EVER complained about my
| forcing them to put $ or @ on the front of every variable in sight, when awk
+---------------

I didn't complain because I use Accell at work; in Accell/Language, you can
precede variables by $ and thus guard against their being treated as keywords
by a later version.  Makes sense to me, and I wish it were a little more
common.
-- 
	      Brandon S. Allbery, moderator of comp.sources.misc
       {well!hoptoad,uunet!hnsurg3,cbosgd,sun!mandrill}!ncoast!allbery