[comp.lang.perl] ptags - perl tags in perl

lm@slovax.Eng.Sun.COM (Larry McVoy) (04/11/91)

In the hack-of-the-hour catagory, a tags file generator for perl.  Differences
from ctags:

	1) Puts a tag in for the filename
	2) Puts in multiple tags for the same symbol (I have a hacked version
	   of vi that groks this).

#!/bin/perl -s

# perl tags, in perl.
# @(#)ptags 1.2 4/11/91, no copyright.  Bugfixes to lm@eng.sun.com.

# tag	file	<vi expresion to find it>
# catch	/u/lm/tmp/eintr.c	/^catch() {}$/

if ($#ARGV == -1) {
	unshift(@ARGV, "-");
}
open(STDOUT, "|sort>tags") || die "can't create tags";
while ($_ = shift) {
	next unless -f $_;
	print STDERR "$_\n" if $v;
	do file($_);
}
exit;

sub file
{
	local($name) = $_[0];
	local($basename) = $_[0];

	open(F, $name) || return;
	if ($name =~ /\//o) {
		$basename  =~ s|.*/([^/]+)$|\1|o;
	}
	# put tag in for filename
	print "$basename\t$name\t1\n";
	while (<F>) {
		# skip the word sub in comments
		next unless /^[^#]*\bsub\b/;
		# skip the word sub in a string (one line only, I'm lazy)
		next if /"[^"]*sub/;
		print "$name: $. $_" if $d;
		# Find the name of the sub, it should be right after "sub".
		split;
		$subname = "";
		for ($f = 0; $f <= $#_; $f++) {
			if ($_[$f] eq "sub") {
				$subname = $_[$f + 1];
				last;
			}
		}
		if ($subname eq "") {
			print STDERR "No name: $name: $. $_\n";
		} else {
			chop;
			print "$subname\t$name\t/^$_\$/\n";
		}
	}
}
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

rbj@uunet.UU.NET (Root Boy Jim) (04/12/91)

In article <543@appserv.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>In the hack-of-the-hour catagory, a tags file generator for perl.

Why don't you hack this into etags/ctags in the emacs distribution?

>Differences from ctags:

Why should there be any?
-- 
		[rbj@uunet 1] stty sane
		unknown mode: sane

tchrist@convex.COM (Tom Christiansen) (04/12/91)

From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
:In article <543@appserv.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
:>In the hack-of-the-hour catagory, a tags file generator for perl.
:
:Why don't you hack this into etags/ctags in the emacs distribution?

Who cares about emacs? :-)


I've always used this.  Don't recall whom I got it from...

--tom

#!/usr/local/bin/perl
open(OUTPUT, "| sort > tags");
while (<>) {
    if (/\bsub\s+(\w+')?(\S+)/) {
	$func = $2;
	chop;
	s,[\\\[\]/.*],\\$&,g;
	print OUTPUT "$func\t", $ARGV, "\t/^$_\$/\n";
    }
}

flee@cs.psu.edu (Felix Lee) (04/14/91)

I almost have a stand-alone scanner for Perl: feed it a Perl program
and it will output a stream of tokens, one per line.  Would anyone
find this useful?  If so, I'll finish rewriting it.
--
Felix Lee	flee@cs.psu.edu

rbj@uunet.UU.NET (Root Boy Jim) (04/16/91)

In article <oebG+9jo1@cs.psu.edu> flee@cs.psu.edu (Felix Lee) writes:
>I almost have a stand-alone scanner for Perl: feed it a Perl program
>and it will output a stream of tokens, one per line.  Would anyone
>find this useful?  If so, I'll finish rewriting it.

Hmmmm, what's a token? How do you parse: $foo =~ s/$bar/tr;a-z;A-Z;/e
How do you deal with: &qaz(<<foo,<<bar)
What about q (or $) followed by newline or tab?
Have fun!
-- 
		[rbj@uunet 1] stty sane
		unknown mode: sane

flee@cs.psu.edu (Felix Lee) (04/16/91)

>Hmmmm, what's a token?

A token is a lexical unit.  What is a lexical unit?  Something
convenient for the parser.

I lied.  It's not just a stand-alone scanner.  There's also a
stand-alone parser.  One of my subgoals in building the thing was to
make the pieces easily decomposable.  So the stand-alone scanner is
just a small wrapper around gettoken().

>How do you parse: $foo =~ s/$bar/tr;a-z;A-Z;/e
	dvar "foo"
	"=~"
	substitute-begin "/"
	dvar "bar"
	substitute-to
	translate-begin ";"
	string "a-z"
	translate-to
	string "A-Z"
	translate-end
	substitute-end

This needs a little explanation.  A double-quoted string like
	"$a is $b"
becomes the token sequence
	dquote-begin
	dvar "a"
	string " is "
	dvar "b"
	dquote-end
and things inside s//, m//, and tr// are treated similarly.

For s///, the scanner always returns an expression for the second
clause.  So if you don't have a trailing "e", the scanner adds a
dquote-begin/dquote-end pair.
	s/foo/bar/
becomes
	substitute-begin "/"
	string "foo"
	substitute-to
	dquote-begin
	string "bar"
	dquote-end
	substitute-end

>How do you deal with: &qaz(<<foo,<<bar)
	call
	string "qaz"
	"("
	dquote-begin "foo"
	string "..."
	dquote-end
	","
	dquote-end "bar"
	string "..."
	dquote-end
	")"

<< quoting and formats require multiply buffered input.  Input is
buffered in a "line stream".  The line stream is buffered in a
"character stream".  The scanner normally reads from the character
stream, but << quoting and formats will read from the line stream.

>What about q (or $) followed by newline or tab?

Left as an exercise to the reader.

Warning.  I haven't actually implemented all of the above, so some of
what I just described may be unwieldy in practice.

The last thing I did, about a month ago, was add full expression
parsing to the yacc grammar.  The grammar has no s/r or r/r conflicts
and uses no precedence rules, but I'm not certain it actually parses
the same language that "perl" does.  I was in middle of rewriting the
scanner to handle the new parser requirements when the project dropped
by the wayside.
--
Felix Lee	flee@cs.psu.edu