johnh@nottingham.cs.ucla.edu (John Heidemann) (11/20/90)
Following is a Perl script which allows keyword searches of bibtex databases, showing the entire database entry when a match is found. A man page and installation instructions are included. -John Heidemann ---------------+-------------------------------------------------------------- John Heidemann | What, your editor plays Tetris with it's built-in LISP? When UCLA | _I_ was a boy, we played by tossing tape write rings about... ---------------+-------------------------------------------------------------- ----- cut here ----- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # lookbibtex.1 # lookbibtex # This archive created: Mon Nov 19 20:26:27 1990 export PATH; PATH=/bin:$PATH if test -f 'README' then echo shar: will not over-write existing file "'README'" else cat << \SHAR_EOF > 'README' To install lookbibtex, change the #! line of lookbibtex to the path for Perl. You may also wish to change the default database; if so edit the $defaultfile variable. Lookbibtex is released under the GNU Public License, Version 1 (Feb 89). A copy of the GPL should be included with your Perl distribution. Any comments are welcome. -John Heidemann <johnh@cs.ucla.edu> SHAR_EOF fi # end of overwriting check if test -f 'lookbibtex.1' then echo shar: will not over-write existing file "'lookbibtex.1'" else cat << \SHAR_EOF > 'lookbibtex.1' .\" lookbibtex.1 .TH LOOKBIBTEX 1 "19 November 1990" .SH NAME lookbibtex \- find references in a bibtex database .SH SYNOPSIS .B lookbibtex [ -k .I keyword ] [ .I bibfile.bib ] .I regexp .SH DESCRIPTION lookbibtex searches through a bibtex .I bibfile.bib database, printing entries that match .I regexp. See .BR bibtex (1) for a description of the bibtex database. .I Regexp is a Perl regular expression. See .BR perl (1) for an explanation of differences between perl and standard regular expressions. Searches can be limited to particular bibtex fields with the -k option. To do "and" searches on two fields use shell pipes, reading the output of one search as the bibliography of the second. For example, to find what someone named Kafka wrote about emacs keyboard layout, do: .IP .B lookbibtex -k author kafka | looktexbib -k title - meta .LP To do "or" searches, use regular expressions. For example, you could be concerned with only careful authors, like Kunth and Kafka: .IP .B lookbibtex -k author 'kunth|kafka' .LP More sophisticated searches can be achieved by combining these techniques. .SH AUTHOR John Heidemann <johnh@cs.ucla.edu> .SH SEE ALSO .BR bibtex (1), .BR perl (1) .SH BUGS lookbibtex is written in Perl, so it will not run on machines which do not have perl installed (although this arguably a bug of the person too lazy to install such a useful tool). Multiple keywords on one line will be missed. The @ that begins a bibtex entry, and the } which end it must be the first non-whitespace character on their line, or they will be missed. SHAR_EOF fi # end of overwriting check if test -f 'lookbibtex' then echo shar: will not over-write existing file "'lookbibtex'" else cat << \SHAR_EOF > 'lookbibtex' #!/usr/local/bin/perl # # Look in to a bib file. # Copyright (C) 1990 by John Heidemann # This is distributed under the GNU Public Licence, Version 1 (Feb 89). # See the Perl documentation for a copy of that license. # # 4-Oct-90 it is hacked together. # 19-Nov-90 Now it remembers "'s and join such lines. # It also removes nasty characters like {} from the search string. # # This program relies on the convention that the closing } of a # bib entry is the only } in the left-most column, # and that the opening @ is also in the first column. # $* = 1; # make searches on vars with imbedded newlines work # # customize this to whatever is right locally # $defaultfile = "/u/s9/u/ficus/DOC/ficus.bib"; $badkeys = "string"; # keys to ignore # # do argument processing # if ($#ARGV >= 1 && $ARGV[0] eq "-k") { $keyword = $ARGV[1]; shift (@ARGV); shift (@ARGV); }; if ($#ARGV == 0) { $file = $defaultfile; $pattern = $ARGV[0]; } elsif ($#ARGV == 1) { $file = $ARGV[0]; $pattern = $ARGV[1]; } else { die ("Usage: lookbib [-k keyword] [bibfile.bib] regexp\n" . " Keyword restricts the regexp search to that bibtex " . "field name (author, etc.)\n" . " Default bibfile is $defaultfile, - indicates stdin.\n" . " Regexp is a Perl regexp.\n"); }; # # handle the keyword by modifying the pattern # if (defined($keyword)) { $pattern = "^\\s*${keyword}\\s*=.*${pattern}"; # print "pattern is $pattern\n"; }; # # looking for beginning of bib entry is state 1, in bib is state 2 # $state = 1; # # Certain keys we really want to ignore because # they're not bib entries. They're listed here. # @badkeys = split(/,/, $badkeys); foreach $i (@badkeys) { $badkeys{$i} = "bad"; # just make them defined }; # # To do searches right, we have to make everything # for a field on one line. # This routine does that, and also gets rid of {}'s # which tend to interfere with searches. # sub printtosearch { local ($print) = @_; local ($search, $mode) = ("", 1); @lines = split(/\n/, $print); @lines[0] =~ s/{/ { /; foreach $ln (@lines) { $ln =~ s/[{}]//g; # remove curly brackets $mode = !$mode if (($ln =~ tr/"/"/) % 2 == 1); $search .= $ln; $search .= "\n" if ($mode); }; return $search; } open (INF, "<$file") || die ("cannot open bibfile $file"); while (<INF>) { # print "line ", $i++, " state=$state: $_\n"; if ($state == 1) { if (/^@(\w+)/) { # beginning of entry $key = $1 =~ tr/A-Z/a-z/; # case insensitive keywords if (! defined($badkeys{$1})) { $state = 2; $bibentry = $_; }; }; } elsif ($state == 2) { $bibentry .= $_; if (/^}/) { # ending $searchentry = &printtosearch($bibentry); print "$bibentry\n" if ($searchentry =~ /$pattern/i); $state = 1; } } else { die ("state problem, $state\n"); }; } SHAR_EOF chmod +x 'lookbibtex' fi # end of overwriting check # End of shell archive exit 0