ehrlich@psuvax1.cs.psu.edu (Daniel Ehrlich) (08/05/89)
I was wondering if someone was aware of a REFER to BIBTEX conversion program. One of our users has deciced to convert from troff to TeX, but has a 1500 entry REFER biblio database. Any pointers would be appreciated. Thanks in advance. -- Dan Ehrlich <ehrlich@shire.cs.psu.edu> | Disclaimer: The opinions expressed are The Pennsylvania State University | my own, and should not be attributed Department of Computer Science | to anyone else, living or dead. University Park, PA 16802 |
seeger@helios.iec.ufl.edu (F. L. Charles Seeger III) (08/07/89)
In article <EHRLICH.89Aug4170909@psuvax1.cs.psu.edu> ehrlich@psuvax1.cs.psu.edu (Daniel Ehrlich) writes: |I was wondering if someone was aware of a REFER to BIBTEX conversion |program. A couple of weeks ago I made the same request and received a few answers (and two programs) that address the issue. Unfortunately, I haven't had a chance to try any of these yet, so I can't offer any opinions on them. Nonetheless, it's past time for a summary, so here is a rather raw one. After trying them, I'll try to make them available for anon-ftp, but this will probably be a month away. (My excuse for such lethargy is that it looks like the magic VLSI layout editor needs massive revisions to run under SunOS 4.0.3. If you're curious, I'll probably post more about it on comp.lsi.cad later this week.) If you *really* can't wait that long for (1) or (2) below, I can mail it to you. (1) r2bib --- C source. /* r2bib - convert refer input files to bibtex .bib files Author - Rusty Wright, Center for Music Experiment, UCSD Modified by - Rod Oldehoeft, LLNL & Colorado State University: From: Rod Oldehoeft <rro@lll-crg.ARPA> 1. Accept a lower-case refer letter code as well as upper case 2. Map "%X" refer entry to "note=" bibtex entry 3. A "%B" entry results in "@inbook" result 4. Use {} instead of "" to bracket output fields 5. Map "%M" to "month=" bibtex entry 6. Map "%Y" to "year=" bibtex entry 7. Try to make bibtex entry key from author initials and year Modified by David Kotz, Duke University Computer Science (dfk@cs.duke.edu): 1. Fixed a bug (indirect through NULL) found when run on Suns. 2. Make the keyword generator smarter about dates and repeated entries. To use this effectively do a 'sortbib -sA+D' on the file before sending it through here. 3. map %K to keywords and %X to abstract instead of note. %O maps to note. */ (Thanks to Mark D. Grosen <grosen%amadeus@hub.ucsb.edu> and Francois-Michel Lang <lang@prc.unisys.com>). (2) ref2bib --- a sh/sed/awk script. # written by Peter King, Heriot-Watt University (Thanks to Peter King <pjbk@cs.hw.ac.uk> and Francois Lang). Peter thought that this should be avilable Clarkson with a name like "KING.TXH". I couldn't find it there, but he was kind enough to mail me an up to date copy. (3) tib --- is not a translator, but a bibliography setter for tex that uses refer-type databases. It is available for anonymous ftp from the june.cs.washington.edu archives (~ftp/tex/tib.shar.Z, 318972 bytes). (Thanks to James C. Alexander <jca@anna.umd.edu>). (4) There appears to be a translator that comes on "the Unix tape", which I assume is the TeX Unix distribution. This was mentioned by David Pascoe <davidp%wacsvax.uwa.edu.au@uunuet.uu.net> and Peter King. Peter felt that it wasn't really up to snuff, though. BTW, David's last message to me appeared to be missing at least one line. Here is the entire text as it reached me: | Chuck, | | The source I have is straight off the Unix distribution tape and is the code | Catcha, | Davidp. Regards, Chuck -- Charles Seeger 216 Larsen Hall +1 904 392 8935 Electrical Engineering University of Florida "Bye, Opus. seeger@iec.ufl.edu Gainesville, FL 32611 It's been fun."
lang@PRC.Unisys.COM (Francois-Michel Lang) (08/07/89)
In article <EHRLICH.89Aug4170909@psuvax1.cs.psu.edu> ehrlich@psuvax1.cs.psu.edu (Daniel Ehrlich) writes: >I was wondering if someone was aware of a REFER to BIBTEX conversion >program. One of our users has deciced to convert from troff to TeX, >but has a 1500 entry REFER biblio database. Any pointers would be >appreciated. Thanks in advance. This comes up every so often, and every so often, I post this stuff. I have two programs to do this (I didn't write either of them), but here they are... ----------------------------------------------------------------------------- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # r2bib.c # ref2bib # ref2bib.1 # This archive created: Mon Aug 7 08:40:41 1989 export PATH; PATH=/bin:$PATH echo shar: extracting "'r2bib.c'" '(7776 characters)' if test -f 'r2bib.c' then echo shar: will not over-write existing file "'r2bib.c'" else sed 's/^ X//' << \SHAR_EOF > 'r2bib.c' X/* r2bib - convert refer input files to bibtex .bib files X Author - Rusty Wright, Center for Music Experiment, UCSD X Modified by - Rod Oldehoeft, LLNL & Colorado State University: XFrom: Rod Oldehoeft <rro@lll-crg.ARPA> X 1. Accept a lower-case refer letter code as well as upper case X 2. Map "%X" refer entry to "note=" bibtex entry X 3. A "%B" entry results in "@inbook" result X 4. Use {} instead of "" to bracket output fields X 5. Map "%M" to "month=" bibtex entry X 6. Map "%Y" to "year=" bibtex entry X 7. Try to make bibtex entry key from author initials and year X Modified by David Kotz, Duke University Computer Science (dfk@cs.duke.edu): X 1. Fixed a bug (indirect through NULL) found when run on Suns. X 2. Make the keyword generator smarter about dates and repeated entries. X To use this effectively do a 'sortbib -sA+D' on the file before sending X it through here. X 3. map %K to keywords and %X to abstract instead of note. %O maps to note. X*/ X# include <ctype.h> X# include <stdio.h> X Xstruct rb { X char rb_kl; /* refer key letter */ X char * rb_kw; /* bibtex string */ X char rb_emit; /* don't print data if 0 */ X char * rb_data; /* refer data */ X}; X Xstruct rb rb[] = { X { 'A', "author", 1, NULL }, X { 'B', "booktitle", 1, NULL }, X { 'C', "address", 1, NULL }, X { 'D', "year", 1, NULL }, X { 'E', "editor", 1, NULL }, X/* { 'H', "commentary1", 1, NULL },*/ X { 'I', "publisher", 1, NULL }, X { 'J', "journal", 1, NULL }, X { 'K', "keywords", 1, NULL }, X { 'L', "label", 0, NULL }, /* use as bibtex key */ X { 'M', "month", 1, NULL }, X { 'N', "number", 1, NULL }, X { 'O', "note", 1, NULL }, X { 'P', "pages", 1, NULL }, X { 'Q', "institution", 1, NULL }, X { 'R', "report", 0, NULL }, X { 'S', "series", 1, NULL }, X { 'T', "title", 1, NULL }, X { 'V', "volume", 1, NULL }, X { 'X', "abstract", 1, NULL }, X { 'Y', "year", 1, NULL }, X { 0, 0, 0, 0 } X}; X Xstruct bmap { X char bm_kl; X char *bm_entry; X}; X X/* X * entries are in order of precedence. X * any entry with a 'J' field must be X * an article, but anthing with an 'I' X * field doesn't have to be a book (if X * an entry has both 'J' and 'I' it is X * considered to be an article). X */ Xstruct bmap bmap[] = { X { 'J', "article" }, X { 'R', "techreport" }, X { 'I', "book" }, X { 'B', "inbook" }, X { 0, 0 } X}; X Xmain(argc, argv) X char **argv; X{ X register FILE *fid; X register int i; X int err; X X err = 0; X X if (argc > 1) { X for (i = 1; i < argc; i++) { X if ((fid = fopen(argv[i], "r")) == NULL) { X fprintf(stderr, "fopen: "); X perror(argv[i]); X continue; X } X err += r2bib(argv[i], fid); X } X } X else X err += r2bib("stdin", stdin); X X if (err) X exit(1); X X exit(0); X} X Xr2bib(file, fid) X char *file; X FILE *fid; X{ X extern char *sanz(); X register char *cp; X struct rb *lrb; /* last rb stored into */ X int line; X char buf[BUFSIZ]; X int err; X X lrb = NULL; X err = 0; X line = 0; X X while (fgets(buf, sizeof(buf), fid) != NULL) { X line++; X X if ((cp = sanz(buf)) == NULL) { X if (lrb != NULL) { X dumprb(); X lrb = NULL; X } X continue; X } X X /* X * if the first letter is a % then it's the X * a new record, otherwise it's a continuation X * of the previous one. X */ X if (cp[0] == '%') { X for (lrb = &rb[0]; lrb->rb_kl != 0; lrb++) { X if(lrb->rb_kl == (islower(cp[1]) ? toupper(cp[1]) : cp[1])){ X stuffrb(lrb, &cp[2]); X break; X } X } X if (lrb->rb_kl == 0) { X fprintf(stderr, "r2b: %s: line %d: unknown key letter %c, ignoring\n", file, line, cp[1]); X err = 1; X } X } X else { X if (lrb == NULL) { X fprintf(stderr, "r2b: %s: line %d: bad format, ignoring\n", file, line); X err = 1; X continue; X } X X stuffrb(lrb, &cp[0]); X } X } X X if (lrb != NULL) X dumprb(); X X return(err); X} X X#define KEYSIZ 100 /* hopefully long enough */ X Xdumprb() { X register struct rb *trb; X register struct bmap *bm; X static int key; X char *bibkey; X char *cp; X int first; X static char lastkey[KEYSIZ]; /* the previous key we output */ X char thiskey[KEYSIZ]; /* key we are now building */ X static int repeat = 0; X X /* X * first, figure out what type of entry this X * is. X */ X for (bm = &bmap[0]; bm->bm_kl != 0; bm++) { X for (trb = &rb[0]; trb->rb_kl != 0; trb++) { X if ((trb->rb_kl == bm->bm_kl) && (trb->rb_data != NULL)) { X printf("@%s{", bm->bm_entry); X goto out; X } X } X } Xout: X if (bm->bm_kl == 0) X printf("@misc{"); X X /* X * in order of precedence; how to determine the X * bibtex key: X * 1. use capital letters from %A, followed if possible X * by the two chars after "19" in %D or %Y field. X * 2. otherwise just use the string "keyN" where N X * is the count of this bibliographic entry in X * the refer file. X */ X X key++; X bibkey = thiskey; X X for (trb = &rb[0]; trb->rb_kl != 0; trb++) { X if( trb->rb_kl == 'A'){ X if( trb->rb_data == NULL ) { X sprintf(thiskey, "key%d,\n",key); X printf("key%d",key); X break; X }else{ X for( cp = trb->rb_data; *cp != NULL; cp++ ) { X if( isupper(*cp)) { X printf("%c", *cp); X *bibkey++ = *cp; X } X }; X *bibkey = '\0'; X }; X }else{ if((trb->rb_kl == 'D') || (trb->rb_kl == 'Y')) { X for( cp = trb->rb_data; cp != NULL && *cp != NULL; cp++ ) { X if(isdigit(cp[0]) && isdigit(cp[1]) && X isdigit(cp[2]) && isdigit(cp[3])) { X *bibkey++ = cp[2]; X *bibkey++ = cp[3]; X printf("%c%c", cp[2], cp[3]); X break; X }; X }; X *bibkey = '\0'; X break; X }; X }; X }; X X if (strcmp(thiskey, lastkey) == 0) { X /* key was the same as previous; add a letter */ X printf("%c", 'a' + repeat++); X } else { X /* key differed from previous, but remember it */ X strcpy(lastkey, thiskey); X repeat = 0; X } X X printf(",\n"); X X first = 1; X X for (trb = &rb[0]; trb->rb_kl != 0; trb++) { X if (trb->rb_data == NULL) X continue; X X if (trb->rb_emit != 0) { X /* X * clank, X * this is so that things will line up. X */ X if (strlen(trb->rb_kw) < 6) X cp = "\t\t"; X else X cp = "\t"; X X if (! first) X printf(",\n"); X X printf("\t%s =%s{%s}", trb->rb_kw, cp, trb->rb_data); X first = 0; X } X X (void) free(trb->rb_data); X trb->rb_data = NULL; X } X X printf("\n}\n\n"); X} X Xstuffrb(lrb, cp) X struct rb *lrb; X char *cp; X{ X extern char *andfix(); X extern char *malloc(); X extern char *realloc(); X X /* empty data field */ X if ((cp = sanz(cp)) == NULL) X return; X X if (lrb->rb_kl == 'A') X cp = andfix(cp); X X if (lrb->rb_data == NULL) { X if ((lrb->rb_data = malloc(strlen(cp) + 1)) == NULL) { X perror("malloc"); X exit(1); X } X X strcpy(lrb->rb_data, cp); X } X else { X char *conj; X X if (lrb->rb_kl == 'A') X conj = " and "; X else X conj = " "; X X if ((lrb->rb_data = realloc(lrb->rb_data, strlen(lrb->rb_data) + strlen(cp) + strlen(conj) + 1)) == NULL) { X perror("realloc"); X exit(1); X } X X strcat(lrb->rb_data, conj); X strcat(lrb->rb_data, cp); X } X} X X/* X */ Xchar * Xandfix(string) X register char *string; X{ X register char *tmp; X register char *cp; X X tmp = string; X X for (cp = string; *cp != NULL; cp++) { X if (strncmp(cp, " and ", 5) == 0) { X /* X * +2 for the curly braces around "{and}", X * +1 for the null at the end. X */ X if ((tmp = malloc(strlen(string) + 2 + 1)) == NULL) { X perror("malloc"); X exit(1); X } X X strncpy(tmp, string, cp - string); X tmp[cp - string] = NULL; /* strncpy doesn't */ X strcat(tmp, " {and} "); X strcat(tmp, cp + 5); X } X } X X return(tmp); X} X Xchar * Xsanz(bp) X char *bp; X{ X register char *cp; X X cp = &bp[strlen(bp) - 1]; X X /* X * back up over any spaces chars X */ X while (isspace(*cp) && (cp >= bp)) X cp--; X X if (cp < bp) X return(NULL); /* empty line */ X X *++cp = NULL; X X while (isspace(*bp) && (bp < cp)) X bp++; X X if (cp == bp) X return(NULL); /* empty line */ X X return(bp); X} SHAR_EOF if test 7776 -ne "`wc -c < 'r2bib.c'`" then echo shar: error transmitting "'r2bib.c'" '(should have been 7776 characters)' fi fi # end of overwriting check echo shar: extracting "'ref2bib'" '(18294 characters)' if test -f 'ref2bib' then echo shar: will not over-write existing file "'ref2bib'" else sed 's/^ X//' << \SHAR_EOF > 'ref2bib' X#!/bin/sh X# X# shell script to convert refer (or bib) databases to BiBTeX format X# X# reads its arguments (or standard input) X# and writes the BibTeX to standard output X# X# the in-line files ref2b*.{sed,awk} do not change, and could be X# stored in a library somewhere. The sed script can actually be X# given as an argument in ' ' quotes provided the ' in the X# file are replaced with '\'' !! X# the awk script is too large for this treatment. X# X# The gnereation of keys can be altered by changing the X# values of some awk variables X# X# errors etc. in ref2bib.errs X# Xcat << 'ZZ' >ref2b$$.sed X# X# sed script to do some of the ref to bib database conversion X# X# written by Peter King, Heriot-Watt University X# You may do anything you like with this code X# EXCEPT claim that you wrote it X# X# First alter the TeX special characters Xs/\(.\)%/\1\\%/g Xs/&/\\&/g Xs/\$/\\$/g Xs/#/\\#/g Xs/_/\\_/g Xs/{/\\{/g Xs/}/\\}/g X# convert the special characters and accents from troff to BibTeX X# assumes the accents are those of the Berkeley -ms with .AM X# X/\\/s/\(.\)\\\\*\*'/{\\'\1}/g X/\\/s/\(.\)\\\\*\*`/{\\`\1}/g X/\\/s/\(.\)\\\\*\*^/{\\^\1}/g X/\\/s/\(.\)\\\\*\*:/{\\"\1}/g X/\\/s/\(.\)\\\\*\*~/{\\~\1}/g X/\\/s/\(.\)\\\\*\*_/{\\=\1}/g X/\\/s/\([oO]\)\\\\*\*\//{\\\1}/g X/\\/s/\([aA]\)\\\\*\*o/{\\\1\1}/g X/\\/s/\(.\)\\\\*\*,/{\\c{\1}}/g X/\\/s/\(.\)\\\\*\*v/{\\v{\1}}/g X/\\/s/\(.\)\\\\*\*"/{\\H{\1}}/g X/\\/s/\(.\)\\\\*\*\./{\\d{\1}}/g X/\\/s/\\\\*\*8/{\\ss}/g X/\\/s/\\\\*\*(P\([lL]\)/{\\\1}/g X/\\/s/\\\\*\*(\([oO]\)\//{\\\1}/g X# quotes X/\\/s/\\\\*\*Q/``/g X/\\/s/\\\\*\*U/''/g X/\\/s/\\\\*\*-/---/g X# \0 as space between surname de\0Souza etc. X/\\/s/\\\\*0\([a-z]*\)\\\\*0/\\0\1 /g X/\\/s/ \([a-z]*\)\\\\*0/ \1 /g X# but trap the ones that start with a capital letter and convert them to X# ties X/\\/s/\\\\*0/~/g X# X# now deal with special characters and Greek X/\\/s/\\\\*(em/---/g X/\\/s/\\\\*(if/$\\infty$/g X/\\/s/\\\\*(\*a/$\\alpha$/g X/\\/s/\\\\*(\*b/$\\beta$/g X/\\/s/\\\\*(\*g/$\\gamma$/g X/\\/s/\\\\*(\*d/$\\delta$/g X/\\/s/\\\\*(\*e/$\\epsilon$/g X/\\/s/\\\\*(\*z/$\\zeta$/g X/\\/s/\\\\*(\*y/$\\eta$/g X/\\/s/\\\\*(\*h/$\\theta$/g X/\\/s/\\\\*(\*i/$\\iota$/g X/\\/s/\\\\*(\*k/$\\kappa$/g X/\\/s/\\\\*(\*l/$\\lambda$/g X/\\/s/\\\\*(\*m/$\\mu$/g X/\\/s/\\\\*(\*n/$\\nu$/g X/\\/s/\\\\*(\*c/$\\xi$/g X/\\/s/\\\\*(\*o/$o$/g X/\\/s/\\\\*(\*p/$\\pi$/g X/\\/s/\\\\*(\*r/$\\rho$/g X/\\/s/\\\\*(\*s/$\\sigma$/g X/\\/s/\\\\*(\*t/$\\tau$/g X/\\/s/\\\\*(\*u/$\\upsilon$/g X/\\/s/\\\\*(\*f/$\\phi$/g X/\\/s/\\\\*(\*x/$\\chi$/g X/\\/s/\\\\*(\*q/$\\psi$/g X/\\/s/\\\\*(\*w/$\\omega$/g X/\\/s/\\\\*(\*A/A/g X/\\/s/\\\\*(\*B/B/g X/\\/s/\\\\*(\*G/$\\Gamma$/g X/\\/s/\\\\*(\*D/$\\Delta$/g X/\\/s/\\\\*(\*E/E/g X/\\/s/\\\\*(\*Z/Z/g X/\\/s/\\\\*(\*Y/H/g X/\\/s/\\\\*(\*H/$\\Theta$/g X/\\/s/\\\\*(\*I/I/g X/\\/s/\\\\*(\*K/K/g X/\\/s/\\\\*(\*L/$\\Lambda$/g X/\\/s/\\\\*(\*M/M/g X/\\/s/\\\\*(\*N/N/g X/\\/s/\\\\*(\*C/$\\Xi$/g X/\\/s/\\\\*(\*O/$O$/g X/\\/s/\\\\*(\*P/$\\Pi$/g X/\\/s/\\\\*(\*R/P/g X/\\/s/\\\\*(\*S/$\\Sigma$/g X/\\/s/\\\\*(\*T/T/g X/\\/s/\\\\*(\*U/$\\Upsilon$/g X/\\/s/\\\\*(\*F/$\\Phi$/g X/\\/s/\\\\*(\*X/X/g X/\\/s/\\\\*(\*Q/$\\Psi$/g X/\\/s/\\\\*(\*W/$\\Omega$/g X# Now trap title words that must be capitalised X/^%[^T]/b X# X# first all words that are all capitals (at least two consecutive) X# we need the slashes to allow for M/M/1 queues Xs;[A-Z][A-Z/][A-Z/0-9]*;{&};g X# X# then some proper names X# first some mathematicians X# (for some I've added the Pattern [^ -]* toe the end to get Markov, X# Markovian, etc. Xs/Abel/{&}/g Xs/Bernoulli/{&}/g Xs/Bessel/{&}/g Xs/Beta/{&}/g Xs/Borel/{&}/g Xs/Cauchy/{&}/g Xs/Church/{&}/g Xs/Rosser/{&}/g Xs/Dedekind/{&}/g Xs/Descartes/{&}/g Xs/Dirichlet/{&}/g Xs/Euclid[^ -,;]*/{&}/g Xs/Euler/{&}/g Xs/Fibonacci/{&}/g Xs/Fermat/{&}/g Xs/Fourier/{&}/g Xs/Fresnel/{&}/g Xs/Frobenius/{&}/g Xs/Perron/{&}/g Xs/Gamma/{&}/g Xs/Gauss[^ -,;]*/{&}/g Xs/Hilbert/{&}/g Xs/Horner/{&}/g Xs/Holder/{&}/g Xs/Jacobi[^ -,;]*/{&}/g Xs/Jensen/{&}/g Xs/Markov[^ -,;]*/{&}/g Xs/Arnoldi/{&}/g Xs/Laplace/{&}/g Xs/Laguerre/{&}/g Xs/Lagrange/{&}/g Xs/Legendre/{&}/g Xs/Leibnitz/{&}/g Xs/Rayleigh/{&}/g Xs/Ritz/{&}/g Xs/Riemann/{&}/g X# this is really Rouche (acute accent) , but the accent processing will disrupt it Xs/Rouch/{&}/g Xs/Stieltjes/{&}/g Xs/Stiener/{&}/g Xs/Schwarz/{&}/g Xs/Weibull/{&}/g Xs/Wald/{&}/g Xs/Kronecker/{&}/g Xs/Diophantine/{&}/g Xs/Delbrouck/{&}/g Xs/Bayes[^ -,;]*/{&}/g Xs/Jackson/{&}/g Xs/Newhall/{&}/g Xs/Turing/{&}/g Xs/Norton/{&}/g Xs/Petri/{&}/g Xs/Wilkinson/{&}/g Xs/Skinner/{&}/g Xs/Schafer/{&}/g Xs/Dempster/{&}/g Xs/Runge/{&}/g Xs/Kutta/{&}/g Xs/Pollaczek/{&}/g Xs/Khinchin/{&}/g Xs/Palm/{&}/g Xs/Erlang/{&}/g Xs/Engset/{&}/g Xs/Little's/{&}/g Xs/Kosten/{&}/g Xs/Gittins/{&}/g Xs/Feller/{&}/g Xs/Cox/{&}/g Xs/Poisson/{&}/g Xs/Chapman/{&}/g Xs/Kolmogorov/{&}/g Xs/Smirnov/{&}/g Xs/Weiner/{&}/g Xs/Hopf/{&}/g Xs/Stirling/{&}/g X X# computing Xs/Buzen/{&}/g Xs/Gordon/{&}/g Xs/Newell/{&}/g Xs/Lemoine/{&}/g Xs/Pierce/{&}/g Xs/Harrison/{&}/g Xs/Cambridge/{&}/g Xs/Ethernet/{&}/g Xs/Aloha/{&}/g X X# coding theory Xs/Hamming/{&}/g Xs/Huffman/{&}/g Xs/Reed/{&}/g Xs/Shannon/{&}/g Xs/Solomon/{&}/g Xs/Viterbi/{&}/g XZZ Xcat << 'ZZ' > ref2b$$.awk X# X# awk script to convert refer (or bib) format databases X# to BiBTeX format. X# X# written by Peter King, Heriot-Watt University X# use freely, but dont claim that you wrote it X# X# Generates keys using authors names and year (see %A entry ) X# X# You may wish to alter treatment of key fields that are ignored X# such as %U %W %Y %K etc. X# X# regular expressions should be sorted according to frequency X# so that minimal tests are made X# From tests in a local data base the order given appears quite good X# 2883 %A X# 1813 blank lines X# 1774 %T X# 1764 %D X# 1505 %P X# 1347 %J X# 1331 %V X# 1201 %N X# 773 .. continuation lines X# 501 %C X# 424 %I X# 192 %B X# 187 %E X# 92 %S X# 89 %R X# 33 %X X# 30 %K X# 16 %O X# 12 %any other % lines X# XBEGIN { X for(i=1;i<=27;i++) X addkey[i] = substr(" abcdefghijklmnopqrstuvwxyz",i,1); X lkey = 3; # number of characters used from authors to make key X maxauthor = 3; # maximum number of authors to use in X # constructing key X rx = 1 X } X X/\\[*u0]/ || /\\d[^{]/|| /\\s[^s]/ { X err = 1 X print "Non translated \\ symbol : Reference " rx > "ref2bib.errs" X print $0 > "ref2bib.errs" X } X X/^%A/ { X if (A==0) keys=""; X A ++; lastx = "A"; X authors[A] = substr($0,4) X if(A> maxauthor) next X ic = 0 X lc = 1 X while(ic < lkey && lc <= length($NF) ){ X kc = substr( $NF, lc, 1) X if ( kc ~ /[a-zA-Z]/ ){ X keys = keys kc X ic++ X if(ic==lkey) next X } X else if ( kc == "\\" ) lc ++; X lc ++; X } X next X } X X/^$/ { X if(NR==pr+1){ X } X else { X refs ++ X # if FILENAME != prevname then new file X acnt[A]++;if(A>MaxA)MaxA=A; X if(T==0)print "No title : Reference "refs" "keys > "ref2bib.errs" X if(A==0)print "No author : Reference "refs" "keys > "ref2bib.errs" X if(D==0)print "No date : Reference "refs" "keys > "ref2bib.errs" X if( (!T)||(!A)||(!D))err=1; X # classify the reference X if(J){ X #journal or conference X if(B||E||R)print "Journal & book?: Reference "refs" "keys > "ref2bib.errs" X if(C||I) {conf++ X type = "Inproceedings" X } X else{ X jour ++; X type = "Article" X } X if(!P) print "No page nos.? : Reference "refs" "keys > "ref2bib.errs" X if( B||E||R||(!P))err=1 X if(err){ X print "Journal reference in error" > "ref2bib.errs" X } X } X else X if(B){ X # article in book X type = "Incollection" X bookart++ X if(N||R||(!E)||(!I)||(!C)||(!P)||(V&&(!S)))err=1 X if(!E) print "No editor? Reference "refs" " keys > "ref2bib.errs" X if(!I) print "No publisher? Reference "refs" " keys > "ref2bib.errs" X if(!C) print "No city? Reference "refs" " keys > "ref2bib.errs" X if(!P) print "No page nos.? Reference "refs" " keys > "ref2bib.errs" X if(V&&(!S))print "Volume but no Series Reference "refs" " keys > "ref2bib.errs" X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(R)print "Report? Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Book reference in error" > "ref2bib.errs" X } X } X else if(R){ X #report X type = "Techreport" X reps++ X if(E||N)err=1 X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(E) print "Editor? Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Report reference in error" > "ref2bib.errs" X } X } X else if(I){ X wholebook ++ X type = "Book" X # book X if(N||R||E||(!C)||(V&&(!S)))err=1 X if(!C) print "No city? Reference "refs" " keys > "ref2bib.errs" X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(E)print "Editor? Reference "refs" " keys > "ref2bib.errs" X if(V&&(!S))print "Volume but no Series Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Book reference in error" > "ref2bib.errs" X } X } X else { X unclass ++ X type = "Misc" X err=1 X print "Unclassified reference in error" > "ref2bib.errs" X } X X # generate date X ndate = split(date,df) X if ( ndate > 2) print " Funny date " date > "ref2bib.errs" X if (ndate == 1 ) { df[2] = df[1]; df[1] = ""; } X X X # generate key X if(keys == "") keys = "ANON" X keys = keys substr(df[2],3,2) X if(keyused[keys] >=1) { X key_suffix = keyused[keys]++; X keys = keys addkey[key_suffix]; X } X else keyused[keys] = 1 X if (err) { X print "Key: " keys > "ref2bib.errs" X if(A) for (i=1;i<=A;i++) X print "%A " authors[i] > "ref2bib.errs" X if(T) print "%T " title > "ref2bib.errs" X if(J) print "%J "journal > "ref2bib.errs" X if(B) print "%B "book > "ref2bib.errs" X if(V) print "%V "volume > "ref2bib.errs" X if(N) print "%N "number > "ref2bib.errs" X if(I) print "%I "publisher > "ref2bib.errs" X if(C) print "%C "city > "ref2bib.errs" X if(E) for (i=1;i<=E;i++)print "%E "editor[i] > "ref2bib.errs" X if(S) print "%S "series > "ref2bib.errs" X if(P) print "%P "pages > "ref2bib.errs" X if(R) print "%R "report > "ref2bib.errs" X if(D) print "%D "date > "ref2bib.errs" X if(O) print "%O "other > "ref2bib.errs" X print "" > "ref2bib.errs" X } X X if(T){ X twc = split(title,z) X title = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt +length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X title = title sc z[i] X lt += length(z[i]) + 1 X } X } X if(O){ X twc = split(other,z) X other = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt + length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X other = other sc z[i] X lt += length(z[i]) + 1 X } X } X if(X){ X twc = split(abstr,z) X abstr = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt + length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X abstr = abstr sc z[i] X lt += length(z[i]) + 1 X } X } X X printf "@%s{\t%s",type,keys X if(A) { X printf ",\n\tAuthor = { %s",authors[1] X for(i=2;i<=A;i++) printf " and\n\t\t%s",authors[i] X printf " }" X } X if(T) printf ",\n\tTitle = { %s }",title X if(B) printf ",\n\tBooktitle = { %s }",book X if(E) { X printf ",\n\tEditor = { %s",editor[1] X for(i=2;i<=E;i++) printf " and\n\t\t%s",editor[i] X printf " }" X } X if(I) printf ",\n\tPublisher = { %s }",publisher X if(C) printf ",\n\tAddress = { %s }",city X if(J) { # substitute the journal abbreviations from the standard styles X journal = "{ " journal " }" X # {acmcs} {"ACM Computing Surveys"} X if ( journal ~ /Comp.* Sur/ ) journal = "acmcs" X # {acta} {"Acta Informatica"} X if ( journal ~ /Acta Inf/ ) journal = "acta" X # {cacm} {"Communications of the ACM"} X if ( journal ~ /Com.* ACM/ ) journal = "cacm" X if ( journal ~ /CACM/ ) journal = "cacm" X # {ibmjrd} {"IBM Journal of Research and Development"} X if ( journal ~ /IBM J.*R.*D/ ) journal = "ibmjrd" X # {ibmsj} {"IBM Systems Journal"} X if ( journal ~ /IBM Sy.*J/ ) journal = "ibmsj" X # {ieeese} {"IEEE Transactions on Software Engineering"} X if ( journal ~ /IEEE Tran.*Soft.*Eng/ ) journal = "ieeese" X # {ieeetc} {"IEEE Transactions on Computers"} X if ( journal ~ /IEEE Tran.*Computers/ ) journal = "ieeetc" X # {ieeetcad} X if ( journal ~ /IEEE Tran.*Comp.*Desig/ ) journal = "ieeetcad" X # {ipl} {"Information Processing Letters"} X if ( journal ~ /Inf.*Proc.*Lett/ ) journal = "ipl" X # {jacm} {"Journal of the ACM"} X if ( journal ~ /Jou.* ACM/ ) journal = "jacm" X if ( journal ~ /JACM/ ) journal = "jacm" X # {jcss} {"Journal of Computer and System Sciences"} X if ( journal ~ /J.*Comp.*Sys.*Sc/ ) journal = "jcss" X # {scp} {"Science of Computer Programming"} X if ( journal ~ /Sc.*Comp.*Prog/ ) journal = "scp" X # {sicomp} {"SIAM Journal on Computing"} X if ( journal ~ /SIAM .*Comp/ ) journal = "sicomp" X # {tocs} {"ACM Transactions on Computer Systems"} X if ( journal ~ /ACM Tran.*Comp.*Sys/ ) journal = "tocs" X # {tods} {"ACM Transactions on Database Systems"} X if ( journal ~ /ACM Tran.*Data.*Sys/ ) journal = "tods" X # {tog} {"ACM Transactions on Graphics"} X if ( journal ~ /ACM Tran.*Grap/ ) journal = "tog" X # {toms} {"ACM Transactions on Mathematical Software"} X if ( journal ~ /ACM Tran.*Math.*Soft/ ) journal = "toms" X # {toois} {"ACM Transactions on Office Information Systems"} X if ( journal ~ /ACM Tran.*Off.*Inf.*Sys/ ) journal = "toois" X # {toplas} {"ACM Transactions on Programming Languages and Systems"} X if ( journal ~ /ACM Tran.*Prog.*Lan.*Sys/ ) journal = "toplas" X # {tcs} {"Theoretical Computer Science"} X if ( journal ~ /Th.*Comp.*Sci/ ) journal = "tcs" X X printf ",\n\tJournal = %s",journal X } X if(V) printf ",\n\tVolume = { %s }",volume X if(N) printf ",\n\tNumber = { %s }",number X if(P) printf ",\n\tPages = { %s }",pages X if(O) printf ",\n\tNote = { %s }",other X if(R) printf ",\n\tNumber = { %s }",report X if(S) printf ",\n\tSeries = { %s }",series X if(df[1] != "") X printf ",\n\tMonth = { %s }",df[1] X if(D) printf ",\n\tYear = { %s }",df[2] X if(X) printf ",\n\tAnnote = { %s }",abstr X if(L) printf ",\n\tKey = { %s }",label X printf "\t}\n\n" X X A=0;B=0;C=0;D=0;E=0;F=0;G=0;H=0;I=0;J=0; X K=0;L=0;M=0;N=0;O=0;P=0;Q=0;R=0;S=0;T=0; X U=0;V=0;W=0;X=0;Y=0;Z=0; X type = "" X book="" X title = "" X volume = "" X city = "" X date = "" X publisher = "" X journal = "" X number = "" X other = "" X page = "" X report = "" X series = "" X toterr +=err X rx++ X } X err = 0 X pr = NR X next X } X X/^%T/ { X T ++; lastx = "T" X if(T>1){err=1 X print "Two titles: Reference " rx > "ref2bib.errs" X print title > "ref2bib.errs" X } X title = substr($0,4) X next X } X X/^%D/ { X D ++; lastx = "D" X if(D>1){err=1 X print "Two dates: Reference " rx > "ref2bib.errs" X print date > "ref2bib.errs" X } X if(($NF<1900)||($NF>=2000)){err=1 X print "Date error? : Reference " rx > "ref2bib.errs" X } X date = substr($0,4); X next X } X X/^%P/ { X P ++; lastx = "P" X if(P>1){err=1 X print "Two page nos? : Reference " rx > "ref2bib.errs" X print pages > "ref2bib.errs" X } X pages = substr($0,4) X next X } X X/^%J/ { X J ++; lastx = "J" X if(J>1){err=1 X print "Two journals: Reference " rx > "ref2bib.errs" X print journal > "ref2bib.errs" X } X journal = substr($0,4) X next X } X X/^%V/ { X V ++; lastx = "V" X if(V>1){err=1 X print "Two volumes: Reference " rx > "ref2bib.errs" X print volume > "ref2bib.errs" X } X volume = substr($0,4) X next X } X X/^%N/ { X N ++; lastx = "N" X if(N>1){err=1 X print "Two issue numbers: Reference " rx > "ref2bib.errs" X print number > "ref2bib.errs" X } X number = substr($0,4) X next X } X X/^[^%]/ { X if( lastx == "A") authors[A] = authors[A] " " $0 X if( lastx == "B") book = book " " $0 X if( lastx == "C") city = city " " $0 X if( lastx == "D") date = date " " $0 X if( lastx == "E") editor[E] = editor[E] " " $0 X if( lastx == "I") publisher = publisher " " $0 X if( lastx == "J") journal = journal " " $0 X if( lastx == "L") label = label " " $0 X if( lastx == "N") number = number " " $0 X if( lastx == "O") other = other " " $0 X if( lastx == "P") pages = pages " " $0 X if( lastx == "R") report = report " " $0 X if( lastx == "S") series = series " " $0 X if( lastx == "T") title = title " " $0 X if( lastx == "V") volume = volume " " $0 X if( lastx == "X") abstr = abstr " " $0 X next X } X X/^%C/ { X C ++; lastx = "C" X if(C>1){err=1 X print "Two cities: Reference " rx > "ref2bib.errs" X print city > "ref2bib.errs" X print " 2 cities " FILENAME, pr+1, NR > "ref2bib.errs" X } X city = substr($0,4) X next X } X X/^%I/ { X I ++; lastx = "I" X if(I>1){err=1 X print "Two publishers: Reference " rx > "ref2bib.errs" X print publisher > "ref2bib.errs" X } X publisher = substr($0,4) X next X } X X/^%B/ { X B ++; lastx = "B" X if(B>1){err=1 X print "Two books: Reference " rx > "ref2bib.errs" X print book > "ref2bib.errs" X } X book = substr($0,4) X next X } X X/^%E/ { # this really deals with 'bib' format X # refer only allows one %E fielsd, so we ought to X # split it somehow X E ++; lastx = "E" X editor[E] = substr($0,4) X next X } X X/^%[^ABCDEIJKLNOPRSTVX]/ { X F ++; lastx = "F"; # should not get these X print "Unexpected flag: Reference " rx > "ref2bib.errs" X print $0 > "ref2bib.errs" X err = 1 X next X } X X/^%O/ { X O ++; lastx = "O" X if(O>1){err=1 X print "Two others: Reference " rx > "ref2bib.errs" X print other > "ref2bib.errs" X } X other = substr($0,4) X next X } X X/^%S/ { X S ++; lastx = "S" X if(S>1){err=1 X print "Two series: Reference " rx > "ref2bib.errs" X print series > "ref2bib.errs" X } X series = substr($0,4) X next X } X X/^%R/ { X R ++; lastx = "R" X if(R>1){err=1 X print "Two reports: Reference " rx > "ref2bib.errs" X print report > "ref2bib.errs" X } X report = substr($0,4) X next X } X X/^%X/ { X X ++; lastx = "X" X abstr = substr($0,4) X if(X>1){err=1 X print "Two abstracts: Reference " rx > "ref2bib.errs" X } X next X } X X/^%K/ { X lastx = "K" X next X } XEND { X print refs " references" > "ref2bib.errs" X if(toterr) print toterr " erroneous" > "ref2bib.errs" X if(conf) print conf " conference papers" > "ref2bib.errs" X if(jour) print jour " journal articles" > "ref2bib.errs" X if(wholebook) print wholebook " books" > "ref2bib.errs" X if(totB) print totB " book articles" > "ref2bib.errs" X if(reps) print reps " reports" > "ref2bib.errs" X if(unclass) print unclass " Unclassified" > "ref2bib.errs" X if(totO) print totO " have additional information." > "ref2bib.errs" X if(totK) print totK " have additional keywords." > "ref2bib.errs" X if(totX) print totX " have abstracts/commentaries." > "ref2bib.errs" X print totA " authors" > "ref2bib.errs" X for(i=0;i<=MaxA;i++)if(acnt[i]){ X print i, " authors ", acnt[i] > "ref2bib.errs" X av += i*acnt[i] X } X print "Average ", av/refs > "ref2bib.errs" X print totT " titles" > "ref2bib.errs" X print "Key frequencies" > "ref2bib.errs" X for(k in keyused) print k, keyused[k] > "ref2bib.errs" X X } XZZ Xsed -f ref2b$$.sed $* | awk -f ref2b$$.awk Xrm -f ref2b$$.sed ref2b$$.awk SHAR_EOF if test 18294 -ne "`wc -c < 'ref2bib'`" then echo shar: error transmitting "'ref2bib'" '(should have been 18294 characters)' fi chmod +x 'ref2bib' fi # end of overwriting check echo shar: extracting "'ref2bib.1'" '(3049 characters)' if test -f 'ref2bib.1' then echo shar: will not over-write existing file "'ref2bib.1'" else sed 's/^ X//' << \SHAR_EOF > 'ref2bib.1' X.TH REF2BIB 1-local X.SH NAME Xref2bib \- convert refer input files to bibtex .bib files X.SH SYNOPSIS X.B r2bib Xfile ... X.br X.SH DESCRIPTION X.B ref2bib Xreads the X.I files Xand produces a X.B bibtex Xreference list (a .bib file) on the standard output. XIf no files are given, ref2bib reads Xstandard input. X.PP XA rudimentary attempt is made to convert X.I troff Xspecial characters and accents to the equivalent X.I TeX Xones. XThe file ``ref2bib.errs'' contains complaints about references that were Xnot recognised, and other problems, as well as a summary of the Xnumber of conversions completed. X.PP XSince X.B refer Xfiles are inherently unstructured (compared to X.B bibtex ) X.B ref2bib Xonly does a passable job. In particular X.B refer Xdoesn't require a keyword, while X.B bibtex Xdoes. X.B ref2bib Xgenerates one using the following procedure: Xthe first 3 characters of the last names of the first three authors Xare concatenated, (preserving the capital letters), and the last two Xdigits of the date are appended. If this key has already been used, Xthen 'a', 'b', 'c', are appended as needed. X.PP XJournal entries that appear to be in the standard bibliography style Xfiles list of @strings, are converted. XThe %D field is converted to month and year entries if there are two Xfields, otherwise it is assumed to contain only the year. XA large number of proper names, such as Hilbert, Turing, etc., Xwhich are often found in the titles of articles are enclosed in braces X{} to protect them. This treatment is also applied to any strings of Xmore than two consecutive capital letters. X.PP XTo determine the type of reference that the X.B refer Xentry is, X.B ref2bib Xhas to do some ``calculated guessing''. The heuristic used Xhere (again, in order of precedence) is: X.PP X1. If it has a journal entry (%J) then it's considered to Xbe an @article, unless there is a city entry (%C) or a publisher entry X(%I) as well, in which case it's Xtreated as an @inproceedings. X.PP X2. If it has a book entry (%B) then it's considered to Xbe an @incollection. X.PP X3. If it has a report entry (%R) then it's considered to Xbe a @techreport. X.PP X4. If it has a issuer entry (%I) then it's considered to Xbe a @book. X.PP X5. Otherwise it's considered to be a @misc. XAll @misc entries are listed in the ``ref2bib.errs'' file. X.PP XQuite often X.B ref2bib Xwill misguess and you will need to edit (by hand) the resulting .bib Xfile. X.PP XAny fields that X.B ref2bib Xdoesn't know about it will ignore (and complain about on stderr). X.SH ACKNOWLEDGMENT XThis manual page is based on the manual page for X.I r2bib , Xa program which performs a simpler version of the same conversion, Xwriotten by XRusty Wright, Center For Music Experiment, University of California San XDiego. X.SH AUTHOR XPeter King, Computer Science Department, Heriot-Watt University, XEdinburgh. X.SH BUGS XImplemented as a X.I sh(1) Xscript, using X.I sed(1) Xand X.I awk(1) . XThis makes the conversion very slow, but also means that it is easily Xmodified to alter the heuristics. In particular, the key generation Xalgorithm is easily changed. SHAR_EOF if test 3049 -ne "`wc -c < 'ref2bib.1'`" then echo shar: error transmitting "'ref2bib.1'" '(should have been 3049 characters)' fi fi # end of overwriting check # End of shell archive exit 0 ----------------------------------------------------------------------------- ---------------------------------------------------------------------------- Francois-Michel Lang Paoli Research Center, Unisys Corporation lang@prc.unisys.com (215) 648-7256 Dept of Comp & Info Science, U of PA lang@cis.upenn.edu (215) 898-9511