gisle@ifi.uio.no (Gisle Hannemyr) (10/15/90)
Posting-number: Volume 15, Issue 79 Submitted-by: Gisle Hannemyr <gisle@ifi.uio.no> Archive-name: l2a/part01 # This is a shell archive. # Remove everything above and including the cut line. # Then run the rest of the file through /bin/sh (not csh). #--cut here-----cut here-----cut here-----cut here-----cut here-----cut here--# #!/bin/sh # shar: Shell Archiver # Execute the following text with /bin/sh to create the file(s): # l2a.tex # Makefile # l2a.l # l2a.1 # This archive created: Mon Oct 8 23:47:11 1990 # Wrapped by: Gisle Hannemyr (gisle@ifi.uio.no) echo shar: extracting l2a.tex sed 's/^XX//' << \SHAR_EOF > l2a.tex XX% run this through LaTeX or L2A -*- text -*- XX\documentstyle[12pt]{article} XX\pagenumbering{arabic} XX\setlength{\parindent}{0em} XX\setlength{\parskip}{1ex} XX\setcounter{secnumdepth}{0} XX\newcommand{\LtoA}{$L_{2}\cal A$} XX XX\begin{document} XX\title{\LtoA{} --- a LaTeX detergent} XX\author{Gisle Hannemyr \\ Norwegian Computing Center} XX\maketitle XX XX XX\section{Introduction} XX XX\LtoA\ is a filter to ``detexify'' texts. That is, it attempts to XXremove \LaTeX{} markup commands, leaving only the body of XXtext. It is intended to be used when journal editors request plain XXASCII text for typesetting, or when you want to post a plain version XXof a \LaTeX{} document on an electronic conference system. XX XXThe author's address is: XX XX\begin{tabbing} XXUUCP: \= C=no;PRMD=uninett;O=nr;S=Hannemyr;G=Gisle \= (X.400 SA format) \kill XX \> Gisle Hannemyr \\ XX \> Norwegian Computing Center \\ XX \> P.O.\ 114 Blindern \\ XX \> N-0314 Oslo 3 \\ XX \> Norway \\ XX\\ XXEAN: \> C=no;PRMD=uninett;O=nr;S=Hannemyr;G=Gisle \> (X.400 SA) \\ XX \> gisle.hannemyr@nr.no \> (RFC-822 ) \\ XXInet: \> gisle@ifi.uio.no \\ XXUUCP: \> ...!mcsun!ifi!gisle \\ XX\end{tabbing} XX XX XX\section{Description} XX XX\LtoA\ is a filter. It reads from standard input and writes to XXstandard output. Typical usage would be: XX XX\begin{quote} XX\verb+l2a < foobar.tex > foobar.txt+ XX\end{quote} XX XXIt accepts three switches: XX XX{\tt -a} displays copyright etc. XX XX{\tt -h} displays brief help XX XX{\tt -n} uses Norwegian for texts etc. XX XX\section{Current state} XX XX\LtoA{} handles a subset \LaTeX{}. Most if the missing things I XXplan to add when I need them, but there are some features of \LaTeX{} XX(e.g.\ the $\backslash$kill function) that I can not see how can XXbe handled with lex. Some manual polishing of the output will always XXbe required. XX XXSome Norwegian bias is present in the source code. In particular XXsome Scandinavian characters are translated to their counterparts XXin the Norwegian/Danish version of the ISO 7-bit character set. XX XX\LtoA{} works for my style of \LaTeX{} usage, but proably barf when fed XXother people's input. Consider the current state of \LtoA{} as a XXstarting point: If you want to use it, then it is up to you to hack it XXinto shape for your style. Btw.\ if you teach \LtoA{} new tricks, I XXwould like to get back a copy of your enhancements\ldots XX XX% I plan to post \LtoA{} on the net in the near future. The current XX% version is a sort of ``beta'' release I mail to people who have shown XX% some interest in it. Any feedback you can give on the current version XX% will be appreciated. XX XX XX\section{Diagnostics} XX XXUnrecognized markup commands generate an error message on the screen. XXThey are also retained in the text, enclosed in a brackets looking XXlike this: {\tt @( )@}. This style of bracketing was chosen so that XXit should be simple to use a text editor to search the output file for XXthese commands and edit the context they appear in. XX XX XX\section{Footnotes} XX XX\LtoA{} does not recognize such advanced concepts as a XX``page''. To avoid having do deal with pages, it will transform XX{\em footnotes} to {\em endnotes} (i.e.\ the footnotes are moved to XXthe end of file, and renumbered). \LtoA{} will take care of the XXrenumbering. It will insert numbers in angle brackets XX(e.g.\ $\langle{}3\rangle{}$) to number the footnotes in the text. XX XX XX\section{Tables, figures and captions} XX XXTables and figures are stripped from the text. They are however XXclearly outlined with lines like this\footnote{The text inserted is XXactually dependent on which language you have selected.} in the text: XX XX{ \small XX\verb+<<=============== NB! Typeset as table. NB! ================>>+\\ XX\verb+<<=========== NB! Please insert figure here. NB! ===========>>+ XX} XX XXThis should make the missing bits stand out to the dullest of editors. XX XXCaptions for figures and tables are marked like this, with the actual XXtext of the caption on the next line. XX XX{ \small XX\verb+ --------------- Caption for figure or table: -------------+ XX} XX XXA line like this is used to mark when end of the point where the table XXor figure should be inserted. XX XX{ \small XX\verb+<<==========================================================>>+ XX} XX XXPlease equip the editor with scissors and glue and refer him/her to XXthe paper version typeset with \LaTeX{} to find the actual XXfigures and tables (or submit them on separate sheets of paper). XX XX XX\section*{To do} XX XX\LtoA{} is far from complete. It is very weak as far as mathematical XXmode are concerned (I don't write much mathematics). XX XXHowever, I hope the current version still are of some use. The lex XXsource is easy to maintain. I suggest that users add the stuff they XXneed when they need it. XX XXThe most urgent thins on the ``to do'' list is functions to handle XXincluded files, cross-references, citations and bibliographies. XX XX\LtoA{} should also have an option to use an 8-bit character set XX(ISO~8859/1) for accented characters. XX XXFinally, I nurse a secret dream of having \LtoA{} generate XX{\em WordPerfect} or {\em Word} files preserving italics etc. XX XXEnjoy! XX XX\end{document} XX XX% EOF XX XX SHAR_EOF if test 5064 -ne "`wc -c l2a.tex`" then echo shar: error transmitting l2a.tex '(should have been 5064 characters)' fi echo shar: extracting Makefile sed 's/^XX//' << \SHAR_EOF > Makefile XX# Makefile for l2a 27 dec. 1988 [gh] XX#----------------------------------------------------------------------------- XX# Abstract: XX# Filter to detexify texts (handles both tex and latex). XX# XX# Compilation: XX# Tested using a Sun 3/50 and SunOS 3.5. Believed to be portable. XX#----------------------------------------------------------------------------- XX XXOBJ = l2a.o XXSHR = l2a.tex Makefile l2a.l l2a.1 XXBINDIR = /local/bin XXMANDIR = /local/man XX XXl2a: $(OBJ) XX cc -o l2a $(OBJ) -ll XX XX$(OBJ): l2a.l XX XXinstall: XX cp l2a $(BINDIR)/l2a XX cp l2a.1 $(MANDIR)/man1/l2a.1 XX XXclean: XX \rm -f l2a.txt l2a.c lex.yy.c l2a l2a.aux l2a.log $(OBJ) *~ XX XXtest: XX \rm -f l2a.txt XX l2a < l2a.tex > l2a.txt XX XXdvi: XX latex l2a.tex XX XXshar: XX shar -a $(SHR) > l2a.shar XX XX# EOF SHAR_EOF if test 737 -ne "`wc -c Makefile`" then echo shar: error transmitting Makefile '(should have been 737 characters)' fi echo shar: extracting l2a.l sed 's/^XX//' << \SHAR_EOF > l2a.l XX /* l2a.l -*- C -*- XX +----------------------------------------------------------------------------- XX | Abstract: XX | Lex filter to detexify texts (handles really LaTeX, and a bit of TeX). XX | XX | Authorship: XX | Copyright (c) 1988, 1990 Gisle Hannemyr. XX | XX | Permission is granted to hack, make and distribute copies of this program XX | as long as this notice and the copyright notices are not removed. XX | If you intend to distribute changed versions of this program, please make XX | an entry in the "history" log (below) and mark the hacked lines with your XX | initials. I maintain the program, and shall appreiciate copies of bug XX | fixes and new versions. XX | Flames, bug reports, comments and improvements to: XX | snail: Gisle Hannemyr, Brageveien 3A, 0452 Oslo, Norway XX | EAN: C=no;PRMD=uninett;O=nr;S=Hannemyr;G=Gisle (X.400 SA format) XX | gisle.hannemyr@nr.no (RFC-822 format) XX | Inet: gisle@ifi.uio.no XX | UUCP: ...!mcsun!ifi!gisle XX | XX | History: XX | 1.0 8 oct 90 [gh] Version 1 posted on comp.sources.misc. XX | 0.0 27 dec 88 [gh] Started it. XX | XX | Bugs: XX | * Works for my style of Latex, but will proably barf on other peoples XX | files. XX | * I can't see how \kill can be handled by lex only. XX | * \cite is hacked. Should be fixed. XX | XX | Environment: XX | None. XX | XX | Diagnostics: XX | Prints plain error messages. Does not return error codes. XX +---------------------------------------------------------------------------*/ XX XX XX%{ XX XX /*---( start of user written lex prologue )---------------------------------*/ XX XX XX#include <stdio.h> XX#include <string.h> XX XX#define VERSION "1.0" XX#define MAXEROR 100 XX#define ENGLISH 0 XX#define NORWEGIAN 1 XX XX void parseerror(); XX void bibhead(); XX void ptabular(); XX void pfigure(); XX void pcaption(); XX void phorline(); XX void pappendix(); XX void verbatim(); XX void pfotnote(); XX void footnote(); XX XX XX /*---( globals )------------------------------------------------------------*/ XX XXchar About[] = "\ XXL2A is a filter to remove markup commands from LaTeX manuscripts.\n\n\ XXFlames, bug reports, comments and improvements to:\n\ XX snail: Gisle Hannemyr, Brageveien 3A, 0452 Oslo, Norway\n\ XX EAN: C=no;PRMD=uninett;O=nr;S=Hannemyr;G=Gisle (X.400 SA format)\n\ XX gisle.hannemyr@nr.no (RFC-822 format)\n\ XX Inet: gisle@ifi.uio.no\n\ XX UUCP: ...!mcsun!ifi!gisle\n"; XX XXchar Usage[] = " Usage: l2a (options)\n\ XX Valid options:\n\ XX\t-a -- about l2a\n\ XX\t-h -- print this\n\ XX\t-n -- norwegian text\n"; XX XXint LineNo; /* Line number for debugging */ XXint Enumber; /* Number for enumerate */ XXint Fnumber = 0; /* Number for footnotes/endnotes */ XXFILE *BFile; /* Bibliography file. */ XXFILE *FNote; /* Temp. file for footnotes. */ XXint Language = ENGLISH; /* language for inserted text */ XX XX /*---( end of user written lex prologue )-----------------------------------*/ XX XX%} XX XX%START EN FN IT QU MM XX%n 1000 XX%p 6000 XX%a 4000 XX%e 2000 XX XXsp [ \t\n]* XXan [0-9a-zA-Z]+ XX XX%% XX XX^"\\begin{thebibliography".* { bibhead(); } XX^"\\end{thebibliography}" { ; } XX^"\\bibitem["{an}"]{"{an}"}" { fputs(yytext,stderr); } XX"\\cite{"[^{\n]*"}" { yytext[0] = '('; yytext[strlen(yytext)] = ')'; printf("@CITE%s@ ",yytext); } XX"\\ref{"[^{\n]*"}" { yytext[0] = '('; yytext[strlen(yytext)] = ')'; printf("@REF%s@ ", yytext); } XX^"\\begin{enumerate}" { BEGIN EN; Enumber = 0; } XX^"\\end{enumerate}" { BEGIN 0; } XX^"\\begin{itemize}" { BEGIN IT; } XX^"\\end{itemize}" { BEGIN 0; } XX^"\\begin{quote}" { BEGIN QU; } XX^"\\end{quote}" { BEGIN 0; } XX^"\\begin{quotation}" { BEGIN QU; } XX^"\\end{quotation}" { BEGIN 0; } XX<IT>"\\item" { putchar('*'); } XX<EN>"\\item" { Enumber++; printf("%d)",Enumber); } XX"\\item" { putchar('+'); } XX<QU>\n { ECHO; LineNo++; fputs(" ",stdout); } XX<FN>. { footnote(); BEGIN 0; } XX"$" { BEGIN MM; } XX<MM>"$" { BEGIN 0; } XX<MM>"\\langle" { putchar('<'); } XX<MM>"\\rangle" { putchar('>'); } XX<MM>"\\backslash" { putchar('\\'); } XX^"\\begin{tabular".* { ptabular(); } XX^"\\end{tabular}" { phorline(); } XX^"\\begin{figure".* { pfigure(); } XX^"\\end{figure}" { phorline(); } XX^"\\caption{" { pcaption(); } XX"\\footnote" { Fnumber++; printf("<%d>",Fnumber); BEGIN FN; } XX"\\appendix" { pappendix(); } XX"\\verb" { verbatim(); } XX"\\'{e}"{sp} { putchar('e'); } XX\\ae{sp} { putchar('{'); } XX\\o{sp} { putchar('|'); } XX\\aa{sp} { putchar('}'); } XX\\AE{sp} { putchar('['); } XX\\O{sp} { putchar('\\'); } XX\\AA{sp} { putchar(']'); } XX"&" { putchar('\t'); } XX"\\>" { putchar('\t'); } XX"~" { putchar(' '); } XX"\\ " { putchar(' '); } XX"\\$" { putchar('$'); } XX"\\&" { putchar('&'); } XX"\\%" { putchar('%'); } XX"\\#" { putchar('#'); } XX"\\_" { putchar('_'); } XX"\\{" { putchar('{'); } XX"\\}" { putchar('{'); } XX"\\\\"{sp} { putchar('\n'); } XX^"\\vspace".* { putchar('\n'); } XX^"\\title" { putchar('\n'); } XX^"\\author" { putchar('\n'); } XX^"\\date".* { putchar('\n'); } XX^"\\maketitle" { putchar('\n'); } XX^"\\part" { putchar('\n'); } XX^"\\chapter" { putchar('\n'); } XX^"\\section" {;} XX^"\\subsection" {;} XX^"\\subsubsection" {;} XX^"\\paragraph" {;} XX^"\\subparagraph" {;} XX^"\\input lcustom" {;} XX^"\\documentstyle".* {;} XX^"\\pagenumbering".* {;} XX^"\\setcounter".* {;} XX^"\\newcommand".* {;} XX^"\\setlength".* {;} XX^"\\hyphenation".* {;} XX^"\\label".* {;} XX^"\\draftfalse" {;} XX^"\\begin{document}" {;} XX^"\\end{document}" {;} XX^"\\begin{sloppypar}" {;} XX^"\\end{sloppypar}" {;} XX^"\\begin{tabbing}" {;} XX^"\\end{tabbing}" {;} XX"\\TeX" { fputs("TeX", stdout); } XX"\\LaTeX" { fputs("LaTeX", stdout); } XX"\\LtoA" { fputs("L2A", stdout); } XX"\\ldots" { fputs("...", stdout); } XX"\\tiny" {;} XX"\\scriptsize" {;} XX"\\footnotesize" {;} XX"\\small" {;} XX"\\normalsize" {;} XX"\\large" {;} XX"\\Large" {;} XX"\\LARGE" {;} XX"\\huge" {;} XX"\\Huge" {;} XX"\\rm"{sp} {;} XX"\\em"{sp} {;} XX"\\bf"{sp} {;} XX"\\it"{sp} {;} XX"\\sl"{sp} {;} XX"\\sf"{sp} {;} XX"\\sc"{sp} {;} XX"\\tt"{sp} {;} XX"{" {;} XX"}" {;} XX"%".* {;} XX"\\-" {;} XX\n { ECHO; LineNo++; } XX. { ECHO; } XX"\\"[@A-Za-z]+ { parseerror(12,yytext); yytext[0] = '('; printf("@%s)@ ",yytext); } XX XX%% XX XX /*---( routines )-----------------------------------------------------------*/ XX XX void parseerror(type,ss) XX int type; XX char *ss; XX { XX static errcnt = 0; XX XX fprintf(stderr,"l2a: Error %d -- ",type); XX switch (type) { XX case 1: fputs("Instruction not recognized",stderr); break; XX case 2: fputs("Wrong number of parameters",stderr); break; XX case 3: fputs("Bad parameter received",stderr); break; XX case 5: fputs("Unknown character set",stderr); break; XX case 6: fputs("Position overflow",stderr); break; XX case 10: fputs("Bogus output request",stderr); break; XX case 11: fputs("Unimplemented command",stderr); break; /* No function */ XX case 12: fputs("Unrecognized markup command",stderr); break; /* No grammar */ XX case 13: fputs("Unhandled command",stderr); break; /* No semantic */ XX default: fputs("Unknown error (internal)",stderr); break; XX } /* switch */ XX fprintf(stderr,"\n @ line %3d: \"%s\"\n",LineNo,ss); XX errcnt++; XX if (errcnt > MAXEROR) { XX fputs(" Too many errors -- aborting\n",stderr); XX exit(-1); XX } XX } /* parseerror */ XX XX XX void bibhead() XX { XX switch (Language) { XX case ENGLISH: puts("\nReferences:"); break; XX case NORWEGIAN: puts("\nLitteratur:"); break; XX } /* switch */ XX } /* bibhead */ XX XX XX void ptabular() XX { XX switch (Language) { XX case ENGLISH: puts("\t<<=============== NB! Typeset as table. NB! ================>>"); break; XX case NORWEGIAN: puts("\t<<============== NB! Settes som tabell. NB! ================>>"); break; XX } /* switch */ XX } XX XX XX void pfigure() XX { XX switch (Language) { XX case ENGLISH: puts("\t<<=========== NB! Please insert figure here. NB! ===========>>"); break; XX case NORWEGIAN: puts("\t<<=========== NB! Figur skal settes inn her. NB! ===========>>"); break; XX } /* switch */ XX XX } XX XX XX void pcaption() XX { XX switch (Language) { XX case ENGLISH: puts("\t --------------- Caption for figure or table: -------------"); break; XX case NORWEGIAN: puts("\t --------------- Undertekst for bilde/tabell: -------------"); break; XX } /* switch */ XX } XX XX XX void phorline() XX { XX puts("<<==========================================================>>"); XX } XX XX XX void pappendix() XX { XX switch (Language) { XX case ENGLISH: puts("\n\nAPPENDIX\n========\n"); break; XX case NORWEGIAN: puts("\n\nAPPENDIX\n========\n"); break; XX } /* switch */ XX } XX XX void verbatim() XX { XX int cc, ts; XX XX ts = input(); XX for (;;) { XX cc = input(); XX if (cc == ts) break; XX putchar(cc); XX } /* forever */ XX } /* verbatim */ XX XX XX void pfotnote() XX { XX switch (Language) { XX case ENGLISH: puts("\n\nENDNOTES\n========\n"); break; XX case NORWEGIAN: puts("\n\nNOTER\n=====\n"); break; XX } /* switch */ XX } XX XX XX void footnote() XX { XX int cc, be = 1; XX XX fprintf(FNote,"<%d>: ", Fnumber); XX while (be) { XX cc = input(); XX if (cc == '{') be++; XX else if (cc == '}') be--; XX else putc(cc,FNote); XX } XX putc('\n',FNote); XX putc('\n',FNote); XX } /* footnote */ XX XX XX void bibitem() XX { XX int cc, be = 1; XX XX fprintf(FNote,"<%d>: ", Fnumber); XX while (be) { XX cc = input(); XX if (cc == '{') be++; XX else if (cc == '}') be--; XX else putc(cc,FNote); XX } XX putc('\n',FNote); XX putc('\n',FNote); XX } /* footnote */ XX XX /*---( main )---------------------------------------------------------------*/ XX XX main (argc, argv) XX int argc; XX char **argv; XX { XX XX fprintf(stderr,"l2a, version %s -- Copyright (c) 1988, 1990 Gisle Hannemyr\n\n",VERSION); XX XX argc--; argv++; /* skip program name */ XX while (argc && (**argv == '-')) { XX (*argv)++; /* skip initial '-' */ XX switch (**argv) { XX case 'a': fputs(About,stderr); exit(1); XX case 'n': Language = NORWEGIAN; break; XX default : fputs(Usage,stderr); exit(1); XX } /* switch */ XX argc--; argv++; XX } /* while options */ XX if (argc) { fputs(Usage,stderr); exit(1); } XX XX BEGIN 0; XX LineNo = 0; XX FNote = fopen("FN.TMP","w"); XX /* BFile = fopen("references.tex","r"); */ XX yylex(); XX XX fclose(FNote); XX if (Fnumber) { XX char buff[80]; XX FNote = fopen("FN.TMP","r"); XX pfotnote(); XX while (fgets(buff,80,FNote)) fputs(buff,stdout); XX } /* if */ XX unlink("FN.TMP"); XX puts("..EOF"); XX } /* main */ XX XX /*---( EOF lex input file )-------------------------------------------------*/ SHAR_EOF if test 11589 -ne "`wc -c l2a.l`" then echo shar: error transmitting l2a.l '(should have been 11589 characters)' fi echo shar: extracting l2a.1 sed 's/^XX//' << \SHAR_EOF > l2a.1 XX.\" @(#)l2a.1 2.0 89/12/10 [gh] XX.\" Usage: XX.\" nroff -man l2a.1 XX.TH L2A 1L "8 October 1990" "Version 1.0" XX.SH NAME XXl2a \- a LaTeX detergent XX.SH SYNOPSIS XX.B l2a XX[ XX.B \-a XX] XX[ XX.B \-h XX] XX[ XX.B \-n XX] XX.SH DESCRIPTION XX.LP XX.B L2a XXis a filter to ``detexify'' texts. That is, it attempts to XXremove LaTeX markup commands, leaving only the body of XXtext. It is intended to be used when journal editors request plain XXASCII text for typesetting, or when you want to post a plain version XXof a LaTeX document on an electronic conference system. XX.PP XX.B L2a XXis a filter. Its default operation is to read from standard input XX(the keyboard) and write on standard output (the terminal). XX.SH OPTIONS XX.TP XX.B \-a XXWrite out information about XX.B XXl2a. XX.TP XX.B \-h XXWrite out a brief summary of options. XX.TP XX.B \-n XXGenerate norwegian headings. XX.SH DIAGNOSTICS XX.PP XXUnrecognized markup commands generate an error message on the screen. XXThere are no return code. XX XX.SH AUTHOR XX.PP XXCopyright \(co 1988, 1990 Gisle Hannemyr. XX.PP XX.B L2a XXmay be freely distributed and copied, as long as this file XXis included in the distribution and that these statements XXabout authorship and copyright is not altered or removed. XX.PP XXBug reports, improvements, comments, suggestions and flames to: XX.ti +0.2i XXSnail: Gisle Hannemyr, Brageveien 3A, 0452 Oslo, Norway. XX.ti +0.2i XXEmail: gisle.hannemyr@nr.no (EAN); XX.ti +0.9i XXgisle@ifi.uio.no (Internet); XX.ti +0.9i XX\|.\|.\|.\|!mcsun!ifi!gisle (UUCP); XX.ti +0.9i XX(and several BBS mailboxes). XX.SH BUGS XX.PP XXThere is some Norwegian bias in XX.B l2a. XXIn particular, XXthere exists several national versions of the ISO 646 7-bit XXcharacter set; but XX.B l2a XXbluntly assumes the standard Norwegian version of the ISO 646. XX.PP XXOnly a subset of LaTeX is understood. XX.\" EOF SHAR_EOF if test 1748 -ne "`wc -c l2a.1`" then echo shar: error transmitting l2a.1 '(should have been 1748 characters)' fi # End of shell archive exit 0