chris@mimsy.UUCP (Chris Torek) (09/20/87)
Lately I have seen some interest expressed about TeX and DVI files and related issues such as fonts. To dispell tales, disseminate truth, define terms, and perhaps induce some other DTs, I have put together this file. TeX is a program for typesetting text, particularly mathematical text. _The_TeXbook_, by Donald E. Knuth, describes in great detail what TeX does and how it goes about doing it. _The_TeXbook_ is the first in a five volume series on Computers and Typesetting. In the series is also _TeX:_The_Program_, which might be described as an annotated source listing of TeX itself; two books on METAFONT; and one on Knuth's Computer Modern fonts. These are available from Addison-Wesley; I do not have ordering information handy. The TeX program itself is available from Stanford University, and a version specifically for Unix from University of Washington. At this time I believe the Unix TeX runs on 4.1, 4.2, and 4.3BSD on Vaxen, on Sun 2s and 3s running SunOS 3.x, and on Pyramids. No doubt other ports are available; again, I have no details. TeX is written in a language called WEB, which contains both the source to the program and the source to the annotated listing. Two auxiliary programs (Tangle and Weave) extract the program and listing portion; the former creates a Pascal source file, and the latter a TeX document suitable for formatting and printing. The Pascal produced is an extended version of a limited version of standard J&W Pascal, avoiding `new' and `dispose' but using a default clause in case statements. Most Pascal compilers can handle this with, if not ease, at least not too much tweaking. There are, however, translations of TeX into C. Two of which I am aware are Common TeX, by Pat Mondaro, which I believe to be freely available, and C-TeX, by Tomas Rokicki. There are several versions that run on IBM PCs. As usual, I have no details on other ports of TeX. TeX produces DVI files. The format of these files is documented in _TeX:_The_Program_ and in the source code for DVITYPE. The latter should be included in any TeX distribution, and in any case is available from Stanford. DVI stands for DeVice Independent; a DVI file is not suitable for printing on any particular typesetter or laser printer. Instead, it is converted by a `driver'. There is one driver for each kind of printer: For instance, there is a PostScript driver that will convert DVI files to the appropriate commands for an Apple LaserWriter or other compatible PostScript printer. (There is another kind of DVI file that is produced by ditroff. This format is quite different from TeX's DVI; indeed, the ditroff format hardly merits the appellation `device independent'. It still requires a driver for conversion, but has embedded within it assumptions about the printer that do not appear in a TeX DVI file. Not that it is a bad format---I just think calling it a DVI file is misleading.) A TeX driver converts the device independent output file to a particular device's format. This is a threefold task, involving reading pages from the DVI file, handling fonts, and decoding `\special's. The first is fairly straightforward: drivers merely need follow the rules laid out by DVITYPE. The remaining two are complicated by a profusion of font formats and a lack of standards. No two drivers, it seems, do the same thing with any given \special, and \specials tend to be overly device dependent. Specials such as `include a PostScript program here' clearly cannot work on printers that do not implement PostScript. This is inherent in the nature of a \special, of course, but there are some common operations that should be done in a common way, such as drawing lines and arcs and other simple graphics. The font problems are not quite as difficult, but to some sites are more important. A full set of TeX fonts could require more than 30 megabytes of disk space. By using a more compact format, these same fonts shrink to less than 10 megabytes. Many drivers can handle only the least compact format. Worse, some drivers handle only one format, and some only another, forcing some sites to keep two or more copies of every font. The three standard font file formats for TeX are GF files, PK files, and PXL files. GF, or Generic Font, files use an intermediate amount of space; PK, or PacKed, files use the least; and the obsolete but still widely used PXL `pixel' files require the most space. Typically a GF file will be about half the size of the corresponding PXL file, and the same file in PK format will be half again the size of the GF file. In addition, PK files are easier to decode than GF files, being better engineered for unpacking. Hence PK format is the best standard format around. (The drivers in my TeX support code---something now called `ctex', although it has little to do with TeX in C---handle all three font formats.) Even the ability to read any of these font formats does not solve another crucial problem. Low resolution fonts, such as those for 300 dpi printers like the LaserWriter, depend heavily upon the mechanical qualities of the printer. There are two major kinds of laser printers available today. The Canon engine, used in the LaserWriter, the Imagen 8/300, and the HP LaserJet, uses a process called `write black', in which the laser is used to create black spots on a white background. The Raven engine used in the Xerox 2700 and in some DEC printers uses a process known as `write white': the laser draws white spots on a black background. Fonts designed for write-black engines usually look thin and spidery when printed on write-white engines, and the fonts that come with Unix TeX are tuned for write-black engines. Fortunately, current distributions of TeX also come with the METAFONT program and the sources to the fonts themselves. Those with write-white engines can build METAFONT and create a `mode definition' file for their printer, then rebuild all the fonts. The task is by no means painless, and there seems as yet to be no standard write-white mode definition, but it can be done. Unless. . . . What happens if you have both a Canon-based printer and a Raven-based printer? My own solution to this, although the problem has not yet occurred here, is a directive in the font configuration file. My drivers specify the appropriate engine type; my font lookup code matches this against a `device specifier'. A site in the situation described above might include these lines in the configuration file: # TYPE SPEC SLOP PATH font pk canon 3 /usr/lib/tex/fonts/canon/%f.%mpk font pk raven 3 /usr/lib/tex/fonts/raven/%f.%mpk An Imagen or PostScript driver would thus use Canon-tuned fonts, while a Xerox 2700 driver would use Raven-tuned fonts. `%f' and `%m' turn into the base name of the font and the magnification, such as `cmr10' and `300' for a 300 dpi rendition of CMR10. The specifier `*' matches anything: font pk * 3 /usr/lib/tex/fonts/%f.%mpk will be used on any kind of print engine. Another question that comes up often is that of printing less than a full TeX document. If you have changed only one page, or need to examine only a specific figure or table, it seems wasteful to have to print an entire paper. Printing just the page of interest is so much more sensible. Because a particular driver must read pages from a DVI file anyway, it seems reasonable to have the driver do the page selection. A number of drivers do this. This is, I think, a mistake. Page selection is, in practise, used rather rarely, and it is hard to do well: For example, did you want the tenth page, or page 10? If there is a page ix, page 10 may be the 22nd page in the DVI file. Following Murphy's Law, drivers that allow page selection will probably choose the tenth page when you wanted page 10, and vice versa. But obviously page selection is useful. The trick is that it can be done outside the driver. A DVI file already consists of a series of pages; it is quite feasible to read one DVI file, select a subset, and write a new DVI file consisting only of the subset pages. The new DVI file can be fed to any driver, whether or not that driver implements page selection, and the DVI selection program can be written to understand the difference between page 19 and page xix, or to allow such esoteric selections as `all of chapter four, and the first page of the index too'. My ctex distribution includes this dviselect program. Other DVI-to-DVI transformations are possible and sensible. After splitting a file, you could concatenate the pieces in a different order: `dviconcat' would concatenate a whole series of DVI files. `dvisort' might rearrange pages for proper two-sided stacking. `dvibooklet' could prepare a file so that it can be printed with four logical pages per physical page, in such a way that several 8.5 by 11 inch pages could be folded down the center to make an 8.5 by 5.5 inch booklet. (dvibooklet may be a special case of dvisort.) No doubt there are other possibilities that I have missed. I hope I have managed to clear up some questions and influence future DVI driver writers to solve the right problems. Incidentally, my font routines are available for anyone who would like the flexibility of handling GF, PK, and PXL formats and multiple print engines. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris