[comp.text.tex] RFC -- a TeX font naming system

Damian.Cugley@prg.ox.ac.uk (Damian Cugley) (04/17/91)

This article describes an alternative approach to naming TeX fonts -- I
would be interested to see if anyone thinks it has anything going for it.

------------------------------------------------------------------------

If we stick to the 6- or 8-character names we have to resort to
outrageous abbreviations -- Karl Berry's gallant efforts
notwithstanding.  These in turn either have to be centrally controlled
so that everyone uses the same names (which is a pain when new fonts
become available).

What we need is a system whereby the TeX name for a font can be deduced
from its external name without any ambiguity, and where TeX names are
relatively intelligible.  Here is a system that I think has some merits;
afterwards I will show how I think it can be implemented on top of
different sorts of filing system.


Font names have a fixed syntax which goes like this:

    <font name> --> [ <foundry> - [ <family> - ] ] <cut>
    <foundry> --> <component>
    <family> --> <component>
    <cut> --> <component>
    <component> --> <letter> { <letter> | <digit> }.

Here "-" stands for a hyphen token and <letter> and <digit> for letters
or digits respectively.  [...] enclose optional material and stuff in
{...} can appear zero or more times.

For example:

    Adobe-Courier-cbo		Courier Bold Oblique
    Bistream-Charter-chi	Charter Italic
    pdc-mabx10			Malvern Bold Expanded 10pt
    pdc-ditc18			Ditko Compressed 18pt
    cmr12			Computer Modern Roman 12pt
    cmmi10			Computer Modern Math Italic 10pt

The form <foundry>-<family>-<cut> is used for fonts from commerical
manufacturers, who have a large number of families each with several
individual font files (which I will eccentrically refer to as cuts).
Usually the company will not be directly involved with porting their
fonts to the TeX world.

The second form, <foundry>-<cut> is used for typefaces where the
"foundry" is a single person or small institution.  Generally these will
turn out to be PD fonts created using METAFONT or PostScript directly.

The third form is for fonts supplied with TeX as part of the CMR
"meta-family" or one-off fonts that supply specialized symbols -- in
other words, almost all of currently existing TeX fonts.


WHAT A CUT IS USED TO MEAN

The <cut> name for a font looks like an old-style TeX name and will
generally be along these lines --

    <family abbrev><suffix><optional size>

where
    <optional size> --> <empty> 	-- one size font 
	| <digits> [ p <digits> ]	-- size in points -- 7p5 == 7.5 pt
	| <digits> m [ <digits> ]	-- size in mm -- 3m5 == 3.5 mm

The <family abbrev> is a 1-, 2- or 3- letter abbrev for the family name.
For the 3-part font names this is just a placeholder taken from the
initials of the family name.  For 2-part and 1-part <font name>s, this
will be chosen by the designer.

The <suffix> is the usual "bold italic" or "thin oblique" type stuff,
and possibly things like "csc" or "mi" meaning it is a different
selection of symbols from a particular face.  For fonts imported from
outside the TeX world (which will have 3-part names), the <suffix> is
derrived from the font designer's description of the font, [table of
abbreviations for goes here].  Someone designing a new METAFONT family
can use whatever <suffix>es takes their fancy.

Certain combinations of long suffixes, long <optional size>s and
3-letter font names might be able to lead to <cut>s more than 8 chars
long.  This can be avoided by using a Univers-style numeric system
("68c" for semibold condensed italic caps-&-s.caps instead of
"sbcicsc"), or simply shoved under the carpet...


FAMILIES and FOUNDRIES

Foundries and families allow different designers to not have to worry
about their fonts having overlapping names -- so long as the foundries
are unique!  [Rules for shortening company names go here.]

For 2-part names, which usually have a <foundry> named after a person or
academic institution, normally the "foundry" will be involved with
making them available and can make their own decision about what name
should be used!


HOW TeX INTERPRETS THESE NAMES

The munging of file names is part of the system-dependent part of TeX.
All this system requires is that "-" be given a special meaning.  What
meaning exactly depends on implemntation.  It should be assumed that the
MS-DOS limit of 8 case-folded characters of each <component> are
significant.

Some examples.

 \\ A HAL 9000 has a filing system with a flat namespace of
    40-character monocase names.  It can map the up to 24 significant
    characters of a font name onto a filename like "TEXFONT@ADOBE@TIMES@TBI"
    or "TEXFONT@PDC@MA24".

 \\ A Lose-O-Matic 770 has a directory structure with short (8-char)
    names.  By interpreting the "-" as a directory separator, it sees
    Adobe-Times-tbi as a file TBI in a subdirectory TIMES of a directory
    ADOBE of the TeX font area.

 \\ A BSD UNIX system has a directory structure *and* long names.  It
    can use whatever sysem is most convenient (I favour mapping "-" to "/").

 \\ A Creton-X50 has a flat name space *and* short names.  All fonts are
    given a code number when installed and are put in files with names
    f0000, f0001, f0002, ...; a "directory file" contains a list of font
    names and font code numbers.  <foundry>s and <families> have
    associated directory files referenced from the top level directory
    file.  In other words, the hapless Creton-X50 implementors have to
    "roll their own" directory structure.  (This is why I limited <font
    name>s to three <component>s.)

 //- Damian Cugley ----\  /--- Oxford University Computing Laboratory, -\ 
 ||  pdc@prg.ox.ac.uk  || \--- 11 Keble Rd, Oxford, UK  OX1 3QD --------/ 
 ||  pdc@uk.ac.ox.prg  ||                                               
  \--------------------//   "His feet are the wrong size for his shoes." 

jeffrey@cs.chalmers.se (Alan Jeffrey) (04/19/91)

In article <DAMIAN.CUGLEY.91Apr17163401@pierrot.prg.ox.ac.uk> Damian.Cugley@prg.ox.ac.uk (Damian Cugley) writes:
>This article describes an alternative approach to naming TeX fonts -- I
>would be interested to see if anyone thinks it has anything going for it.

Hmm... very interestink... It does mean you'll need a different
fontname.tex for each installation, but that shouldn't be too hard.

The only serious problem I can see is that dvi files would no longer
be portable.  For example, if you say

   \newfont\thingy{pdc-dict18}

then on a Unix architecture fontname.tex converts this to

   \font\thingy pdc/dict18

whereas on a Scrungomatic 2503, fontname.tex would convert it to 

   \font\thingy pdc@dict18

and so the dvi file would no longer be portable.  (And yes, people do
port dvi files around rather than TeX files---the day TeX guarantees
the same output from the same input is the day people will stop
sending dvi files around.)

Apart from that, the main problem would be getting people to
standardize on it.  Isn't this always the way?

Alan.


Alan Jeffrey         Tel: +46 31 72 10 98         jeffrey@cs.chalmers.se
Department of Computer Sciences, Chalmers University, Gothenburg, Sweden

lee@sq.sq.com (Liam R. E. Quin) (04/21/91)

Damian Cugley described an alternative approach to naming TeX fonts.

I think that this is an excellent idea.

I also think that reading the work done by the working group for DIS 9541-1
and particularly Jim Flowers' contribtions, would be an excellent idea.

ISO standards on the structure of font names are in progress, and it does
look as if these will be accepted by the MIT X consortium and others.
So it would be good to be compatible.

Distributed font servers are on their way, and there will be one in X11R5.
That one almost certainly won't be fully draft-iso-compliant, because of
time constraints, but is _is_ moving in that direction.

So a TeX of the future might well be able to make a query to a remote system
about fonts, and even acquire a tfm file (or equivalent), as long as TeX
can understand the ISO naming conventions.

These, like Damian's proposal, are similar to the current (X11 R4) names:
-adobe-new century schoolbook-medium-i-normal--24-240-75-75-p-136-iso8859-1
and so forth.

ISO names tend to use / rather than -, but the princile is the same.

Note that it is useful to separate weight (bold, medium) from face (roman,
oblique, slanted, italic), so that using "cbo" for Courier-Bold-Oblique is
probably bad.

I don't see any advantage in limiting font names to three components.
Systems with terminally braindamaged filesystems can use a file containing
a full mapping, and can maybe even have a server process which will manage
this.  It's better to design the system you want, and then try to see how
to implement it, than to try and design something with the worst common
denominator in mind and then try to extend it.

Lee



-- 
Liam Russell Quin, SoftQuad Inc., Toronto... 416 963 8337... lee@sq.com
	   `What one person finds valuable others do not even notice.
	    And they do not notice that they do not notice.'
-- Scott Kim, `Interdisciplinary Communication', in `The Art of [HCI] Design'

Damian.Cugley@prg.ox.ac.uk (Damian Cugley) (04/22/91)

From:		Alan Jeffrey <jeffrey@cs.chalmers.se>
Message-Id:	<4427@undis.cs.chalmers.se>

> The only serious problem I can see is that dvi files would no longer
> be portable.  For example, if you say
>    \newfont\thingy{pdc-dict18}
> then on a Unix architecture fontname.tex converts this to
>    \font\thingy pdc/dict18
> whereas on a Scrungomatic 2503, fontname.tex would convert it to 
>    \font\thingy pdc@dict18

Hum?  What is fontname.tex?  I was thinking in terms of using

	\font\titlerm = pdc-ditc18 

in the TeX file or equivalently [see lfonts.tex] in LaTeX files

	\newfont{\titlerm}{pdc-ditc18}

The system-dependent part of TEX.WEB that looks for font files would do
any munging of that internally; the name kept by TeX for the DVI file
ought to be able to remain unchanged.

This relies on all DVI-manipulating programs either knowing of the
conventions used or being configurable enough to be told so.  (If the
TeX-name->file-name conventions were embodied in a library module that
could be used in other programs, this would be a lot easier.)


> Apart from that, the main problem would be getting people to
> standardize on it.  Isn't this always the way?

This is true.  It is a pity that it never occured to the people
designing TeX that it might eventually be convenient to be able to refer
to more than half a dozen or so typeface families...

(It also has to be said that the ideal of all (La)TeX documents being
printable on all (La)TeX systems is unattainable anyway.)

 //- Damian Cugley ---------------------------------------------------\ 
 ||  Oxford University Computing Laboratory, 11 Keble Rd, Oxford, UK  ||
 ||  pdc@prg.ox.ac.uk or pdc@uk.ac.ox.prg in UK      DON'T PANIC!     ||
  \-------------------------------------------------------------------//

Damian.Cugley@prg.ox.ac.uk (Damian Cugley) (04/22/91)

From:		Liam R. E. Quin <lee@sq.sq.com>
Message-Id:	<1991Apr20.214907.13178@sq.sq.com>

> Note that it is useful to separate weight (bold, medium) from face (roman,
> oblique, slanted, italic), so that using "cbo" for Courier-Bold-Oblique is
> probably bad.

I am combining the "face" info into one word for several reasons:
firstly, TeX has no useful way to treat weight, width, slant etc.
separately; secondly, it's shorter; thirdly, it allows the old fonts --
cmr12 etc. -- to remain unchanged.  I prefer

	Adobe-NewCenturySchoolbook-r 

(PostScript fonts are linearly scaled, so no <optional size> is needed)
to the XLFD name

 -Adobe-New Century Schoolbook-Medium-I-Normal--24-240-75-75-P-136-ISO8859-1

if only because it is shorter -- and IMO more comprehensible.

For three-part names, the "cut" part is just the "bold-oblique" part of
the name.  The extra "c" at the front is so that Courier doesn't get
called "Adobe-Courier-", with nothing after the final "-".  On second
thoughts, a better approach would be to drop that and use "r" (= roman
or regular) for the regular face:

	Adobe-Courier-r
	Adobe-Courier-b
	Adobe-Courier-o
	Adobe-Courier-bo

also "u" might be used to prefix a Univeseque numeric style description
(so <component>s don't start with digits:
 
	Linotype-Futura-u35
	Linotype-Futura-u55
	Linotype-Futura-u56
	Linotype-Futura-u75
	Linotype-Futura-u76
	...

For 2-part names, the "cut" is really what the font would be called
anyway under current TeX conventions, and so combines the <family> and
<cut> in 3-part names.  1-part names are like current TeX names.

> I don't see any advantage in limiting font names to three components.

My scheme isn't supposed to classify every aspect of fonts in their
names (not useful IMO) but to help divide the huge namespace of possible
fonts into useful chunks:

    all Adobe fonts -- further subdivided into 
	    Helvetica (further subdivided into r, o, b, bo)
	    Times (further divided into r, i, b, bi, sy)
	    ...
    all Linotype ...
    all Bitstream ...
    ...
    all pdc
    ...
    misc

 //- Damian Cugley ---------------------------------------------------\ 
 ||  Oxford University Computing Laboratory, 11 Keble Rd, Oxford, UK  ||
 ||  pdc@prg.ox.ac.uk or pdc@uk.ac.ox.prg in UK      DON'T PANIC!     ||
  \-------------------------------------------------------------------//

lee@sq.sq.com (Liam R. E. Quin) (04/25/91)

Damian.Cugley@prg.ox.ac.uk (Damian Cugley) writes:
>From:		Liam R. E. Quin <lee@sq.sq.com>
>
>> Note that it is useful to separate weight (bold, medium) from face (roman,
>> oblique, slanted, italic), so that using "cbo" for Courier-Bold-Oblique is
>> probably bad.
>
>I am combining the "face" info into one word for several reasons:
>firstly, TeX has no useful way to treat weight, width, slant etc.
>separately; secondly, it's shorter; thirdly, it allows the old fonts --
>cmr12 etc. -- to remain unchanged.

Those are all good reasons.

>I prefer
>	Adobe-NewCenturySchoolbook-r 
>[...]
>to the XLFD name
> -Adobe-New Century Schoolbook-Medium-I-Normal--24-240-75-75-P-136-ISO8859-1
>
>if only because it is shorter -- and IMO more comprehensible.

Well, so do I from a user's point of view.  The X11 name does allow font
substitution, at least to a certain extent.
I _would_ prefer to see use of existing font names rather than see the
invention of a new set of conventions, though, so keeping Bold-Oblique
still sounds better to me.  There's no reason why there can't be a short
cut, so the macros put the long form in the dvi file even if the user typed
Compugraphic-Triumvirate-bo in the input.  Most users would probably be happy
wih ``Helvetica'' and would take Triumvirate or Adobe or Linotype or whatever
they got, I imagine!


I have had mail asking that font components be limited to 6 or less
characters, which also seems to me to be an unnecessary abomination, since
the fonts do not have to be stored under the long names on the computer.

Lee


-- 
Liam Russell Quin, SoftQuad Inc., Toronto... 416 963 8337... lee@sq.com
	   `What one person finds valuable others do not even notice.
	    And they do not notice that they do not notice.'
-- Scott Kim, `Interdisciplinary Communication', in `The Art of [HCI] Design'

acmfiu@serss0.fiu.edu (ACMFIU) (04/26/91)

In article <4427@undis.cs.chalmers.se> jeffrey@cs.chalmers.se (Alan Jeffrey) writes:
>and so the dvi file would no longer be portable.  (And yes, people do
>port dvi files around rather than TeX files---the day TeX guarantees
>the same output from the same input is the day people will stop
>sending dvi files around.)
>
>Alan Jeffrey         Tel: +46 31 72 10 98         jeffrey@cs.chalmers.se
>Department of Computer Sciences, Chalmers University, Gothenburg, Sweden
.
. the day your computer system computes multiplication with the same
. degree of accuracy as mine will be the day TeX is 100% portable.
.
. albert