[comp.lang.postscript] Why does a font need an Encoding vector?

pdsmith@bbn.com (Peter D. Smith) (11/01/90)

In article <1990Oct31.225607.10364@phri.nyu.edu> roy@alanine.phri.nyu.edu (Roy Smith) writes:
>
>	The Red Book says that the Encoding vector is a required item for
>all fonts.  Why?  Is there some bit of internal PS font machinery that uses
>it?  As far as I can tell, it's only used by the font-supplied BuildChar
>routine.
>--
>Roy Smith, Public Health Research Institute
>455 First Avenue, New York, NY 10016
>roy@alanine.phri.nyu.edu -OR- {att,cmcl2,rutgers,hombre}!phri!roy

Well, BBN customers use our software to generate reports, and like to
spell their company's name correctly therein.  The normal encoding
vector doesn't have a mechanism to access, for example, E-umlaut.

Even worse, there are several standards for where to place the
non-American characters in an 8-bit alphabet.  The HP LaserJet III
lists about *100* character sets built into the printer.

This isn't just theory, BTW.  One of the new features in our new
product includes several choices for the Encoding Vector.


					Peter D. Smith

roy@alanine.phri.nyu.edu (Roy Smith) (11/01/90)

	The Red Book says that the Encoding vector is a required item for
all fonts.  Why?  Is there some bit of internal PS font machinery that uses
it?  As far as I can tell, it's only used by the font-supplied BuildChar
routine.  What if I wanted to write a font in which there really wasn't any
reason to use the Encoding vector at all?  Maybe what I wanted to do was
have a font in which each character was simply a circle of thickness 30 and
radius c, centered in a 1000 unit square space, where c is the ascii code
for the character?  No reason for an Encoding vector, just:

	/BuildChar {
		1000 0 setcharwidth
		exch begin
		newpath 30 setlinewidth
		500 exch 500 exch 0 360 arc stroke
	} def

	Why wouldn't that work, other than the fact that definefont would
complain that there was an Encoding key-value pair missing?
--
Roy Smith, Public Health Research Institute
455 First Avenue, New York, NY 10016
roy@alanine.phri.nyu.edu -OR- {att,cmcl2,rutgers,hombre}!phri!roy
"Arcane?  Did you say arcane?  It wouldn't be Unix if it wasn't arcane!"

glenn@heaven.woodside.ca.us (Glenn Reid) (11/02/90)

In article <1990Oct31.225607.10364@phri.nyu.edu> roy@alanine.phri.nyu.edu (Roy Smith) writes:
>
>	The Red Book says that the Encoding vector is a required item for
>all fonts.  Why?  Is there some bit of internal PS font machinery that uses
>it?

Yes, definefont looks at it during error-checking.

It's true that there is an occasional situation where you don't really
need the Encoding vector (such as your case, when all the characters
are the same).  However, most fonts contain different characters, and
benefit from the possibility of being reencoded.

The overhead for including an Encoding vector is extremely low, and for
essentially all PS implementations existing today is 20 bytes (the cost
of the dictionary entry for "Encoding").  The StandardEncoding vector
is already in memory, and you just get another pointer to it when you
say "/Encoding StandardEncoding def".  Since the overhead is so low, it
is not unreasonable to make it a required item in a font, since 10 times
out of 9 it is a good thing.

Now whether or not you actually vector through the Encoding in your
BuildChar is up to you, of course.

>it?  As far as I can tell, it's only used by the font-supplied BuildChar
>routine.

Well, there is an occasional program that looks at the Encoding vector
to see whether or not the font has a standard character set.  For example,
if you're writing a program to print out a sample of all the fonts on
your printer, it is a nice enhancement to check the Encoding vector, and
if it is not StandardEncoding, you print out the name of the font in,
say, Helvetica, so you can read it (think of Sonata or Symbol), then print
a short sample of the font itself.

/Glenn

-- 
 Glenn Reid				RightBrain Software
 glenn@heaven.woodside.ca.us		PostScript/NeXT developers
 ..{adobe,next}!heaven!glenn		415-851-1785

amanda@visix.com (Amanda Walker) (11/02/90)

In article <1990Oct31.225607.10364@phri.nyu.edu>
roy@alanine.phri.nyu.edu (Roy Smith) writes:
>
>	The Red Book says that the Encoding vector is a required item for
>all fonts.  Why?  Is there some bit of internal PS font machinery that uses
>it?

Yup: the font cache.  Character bitmaps are stored by name, not by character
code.  I once had a LaserWriter crash with a fatal system error by sending
it a font that didn't have an encoding vector...

-- 
Amanda Walker						      amanda@visix.com
Visix Software Inc.					...!uunet!visix!amanda
--
Marching to a different kettle of fish.

roy@phri.nyu.edu (Roy Smith) (11/02/90)

amanda@visix.com (Amanda Walker) writes:
> Yup: the font cache.  Character bitmaps are stored by name, not by character
> code.  I once had a LaserWriter crash with a fatal system error by sending
> it a font that didn't have an encoding vector...

	But what if I'm using a font that doesn't get cached?  Notice in my
example I used setcharwidth not setcachedevice in BuildChar.  The reason I
did that is because in the real font that I'm building, I need greyscale,
which according to the Red Book can't be cached.

	I'm surprised you managed to crash the LaserWriter that way.
According to the Red Book, definefont checks to make sure that Encoding is
there, and returns an error if it's not.  How did you manage to get an
Encoding-less font past definefont?  Is it a matter of the documentation
being more bullet proof than the actual interpreter?  How bad a crash was
it?  I'm inclined to try it on my LW to see what happens, but I'd like to
know that at least power-cycling resets it before I go ahead.
--
Roy Smith, Public Health Research Institute
455 First Avenue, New York, NY 10016
roy@alanine.phri.nyu.edu -OR- {att,cmcl2,rutgers,hombre}!phri!roy
"Arcane?  Did you say arcane?  It wouldn't be Unix if it wasn't arcane!"

kevina@apple.com (This space for rent) (11/03/90)

I tried to post this yesterday but our mailer failed... it looks like 
Amanda came to the same conclusion.  Look at the end for a follow-up.

In article <1990Oct31.225607.10364@phri.nyu.edu> roy@alanine.phri.nyu.edu 
(Roy Smith) writes:
>         The Red Book says that the Encoding vector is a required item for
> all fonts.  Why?  Is there some bit of internal PS font machinery that 
uses
> it?  As far as I can tell, it's only used by the font-supplied BuildChar
> routine.  What if I wanted to write a font in which there really wasn't 
any
> reason to use the Encoding vector at all?  Maybe what I wanted to do was
> have a font in which each character was simply a circle of thickness 30 
and
> radius c, centered in a 1000 unit square space, where c is the ascii code
> for the character?  No reason for an Encoding vector, just:
  ...
 
(I'm assuming that you're talking about Type 3 fonts only... the internal 
BuildChar routine for Adobe built-in fonts definitely depends on 
Encoding!)  The font that you described does not need an Encoding vector 
for its BuildChar.  But how is the PostScript font machinery going to 
access the bitmap which you just built?  It can't use the character code 
-- e.g. if your BuildChar did depend on an Encoding vector, it would be 
trivial to swap the 'a' and 'b' glyphs' encodings.  Therefore, even in a 
Type 3 font, it uses the associated name, which it gets from the font's 
Encoding vector.  Level 2 PostScript is more forthcoming about all this 
(including new BuildGlyph and glyphshow routines which explicitly access a 
glyph by name.)

There are a lot of subtleties having to do with caching of user-defined 
characters (particularly if the font has a UniqueID entry.)  To be safe, I 
always use StandardEncoding as my Encoding vector when the BuildChar 
doesn't need one at all.  Helpful hint:  *Don't* use "256 array"... the
characters may not get cached at all!

In article <1990Nov2.012303.9487@phri.nyu.edu> roy@phri.nyu.edu (Roy 
Smith) writes:
>         But what if I'm using a font that doesn't get cached?  Notice in 
my
> example I used setcharwidth not setcachedevice in BuildChar.  The reason 
I
> did that is because in the real font that I'm building, I need greyscale,
> which according to the Red Book can't be cached.

When the font machinery decides, "this character isn't cached already; I'm 
going to have to call BuildChar", it doesn't know whether you are going to 
invoke setcachedevice or setcharwidth within the BuildChar procedure, so 
I'm sure that it assumes the character will be cached (and therefore needs 
a name from the Encoding vector) for efficiency.  In the case which you 
presented, there is no *necessary* reason for an Encoding vector.  
(Although Glenn's right... the overhead using StandardEncoding is 
negligible.) 

Disclaimer:  No official Adobe or Apple information here...

--Kevin Andresen [kevina@apple.com]
"This is my investigation... it's not a public inquiry"

amanda@visix.com (Amanda Walker) (11/03/90)

In article <1990Nov2.012303.9487@phri.nyu.edu> roy@phri.nyu.edu (Roy Smith)
writes:
>	I'm surprised you managed to crash the LaserWriter that way.

So was I :).  It was a LaserWriter, PostScript v38.0.  Your mileage may
vary.

>How bad a crash was it?

The effect was almost indistinguishable from a power cycle, except for the
error message it managed to send out just before going bouncy bouncy :).
Aside from that, it was quite harmless.

I posted the code at the time (it's real short), but I'll try and scrape it
up again, just for amusement's sake...
-- 
Amanda Walker						      amanda@visix.com
Visix Software Inc.					...!uunet!visix!amanda
--
"This is a basic principle of the universe known as `The Law of the
 Cussedness of Nature.'"	--W. Kauzmann

cet1@cl.cam.ac.uk (C.E. Thompson) (11/04/90)

In article <11090@goofy.Apple.COM> kevina@apple.com (This space for rent) writes:
>There are a lot of subtleties having to do with caching of user-defined 
>characters (particularly if the font has a UniqueID entry.)  To be safe, I 
>always use StandardEncoding as my Encoding vector when the BuildChar 
>doesn't need one at all.  Helpful hint:  *Don't* use "256 array"... the
>characters may not get cached at all!
>
But StandardEncoding has lots of /.notdef's at positions 0..31, 127..160, etc
doesn't it? So it isn't suitable for a Type 3 font with a BuildChar routine
independant of the Encoding vector, but which does use these character 
positions. Otherwise you'll confuse the font cacheing machinery horribly.
In my DVI to PostScript converter (fairly standard in this respect, I think)
I use an array of 256 different names. Of course, I share it between all 
the fonts I define.

Chris Thompson
JANET:    cet1@uk.ac.cam.phx
Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk