[comp.text.tex] ISO Latin 1 Virtual Font interface to Computer Modern fonts

fj@iesd.auc.dk (Frank Jensen) (09/25/90)

I am looking for virtual font files for Knuth's Computer Modern fonts.

They should have an ISO 8859/1 (Latin 1) layout, and they should
contain reasonable kerning data for character pairs where one or both
characters are accented (but since TeX doesn't kern between such
characters anyway, I can live without for the moment).

[The advantage of having such virtual fonts is that it's easier to
type non-english text and that you can have correct hyphenation for
non-english languages (without resorting to incantations like
`german.sty' does).]

It would be even better to have a configurable program that generates
such virtual fonts based on the TFM (or PL) file, font layout tables,
and information about kerning and extra lowering/raising of accents
(for fine tuning).

If nobody else has already done this, I might undertake this project
myself (but I don't want to repeat other peoples work).


---
Frank Jensen,	fj@iesd.auc.dk
Department of Mathematics and Computer Science
Aalborg University
DENMARK

tml@sottuli.tik.vtt.fi (Tor Lillqvist) (09/26/90)

In article <FJ.90Sep25153811@indigo.iesd.auc.dk> fj@iesd.auc.dk (Frank Jensen) writes:
   I am looking for virtual font files for Knuth's Computer Modern fonts.

   They should have an ISO 8859/1 (Latin 1) layout, and they should
   contain reasonable kerning data.

   It would be even better to have a configurable program that generates
   such virtual fonts based on the TFM (or PL) file, font layout tables,
   and information about kerning and extra lowering/raising of accents
   (for fine tuning).

Well, here is what I hacked together some time ago when I wanted to
test the new features of TeX 3.0: A program for GNU Emacs that given a
PL file (or a directory of them) generates VP files for corresponding
extended CM fonts.  Why GNU Emacs?  Well, PL files look a bit like
Lisp lists, so I though it would be easy to parse them in Lisp.  And
GNU Emacs was the only Lisp I had readily available.  However, elisp
doesn't have floating point numbers, so the code has to treat them as
fixed-point numbers (integer part and millionths).

There are some people working on a standard extended CM virtual font
encoding scheme, this is not related to that effort.  Use this
temporarily only, until something "official" gets released.

Use this like this:

cd somewhere
run tftopl on some fonts, producing cmr10.pl etc.
start emacs, load extend-cm.el
do M-x extend-directory-of-cm-fonts
now you have .vp files, run vptotf on them

xxxxxxxxxxxxxxxx cut here xxxxxxxxxxxxxxxx
;; extend-cm.el -- extend a Computer Modern font with accented letters
;; and some others letters (e.g. Icelandic letters)
;; tml@tik.vtt.fi

;; Currently only the accented letters in Latin-1 (ISO8859-1)
;; are generated, with their Latin-1 codes.
;; This should be extended with other necessary accented letters.
;; As there isn't room for all accented letters in use by languages
;; written in a Latin alphabet, only those combinations actually 
;; in common use in some language should be picked.

(defvar xcm-do-extras nil
  "If non-nil, add extra characters to the generated virtual fonts.")

(defvar xcm-interesting-pl-properties
  '(CHARACTER CHECKSUM DESIGNSIZE FONTDIMEN KRN LABEL LIGTABLE SKIP))

(defvar xcm-accented-versions
  '((?A ?\300 ?\301 ?\302 ?\303 ?\304 ?\305)
    (?C ?\307)
    (?E ?\310 ?\311 ?\312 ?\313)
    (?I ?\314 ?\315 ?\316 ?\317)
;;    (?D ?\320)
    (?N ?\321)
    (?O ?\322 ?\323 ?\324 ?\325 ?\326)
    (?U ?\331 ?\332 ?\333 ?\334)
    (?Y ?\335)
    (?a ?\340 ?\341 ?\342 ?\343 ?\344 ?\345)
    (?c ?\347)
    (?e ?\350 ?\351 ?\352 ?\353)
    (?i ?\354 ?\355 ?\356 ?\357)
    (?n ?\361)
    (?o ?\362 ?\363 ?\364 ?\365 ?\366)
    (?u ?\371 ?\372 ?\373 ?\374)
    (?y ?\375 ?\377))
  "A list with sublists containing an ASCII character and the corresponding
accented ISO Latin-1 characters.")

(defvar cm-text-fonts "^cm\\(r\\|bx\\|tt\\|sltt\\|vtt\\|tex\\|ss\\|ssi\\|ssdc\\|ssbx\\|ssqi\\|dunh\\|bxsl\\|b\\|ti\\|bxti\\|csc\\|tcsc\\)\\([0-9]+\\)\\(\\.pl\\)$")

(defun extend-directory-of-cm-fonts (directory)
  "For all Computer Modern PL files in a direcory create the
corresponding Extended Computer Modern VP file."
  (interactive "DExtend fonts in directory: ")
  (mapcar 'extend-pl-file (directory-files directory nil cm-text-fonts)))

(defun extend-pl-file (filename)
  "Create a Extended Computer Modern VP (Virtual font Property list) file
from the PL (font Property List) file FILENAME."
  (interactive "fExtend PL file:")
  (set-buffer (get-buffer-create "*xcm-tmp-1*"))
  (erase-buffer)
  (insert-file-contents filename t)
  (message "Working on %s" (file-name-nondirectory filename))
  (set-buffer (extend-cm-font-in-buffer))
  (write-file (buffer-file-name (current-buffer))))

(defun extend-cm-font-in-buffer ()
  "Convert a Computer Modern PL file to the corresponding Latin-1
Extended Computer Modern VP file."
  (interactive)
  (let (font-name font-basename font-size tempbuffer vp-file new-vp)
    (setq font-name (file-name-nondirectory (buffer-file-name (current-buffer))))
    (if (not (string-match cm-text-fonts font-name))
	(error "Cannot handle this font.")
      (setq font-name (substring font-name 0 (match-beginning 3)))
      (setq font-basename (substring font-name 0 (match-beginning 2)))
      (setq font-size (substring font-name (match-beginning 2) (match-beginning 3))))
    (setq tempbuffer (get-buffer-create "*xcm-tmp-2*"))
    (copy-to-buffer tempbuffer (point-min) (point-max))
    (set-buffer tempbuffer)
    (goto-char (point-min))
    ;; First convert C values to decimal
    (xcm-clean-char-values)
    (goto-char (point-min))
    ;; Convert R values to dotted pairs
    (dotted-floatify-buffer)
    (goto-char (point-min))
    ;; (read) will read from the temp buffer
    (setq standard-input (current-buffer))
    ;; We produce a VP file, with 'x' prefixed to the original font name
    (set-buffer (setq new-vp (get-buffer-create "*new-vp-file*")))
    (set-visited-file-name (concat "x" font-name ".vp"))
    (erase-buffer)
    (setq standard-output (current-buffer))
    (setq xcm-char-metrics (make-vector 256 nil))
    (if xcm-do-extras
	(princ (format
		"(MAPFONT D 0 (FONTNAME %s)) (MAPFONT D 1 (FONTNAME extra-%s))\n"
		font-name font-name))
      (princ (format
		"(MAPFONT D 0 (FONTNAME %s))\n" font-name)))
    ;; Parse the PL file, inserting ligtable labels and kerns
    ;; for the accented letters on the fly
    (condition-case nil
	(while t
	  (xcm-convert-expr (read)))
      (end-of-file nil))
    ;; Output the character descriptions for the accented characters
    (xcm-make-accented-chars)
    ;; Output the character descriptions for extra characters
    (if xcm-do-extras
	(xcm-make-extras font-name))
;;  (kill-buffer temp-buffer)
    new-vp
    ))

(defun dotted-floatify-buffer ()
  "Replaces floating-point values in the current buffer
with pairs of the form (integer-part . fraction-part).
The fraction part is multiplied by 1000000."
  (interactive)
  (while (re-search-forward "\\(-?\\)\\([0-9]+\\)\\.\\([0-9]+\\)" nil t)
    (replace-match
     (make-float-string
      (string-to-int (buffer-substring (match-beginning 2)
				       (match-end 2)))
      (buffer-substring (match-beginning 3)
			(match-end 3))
      (< (match-beginning 1) (match-end 1))))))

(defun xcm-clean-char-values ()
  "Replace C (character) values of non-alphanumeric characters with
the corresponding D (decimal) value."
  (while (re-search-forward " [Cc] \\([^A-Za-z]\\) " nil t)
    (replace-match
     (format " d %d " (aref (buffer-substring
			     (match-beginning 1)
			     (1+ (match-end 1))) 0)))))

(defun xcm-print (e)
  (cond ((atom e) (prin1 e))
	((and (integerp (car e)) (integerp (cdr e)))
	 (xcm-print-dottedfloat (car e) (cdr e)))
	((eq (car e) 'COMMENT) nil)
	(t (terpri)
	   (princ "(")
	   (xcm-print (car e))
	   (xcm-print-rest (cdr e)))))

(defun abs (e)
  (if (< e 0) (- 0 e) e))

(defun xcm-print-dottedfloat (int fract)
  (if (or (< int 0) (< fract 0))
      (princ "-"))
  (prin1 (abs int))
  (princ ".")
  (princ (format "%06d" (abs fract))))
  
(defun xcm-print-rest (e)
  (cond ((null e) (princ ")"))
	(t (princ " ")
	   (xcm-print (car e))
	   (xcm-print-rest (cdr e)))))

(defun xcm-convert-expr (e)
  (cond ((listp e)
	 (let ((handler(get (car e) 'prop-handler)))
	   (if handler (apply handler e nil)
	     (xcm-print e))))
	(t (error "Invalid property list"))))

(mapcar (function (lambda (prop)
		   (put prop 'prop-handler
			(intern (concat "xcm-"
					(downcase (symbol-name prop))
					"-handler")))))
	xcm-interesting-pl-properties)

(defun xcm-comment-handler (prop)
  nil)

(defun xcm-checksum-handler (prop)
  nil)

(defun xcm-designsize-handler (prop)
  (setq designsize (car (cdr (cdr prop))))
  (xcm-print prop))

(defun xcm-fontdimen-handler (prop)
  (xcm-print prop)
  (setq xcm-fontdimens (cdr prop))
  (setq xcm-font-xheight
	(car (cdr (cdr (assq 'XHEIGHT xcm-fontdimens)))))
  (setq xcm-font-slant
	(car (cdr (cdr (assq 'SLANT xcm-fontdimens))))))

(defun xcm-ligtable-handler (prop)
  (princ "\n(LIGTABLE\n")
  (mapcar 'xcm-ligstep-handler (cdr prop))
  (princ ")\n"))

(defun xcm-ligstep-handler (step)
  (xcm-print step)
  (let ((handler (get (car step) 'prop-handler)))
    (if handler
	(apply handler step nil))))

(defun xcm-label-handler (step)
  (xcm-label-accented (assq (int-value (cdr step)) xcm-accented-versions)))

(defun xcm-label-accented (list)
    (while (and list (setq list (cdr list)))
      (princ " ")
      (prin1 (list 'LABEL 'D (car list)))))

(defun xcm-krn-handler (step)
  (xcm-krn-accented (car (cdr (cdr (cdr (cdr step)))))
		    (assq (int-value (cdr step)) xcm-accented-versions)))

(defun xcm-krn-accented (kern list)
  (while (and list (setq list (cdr list)))
    (princ " ")
    (princ "(KRN D ")
    (prin1 (car list))
    (princ " R ")
    (xcm-print kern)
    (princ ")")))

(defun xcm-skip-handler (step)
  (error "Cannot handle SKIPs."))

(defun xcm-character-handler (prop)
  (xcm-print prop)
  (aset xcm-char-metrics (int-value (cdr prop)) (cdr (cdr (cdr prop)))))

;; This list contains for each accented Latin-1 letter a sublist
;; with its code, the code of the unaccented letter, and the 
;; code of the accent (in Computer Modern)

(setq xcm-combinations
  '((?\300 ?A ?\022)			; Agrave
    (?\301 ?A ?\023)			; Aacute
    (?\302 ?A ?\136)			; Acircumflex
    (?\303 ?A ?\176)			; Atilde
    (?\304 ?A ?\177)			; Adiaeresis
    (?\305 ?A ?\027)			; Aring
    (?\307 ?C ?\030)			; Ccedilla
    (?\310 ?E ?\022)			; Egrave
    (?\311 ?E ?\023)			; Eacute
    (?\312 ?E ?\136)			; Ecircumflex
    (?\313 ?E ?\177)			; Ediaeresis
    (?\314 ?I ?\022)			; Igrave
    (?\315 ?I ?\023)			; Iacute
    (?\316 ?I ?\136)			; Icircumflex
    (?\317 ?I ?\177)			; Idiaeresis
    (?\321 ?N ?\176)			; Ntilde
    (?\322 ?O ?\022)			; Ograve
    (?\323 ?O ?\023)			; Oacute
    (?\324 ?O ?\136)			; Ocircumflex
    (?\325 ?O ?\176)			; Otilde
    (?\326 ?O ?\177)			; Odiaeresis
    (?\331 ?U ?\022)			; Ugrave
    (?\332 ?U ?\023)			; Uacute
    (?\333 ?U ?\136)			; Ucircumflex
    (?\334 ?U ?\177)			; Udiaeresis
    (?\335 ?Y ?\023)			; Yacute
    (?\340 ?a ?\022)			; agrave
    (?\341 ?a ?\023)			; aacute
    (?\342 ?a ?\136)			; acircumflex
    (?\343 ?a ?\176)			; atilde
    (?\344 ?a ?\177)			; adiaeresis
    (?\345 ?a ?\027)			; aring
    (?\347 ?c ?\030)			; ccedilla
    (?\350 ?e ?\022)			; egrave
    (?\351 ?e ?\023)			; eacute
    (?\352 ?e ?\136)			; ecircumflex
    (?\353 ?e ?\177)			; ediaeresis
    (?\354 ?\020 ?\022)			; igrave
    (?\355 ?\020 ?\023)			; iacute
    (?\356 ?\020 ?\136)			; icircumflex
    (?\357 ?\020 ?\177)			; idiaeresis
    (?\361 ?n ?\176)			; ntilde
    (?\362 ?o ?\022)			; ograve
    (?\363 ?o ?\023)			; oacute
    (?\364 ?o ?\136)			; ocircumflex
    (?\365 ?o ?\176)			; otilde
    (?\366 ?o ?\177)			; odiaeresis
    (?\371 ?u ?\022)			; ugrave
    (?\372 ?u ?\023)			; uacute
    (?\373 ?u ?\136)			; ucircumflex
    (?\374 ?u ?\177)			; udiaeresis
    (?\375 ?y ?\023)			; yacute
    (?\377 ?y ?\177)			; ydiaeresis
    ))

(defun xcm-make-accented-chars ()
  (mapcar 'xcm-build-combination xcm-combinations))

(defun xcm-build-combination (recipe)
  "Outputs the recipe for an accented letter."
  (let* ((basechar (car (cdr recipe)))
	 (accent (car (cdr (cdr recipe))))
	 (basechar-metrics (aref xcm-char-metrics basechar))
	 (accent-metrics (aref xcm-char-metrics accent))
	 (a (car (cdr (cdr (assq 'CHARWD accent-metrics)))))
	 (w (car (cdr (cdr (assq 'CHARWD basechar-metrics)))))
	 (h (or (car (cdr (cdr (assq 'CHARHT basechar-metrics)))) '(0 . 0)))
	 (ah (or (car (cdr (cdr (assq 'CHARHT accent-metrics)))) '(0 . 0)))
	 (cd (or (car (cdr (cdr (assq 'CHARDP basechar-metrics)))) '(0 . 0)))
	 (ad (or (car (cdr (cdr (assq 'CHARDP accent-metrics)))) '(0 . 0)))
	 (ci (or (car (cdr (cdr (assq 'CHARIC basechar-metrics)))) '(0 . 0)))
	 (downkern (float- xcm-font-xheight h))
	 (delta (float+ (float-div (float- w a) 2)
			(float* (float- downkern) xcm-font-slant))))
    (xcm-print
     (append
      (list 'CHARACTER 'D (car recipe))
      (list (list 'CHARWD 'R w))
      (list (list 'CHARHT 'R (floatmax (float- ah downkern) h)))
      (if (not (equal (floatmax cd ad) '(0 . 0)))
	  (list (list 'CHARDP 'R (floatmax cd ad))))
      (if (not (equal ci '(0 . 0))) (list (list 'CHARIC 'R ci)))
      (list
       (append (list 'MAP '(PUSH))
	       (if (not (equal downkern '(0 . 0)))
		   (if (equal ah '(0 . 0)) ; Cedilla?
		       nil
		     (list (list 'MOVEDOWN 'R
				 downkern))))
	       (if (not (equal delta '(0 . 0)))
		   (list (list 'MOVERIGHT 'R delta)))
	       (list (list 'SETCHAR 'D accent))
	       ;; No need to output a kern after the accent, as a POP follows
	       ;; (list (list 'MOVERIGHT 'R (float- (float+ a delta))))
	       (list '(POP))
	       (list (list 'SETCHAR 'D basechar))))))
    (terpri)))

(defun xcm-make-extras (font-name)
  "Outputs the recipes for those characters in Extended Computer Modern
not built up from a base character and an accent, i.e. the Icelandic
letters eth and thorn.  The font metrics for these letters should be
in PL files called extra-cm*10.pl, for instance extra-cmr10.pl. "
  (let ((extra-pl (format "extra-%s.pl" font-name)))
    (if (not (file-readable-p extra-pl))
	(message "PL file for extra characters (%s) not found!" extra-pl))
    ;; Read in the extra characters' PL file
    (set-buffer (get-buffer-create "*xcm-temp-3*"))
    (erase-buffer)
    (insert-file extra-pl)
    (goto-char (point-min))
    (dotted-floatify-buffer)
    (goto-char (point-min))
    (setq standard-input (current-buffer))
    ;; Handle the CHARACTER properties
    (condition-case nil
	(while t
	  (xcm-handle-extra-pl-expr (read)))
      (end-of-file nil))
    (kill-buffer (current-buffer))))

(defun xcm-handle-extra-pl-expr (e)
  (cond
   ((eq (car e) 'CHARACTER)
    (xcm-print
     (append e (list (list 'MAP (list 'SELECTFONT 'D 1)
			   (list 'SETCHAR 'O (car (cdr (cdr e))))))))
    (terpri))))

(defun make-float-string (int fract neg)
  "Returns a string containing the special dotted-pair representation of
a floating-point number.  INT is the integer part (a number)
and FRACT is the fractional part (as a string!)."
  (let ((x (make-float int fract (if neg -1 1))))
    (format "(%d . %d)" (car x) (cdr x))))

(defun make-float (int fract neg)
  "Converts a floating point number to a dotted-pair fixed-point 
representation.  INT is the integer part, FRACT is the fractional part
as a string, and NEG is the sign, 1 or -1.
The result is a dotted pair the car of which is the integer
part and the cdr is the fractional part multiplied by 1000000."
  (interactive)
  (cons int
	(* neg (let ((l (length fract)) (f (string-to-int fract)))
		 (cond ((= l 7) (/ (+ f 5) 10))
		       ((= l 6) f)
		       ((= l 5) (* f 10))
		       ((= l 4) (* f 100))
		       ((= l 3) (* f 1000))
		       ((= l 2) (* f 10000))
		       ((= l 1) (* f 100000))
		       (t (error "Too long fractional part: %s" fract)))))))

(defun float+ (a b)
  "Add two dottedfloats."
  (let (i f)
    (setq i (+ (car a) (car b)))
    (setq f (+ (cdr a) (cdr b)))
    (cond ((>= f 1000000)
	   (setq i (1+ i))
	   (setq f (- f 1000000)))
	  ((<= f -1000000)
	   (setq i (1- i))
	   (setq f (+ f 1000000))))
    (cond ((and (< i 0) (> f 0))
	   (setq i (1+ i))
	   (setq f (- f 1000000)))
	  ((and (> i 0) (< f 0))
	   (setq i (1- i))
	   (setq f (+ 1000000 f))))
    (cons i f)))

(defun float- (a &rest b)
  "Negate a dottedfloat or subtract two dottedfloats."
  (if b
      (float+ a (float- (car b)))
    (cons (- (car a)) (- (cdr a)))))

(defun float-div (a i)
  "Divide a dottedfloat by an integer."
  (cons (/ (car a) i) (/ (cdr a) i)))

(defun float* (a b)
  "Multiply two dottedfloats."
  (float-normalize
   (cons
    (* (car a) (car b))
    (+ (* (car a) (cdr b))
       (* (cdr a) (car b))
       (* (/ (cdr a) 1000) (/ (cdr b) 1000))))))

(defun float-normalize (a)
  (cond ((>= (cdr a) 1000000)
	 (float-normalize (cons (1+ (car a)) (- (cdr a) 1000000))))
	((<= (cdr a) -1000000)
	 (float-normalize (cons (+ -1 (car a)) (+ (cdr a) 1000000))))
	(a)))

(defun floatmax (a b)
  (if (or (> (car a) (car b)) (and (= (car a) (car b)) (> (cdr a) (cdr b))))
      a
    b))

(defun octal-to-int (value)
  (if (zerop value)
      0
    (+ (* 8 (octal-to-int (/ value 10))) (% value 10))))

(defun int-value (list)
  (let ((type (car list)) (value (car (cdr list))))
    (cond ((eq type 'C)
	   (cond
	    ((symbolp value) (aref (symbol-name value) 0))
	    ((integerp value) (+ value ?0))
	    (t (error "Invalid C type value."))))
	  ((eq type 'D) value)
	  ((eq type 'O) (octal-to-int value))
	  (t (error "Unknown value type.")))))
xxxxxxxxxxxxxxxx cut here xxxxxxxxxxxxxxxx
--
Tor Lillqvist,
working, but not speaking, for the Technical Research Centre of Finland

les@IDA.LiU.SE (Lennart Staflin) (09/26/90)

>>>>> On 26 Sep 90 05:03:37 GMT, tml@sottuli.tik.vtt.fi (Tor Lillqvist) said:

Tor> Well, here is what I hacked together some time ago when I wanted to
Tor> test the new features of TeX 3.0: A program for GNU Emacs that given a
Tor> PL file (or a directory of them) generates VP files for corresponding
Tor> extended CM fonts.  


I have used an erlier version of Tor Lillqvist nice program.
Unfortunally (at least) with my version of VPtoVF and dvips it is
neccesary to include the FONTDSIZE property in the MAPFONT properties
as VPtoVF defaults FONTDSIZE to 10pt.  This had the strange effect of
the LATIN1 fonts working in for the 10pt text but not for the larger
headings.  Using a correct FONTDSIZE (usually the same as the
DESIGNSIZE of the mapped font) fix this.

--
---------------------------------------------------------------------
Lennart Staflin   :   There's more to life than books, you know 
les@ida.liu.se	  :   but not much more           --- The Smiths