npn@cbnewsl.att.com (nils-peter.nelson) (12/29/90)
The original DWB's (1.0 and 2.0) supported a variety of printers: default was C/A/T, others were daisy, Imagen, Xerox, etc. In 3.1 we feature PostScript but also provide Imagen and HP LaserJet support. Our plan would be to provide support for the European standard (ISO 8859-1) character set only with the PostScript postprocessor, dpost. The reason is that the HP LaserJet requires host-resident fonts, in bitmap form, for every point size. The size of the font support for the LaserJet is already several megabytes, and the additional characters would add considerably. I'm not sure where we'd get the bitmaps from, either. PostScript already provides most if not all of the additional characters-- they are already in the printer, we only have to generate the name or position of the character in dpost. So, my question is, have the Europeans settled on PostScript as a standard for printers, or is there something else we should be supporting? (Special note to HP LaserJet owners: as you may already know, for US $700 you can add a PostScript cartridge to your LaserJet II. In addition to the added flexibility of PostScript you will probably recover your investment with the disk space you save when you rm the LaserJet bitmaps!)
clewis@ecicrl.UUCP (Chris Lewis) (12/30/90)
In article <1990Dec28.195703.2749@cbnewsl.att.com> npn@cbnewsl.att.com (nils-peter.nelson) writes: [Re: DWB 3.2 support for Latin-1] >Our plan would be to provide support for the European standard >(ISO 8859-1) character set only with the PostScript postprocessor, >dpost. The reason is that the HP LaserJet requires host-resident >fonts, in bitmap form, for every point size. The size of the >font support for the LaserJet is already several megabytes, >and the additional characters would add considerably. I'm not >sure where we'd get the bitmaps from, either. >PostScript already provides most if not all of the additional >characters-- they are already in the printer, we only have to >generate the name or position of the character in dpost. >So, my question is, have the Europeans settled on PostScript as >a standard for printers, or is there something else we should >be supporting? >(Special note to HP LaserJet owners: as you may already know, >for US $700 you can add a PostScript cartridge to your LaserJet II. >In addition to the added flexibility of PostScript you will >probably recover your investment with the disk space you save >when you rm the LaserJet bitmaps!) I get my revenge... Oh so sweet. Psroff has solved most of these difficulties, and I would have made them available as source for AT&T to use, but somehow "Chris Lewis doesn't feel that way". Nyah, nyah ;-) [Ronald Khoo was right, you should be careful about egregarious misquoting of guys like me. What goes around comes around.] But, I'm a nice guy, so I'll tell you how to solve your problems anyways: Font compression of HP SFP's (native HP PCL fonts): - compress (the PD one which I believe is now almost POSIX required aside from the copyright issue which is still with us.) - TeX PK format (compress won't compress them) These are the sizes of a Helvetica font at 10 point in three different formats (H.10.sfp doesn't have the full Latin-1 set, but it should be reasonably close): -rw-r--r-- 1 clewis users 3988 Jul 28 23:42 H.10.pk -rw-r--r-- 1 clewis users 10241 Dec 29 12:57 H.10.sfp -rw-r--r-- 1 clewis users 5149 Dec 29 12:57 H.10.sfp.Z Normally, psroff is told whether to look for a ".pk" or ".sfp" font file for a given font at a specific size. However, psroff's font reader doesn't care whether the file it finds is PK or SFP because you can tell from the first byte whether it's a SFP or PK, and the reader automatically switches to the right decoding software. If psroff can't find a file with the .pk or .sfp suffix, it automatically checks for a tacked on ".Z", and will popen a zcat (compress -dc) if it finds one to read and decode the font file. Psroff actually maintains the font internally as more of a PK format, but will read the SFP as a variant of the "unpacked PK" format. Of course, the emission of the font is in SFP format. (Does DWB's 3.1 LJ emitter support incremental downloading? Psroff and jetroff does) The compression variant is really easy for you to encorporate into DWB 3.2. You could ship the fonts entirely compressed, and then tell the customer to uncompress those fonts that are used a lot to eliminate the performance hit of decompression most of the time, but still have the full set available for immediate use. Psroff users haven't complained to me about the performance of this. (They did about other stuff, but I've fixed that). The PK to SFP conversion isn't that easy (unless you steal psroff source). Standalone programs to convert PK's to SFP's (including changing the mappings) is included with psroff (SoftQuad is using a version of this software with my blessing to create some of the fonts they distribute). Jetroff includes a program to convert SFP's to PK's which I use to create the PK format fonts I distribute with psroff. Font/code sources: 1 HP has sets of at least Roman and Helvetica at the sizes you'll need. (They seem to be discontinuing the floppy version however. I know that they have Latin-1 symbol sets, but I don't know whether they're currently available on floppy). Maybe you can do a deal with them. These are VERY good-looking fonts - to my eye they look nicer than the LaserWriter's Postscript fonts. 2 TeX PK's are available that have most of the characters you need (eg: the University of Toronto distribution). Psroff has facilities to search for and merge/remap these files into SFP's. ("buildfonts") 3 The freeware/shareware version of jetroff had PK's that buildfonts works with. 4 The commercial version of Jetroff has similar PK's and might have the Latin-1 extensions too. 5 METAFONT. "cm" PK fonts are rather ugly, but there are other fonts available that look nicer (eg: the am or jetroff's jm) The fonts that come with psroff for laserjets are built out of 2 and 3 (indirectly 5 of course), and I know of people using psroff with 1 and 4. And a few people have parts of the Latin-1 set working thru psroff. (I'm working on full support for them - thanks for eliciting the paper on the subject from the net). One thing you should be very aware of is that the HP Laserjet III has font scaling built in, and you can get a CG Times and Universal at any size you want out of them, just by requesting them by characteristic. You should support this. Psroff does, but I don't have the width table issue sorted out quite yet. -- Chris Lewis, Phone: (613) 832-0541 UUCP: uunet!utai!lsuc!ecicrl!clewis Moderator of the Ferret Mailing List (ferret-request@eci386) Psroff mailing list (psroff-request@eci386)
jjc@jclark.UUCP (James Clark) (01/01/91)
Groff has a composite character feature which helps with using ISO 8859/1 with a device that doesn't have all the ISO 8859/1 characters. For example, suppose you want to be able to input an `a' circumflex using ISO 8859/1 (`a' circumflex has code 0342), and suppose you have a device which has the letter `a' and has a circumflex accent, but doesn't have an `a' circumflex as a single character. Assuming `\*^' has been defined appropriately, you just have to do: .char \342 a\\*^ (by \342 I mean the character whose code is 0342.) After this you can use \342 exactly as if your output device provided an `a' with a circumflex as a single character. The `char' request is useful for other things too: for example, .char \(ru \D'l .5m 0' will get you a \(ru character if your output device happens not to have one. Characters defined with the `char' request can be used just like other characters: for example, they will be hyphenated properly (after an appropriate `hcode' request); they can also be used with the `lc' request, with `tr' request and with the `\l' or `\L' escape sequences. James Clark jjc@jclark.uucp
clewis@ecicrl.UUCP (Chris Lewis) (01/03/91)
In article <JJC.90Dec31183059@jclark.jclark.UUCP> jjc@jclark.UUCP (James Clark) writes: >Groff has a composite character feature which helps with using ISO >8859/1 with a device that doesn't have all the ISO 8859/1 characters. >For example, suppose you want to be able to input an `a' circumflex >using ISO 8859/1 (`a' circumflex has code 0342), and suppose you have >a device which has the letter `a' and has a circumflex accent, but >doesn't have an `a' circumflex as a single character. Assuming `\*^' >has been defined appropriately, you just have to do: So does psroff. In fact, I'd like to include the following plea to Nils-Peter for consideration in DWB 3.2: It is not necessary to limit the emit sequence in the ditroff width table files to be just one character. It would be advantageous to permit the fourth field in the width table to consist of any arbitrary sequence of characters, including backslash escape sequences (octal, maybe even hex ala 1003.1 string definitions). In this way, people can produce composite characters for mundane things (such as O overstrike c) that a printer doesn't support, as well as Latin-1 extensions on printers that are short the special glyphs, and even get at characters you call for by name (the characters not in the default postscript encoding vectors). Psroff has this feature: embedded in the built-in emit sequences (after all, CAT troff can't extend it's basic character set) plus translation override facilities, with multiple char codes, and indeed, invocations of Postscript drawing routines or glyph-backspace-glyph sequences etc. >.char \342 a\\*^ This feature is going to be appearing in Psroff soon. As another plea, in the extremely unlikely event that somebody does get to diddle SVR4 CAT Troff and sees this posting, PLEASE PLEASE PLEASE implement "\!". This directive should simply pump it's arguments out *in* the CAT code stream, prefixing it with an unused CAT code (eg: 'M') and terminated with a null or newline. That would make my life complete. [I enjoy the simple life ;-)] -- Chris Lewis, Phone: (613) 832-0541 UUCP: uunet!utai!lsuc!ecicrl!clewis Moderator of the Ferret Mailing List (ferret-request@eci386) Psroff mailing list (psroff-request@eci386)
npn@cbnewsl.att.com (nils-peter.nelson) (01/03/91)
Chris Lewis requests an additional field in the width tables to instruct troff how to manufacture the additional 8859 characters that are not in ASCII. Some of them appear quite easy (e.g., the Yen sign looks like a Y with a line through it, or the lower case letters with accent marks) but others appears near-impossible. For example, all the upper case letters will obliterate diacriticals, and the Icelandic eth doesn't appear to have an obvious representation. Is it worth doing half the job? (I.e., should we try to implement those characters that can be done this way and forget the others? Do a bad job on the others?) My inclination is to support two and only two modes for "production": PostScript and nroff. If you want ISO 8859 nroff, get an ISO 8859 terminal. The stuff about 7 bit shorthand for 8 bit characters was intended for debugging and interchange, not production. So far, no one has answered my previous question: will this direction meet the needs of the European market?
staff@cadlab.sublink.ORG (Alex Martelli) (01/04/91)
npn@cbnewsl.att.com (nils-peter.nelson) writes:
...
:Chris Lewis requests an additional field in the width
:tables to instruct troff how to manufacture the additional
:8859 characters that are not in ASCII. Some of them appear
...
:My inclination is to support two and only two modes for
:"production": PostScript and nroff. If you want ISO 8859
:nroff, get an ISO 8859 terminal. The stuff about 7 bit
:shorthand for 8 bit characters was intended for debugging
:and interchange, not production. So far, no one has answered
:my previous question: will this direction meet the needs
:of the European market?
Speaking as a European whose language needs very few diacritical marks
on letters (just a few accented vowels), I'd say the PS-or-nroff
direction would NOT be quite satisfactory; Chris's proposal looks VERY
much more attractive. How a Turk would feel about it, I can't say.
Regarding your previous question re Laserjet printers, I must say that
they and their clones appear to be VERY well situated in the Italian
market; probably MORE popular than PS printers, even including the ones
you obtain by tweaking a Laserjet, maybe because such tweaks don't
always work well. Lack of Laserjet support, in other terms, WOULD be
a (minor, but perceptible) handicap in the Italian market - although I
understand your arguments regarding why and wherefore you only want to
support Postcript output for ISO 8859.
---
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
--
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
clewis@ecicrl.UUCP (Chris Lewis) (01/07/91)
In article <1991Jan3.151843.24109@cbnewsl.att.com> npn@cbnewsl.att.com (nils-peter.nelson) writes: |Chris Lewis requests an additional field in the width |tables to instruct troff how to manufacture the additional |8859 characters that are not in ASCII. Some of them appear |quite easy (e.g., the Yen sign looks like a Y with a line |through it, or the lower case letters with accent marks) |but others appears near-impossible. For example, all the |upper case letters will obliterate diacriticals, and the |Icelandic eth doesn't appear to have an obvious representation. |Is it worth doing half the job? (I.e., should we try to |implement those characters that can be done this way and |forget the others? Do a bad job on the others?) My suggestion was to permit the fourth field to be more than one character. You're quite right in that this seems a bit half-assed, However, in psroff it's fairly necessary to permit slight adjustment of some character's placement and kludge up some additional characters. (Eg: HPLJ box drawing characters don't precisely line up the way troff expects them to). Psroff, though, has a somewhat more sophisticated scheme, in ditroff width table terms it looks like: char kern width <sequence> <xshift> <yshift> <scale> Where <sequence> is a sequence of one or more bytes to emit for the glyph (it can even be invocations of Postscript functions), x shift and y shift are adjustment factors that are multiplied by the point size and added to the X and Y coordinate when positioning the character, and scale is a facter to apply to the point size. (eg: bullets are too small in Postscript Roman). They default to 0, 0 and 1. In this way, box corners can be tuned etc. It's not particularly necessary with Postscript, but it certainly is with Laserjets. |My inclination is to support two and only two modes for |"production": PostScript and nroff. If you want ISO 8859 |nroff, get an ISO 8859 terminal. The stuff about 7 bit |shorthand for 8 bit characters was intended for debugging |and interchange, not production. So far, no one has answered |my previous question: will this direction meet the needs |of the European market? Speaking unofficially (I'm on the ISO/CSA/Treasury/POSIX committee), having ditroff accept 8-bit characters on input (also use a reasonable set of \(sequences for those without the proper terminal), being able to successfully search for them in the width tables, and emit the appropriate stuff seems to be consistent (from the perspective of code, not tables) and sufficient for Canada to be happy (Canadian Federal Govt. is pushing 8859-1 because of French-English bilinqualism requirements). From the perspective of only supporting Postscript on troff with 8859-1, do you have any comments on the suggestions I made? At the very least, the other troff filters should *not* disallow 8-bit, and permit extension of the character set by the user when/if the appropriate fonts are available. Including the proper tables (and a pointer to where fonts might be obtained) would be even better if you can't compress the fonts. I think it would be a drastic mistake to only support 8859-1 on Postscript. It's not that hard for HPPCL. -- Chris Lewis, Phone: (613) 832-0541 UUCP: uunet!utai!lsuc!ecicrl!clewis Moderator of the Ferret Mailing List (ferret-request@eci386) Psroff mailing list (psroff-request@eci386)
clewis@ecicrl.UUCP (Chris Lewis) (01/07/91)
In article <1041@ecicrl.UUCP> clewis@ecicrl.UUCP (Chris Lewis) (me) writes: >I think it would be a drastic mistake to only support 8859-1 on Postscript. >It's not that hard for HPPCL. *ESPECIALLY* considering the new HP Laserjets (III's) have two font sets built in that the printer can scale to any size. -- Chris Lewis, Phone: (613) 832-0541 UUCP: uunet!utai!lsuc!ecicrl!clewis Moderator of the Ferret Mailing List (ferret-request@eci386) Psroff mailing list (psroff-request@eci386)
lee@sq.sq.com (Liam R. E. Quin) (01/08/91)
npn@cbnewsl.att.com (nils-peter.nelson) writes: >Our plan would be to provide support for the European standard >(ISO 8859-1) character set only with the PostScript postprocessor, >dpost. [...] >So, my question is, have the Europeans settled on PostScript as >a standard for printers, or is there something else we should >be supporting? Speaking as somone who until very recently worked in a UK Unix company, I would say that the LaserJet is literally orders of magnitude more widespread. The hight cost of PostScript printers, coupled with the high mark-up involved in shipping to Europe, means that PostScript has not made anything like the market penetration it seems to have achieved in North America. Small companies to whom we sold (sq)troff were more likely to have LaserJets then LaserWriters, probably since the former could be obtained for under (the equivalent of) US$3,000 in the UK, whilst Post- Script printers started at a little over US$6,000. I do feel that it might be worth your while investing in a little Market Research. [But perhaps I shouldn't be giving clues to the competition :-)] And, as Chris Lewis points out, there is no reason why you shouldn't at least compress the HP fonts if there are so many. The AF ad AD sets used to come with a reasonable Latin-1 (Roman 8) character set. With a careful font downloading scheme and a good driver, a LaserJet can easily out-perform most PostScript printers for most common jobs. Lee -- Liam R. E. Quin, lee@sq.com, SoftQuad Inc., Toronto, +1 (416) 963-8337
gisle@ifi.uio.no (Gisle Hannemyr) (01/15/91)
In article <1991Jan3.151843.24109@cbnewsl.att.com> npn@cbnewsl.att.com (nils-peter.nelson) writes: > Chris Lewis requests an additional field in the width > tables to instruct troff how to manufacture the additional > 8859 characters that are not in ASCII. Some of them appear > quite easy (e.g., the Yen sign looks like a Y with a line > through it, or the lower case letters with accent marks) > but others appears near-impossible. For example, all the > upper case letters will obliterate diacriticals, and the > Icelandic eth doesn't appear to have an obvious representation. > Is it worth doing half the job? (I.e., should we try to > implement those characters that can be done this way and > forget the others? Do a bad job on the others?) > My inclination is to support two and only two modes for > "production": PostScript and nroff. If you want ISO 8859 > nroff, get an ISO 8859 terminal. The stuff about 7 bit > shorthand for 8 bit characters was intended for debugging > and interchange, not production. So far, no one has answered > my previous question: will this direction meet the needs > of the European market? First when you say ISO 8859, do you actually mean the complete set of ISO 8859 character sets, or just the ISO 8859/1? In any case, only the latter is yet implemented in most PostScript printers. One important source for information on how the European market views character sets are the European Government OSI Profiles. They are called thinks like GOSIP (UK), SOSIP (Sweden), NOSIP (Norway) and EPHOS == European Procurement Handbook for Open Systems (European Community). The X/Open portability guide NLS also discusses character sets. I have spent some time studying these, and in brief, yes, they focus on ISO 8859/1. And IMHO supporting ISO 8859/1 will meet the major requirements for most of Western Europe, Note, however the following three exceptions: 1) Slavic languages, and of cource cyrillic, as required in Easter Europe is not covered by ISO 8859/1. 2) Lappish -- a small minority language in the north of Finland, Sweden and Norway -- is not covered by ISO 8859/1 (but by ISO 8859/4). 3) A number of network communication protocols (most important X.400 electronic mail) assumes ISO 6937/2, not ISO 8859/1 (ISO 6937/2 is s superset of ISO 8859/1-4). -- Disclaimer: The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary. - gisle hannemyr (Norwegian Computing Center) EAN: C=no;PRMD=uninett;O=nr;S=Hannemyr;G=Gisle (X.400 SA format) gisle.hannemyr@nr.no (RFC-822 format) Inet: gisle@ifi.uio.no UUCP: ...!mcsun!ifi!gisle ------------------------------------------------