[comp.text.tex] LaTeX: Hyphenation Problem

mgb@csadfa.cs.adfa.oz.au (Michael Barlow) (06/13/91)

Hi, I'm having a problem with hyphenation under LaTeX and wonder if
someone could help.

I'm using a number of hyphenated words like non-normalised and
log-concatenated. These occur in the captions of a large number of
figures and tables. When I generate a lof or lot I get a large number
of overfull hbox errors because it doesn't want to hypenated these
words anywhere but at the original hyphen.

I've tried things like \hyphenation{nor-mal-ised}
but of course normalised isn't the same word as non-normalised!...
and \hyphenation{non-nor-mal-ised} won't work either as it looks for
the word nonnormalised. 

I know I can go around and individually hyphenate these words but it
would be much nicer if there was a simple, one place, fix.

Thankyou in advance.




							Spike
						    Bun Bu RyoDo
--------------
Michael Barlow	 mgb@csadfa.cs.adfa.oz.au

robin@lsl.co.uk (Robin Fairbairns) (06/14/91)

In article <1991Jun13.070121.1906@sserve.cc.adfa.oz.au>, mgb@csadfa.cs.adfa.oz.au (Michael Barlow) writes:
> Hi, I'm having a problem with hyphenation under LaTeX and wonder if
> someone could help.
> 
> I'm using a number of hyphenated words like non-normalised and
> log-concatenated. These occur in the captions of a large number of
> figures and tables. When I generate a lof or lot I get a large number
> of overfull hbox errors because it doesn't want to hypenated these
> words anywhere but at the original hyphen.
> 
> I've tried things like \hyphenation{nor-mal-ised}
> but of course normalised isn't the same word as non-normalised!...
> and \hyphenation{non-nor-mal-ised} won't work either as it looks for
> the word nonnormalised. 
> 
> I know I can go around and individually hyphenate these words but it
> would be much nicer if there was a simple, one place, fix.

This is because of the jolly rules of `proper' typesetting - don't (they
say) hyphenate a word that's already been explicitly hyphenated.

I picked up the following hack from Barbara Beeton ages ago, and include
it in all the style files I write - it needs to be inserted in the final
pass (so non-normalised goes to non\hyph normalised), to get rid of
those otherwise un-removable bad \hbox{es}. It's the `breakable hyphen'
command: 

      \def\hyph{-\penalty0\hskip0pt\relax}

You could play tricks mapping it to a character that's made active for 
the purpose, but `-'?  Mmmm...
-- 
Robin Fairbairns, Senior Consultant, postmaster and general dogsbody
Laser-Scan Ltd., Science Park, Milton Rd., Cambridge CB4 4FY, UK
Email: robin@lsl.co.uk  --or--  rf@cl.cam.ac.uk

spit@fys.ruu.nl (Werenfried Spit) (06/15/91)

In <1991Jun14.143616.776@lsl.co.uk> robin@lsl.co.uk (Robin Fairbairns) writes:

>This is because of the jolly rules of `proper' typesetting - don't (they
>say) hyphenate a word that's already been explicitly hyphenated.
Might be, but you will have difficulties with german 
and dutch (and probably a lot more languages) which
contain too many of these words to apply this rule.

robin@lsl.co.uk (Robin Fairbairns) (06/18/91)

In article <1991Jun14.191028.14533@fys.ruu.nl>, spit@fys.ruu.nl (Werenfried Spit) writes:
> In <1991Jun14.143616.776@lsl.co.uk> robin@lsl.co.uk (Robin Fairbairns) writes:
> 
>>This is because of the jolly rules of `proper' typesetting - don't (they
>>say) hyphenate a word that's already been explicitly hyphenated.
> Might be, but you will have difficulties with german 
> and dutch (and probably a lot more languages) which
> contain too many of these words to apply this rule.

Oh dearie me - I stand corrected.  I just don't know enough German, 
obviously, and my Dutch is next to non-existent (there are too many good 
speakers of English in Holland for the incentive to be strong enough).

Am I to understand that this is another instance in which TeX is 
not-quite-perfect for the non English-speaking world?  Oh woe!
-- 
Robin Fairbairns, Senior Consultant, postmaster and general dogsbody
Laser-Scan Ltd., Science Park, Milton Rd., Cambridge CB4 4FY, UK
Email: robin@lsl.co.uk  --or--  rf@cl.cam.ac.uk

geoffo@spectrum.cs.unsw.oz.au (Geoff Oakley) (06/19/91)

In article <1991Jun18.095110.781@lsl.co.uk> robin@lsl.co.uk (Robin Fairbairns) writes:

   In article <1991Jun14.191028.14533@fys.ruu.nl>, spit@fys.ruu.nl (Werenfried Spit) writes:
   > In <1991Jun14.143616.776@lsl.co.uk> robin@lsl.co.uk (Robin Fairbairns) writes:
   > 
   >>This is because of the jolly rules of `proper' typesetting - don't (they
   >>say) hyphenate a word that's already been explicitly hyphenated.
   > Might be, but you will have difficulties with german 
   > and dutch (and probably a lot more languages) which
   > contain too many of these words to apply this rule.

   Oh dearie me - I stand corrected.  I just don't know enough German, 
   obviously, and my Dutch is next to non-existent (there are too many good 
   speakers of English in Holland for the incentive to be strong enough).

   Am I to understand that this is another instance in which TeX is 
   not-quite-perfect for the non English-speaking world?  Oh woe!

Don't get too depressed just yet.  Remember that non-English `versions'
of TeX have their own, different hyphenation tables.  And that with
TeX 3.0 there is (fairly) full support for multi-lingual TeX.
Remember also the impressive Japanese and Arabic (and no doubt other)
versions that exist.
--

	       geoffo@spectrum.cs.unsw.oz.au
Geoff Oakley:  CS & E, UNSW, PO Box 1, Kensington, NSW 2033, Australia
	       Phone: +61 2 697 4043	Fax: +61 2 313 7987

spit@fys.ruu.nl (Werenfried Spit) (06/19/91)

In <GEOFFO.91Jun19171522@crimson.spectrum.cs.unsw.oz.au> geoffo@spectrum.cs.unsw.oz.au (Geoff Oakley) writes:

>Don't get too depressed just yet.  Remember that non-English `versions'
>of TeX have their own, different hyphenation tables.  And that with
>TeX 3.0 there is (fairly) full support for multi-lingual TeX.
>Remember also the impressive Japanese and Arabic (and no doubt other)
>versions that exist.

True, but the problem remains, as TeX (without special tricks) will
not hypnhenate words containing an explicit hyphen. Even if the patterns
tell clearly how each of the constituing members could be hyphenated.

geyer@galton.uchicago.edu (06/19/91)

In article <1991Jun19.135807.3082@fys.ruu.nl> spit@fys.ruu.nl
(Werenfried Spit) writes:

> True, but the problem remains, as TeX (without special tricks) will
> not hypnhenate words containing an explicit hyphen. Even if the patterns
> tell clearly how each of the constituing members could be hyphenated.

That is because TeX does the Right Thing.  It is not correct to further
hyphenate words already containing a hyphen, unless nothing else can be
done.  Generally a copyeditor can avoid the troublesome hyphen by
rearranging a word or two somewhere in the paragraph.

TeX is for typesetting of the highest quality.  It can't do what you
want without compromising that quality.



Charles Geyer
Department of Statistics
University of Chicago
geyer@galton.uchicago.edu

spit@fys.ruu.nl (Werenfried Spit) (06/20/91)

In <1991Jun19.164316.16358@midway.uchicago.edu> geyer@galton.uchicago.edu writes:


>In article <1991Jun19.135807.3082@fys.ruu.nl> spit@fys.ruu.nl
>(Werenfried Spit) writes:

>> True, but the problem remains, as TeX (without special tricks) will
>> not hypnhenate words containing an explicit hyphen. Even if the patterns
>> tell clearly how each of the constituing members could be hyphenated.

>That is because TeX does the Right Thing.  It is not correct to further
>hyphenate words already containing a hyphen, unless nothing else can be
>done.  Generally a copyeditor can avoid the troublesome hyphen by
>rearranging a word or two somewhere in the paragraph.

>TeX is for typesetting of the highest quality.  It can't do what you
>want without compromising that quality.

What I was telling one or two postings earlier was that languages like
german or dutch have agglutinating possiblities that make very long words
with and without explicit hyphens rather common. This means that applying
the mentioned typographical rule may get you in more serious typographical 
trouble sometimes.
In other languages than english TeX should *also* give you the highest
quality.

icking@gmdzi.gmd.de (Werner Icking) (06/20/91)

robin@lsl.co.uk (Robin Fairbairns) writes:

>In article <1991Jun14.191028.14533@fys.ruu.nl>, spit@fys.ruu.nl (Werenfried Spit) writes:
>> In <1991Jun14.143616.776@lsl.co.uk> robin@lsl.co.uk (Robin Fairbairns) writes:
>> 
>>>This is because of the jolly rules of `proper' typesetting - don't (they
>>>say) hyphenate a word that's already been explicitly hyphenated.
>> Might be, but you will have difficulties with german 
>> and dutch (and probably a lot more languages) which
>> contain too many of these words to apply this rule.

>Oh dearie me - I stand corrected.  I just don't know enough German, 
>obviously, and my Dutch is next to non-existent (there are too many good 
>speakers of English in Holland for the incentive to be strong enough).

But in this special case most German writers don't know enough German.
They often write a "-" when the rules do not allow this: black-and-white
film (sp?) Schwarz-Weiss-Film is wrong; Schwarzweissfilm is correct. And
from the view of typesetting you are completely rigth; to hyphenated a
word which already contains hyphens (dashes, "-") makes it difficult to
read. And therefor it is bad typesetting: Ich wohne in der E.-T.-A.-Hoff-
mann-Strasse --- oder gar oder gar oder gar oder gar --- der Giro-d'Ita-
lia-Gewinner erhaelt ...

>Am I to understand that this is another instance in which TeX is 
>not-quite-perfect for the non English-speaking world?  Oh woe!

Yes, and the first indication for me was already in the TeX-book just where
Knuth wants to show that TeX may be used for non-english languages, too.
The Hungarian names are written with incorrect accents!

Hope, that my english is not worse than TeX for non-english languages :-)
-- 
Werner Icking          icking@gmdzi.gmd.de          (+49 2241) 14-2443
Gesellschaft fuer Mathematik und Datenverarbeitung mbH (GMD)
Schloss Birlinghoven, P.O.Box 1240, D-5205 Sankt Augustin 1, FRGermany
                                  "Der Dativ ist dem Genitiv sein Tod."
-- 
Werner Icking          icking@gmdzi.gmd.de          (+49 2241) 14-2443
Gesellschaft fuer Mathematik und Datenverarbeitung mbH (GMD)
Schloss Birlinghoven, P.O.Box 1240, D-5205 Sankt Augustin 1, FRGermany
                                  "Der Dativ ist dem Genitiv sein Tod."

marcel@cs.caltech.edu (Marcel van der Goot) (06/21/91)

Robin Fairbairns (robin@lsl.co.uk) wrote:
> Am I to understand that this is another instance in which TeX is 
> not-quite-perfect for the non English-speaking world?  Oh woe!

TeX is not perfect of course, but is capable of solving the problem.
You simply say
	\lccode`\-=`\- \hyphenchar\the\font=`\#
and voila, TeX will hyphenate words containing a `-'. There is only one
drawback: if TeX hyphenates a word it uses a `#' instead of a `-', as in
	\showhyphens{hyphenation}

	Underfull \hbox (badness 10000) detected at line 0
	[] \tenrm hy#phen#ation
To solve that, the font should contain two hyphen symbols, one in the
normal position and one in the position for `#'.

Charles Geyer (geyer@galton.uchicago.edu) declared in the same thread:
> That is because TeX does the Right Thing.  It is not correct to further
> hyphenate words already containing a hyphen, unless nothing else can be
> done.

This program posts news to thousands of machines throughout the entire
civilized world.  Your message will cost the net hundreds if not thousands of
dollars to send everywhere.  Please be sure you know what you are doing.

Are you absolutely sure that you want to do this? [ny]


Finally, Werner Icking (icking@gmdzi.gmd.de) remarked:
> But in this special case most German writers don't know enough German.
> They often write a "-" when the rules do not allow this: black-and-white
> film (sp?) Schwarz-Weiss-Film is wrong; Schwarzweissfilm is correct.

Same is true for Dutch; most hyphens that are written should be omitted.
``Aardappelmeelfabriek,'' not ``aardappelmeel-fabriek'' or other horrors.
(aard=earth, appel=apple, meel=flour/starch, fabriek=factory)


                                          Marcel van der Goot
 .----------------------------------------------------------------
 | Blauw de vi-ool-tjes,                    marcel@vlsi.cs.caltech.edu
 |    Rood zijn de ro-zen;
 | Een rijm kan ge-zet
 |    Met plak-sel en do-zen.
 |