[sci.lang.japan] Marketing wizardry & handling of far-east languages.

goer@sophist.uucp (Richard Goerwitz) (09/28/89)

In article <5508@zyx.ZYX.SE> arndt@zyx.ZYX.SE (Arndt Jonasson) makes
a very important request for information - one that makes we here in
the US only to painfully aware of our almost pathological inability
to think internationally, at least on the linguistic level:

>This is a request for information. We are in the process of developing
>software which among other things will handle natural languages other
>than English in a useful manner. This software will mostly run in an
>environment using the X Window System. Among the languages that raise
>the most problems is Japanese, since the set of characters is so much
>larger than the Latin alphabet.

I have sent mail to several firms about this problem.  Most replies
have been of the ilk:

    We are aware of the problem of internationalization, and we are
    working on localizing the various versions of our software for
    various nationalities.

The fundamental misconception is, of course, that localization is com-
patible with internationalization.  Every time a system is hacked for
a new alphabet/font/wordwrap method, all the software needs to be hacked
with it.  Moreover, software written for a different situation in a
different country needs to be "ported" to run in another country and
another situation.

And what of bi- or multi-lingual environments?  Increasingly, English
is being used in conjunction with national languages (e.g. India, and
in the Far East, somewhat in Arabic-speaking countries, definitely in
Israel).  In places like Turkey, we have Arabic, Turkish, and then some
English and other W. European languages being used by international
firms.  If we sell them "Turkish" versions of a given os or windowing
package, it will not fit the real-life conditions of the market.

A truly international windowing environment must offer basic support
for:

   1) proportional spacing on screen, with overstrikes (particularly
      important for Arabic)
   2) various character sets used simultaneously in the same window
   3) various wordwrap methods used simultaneously in the same win-
      dow

Only in this manner can each country in which a product is marketed
really have the same product (see the problems with EBCDIC transla-
tion!), and likewise be able to run products easily that were devel-
oped in other countries (and, I might add, to do it all at the same
time).

In short, Arndt Johanssen will be hard-pressed to find what he is
looking for, at least in terms of some fundamentally international
solution.  He will probably have to settle for a short-sighted hack
that some independent firm, or else some national branch of a larger
firm, has developed to meet his particular sort of need.

                                       -Richard L. Goerwitz
                                       goer@sophist.uchicago.edu
                                       rutgers!oddjob!gide!sophist!goer

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/28/89)

In article <5557@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>In article <5508@zyx.ZYX.SE> arndt@zyx.ZYX.SE (Arndt Jonasson) makes
>a very important request for information - one that makes we here in
>the US only to painfully aware of our almost pathological inability
>to think internationally, at least on the linguistic level:

It also appears to make us forget how to use English.

>The fundamental misconception is, of course, that localization is com-
>patible with internationalization.

No, the fundamental problem is that you don't know what they
mean by "localization".  It's a technical term; locales provide
a flexible mans of supporting multiple cultural interfaces on
the same system.  The original technique was devised by X3J11
in conjunction with international working groups that were
concerned with such issues, generally summarized as
"internationalization".  I receive many of their mailings
regularly.  I think they have the matter well under control.

ry@cbnewsl.ATT.COM (ryerson.schwark) (09/28/89)

In article <5557@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>In short, Arndt Johanssen will be hard-pressed to find what he is
>looking for, at least in terms of some fundamentally international
>solution.  He will probably have to settle for a short-sighted hack
>that some independent firm, or else some national branch of a larger
>firm, has developed to meet his particular sort of need.


Not True!  AT&T has done considerable work on internationalization
with the intent that all the work we have done not have to be
redone for each language.  We have quite effectively addressed Japanese,
one of the more difficult languages with 3 alphabets and ideograms,
and have created some generalized solutions to address both Asian
and European languages.  The UNIX Software Operation is, however,
in the source code licensing business, so you may not be seeing
this stuff on your vendor's box yet, but the technology is there.  

Ry Schwark
rye@attunix.att.com

goer@sophist.uucp (Richard Goerwitz) (09/28/89)

In article <11171@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
 
>It also appears to make us forget how to use English.

Come now, is this really a substantive comment, Doug?

>>The fundamental misconception is, of course, that localization is com-
>>patible with internationalization.
>
>No, the fundamental problem is that you don't know what they
>mean by "localization".  It's a technical term; locales provide
>a flexible mans of supporting multiple cultural interfaces on
>the same system.  The original technique was devised by X3J11
>in conjunction with international working groups that were
>concerned with such issues, generally summarized as
>"internationalization".  I receive many of their mailings
>regularly.  I think they have the matter well under control.

Very interesting.  The problem I have found (and, regardless of ter-
minology, it seems real enough to me) is that no one has come up
with a standard interface that:

  1) offers flexible creating and use of multiple fonts in the
     same window
  2) offers proportional spacing and/or overstrike, or some other
     ready means of getting languages like Arabic on the screen
  3) offers access to various wordwrap methods for (1) and (2)

If such a system exists, I would truly like to know about it.  Short
of this, it would be hard to call something "international."  My
impression is that the responder quoted above was so annoyed at my
ignorance about the term "localization" that he did not address the
substantive questions raised.  I, for one, would like to know more
than simply that they "have the matter well under control."

                                       -Richard L. Goerwitz
                                       goer@sophist.uchicago.edu
                                       rutgers!oddjob!gide!sophist!goer

samlb@pioneer.arc.nasa.gov (Sam Bassett RCD) (09/29/89)

	I heartily agree that U.S. companies are, for the most part,
abysmally ignorant about internationalization.

	On the other hand, I would counsel the gentleman from Sweden to
get his company to form a partnership with a Japanese (Korean, etc.)
company to develop WP software for those languages.  The Japanese, at
least, have put tremendous effort into handling romaji/katakana/kanji
input, output, and displays -- they are the _experts_ in the language,
after all, and know which optimizations and shortcuts will and will not
work.

	I know what kind of butchery _I_ do to the languages I half-know,
and have seen that kind of English . . .


Sam'l Bassett, Sterling Software @ NASA Ames Research Center, 
Moffett Field CA 94035 Work: (415) 694-4792;  Home: (415) 969-2644
samlb@well.sf.ca.us                     samlb@ames.arc.nasa.gov 
<Disclaimer> := 'Sterling doesn't _have_ opinions -- much less NASA!'

uucibg@swbatl.UUCP (3929) (09/29/89)

In article <5566@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>Very interesting.  The problem I have found (and, regardless of ter-
>minology, it seems real enough to me) is that no one has come up
>with a standard interface that:
>
>  1) offers flexible creating and use of multiple fonts in the
>     same window
>  2) offers proportional spacing and/or overstrike, or some other
>     ready means of getting languages like Arabic on the screen
>  3) offers access to various wordwrap methods for (1) and (2)
>
>If such a system exists, I would truly like to know about it.  Short
>of this, it would be hard to call something "international."  ...

You ought to check out the MacOS's ScriptManager stuff.  It claims to do this
kind of thing.  As I recall, there were some bugs in the code but from what I
know it was substantially correct (disclaimer:  I've never actually had a 
chance to work with the routines).  I believe that the known bugs were to be
fixed with the next release of the OS (which should be out 1st qtr of 1990).
For more info, you probably could post to comp.sys.mac.programmer, look at
a copy of Inside Macintosh Volume V, or call Apple and have them tell
you to do one of the first two :-).

>                                       -Richard L. Goerwitz
>                                       goer@sophist.uchicago.edu
>                                       rutgers!oddjob!gide!sophist!goer

Disclaimer:  I could be wrong. :-)

Thanks,
--------------------------------------------------------------------------------
Brian R. Gilstrap    ...!{ {killer,bellcore}!texbell, uunet }!swbatl!uucibg
One Bell Center      +----------------------------------------------------------
Rm 17-G-4            | "Winnie-the-Pooh read the two notices very carefully,
St. Louis, MO 63101  | first from left to right, and afterwards, in case he had
(314) 235-3929       | missed some of it, from right to left."   -- A. A. Milne
--------------------------------------------------------------------------------
Disclaimer:
Me, speak for my company?  You must be joking.  I'm just speaking my mind.

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/29/89)

In article <5566@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>I, for one, would like to know more than simply that they
>"have the matter well under control."

Well, instead of bitching about how dumb everybody is, don't you
think you should be participating in the internationalization work?
It's hardly been a closely-guarded secret.

goer@sophist.uucp (Richard Goerwitz) (09/29/89)

In article <11183@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <5566@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>>I, for one, would like to know more than simply that they
>>"have the matter well under control."
>
>Well, instead of bitching about how dumb everybody is, don't you
>think you should be participating in the internationalization work?
>It's hardly been a closely-guarded secret.

Aren't you the same fellow who just blasted me for my grammar on
the net, and then proceeded to blast me again over my ignorance
about a term - "localization" (forgetting in the process to address
the basic question I asked)?  :-(

Let's get back to the first question I asked you:  What are these
products or systems you seem to think exist?  Who has the matter
"well under control"?  There's no need to be hostile.  I really 
don't mind being told I don't know what I'm talking about if my
respondent can cite evidence.  Can you cite an os or windowing sys-
tem that meets the criteria I outlined?  Please, no more flames:
Facts.

                                       -Richard L. Goerwitz
                                       goer@sophist.uchicago.edu
                                       rutgers!oddjob!gide!sophist!goer

ck@voa3.UUCP (Chris Kern) (09/29/89)

In article <5566@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz)
writes:
>              ...  The problem I have found (and, regardless of ter-
>minology, it seems real enough to me) is that no one has come up
>with a standard interface that:
>
>  1) offers flexible creating and use of multiple fonts in the
>     same window
>  2) offers proportional spacing and/or overstrike, or some other
>     ready means of getting languages like Arabic on the screen
>  3) offers access to various wordwrap methods for (1) and (2)
>

Xerox markets sophisticated multilingual word processing software in
its ViewPoint product line.  We currently have word processing in 31
languages, including some difficult ones, such as Arabic, Chinese, and
Hindi, and will have 43 languages installed by the middle of next year.

We tend to use the software mono- or bi-lingually; typically, our
radio scripts are composed in one foreign language with a little bit
of English thrown in.  However, there is no limit to the number of
languages that can be included in a single document.  The typing
logic is sensible (except in a few cases where well-established national
standards mandate a typewriter-style approach to typing, although it
probably is sensible to follow the standard if that's how everyone in
that culture is taught to type).  Rendering is handled properly
on the user's video monitor as well as in the laser printed hard-copy.
Our native speaker users say the quality of the fonts ranges from
good to outstanding.

Essentially, everything works exactly as the user expects.  Some
genuinely difficult technical obstacles must be overcome to accomplish
that.  It is not just a matter of drawing the fonts properly.
(Imagine an English phrase followed by its Chinese translation,
drawn from a universe of 10,000 discrete Chinese characters, with
an intervening parenthetical expression in Arabic, which is written
right-to-left and where many of the individual letters can assume
up to four different shapes depending on their position within a
word.  Now imagine what the software has to do as you type that string
of words serially.  Or backspace over or otherwise edit part of it after
you have typed it.)  We're quite pleased with the quality of the
individual languages.  But the *generality* of the system is astounding.

Currently, ViewPoint runs on Xerox's proprietary Mesa processor, but
the company has announced plans to port its office automation software
to a UNIX platform (specifically, a SPARC processor produced by or under
license from Sun).

(I have no connection to Xerox except as a customer.)

-- 
Chris Kern			     Voice of America, Washington, D.C.
...uunet!voa3!ck					+1 202-485-7020

tw@Atherton.COM (Tw Cook) (09/30/89)

In article <3260@amelia.nas.nasa.gov>, samlb@pioneer.arc.nasa.gov (Sam
Bassett RCD) writes:

>	On the other hand, I would counsel the gentleman from Sweden to
>get his company to form a partnership with a Japanese (Korean, etc.)
>company to develop WP software for those languages.  The Japanese, at
>least, have put tremendous effort into handling romaji/katakana/kanji
>input, output, and displays -- they are the _experts_ in the language,
>after all, and know which optimizations and shortcuts will and will not
>work.

I second this recommendation.  In my previous life at HP, handling
foreign languages seemed to be a really big problem; always behind, not
a very good job. Then they transferred a lot of the Unix commands work
to an HP lab in Japan.  Suddenly, the problem got much less severe!

Tw

ianf@nada.kth.se (Ian Feldman) (10/01/89)

In article <2033@cbnewsl.ATT.COM> ry@cbnewsl.ATT.COM (ryerson.schwark,sf,) 
comments upon Richard Goerwitz' conclusion:
> Arndt Johanssen [...]
> will probably have to settle for a short-sighted hack
> that some independent firm, or else some national branch of a larger
> firm, has developed to meet his particular sort of need.

thus:

> Not True!  AT&T [...] 
> have created some generalized solutions to address both Asian
> and European languages.

  Oh, yes?  I challenge you to come up with a solution to the Polish,
  Slovak, Czech, Croatian, Latvian and few other European Latin-character
  alfabets not currently cared for in either the EBCDIC, the "8-bit ASCII,"
  or the DEC Multinational character sets.  Not to mention the present-day's
  TOTAL inability to address/ display/ communicate with computers in bi-
  lingual or multi-lingual mode... 

  Seems to me any solution to the above that is based on post-addressing
 "the problem" instead of making it a part of the basic-design stage is
  bound to fail in the end.... see the "short-sighted hacks" that Richard
  was talking about.

  P.S. The computer czars have gotten away with it so far.  Now that
  Poland is about to re-join the Western society (in principle if not
  yet in spirit) there is one less excuse for not catering to 'East-
  European Commie languages'

-- 
----
------ ianf@nada.kth.se/ @sekth.bitnet/ uunet!nada.kth.se!ianf
----
--

oster@dewey.soe.berkeley.edu (David Phillip Oster) (10/07/89)

>In article <5566@tank.uchicago.edu> goer@sophist.UUCP (Richard Goerwitz) writes:
>Very interesting.  The problem I have found (and, regardless of ter-
>minology, it seems real enough to me) is that no one has come up
>with a standard interface that:

>  1) offers flexible creating and use of multiple fonts in the
>     same window
>  2) offers proportional spacing and/or overstrike, or some other
>     ready means of getting languages like Arabic on the screen
>  3) offers access to various wordwrap methods for (1) and (2)

Richard L. Goerwitz is right.  All of this is standard on the Macintohsh.
Ever since the Mac II came out in 1987, all System releases have patched
the ROM Text editor to use Script Manager. Script Manager lets you chain
multiple national keyboards off the ADB bus, or remap the current keyboard
with a single mouse click.

It handles language systems that write from right to left, and those that
write from left to write, it  handles mixing them on a single line, and
selecting a portion of that line with the mouse. Think about it: a
selection that is contiguous in memory will not be on the screen. 

It handles sorting according to the rules of the country (in Spanish, I've
heard, "ch" sorts after "cz".)

It handles specifying numeric formats (such as the spreadsheet
equivalent of a fortran format statement) in one national format, in a
program written in a second language, for a customer who will be using
a third langauge.

It handles conversion to non-western calendar systems, such as the
Japanese in-the-year-of-the-emporor or the arabic hours which are based
on the local length of the daylight. You use a cute piece of software that
lets you point at a world map, or type in a city name, if you don't know
your latitude or longitude.

It handles mixing multi-font characters, such as Japanese, with single
font characters, and the problems that causes for string searching
(can't match in the middle of a character.) (JNSI chars take two bytes
each.)

It handles languages that justify text by adding extra white space
(like English) and langauges that justify text by making the letters
wider, (like arabic). Think of the problems justifying a mixed English
Arabic line.

It handles languages where the glyph denoted by a byte differs
depending on whether that character is at the beginning, middle, or end
of the word. (for example Hebrew's "mem", "mem-sofeet") In arabic there
are many characters that look different depending on whether they are at
the begginning, middle, or end of the word. As you type, the previous
character is redrawn appropriately, and the current character is drawn.

99% of non-wordprocessor application programs already call TexEdit to do text
handling for them. They become multi-lingual immediately when run on a Mac
that has had the appropriate national interface system file placed in the
system folder.

Word processing programs, and other text-handling programs that don't use
text edit can call Script Manager directly.

Apple also provides a tool to let the informed user change the menu key
equivalents, all the menu & prompt text, and the windows containing the
text (translation usually makes strings longer.) of their existing
binaries. (Compilers on Macintosh use a run-time linkage to dialogs and
strings, so they can be made larger safely.)

Apple sent free copies of all the national interface systems files they've
published to all their developers on a CD ROM called "Phil & Dave's Excellent
CD."  An early version of the Script Manager documentation is contained in
Inside mac Vol. 5. The current version of the Script Manager documentation
is on the CD. It is also available on paper from the Apple Program
Developers Association. (which everyone calls APDA.)

As a developer, I get the feeling from Apple that they are serious about
this. That they want all the developers to make their software fully
compatible with Script Manager (most old programs don't call the script
manager to verify that they haven't matched a string in the middle of a
Japanese character.) I've tried their software, and it works. What kind of
free software have you gotten from your o.s. vendor lately?

> The mac is a detour in the inevitable march of mediocre computers.
> drs@bnlux0.bnl.gov (David R. Stampf)

--- David Phillip Oster          -master of the ad hoc odd hack. 

Keith Sproul, head of microcomputer support at Union Carbide, NJ, complained
about the poorly digitized fellatio on an IBM porno program. "Mac is better
on everything, and this is no execption."  -- "Computer Porn at the Office"
by Reese Erlich, _This_World_, S.F. Chronicle, p.8, Aug 13, 1989

Arpa: oster@dewey.soe.berkeley.edu
Uucp: {uwvax,decvax}!ucbvax!oster%dewey.soe.berkeley.edu