[comp.lang.perl] Tools for manipulating message catalogs

jtc@motcad.portal.com (J.T. Conklin) (04/08/91)

Are there any freely redistributable tools, perl scripts, etc. for maintaining
message catalogs?  I'm thinking of something that would automagically 
create a *.h to be included by a program and *.msg files for each locale
from one master text file.

Are there any freely redistributable message catalog libraries for old
systems which don't have them?  If not, are there any that can be licenced?

    --jtc

-- 
J.T. Conklin    jtc@motcad.portal.com, ...!portal!motcad!jtc

nazgul@alphalpha.com (Kee Hinckley) (04/09/91)

In article <1991Apr7.190119.24825@motcad.portal.com> jtc@motcad.portal.com (J.T. Conklin) writes:
>Are there any freely redistributable tools, perl scripts, etc. for maintaining
>message catalogs?  I'm thinking of something that would automagically 
>create a *.h to be included by a program and *.msg files for each locale
>from one master text file.
>
>Are there any freely redistributable message catalog libraries for old
>systems which don't have them?  If not, are there any that can be licenced?

Here's the README from a package I wrote for our product.  We're shipping our
product at the end of this month and I plan on cleaning the message catalog
stuff up and shipping it out to comp.sources.xxx after that, but if you want
a copy now just let me know.  This software has been run and tested on Apollo,
Sparc, MIPS, SCO, DECStation and other platforms.  The implementaion is fast
and uses a minimum of memory, yet the ondisk structure is the same as the
inmemory one.  The gencat code is a bit crufty and could use new error handling
routines (not to mention using the msgcatalog itself!), but in general it's
pretty clean.

------
First a note on the copyright.  This is the same one used by the
X Consortium (fortunately no one has started copyrighting copyrights),
so if your lawyers don't mind that one, they shouldn't mind this one.
Simply put, you can do what you want with this, although if you are
so inclined we'd appreciate it if you sent us back any enhancements,
bug fixes, or other related material, so we can make it available to
everyone else.  But if you don't want to, you don't have to.  So be it.

So.  What's here?  It's an implementation of the Message Catalog System,
as described in the X/Open Portability Guide (XSI Supplementary Definitions,
X/Open Company, Ltd, Prentice Hall, Englewood Cliffs, New Jersey 07632,
ISBN: 0-13-685850-3).  Included is a version of gencat, to generate
message catalogs, as well as the routines catgets, catopen, and catclose.
There is also the beginings of an X/Open compliant set of print routines,
but we'll talk about those later.

I haven't done a man page yet (sorry, but I've got a product to get out
the door, the pretty stuff has to come later).  However you can use the
definitions in the X/Open docs and it should all work.  I have, however,
added a series of pretty significant enhancements, particularly to gencat.

As follows:

Use: gencat [-new] [-or] [-lang C|C++|ANSIC] catfile msgfile [-h <header-file>]...

This version of gencat accepts a number of flags.
    -new	Erase the msg catalog and start a new one.
		The default behavior is to update the catalog with the
		specified msgfile(s).  This will instead cause the old
		one to be deleted and a whole new one started.
    -h <hfile>	Output identifiers to the specified header files.
		This creates a header file with all of the appropriate
		#define's in it.  Without this it would be up to you to
		ensure that you keep your code in sync with the catalog file.
		The header file is created from all of the previous msgfiles
		on the command line, so the order of the command line is
		important.  This means that if you just put it at the end of
		the command line, all the defines will go in one file
		    gencat foo.m bar.m zap.m -h all.h
		If you prefer to keep your dependencies down you can specify
		one after each message file, and each .h file will receive
		only the identifiers from the previous message file
		    gencat foo.m -h foo.h bar.m -h bar.h zap.m -h zap.h
		As an added bonus, if you run the following sequence:
		    gencat foo.m -h foo.h
		    gencat foo.m -h foo.h
		the file foo.h will NOT be modified the second time.  gencat
		checks to see if the contents have changed before modifying
		things.  This means that you won't get spurious rebuilds of
		your source everytime you change a message.  You can thus use
		a Makefile rule such as:

		MSGSRC=foo.m bar.m
		GENFLAGS=-or -lang C
		GENCAT=gencat
		NLSLIB=nlslib/OM/C
		$(NLSLIB):	$(MSGSRC)
			@for i in $?; do cmd="$(GENCAT) $(GENFLAGS) $@ $$i -h `basename $$i .m`.H"; echo $$cmd; $$cmd; done

		foo.o:	foo.h

		The for-loop isn't too pretty, but it works.  For each .m
		file that has changed we run gencat on it.  foo.o depends on
		the result of that gencat (foo.h) but foo.h won't actually
		be modified unless we changed the order (or added new members)
		to foo.m.  (I hope this is clear, I'm in a bit of a rush.)

    -lang <l>	This governs the form of the include file.
		Currently supported is C, C++ and ANSIC.  The latter two are
		identical in output.  This argument is position dependent,
		you can switch the language back and forth inbetween include
		files if you care to.

    -or		This is a hack, but a real useful one.
		MessageIds are ints, and it's not likely that you are going
		to go too high there if you generate them sequentially.
		catgets takes a msgId and a setId, since you can have multiple
		sets in a catalog.  What -or does is shift the setId up to
		the high end of a long, and put the msgId in the low half.
		Assuming you don't go over half a long (usually 2 bytes 
		nowadays) in either your set or msg ids, this will work great.
		Along with this are generated several macros for extracting
		ids and putting them back together.  You can then easily
		define a macro for catgets which uses this single number
		instead of the two.  Note that the form of the generated
		constants is somewhat different here.
		
		Take the file aboutMsgs.m

		$ aboutMsgs.m
		$ OmegaMail User Agent About Box Messages
		$

		$set 4 #OmAbout

		$			 About Box message and copyrights
		$ #Message
		# Welcome to OmegaMail(tm)
		$ #Copyright
		# Copyright (c) 1990 by Alphalpha Software, Inc.
		$ #CreatedBy
		# Created by:
		$ #About
		# About...
		# A
		#
		#
		$ #FaceBitmaps
		# /usr/lib/alphalpha/bitmaps/%s

		Here is the the output from: gencat foo aboutMsgs.m -h foo.h

		#define OmAboutSet      0x4
		#define OmAboutMessage  0x1
		#define OmAboutCopyright        0x2
		#define OmAboutCreatedBy        0x3
		#define OmAboutAbout    0x4
		#define OmAboutFaceBitmaps      0x8

		and now from: gencat -or foo aboutMsgs.m -h foo.h

		/* Use these Macros to compose and decompose setId's and msgId's */
		#ifndef MCMakeId
		# define MCMakeId(s,m)  (unsigned long)(((unsigned short)s<<(sizeof(short)*8))\
							|(unsigned short)m)
		# define MCSetId(id)    (unsigned int) (id >> (sizeof(short) * 8))
		# define MCMsgId(id)    (unsigned int) ((id << (sizeof(short) * 8))\
							>> (sizeof(short) * 8))
		#endif

		#define OmAboutSet      0x4
		#define OmAboutMessage  0x40001
		#define OmAboutCopyright        0x40002
		#define OmAboutCreatedBy        0x40003
		#define OmAboutAbout    0x40004
		#define OmAboutFaceBitmaps      0x40008


Okay, by now, if you've read the X/Open docs, you'll see I've made
a bunch of other extensions to the format of the msg catalog as well.
Note that you don't have to use any of these and, with one exception,
they are all compatible with the standard format.

$set 4 #OmAbout
    In the standard the third argument is a comment.  Here if the
    comment begins with a # then it is used to generate the setId constant
    (with the word "Set" appended).  This constant is also prepended onto
    all of the msgId constants for this set.  Anything after the first
    token is treated as a comment.

$ #Message
    As with set, I've modified the comment to indicate an identifier.
    There are cleaner ways to do this, but I was trying to retain a
    modicom of compatibility.  The identifier after # will be retained
    and used as the identifier for the next message (unless overridden
    before we get there).  If a message has no previous identifier then
    no identifier is generated in the include file (I use this quite a
    bit myself, the first identifier is a Menu item, the next three are
    accelerator, accelerator-text and mnemonic - I don't need identifiers
    for them, I just add 1, 2 and 3).

# Welcome to OmegaMail(tm)
    Finally the one incompatible extension.  If a line begins with #
    a msgId number is automatically generated for it by adding one to
    the previous msgId.  This wouldn't have been useful in the standard,
    since it didn't generate include files, but it's wonderful for this
    version.  It makes it easy to reorder the message file to put things
    where they belong and not have to worry about renumber anything (although
    of course you'll have to recompile).

That's about all for that.

Now, what about the print routines?  These are embarassing.  They are
a first pass.  They support only %[dxo] and %[s], although they support
*all* of the modifiers on those arguments (I had no idea there were
so many!).  They also, most importantly, support the position arguments
that allow you to reference arguments out of order.  There's a terrible
hack macro to handle varargs which I wrote because I wasn't sure if it
was okay to pass the address of the stack to a subroutine.  I've since
seen supposedly portable code that in fact does this, so I guess it's
okay.  If that's the case the code could become a lot simpler.  I welcome
anyone who would like to fix it up.  I just don't know when I'll get
the chance; it works, it's just ugly.


One last comment.  You probably want to know how reliable it is.  I've tested
the print routines pretty well.  I've used the msgcat routines intensely,
but I haven't exercised all of the options in the message catalog file
(like all of the \ characters) although I have implemented them all.
I'm pretty confident that all the basic stuff works, beyond that it's
possible that there are bugs.  As for portability, I've run it under
BSD4.3 (Apollo) and SYSV-hybrid (SCO).  (And I never want to see the
words "System V with BSD extensions" again in my life.)  I don't believe
that there are any heavy dependencies on Unix, although using another
system would probably require #ifdef's.

I apologize for the state of the documentation, the lack of comments,
the lack of testing, and all of the other things.  This project is
subsidiary to my primary goal (Graphical Email for Unix) and I'm afraid
I haven't been able to spend the time on it that I would have liked.
However I'll happily answer any questions that you may have, and will
be glad to serve as a distribution point for future revisions.  So if
you make any changes or add more X/Open functions, send me a copy and
I'll redistribute them.

Best of luck!
				    Kee Hinckley
				    September 12, 1990
-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

eliot@chutney.rtp.dg.com (Topher Eliot) (04/10/91)

On more than one occasion I have seen references to tools that accept as
input a message catalog that contains symbolic names (rather than numbers) 
for the messages, and produce as output new message catalogs containing
numbers (or perhaps compiled message catalogs) and .h files containing the
appropriate #define lines mapping the symbolic names to the numbers. I am
here to argue that these are a Bad Thing.

At first glance, they seem great.  Who wants to keep track of a bunch of small
integers, when they can use symbolic identifiers instead?  I mean, we figured
that out back when the first assembler was written.

The problem is the context in which such a tool is being used.  One of the 
main points of symbolic names (especially #defined "constants") is that they
allow one to change the numeric value of the "constant" without having to
edit all the source files.  Thus, for example, one could add a new message
to the middle of a message catalog, rebuild everything, and it would all be
in sync.  Or so it would appear.

But what's the point of message catalogs?  The point is that you don't just
have one, you have lots, in all different languages.  Creating new versions
of those translated catalogs is NOT just a matter of rebuilding.  They have
to be sent off to translators, and then reincorporated into the product
distribution after being translated.  They may arrive at different times
(getting something translated into French can probably be done locally;
Serbo-Croation is more of a challenge).  Depending on how you distribute them
to customers, they may or may not arrive in sync with the new executable
code.  Customers may or may not load all the new message catalogs.  On and on.
All in all, keeping message catalogs synchronized with programs that use
them is a real bitch.

The moral of this is that ONE SHOULDN'T DO THINGS THAT REQUIRE MAINTAINING
SYNCHRONIZATION BETWEEN THE APPLICATION AND THE MESSAGE CATALOG, like
inserting a new message into the middle of an existing message catalog.
One should only add new messages to the end of a catalog.  If a message is
no longer required, it's place should be filled with a zero-length message
(or just not be used, depending on whether you are using AT&T or Xopen message
facilities).  That slot (message number) should not be re-used for a different
message.

Given these guidelines, the usefulness of the tool I described above is much
less than one might initially think.  In fact, I argue that such a tool tempts
one to break the guidelines, or perhaps I should say makes it easy to break
the guidelines without realizing that one has done so.  Without the tool,
one writes the message number right into the application program, and leaves
that value there forever.  Which is exactly what one should do.  Presumably
if one types in the wrong number, this error will be discovered early on in
testing.  (You do, after all, test each possible message usage, don't you? :-)

To reiterate:  when one is writing an application, every time one creates a
new message, it should be added to the message catalog, a message number should
be created for it, that message number should be hard-coded into the
application source code, and then it should stay that way until doomsday.
You should never WANT automatic numbering of your messages.

Some people may point out that using symbolic identifiers for messages allows
a reader of the source code to figure out easily what the message is, rather
than having to flip back and forth through a message catalog.  I would counter
that the source code is supposed to have a compiled-in default message
anyway, to cover those occasions when the message catalog is for some reason
unavailable.  Given the default message, a symbolic message identifier
doesn't add much.

Whew.  And so early in the morning, too.

Have I made my point clear?  Would anyone care to point out flaws in my logic?
Does anyone still think that a tool to create a .h file out of a message
catalog is useful?

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.

worley@compass.com (Dale Worley) (04/10/91)

   From: nazgul@alphalpha.com (Kee Hinckley)

   First a note on the copyright.  This is the same one used by the
   X Consortium (fortunately no one has started copyrighting copyrights),

Well, the GPL has a copyright -- you can reuse it but you can't change
it:

		 GNU EMACS GENERAL PUBLIC LICENSE
		    (Clarified 11 Feb 1988)

 Copyright (C) 1985, 1987, 1988 Richard M. Stallman
 Everyone is permitted to copy and distribute verbatim copies
 of this license, but changing it is not allowed.  You can also
 use this wording to make the terms for other programs.

Dale

Dale Worley		Compass, Inc.			worley@compass.com
--
Klein bottle for sale ... inquire within.

composer@chem.bu.edu (Jeff Kellem) (04/11/91)

In article <1991Apr10.145552.7955@uvaarpa.Virginia.EDU> worley@compass.com
	(Dale Worley) writes:
 >    From: nazgul@alphalpha.com (Kee Hinckley)
 >
 >    First a note on the copyright.  This is the same one used by the
 >    X Consortium (fortunately no one has started copyrighting copyrights),
 >
 > Well, the GPL has a copyright -- you can reuse it but you can't change
 > it:

But, the GPL is a license, not a copyright.  ;-)

		-jeff

Jeff Kellem
Internet: composer@chem.bu.edu

nazgul@alphalpha.com (Kee Hinckley) (04/11/91)

In article <1991Apr10.122642.3991@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
>Serbo-Croation is more of a challenge).  Depending on how you distribute them
>to customers, they may or may not arrive in sync with the new executable
>code.  Customers may or may not load all the new message catalogs.  On and on.
>All in all, keeping message catalogs synchronized with programs that use
>them is a real bitch.
...
>Have I made my point clear?  Would anyone care to point out flaws in my logic?
>Does anyone still think that a tool to create a .h file out of a message
>catalog is useful?

Absolutely.  First of all you're trying to protect me from myself.
I'd rather establish conventions to do that than require it in the
code.  While you could argue that the code readability doesn't change,
your mechanism makes it more likely that people will enter the wrong
number in the code, and thus get the wrong error message.  Furthermore
it makes it impossible to reorganize the message catalog.  I have
some very large catalogs where I like to organize things (like having
all of the "File" menu items in one place).  If the numbers are
hardcoded into my program I can never do this, even when I know it's
safe.  So the readability of the message catalog gets much worse over
time.  I'm clearly on the other side of the fence from you on this
one.  Not only does my gencat create .h files, it also allows me to
create a catalog that doesn't specify any numbers at all - it generates
them automatically.

I agree that there is potential for screw ups here, particularly since
people get real annoyed everytime they modify the catalog, get new
header files and have to rebuild (that's the reason I make my gencat
compare the new and old header files and only update if the actual
values/names have changed).  I don't think that the danger here, however
outweighs the advantages.  In addition there is a way to, if not prevent
the problem, at least spot it.  Simply have a convention, as a user
of message catalogs, that messageId #1 is a version number.  Every
time you make an incompatible change to the catalog, change the version
number.  Have your application check the version number and complain
if it doesn't match.

All that said, I don't think there is so much a "flaw" in your arguments,
as a matter of preference and tradeoffs.  Even if gencat does create
header files, you are certainly under no obligation to use them, so it
seems to me we can both coexist happily (so long as I never have to
use any of your code :-).
-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

eliot@chutney.rtp.dg.com (Topher Eliot) (04/12/91)

In article <1991Apr11.084924.1951@alphalpha.com>, nazgul@alphalpha.com (Kee Hinckley) writes:
|> Absolutely.  First of all you're trying to protect me from myself.
No, I'm trying to protect other people who aren't experienced in the pains of
distributing message catalogs from the mistakes your* tools might tempt them to
make.  Obviously anybody with enough experience with internationalization
to create new tools for it isn't going to make foolish mistakes.  But others
may well.

*It was you who posted the note about the new tools, right?

|> I'd rather establish conventions to do that than require it in the
|> code.  While you could argue that the code readability doesn't change,
|> your mechanism makes it more likely that people will enter the wrong
|> number in the code, and thus get the wrong error message.  
Yes, they may be more likely to get it wrong once, during development, where
it will be found during an initial pass of testing.  But then it will be gotten
right, and it will then stay right.  Your system makes it easier for the
numbers to become wrong later on, in ways that are harder to detect and keep
track of.

|> Furthermore
|> it makes it impossible to reorganize the message catalog.  
Nothing is impossible.  My approach makes it appropriately difficult.  You
SHOULD be reluctant to reorganize your catalog.

|> I have
|> some very large catalogs where I like to organize things (like having
|> all of the "File" menu items in one place).  
I would have thought that message sets would fill this need very nicely.
Note that everywhere that I said "add new messages only to the end of the
catalog", I really should have said "add new messages only to the ends of
sets, and add new sets only to the end of the catalog".

...
|> Not only does my gencat create .h files, it also allows me to
|> create a catalog that doesn't specify any numbers at all - it generates
|> them automatically.
So instead of having to type in numbers, you have to type in symbolic names.
Why is this any easier?  Symbolic names are an advantage when you want to be
able to change the underlying value later on.  I claim that with message
numbers, you shouldn't change those values!  These numbers should be CONSTANTS!

|> I agree that there is potential for screw ups here, particularly since
|> people get real annoyed everytime they modify the catalog, get new
|> header files and have to rebuild ...
You won't hear any complaints from me about rebuilding.  That isn't my point.
But your "rebuild the .h file only if necessary" feature is a nice touch.

...
|> In addition there is a way to, if not prevent
|> the problem, at least spot it.  Simply have a convention, as a user
|> of message catalogs, that messageId #1 is a version number.  Every
|> time you make an incompatible change to the catalog, change the version
|> number.  Have your application check the version number and complain
|> if it doesn't match.
In other words, protect yourself from yourself.  Isn't this what you were
complaining about?  Why build a tool that makes a certain class of errors
more likeley, and then invent a convention to try to head off those errors?
Do you really think that the class of programmers that are likely to screw
up message catalogs is the same class of programmers that will diligently
put this checking code into their applications?  I don't.

Moreover, such a convention would make it so that if the developer added
one message to the end of a catalog, and distributed new executables and
new English-language catalogs on January 1st, and the Serbo-Croation
catalog didn't get distributed until March 1st, the Serbo-Croations would
not get to use new new application and their own language catalog for two
months.  With my approach, they would only see the one new message in English
for those two months.  The old Serbo-Croation catalog would serve just fine
for all the other messages.

|> All that said, I don't think there is so much a "flaw" in your arguments,
|> as a matter of preference and tradeoffs.  Even if gencat does create
|> header files, you are certainly under no obligation to use them, so it
|> seems to me we can both coexist happily (so long as I never have to
|> use any of your code :-).
Ah, but who will maintain the code that you have written this way?  I assert
that code developed using my approach will be much less of a headache to
maintain and support than will be code developed and maintained using your
tools.  Yes, the first time around, during development, your approach is
easier to use.  Once that first translated catalog goes out to customers,
my approach is much more robust.


Meanwhile, erik@srava.sra.co.jp (Erik M. van der Poel) says:
|> Kee Hinckley writes:
|> > While you could argue that the code readability doesn't change,
|> > your mechanism makes it more likely that people will enter the wrong
|> > number in the code, and thus get the wrong error message.
|> 
|> Instead of treating the symptoms, we should try to cure the disease. 
|> Using numbers for the message ids was a bad idea in the first place. 
|> (Thank goodness XPG3 and AT&T's specs are not International
|> Standards.)
|> 
|> Wouldn't it be possible to create a reasonably efficient
|> implementation using hashing and caching with symbolic names instead
|> of numeric ids? Then we can add/delete/modify messages at will. We
|> should leave numbering and counting to the computer.

This sounds even better to me.  After I posted my note, I heard that Uniforum
has advanced a proposal exactly along these lines.  Does anyone have any
specifics on it?

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.

peter@ficc.ferranti.com (Peter da Silva) (04/13/91)

In article <1991Apr10.122642.3991@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
> Some people may point out that using symbolic identifiers for messages allows
> a reader of the source code to figure out easily what the message is, rather
> than having to flip back and forth through a message catalog.

It also lets him more easily verify that the right message is being generated
at each point.

> Have I made my point clear?  Would anyone care to point out flaws in my logic?
> Does anyone still think that a tool to create a .h file out of a message
> catalog is useful?

Sure, as long as it's only run on the master message catalog that's kept
in the programmer's native language, and that new messages are only added
at the end. Translations are done on processed copies with fixed message
numbers.

It's a tool. Like any, it can be abused. That doesn't mean it's not useful.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

preece@urbana.mcd.mot.com (Scott E. Preece) (04/14/91)

In article <1991Apr10.122642.3991@dg-rtp.dg.com> eliot@chutney.rtp.dg.com (Topher Eliot) writes:
| To reiterate:  when one is writing an application, every time one creates a
| new message, it should be added to the message catalog, a message number should
| be created for it, that message number should be hard-coded into the
| application source code, and then it should stay that way until doomsday.
| You should never WANT automatic numbering of your messages.
|...
| Have I made my point clear? Would anyone care to point out flaws in my logic?
| Does anyone still think that a tool to create a .h file out of a message
| catalog is useful?
---
Well, actually, I still think the use of symbolic names makes code
reading easier (and code reading is critical to delivered quality).  I
also think that use of symbolic names is in no way counter to the
principle of never changing existing message number assignments; just
don't do it.  Finally, an automatic tool would be useful for the
important case of the first version of a program, even if subsequent
versions need to maintained manually (though, actually, I think the
synchronization problem would be better addressed in other ways, like
keeping the source for the message catalog in a development environment
that linked it to the code and generated the new catalog as part of the
normal release process; I tend to think that the safest way to keep
things synchronized is to always reissue them together (surely you're
going to test the whole message catalog when you release a new version,
anyway, right? :-)).

scott
--
scott preece
motorola/mcg urbana design center	1101 e. university, urbana, il   61801
uucp:	uunet!uiucuxc!udc!preece,	 arpa:	preece@urbana.mcd.mot.com
phone:	217-384-8589			  fax:	217-384-8550

nazgul@alphalpha.com (Kee Hinckley) (04/15/91)

In article <1991Apr12.122701.9545@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
>*It was you who posted the note about the new tools, right?
I plead guilty :-).

>|> all of the "File" menu items in one place).  
>I would have thought that message sets would fill this need very nicely.
Quite right.  I hadn't considered using them at that level of granularity.

>|> of message catalogs, that messageId #1 is a version number.  Every
>|> time you make an incompatible change to the catalog, change the version
>|> number.  Have your application check the version number and complain
>|> if it doesn't match.
>In other words, protect yourself from yourself.  Isn't this what you were
>complaining about?  Why build a tool that makes a certain class of errors
Not really.  The difference is that this is a restriction I can institute
if I feel it is needed, as opposed to one forced on me by the implementation
of the tools.  But if this were the only issue I wouldn't mind.

>Do you really think that the class of programmers that are likely to screw
>up message catalogs is the same class of programmers that will diligently
>put this checking code into their applications?  I don't.
I can't argue with this.

>Moreover, such a convention would make it so that if the developer added
>one message to the end of a catalog, and distributed new executables and
>new English-language catalogs on January 1st, and the Serbo-Croation
>catalog didn't get distributed until March 1st, the Serbo-Croations would
>not get to use new new application and their own language catalog for two
>months.  With my approach, they would only see the one new message in English
>for those two months.  The old Serbo-Croation catalog would serve just fine
>for all the other messages.
This is an interesting point, because it makes me realize another way
that people can misuse symbolic numbers.  You see it turns out that they
are terifficly convenient to use as identifiers for other things.  For
instance I use them to identify my menubutton objects.  Based on the
identifier I execute the appropriate action.  If I had to do a string
compare I wouldn't do it that way.  But furthermore, I use the identifier
(plus a fixed count) to find the accelerator, mnemonic and string-form
of the accelerator in the catalog.  This leads to an unfortunate side effect,
namely, there are no fallback strings for those values.  Without the
catalog the program is usable, but not full-function.  So yes, you are right;
numeric identifiers can be abused.  I won't know whether the tradeoffs
I'm making are worth it until we start shipping lots of different language
versions - but it's definitely something to think about.

>Ah, but who will maintain the code that you have written this way?  I assert
>that code developed using my approach will be much less of a headache to
>maintain and support than will be code developed and maintained using your
I'm not sure, I still think it's too easy to get burned by the runtime
typing.  How do you verify that all of your strings in fact correspond
to message catalog symbols?  That issue and the speed/memory issues are
my major concerns.

-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

eliot@chutney.rtp.dg.com (Topher Eliot) (04/16/91)

In article <PREECE.91Apr13223807@etude.urbana.mcd.mot.com>, preece@urbana.mcd.mot.com (Scott E. Preece) writes:
|> ...  Finally, an automatic tool would be useful for the
|> important case of the first version of a program, even if subsequent
|> versions need to maintained manually (though, actually, I think the
|> synchronization problem would be better addressed in other ways, like
|> keeping the source for the message catalog in a development environment
|> that linked it to the code and generated the new catalog as part of the
|> normal release process; I tend to think that the safest way to keep
|> things synchronized is to always reissue them together 
Sure, in an ideal world I would like to have one huge build process that
starts with my source archives for everything, and burps out a tape at the
far end that I can ship to any customer, anywhere in the world, and have it
work correctly, no matter what earlier versions of software they have on their
machine.  I have yet to see any company that actually implements such
a system, and I've worked at some of the largest computer manufacturers in
the world.  The realities of getting things translated into other languages
are horrendous.  Suppose you have a new executable and new catalog that you
have built using one of these tools that generates message numbers
automatically.  You HOPE that the numbers on existing messages haven't been
accidentally changed, but unless you build another tool to verify that, you
don't know.  You're all set to ship, complete with translated catalogs in
seven of the eight languages you support, when civil war breaks out in upper
Lithuania, and your Lithuanian translator patriotically returns to defend the
motherland :-).  You're left one language short in your catalogs.  What are
your choices?  
1) Hold all your shipments until you have a complete set of message catalogs;
2) Ship to all customers except those that use Lithuanian, and ship those
later (glad I'm not in charge of THAT operation :-);
3) Ship to everyone and hope that the old Lithuanian catalog will work ok
with the new executable;
4) Heave a sigh, and say "boy, dealing with crufty old numeric message
identifiers sure has been tacky all this time, but now we can ship this tape
even to the Lithuanians, and still sleep tonight knowing that it's very
unlikely that we've screwed up the numbers of any old messages.  Good thing
we followed Topher Eliot's advice and didn't use tools that automatically
renumber them.  Let's hire him as a high-priced consultant" :-)

I grant that you actually suggested only using the tool during the initial
development, and then switching to manual numbering.  However, I haven't seen
the tools being promoted that way, and I don't know of any tool that will
generate a .c file that is a hard-numbered version of your source, for use
after the first release.  They all assume you will continue to use the tool
during each build.

|> (surely you're
|> going to test the whole message catalog when you release a new version,
|> anyway, right? :-)).
In Norwegian?  And Portuguese? and, and?  Surely you're kidding.  I mean,
think about it -- to do such testing, you need to know both the program being
tested, and the language.  I don't know anybody who will be able to do a
really good job of testing their application in all the languages for which
message catalogs will be provided (unless, of course, "all" is just
English :-).   Doing it just once is a big task; doing it all over again
every time you release an update would be outrageously expensive.  

You may say "well, have a developer and a translator sit side by side and
run through the test suite".  What if the translator is an ocean away?
You may say "well, everyone really should have automated regression tests
anyway", but those automated tests will need catalogs listing the expected
output from the program, and we're right back where we started, i.e. trying
to keep message catalogs in sync with the executable.

So I'll say it again:  making sure that an application program and all the
different language catalogs available for it correspond correctly is a very
error-prone process, particularly when dealing with updating things in the
field.  We should do everything we can to make sure they don't get out of
sync, including gritting our teeth and using numbers instead of those oh-so-
nice automatic numbering tools.

Something just occurred to me:  how about if the automatic numbering tool
knew enough about the source archiving system (SCCS, RCS, or whatever) so
that it could compare the latest version of the catalog against all previous
versions, to make sure that no incompatibilities were being introduced?  This
might require some special flagging of messages to indicate that you really
did intend to change them, but the tool could catch egregious errors, such
as bumping all the message numbers by one.  This would go a long way towards
keeping everyone happy, wouldn't it?

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.

peter@ficc.ferranti.com (Peter da Silva) (04/16/91)

In article <1991Apr12.122701.9545@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
> Why is this any easier? Symbolic names are an advantage when you want to be
> able to change the underlying value later on. I claim that with message
> numbers, you shouldn't change those values! These numbers should be CONSTANTS!

Like these constants?

	#define PI 3.141592653589 /* values from memory... apologies if */
	#define E 2.171828182845 /* they're incorrect */
	...

It's pointless making them symbolics, because they're not going to change.
Quick, what's the numeric value for ENOMEM? SIGPWR? TIOCSETC?

Symbolic names are an advantage to the person writing and debugging the
program, because they reduce the number of meaningless magic numbers they
need to track.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

hansm@cs.kun.nl (Hans Mulder) (04/16/91)

In <1991Apr15.170901.18836@dg-rtp.dg.com> eliot@chutney.rtp.dg.com (Topher Eliot) writes:
>Sure, in an ideal world I would like to have one huge build process that
>starts with my source archives for everything, and burps out a tape at the
>far end that I can ship to any customer, anywhere in the world, and have it
>work correctly, no matter what earlier versions of software they have on their
>machine.  I have yet to see any company that actually implements such
>a system, and I've worked at some of the largest computer manufacturers in
>the world.  The realities of getting things translated into other languages
>are horrendous.

You don't really expect a company to delay the release of their latest product
until they are able to bundle with it free message catalogs in a dozen major
languages, do you?

In the real world the product is shipped as soon as the English version of the
product is ready.  Simultaneously the message catalog (but not the executable)
is sent to the translator.  Non-English versions of the product hit the market
6 months later than the English version, at the earliest.  And they are sold
as separate products.

And combining the correct version of the executable with a translated message
catalog is really trivial, compared to the problem of getting the catalog
translated in the first place.

--
Hans Mulder	hansm@cs.kun.nl

rschwartz@OFFICE.WANG.COM (R. Schwartz@Wang R&D Net) (04/16/91)

eliot@chutney.rtp.dg.com (Topher Eliot) writes:

>  (much omitted)
>
> The moral of this is that ONE SHOULDN'T DO THINGS THAT REQUIRE MAINTAINING
> SYNCHRONIZATION BETWEEN THE APPLICATION AND THE MESSAGE CATALOG, like
> inserting a new message into the middle of an existing message catalog.
>
> (more omitted)
>
> You should never WANT automatic numbering of your messages.
>
> (still more omitted)
>
> Have I made my point clear?  Would anyone care to point out flaws in my logic
> Does anyone still think that a tool to create a .h file out of a message
> catalog is useful?

      YES!!!   Your point is clear.
      YES!!!   I absoultely insist that generating .h files is required.

The flaws are not in your logic.  The flaws are in your assumptions about the
tools that should be used to synchronize code and messages when they reside
in separate files.  I.e., you presume that there are no such tools, and I
grant that it is normal for there to be none.  The dangers that you point out
are completely valid, and your point that these dangers are exacerbated by
the logistics involved in sending materials hither and yon for translation
is well taken.  But the solution isn't to make a bad software engineering
decision.  Invent the right tools instead!

The use of mnemonic names in message catalogs is an absolute necessity in
any application other than trivial toys.  Most of the benefits are too
obvious to mention.  One that bears special attention is the ability to
re-organize multiple catalogs without re-numbering.  If the run-time
organization of code changes from one release to the next, it may make
perfectly good sense to divide or merge message catalogs, or to re-locate
individual messages.  Mnemonic labels can minimize the code impact of such
changes.  I might even suggest going to enough lengths to remove the code
impact completely by adding a level of indirection so that code is unaware
which message catalog a given message comes from.

Another point that strongly supports the use of such a tool is that it
helps translators to identify their mistakes.  Comparison of the .h file
generated with the translated catalog against the version from the release
is a sure way to detect inadvertantly deleted messages and a host of other
errors.  I haven't met a translator who wouldn't love to have a way to
check for such editing errors.

Something to help us developers, too: tracking down obsolete messages is a
snap if you use a cross-referencer to find unused #defines in your generated
.h files.  Maybe it's really obsolete and should be gotten rid of since
translating obsolete messages to a dozen or so languages can cost big bucks,
pounds, marks, yen, etc.  Maybe you added an error message to the catalog
you knew you'd need it, but you forgot to code that else clause!

Am I reaching?  Am I stretching my logic to make a point?  Yup!  But does
anyone still think that a tool to create a .h file out of a message catalog
is useless?  :-)

erik@srava.sra.co.jp (Erik M. van der Poel) writes:

> Using numbers for the message ids was a bad idea in the first place.
> (Thank goodness XPG3 and AT&T's specs are not International
> Standards.)

Once compiled into an executable, no one need care what the representation
of a message id is.  Nobody says that the the #define in the generated
header ultimately has to resolve to an integer.  It merely has to resolve
to whatever the functional interface requires, and if that changes you
just change the .h generation tool.  Information hiding strikes again!

> Wouldn't it be possible to create a reasonably efficient
> implementation using hashing and caching with symbolic names instead
> of numeric ids? Then we can add/delete/modify messages at will. We
> should leave numbering and counting to the computer.

Yes it is possible, but why bother?  The organization of the run-time store
of messages can be changed for efficiency without any impact on the functional
interface.  As an example, I have implemented a (non-unix based) system that
compiles the (equivalent of) the message catalog into assembler code for a
function that retrieves the messages from (again the equivalent of) the
text segment of a shared runtime archive.  The performance is frighteningly
good, and I don't do any fancy indexing or hashing.  I could add it, but
for a large-scale multi-user application the big bang for the buck was in
reducing paging by using non-modifiable shared memory instead of data space.
Yes, it just uses integer ids, and yes, it generates the headers.

nazgul@alphalpha.com (Kee Hinckley) writes:

>                            In addition there is a way to, if not prevent
> the problem, at least spot it.  Simply have a convention, as a user
> of message catalogs, that messageId #1 is a version number.  Every
> time you make an incompatible change to the catalog, change the version
> number.  Have your application check the version number and complain
> if it doesn't match.

More than that, have it check for the last and one-past-the-last message
to verify that the catalog has exactly the right number of entries.  Don't
tolerate any errors in the message configuration -- they're just as critical
as errors in configuration of executables.  Just don't take a checksum! :-)

If you want real safety, make the versioning mechanism automatic.  Have
your make file bump it after any change that affected the .h file, and drop
the new version number into both the message cat and the .h.  Have your code
do its version check comparing the run-time version against a symbolic
constant from the very same include file!  A re-compile of the code that
includes the .h is forced anyhow, so the code is always in step with the
message catalog version.  Now, provide a modified version of the make file
for your translators that does the same checking but instead of triggering a
bump in version and re-compile (you don't give them source anyhow) it simply
triggers an error.

A final comment:

The main reason that I am concerned about this is that internationalization
of code must not violate developers' sense of what is right.  The only people
I have run into who are more fanatic than non-English speakers who (rightly)
flame against non-translatable code, are developers who (rightly) flame
against un-readable code.  There is finally real recognition of the need for
designing internationalization in applications from Day One, and this has been
a hard-fought victory.  Let's not make the software so ugly that everyone will
go back to the old attitude of "we'll worry about international in release 2".

rich schwartz   (All views expressed are my own, and not Wang Labs, Inc.'s.).
 rschwartz@office.wang.com      VOICE (508) 967 5027     FAX (508) 967 0947m.
     Wang Labs, Inc., M/S 019-58A, 1 Industrial Ave., Lowell, MA 01851

nazgul@alphalpha.com (Kee Hinckley) (04/17/91)

In article <1991Apr15.170901.18836@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
>are horrendous.  Suppose you have a new executable and new catalog that you
>have built using one of these tools that generates message numbers
>automatically.  You HOPE that the numbers on existing messages haven't been
>accidentally changed, but unless you build another tool to verify that, you
>don't know.  You're all set to ship, complete with translated catalogs in

How about if I add a feature to the gencat which checks to see if the
message numbers have change incompabibly? And if so it issues a warning?
I think it's doable, and I've pretty much been convinced it's useful.

>Something just occurred to me:  how about if the automatic numbering tool
>knew enough about the source archiving system (SCCS, RCS, or whatever) so
>that it could compare the latest version of the catalog against all previous
>versions, to make sure that no incompatibilities were being introduced?  This
This'll get me for replying before reading everything.  Anyway, you could
do it there, but I think doing it with the message catalog iself would
be sufficient (and certainly easier).

-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

ch@dce.ie (Charles Bryant) (04/19/91)

In article <1991Apr10.122642.3991@dg-rtp.dg.com> eliot@dg-rtp.dg.com
advocates the use of integer constants to identify messages in a message
catalogue. Many others argue that symbolic names are better.

I have never used such a thing, but I assume the choice is between
	MSG_NUMTOOBIG	"Number too big"
which is processed into:
	#define MSG_NUMTOOBIG	1
which then gets used in the program instead of the string, and
	1	"Number too big"
and then `1' gets used in the source.

I hope everyone would agree that:
	a) symbolic names allow messages to get out of sync with the
	program (e.g. swap two lines in the message file, or add a new
	one in the wrong place)

	b) it is easy to forget or get confused over which number
	corresponds to each message

Why not get the benefits of both? Have the input be:
	1	MSG_NUMTOOBIG	"Number too big"
which produces the #defines as before. The programmer can now use a
meaningful symbolic name, and cannot renumber without making the same
change to the message file as would be necessary if no symbolic name is
used.
-- 
Charles Bryant (ch@dce.ie)
--
If you like the opinions expressed in this message, they may be available
for rent - contact your local sales office. Low interest deals available.

eliot@chutney.rtp.dg.com (Topher Eliot) (04/19/91)

In article <1991Apr17.053943.6263@alphalpha.com>, nazgul@alphalpha.com (Kee Hinckley) writes:
|> How about if I add a feature to the gencat which checks to see if the
|> message numbers have change incompabibly? And if so it issues a warning?
|> I think it's doable, and I've pretty much been convinced it's useful.
|> 
|> >Something just occurred to me:  how about if the automatic numbering tool
|> >knew enough about the source archiving system (SCCS, RCS, or whatever) so
|> >that it could compare the latest version of the catalog against all previous
|> >versions, to make sure that no incompatibilities were being introduced?  This
|> This'll get me for replying before reading everything.  Anyway, you could
|> do it there, but I think doing it with the message catalog iself would
|> be sufficient (and certainly easier).
Easier to implement, absolutely.  Sufficient?  I guess it depends on how you
handle your builds, etc.  Such a feature would definitely be good, but one
could still lose track of an incompatible change if one were careless in the
development process.


Someone sent me some mail with a suggestion that I thought was good.  I was
waiting to see it posted, but I'll go ahead and do it.  The suggestion was
that the input catalog should look like:

$set 1 BASEMSGS
1	ERRMSG		"Error in application foo:"
2	WARNMSG		"Warning:"

and so on.  From this one could generate a .h file, and allow the .c file to
use the symbolic values (BASEMSGS, ERRMSG, WARNMSG, etc), and yet still avoid
the danger of accidental renumbering.  Of course if you WANT automatic
renumbering then this approach isn't for you.  But with this approach manual
renumbering is much easier than with my original "NO SYMBOLIC IDENTIFIERS"
gospel.  Renumbering would be a 5-minute editing job, all in one file.  This
idea seems pretty straightforward to implement, and more robust than the
approach of trying to automatically detect incompatible changes and issue
warning messages.

So, far, of all the ideas I've seen, I like this one best.  Does anyone see
anything wrong with it?

Various people have said things here and in private mail that, in my mind,
essentially pooh-pooh the difficulties of keeping executables and translated
message catalogs in sync on customers' machines.  Well, what can I say.  _I_
think it's a hard problem.

At the other end of the spectrum, some people have suggested very strong
mechanisms to rigidly enforce such coordination -- if you don't have the
right message catalog, you can't run.  This is certainly the safest approach
to solving this particular set of problems, but my feeling is that it would
be overruled the first time a fatal bug was found in an application, which had
to be fixed immediately, but translated message catalogs weren't available yet.
I try to use policies that are flexible enough to bend when they need to, yet
will still do you some good when bent.  The absolute-synchronization rule is
too all-or-nothing for my taste.

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.

nazgul@alphalpha.com (Kee Hinckley) (04/21/91)

In article <1991Apr19.130632.17861@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
>Someone sent me some mail with a suggestion that I thought was good.  I was
>waiting to see it posted, but I'll go ahead and do it.  The suggestion was
>that the input catalog should look like:
>
>$set 1 BASEMSGS
>1	ERRMSG		"Error in application foo:"
>2	WARNMSG		"Warning:"

Unless I misunderstand you, this is essentially what I do, except that I 
tried to remain compatible with the standard.  The spec says that anything
after "$set n" is a comment, and anything after "$ " is a comment, so
I just made comments that begin with "#" special.

$set 1 #Foo
$ #ErrMsg
1 "Error in application foo:"
$ #WarnMsg
2 "Warning"

This generates

#define FooSet  0x1
#define FooErrMsg       0x1
#define FooWarnMsg      0x2

Unless you use the '-or' option, which is useful if you want to simplify
things down to a single set and msgid number.

/* Use these Macros to compose and decompose setId's and msgId's */
#ifndef MCMakeId
# define MCMakeId(s,m)  (unsigned long)(((unsigned short)s<<(sizeof(short)*8))\
					|(unsigned short)m)
# define MCSetId(id)    (unsigned int) (id >> (sizeof(short) * 8))
# define MCMsgId(id)    (unsigned int) ((id << (sizeof(short) * 8))\
					>> (sizeof(short) * 8))
#endif

#define FooSet  0x1
#define FooErrMsg       0x10001
#define FooWarnMsg      0x10002

>and so on.  From this one could generate a .h file, and allow the .c file to
>use the symbolic values (BASEMSGS, ERRMSG, WARNMSG, etc), and yet still avoid
>the danger of accidental renumbering.  Of course if you WANT automatic
>renumbering then this approach isn't for you.  But with this approach manual
Right.  The automatic number I simply do by replacing the initial number
with '#', which is the main thing I believe you disagree with.  But it's
optional.


-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

nazgul@alphalpha.com (Kee Hinckley) (04/21/91)

In article <1991Apr19.130632.17861@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
>$set 1 BASEMSGS
>1	ERRMSG		"Error in application foo:"
>2	WARNMSG		"Warning:"

I should note, I prefer this syntax to mine, I just wanted to do something
that could be run through strictly conforming gencats as well.

-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

peter@ficc.ferranti.com (Peter da Silva) (04/23/91)

In article <1991Apr19.103905.486@dce.ie> ch@dce.ie (Charles Bryant) writes:
> 	a) symbolic names allow messages to get out of sync with the
> 	program (e.g. swap two lines in the message file, or add a new
> 	one in the wrong place)

Well, actually, they let newer message files get out of sync with older
ones. But you need to check this anyway, to handle accidental deletions
or improper changes in messages... as well as improper use of messages
in the source (as, for example, the classic case where a constant with
the initial value of "10" was used as a numeric base in base conversions,
and when the constant (which had nothing to do with base conversion)
changed everything went higgledy-piggledy).

> 	b) it is easy to forget or get confused over which number
> 	corresponds to each message

Why should you have to know?

> Why not get the benefits of both? Have the input be:
> 	1	MSG_NUMTOOBIG	"Number too big"

Sounds good. You have to watch out for stuff like:

	15698	MSG_EMACS	"Editor too big"
	...
	15968	MSG_SWAPPER	"Out of swap space in message file"
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

eliot@chutney.rtp.dg.com (Topher Eliot) (04/26/91)

In article <1991Apr21.043742.28994@alphalpha.com>, nazgul@alphalpha.com (Kee Hinckley) writes:
|> In article <1991Apr19.130632.17861@dg-rtp.dg.com> eliot@dg-rtp.dg.com writes:
|> >Someone sent me some mail with a suggestion that I thought was good.  I was
|> >waiting to see it posted, but I'll go ahead and do it.  The suggestion was
|> >that the input catalog should look like:
|> >
|> >$set 1 BASEMSGS
|> >1	ERRMSG		"Error in application foo:"
|> >2	WARNMSG		"Warning:"
|> 
|> Unless I misunderstand you, this is essentially what I do, except that I 
|> tried to remain compatible with the standard.  The spec says that anything
|> after "$set n" is a comment, and anything after "$ " is a comment, so
|> I just made comments that begin with "#" special.
|> 
|> $set 1 #Foo
|> $ #ErrMsg
|> 1 "Error in application foo:"
|> $ #WarnMsg
|> 2 "Warning"

I have to admit jumping to an unwarranted conclusion.  I'm not sure if I mis-
read your original posting, or what, but I saw "automatic numbering" somewhere,
and immediately what came to mind was a different implementation, which did
not offer what you describe here (i.e. in that implementation, the original
source message files could not contain both a number and a symbol).  So I was
really protesting that earlier design, not yours.  Sorry.

|> Right.  The automatic number I simply do by replacing the initial number
|> with '#', which is the main thing I believe you disagree with.  But it's
|> optional.
I guess it would be fair to say you could use your design in the way that I
think is good, and in a way that I think is dangerous.

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.