[comp.sys.amiga.tech] Response to DR2D format specification

shf@well.UUCP (Stuart H. Ferguson) (11/10/89)

+----  Ross Cunniff (cunniff%hpfcrt@hplabs.HP.COM) writes:
|
| The following is the format of the DR2D IFF FORM file produced
| by ProVector, a 2-dimensional structured graphics program.

I see failings with this format in two main areas:

1.  Transportability across applications.  Some aspects of the format
were clearly designed to work well with this one drawing program but 
will hinder other developers from using it for different applications.
Since the main reason for IFF is this tranportability of data, this
is a serious consideration in any format that wants to become a 
"standard."

2.  Conformance with IFF philosophy.  Although it's never been explicity
documented what constitutes a "good" IFF FORM, it is possible to glean
some of the basic design notions from studying the IFF documentation, 
the standard FORM's and the IFF format itself.  Being a something of a 
student of IFF, I think I can illuminate some areas where this proposed
format clashes with general IFF principles.


Let me first address the transportability and generality issues; those
issues that will affect how well this format can adapt to other 
applications than the one for which it was specifically designed.

| Coordinates are specified in IEEE format (see note 1 for a
| description of the IEEE format).

Storing coordinates in floating point is not an ideal choice.  For some
things floating point numbers are essential, but not for drawings.  The
advantage that floating point numbers have over fixed point is that they
are valid over a much greater range, and are therefore useful for 
calculations involving multiplications.  Fixed point numbers, however,
are more accurate and operations on them are faster, but fixed within a 
narrow range.  This makes them more appropriate for calculations 
involving additions and subtractions.  Since coordinates of greatly 
differing magnitude cannot be mixed in the same drawing, and the edges
of a drawing provide a natural boundary for the scale of the numbers, and
the fact that vectors are often added and rarely multiplied, fixed point 
coordinates (integers) are a more natural choice for drawings.

This does not mean that any given program couldn't use floating point
numbers internally.  It just means that the Interchange format for drawings
would use the more natural fixed point notation for exchanging data
between varied applications, some which use floating point interanlly
and some which do not.  But none are required to parse it when they don't
have to.  For programs which want floating point, the drawing header chunk,

| 	struct DRHDstruct {
| 	    LONG	ID;
| 	    LONG	Size;		/* Always 18 */
| 	    IEEE	p_xmin, p_ymin,	/* Minimum and maximum */
| 			p_xmax, p_ymax;	/*   coordinates of the drawing area */
| 	};

could be expanded to include the bounding box of the fixed point 
coordinates as well as their equivalents in floating point.  Thus a
floating point program such as ProVector could transform the integer
values in the DR2D file into their equivalent floating point values,
with no loss of precision, with one floating multiply.  Since a drawing
requires at least this much calculation each time it's displayed, I
can't see how this extra computation would be objectionable.  It
saves a lot of effort for those who only want to deal with integers.

| 	typedef struct {
| 	    UBYTE	FillType;	/* One of FT_*, above	*/
| 	    UBYTE	FillValue;	/* See note 3		*/
| 	    UBYTE	EdgeType;	/* One of ET_*, above	*/
| 	    UBYTE	EdgeValue;	/* Edge color index	*/
| 	    USHORT	WhichLayer;	/* Which layer it's in	*/
| 	    IEEE	EdgeThick;	/* Line width		*/
| 	    IEEE	XMin, YMin,	/* Bounding box of obj. */
| 			XMax, YMax;	/* including line width	*/
| 	} ObjHdr;	/* Size == 26 */

This "Object Header" record comes along with almost all primitive
objects in the file.  Although most of these parameters are reasonable,
I have problems with two of them:  WhichLayer and the X/Y-Min/Max
bounding box.

WhichLayer is somewhat ambigious.  Can you have, for example, a group
with components in more than one layer?  If so, what does it mean?
The "use a symbol" chunk has its own WhichLayer field, as do each of
the components of the symbol.  Which layer wins?  I think that the
layer information does not belong in the data for each object, but 
rather as a property of a whole collection of objects.  I will spell
this out later on.

The bounding box information simply doesn't belong here.  What useful
purpose does it serve in the file?  None.  It does make it much more 
difficult for simple programs to write one of these files, because it 
makes them compute the bounding box for each object, even if they don't 
*use* that information themselves.  Because the box includes the line 
width, mitering makes this a non-trivial calculation.  In fact, because 
there's no specified limit on the length of a miter, the bounding box is
_undefined_ in the general case.

I want to emphasize how utterly absurd it is to store this information in
an Interchange file.  1) It makes simple readers and writers more complex.
2) Even a program that could use the information can't, because 3) the
information might not be correct.  Since this is a published format, you
can't tell where it might be coming from.  A program that wants to use the 
bounding box to speed things up or whatever would still have to *verify* it
against its idea of what the object is.  And the most ridiculously 
redundant aspect of this whole thing is that 4) a drawing program already
has to have the logic in it to compute the bounds given the parameters
of an object.  So why not just compute them as they get read!?  Saving 
this information might save *some* programs a little time when reading
and writing their own files, but it will cost *everybody* pain and grief 
in the long run.  Be smart -- throw it out now.

| /* 8x8 bitmap fill patterns */ /* OBSOLETE */

If it's obsolete, why publish it?

| XTRN (0x4554524E)	/* Externally controlled group */

It would help transportability if there was some clue what this chunk
was about.

| /* Color map */
| 	struct CMAPstruct {
| 	    LONG	ID;
| 	    LONG	Size;		/* Usually 256*3, or 768 */
| #define			NUMCOLOR(Size)	(Size / 3)
| 	    UBYTE	ColorMap[NUMCOLOR(Size)][3];
| 	};

I assume (since it's not specified) that the colors are in R G B order
with 0=min intensity and 255=max intensity.  If so, then this chunk is
the same as the ILBM.CMAP chunk, which is probably a good thing.

| RAST (0x52415354)	/* Raster image */
| 
| 	struct RASTstruct {
| 	    LONG	ID;
| 	    LONG	Size;
| 	    IEEE	XPos, YPos,		/* Virtual coords */
| 			XSize, YSize;		/* Virtual size */
| 	    SHORT	NumX, NumY, Depth;	/* Actual size */
| 	    UBYTE	Colors[3*(1<<Depth)];	/* Color map */
| 	    UBYTE	Closest[(1<<Depth)];	/* Closest equivs in CMAP */
| #define RSIZE(x,y,d)	((x+15)/16)*y*d
| 	    USHORT	Rast[RSIZE(NumX,NumY,Depth)];	/* Lookups into Colors*/
| 	};

With the exception of virtual coordinates and size, this is just like a
simplified ILBM.  Using nested ILBM's here would facilitate 
transportability and would serve the IFF charter a lot better.  Again,
the "Closest" array is information that is best computed by the target
application and should not be stored in the file.  Same reasons apply
as for the bounding box.


Now we get into the area of IFF philosophy.  These are in some sense
far more important issues, because they can have such far-reaching
effects.  Each format that is made a standard evolves the standard,
and once a format is accepted, either by Commodore or by the 
marketplace, we are forever stuck with whatever slant it takes on
the IFF concept.

| Note that most chunks have
| both a chunk Size and an object Count.  I know this is redundant;
| however, this allows the data structures to be extended in a
| compatible manner if it should become necessary in the far future.

The general way IFF provides for this future extensibility is by 
obsoleting old chunks and creating updated versions.  The SMUS FORM
contains an example of this with the INST and INS1 chunks.  INS1
is the new version of the now obsolete INST chunk (new versions of
chunk ID's are created by appending a number, INS1, INS2, etc.).
Old files still work because the new readers can interpret the old
chunks, and old readers don't crash because they skip chunks they
don't understand.  Once a chunk is publicly defined, it should not
be changing (even to get bigger).

| PREF (0x50524546)	/* Application preferences */
  [ ... ]
| 	   Other applications may safely ignore this chunk,
| 	   although it is requested that applications that
| 	   manipulate DR2D files do not delete preferences
| 	   settings that they do not understand.  Some possible
| 	   settings include:
| 
| 		DATE=10/28/88
| 		COPYRIGHT=Copyright 1989 by Taliesin, Inc.
| 		PROGRAM=ProVector
| 		ARTIST=Ross Cunniff
| 		LMARGIN=1 inch
| 		ORIENT=portrait
| 		ZOOM=1.0 1.0 2.0 2.0

First of all, it looks like some of this information would be of general
interest and would be better to be stored in documented chunks. The
Copyright "(c) " chunk already exists in SMUS, for example.

Although there is nothing wrong with a specific application sticking a 
private chunk in a public FORM, the rule with these is for programs
that do not understand them to throw them away.  The reason is that there
is no way for these other programs to verify if the values in the chunk
are still valid.  If the program has edited the data at all, then the 
ZOOM might be off in never-never land, for example.  If ProVector is 
willing to deal with this possibility, then it can certainly *request*
that other programs leave this chunk intact.  Don't expect competitors to
leave it in though.  :-)

| /* Font table */
| 	struct FONTstruct {
| 	    LONG	ID;
| 	    LONG	Size;
| 	    SHORT	NumFonts;
| 	    CHAR	FontNames[Size-2];
| 	};
| 	/* FontNames is an array containing the names of all the fonts
| 	   used in this drawing.  The names are stored as null-separated
| 	   strings. */

Dumping all the font names in one big lump is not a very flexible way
to indicate fonts, from an IFF standpoint.  For example, there is no
way to selectively override a single font in a nested FORM.  For a better,
more IFF-like way to specify fonts, look at the FTXT.FONS chunk.  Better
yet, use it!

| NOTE 4: The  data in a FILL,  DSYM  or  GRUP chunk is a series of nested
| 	object chunks.  All object chunks except DSYM, CMAP, PATT, FILL,
| 	and FONT are allowed in a DSYM or GRUP.

This one is a serious, serious problem.  This type of nesting is not 
supported by IFF, and is inconsistent with the provided mechanism.  While
there is no technical reason why the content of the chunks couldn't be
defined this way, doing so flys in the face of a consistent IFF 
philosophy.  These nested chunks cannot, for example, be read by the
generic iffparse.library, since the library was designed to deal with
standard IFF nesting constructs.  Reading this would require every
programmer to code their own custom reader, and would futher set back
any progress toward providing general IFF parsing tools.

The only place properly nested chunks can occur is within FORM's.  In
this case, it should be a FORM DR2D, since that's what kind of chunks 
they are.  In other words, a group would look like a nested drawing:

	FORM {
		DR2D
		...		-- root level objects
		FORM {
			DR2D
			...	-- grouped objects
		}
		...
	}

The different types of nested drawings could be flagged in numerous
ways, a simple one being different drawing header chunks.  Symbols
could have a SYMB chunk in them giving the name of this symbol.
Fill patterns could have a header that specifies the local coordinate
system and other stuff.  Another way to flag them would be with
chunks in the containing FORM just before the included FORM.  There 
are probably better ways waiting to be discovered.

The basic point is that the nesting *must* look something like this 
in order to be IFF.  The mechanism specified in note 4 is vaguely
like IFF, but it isn't, and never will be.  If this format is released 
in its current form, I will certainly do everything I can to prevent it 
from ever being accepted as a Commodore standard IFF FORM.  The damage
to IFF integrity would be too great.


On a more positive note .. I mentioned earlier that there is a more
IFF-ish way to encode layers.  Rather than having the layer number as
a parameter in each object in the file, the layer ID could be a
property chunk with the file organized as a list of drawings, one in 
each layer.  This would be a "MultiLayer" LIST format:

	LIST {
		MULY
		PROP {
			DR2D
			...	-- shared properties (such as header).
		}
		FORM {
			DR2D
			LAYR { 1 }
			...	-- data for layer 1.
		}
		FORM {
			DR2D
			LAYR { 2 }
			...	-- data for layer 2.
		}
		...		-- more layers.
	}

This takes full advantage of the built-in IFF semantics for lexically
scoped properties within LIST's.  It is, of course, somewhat harder
to read, but not that much.


So let's hear from you Ross.  I'm sure that there's a way to make this
format more in the spirit of IFF without holding up your delivery 
schedule too much.  I'd be glad to work with you on this one.  A few 
conversations should be all it takes.
-- 
		Stuart Ferguson		(shf@well.UUCP)
		Action by HAVOC		(ferguson@metaphor.com)

cunniff@hpfcso.HP.COM (Ross Cunniff) (11/16/89)

shf@well.UUCP (Stuart H. Ferguson) writes:

>| Coordinates are specified in IEEE format (see note 1 for a
>| description of the IEEE format).

>Storing coordinates in floating point is not an ideal choice.  For some
>things floating point numbers are essential, but not for drawings.  The
>advantage that floating point numbers have over fixed point is that they
>are valid over a much greater range, and are therefore useful for 
>calculations involving multiplications.  Fixed point numbers, however,
>are more accurate and operations on them are faster, but fixed within a 
>narrow range.  This makes them more appropriate for calculations 
>involving additions and subtractions.  Since coordinates of greatly 
>differing magnitude cannot be mixed in the same drawing, and the edges
>of a drawing provide a natural boundary for the scale of the numbers, and
>the fact that vectors are often added and rarely multiplied, fixed point 
>coordinates (integers) are a more natural choice for drawings.

I disagree.  Floating point numbers are infinitely  more flexible
than integers.  Fixed point numbers are *not necessarily* more
flexible than floating point.  Why can't coordinates of greatly differing
magnitude be mixed in the same drawing?  What if I want to specify
coordinates in millimeters?  What if I wanted objects of great detail
spread across a large distance (a map of the solar system, perhaps)?

Your point about calculations is invalidated below.

>This does not mean that any given program couldn't use floating point
>numbers internally.  It just means that the Interchange format for drawings
>would use the more natural fixed point notation for exchanging data
>between varied applications, some which use floating point interanlly
>and some which do not.  But none are required to parse it when they don't
>have to.

You just proved my point - 'This does not mean that any given program
couldn't use fixed point numbers internally'.  Integers are no more natural
than floating point.  Your point about 'parsing' is a red herring -
nearly all computers (except maybe some DEC computers :-) use IEEE
as their 'natural' floating point format, and both Amiga C compilers
are the same.  IEEE is actually not terribly hard to translate.

>WhichLayer is somewhat ambigious.  Can you have, for example, a group
>with components in more than one layer?  If so, what does it mean?
>The "use a symbol" chunk has its own WhichLayer field, as do each of
>the components of the symbol.  Which layer wins?  I think that the
>layer information does not belong in the data for each object, but 
>rather as a property of a whole collection of objects.  I will spell
>this out later on.

The layer information of objects inside groups is ignored, but
might become meaningful if the objects are later ungrouped.  It
depends on the application.  One application might use that layer
information, where another might reset it to the layer information
of the parent group.

>The bounding box information simply doesn't belong here.  What useful
>purpose does it serve in the file?  None.  It does make it much more 
>difficult for simple programs to write one of these files, because it 
>makes them compute the bounding box for each object, even if they don't 
>*use* that information themselves.  Because the box includes the line 
>width, mitering makes this a non-trivial calculation.  In fact, because 
>there's no specified limit on the length of a miter, the bounding box is
>_undefined_ in the general case.

I should have specified that the miter limit is assumed to be the PostScript
default - 10x line width, if I remember correctly (I left my info at home).
As for useful purposes, a display program could trivially decide whether
an object is visible in its current viewport and not display objects
which are obscured.  Bounding boxes are *extremely* useful pieces of
information.  I would be willing to extend the meaning of these numbers
as follows:

	"If any of the coordinates of the bounding box have a bit pattern
	equivalent to the constant INDICATOR, the bounding box information
	is not valid and must be created by the reading program, if it
	is necessary."

>| /* 8x8 bitmap fill patterns */ /* OBSOLETE */
>
>If it's obsolete, why publish it?

The "OBSOLETE" should have read "OBSOLETE?"  We haven't decided how
useful this is in an object-oriented program.  All Mac programs seem
to have them (a powerful clue...)

>| XTRN (0x4554524E)	/* Externally controlled group */
>
>It would help transportability if there was some clue what this chunk
>was about.

On the Amiga, an externally controlled group is one where for each
manipulation performed on that group (i.e., resizing, rotating, moving,
etc.) the ARexx macro named 'ApplName' is called with appropriate flags,
etc.  This turns out to be useful for all sorts of things (like
dimension lines; or bar charts where twiddling the size can change
the numbers in a spreadsheet, for example).  Other systems will have
to make their own interpretations of an XTRN.  Note that the XTRN
objects are easily displayed, since none of the callback stuff determines
how they *look*.

This should be better documented in any case.

>| /* Color map */
>I assume (since it's not specified) that the colors are in R G B order
>with 0=min intensity and 255=max intensity.  If so, then this chunk is
>the same as the ILBM.CMAP chunk, which is probably a good thing.

Yep.  Exactly the same (except that all 8 bits of each of R G and B are
potentially significant...)

| RAST (0x52415354)	/* Raster image */

>With the exception of virtual coordinates and size, this is just like a
>simplified ILBM.  Using nested ILBM's here would facilitate 
>transportability and would serve the IFF charter a lot better.  Again,
>the "Closest" array is information that is best computed by the target
>application and should not be stored in the file.  Same reasons apply
>as for the bounding box.

Have you any idea how long it can take to calculate the closest colors in a
large bitmap?  I might consider, however, a chunk that looks like:

	FORM {
		RAST	(put virtual coordinates and 'closest' stuff here)
		FORM {
			ILBM
			...	(etc)
		}
	}

See below on the difficulties of us changing the format this drastically.


>| Note that most chunks have
>| both a chunk Size and an object Count.  I know this is redundant;
>| however, this allows the data structures to be extended in a
>| compatible manner if it should become necessary in the far future.

This paragraph should probably be deleted.  I don't see any particular
objection to having the counts in the chunks, though.


>| PREF (0x50524546)	/* Application preferences */
>
>First of all, it looks like some of this information would be of general
>interest and would be better to be stored in documented chunks. The
>Copyright "(c) " chunk already exists in SMUS, for example.

True, true.  I'll consider it...


>[...] If ProVector is 
>willing to deal with this possibility, then it can certainly *request*
>that other programs leave this chunk intact.  Don't expect competitors to
>leave it in though.  :-)

True, it's a risk.  Documenting some of them might reduce that risk
(or it might just tempt the competition :-)

>Dumping all the font names in one big lump is not a very flexible way
>to indicate fonts, from an IFF standpoint.  For example, there is no
>way to selectively override a single font in a nested FORM.  For a better,
>more IFF-like way to specify fonts, look at the FTXT.FONS chunk.  Better
>yet, use it!

Well, for one thing, these fonts are *not* the standard Amiga fonts.
Also, it's very nice to have a list of the fonts all in one place.  Note
that it simplifies text objects, since all they have to include is an index
into the font table to tell which font they use.


OK.  Now for the most serious objections:
>| NOTE 4: The  data in a FILL,  DSYM  or  GRUP chunk is a series of nested
>| 	object chunks.  All object chunks except DSYM, CMAP, PATT, FILL,
>| 	and FONT are allowed in a DSYM or GRUP.

>This one is a serious, serious problem.  This type of nesting is not 
>supported by IFF, and is inconsistent with the provided mechanism.  While
>there is no technical reason why the content of the chunks couldn't be
>defined this way, doing so flys in the face of a consistent IFF 
>philosophy.  These nested chunks cannot, for example, be read by the
>generic iffparse.library, since the library was designed to deal with
>standard IFF nesting constructs.  Reading this would require every
>programmer to code their own custom reader, and would futher set back
>any progress toward providing general IFF parsing tools.

True.  I hadn't considered generic IFF parsers when I wrote this.

> There are probably better ways waiting to be discovered.

How about:

	FORM {
		DR2D
		...	(a bunch of objects)
		FORM {
			GRUP	(has an object header)
			...	(a bunch of objects)
		}
		FORM {
			XTRN	(has the same XTRN header)
			...	(the objects for the XTRN)
		}
		FORM {
			FILL	(might have object count for grins)
			...	(the possible FILL objects)
		}
		FORM {
			DSYM	(might have object count for grins)
			...	(the object in the symbol)
		}
		...	(other objects...)
	}

I actually rather like this; I'm not sure we'll be able to change it though.
We have on the order of 3-5 different companies working on importing
our file format at this point (I'll not name names).  We'll have to
talk with them and determine how far along they are toward implementing
the readers.

>The basic point is that the nesting *must* look something like this 
>in order to be IFF.  The mechanism specified in note 4 is vaguely
>like IFF, but it isn't, and never will be.  If this format is released 
>in its current form, I will certainly do everything I can to prevent it 
from ever being accepted as a Commodore standard IFF FORM.  The damage
>to IFF integrity would be too great.

Oh, I don't know that it would be *that* great...  Is iffparse.library
not flexible enough to do something like this:

	Chunk = ReadIFFChunk( ReaderFunc, (void *)InFile );
	if (Chunk->Generic.ID == GRUP) {
		for( i = 0; i < Chunk->Grup.NumObjs; i++ ) {
			SubChunk = ReadIFFChunk( MemReaderFunc, (void *)Chunk );
		}
	}

Actually, I don't know how iffparse.library works at all, as you can
probably tell :-)  Let me know how off-base I am...

>On a more positive note .. I mentioned earlier that there is a more
>IFF-ish way to encode layers.  Rather than having the layer number as
>a parameter in each object in the file, the layer ID could be a
>property chunk with the file organized as a list of drawings, one in 
>each layer.

I don't particularly like this.  It makes moving objects around from
layer to layer rather difficult.  

>This takes full advantage of the built-in IFF semantics for lexically
>scoped properties within LIST's.  It is, of course, somewhat harder
>to read, but not that much.

I think it is quite a bit harder to read, myself.  This was one of the
hot points of discussion when the format was originally posted.  The
consensus was that 16-bit layer id's on each object was a better
way to go.

>So let's hear from you Ross.  I'm sure that there's a way to make this
>format more in the spirit of IFF without holding up your delivery 
>schedule too much.  I'd be glad to work with you on this one.  A few 
>conversations should be all it takes.

Sorry it took so long to respond - business trips and all.  I'm looking
forward to the next set of comments...

>		Stuart Ferguson		(shf@well.UUCP)
>		Action by HAVOC		(ferguson@metaphor.com)

				Ross Cunniff
				Hewlett-Packard Colorado Language Lab
				...{ucbvax,hplabs}!hpfcla!cunniff
				cunniff%hpfcrt@hplabs.HP.COM

deven@rpi.edu (Deven T. Corzine) (11/20/89)

*sigh*  somewhat ill-prepared to jump in here, not having my IFF specs
onhand, operating from memory and on general principles.  Hope this
post makes it through the system; my infrequent posts over the last
few months seem to have found their way to the bit bucket...

shf@well.UUCP (Stuart H. Ferguson) writes:

Stuart> Storing coordinates in floating point is not an ideal choice.

On 15 Nov 89 23:29:32 GMT, cunniff@hpfcso.HP.COM (Ross Cunniff) said:

Ross> I disagree.  Floating point numbers are infinitely more flexible
Ross> than integers.  Fixed point numbers are *not necessarily* more
Ross> flexible than floating point.  Why can't coordinates of greatly
Ross> differing magnitude be mixed in the same drawing?  What if I
Ross> want to specify coordinates in millimeters?  What if I wanted
Ross> objects of great detail spread across a large distance (a map of
Ross> the solar system, perhaps)?

You seem to have missed the point; what is important is not
flexibility for manipulation and calculation so much as flexibility
for interchange.  This is to be an _Interchange_ File Format, so it
should be designed for maximum transportability.  Regardless of the
usefulness of floating point numbers, extra code is required to handle
them.  No sense in forcing a program which would use fixed point to
handle floating point for the IFF file when it has no other use for
floating point.  It's sort of like creating an IFF format which uses
EBCDIC for text because you want to be creating the data on an IBM
mainframe -- sure, it's more convenient, but less workable, in the
long run...

Ross> Your point about calculations is invalidated below.

Stuart> This does not mean that any given program couldn't use
Stuart> floating point numbers internally.  It just means that the
Stuart> Interchange format for drawings would use the more natural
Stuart> fixed point notation for exchanging data between varied
Stuart> applications, some which use floating point interanlly and
Stuart> some which do not.  But none are required to parse it when
Stuart> they don't have to.

Ross> You just proved my point - 'This does not mean that any given
Ross> program couldn't use fixed point numbers internally'.  Integers
Ross> are no more natural than floating point.  Your point about
Ross> 'parsing' is a red herring - nearly all computers (except maybe
Ross> some DEC computers :-) use IEEE as their 'natural' floating
Ross> point format, and both Amiga C compilers are the same.  IEEE is
Ross> actually not terribly hard to translate.

Integers certainly ARE more natural than floating point, from the
perspective of the processor -- the 68000 certainly can't deal with
IEEE as easily as fixed point...  Yes, you can compile in the IEEE
code, but it increases code size and complexity, which is an
undesirable consequence if the program has no use for floating point
internally...

Stuart> I think that the layer information does not belong in the data
Stuart> for each object, but rather as a property of a whole
Stuart> collection of objects.  I will spell this out later on.

Ross> The layer information of objects inside groups is ignored, but
Ross> might become meaningful if the objects are later ungrouped.

I'd also suggest using PROPs in a LIST -- more "IFF-like"...

Stuart> The bounding box information simply doesn't belong here.  What
Stuart> useful purpose does it serve in the file?  None.

I don't think it harms anything to include it...  the only harm is in
DEPENDING on it to exist and be accurate.  Definitely should be an
optional chunk.

Ross> Have you any idea how long it can take to calculate the closest
Ross> colors in a large bitmap?  I might consider, however, a chunk
Ross> that looks like:

Ross> 	FORM {
Ross> 		RAST	(put virtual coordinates and 'closest' stuff here)
Ross> 		FORM {
Ross> 			ILBM
Ross> 			...	(etc)
Ross> 		}
Ross> 	}

Don't use a separate FORM; use a chunk in FORM DR2D with the data
followed by the nested ILBM FORM (within that RAST chunk)...

Ross> True, it's a risk.  Documenting some of them might reduce that
Ross> risk (or it might just tempt the competition :-)

Never depend on security through obscurity...

Stuart> [...] look at the FTXT.FONS chunk.  Better yet, use it!

Ross> Well, for one thing, these fonts are *not* the standard Amiga
Ross> fonts.  Also, it's very nice to have a list of the fonts all in
Ross> one place.  Note that it simplifies text objects, since all they
Ross> have to include is an index into the font table to tell which
Ross> font they use.

Still less flexible.  Besides, from what I recall of FTXT.FONS, it
shouldn't matter if you're storing names of PostScript fonts instead of
Amiga fonts...

Ross> OK.  Now for the most serious objections:

Stuart> This one is a serious, serious problem.  [...]

Ross> True.  I hadn't considered generic IFF parsers when I wrote this.

Stuart>  There are probably better ways waiting to be discovered.

Ross> How about:

Ross> 	FORM {
Ross> 		DR2D
Ross> 		...	(a bunch of objects)
Ross> 		FORM {
Ross> 			GRUP	(has an object header)
Ross> 			...	(a bunch of objects)
Ross> 		}
Ross> 		FORM {
Ross> 			XTRN	(has the same XTRN header)
Ross> 			...	(the objects for the XTRN)
Ross> 		}
Ross> 		FORM {
Ross> 			FILL	(might have object count for grins)
Ross> 			...	(the possible FILL objects)
Ross> 		}
Ross> 		FORM {
Ross> 			DSYM	(might have object count for grins)
Ross> 			...	(the object in the symbol)
Ross> 		}
Ross> 		...	(other objects...)
Ross> 	}

Please bear in mind that these bits of data (DSYM, XTRN, etc.) have no
meaning beyond the context of the FORM DR2D, and that FORMs are a
top-level construct.  Apart from the general undesirability of
cluttering the FORM namespace, you would have little use for an IFF
file consisting solely of FORM XTRN, ne?

The "right" way to do it is similar -- Use FORM DR2D nested, and have
the nested ones contain the XTRN/DSYM/whatever chunks, and further
nested FORM DR2D chunks if need be.

Ross> I actually rather like this; I'm not sure we'll be able to
Ross> change it though.  We have on the order of 3-5 different
Ross> companies working on importing our file format at this point
Ross> (I'll not name names).  We'll have to talk with them and
Ross> determine how far along they are toward implementing the
Ross> readers.

Well, I couldn't you write a reader for the old format and a
reader/writer for the revised one, so you could just convert them all?

Stuart> The basic point is that the nesting *must* look something like
Stuart> this in order to be IFF.  The mechanism specified in note 4 is
Stuart> vaguely like IFF, but it isn't, and never will be.  If this
Stuart> format is released in its current form, I will certainly do
Stuart> everything I can to prevent it from ever being accepted as a
Stuart> Commodore standard IFF FORM.  The damage to IFF integrity
Stuart> would be too great.

I agree here.

Ross> Oh, I don't know that it would be *that* great...  Is
Ross> iffparse.library not flexible enough to do something like this:

The point is that it completely violates the IFF guidelines -- it's
not a question of making the generic reader flexible enough.
Compromising the IFF guidelines would defeat its entire intended
purpose.

Ross> Sorry it took so long to respond - business trips and all.  I'm
Ross> looking forward to the next set of comments...

Well, I'm not Stuart, but I thought I'd toss in my thoughts.  Good
luck with fixing it up (please do!).  Cheers!

Deven
-- 
Deven T. Corzine        Internet:  deven@rpi.edu, shadow@pawl.rpi.edu
Snail:  2151 12th St. Apt. 4, Troy, NY 12180   Phone:  (518) 274-0327
Bitnet:  deven@rpitsmts, userfxb6@rpitsmts     UUCP:  uunet!rpi!deven
Simple things should be simple and complex things should be possible.

cunniff@hpfcso.HP.COM (Ross Cunniff) (11/28/89)

> You seem to have missed the point; what is important is not
> flexibility for manipulation and calculation so much as flexibility
> for interchange.  This is to be an _Interchange_ File Format, so it
> should be designed for maximum transportability.  Regardless of the
> usefulness of floating point numbers, extra code is required to handle
> them.  No sense in forcing a program which would use fixed point to
> handle floating point for the IFF file when it has no other use for
> floating point.  It's sort of like creating an IFF format which uses
> EBCDIC for text because you want to be creating the data on an IBM
> mainframe -- sure, it's more convenient, but less workable, in the
> long run...

Let me turn your argument around - 'No sense in forcing a program which
could use the extra precision of IEEE floating point to lose it'
It is not AT ALL like using EBCDIC; IEEE is a mandated standard,
vis the name.  It holds across a variety of architectures and
operating systems, at least as much as two's complement fixed point
integers.

> Integers certainly ARE more natural than floating point, from the
> perspective of the processor -- the 68000 certainly can't deal with
> IEEE as easily as fixed point...  Yes, you can compile in the IEEE
> code, but it increases code size and complexity, which is an
> undesirable consequence if the program has no use for floating point
> internally...

How would your argument change if you were talking about a 68020/68881
system?  A 68040 with on-chip floating point?  An 8088/8087 system?
For the latter, I suspect that single precision IEEE floating point
is MORE natural than 32-bit fixed point integers.

Here's a little routine to convert IEEE single-precision floating point
numbers to a fixed-point representation given the position of the
decimal point (in bits).  It took about 5 minutes to write.  It uses
NO floating point arithmetic.  It compiles to around 200 bytes of object
code.  It *is* a little compiler specific wrt the bit order of bitfields,
but that's easy to massage.  It wouldn't be any harder to write the
inverse transformation:

	typedef union {
	    struct {
		unsigned int	Sign :  1;
		unsigned int	Exp  :  8;
		unsigned int	Mant : 23;
	    }
		Bits;
	    long
		Long;
	} IEEE;


	long ScaleIEEE( val, mag )
	long val;
	int mag;
	{
	    IEEE num;
	    unsigned long base;
	    long res;
	    int nshift;

	    num.Long = val;
	    if( num.Long == 0 )		return 0;

	    base = 0x800000;		/* Implicit leading '1' */
	    base |= num.Bits.Mant;	/* Get entire number */

	    /* Calculate amount to shift: excess 127, 23 bits precision, */
	    /* plus scale factor */
	    nshift = ((int)num.Bits.Exp) - 127 + mag - 23;

	    /* Shift it: left or right */
	    if( nshift < 0 )	res = base >> -nshift;
	    else		res = base << nshift;

	    /* Add in the sign */
	    if( num.Bits.Sign )	res = -res;

	    return res;
	}

There.  Was that so bad?

(regarding layers)
> I'd also suggest using PROPs in a LIST -- more "IFF-like"...

I think here you're going overboard in the IFF philosophy.  I think
it is much easier to parse the file without all that extra LIST, etc.
clutter around.  I agree about the nested object stuff, but the
layering seems so much simpler as a 16-bit ID for each object.

> Please bear in mind that these bits of data (DSYM, XTRN, etc.) have no
> meaning beyond the context of the FORM DR2D, and that FORMs are a
> top-level construct.  Apart from the general undesirability of
> cluttering the FORM namespace, you would have little use for an IFF
> file consisting solely of FORM XTRN, ne?

Oops.  You're correct.  How about:

		FORM {
			DR2D
			DRHD {
				...		-- various drawing attributes
			}
			...			-- root level objects
			FORM {
				DR2D
				GRUP {
					...	-- various group attributes
				}
				...		-- grouped objects
			}
			...
			FORM {
				DR2D
				XTRN {
					...	-- various extern attributes
				}
				...		-- grouped objects
			}
		}

and so on.  It would be nice if there was some requirement that the
GRUP, XTRN, etc. chunks be the first ones in their forms, but I'm
not sure that IFF allows any mandates like that.

I still haven't gotten a chance to look at FTXT.FONS; I'll do that
and report later (I do have the Commodore IFF document).

> Deven

				Ross

shf@well.UUCP (Stuart H. Ferguson) (12/10/89)

+---- cunniff@hpfcso.HP.COM (Ross Cunniff)
| shf@well.UUCP (Stuart H. Ferguson) writes:
| >Storing coordinates in floating point is not an ideal choice.  [ ... ]
| >  Fixed point numbers, however,
| >are more accurate and operations on them are faster, but fixed within a 
| >narrow range.  This makes them more appropriate for calculations 
| >involving additions and subtractions.  Since coordinates of greatly 
| >differing magnitude cannot be mixed in the same drawing, [ ... ] fixed point 
| >coordinates (integers) are a more natural choice for drawings.
| 
| I disagree.  Floating point numbers are infinitely  more flexible
| than integers.

Perhaps.  They are undoubtedly more complex, which is my objection.  Since
a drawing can be represented just as well with integers (better, in fact),
and integers are simplier than floating point numbers, why not use them?

| Why can't coordinates of greatly differing
| magnitude be mixed in the same drawing?  [ ... ] What if I wanted 
| objects of great detail spread across a large distance (a map of the
| solar system, perhaps)?

You can't do it -- it's an inherent limitation of floating point.  FP 
numbers are specified by a mantissa (M) between 1 and N, and an exponent 
(E) which gives a magitude multiplier for the mantissa.  The complete
number is given by the expression

	M * (N^E)

where N is the base of the number system.  For IEEE floating point numbers,
N is 2, E is an integer between -127 and 127, and M is a 24-bit fixed point 
integer greater than or equal to 1 and less than 2.  (There's also a sign
bit, of course.)

To give a simple illustration of the limitations of the format, imagine that
we have a floating point number in base 10, with three digits of mantissa.
This means that we can represent numbers like 999, 87600 and .00765.  We 
cannot represent 1.234 accurately because we only have three digits of 
mantissa -- the best we can do is approximate it with 1.23.

So, let's pretend that we have a ruler 2000 units long and we want to put a
mark along the ruler every 1 unit.  So we start putting marks at 1, 2, 3, 4,
etc., until we get to 999.  Now we put a mark at 1000.  Fine.  But what 
happens when we try to put a mark at 1001?  We can't -- we've run out of 
digits of mantissa.  We can mark positions at 1000 and 1010, but we can't 
mark the nine intermediate positions, because we were forced to increment
the exponent of our number to express a greater magnitude and gave up some 
precision at the low end in the process.

The exact same thing happens with binary floating point formats.  Now, IEEE
has 24 bits of mantissa, and double precision has 56 bits, but this only 
makes the effect harder to spot.  But the example of the solar system will
do it.  Suppose we want to mark out the distance from the Sun to the Earth
with one mile increments.  We will do fine until we get about 16 million 
miles out when we run out of mantissa and are forced to start placing our 
marks TWO miles apart.  At 32 million miles we can only make marks FOUR miles
apart and so on.  Each time the distance doubles it halves our effective 
precision.  By the time we reach the Earth, at about 100 million miles from
the Sun, our marks are 64 miles apart.  This means that 64 miles is the 
smallest distance you can measure on the Earth in this scale model of the 
Solar system.

Now consider, if we took that 100 million miles and represented it with a
32 bit integer, the smallest distance we could measure would be about 1200 
FEET -- more than 250 times smaller than what we can measure with an equal
sized FP number.

| What if I want to specify coordinates in millimeters?

Easy -- just call each integer unit a micron.  Then 1000 units is a 
millimeter.  Maximum size of the drawing: 4.3 km.  My proposal from last
time allowed drawings to do this by specifying the bounding box for the 
drawing in both integer and floating point.  The floating point bounding
box allows you to specify the size of the drawing in whatever units you
want (which floating point numbers are good for); the integer bounding
box allows you to impose a very fine uniform grid on that bounded area
(which integers are good for).  Best of both worlds.

| [ ... ] Integers are no more natural than floating point. [ ... ]

One advantage of the proposal above is that programs that are not 
interested in the real-world scale of the drawing can use the integer
information without being forced to deal with floating point.  Background
art for windows, for example, could be drawn without having to know the
actual size of the drawing.  It would be a shame to force Intuition to
load the IEEE floating point library just to draw drawings, especially
since it can be avoided so easily.  (If anything, it will make assembly
language programmers happy :-).


| >The bounding box information simply doesn't belong here.  [ ... ]
| >Because the box includes the line 
| >width, mitering makes this a non-trivial calculation.  In fact, because 
| >there's no specified limit on the length of a miter, the bounding box is
| >_undefined_ in the general case.

| I should have specified that the miter limit is assumed to be the PostScript
| default - 10x line width, if I remember correctly (I left my info at home).

It's still a difficult calculation.  I would bet that 8 programs out of 10
would just fudge it and add the miter limit plus 15% to all sides of the
bounding box.

| As for useful purposes, a display program could trivially decide whether
| an object is visible in its current viewport and not display objects
| which are obscured.

This is true, but this is a run-time consideration which argues that the
bounding box is useful information to have in a program that is continously
updating the screen.  What purpose does it serve *in the file*? -- that's 
my question.  In my experience, an IFF format should be the distilled-down
essense of the data to be exchanged.  The file should represent only the
substance of the data objects and their relationships in the most abstract
possible way.  But if you take away the bouding box information, the picture 
is unchanged!  Also, given the objects in the drawing, you can *derive* 
their bounding boxes.  This strongly suggests that the bounding box is not 
an essential part of an object and should not be in the file format.

| I would be willing to extend the meaning of these numbers as follows:
  [ allows for bounding box to be marked invalid ]

It would be even better if you would make the whole structure optional,
or leave it out altogether.  Those 16 bytes per object could be put to
much better use.


| >| /* 8x8 bitmap fill patterns */ /* OBSOLETE */
| We haven't decided how
| useful this is in an object-oriented program.  All Mac programs seem
| to have them (a powerful clue...)

Yeah, but a powerful clue which way? :-)  Actually, can't this be done
in DR2D with a PATT containing a RAST?

| >| XTRN (0x4554524E)	/* Externally controlled group */
| On the Amiga, an externally controlled group is one where for each
| manipulation performed on that group (i.e., resizing, rotating, moving,
| etc.) the ARexx macro named 'ApplName' is called with appropriate flags,

This doesn't seem very portable.  I would recomend that you make this an
optional chunk, such that reader that can't deal with it can safely ignore
it.  It's an interesting idea, and I think I know what you're trying to
accomplish, but I wonder if this works for an IFF format.  I don't feel
I have enough grasp of the implications to say you should remove it, so
I'd say give it a try and see what happens.  We'll find out in a few years
if it was the right decision or not... :-)

| >| /* Color map */
| > [ ... ] this chunk is the same as the ILBM.CMAP chunk, which is 
| >probably a good thing.
| Yep.  Exactly the same (except that all 8 bits of each of R G and B are
| potentially significant...)

Nitpick time:  All eight bits in the ILBM.CMAP chunk are potentially 
significant too, it's just Amiga programs that truncate them to four.  I
imagine that a drawing program on the Amiga would do the same thing given
a DR2D.CMAP.

| | RAST (0x52415354)	/* Raster image */
 [ ... ] 
| >the "Closest" array is information that is best computed by the target
| >application and should not be stored in the file.

| Have you any idea how long it can take to calculate the closest colors in a
| large bitmap?

Well, yes, I have some idea.  The naive algorithm will be O(N*M), where N
and M are the sizes of the source and destination color tables.  With N & M
up to 32 or so, this straight-ahead search will be plenty fast.  If the
color tables get into the 256 color range, you might need some hashing on
the color space to speed up the search, but I would think that would still
be pretty easy and fast enough to do on the fly.  Note that the time to 
compute this mapping has nothing to do with how large the bitmaps are, but
only depends on the size of the color tables.

| I might consider, however, a chunk that looks like:
| 	FORM {
| 		RAST	(put virtual coordinates and 'closest' stuff here)
| 		FORM {
| 			ILBM
| 			...	(etc)

Except now you're creating a new FORM, which is not a great idea.  See below.

| >Dumping all the font names in one big lump is not a very flexible way
| >to indicate fonts, from an IFF standpoint.  For example, there is no
| >way to selectively override a single font in a nested FORM.  For a better,
| >more IFF-like way to specify fonts, look at the FTXT.FONS chunk.  Better
| >yet, use it!

| Well, for one thing, these fonts are *not* the standard Amiga fonts.

Neither are the fonts in FTXT.FONS.  Remember, nothing in IFF is Amiga
specific (at least none of the required chunks) -- it is intended as an 
interchange format.  The FONS chunk also provides some other general 
information about the font to help make a good decision in case a 
different font needs to be substituted.

| [ ... ] it simplifies text objects, since all they have to include is an
| index into the font table to tell which font they use.

That's how FTXT.FONS chunks work too.  Each chunk has a index which is then 
used to select the font later.

| Also, it's very nice to have a list of the fonts all in one place.

Why?  I gave you an example of something you can do when they're separate;
you give me an example of an advantage of having them together.


| OK.  Now for the most serious objections:
| >| NOTE 4: The  data in a FILL,  DSYM  or  GRUP chunk is a series of nested
| >| 	object chunks.  All object chunks except DSYM, CMAP, PATT, FILL,
| >| 	and FONT are allowed in a DSYM or GRUP.

| >This one is a serious, serious problem.  This type of nesting is not 
| >supported by IFF, and is inconsistent with the provided mechanism.
| > [ ... ] These nested chunks cannot, for example, be read by the
| >generic iffparse.library, since the library was designed to deal with
| >standard IFF nesting constructs.

| How about:
|	FORM {
|		DR2D
|		...	(a bunch of objects)
|		FORM {
|			GRUP	(has an object header)
 			^^^^	^^^^^^^^^^^^^^^^^^^^^^
|			...	(a bunch of objects)
|		}
 [ ...etc. ]

I can't tell what you're suggesting here.  This "GRUP" ID appears right
at the start of the FORM, which makes it look like the FORM-type for the 
FORM; however, it also looks like it has content, which makes it look 
like a chunk.  This doesn't make sense.

If you mean the latter, i.e. that the GRUP is a chunk, then the FORM-type
should be DR2D.  In this case, the file would look like this:

	"FORM" {
		"DR2D"
		"DRHD" { drawing header }
		...				-- object chunks
		"FORM" {
			"DR2D"
			"GRUP" { group header }
			...			-- more object chunks
		}
		...				-- even more object chunks
	}

If this is what you mean, then I agree that this is a good way to go.

If, on the other hand, you were proposing the former possibility, i.e.
that "GRUP" is a FORM-type in itself, then I don't think this is a good
way to go.  Creating a new FORM-types is conceptually creating a whole
new entity with its own syntax and content.  Since a "GRUP" FORM is
essentially exactly the same as a "DR2D" FORM, it really should not be
its own FORM-type.  Same with the other types: FILL, PATT, etc.

| I'm not sure we'll be able to change it though.
| We have on the order of 3-5 different companies working on importing
| our file format at this point (I'll not name names).  We'll have to
| talk with them and determine how far along they are toward implementing
| the readers.

Well, I can certainly appreciate the need to get product out the door,
as well as the kinds of compromises needed to do that in a timely manner.
It is sometimes more expedient to accept a kludgy solution now and fix it
later; however, this is one case where this is not true.  This format is 
a public interface, not just an internal issue.  There will be no 
opportunity to fix it later!  Unless you make this a private format --
which you cannot unless the other companies you mention are willing to
update their programs at the same time -- the decisions you make now will
stick.  And all of us in the Amiga community will have to live with the
consequences.

| Is iffparse.library not flexible enough to do something like this:
  [ code fragment deleted ... ]
| Actually, I don't know how iffparse.library works at all, as you can
| probably tell :-)  Let me know how off-base I am...

The code fragment that was deleted showed an example of recursive-descent 
IFF parsing code being used to delve into a GRUP chunk as if it were a
nesting construct.  No, you can't do this with iffparse because iffparse
is not a recursive-descent parser.  Hard-coded in the heart of the library
is an explicit state-machine that controls the traversal of IFF file
structures.  To treat GRUP chunks like FORM's and LIST's would require
re-wiring this central part of the parsing logic.

| >On a more positive note .. I mentioned earlier that there is a more
| >IFF-ish way to encode layers.  Rather than having the layer number as
| >a parameter in each object in the file, the layer ID could be a
| >property chunk with the file organized as a list of drawings, one in 
| >each layer.
| I don't particularly like this.  It makes moving objects around from
| layer to layer rather difficult.  

Unless you're editing your pictures with NewZAP, this is a run-time issue.  
You can structure your internal data objects to have a layer ID in each 
one if you prefer.  While writing, your program would stick all the objects 
with like ID's into the same DR2D FORM, and arrange those in a LIST.

| >This takes full advantage of the built-in IFF semantics for lexically
| >scoped properties within LIST's.  It is, of course, somewhat harder
| >to read, but not that much.
| I think it is quite a bit harder to read, myself.  This was one of the
| hot points of discussion when the format was originally posted.  The
| consensus was that 16-bit layer id's on each object was a better
| way to go.

What was the basis for this consensus?  My position here is the same
"distilled essense" argument about the content of IFF formats.  A simple
drawing has no layers.  A layered drawing is actually several drawings,
each one in a different layer, constructed using IFF syntax.

| Sorry it took so long to respond - business trips and all.  I'm looking
| forward to the next set of comments...

At least you'll understand why this response took so long. :-)
-- 
		Stuart Ferguson		(shf@well.UUCP)
		Action by HAVOC		(ferguson@metaphor.com)