[net.ai] An Image Format Standard

rsk@pucc-k (Wombat) (07/17/85)

It has been pointed out that one of the obstacles in the path of
sharing research on machine vision and related areas is the lack of a
common image format that would allow investigators to freely exchange
images.  (The recent article by David Sher of the University of
Rochester in net.ai comes to mind.)  It seems to me that perhaps the
time has come for a committee to be formed, either by the IEEE or ACM,
or both, to define such a standard.

Certainly, different research areas have their own requirements for
such work; I would expect that those working on assembly-line
automation would be interested in different criteria than those working
on ultrasonic medical imaging; however, a standard could allow a few
different formats, with clearly defined conversion algorithms from one
to another.  This certainly falls short of the ideal of one image
format, but it is better than the current state of affairs, with
hundreds of incompatible formats.

The problem is complex; and with so many existing formats, debate is
likely to be long and vigorous.  Also, once a standard is agreed upon,
a great many programs would have to be modified to understand it, with
what is likely to be considerable effort.  However, I believe that the
time and energy spent on discussion and (eventual) software changes
will pay off over time as researchers are able to share information
more freely.  (How many of you have programs with do nothing other than
convert someone else's image format to yours, or vice versa?)

In summary, the current state of affairs is such that there are
probably as many image formats as there are groups doing image
research; this represents an obstruction to progress that can be
overcome with a standard image format.

(It may be that such a standard or a committee working on a standard
already exists; if so, I apologize for my ignorance on the matter.)

-- 
Rich Kulawiec	rsk@{pur-ee,purdue}.uucp, rsk@purdue-asc.csnet
		rsk@purdue-asc.arpa or rsk@asc.purdue.edu

miles@vax135.UUCP (Miles Murdocca) (07/18/85)

Rich Kulawiec at Purdue writes:

> It has been pointed out that one of the obstacles in the path of
> sharing research on machine vision and related areas is the lack of a
> common image format that would allow investigators to freely exchange
> images.  (The recent article by David Sher of the University of
> Rochester in net.ai comes to mind.)  It seems to me that perhaps the
> time has come for a committee to be formed, either by the IEEE or ACM,
> or both, to define such a standard.

Yes, we do need some standards.  But I don't want to wait for a committee
to be formed by ANSI (or IEEE, ACM) and have a monstrous standard produced
that is difficult to use.  How about this idea:  subscribers to the net
informally create a standard that fills their needs as well as the needs
of anyone else they can think of.  Net users will then proliferate / enforce
the standard at their own sites until the bugs get worked out and the
standard gains some acceptance.

For starters, I suggest this: every image (or set of images) will start
with a fixed size header that gives information on the rest of the image(s).
Such information could be X dimension (in pixels), Y dimension, type
of data (graphic or raster), number of bits per pixel, number of images,
etc.  Color maps and data would then follow.  Ideas or comments, anyone?

    Miles Murdocca, 4B-525, AT&T Bell Laboratories, Crawfords Corner Rd,
    Holmdel, NJ, 07733, (201) 949-2504, ...{ihnp4}!vax135!miles

nather@utastro.UUCP (Ed Nather) (07/19/85)

> For starters, I suggest this: every image (or set of images) will start
> with a fixed size header that gives information on the rest of the image(s).
> Such information could be X dimension (in pixels), Y dimension, type
> of data (graphic or raster), number of bits per pixel, number of images,
> etc.  Color maps and data would then follow.  Ideas or comments, anyone?
> 
>     Miles Murdocca, 4B-525, AT&T Bell Laboratories, Crawfords Corner Rd,

The refernce to the FITS standard for image transfer was posted to this
newsgroup a couple of days ago.  Your description above sounds just like
FITS with many details missing.  Why not look at FITS first, and then
change it if you can't live with it?  Why re-invent a new "standard?"

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather%utastro.UTEXAS@ut-sally.ARPA

msp@ukc.UUCP (M.S.Parsons) (07/20/85)

In article <1130@vax135.UUCP> miles@vax135.UUCP (Miles Murdocca) writes:
>How about this idea:  subscribers to the net informally create a standard 
>that fills their needs as well as the needs of anyone else they can think 
>of. For starters, I suggest this: every image will start with a fixed size 
>header that gives information on the rest of the image(s).
>Such information could be X dimension (in pixels), Y dimension, type
>of data (graphic or raster), number of bits per pixel, number of images,
>etc.  Color maps and data would then follow.  Ideas or comments, anyone?

Good idea. I would love to trade images round the net. I think the header
needs to be somewhat more flexible though: What about run-length, quadtree,
octree and other 3-D data? To keep size to a minimum, every image could be
put through compress. We'll need uuencode/decode too. What about the HIPS
system? Reference in bib format:

%A Michael S. Landy
%A Yoav Cohen
%A George Sperling
%D 1984
%I Academic Press Inc.
%J Computer Vision, Graphics and Image Processing
%K HIPS, UNIX, Image Processing, filters, Headers
%P 331-347
%T HIPS: A UNIX-Based Image Processing System
%V 25

--------------------------------------------------------------------------------
Mike Parsons           UUCP: ..!seismo!mcvax!ukc!msp  ARPA: MSP%UKC@UCL-CS.ARPA
JANET: MSP%UKC@UCL-CS  MAIL: Computing Lab, Univ. of Kent, Canterbury, Kent, UK.

jss@sjuvax.UUCP (J. Shapiro) (07/26/85)

> 
>      Take a good look at the standards for facsimile transmission, such as
> CCITT group III and IV FAX compression standards.  This territory is well
> covered.
> 
> 					John Nagle

This things are great for black and white, but it is not clear to me tha
they cover color. In addition, even if all you want to do is move images
as pixel/intensity/color data, the compression achieved bu IV FAX is
a joke.

Finally, neither of these (to my knowledge) facilitates the transmission
of pictures which are _described_ more easily than they are transmitted.
For example, a circle inside a triangle would clearly transmit more effi-
ciently as graphics commands than as a bit image.

Jon Shapiro

ken@turtlevax.UUCP (Ken Turkowski) (07/29/85)

In article <182@ukc.UUCP> msp@eagle.UUCP (Mike Parsons. ) writes:
>In article <1130@vax135.UUCP> miles@vax135.UUCP (Miles Murdocca) writes:
>>How about this idea:  subscribers to the net informally create a standard 
>>that fills their needs as well as the needs of anyone else they can think 
>>of. For starters, I suggest this: every image will start with a fixed size 
>>header that gives information on the rest of the image(s).
>>Such information could be X dimension (in pixels), Y dimension, type
>>of data (graphic or raster), number of bits per pixel, number of images,
>>etc.  Color maps and data would then follow.  Ideas or comments, anyone?
>
>Good idea. I would love to trade images round the net. I think the header
>needs to be somewhat more flexible though: What about run-length, quadtree,
>octree and other 3-D data? To keep size to a minimum, every image could be
>put through compress. We'll need uuencode/decode too.

-----------------------------------------------------------------
Additionally, there is
byte ordering - If there are any data stored in quanta other than bytes,
	there is a problem transferring images generated on a big-endian
	machine to a little-endian one.
scanning direction: right-to-left then top-to-bottom, right-to-left then
	bottom-to-top, or any of the other 6 possible ways to scan a 2-
	dimensional signal to convert it into a 1-dimensional one.
	More possibilities are possible for 3-dimensional signals.
packing orientation: Several pixels can be packed into a "word".
	Are they packed left-to-right or right-to-left?
contiguity within words: If several pixels are packed into the same
	word, what is their geometric relationship (horizontal or
	vertical).  You may think that this is a weird concept,
	but a certain workstation manufacturer's graphics board
	seems to work fastest when 1-bit pixels are packed
	16-to-a-word horizontally-contiguous and then block-scanned
	vertically.
cursor hot spot: If an image is used as a cursor, where is the relative
	location of the point to which the cursor points?

Below I submit a format that we've used in-house, which seems to meet
nearly every criterion.  I offer it to the net for comment and
improvement.  The only things not specifically included are encoding
(run-length, quadtree, octtree, 3-D) which could be put into the
"ident" field of ImageSubHeader, and cursor hot-spot, which could
replace the "thisimage" field of ImageSubHeader.  The "thisimage"
pointer was originally proposed to allow locating all of the subheaders
at the beginning of the file, to accommodate an easily accessible index
for font files which contain an entire character set.

I'm not advocating that every institution accommodate every permutation
allowed by this header in every one of its application programs, but
some allowance should be made for an almighty conversion program
(albeit partially implemented) to convert between various popular
formats.

-----------------------------------------------------------------

# This is a shell archive.  Remove anything before this line, then
# unpack it by saving it in a file and typing "sh file".  (Files
# unpacked will be owned by you and have default permissions.)
#
# This archive contains:
# image.h

echo x - image.h
cat > "image.h" << '//E*O*F image.h//'
/* These are the headers to be used on all images.  A simple image, with
 * one component, is characterized by a 32-byte header, followed by the
 * image.  More complex images with multiple components, as well as (whole
 * or partial) color maps may be included in this file by using one
 * 20-byte header plus a 12-byte header for each separate image
 * component.  The general structure follows:
 *
 *	RasterImageHeader	(20 bytes)
 *	Colormap		(optional; flagged in RasterImageHeader)
 *	ImageSubHeader		(12 bytes)
 *	Image1
 *	ImageSubHeader		(12 bytes, optional)
 *	Image2			(optional)
 *	...		(Should be at least one ImageSubHeader and Image)
 * 
 * If a colormap is supplied with the image, it is denoted by a nonzero
 * cmapsize in the RasterImageHeader.  The colormap is located immediately
 * afterward, and should be 832 bytes long, as described in
 * <colormap.h>:  32 for the header, 32 for the indexmap, and 3*256 for
 * the main and alternate colormaps.  The colormap can easily be skipped
 * (regardless of its conformity to the colormap.h standard) by using the
 * cmapsize parameter.
 * 
 * The ImageSubHeader is used in front of every image.  This is useful for
 * images with different sized components (as with an alpha component to
 * specify transparency) or for images with R, G, and B generated
 * separately.
 */

# define STDIMAGE 'I','M','A','G',('3'<<24)|('2'<<16)|('1'<<8)|'0',0,16,\
	SCANRTDN|HCONTINUITY|BIGENDIAN|RIGHTJUSTIFIED,0

struct RasterImageHeader {		/* 20 byte introductory header */
	char	objecttype[4];	/* "IMAG" */
	long	byteorder;	/* ('3'<<24) | ('2'<<16) | ('1'<<8) | '0' */
	char	version;	/* 0 */
	char	wordsize;	/* Size (in bits) of the word */
	char	packingorientation;	/* Fields defined below */
	char	unused;		/* Should be zero */

	short	width;		/* Width of the raster image in pixels */
	short	height;		/* Height of the raster image in pixels */
	char	unused1[2];	/* Should be zero */

	short	cmapsize;     /* colormap size, or offset to ImageSubHeader */
};

struct ImageSubHeader {			/* 12-byte header for each image */
	char imagetype;		/* The type code of the image */
	char bitspercomp;	/* The number of bits per component */
	char fieldshift;	/* Bit number: 0 unless plane-scanned */
	char ident;		/* User's identification code (0 default) */
	long thisimage;		/* Should be 0; reserved for image offset */
	long nextimage;		/* Offset to next ImageSubHeader; 0 if none */
};

/* Below we have the bit definitions for the packingorientation field */
# define SCANRTUP	0x00	/* left-to-right, bottom-to-top */
# define SCANUPRT	0x01	/* bottom-to-top, left-to-right */
# define SCANUPLF	0x02	/* bottom-to-top, right-to-left */
# define SCANLFUP	0x03	/* right-to-left, bottom-to-top */
# define SCANLFDN	0x04	/* right-to-left, top-to-bottom */
# define SCANDNLF	0x05	/* top-to-bottom, right-to-left */
# define SCANDNRT	0x06	/* top-to-bottom, left-to-right */
# define SCANRTDN	0x07	/* left-to-right, top-to-bottom */

# define HCONTIGUITY	0x00	/* word has horizontally-contiguous pixels */
# define VCONTIGUITY	0x08	/* word has vertically-contiguous pixels */

# define LITTLEENDIAN	0x00	/* first pixel in least significant position */
# define BIGENDIAN 	0x10	/* first pixel in most significant position */

# define RIGHTJUSTIFIED	0x00	/* Right justification in field */
# define LEFTJUSTIFIED	0x20	/* Left justification in field */

/* Here is the encoding for the various image types */
# define IMAG_UNDEFINED	0

# define IMAG_VALUE	1
# define IMAG_AALINE	2
# define IMAG_INTENSITY	3
# define IMAG_ALPHA	4
# define IMAG_INVALPHA	5

# define IMAG_RGB	8
# define IMAG_RGBA	9
# define IMAG_RED	10
# define IMAG_GREEN	11
# define IMAG_BLUE	12

# define IMAG_DEPTH	16

/* #include <colormap.h>	/* To interpret the colormap, if included */
//E*O*F image.h//

exit 0
-- 

Ken Turkowski @ CADLINC, Menlo Park, CA
UUCP: {amd,decwrl,hplabs,nsc,seismo,spar}!turtlevax!ken
ARPA: turtlevax!ken@DECWRL.ARPA

jbn@wdl1.UUCP (08/05/85)

     Take a good look at the standards for facsimile transmission, such as
CCITT group III and IV FAX compression standards.  This territory is well
covered.

					John Nagle

julian@osu-eddie.UUCP (Julian Gomez) (08/07/85)

Lucasfilm has had an image format standard for a while now.  It has
even been used externally: images for the frame buffer show at
SIGGRAPH 85 were mailed on tapes using a simplified Lucasfilm
standard.

One problem with their format (and Ken's too) is the lack of stating
the aspect ratio for the pixels: square or 4:3.  Not all frame
buffers do them the same way.
-- 

	Julian "a tribble took it" Gomez
	The Ohio State University
	{ucbvax,decvax}!cbosg!osu-eddie!julian