[comp.graphics] How to map 24-bit RGB to 256 co

phil@ux1.cso.uiuc.edu (08/29/89)

> i.e. Calculate the 3-D histogram and use the 256 values with highest frequency.
>      Then map original image, pixel by pixel - using closest entry in 256 color
>      palette, into the displayable image.
> 
> Problem 1: the histogram will be large (up to 262144 'bins')

or more!

> Problem 2: is Euclidian distance (at least in RGB color space) a good
> 	measure of closeness of color ?

I would think so.  Also, two very close colors, with high frequency of
occurence, could be represented by the SAME color out.  You should examine
the number of colors clustering around an area (in rgb color space) to see
if it is tonal information that can be dithered instead, input quantization
noise, or other legitimate occurences that need to be retained.  I have no
idea how to do this.

> Problem 3: are the highest frequency colors in the histogram necessarily
>         the best to use? (Maybe there should be a minimum distance between
> 	bins used - (dithering?) image dependent ...)

When dealing with scanned input images, the number of bins actually used can
be quite large, and most having only a small frequency.  What out the bin
bin "way out yonder" (in rgb color space) with a count of 1?  What if there
are a lot of the WOY bits?

--Phil howard--  <phil@ux1.cso.uiuc.edu>

mitchell@cbmvax.UUCP (Fred Mitchell - QA) (08/29/89)

In article <5300026@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu writes:
>
>> i.e. Calculate the 3-D histogram and use the 256 values with highest frequency.
>>      Then map original image, pixel by pixel - using closest entry in 256 color
>>      palette, into the displayable image.
>> 
>> Problem 1: the histogram will be large (up to 262144 'bins')
>
>or more!
>
>> Problem 2: is Euclidian distance (at least in RGB color space) a good
>> 	measure of closeness of color ?
>
>I would think so.  Also, two very close colors, with high frequency of
>occurence, could be represented by the SAME color out.  You should examine
>the number of colors clustering around an area (in rgb color space) to see
>if it is tonal information that can be dithered instead, input quantization
>noise, or other legitimate occurences that need to be retained.  I have no
>idea how to do this.

It would also help to know what the source image is. In my case, the source
was a 24-bit ray-traced image. If you are digitizing an actual scene, then
the color quantization problem become alot more acute if one is to
maintain realism. But one can do only so much to make up for the lost
information. Dithering reduces the effective resolution, for example.

>> Problem 3: are the highest frequency colors in the histogram necessarily
>>         the best to use? (Maybe there should be a minimum distance between
>> 	bins used - (dithering?) image dependent ...)
>
>When dealing with scanned input images, the number of bins actually used can
>be quite large, and most having only a small frequency.  What out the bin
>bin "way out yonder" (in rgb color space) with a count of 1?  What if there
>are a lot of the WOY bits?

It would seem, when one first thinks of it, that there may be some 
'all-important' color in the scene SOMEWHERE that, even though it may not 
be one of the most used, the scene would 'fall-apart' without it. At first
I thought that might be a possibility, but no, for the following reasons:

1) The loss of the really pertinent information is so great, that that one
   little-used 'key color' would fall by the way-side. Of course, someone might
   be able to diliberatly manufacture an exception, but in most cases, this
   would not be a common occurance.

2) If the source image has such 'gross color variations' that causes the
   destination pallette to fill up early, then again, this little-used
   'key color' would be drowned in all the commotion. 

In any case, to round this up, in doing color quantization from a very large
pallette to a very small one, one must consider the destination colors
one has, how much dithering it would require to match the original and at 
what expense of resolution, what information is really worth keeping, and
whether a theoritically-correct color transformation is really worth the
extra overhead, especially if it will make little difference in the
destination image. As well as what type of source image you are working
with in the first place (e.g. real vs. synthetic) and how important it would
be to preserve the color variation to maintain realism and to what expense in
terms of CPU time and loss of resolution, etc.
-- 

                                   |*******************************************|
	-Compliments of	       /// |* All thoughts and comments are soley     *|
	 Fred Mitchell	   \\\///  |* thoses of The Author and has nothing to *|
			    \XX/   |* do with Commodore-Amiga.		      *|
   Software QA - Commodore-Amiga   |*******************************************|

pierre@rhi.hi.is (Kjartan Pierre Emilsson) (09/01/89)

A year ago or so, there was a similar discussion and there a method originally
devised by Paul Heckbert was introduced, called Color Cube Compression. 
I don't have the original posting but the method went something like that:

         1.  Determine the bounding box of the picture in RGB space.
		 
		 2.  Divide the box in two along the longest axis, such that
			 approximately half the pixels lie on the left of the 
			 division, and half on the right.
			
		 3.  Repeat the procedure with each sub-box, always dividing
			 along the longest axis of each sub-box, such that half 
			 of the pixels in the box lie to the right and half to
			 the left.
			 
		 4.  When you have obtained 256 boxes, take the average of colours
			 inside each box, so you get one average colour per box.
			 
         5.  Read in the picture and check for each pixel, which box it
			 belongs to, and then assign to that pixel the corresponding
			 colour.

          _____________             _______________
         |        .   :|           |        . |   :| 
		 | .   .    ...|           | .  .     |... |
		 |.  .       . | ------->  |. .       | .  | 
		 |_____________|           |__________|____|

	When I tried implementing this method, I got into trouble when a lot of
	pixels were concentrated on a single colour, because then the subdivision
	was actually trying to split a single colour cell.  One way out of this
	is to make a histogram of the picture, and single out the sharpest peaks
	and allocating them specially.  The reast of the picture can then be
	treated with the CCC algorithm.

									 -Kjartan
									 
P.S: I hope I got this method right.

---------------------------------------------------------------------
Kjartan Pierre Emilsson
Science Institute of the University of Iceland
Dunhaga 3
Reykjavik
Iceland      Net: pierre@krafla.rhi.hi.is

billd@fps.com (Bill Davidson) (09/03/89)

In article <1109@krafla.rhi.hi.is> pierre@rhi.hi.is (Kjartan Pierre Emilsson) writes:
>A year ago or so, there was a similar discussion and there a method originally
>devised by Paul Heckbert was introduced, called Color Cube Compression. 
>I don't have the original posting but the method went something like that:

[a description of Heckbert's MEDIAN CUT algorithm deleted]
>	and allocating them specially.  The reast of the picture can then be
>	treated with the CCC algorithm.
>									 
>P.S: I hope I got this method right.

Well it would be right if you called it median cut.  It also was a
reasonable answer to the question.  The CCC algorithm that I know
about is the "Color Cell Compression" algorithm which is a data
compression algorithm for color images, not a color quantization
algorithm.  It was published in the SIGGRAPH '86 proceedings.

--Bill Davidson

billd@fps.com (Bill Davidson) (09/03/89)

In article <600@celit.com> billd@fps.com (Bill Davidson) <= that's me writes:
>....  The CCC algorithm that I know
>about is the "Color Cell Compression" algorithm which is a data
>compression algorithm for color images, not a color quantization
>algorithm.

I've got to stop posting so late at night :-).  I can't believe
I actually wrote that sentance.  My english teachers would be
appalled.  Also, moments later I almost posted a stupid question
about Knuth's article on smooth error difusion but saw my own
stupidity moments before sending it.  ACK!

--Bill

schwarze%leonardo@isaak.uucp (Jochen Schwarze) (09/03/89)

Well, this might have been dicussed before and I don't know if it
helps for the topic, but anyway:

To map a large number of gray scale levels to a considerable smaller
one (say 65536 to 64 or so) I used an algorithm that numerically
integrates the gray scale histogram of the image and uses the
integrated table to map from the large number of levels to the small
one. In pseudo code:

    int histogram[ 0..65535 ]

    foreach pixel
	++histogram[ pixel_value ]

    foreach i from 1..65535
	histogram[ i ] += histogram[ i - 1 ]

The histogram table now contains a growing series (don't know the
exact term in English) of numbers. It has its biggest slope, i.e the
finest resolution in that range of pixel values that mainly occur in
the image. The histogram table can then be scaled down by a factor
like 64 / num_pixels (if 64 is the number of gray scale levels one
has) and used as a mapping table.

Now, this is 1D only. Does anyone have an idea, wether anything
similar could be done in a 3D RGB color space? Any suggestions?
Jochen Schwarze                     Domain: schwarze@isaak.isa.de
ISA GmbH, Stuttgart, West Germany   UUCP:   schwarze@isaak.uucp
                                    BITNET: schwarze%isaak.uucp@unido.bitnet
                                    Bang:   ...!uunet!unido!isaak!schwarze

dmurdoch@watstat.waterloo.edu (Duncan Murdoch) (09/05/89)

In article <1109@krafla.rhi.hi.is> pierre@rhi.hi.is (Kjartan Pierre Emilsson) writes:
>A year ago or so, there was a similar discussion and there a method originally
>devised by Paul Heckbert was introduced, called Color Cube Compression. 
>I don't have the original posting but the method went something like that:

The description you gave of the algorithm (which someone else said you 
should have called the Median Cut algorithm) sounds a lot like a statistical
technique called regression trees.  The main difference is that you chose
to divide your rectangles at the median of the longest axis, while the 
regression tree algorithm looks for the "most discriminating" place to cut.
I think the standard reference is Classification and Regression Trees;
L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone; Wadsworth, 1984.

Duncan Murdoch

falk@sun.Eng.Sun.COM (Ed Falk) (09/12/89)

In article <1109@krafla.rhi.hi.is>, pierre@rhi.hi.is (Kjartan Pierre Emilsson) writes:
> [summary of Heckbert's median cut algorithm]
>	When I tried implementing this method, I got into trouble when
>	a lot of pixels were concentrated on a single colour, because
>	then the subdivision was actually trying to split a single
>	colour cell.  One way out of this is to make a histogram of the
>	picture, and single out the sharpest peaks and allocating them
>	specially.  The reast of the picture can then be treated with
>	the CCC algorithm.

In Heckbert's Siggraph paper, he says that when you encounter a box with only
a single value inside, that you don't try to split it (since you obviously
can't).  Instead, you should simply go on, and when all boxes have been
split, go back and further split the larger boxes until the unused boxes
are all used up.

In my own implementation, I did it diferently.  Instead of recursively
splitting *all* boxes, I just pick the current largest box and split that.
I keep doing this until I have 256 (or whatever number I choose) boxes.  I
think the results are better, and it makes it easy to use numbers that are
not a power of two.

-- 
		-ed falk, sun microsystems, sun!falk, falk@sun.com

  "If you wrapped yourself in the flag like George Bush does, you'd
  be worried about flag-burning too"