mlm@nl.cs.cmu.edu (Michael L. Mauldin) (09/07/89)
Has anyone experimented with psychological color preferences in quantizing using Heckbert's median cut? Here's an example problem: In some images containing people's faces, where the face is only a small part of the image, very few colors are assigned to "flesh" color. The result is banding/loss of resolution in an area of the image that is interesting to the viewer out of proportion to its relative size. The problem is most severe when quantizing to 32 or fewer colors. I tried the following experiment, with mixed results. Choose a color that is "flesh" (I used <192,96,80>), and after the image has been histogrammed, but before the median cut color assignment is done, multiply each cell by a "bonus" between 1 and 2 if it is within some minimum distance from this point. On an image of "Our God of Free Software, RMS", where the face filled about 8% of the screen, using 32 colors, the number of "flesh" colors was increased from 3 to 6, and significant detail was added to the facial region. On another image, a baby picture with significantly "whiter" skin, the method didn't affect the image much, and when quantizing the RMS image with 16 colors, the whole image tended to look like a sepia tone print, rather than a color image. I can think of several modifications: 1. Get a better definition of "flesh" (racially unbiased :-) 2. Tweak the bonus function 3. [Actually used in some Amiga software] extract a subimage containing mostly the feature(s) of interest, build the colormap using statistics form this region. Anybody else have any good ideas? Has anyone else experimented with this? Is there a reference I don't know about? Michael L. Mauldin (Fuzzy) School of Computer Science ARPA: Michael.Mauldin@NL.CS.CMU.EDU Carnegie Mellon University Phone: (412) 268-3065 Pittsburgh, PA 15213-3890
spencer@eecs.umich.edu (Spencer W. Thomas) (09/08/89)
An interesting idea... > multiply each cell by a "bonus" between 1 and 2 if it is within some > minimum distance from [flesh color]. Another way to make the color reproduction look better is to dither. But this was pointed out in PH's original paper. -- =Spencer (spencer@eecs.umich.edu)
pepke@loligo (Eric Pepke) (09/09/89)
In article <6087@pt.cs.cmu.edu> mlm@nl.cs.cmu.edu (Michael L. Mauldin) writes: >I can think of several modifications: > > 1. Get a better definition of "flesh" (racially unbiased :-) > 2. Tweak the bonus function > 3. [Actually used in some Amiga software] extract a subimage > containing mostly the feature(s) of interest, build the > colormap using statistics form this region. > >Anybody else have any good ideas? Has anyone else experimented with >this? Is there a reference I don't know about? I don't know whether the Amiga software to which you refer does this, but one of the things I have not gotten around to trying is a detail brush that you rub on areas you want to be more detailed. This could also be a lasso, of course. This would enable people to make decisions like "I need a little bit more detail over here" or "I need a lot more detail over here." The effect on the histogram could be cumulative. Eric Pepke INTERNET: pepke@gw.scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: scri::pepke Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.
falk@sun.Eng.Sun.COM (Ed Falk) (09/15/89)
> Has anyone experimented with psychological color preferences in > quantizing using Heckbert's median cut? Here's an example problem: > > In some images containing people's faces, where the face is > only a small part of the image, very few colors are assigned > to "flesh" color. The result is banding/loss of resolution in > an area of the image that is interesting to the viewer out of > proportion to its relative size. The problem is most severe > when quantizing to 32 or fewer colors. > Here's a thought; try converting RGB to the NTSC IYQ coordinates and quantize in IYQ space. I suggest this because NTSC chose the Y axis to be biased towards flesh tones and TV pictures transmit more power along that axis than along the Q axis (I is intensity). I'm sorry, but I don't have the transformation matrix from RGB to IYQ handy. -- -ed falk, sun microsystems, sun!falk, falk@sun.com "If you wrapped yourself in the flag like George Bush does, you'd be worried about flag-burning too"
dal@midgard.Midgard.MN.ORG (Dale Schumacher) (09/19/89)
In article <124742@sun.Eng.Sun.COM> falk@sun.Eng.Sun.COM (Ed Falk) writes: | |Here's a thought; try converting RGB to the NTSC IYQ coordinates |and quantize in IYQ space. I suggest this because NTSC chose |the Y axis to be biased towards flesh tones and TV pictures transmit |more power along that axis than along the Q axis (I is intensity). | |I'm sorry, but I don't have the transformation matrix from RGB to IYQ handy. | I thought Y was the intensity (luminance) component... and the most bandwidth is used for the luminance, with less for the color components. Here are the integer [0..255] pixel value formulae that I use: Y = (((77 * R) + (150 * G) + (29 * B)) / 256); I = (((153 * R) + (-70 * G) + (-82 * B)) / 256); Q = (((54 * R) + (-134 * G) + (80 * B)) / 256); R = (((256 * Y) + (245 * I) + (159 * Q)) / 256); G = (((256 * Y) + (-70 * I) + (-167 * Q)) / 256); B = (((256 * Y) + (-283 * I) + (436 * Q)) / 256); The above forms the heart of a utility I wrote to set the luminance (Y) of a color image from a monochrome image. I use this in convolutions, particularly for edge sharpening, such that I do the convolution only on the luminance component, then recombine the output with the original color image.
jlg@hpfcdq.HP.COM (Jeff Gerckens) (09/19/89)
> > Has anyone experimented with psychological color preferences in > > quantizing using Heckbert's median cut? Here's an example problem: > > > > In some images containing people's faces, where the face is > > only a small part of the image, very few colors are assigned > > to "flesh" color. The result is banding/loss of resolution in > > an area of the image that is interesting to the viewer out of > > proportion to its relative size. The problem is most severe > > when quantizing to 32 or fewer colors. > > > > Here's a thought; try converting RGB to the NTSC IYQ coordinates > and quantize in IYQ space. I suggest this because NTSC chose > the Y axis to be biased towards flesh tones and TV pictures transmit > more power along that axis than along the Q axis (I is intensity). > > I'm sorry, but I don't have the transformation matrix from RGB to IYQ handy. > > -- > -ed falk, sun microsystems, sun!falk, falk@sun.com Almost.... The Y axis in the NTSC encoding (IYQ) is the intensity, which was selected to match the CIE-1931 XYZ intensity for the NTSC standard phosphors. Both the I axis and the Q axis are named after the signal encoding technique used, being In-phase and Quadrature, respectively. The I and Q axis carry the chromaticity information, and are selected for encoding the most information on the I and less on the Q. Use of this space will only affect your results if you weight the different axis differently, since the linear transform to/from YIQ and RGB yields the same results for any linear interpolation regardless of which space the interpolation takes place in. - Jeff Gerckens, Graphics Technology Division, Hewlett-Packard Company. ...!hplabs!hpfcla!jlg "What color is a white horse in a dark room?"
hutch@fps.com (Jim Hutchison) (09/19/89)
In <1212@midgard.Midgard.MN.ORG> dal@midgard.Midgard.MN.ORG (Dale Schumacher): [... In NTSC ...] >I thought Y was the intensity (luminance) component... and the most >bandwidth is used for the luminance, with less for the color components. >Here are the integer [0..255] pixel value formulae that I use: > Y = (((77 * R) + (150 * G) + (29 * B)) / 256); > I = (((153 * R) + (-70 * G) + (-82 * B)) / 256); > Q = (((54 * R) + (-134 * G) + (80 * B)) / 256); > R = (((256 * Y) + (245 * I) + (159 * Q)) / 256); > G = (((256 * Y) + (-70 * I) + (-167 * Q)) / 256); > B = (((256 * Y) + (-283 * I) + (436 * Q)) / 256); >The above forms the heart of a utility I wrote to set the luminance (Y) >of a color image from a monochrome image. I use this in convolutions, >particularly for edge sharpening, such that I do the convolution only >on the luminance component, then recombine the output with the original >color image. It would seem that by using these equations, you might end up with a fair amount of error in the process of remapping the colors. Atleast you will want to round-up in order to halve the error. E.g. Y = ((77 * R) + (150 * G) + (29 * B) + 128) / 256; Or, you could use the scaled numbers for the convolution and only rescale them when you reconvert to RGB. This might make your convolution algorithm to messy, perhaps you might want to just save the error from the original RGB->YIQ conversion and add that back into the RGB at the end. Have you noticed the error in your output? Is it significant enough to cause darkening of image or loss of shadow detail? /* Jim Hutchison {dcdwest,ucbvax}!ucsd!celerity!hutch */ /* Disclaimer: I am not an official spokesman for FPS computing */
falk@sun.Eng.Sun.COM (Ed Falk) (09/22/89)
In article <390037@hpfcdq.HP.COM>, jlg@hpfcdq.HP.COM (Jeff Gerckens) writes: > > > Has anyone experimented with psychological color preferences in > > > quantizing using Heckbert's median cut? Here's an example problem: > > > > > > In some images containing people's faces, where the face is > > > only a small part of the image, very few colors are assigned > > > to "flesh" color. The result is banding/loss of resolution in > > > an area of the image that is interesting to the viewer out of > > > proportion to its relative size. The problem is most severe > > > when quantizing to 32 or fewer colors. > > > > > > > Here's a thought; try converting RGB to the NTSC IYQ coordinates > > and quantize in IYQ space. I suggest this because NTSC chose > > the Y axis to be biased towards flesh tones and TV pictures transmit > > more power along that axis than along the Q axis (I is intensity). > > > > I'm sorry, but I don't have the transformation matrix from RGB to IYQ handy. > > > > -- > > -ed falk, sun microsystems, sun!falk, falk@sun.com > > Almost.... > > The Y axis in the NTSC encoding (IYQ) is the intensity, which was selected to > match the CIE-1931 XYZ intensity for the NTSC standard phosphors. Both the I > axis and the Q axis are named after the signal encoding technique used, being > In-phase and Quadrature, respectively. The I and Q axis carry the chromaticity > information, and are selected for encoding the most information on the I and > less on the Q. > > Use of this space will only affect your results if you weight the different > axis differently, since the linear transform to/from YIQ and RGB yields the > same results for any linear interpolation regardless of which space the > interpolation takes place in. All true. Mea Culpa. Here are the equations from the FCC regs: Y = .30R + .59G + .11B I = -.27(B-Y) + .74(R-Y) Q = .41(B-Y) + .48(R-Y) In matrix form, this is: |Y| | .300 .590 .110 | |R| |I| = | .599 -.277 -.322 | |G| |Q| | .213 -.525 .312 | |B| |R| | 1. .947 .624 | |Y| |G| = | 1. -.275 -.636 | |I| |B| | 1. -1.108 1.709 | |Q| The I axis was chosen to be the fleshtone axis, and about three times as much power is transmitted on this axis as on the Q axis. This way, a weak signal will not degrade flesh tones as much as other colors. -- -ed falk, sun microsystems, sun!falk, falk@sun.com "If you wrapped yourself in the flag like George Bush does, you'd be worried about flag-burning too"