poynton%vector@Sun.COM (Charles Poynton) (11/06/88)
In Comp.windows.x article <8811011523.AA02242@LYRE.MIT.EDU>, Ralph R. Swick <swick@ATHENA.MIT.EDU> comments: > When converting RGB values to monochrome, the sample server(s) compute > an intensity value as (.39R + .5G + .11B) ... (.39R + .5G + .11B) is apparently incorrect. This set could be a typographical error (from .29R + .6G + .11B ?), a liberal approximation, or perhaps an unusual phosphor set. Could someone enlighten me on this? In followup article <8811042303.AA21505@dawn.steinmetz.GE.COM>, Dick St.Peters <stpeters@dawn.UUCP> makes the statement: > I'd like to suggest that (.39R + .5G + .11B) is not a good choice for > "intensity" in the realm of computer graphics. ... > > A better choice in computer graphics is to equally weight the colors: > ((R+G+B)/3.0). Let white be white. Equal weighting of the primaries is NOT the right thing to do, unless the viewers of your images are members of some species that has uniform response across the visible spectrum, unlike homo sapiens. Humans see 1 watt of green light energy as being somewhat brighter than 1 watt of red, and very much brighter than 1 watt of blue. The science of colourimetry began to flourish in 1931, when the CIE standardized a statistical entity called the "standard observer". This includes a standard spectral luminance response defined numerically as a function of wavelength. It is from this data that the factors which are used in colour television are derived: .587 for green, .299 for red, and .114 for blue. The particular factors depend on the wavelengths or chromaticities that you call red, green, and blue: there is wide disparity in these choices. For computer graphics and television, the luminance factors depend on the chromaticity coordinates of the phosphors of your CRT. There are compromises in the choice of phosphor primaries, but it turns out that the NTSC did a spectacularly good job of selecting primaries. The luminance coefficients 0.299 for red, 0.114 for blue, and 0.587 for green are unquestionably the best values to use, unless you know your phoshphors intimately. The second article continues, > The formula is from > the (1954) NTSC standard for compatible color TV, and it has built > into it a lot of compromises to accommodate old technology and > problems inherent in the analog transmission of composite color > television. Contrary to this assertion, the ONLY compromise in NTSC which impacts the luminance equation is the choice of reference phosphor chromaticities, and a choice of phosphors MUST be made for any system which transmits colour in RGB. Just because it's old (1954) doesn't mean we should throw it away. Aside from this, the discussion of television coding which follows is substantially correct, except that modulation onto an RF carrier for transmission involves no inherent compromises beyond those already made in formation of baseband NTSC. (Receivers frequently make their own compromises, but these are not inherent.) For those interested, I attach an alternate description of television coding. Charles Poynton "No quote. poynton@sun.com No disclaimer." (415)336-7846 ----- GAMMA CORRECTION A picture tube (CRT) produces a light output which is proportional to its input voltage raised to approximately the 2.5-th power. Rather than requiring circuitry to implement the 2.5-th root function to compensate for this be implemented at every receiver, the "gamma correction" is performed on the R, G, and B primaries at the camera to form signals denoted R', G', and B'. YUV REPRESENTATION (3 wires) Studio equipment typically processes colour signals in three components YUV, which are easily derived from RGB. The Y channel contains the luminance (black-and-white) content of the image, and is computed as: Y' = 0.299 R' + 0.587 G' + 0.114 B' "Colour difference" signals U and V are scaled versions of B'-Y' and R'-Y' respectively; these vanish for monochrome (grey) signals. The human visual system has much less acuity for spatial variation of colour than for luminance, and the advantage of U and V components is that each can be conveyed with substantially less bandwidth than luminance, R or G or B. In analog YUV studio systems, U and V each have a bandwidth of 1.5 MHz, compared to between 4.2 MHz and 5.5 MHz for luminance. In digital systems, U and V are each horizontally subsampled by a factor of two (i.e. conveyed at half the rate of the luminance signal). Y/C REPRESENTATION (2 wires) U and V can be combined easily into a "chroma" signal which is conveyed as modulation of a continuous 3.58 MHz sine-wave subcarrier. Subcarrier phase is decoded with reference to a sample or "burst" of the 3.58 MHz continuous-wave subcarrier which is transmitted during the horizontal blanking interval. The phase of the chroma signal conveys a quantity related to hue, and its amplitude conveys a quantity related to colour saturation (purity). The "S" connectors of S-VHS and ED-Beta equipment simply carry Y and C on separate wires. This coding is easily decoded without artifacts. Current S-VHS equipment conveys chroma with severely limited bandwidth, about 300 kHz (which is just 16 cycles of U or V per picture width). Consumer VCR equipment has always recorded the luminance and chroma components separately on tape, but only since the introduction of the S-connector in S-VHS and ED-Beta equipment has the consumer been able to take advantage of this capability. NTSC REPRESENTATION (1 wire) The NTSC system mixes Y and C together and conveys the result on one piece of wire. The result of this addition operation is not theoretically reversible: the process of separating luminance and colour often confuses one for the other. Cross-colour artifacts result from luminance patterns which happen to generate signals near the 3.58 MHz colour subcarrier. Such information may be decoded as swirling colour rainbows. Cross-luminance artifacts result if modulated colour information is incorrectly decoded as crawling or hanging luminance dots. It is these artifacts which can be avoided by using the S-connector interface. In general, once the NTSC footprint is impressed on a signal, it persists even if subsequent processing is performed in RGB or YUV components. Encoded NTSC signals can be sampled into a stream of 8-bit bytes. Such "composite digital" systems have the advantage of using slightly less memory than component systems, at the expense of the dreaded NTSC artifacts. Manipulation of such composite signals to perform operations such as shrinking the picture is difficult or impossible, because if the colour subcarrier frequency is altered the colour information in the signal is destroyed. Therefore, these operations are performed in the component domain. FREQUENCY INTERLEAVING The NTSC colour subcarrier frequency is chosen to be exactly 455/2 times the line rate of 9/.572 kHz. The fact that the subcarrier frequency is an odd multiple of half the line rate causes colour information to be interleaved with the luminance spectrum: if a portion of a coloured region has a positive-going modulated chroma component on one scan line, then on the next line chroma will go negative. This property allows the use of a "comb filter" to separate luminance and chroma. The signal is delayed by one total line time, in order that two vertically adjacent picture elements be available to the electronics at the same instant in time. Forming the sum of these two elements will produces luminance, and forming their difference produces the modulated chroma. This feature results in greatly improved luma/chroma separation compared to a 3.58 MHz "trap" filter. However, a comb filter assumes a fair degree of vertical correlation in the picture, and this assumption does not hold for pictures with great vertical detail. -----
poynton@SUN.COM (Charles Poynton) (11/07/88)
Excuse the re-post, the first posting was returned by a mailer at MIT so I think they missed it. In Comp.windows.x article <8811011523.AA02242@LYRE.MIT.EDU>, Ralph R. Swick <swick@ATHENA.MIT.EDU> comments: > When converting RGB values to monochrome, the sample server(s) compute > an intensity value as (.39R + .5G + .11B) ... (.39R + .5G + .11B) is apparently incorrect. This set could be a typographical error (from .29R + .6G + .11B ?), a liberal approximation, or perhaps an unusual phosphor set. Could someone enlighten me on this? In followup article <8811042303.AA21505@dawn.steinmetz.GE.COM>, Dick St.Peters <stpeters@dawn.UUCP> makes the statement: > I'd like to suggest that (.39R + .5G + .11B) is not a good choice for > "intensity" in the realm of computer graphics. ... > > A better choice in computer graphics is to equally weight the colors: > ((R+G+B)/3.0). Let white be white. Equal weighting of the primaries is NOT the right thing to do, unless the viewers of your images are members of some species that has uniform response across the visible spectrum, unlike homo sapiens. Humans see 1 watt of green light energy as being somewhat brighter than 1 watt of red, and very much brighter than 1 watt of blue. The science of colourimetry began to flourish in 1931, when the CIE standardized a statistical entity called the "standard observer". This includes a standard spectral luminance response defined numerically as a function of wavelength. It is from this data that the factors which are used in colour television are derived: .587 for green, .299 for red, and .114 for blue. The particular factors depend on the wavelengths or chromaticities that you call red, green, and blue: there is wide disparity in these choices. For computer graphics and television, the luminance factors depend on the chromaticity coordinates of the phosphors of your CRT. There are compromises in the choice of phosphor primaries, but it turns out that the NTSC did a spectacularly good job of selecting primaries. The luminance coefficients 0.299 for red, 0.114 for blue, and 0.587 for green are unquestionably the best values to use, unless you know your phoshphors intimately. The second article continues, > The formula is from > the (1954) NTSC standard for compatible color TV, and it has built > into it a lot of compromises to accommodate old technology and > problems inherent in the analog transmission of composite color > television. Contrary to this assertion, the ONLY compromise in NTSC which impacts the luminance equation is the choice of reference phosphor chromaticities, and a choice of phosphors MUST be made for any system which transmits colour in RGB. Just because it's old (1954) doesn't mean we should throw it away. Aside from this, the discussion of television coding which follows is substantially correct, except that modulation onto an RF carrier for transmission involves no inherent compromises beyond those already made in formation of baseband NTSC. (Receivers frequently make their own compromises, but these are not inherent.) For those interested, I attach an alternate description of television coding. Charles Poynton "No quote. poynton@sun.com No disclaimer." (415)336-7846 ----- GAMMA CORRECTION A picture tube (CRT) produces a light output which is proportional to its input voltage raised to approximately the 2.5-th power. Rather than requiring circuitry to implement the 2.5-th root function to compensate for this be implemented at every receiver, the "gamma correction" is performed on the R, G, and B primaries at the camera to form signals denoted R', G', and B'. YUV REPRESENTATION (3 wires) Studio equipment typically processes colour signals in three components YUV, which are easily derived from RGB. The Y channel contains the luminance (black-and-white) content of the image, and is computed as: Y' = 0.299 R' + 0.587 G' + 0.114 B' "Colour difference" signals U and V are scaled versions of B'-Y' and R'-Y' respectively; these vanish for monochrome (grey) signals. The human visual system has much less acuity for spatial variation of colour than for luminance, and the advantage of U and V components is that each can be conveyed with substantially less bandwidth than luminance, R or G or B. In analog YUV studio systems, U and V each have a bandwidth of 1.5 MHz, compared to between 4.2 MHz and 5.5 MHz for luminance. In digital systems, U and V are each horizontally subsampled by a factor of two (i.e. conveyed at half the rate of the luminance signal). Y/C REPRESENTATION (2 wires) U and V can be combined easily into a "chroma" signal which is conveyed as modulation of a continuous 3.58 MHz sine-wave subcarrier. Subcarrier phase is decoded with reference to a sample or "burst" of the 3.58 MHz continuous-wave subcarrier which is transmitted during the horizontal blanking interval. The phase of the chroma signal conveys a quantity related to hue, and its amplitude conveys a quantity related to colour saturation (purity). The "S" connectors of S-VHS and ED-Beta equipment simply carry Y and C on separate wires. This coding is easily decoded without artifacts. Current S-VHS equipment conveys chroma with severely limited bandwidth, about 300 kHz (which is just 16 cycles of U or V per picture width). Consumer VCR equipment has always recorded the luminance and chroma components separately on tape, but only since the introduction of the S-connector in S-VHS and ED-Beta equipment has the consumer been able to take advantage of this capability. NTSC REPRESENTATION (1 wire) The NTSC system mixes Y and C together and conveys the result on one piece of wire. The result of this addition operation is not theoretically reversible: the process of separating luminance and colour often confuses one for the other. Cross-colour artifacts result from luminance patterns which happen to generate signals near the 3.58 MHz colour subcarrier. Such information may be decoded as swirling colour rainbows. Cross-luminance artifacts result if modulated colour information is incorrectly decoded as crawling or hanging luminance dots. It is these artifacts which can be avoided by using the S-connector interface. In general, once the NTSC footprint is impressed on a signal, it persists even if subsequent processing is performed in RGB or YUV components. Encoded NTSC signals can be sampled into a stream of 8-bit bytes. Such "composite digital" systems have the advantage of using slightly less memory than component systems, at the expense of the dreaded NTSC artifacts. Manipulation of such composite signals to perform operations such as shrinking the picture is difficult or impossible, because if the colour subcarrier frequency is altered the colour information in the signal is destroyed. Therefore, these operations are performed in the component domain. FREQUENCY INTERLEAVING The NTSC colour subcarrier frequency is chosen to be exactly 455/2 times the line rate of 9/.572 kHz. The fact that the subcarrier frequency is an odd multiple of half the line rate causes colour information to be interleaved with the luminance spectrum: if a portion of a coloured region has a positive-going modulated chroma component on one scan line, then on the next line chroma will go negative. This property allows the use of a "comb filter" to separate luminance and chroma. The signal is delayed by one total line time, in order that two vertically adjacent picture elements be available to the electronics at the same instant in time. Forming the sum of these two elements will produces luminance, and forming their difference produces the modulated chroma. This feature results in greatly improved luma/chroma separation compared to a 3.58 MHz "trap" filter. However, a comb filter assumes a fair degree of vertical correlation in the picture, and this assumption does not hold for pictures with great vertical detail. -----
stpeters@dawn.UUCP (11/08/88)
Received: by ATHENA.MIT.EDU (5.45/4.7) id AA13335; Mon, 7 Nov 88 18:22:34 EST Received: from steinmetz.UUCP by uunet.UU.NET (5.59/1.14) with UUCP id AA18816; Mon, 7 Nov 88 18:20:39 EST From: steinmetz!dawn!stpeters@uunet.UU.NET Received: from dawn.steinmetz.GE.COM (dawn.ARPA) by kbsvax.steinmetz (1.2/1.1x Steinmetz) id AA10218; Mon, 7 Nov 88 17:58:04 est Received: by dawn.steinmetz.GE.COM (4.0/SMI-4.0) id AA23325; Mon, 7 Nov 88 17:57:55 EST Date: Mon, 7 Nov 88 17:57:55 EST Message-Id: <8811072257.AA23325@dawn.steinmetz.GE.COM> To: xpert@athena.mit.edu Cc: stpeters@dawn.steinmetz, poynton@sun.com Subject: Re: Luminance from RGB (was "intensity" from RGB Responding to my posting on RGB, Charles Poynton (poynton@sun.com) writes: > ... except that modulation onto an RF carrier for > transmission involves no inherent compromises beyond those already made in > formation of baseband NTSC. This isn't an appropriate forum for RF discussion, but one paragraph won't hurt. In baseband NTSC, the chroma signal is an oscillation about the luma level. The NTSC scaling of chroma relative to luma is such that when the composite signal is (AM) modulated onto RF, certain combinations of luma and chroma nominally correspond to more than 100% modulation. That's just one example: there *is* a substantial set of inherent compromises involved going to/from RF[1]. On RGB weighting: I argued that the NTSC intensity formula shouldn't be considered carved in stone for purposes beyond its original intent, because compatible color TV required identification of an RGB combination orthogonal to chroma, a luma that had to work with B&W televisions. Mr. Poynton correctly stated that the formula also describes the human spectral response. That the best RGB weightings for a luma orthogonal to chroma turned out to be those for the human spectral response should hardly surprise anyone. However, that has little relevance[2]. If you display RGB in the ratios matching the human spectral response, you do not get a good white, you get a bright but dreary grey. The logarithmic response of the eye masks the roughly 2:1 green/red ratio, but the 11% blue in the formula is just not enough. The CIE has a name that I can't remember, a qualified white of some sort, for this bright grey. The mixture lies in the large central region of whiteish shades in a triangular color plot, but it is well removed from what people subjectively perceive as a clean, pure white. This has been known and exploited for years. Fabric whiteners work by adding blue; they're essentially blue dyes. High-grade paper is tinted with fluorescent blue dyes to make it appear whiter; people appearing on color TV have long been advised that a blue-tinted shirt looks whiter than a white one (a problem aggravated by the low level of blue in the spectrum of studio lighting). The point is that what people perceive as pure white is not, in TV terms, a zero-chroma color. This is not unnatural: there is no reason that we should have evolved with our visual systems adjusted so that the mixture we see as white matches the weightings determined by the spectral response of our eyes. After all, the illumination spectrum in which we see things is not constant over frequency - nor is the variation constant over time. (E.g., there is more blue light on clear days than overcast ones, more blue at noon than near sunset or dawn, more blue on clear days near the equator than on clear days at higher latitudes, etc.) To restate the point, not only is there no reason we should have evolved to see NTSC luma as pure white, it is known that we did not. [I'm rather fond of summer scenes of bright white puffy clouds in a blue sky. If nature had adjusted my visual system to perceive the NTSC mixture as pure white, the clouds would seem to have a blue tint. It wouldn't be the same.] Not long from now, 24-bit systems will be as common as 8-bit ones are today, and people will want to display true-color images. If you display the NTSC RGB combination and adjust your monitor so it looks like a nice clean white, you will have problems with true-color images. If you make equal-RGB look white, your true-color images will look true-color. When you have RGB on separate cables with equal bandwidths, there is nothing relevant about the NTSC RGB combination. However, you will frequently want white window borders, background/foreground color, etc. The white you want is ((R+G+B)/3). Equal weighting and the eye's logarithmic response allow considerable monitor misadjustment to still make your white look reasonably white. -- Dick St.Peters GE Corporate R&D, Schenectady, NY stpeters@ge-crd.arpa uunet!steinmetz!stpeters 1. I'm willing to discuss RF issues further offline. 2. The NTSC/human-vision weighting is a reasonable one if you want to extract a grey-scale image from a true-color image.
poynton@SUN.COM (Charles Poynton) (11/09/88)
This follows Dick St Peters' comment in <8811080030.AA14681@EXPO.LCS.MIT.EDU> of Comp.windows.x that (R+G+B)/3 is an appropriate weighting function for luminance. Summary Luminance weighting coefficients are a function ONLY of the human luminance sensitivity functions, the selected phosphor chromaticities, and the chromaticity of the reference illuminant (white point). The equations can be found on page 2.28 and on in K. Blair Benson's "Television Engineering Handbook", or on page 20-10 of Donald G. Fink's Electronic Engineers' Handbook, Second Edition". A Simple Thought Experiment Think of three monochromatic primaries, for example, tuneable dye lasers. If (R+G+B)/3 is white, what happens when I move the wavelength of the blue primary? If I move it towards the ultraviolet, the eye responds less to it and luminance drops. If I move it towards green, the eye responds more (in which channel(s) we don't care) and its luminance increases. Hence, the equation for luminance must depend in some manner on what wavelength you call "blue". White CIE Illuminant D65 refers to a standard spectral distribution which is chosen to approximate daylight. This is television's standard white reference. So-called "Zero chroma" in NTSC occurs on this colour. People may or may not respond to this as "white": if the observer is adapted to a pink room, he'll think it's yellowish-green. The reason that the CIE decided to standardize white, in 1931, is to allow work in colourimetry to be done objectively. To achieve accurate colour reproduction requires specification of reference white. This is a lurking problem for accurate colour reproduction in computer graphics, because most workstation monitors are adjusted for a white point (of about 9300 K) that's quite a bit more blue than the television standard. Software people tend to say "R=G=B=1=white. Simple." But it's not that simple. Blue Shirts Sorry to air the laundry in public. Fabric whiteners work by adding materials which flouresce to convert ultraviolet light [to which the human eye is not sensitive] into visible light in the blue part of the spectrum [to which it is]. The shirt not only looks brighter than white, it IS brighter than white. But this has nothing to do with the luminance coefficients. With modern phosphors the blue luminances coefficient is actually tending to go DOWN. To produce accurate reproduction with European standard phosphors requires a blue luminance contribution of 0.071. This is a big colour problem for Europe, because they use the same luminance function as we do, and it's not exactly matched to their phosphors. They tweak their camera spectral sensitivities as a first-order fix to improve the colour reproduction accuracy. Orthogonality Luminance is luminance; chrominance is chrominance; and NTSC can be thought of as conveying Y, U, and V independently. This is true regardless of your interpretation of what these quantities represent. They're "orthogonal" provided that one can't be derived from the other two. Although there are some subtle signal-to-noise ratio considerations in television coding, this issue is independent of (or should I say orthogonal to) the choice of luminance coefficients. NTSC RF Modulation I stand corrected by Mr St Peters on an RF modulation point: there IS one compromise made in NTSC transmission via RF. White modulates the RF carrier to 12.5%, and black modulates it to 70.3125%. When NTSC modulates an RF carrier, chroma excursions near highly saturated yellow and near highly saturated cyan are clipped to 120 IEEE units prior to the modulator, to avoid UNDER-modulating the transmitter. Two small regions of colour space are lost in this case. No practical television camera has sufficient colour separation capability to generate signals in these regions, but electronically-synthesized colour bars have perfect colour saturation and would undermodulate the transmitter if left alone. Studio colour bars are generated at 75% saturation to avoid these two regions of colour space. This issue does not bear on the choice of luminance coefficients, and is relevant to broadcast only. VHS machines and NTSC monitors reproduce all of the NTSC colour space even at 100% saturation. I would have put this first, but then you wouldn't have read all the colour stuff, would you? Charles
poynton%vector@Sun.COM (Charles Poynton) (11/09/88)
This follows Dick St Peters' comment <8811080030.AA14681@EXPO.LCS.MIT.EDU> of Comp.windows.x that (R+G+B)/3 is an appropriate weighting function for luminance. Summary Luminance weighting coefficients are a function ONLY of the human luminance sensitivity functions, the selected phosphor chromaticities, and the chromaticity of the reference illuminant (white point). The equations can be found on page 2.28 and on in K. Blair Benson's "Television Engineering Handbook", or on page 20-10 of Donald G. Fink's Electronic Engineers' Handbook, Second Edition". A Simple Thought Experiment Think of three monochromatic primaries, for example, tuneable dye lasers. If (R+G+B)/3 is white, what happens when I move the wavelength of the blue primary? If I move it towards the ultraviolet, the eye responds less to it and luminance drops. If I move it towards green, the eye responds more (in which channel(s) we don't care) and its luminance increases. Hence, the equation for luminance must depend in some manner on what wavelength you call "blue". White CIE Illuminant D65 refers to a standard spectral distribution which is chosen to approximate daylight. This is television's standard white reference. So-called "Zero chroma" in NTSC occurs on this colour. People may or may not respond to this as "white": if the observer is adapted to a pink room, he'll think it's yellowish-green. The reason that the CIE decided to standardize white, in 1931, is to allow work in colourimetry to be done objectively. To achieve accurate colour reproduction requires specification of reference white. This is a lurking problem for accurate colour reproduction in computer graphics, because most workstation monitors are adjusted for a white point (of about 9300 K) that's quite a bit more blue than the television standard. Software people tend to say "R=G=B=1=white. Simple." But it's not that simple. Blue Shirts Sorry to air the laundry in public. Fabric whiteners work by adding materials which flouresce to convert ultraviolet light [to which the human eye is not sensitive] into visible light in the blue part of the spectrum [to which it is]. The shirt not only looks brighter than white, it IS brighter than white. But this has nothing to do with the luminance coefficients. With modern phosphors the blue luminances coefficient is actually tending to go DOWN. To produce accurate reproduction with European standard phosphors requires a blue luminance contribution of 0.071. This is a big colour problem for Europe, because they use the same luminance function as we do, and it's not exactly matched to their phosphors. They tweak their camera spectral sensitivities as a first-order fix to improve the colour reproduction accuracy. Orthogonality Luminance is luminance; chrominance is chrominance; and NTSC can be thought of as conveying Y, U, and V independently. This is true regardless of your interpretation of what these quantities represent. They're "orthogonal" provided that one can't be derived from the other two. Although there are some subtle signal-to-noise ratio considerations in television coding, this issue is independent of (or should I say orthogonal to) the choice of luminance coefficients. NTSC RF Modulation I stand corrected by Mr St Peters on an RF modulation point: there IS one compromise made in NTSC transmission via RF. White modulates the RF carrier to 12.5%, and black modulates it to 70.3125%. When NTSC modulates an RF carrier, chroma excursions near highly saturated yellow and near highly saturated cyan are clipped to 120 IEEE units prior to the modulator, to avoid UNDER-modulating the transmitter. Two small regions of colour space are lost in this case. No practical television camera has sufficient colour separation capability to generate signals in these regions, but electronically-synthesized colour bars have perfect colour saturation and would undermodulate the transmitter if left alone. Studio colour bars are generated at 75% saturation to avoid these two regions of colour space. This issue does not bear on the choice of luminance coefficients, and is relevant to broadcast only. VHS machines and NTSC monitors reproduce all of the NTSC colour space even at 100% saturation. I would have put this first, but then you wouldn't have read all the colour stuff, would you? Charles
awpaeth@watcgl.waterloo.edu (Alan Wm Paeth) (11/10/88)
In article <76649@sun.uucp> poynton%vector@Sun.COM (Charles Poynton) writes: >This follows Dick St Peters' comment <8811080030.AA14681@EXPO.LCS.MIT.EDU> >of Comp.windows.x that (R+G+B)/3 is an appropriate weighting function for >luminance. Thank you, Charles for a very informative article. I (for one) I'm getting tired of hearing misinformation such the above comment or related statements such as "[CMY] = 1-[RGB]". It's nice to see the record being set straight. (This is not a criticism on the original poster). These color "facts" have a feeling of common sense (as 0th order approximations to reality) which explains why they are deduced from first principles and then readily requoted. A bit like teaching Newtonian physics without being reminded that Relativity exists. Fortunately, the latest Computer Graphics texts are beginning to give color the more precise treatment it deserves. Some additional fine points which might be of interest to people: >White > >To achieve accurate colour reproduction requires specification of >reference white. This is a lurking problem for accurate colour >reproduction in computer graphics, because most workstation monitors are >adjusted for a white point (of about 9300 K) that's quite a bit more blue >than the television standard. Software people tend to say >"R=G=B=1=white. Simple." But it's not that simple. When people say "RGB color" my first impressions are (1) color used in the context of a digital computer and (2) (all to often) an associated sense vagueness about the exactness of the specification ("well, it's 24 bit RGB?!"). For instance, R might relate to the phosphor chromaticity of a specific monitor, a spectral line, or the NTSC defined value (I've yet to see an NTSC monitor that reproduces this). In our lab I've encountered many "R"'s: Chromaticity Coords Comments ------------------------------------------------- x' = .62, y' = .21 Electrohome monitor red phosphor x' = .65, y' = .30 Aydin monitor red phosphor x' = .63, y' = .30 Spectral line at 600 Angstroms (CIE tables) x' = .67, y' = .21 NTSC defined "Red" The use of the NTSC standard (with which the CIE tables for Y and a matrix inversion give the familiar Y = .299B + .587G + .114B) is far better than using Y = 1/3(R+G+B), but even then, I've yet to see an NTSC monitor. The first color TV's tried to live up to the standard, but the required red spectral purity is so high (that's the x' = .67 in the table) that luminance suffers. So the industry pushes brighter, less pure red phosphors (remember the "rare earth phosphor" ads of the early 70's?) and it's anyone's guess where the R coordinates are. Unless you have a lot of money, studio monitors tend to migrate in the direction of their commercial counterparts (lower spectral purity in the phosphors). > >Orthogonality > >Luminance is luminance; chrominance is chrominance; and NTSC can be >thought of as conveying Y, U, and V independently. This is true >regardless of your interpretation of what these quantities represent. >They're "orthogonal" provided that one can't be derived from the other >two. Although there are some subtle signal-to-noise ratio considerations >in television coding, this issue is independent of (or should I say >orthogonal to) the choice of luminance coefficients. Well, that's really a definition for independence, not complete orthogonality. The YIQ television signal is similar to the CIE defined YUV (or XYZ) color spaces in that the Y's (luminance) are the same. The I and Q chromanence signals pick up the remaining two degrees of freedom. The matrix that defines the coordinate change was chosen out of bandwidth considerations. In fact, I and Q stand for "In phase" and "Quadrature" signal. They are encoded for broadcast on color subcarriers that are 90 degrees out of phase. It is these signals that are orthogonal (in the sin(x), cos(x) sense), but not the independent values which they encode. >NTSC RF Modulation > >I stand corrected by Mr St Peters on an RF modulation point: there IS one >compromise made in NTSC transmission via RF...When NTSC modulates >an RF carrier, chroma excursions near highly saturated yellow and near >highly saturated cyan are clipped to 120 IEEE units prior to the >modulator, to avoid UNDER-modulating the transmitter. Two small regions >of colour space are lost in this case. No practical television camera has >sufficient colour separation capability to generate signals in these >regions, but electronically-synthesized colour bars have perfect colour >saturation and would undermodulate the transmitter if left alone. Almost. There are stories of independent stations with substandard power supplies that got fratzed when trying to run _Sesame Street_ clips -- Big Bird is both big and yellow (the AM video signal draws power as a function of the modulation). As for the generation of "synthetic" colors as with color bar tests, one has to be careful. I worked on Shoup's "Superpaint" system when with Xerox (one of the first color "paint" systems. It provided NTSC output). It featured a menu item to test for "hot" broadcast colors such as yellow to avoid modulation problems -- it would blink the colors. In that case one had either to reduce the intensity or desaturate. A tool like this is useful and necessary for readying computer graphics images for commercial broadcast. Those images *invariably* have highly saturated colors (a generation raised on Big Bird?). If there is enough interest I can post a production tuned C program which flags "hot" colors. /Alan Paeth Computer Graphics Laboratory University of Waterloo
mab@pixar.UUCP (Malcolm Blanchard) (11/12/88)
The discussion of luminance computations and the subsequent discussion of the meaning of white reminds me of an experience I had a few years ago when Pixar was a division of Lucasfilm and we were working on an effect for "Young Sherlock Holmes". Aesthetic decisions were being made by people sitting in front of color monitors. The digital images were transferred to film using three color lasers. The film was printed and then projected in a screening room. I decided that this was an great place to implement a what-you-see-is-what-you-get color system. And so I delved into the murky depths of colorimetry in the hope of developing a color-correction program that would produce the same color in the screening room that was measured on the color monitors. This a difficult problem (in fact, in its strictest sense, an impossible one, since the color gamuts of the two systems have mutually exclusive regions). I took into account the CIE coordinates of the monitor's phosphors, its color balance, the sensitivity of the color film to each of the lasers, cross-talk between the film layers, effects of film processing, the spectral characteristics of the print film's dye layers, and the spectral characteristics of a standard projector bulb. Several steps in this process are extremely non-linear, but I was able to achieve some good results by using some piece-wise linear approximations. I felt a great sense of success when I used a colormeter to confirm that the CIE coordinates on the silver screen did, indeed, closely match those on the tiny screen. We color corrected a few shots and showed them to the effects director. He's response was, "Why does this look so blue"? It turns out that when we look at a TV we're accustomed to a blue balance and when we're sitting in a theater we expect a yellow balance. The digital color correction was abandoned and the production relied on the film lab to produce an aesthetic balance. Thus proving to me that science may work, but computer graphics and film making are still largely a matter of art.
stpeters@dawn.UUCP (11/14/88)
pixar!mab@bloom-beacon.mit.edu (Malcolm Blanchard) writes: > The digital color correction was abandoned and the production > relied on the film lab to produce an aesthetic balance. Thus proving > to me that science may work, but computer graphics and film making are > still largely a matter of art. Amen. I am weary of claims that the CIE description of 1931, as pivotal as it was, is the last word in human color perception. Does it really suprise anybody that work over the last half century has unearthed some aspects that the CIE did not? CIE is a comfort to engineers, because it wraps color in a widely accepted veneer of quantification, making it seem like a science, something they can deal with. It allows them to talk about color in terms of "CIE coordinates", quoted to two or more significant figures, as for phosphors in a recent posting. However, the CIE coordinates are strictly applicable only to truly continuous spectra. Phosphors emit much of their output in a few narrow spectral lines. Further, the proportions of the emission among the peaks - and the continuous underlying background - depend on the excitation. Assigning precise CIE coordinates to such phosphors is meaningless. (Note the word "precise".) Human vision receives input in four channels. Three have spectral response curves that peak in red, green, and blue parts of the spectrum respectively, but each responds somewhat throughout most of the visible spectrum. The fourth channel is the roughly colorless night vision channel, responding to overall brightness. Our visual system appears to assign color based on ratios of the inputs - and on its own biases. It is an ornery system indeed and is quite willing to reject input that does not fit: when two images of a scene, each projected through a different narrowband red filter, are superimposed on a screen, it is quite possible for humans to perceive greens and blues in a scene, in spite of information from the eye's receptors that the overwhelming majority of received light energy is in the red part of the spectrum[1]. As far as I know, nobody knows why for sure. However, apparently our visual system (evolved for broad-spectrum illumination) has more confidence in the differing responses of the blue, green and "colorless" receptors to the two red peaks than in the evidence from the single red channel. With inputs voting 3:1 in favor of other colors, it ignores the red input. If you had a red phosphor that emitted primarily in two spectral peaks and the proportional distribution between the peaks depended on the excitation, you could perceive the emission from the red phosphor as being blue or green at some intensities. (The emission would have to be confined virtually entirely to the peaks, whereas real phosphors emit a continuous background as well as peaks, so such a phosphor would be very unlikely.) Not only does the concentration by phosphors of their output into a few peaks distort the color response they generate from their nominal CIE coordinates, but the spectrum of each phosphor interacts with that of the others in an RGB system, probably by altering response ratios. Two projects ago, I spent several years simulating color television systems, from the camera to NTSC baseband to RF to NTSC baseband to final "RGB", as well as simulating high-definition TV schemes. I know all too well how hard it is to give up the comfort of quantifying color. CIE was a good start, a necessary and useful approximation for technological progress like color TV transmission and color CRT development. However, "accurate" color is still an art. -- Dick St.Peters GE Corporate R&D, Schenectady, NY stpeters@ge-crd.arpa uunet!steinmetz!stpeters 1. The experiment requires that each (B&W) slide be projected through the same filter through which it was photographed.
xacct@uhccux.uhcc.hawaii.edu (X-Windows Account) (11/14/88)
From article <8811132229.AA02726@dawn.steinmetz.GE.COM>, by stpeters@dawn.UUCP: " " If you had a red phosphor that emitted primarily in two spectral peaks " and the proportional distribution between the peaks depended on the " excitation, you could perceive the emission from the red phosphor as " being blue or green at some intensities. (The emission would have to " be confined virtually entirely to the peaks, whereas real phosphors " emit a continuous background as well as peaks, so such a phosphor " would be very unlikely.) It might also have to part of a high resolution picture of a natural scene, if I recall Land's experiments correctly. I noticed that you used the term 'scene' above the quoted passage. Greg, lee@uhccux.uhcc.hawaii.edu
raveling@vaxb.isi.edu (Paul Raveling) (11/16/88)
In article <8811132229.AA02726@dawn.steinmetz.GE.COM> stpeters@dawn.UUCP writes: > >Our visual system appears to assign color based on ratios of the >inputs - and on its own biases. It is an ornery system indeed and is >quite willing to reject input that does not fit: ... A demonstration of this that I used to do with our filled-vector map graphics involved changing the ocean color. Starting with a seemingly well-saturated chocolate brown for land and dark blue for water, I'd change only the water color. Going to light blue made the land suddenly appear black. Holding something over the screen to mask out water areas showed that the land, in fact, was still the same "vivid" chocolate brown. This was a fairly radical difference in perceived color and luminance. An "accurate" model of perception would need to incorporate info about graphic context of the color in question. --------------------- Paul Raveling Raveling@vaxb.isi.edu
karlton@decwrl.dec.com (Philip Karlton) (12/06/88)
I found the entire discussion on luminance last month quite interesting. What I would like now from the experts is what they would do to convert an RGB value as specified in the X protocol into a gray level. The particular problem I have in mind is that of a StaticGray display with N equally spaced intensities arranged in a ramp with black at 0 and white at N-1. I have access to hardware with N of 2, 4, 16, and 256. Two different expressions for computing the appropriate pixel value come immediately to (my) mind: For r, g, b in [0..1] floor((.299r + .587g + .114b)(n - 1) + 0.5) (a) or for r, g, b in [0..1) floor((.299r + .587g + .114b)(n)) (b) (a) and (b) produce almost identical results for N=2. For N=256, the resulting differences are probably not detectable by the human eye, certainly not mine. For N=4, the differences are observable. The correct thing is for the client to have done the appropriate dithering and present the pixmap to the server. For those clients that ignore the visual type of the root window, the server has to do the mapping of RGB (in X's terms) to some pixel value. Is either (a) or (b) the appropriate choice. Is some better function around that I should use? For the numerically curious: r, g, and b above could be computed using r = ((float) screenRed) / maxColor; b = ((float) screenBlue) / maxColor; g = ((float) screenGreen) / maxColor; where maxColor is dependent upon which of (a) or (b) is chosen: float maxColor = (float) (0xFFFF); /* (a) */ or float maxColor = (float) (0x10000); /* (b) */ PK
srneely@watcgl.waterloo.edu (Shawn Neely) (12/07/88)
In article <964@bacchus.dec.com> karlton@decwrl.dec.com (Philip Karlton) writes:
+I would like now from the experts is what they would do to convert an RGB
+value as specified in the X protocol into a gray level.
+
+The particular problem I have in mind is that of a StaticGray display with N
+equally spaced intensities arranged in a ramp with black at 0 and white at
+N-1. I have access to hardware with N of 2, 4, 16, and 256.
+
+Two different expressions for computing the appropriate pixel value come
+immediately to (my) mind:
+
+For r, g, b in [0..1]
+
+ floor((.299r + .587g + .114b)(n - 1) + 0.5) (a)
+
+or for r, g, b in [0..1)
+
+ floor((.299r + .587g + .114b)(n)) (b)
+
+ (stuff deleted)
+...Is either (a) or (b) the appropriate choice. Is
+some better function around that I should use?
+
+For the numerically curious: r, g, and b above could be computed using
+
+ r = ((float) screenRed) / maxColor;
+ b = ((float) screenBlue) / maxColor;
+ g = ((float) screenGreen) / maxColor;
+
+where maxColor is dependent upon which of (a) or (b) is chosen:
+
+ float maxColor = (float) (0xFFFF); /* (a) */
+or
+ float maxColor = (float) (0x10000); /* (b) */
+
+PK
The correct approach is (a). A strong argument for using
the closed interval [0..1] (and [0..N-1]) is given in
"Design and Experience with a Generalized Raster Toolkit"
by Paeth and Booth in Proc. Graphics Interface '86 (Vancouver).
The interval is consistent with the design of a number of colour spaces,
and is correct when data is taken to higher significance. The same is
not true of the wrong (often implicit using bit shifts) use of
the open interval [0..1).
For example, a one-bit image in the [0..1) model allows only
the intensity values 0.0 and 0.5, and not "full on".
The required multiplications and divisions for the correct approach
can often be performed by table lookup.
--
(.I.) "The road of excess leads
).( to the palace of wisdom."
( Y ) -William Blake
dal@midgard.Midgard.MN.ORG (Dale Schumacher) (12/10/88)
In article <7162@watcgl.waterloo.edu> srneely@watcgl.waterloo.edu (Shawn Neely) writes: [...discussion of RGB->Grey conversion and "shifting" to higher precision...] | |The interval is consistent with the design of a number of colour spaces, |and is correct when data is taken to higher significance. The same is |not true of the wrong (often implicit using bit shifts) use of |the open interval [0..1). | |For example, a one-bit image in the [0..1) model allows only |the intensity values 0.0 and 0.5, and not "full on". | |The required multiplications and divisions for the correct approach |can often be performed by table lookup. Over the last couple of weeks I've been working with the PBM code posted to (i think) comp.source.misc, expanding it to handle 8-bit greyscale and 24-bit color images. In the process, I've run into some of the problems being discussed here. Following are some of my solutions. I think they work as well or better than what I've seen posted so far, but if I'm missing some glaring deficiency, I'd like to have it pointed out to me. RGB to Greyscale: I use the formula GREY = ((76 * R) + (150 * G) + (29 * B)) >> 8; where R, G and B are 24-bit color components [0..255] and the result is a greyscale value [0..255] with intermediate values in the formula not exceeding 16 bits (ie. int). This gives a good approximation of the 29.9% red, 58.7% green, 11.4% blue luminence contributions. Extrapolation to more significance: I have only 3-bits per gun (RGB) on the Atari-ST, and want to expand those color values to their 8-bit equivalents. As was mentioned, simply doing a left-shift is not sufficient. The method I use is to start at the MSB of the source and destination values, copy bits from the source proceeding toward the LSB, if you reach the end of the source before filling the destination, start over at the beginning of the source. This works for both imcreasing and decreasing significance (equivalent to right-shift for decreasing). Example: 101 --> 10110110, 000->00000000, 111->11111111, etc. It seems to work for all cases, even wierd things like 7-bits -> 13-bits. One problem I have yet to solve it analyzing a picture to choose N colors out of a (larger) palette of M colors that best represent a given image. For example, I can only use 16-colors out of a palette of 512, so which are the "best" 16 to use. I already have color dithering algorithms, but I need to decide which colors to dither WITH.
ph@miro.Berkeley.EDU (Paul Heckbert) (12/12/88)
Dale Schumacher (dal@midgard.Midgard.MN.ORG) wrote: > ...I have only 3-bits per gun (RGB) on the Atari-ST, and want to expand those > color values to their 8-bit equivalents... The method I use is to start at > the MSB of the source and destination values, copy bits from the source > proceeding toward the LSB, if you reach the end of the source before filling > the destination, start over at the beginning of the source. > This works for both imcreasing and decreasing significance (equivalent > to right-shift for decreasing). Example: 101 --> 10110110, > 000->00000000, 111->11111111, etc. It seems to work for all cases, > even wierd things like 7-bits -> 13-bits. Paraphrasing, Dale is convering the 3 bit number abc, where each of a, b, and c are 0 or 1, into the 8 bit number abcabcab. This is very close to the "correct" formula, but you've found a somewhat roundabout way to compute it. The formula you want will map black (000) to black (00000000) and white (111) to white (11111111) and map everything inbetween linearly. In other words, you want to multiply by 255/7. Your formula actually multiplies by 255.9375/7. You can prove this to yourself by thinking of the 3-bit bit string x=abc as a representation for the binary fraction x'=.abc (e.g. bit string 010 represents the number .010) and 8-bit bit string y=abcabcab is a code for binary y'=.abcabcab . But replicating the bits is equivalent to a multiplication: y'=x'*1.001001001. Putting our formulas together, we have x'=x/8, y'=y/256, and 1.001001001=4095/3584, so y/x = (1/8)*(4095/(512*7))*256 = 4095/(7*16) = 255.9375/7 . It's good to step back from the low-level bits once in a while and think about what these pixel values mean in the real world. ---- Dale also asked about algorithms for selecting the 16 colors out of a palette of 512 that best represent an image. This is called "color image quantization". I wrote about it in a paper: Paul S. Heckbert, "Color Image Quantization for Frame Buffer Display", Computer Graphics (SIGGRAPH '82 Proceedings), vol. 16, no. 3, July 1982, pp. 297-307 see also the improved algorithm in: S. J. Wan, K. M. Wong, P. Prusinkiewicz, "An Algorithm for Multidimensional Data Clustering", ACM Trans. on Mathematical Software, vol. 14, no. 2, June 1988, 153-162 Paul Heckbert, CS grad student 508-7 Evans Hall, UC Berkeley UUCP: ucbvax!miro.berkeley.edu!ph Berkeley, CA 94720 ARPA: ph@miro.berkeley.edu
dal@midgard.Midgard.MN.ORG (Dale Schumacher) (12/14/88)
In article <8241@pasteur.Berkeley.EDU> ph@miro.Berkeley.EDU (Paul Heckbert) writes: | |Paraphrasing, Dale is convering the 3 bit number abc, where each of a, b, |and c are 0 or 1, into the 8 bit number abcabcab. | |This is very close to the "correct" formula, but you've found a somewhat |roundabout way to compute it. The formula you want will map black (000) |to black (00000000) and white (111) to white (11111111) and map everything |inbetween linearly. In other words, you want to multiply by 255/7. |Your formula actually multiplies by 255.9375/7. The point of my "round-about" method is performance. It's much easier to replicate bits and do shifts than to divide by 7. I believe that this approximation will yield the "correct" value (to 8-bit int precision) for all cases, right? |Dale also asked about algorithms for selecting the 16 colors out of a |palette of 512 that best represent an image. This is called "color image |quantization". I wrote about it in a paper: Thank you for the references. I'll check into them.
dave@onfcanim.UUCP (Dave Martindale) (12/17/88)
In article <518@midgard.Midgard.MN.ORG> dal@midgard.Midgard.MN.ORG (Dale Schumacher) writes: > >The point of my "round-about" method is performance. It's much easier to >replicate bits and do shifts than to divide by 7. But it's faster to *implement* almost any function using table lookup. Even the naive and inaccurate shift-n-bits is faster when performed using table lookup than the hardware shift instruction on some hardware. Once you are using table lookup to do the pixel-by-pixel "computations", it really doesn't matter how expensive the code that initializes the table is - you only do it once. So you might as well use multiply and divide, and do the calculations in a way that someone else can read, and can see by inspection is correct. You can even use non-linear pixel encodings to avoid losing shadow detail when the output pixel is narrow. For example, my standard way of storing 12-bit linear data from a scanner into 8 bits is: outpix = 255 * ((inpix/4095.0) ** (1/2.2)) using floating point where needed, and rounding the result to an integer. (The magic number 2.2 happens to be the standard value of "gamma correction" that the NTSC television standard uses, so this can be sent to a frame buffer and turned into NTSC without further gamma correction, but the technique is worthwhile on its own even if the image will never appear on video.)
raveling@vaxb.isi.edu (Paul Raveling) (12/20/88)
In article <16929@onfcanim.UUCP> dave@onfcanim.UUCP (Dave Martindale) writes: >In article <518@midgard.Midgard.MN.ORG> dal@midgard.Midgard.MN.ORG (Dale Schumacher) writes: >> >>The point of my "round-about" method is performance. It's much easier to >>replicate bits and do shifts than to divide by 7. > >But it's faster to *implement* almost any function using table lookup. This is true for relatively complex functions, but not usually for those that break down easily to simple operations such as shifts and adds. I've measured speed improvements up to a factor of 14 over ordinary C code in the most extreme case by moving a critical algorithm to assembly language and using this sort of shifty logic. Both techniques are valuable. For example, real time software such as that used in Central Air Data Computers uses shift/subtract logic wherever possible for functions such as the simplest digital filters [something like filtered_value = (7*old_value+new_value)/8]; it uses table lookup with linear interpolation between entries for other functions. The other functions need not be very complex to make the table lookup useful -- sqrt, for example. Where table lookup REALLY shines is evaluating relatively complex functions. Software such as the B-1B's CADC uses it profusely to keep adequate real time margins. >Once you are using table lookup to do the pixel-by-pixel "computations", >it really doesn't matter how expensive the code that initializes the table >is - you only do it once. So you might as well use multiply and divide, >and do the calculations in a way that someone else can read, and can see >by inspection is correct. Good commenting serves the latter purpose. It's just as easy to supply an actual equation as a comment as it is to use it as code. It's a matter of software engineering discipline to be sure the comments match the code -- we all do that, religiously, don't we? --------------------- Paul Raveling Raveling@vaxb.isi.edu
dave@onfcanim.UUCP (Dave Martindale) (12/21/88)
In article <7086@venera.isi.edu> raveling@vaxb.isi.edu (Paul Raveling) writes: > >>But it's faster to *implement* almost any function using table lookup. > > This is true for relatively complex functions, but not usually > for those that break down easily to simple operations such as > shifts and adds. I've measured speed improvements up to a > factor of 14 over ordinary C code in the most extreme case > by moving a critical algorithm to assembly language and using > this sort of shifty logic. Is that a factor of 14 speed improvement over table lookup? Or a factor of 14 improvement over code that contained multiply or divide in the inner loop? A comparison against table lookup is what matters. What processor was this on? For shift/add to be faster than table lookup, the time required to do the shifts and adds must be less than that required to do the table addressing calculations and the extra memory fetch. (Note that the table word will usually be in the cache on machines that have a cache.) On machines like the VAX, where the table addressing calculations can be done as part of a "move" instruction but the shifts and adds are done as separate instructions (and shift is very slow on some models), table lookup is going to be faster. On another machine where the table addressing requires several instructions but a barrel shifter is available, the shift method will likely be faster. You have to try both, or have a good knowledge of the particular model of the particular architecture of processor you are using, to determine which is faster. However, table lookup has some additional benefits. It is simple enough that carefully-written C code is likely to generate the best possible assembler code, so there is no need to use assembler. A single copy of the lookup code functions for all possible input and output widths, as long as the pixels are always stored in words of a constant width. In contrast, the shift/add code requires different sequences of instructions for different input and output pixel bit widths, since the output may require adding 1 or 2 or 3 or more copies of the input, shifted by various amounts. Either you must pre-compile all possible variations that you could ever need, or compile code during execution, or just have a general-purpose algorithm that needs tests and branches within the pixel lookup loop (bye-bye performance). How do you deal with this?
ksbooth@watcgl.waterloo.edu (Kelly Booth) (12/21/88)
In article <16960@onfcanim.UUCP> dave@onfcanim.UUCP (Dave Martindale) writes: >For shift/add to be faster than table lookup... Actually, some solutions combine BOTH techniques. If the table has two or more indices (not the case here), then the lookup requires combining those indices into a single index into the table. Some compilers do this with adds and multiplies. If machine code is written (or if the compiler is smart) shifts and adds can be substituted. This may require that the table size(s) be adjusted to the next larger powers of two. Or, recursively, the composite indices can themselves be computed using a table lookup scheme.
raveling@vaxb.isi.edu (Paul Raveling) (12/22/88)
BTW, in case it wasn't clear my preceding response was suggesting that the "shifty logic" and table lookup with interpolation alternatives are more appropriate when the function being implemented has a large domain. Direct table lookup is best when the domain is relatively small, and I agree that it's the best technique for sheer speed. A case where direct table lookup isn't practical is any of a few image processing utilities we have. They examine a 5x5 square of pixels with 24 bits of color per pixel to determine a new color for the center pixel. I'd love to use direct table lookup for this, but 25*2**24 bytes is a tough table to handle. In article <16960@onfcanim.UUCP> dave@onfcanim.UUCP (Dave Martindale) writes: > >Is that a factor of 14 speed improvement over table lookup? Or a >factor of 14 improvement over code that contained multiply or divide in >the inner loop? A comparison against table lookup is what matters. >What processor was this on? This was several years ago; I believe the algorithm was an integer square root function, the processor definitely was either an 8088 or an 80286 on an IBM PC. Since the function's domain was unsigned 16-bit integers and the PC's memory was quite limited, direct table lookup would have been impractical, even though it certainly would be much faster. I must admit that the factor of 14 is exceptional because of the 80x86 processor architecture; going into assembly language allowed carefully recoding some branching logic that produced excessive queue flushes, which are a major time waster in this family of processors. Of course a direct table lookup would eliminate this problem. Note also that there are two different types of table lookup to consider: Direct table lookup and searching. The variant I mentioned in connection with real time avionics software used tables containing both x and y values (as in y = f(x) -- not screen coordinates) to handle continuous functions of real numbers. It did a binary search to find the nearest x values to the argument, then used linear interpolation to derive the corresponding y value. In order to support the required accuracy most tables contained around 6-10 entries; the shortest I can recall was 2 entries, longest about 50. The version of the B-1B CADC that I worked on used sets of these tables to implement up to 5-dimensional linear interpolation. This approach is VERY fast for a large class of functions; with a mean search requiring something like 3 lookups and only simple math, it's faster than computing for lots of things with large domains or complex [not simple] equations. It's also good for supporting empirically derived functions, such as error correction for the static pressure source on military aircraft; this tends to be a bit bizarre in the transonic regime near mach 1. BTW, a necessary aid for building that sort of table is a utility to do curve fits, then extract a set of (x,y) points which keep error less than a given accuracy requirement when using linear interpolation between these points. Bottom line: By all means use direct table lookup where it's feasible; but don't forget there are a couple other approaches that can still save time. --------------------- Paul Raveling Raveling@vaxb.isi.edu
turk@Apple.COM (Ken "Turk" Turkowski) (12/31/88)
A fast but crude method is: Y = (R + 2G + B) / 4 -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA Internet: turk@apple.com Applelink: Turkowski1
jbm@eos.UUCP (Jeffrey Mulligan) (01/04/89)
From article <23105@apple.Apple.COM>, by turk@Apple.COM (Ken "Turk" Turkowski): > A fast but crude method is: > > Y = (R + 2G + B) / 4 > -- > Ken Turkowski @ Apple Computer, Inc., Cupertino, CA > Internet: turk@apple.com > Applelink: Turkowski1 Equally fast and crude but probably more accurate: Y = 2R + 5G + B -- Jeff Mulligan (jbm@aurora.arc.nasa.gov) NASA/Ames Research Ctr., Mail Stop 239-3, Moffet Field CA, 94035 (415) 694-6290
falk@sun.uucp (Ed Falk) (01/04/89)
In article <2263@eos.UUCP>, jbm@eos.UUCP (Jeffrey Mulligan) writes: > From article <23105@apple.Apple.COM>, by turk@Apple.COM (Ken "Turk" Turkowski): > > Y = (R + 2G + B) / 4 > Y = 2R + 5G + B Yeesh, you people. Why do you need to simplify it? Are you going to be doing these calculations by hand? Get a calculator or something. If you *must* do it in integer for speed reasons, do it this way: out = (77*r + 151*g + 28*b)/256 ; /* NTSC weights (.3,.59,.11)*/ The results are correct to four decimal places and the divide is replaced by a right-shift in a decent compiler and a byte-move in a good compiler. -- -ed falk, sun microsystems sun!falk, falk@sun.com card-carrying ACLU member.
raveling@vaxb.isi.edu (Paul Raveling) (01/05/89)
In article <83604@sun.uucp> falk@sun.uucp (Ed Falk) writes: >In article <2263@eos.UUCP>, jbm@eos.UUCP (Jeffrey Mulligan) writes: >> From article <23105@apple.Apple.COM>, by turk@Apple.COM (Ken "Turk" Turkowski): >> > Y = (R + 2G + B) / 4 >> Y = 2R + 5G + B [The latter should really be Y = (2R + 5G + B) / 8, right?] > >Yeesh, you people. Why do you need to simplify it? Are you going to be >doing these calculations by hand? Get a calculator or something. > >If you *must* do it in integer for speed reasons, do it this way: > > out = (77*r + 151*g + 28*b)/256 ; /* NTSC weights (.3,.59,.11)*/ > >The results are correct to four decimal places and the divide is replaced >by a right-shift in a decent compiler and a byte-move in a good compiler. ... because sometimes speed is lots more important than 4 significant digits of accuracy, and multiplies are slow. Consider a 68020 running code for these operations: a) Y = (R + G+G + B) >> 2 b) Y = (R+R + G<<2+G + B) >> 3 c) Y = (77*r + 151*g + 28*b) >> 8 For a rough cut at comparing timings, adding up the number of clocks for each instruction for each of the 3 cases given in the gospel according to Motorola, assuming the work's done in registers and the result is stored with a (An)+ reference, gives: Best Case Cache Case Worst Case --------- ---------- ---------- a) 5 14 18 b) 6 20 25 c) 83 93 99 Which makes the accurate variant a whale of a lot slower than the others. This sort of thing gets fairly noticable if you're massaging a megapixel image. BTW, this is an example of a function that couldn't easily be accelerated by table lookup unless R, G, and B have very few bits. Even then 3D subscript computation puts the table lookup in the same speed range as the faster 2 of these alternatives. --------------------- Paul Raveling Raveling@vaxb.isi.edu
ksbooth@watcgl.waterloo.edu (Kelly Booth) (01/05/89)
In article <7187@venera.isi.edu> raveling@vaxb.isi.edu (Paul Raveling) writes: > ... because sometimes speed is lots more important than 4 > significant digits of accuracy, and multiplies are slow. . . . (stuff deleted) . . . > BTW, this is an example of a function that couldn't easily > be accelerated by table lookup unless R, G, and B have very > few bits. Even then 3D subscript computation puts the table > lookup in the same speed range as the faster 2 of these > alternatives. Huh? Table look up can be used to replace each of the three multiplies in aR+bG+cB so that the code becomes something like a[R]+b[G]+c[B] if R-G-B are bytes (the usual case and what most of the previous postings have assumed -- for up to 12 bits table look up is still reasonable). This leaves just the adds and the divide (not shown at the end), which for the posting that suggest this was still a factor of two (in fact 256) so the byte swap/move or shift tricks all still work. There is no need to tabulate the entire function. [See previous postings on table look up in this news group about 1-2 weeks ago.]
raveling@vaxb.isi.edu (Paul Raveling) (01/07/89)
In article <7187@venera.isi.edu> raveling@vaxb.isi.edu (Paul Raveling) writes: > > Consider a 68020 running code for these operations: > > a) Y = (R + G+G + B) >> 2 > b) Y = (R+R + G<<2+G + B) >> 3 > c) Y = (77*r + 151*g + 28*b) >> 8 > > BTW, this is an example of a function that couldn't easily > be accelerated by table lookup unless R, G, and B have very > few bits. Even then 3D subscript computation puts the table > lookup in the same speed range as the faster 2 of these > alternatives. It's time to eat some of my own words... I admit to taking my brain out of gear too soon on this one. As Bob Webber pointed out in an email message, a good candidate for the best approach of all is likely to be: d) Y = (times77[R] + times151[G] + times28[B]) >> 8 If R, G, and B are 8 bits this only requires 768 bytes of table space and it should be about as fast as alternative b. This is easily worth it for having both good speed and good accuracy. --------------------- Paul Raveling Raveling@vaxb.isi.edu
dal@midgard.Midgard.MN.ORG (Dale Schumacher) (01/10/89)
In article <83604@sun.uucp> falk@sun.uucp (Ed Falk) writes: |In article <2263@eos.UUCP>, jbm@eos.UUCP (Jeffrey Mulligan) writes: |> From article <23105@apple.Apple.COM>, by turk@Apple.COM (Ken "Turk" Turkowski): |> > Y = (R + 2G + B) / 4 |> Y = 2R + 5G + B | out = (77*r + 151*g + 28*b)/256 ; /* NTSC weights (.3,.59,.11)*/ | |The results are correct to four decimal places and the divide is replaced |by a right-shift in a decent compiler and a byte-move in a good compiler. I don't know where you got your numbers. The values I have for the Y component of YIQ (luminance) from RGB are: R=.299 G=.587 B=.144 The following formula is the best approximation with 8-bit values: Y = (R*77 + G*150 + B*29) / 256 Which gives the weights: R=.3008 G=.5859 B=.1133, total error=.0036 Your values give the weights: R=.3008 G=.5898 B=.1094, total error=.0092 Even MY numbers don't have 4 places of accuracy, but they are a better approximation to the 3 place target values I have. Someone mentioned that the NTSC weight may have been changed recently, is that so? PS. I fully agree with the idea that more accurate values should be used if you're going to use integer, and do 3 multiplies and a 'divide' (which can be optimized if it's a power of 2) anyway.
raveling@vaxb.isi.edu (Paul Raveling) (01/14/89)
In article <10322@well.UUCP> Jef Poskanzer <jef@rtsg.ee.lbl.gov> writes: >I wrote a quick test program to try out various approximations. It runs >five million conversions. On a Sun 3/260, the timings are: > > float: 223.0 > int: 35.4 > table: 31.6 > >I have appended the program, in case anyone wants to run it on a different >architecture or try different approximations. Just below are some results from an HP 9000/350. I added two runs: One was with "shifty" logic defined by... #ifdef SHIFTY j = ( r+r + (g<<2)+g + b ) >> 3; #endif The other was a "no logic" run, with nothing defined to get an overhead calibration (how much time the loop logic and rgb updating used). The "Less Overhead" column below subtracts this to get a direct comparison of timing for the math only. Test Raw Timing Less Overhead ---- ---------- ------------- float 220.0 201.2 int 37.6 18.8 table 34.9 16.1 shifty 28.9 10.1 overhead 18.8 0 This isn't entirely what I anticipated: The "int" version, (j = ( r * 77 + g * 150 + b * 29 ) >> 8;), appeared to be faster than expected. I checked further and found that on this one the compiler decomposed all three multiplies into shifts, adds, and a subtract. Also, the table version seemed too slow. It turned out that the compiler generated some remarkably crummy code. ALL data except the tables were kept on the stack -- none in registers -- and the subscript address computations appeared to be distinctly suboptimal. Next, maybe tomorrow, I'll try the same stuff with some hand coded assembly language. It should be easy to beat the compiler by LOTS. --------------------- Paul Raveling Raveling@vaxb.isi.edu
falk@sun.uucp (Ed Falk) (01/15/89)
> > BTW, this is an example of a function that couldn't easily > > be accelerated by table lookup unless R, G, and B have very > > few bits. Even then 3D subscript computation puts the table > > lookup in the same speed range as the faster 2 of these > > alternatives. > > Huh? Table look up can be used to replace each of the three multiplies > in aR+bG+cB so that the code becomes something like > > a[R]+b[G]+c[B] > I'm embarrassed. I had been doing (r*77 + g*150 + 29*b)/256 all along thinking that all multiplies took the same amount of time (the compiler, it turns out, optimizes constant multiplies in interesting ways). I've switched all my code to use look-up tables now. I've gained a new respect for look-up tables. -- -ed falk, sun microsystems sun!falk, falk@sun.com card-carrying ACLU member.
raveling@vaxb.isi.edu (Paul Raveling) (01/17/89)
In article <7266@venera.isi.edu> raveling@vaxb.isi.edu (Paul Raveling) writes: > > Next, maybe tomorrow, I'll try the same stuff with some > hand coded assembly language. It should be easy to beat > the compiler by LOTS. > Here's the result of using hand coded assembly language for the "table" algorithm: Test Raw Timing Less Overhead ---- ---------- ------------- C version 34.9 16.1 Assembly version 10.2 6.4 Anyone have a better C compiler? If anyone would like to check this hacked assembly version on other systems, let me know. I can either email it or post it, but should take a few minutes to clean the source up a little & stick in warnings that very few 68K assemblers use identically the same source syntax. --------------------- Paul Raveling Raveling@vaxb.isi.edu
jonathan@jvc.UUCP (Jonathan Hue) (01/18/89)
I'm slightly puzzled by these calculations of luminance from RGB. Doesn't the formula Y = .299R + .587G + .114B only apply when RGB represents intensity, rather than pixel values? If your pixel values are completely gamma-corrected through look-up tables in your frame buffer hardware so they represent intensities, this would work, but if you use linear look-up tables (or don't have any), you would need to convert pixel values to intensity, calculate luminance, then convert them back into pixel values (voltages). Also, considering how far the green of the typical color monitor is from NTSC green, it may be worth deriving new coefficients for the monitor you are using. Jonathan Hue uunet!jvc!jonathan