harnad@mind.UUCP (Stevan Harnad) (11/01/86)
Summary: Continuity, exploiting the physical medium, and information preservation under transformations Tom Dietterich (orstcs!tgd) responds as follows to my challenge to define the A/D distinction: > In any representation, certain properties of the representational > medium are exploited to carry information. Digital representations > tend to exploit fewer properties of the medium. For example, in > digital electronics, a 0 could be defined as anything below .2 volts > and a 1 as anything above 4 volts. This is a simple distinction. > An analog representation of a signal (e.g., in an audio amplifier) > requires a much finer grain of distinctions--it exploits the > continuity of voltage to represent, for example, the loudness > of a sound. So far so good. Analog representations "exploit" more of the properties (e.g., continuity) of the "representational" (physical?) medium to carry information. But then is the difference between an A and a D representation just that one is more (exploitative) and the other less? Is it not rather that they carry information and/or represent in a DIFFERENT WAY? In what does that difference consist? (And what does "exploit" mean? Exploit for whom?) > A related notion of digital and analog can be obtained by considering > what kinds of transformations can be applied without losing > information. Digital signals can generally be transformed in more > ways--precisely because they do not exploit as many properties of the > representational medium. Hence, if we add .1 volts to a digital 0 as > defined above, the result will either still be 0 or else be undefined > (and hence [un]detectable). A digital 1 remains unchanged under > addition of .1 volts. However, the analog signal would be > changed under ANY addition of voltage. "Preserving information under transformations" also sounds like a good candidate. But it seems to me that preservation-under-transformation is (or ought to be) a two-way street. Digital representations may be robust within their respective discrete boundaries, but it hardly sounds information-preserving to lose all the information between .2 volts and 4 volts. I would think that the invertibility of analog transformations might be a better instance of information preservation than the irretrievable losses of A/D. And this still seems to side-step the question of WHAT information is preserved, and in what way, by analog and digital representations, respectively. And should we be focusing on representations in this discussion, or on transformations (A/A, A/D, D/D, D/A)? Finally, what is the relation between a digital representation and a symbolic representation? Please keep those definitions coming. Stevan Harnad {allegra, bellcore, seismo, packard} !princeton!mind!harnad harnad%mind@princeton.csnet (609)-921-7771
harnad@mind.UUCP (Stevan Harnad) (11/01/86)
Steven R. Jacobs (utah-cs!jacobs) of the University of Utah CS Dept makes the following contribution on the A/D Distinction. (His reply appears below, followed at the very end by some comments from me.) >> One prima facie non-starter: "continuous" vs. "discrete" physical processes. >I apologize if this was meant to avoid discussion of continuous/discrete >issues relating to analog/digital representations. I find it difficult >to avoid talking in terms of "continuous" and "discrete" processes when >discussing the difference between analog and digital signals. I am >approaching the question from a signal processing point of view, so I >tend to assume that "real" signals are analog signals, and other methods >of representing signals are used as approximations of analog signals (but >see below about a physicist's perspective). Yes, I realize you asked for >objective definitions. For my own non-objective convenience, I will use >analog signals as a starting point for obtaining other types of signals. >This will assist in discussing the operations used to derive non-analog >signals from analog signals, and in discussing the effects of the operations >on the mathematics involved when manipulating the various types of signals >in the time and frequency domains. > >The distinction of continuous/discrete can be applied to both the amplitude >and time axes of a signal, which allows four types of signals to be defined. >So, some "loose" definitions: > >Analog signal -- one that is continuous both in time and amplitude, so that > the amplitude of the signal may change to any amplitude at any time. > This is what many electrical engineers might describe as a "signal". > >Sampled signal -- continuous in amplitude, discrete in time (usually with > eqully-spaced sampling intervals). Signal may take on any amplitude, > but the amplitude changes only at discrete times. Sampled signals > are obtained (obviously?) by sampling analog signals. If sampling is > done improperly, aliasing will occur, causing a loss of information. > Some (most?) analog signals cannot be accurately represented by a > sampled signal, since only band-limited signals can be sampled without > aliasing. Sampled signals are the basis of Digital Signal Processing, > although digital signals are invariably used as an approximation of > the sampled signals. > >Quantized signal -- piece-wise continuous in time, discrete in amplitude. > Amplitude may change at any time, but only to discrete levels. All > changes in amplitude are steps. > >Digital signal -- one that is discrete both in time and amplitude, and may > change in (discrete) amplitude only at certain (discrete, usually > uniformly spaced) time intervals. This is obtained by quantizing > a sampled signal. > >Other types of signals can be made by combining these "basic" types, but >that topic is more appropriate for net.bizarre than for sci.electronics. > >The real distinction (in my mind) between these representations is the effect >the representation has on the mathematics required to manipulate the signals. > >Although most engineers and computer scientists would think of analog signals >as the most "correct" representations of signals, a physicist might argue that >the "quantum signal" is the only signal which corresponds to the real world, >and that analog signals are merely a convenient approximation used by >mathematicians. > >One major distinction (from a mathematical point of view) between sampled >signals and analog signals can be best visualized in the frequency domain. >A band-limited analog signal has a Fourier transform which is finite. A >sampled representation of the same signal will be periodic in the Fourier >domain. Increasing the sampling frequency will "spread out" the identical >"clumps" in the FT (fourier transform) of a sampled signal, but the FT >of the sampled signal will ALWAYS remain periodic, so that in the limit as >the sampling frequency approaches infinity, the sampled signal DOES NOT >become a "better" approximation of the analog signal, they remain entirely >distinct. Whenever the sampling frequency exceeds the Nyquist frequency, >the original analog signal can be exactly recovered from the sampled signal, >so that the two representations contain the equivalent information, but the >two signals are not the same, and the sampled signal does not "approach" >the analog signal as the sampling frequency is increased. For signals which >are not band-limited, sampling causes a loss of information due to aliasing. >As the sampling frequency is increased, less information is lost, so that the >"goodness" of the approximation improves as the sampling frequency increases. >Still, the sampled signal is fundamentally different from the analog signal. >This fundamental difference applies also to digital signals, which are both >quantized and sampled. > >Digital signals are usually used as an approximation to "sampled" signals. >The mathematics used for digital signal processing is actually only correct >when applied to sampled signals (maybe it should be called "Sampled Signal >Processing" (SSP) instead). The approximation is usually handled mostly by >ignoring the "quantization noise" which is introduced when converting a >sampled analog signal into a digital signal. This is convenient because it >avoids some messy "details" in the mathematics. To properly deal with >quantized signals requires giving up some "nice" properties of signals and >operators that are applied to signals. Mostly, operators which are applied >to signals become non-commutative when the signals are discrete in amplitude. >This is very much related to the "Heisenburg uncertainty principle" of >quantum mechanics, and to me represents another "true" distinction between >analog and digital signals. The quantization of signals represents a loss of >information that is qualitatively different from any loss of information that >occurs from sampling. This difference is usally glossed over or ignored in >discussions of signal processing. > >Well, those are some half-baked ideas that come to my mind. They are probably >not what you are looking for, so feel free to post them to /dev/null. > >Steve Jacobs > - - - - - - - - - - - - - - - - - - - - - - - - REPLY: > I apologize if this was meant to avoid discussion of continuous/discrete > issues relating to analog/digital representations. It wasn't meant to avoid discussion of continuous/discrete at all; just to avoid a simple-minded equation of C/D with A/D, overlooking all the attendant problems of that move. You certainly haven't done that in your thoughtful and articulate review and analysis. > I tend to assume that "real" signals are analog signals, and other > methods of representing signals are used as approximations of analog > signals. That seems like the correct assumption. But if we shift for a moment from considering the A or D signals themselves and consider instead the transformation that generated them, the question arises: If "real" signals are analog signals, then what are they analogs of? Let's borrow some formal jargon and say that there are (real) "objects," and then there are "images" of them under various types of transformations. One such transformation is an analog transformation. In that case the image of the object under the (analog) transformation can also be called an "analog" of the object. Is that an analog signal? The approximation criterion also seems right on the mark. Using the object/transformation/image terminology again, another kind of a transformation is a "digital" transformation. The image of an object (or of the analog image of an object) under a digital transformation is "approximate" rather than "exact." What is the difference between "approximate" and "exact"? Here I would like to interject a tentative candidate criterion of my own: I think it may have something to do with invertibility. A transformation from object to image is analog if (or to the degree that) it is invertible. In a digital approximation, some information or structure is irretrievably lost (the transformation is not 1:1). So, might invertibility/noninvertibility have something to do with the distinction between an A and a D transformation? And do "images" of these two kinds count as "representations" in the sense in which that concept is used in AI, cognitive psychology and philosophy (not necessarily univocally)? And, finally, where do "symbolic" representations come in? If we take a continuous object and make a discrete, approximate image of it, how do we get from that to a symbolic representation? > Analog signal -- one that is continuous both in time and amplitude. > Sampled signal -- continuous in amplitude, discrete in time... > If sampling is done improperly, aliasing will occur, causing a > loss of information. > Quantized signal -- piece-wise continuous in time, discrete in > amplitude. > Digital signal -- one that is discrete both in time and amplitude... > This is obtained by quantizing a sampled signal. Both directions of departure from the analog, it seems, lose information, unless the interpolations of the gaps in either time or amplitude can be accurately made somehow. Question: What if the original "object" is discrete in the first place, both in space and time? Does that make a digital transformation of it "analog"? I realize that this is violating the "signal" terminology, but, after all, signals have their origins too. Preservation and invertibility of information or structure seem to be even more general features than continuity/discreteness. Or perhaps we should be focusing on the continuity/noncontinuity of the transformations rather than the objects? > a physicist might argue that the "quantum signal" is the only > signal which corresponds to the real world, and that analog > signals are merely a convenient approximation used by mathematicians. This, of course, turns the continuous/discrete and the exact/approximate criteria completely on their heads, as I think you recognize too. And it's one of the things that makes continuity a less straightforward basis for the A/D distinction. > Mostly, operators which are applied to signals become > non-commutative when the signals are discrete in amplitude. > This is very much related to the "Heisenburg uncertainty principle" > of quantum mechanics, and to me represents another "true" distinction > between analog and digital signals. The quantization of signals > represents a loss of information that is qualitatively different from > any loss of information that occurs from sampling. I'm not qualified to judge whether this is an anolgy or a true quantum effect. If the latter, then of course the qualitative difference resides in the fact that (on current theory) the information is irretrievable in principle rather than merely in practice. > Well, those are some half-baked ideas that come to my mind. Many thanks for your thoughtful contribution. I hope the discussion will continue "baking." Stevan Harnad {allegra, bellcore, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet (609)-921-7771
rst@think.COM (Robert Thau) (11/02/86)
In article <105@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >"Preserving information under transformations" also sounds like a good >candidate. But it seems to me that preservation-under-transformation >is (or ought to be) a two-way street. Digital representations may be >robust within their respective discrete boundaries, but it hardly >sounds information-preserving to lose all the information between .2 >volts and 4 volts. I would think that the invertibility of analog >transformations might be a better instance of information preservation than >the irretrievable losses of A/D. I'm not convinced. Common ways of transmitting analog signals all *do* lose at least some of the signal, irretrievably. Stray capacitance in an ordinary wire can distort a rapidly changing signal. Even fiber optic cables lose signal amplitude enough to require repeaters. Losses of information in processing analog signals tend to be worse, and for an analog transformation to be exactly invertible, it *must* preserve all the information in its input. Of course, it is possible to keep the loss of information to acceptable levels. Any stereo buff owns at least two fairly good analog systems (speakers). If s/he is a committed stereo buff, s/he probably can quote you exactly how much information is lost by the speakers (at what volume levels, in what pitch ranges, ad nauseam), and how much it would cost to lose less. The point is that the amount of information in the speakers' input which they lose, irretrievably, is a consequence of the design decisions of the people who made them. Such design decisions are as explicit as the number of bits used in a digital representation of the signal in the CD player farther up the pike. Either digital or analog systems can be made as "true" as you like, given enough time, materials, and money, but in neither case is perfection an option. >And this still seems to side-step the >question of WHAT information is preserved, and in what way, by analog >and digital representations, respectively. Agreed. -- This article was generated entirely by line noise. Don't believe a word of it. rst
ken@rochester.ARPA (Comfy chair) (11/02/86)
I tend to side with those who say that the A/D distinction is mostly in the mind of the observer. The world is seamless to me and but may be digital at the atomic quantum level for all I know. (Please, no physics flames about this, I'm ignorant here.) We just use whatever analysis is appropriate. Is a CD player a digital system or an analog system? It partakes of both. I can make a very nice sinewave oscillator out of a couple of TTL inverters by biasing them correctly. I can turn my op amp into a nice bistable flip-flop with the right feedback. Is light radiation or particles? Does it matter? Depends on which sets of equations are simpler to solve. Looking for a distinction in this A/D question probably reveals more about man's desire to classify than anything else. Words are not the reality. Ken
djo@ptsfd.UUCP (Dan'l Oakes) (11/04/86)
A candidate: I think an earlier candidate hinted at the right idea and then dropped it. The distinction is that a digital signal is discrete -- not so much with respect to amplitude as with respect to time. An analog signal doesn't much care when you look at it; a digital signal has "transition" or "edge" periods which can carry no information, by definition. To point out that an analog modem has such periods is irrelevant; the reason it has transition periods is that it is an analog signal CARRYING A MODULATED DIG- ITAL SIGNAL on it. The transition periods are -not- meaningful in terms of the analog signal but only in terms of the digital signal -- or, to be more precise but stranger, they are only -meaningless- in terms of the digital signal; their presence is very significant to the analog signalling device, the modem: their presence signals to the distant modem "Yep, I'm still here." Go ahead. Tear it up. I don't care... Dan'l Danehy-Oakes
turk@apple.UUCP (Ken "Turk" Turkowski) (11/04/86)
In article <116@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >(2) winnie!brad (Brad Garton) writes: >> ... When I consider the digitized versions of analog >> signals we deal with over here <computer music>, it seems that we >> approximate more and more closely the analog signal with the >> digital one as we increase the sampling rate. There is a difference between sampled signals and digital signals. A digital signals is not only sampled, but is also quantized. One can have an analog sampled signal, as with CCD filters. As a practical consideration, all analog signals are band-limited. By the Sampling Theorem, there is a sampling rate at which a bandlimited signal can be perfectly reconstructed. *Increasing the sampling rate beyond this "Nyquist rate" cannot result in higher fidelity*. What can affect the fidelity, however, is the quantization of the samples: the more bits used to represent each sample, the more accurately the signal is represented. This brings us to the subject of Signal Theory. A particular class of signal that is both time- and band-limited (all real-world signals) can be represented by a linear combination of a finite number of basis functions. This is related to the dimensionality of the signal, which is approximately 2WT, where W is the bandwidth of the signal, and T is the duration of the signal. >> ... This process reminds >> me of Mandelbrot's original "How Long is the Coastline of Britain" >> article dealing with fractals. Perhaps "analog" could be thought >> of as the outer limit of some fractal set, with various "digital" >> representations being inner cutoffs. Fractals have a 1/f frequency distribution, and hence are not band-limited. >> In article <105@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >> I'm not convinced. Common ways of transmitting analog signals all >> *do* lose at least some of the signal, irretrievably... Let's not forget noise. It is impossible to keep noise out of analog channels and signal processing, but it can be removed in digital channels and can be controlled (roundoff errors) in digital signal processing. >> ... Losses of information in processing analog signals tend to >> be worse, and for an analog transformation to be exactly invertible, it >> *must* preserve all the information in its input. Including the exclusion of noise. Once noise is introduced, the signal cannot be exactly inverted. -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA UUCP: {sun,nsc}!apple!turk CSNET: turk@Apple.CSNET ARPA: turk%Apple@csnet-relay.ARPA
ken@rochester.ARPA (Comfy chair) (11/05/86)
|I think an earlier candidate hinted at the right idea and then dropped it. The |distinction is that a digital signal is discrete -- not so much with respect to |amplitude as with respect to time. I don't think time is necessarily relevant. I can give you a picture of the Mona Lisa or a tape containing digitized luminance and colour levels of same. There seem to be various things people are thinking of: circuits, signals or just representations. Ken
zdenek@heathcliff.columbia.edu (Zdenek Radouch) (11/06/86)
In article <267@apple.UUCP> turk@apple.UUCP (Ken "Turk" Turkowski) writes: > >As a practical consideration, all analog signals are band-limited. False. For a PARTICULAR purpose we can consider property A to be of interest only if it is in an interval <x,y>. The property is ANY property, not just the frequency - see below! >...By the >Sampling Theorem, there is a sampling rate at which a bandlimited signal can >be perfectly reconstructed. *Increasing the sampling rate beyond this >"Nyquist rate" cannot result in higher fidelity*. False. The Nyquist Theorem assumes that the band of the signal REALLY IS limited. It in fact isn't, we are just not interested in the sound we cannot hear. It is a common misconception that you don't have to worry about the frequency you cannot hear. As far as hearing is concerned - yes. But the sampling works ONLY if the signal IS bandlimited. You need a low pass filter to LIMIT the band of the signal. Such a filter is easy to describe mathemati- cally (Fx < Fn -> A=1; Fx > Fn -> A=0) but it cannot be build. Hence the real sampling rate has to be higher than the Nyquist rate. >What can affect the fidelity, however, is the quantization of the samples: >the more bits used to represent each sample, the more accurately the signal >is represented. False. Either you are interested in unlimited range i.e. the original analog signal, or you assume a particular application and corresponding set of ranges of all the properties involved. In the first case (of no practical importance) you would need infinite sampling frequency as well as infinite number of the bits for each sample. Once you limited the frequency range according to the mechanism of hearing you should do the same to the dynamic range. You said: According to our experience, we are interested only in the frequencies up to N Hz. Therefore (according to Nyquist) the sampling rate can be 2N Hz. Similarly: According to our experience, we are interested only in dynamic levels up to N dB. Therefore (using simple math) we need only N/6 bits of resolution. zdenek ------------------------------------------------------------------------- Men are four: He who knows and knows that he knows, he is wise - follow him; He who knows and knows not that he knows, he is asleep - wake him; He who knows not and knows that he knows not, he is simple - teach him; He who knows not and knows not that he knows not, he is a fool - shun him! zdenek@CS.COLUMBIA.EDU or ...!seismo!columbia!cs!zdenek Zdenek Radouch, 457 Computer Science, Columbia University, 500 West 120th St., New York, NY 10027
turk@apple.UUCP (Ken "Turk" Turkowski) (11/08/86)
In article <521@ptsfd.UUCP> djo@ptsfd.UUCP (Dan'l Oakes) writes: >I think an earlier candidate hinted at the right idea and then dropped it. The >distinction is that a digital signal is discrete -- not so much with respect to >amplitude as with respect to time. An analog signal doesn't much care when you >look at it; a digital signal has "transition" or "edge" periods which can carry >no information, by definition. Au contraire, the transitions contain timing information, and some disk controllers use this information to synchronize. Most applications don't, however. A good digital-to-analog subsystem will not produce signals with transitions, edges, or impulses. The steps seen on signals coming directly from an A/D converter are due to aliases in the frequency domain; a lowpass filter with a sharp cutoff at half the Nyquist rate will eliminate these steps or aliases. As I asserted in an earlier article, the time- and band-limited nature of practical analog signals implies the finite dimensionality of them, where the dimension is approximately 2WT, where W is the bandwidth and T is the duration of the signal. [see: Slepian, David, "On Bandwidth", Proc. IEEE, v. 64, no. 3 March 1986] As such, analog and digital representations of bounded signals are equivalent, as long as the digital signal has approximately 2WT samples, and amplitude quantization error is negligible. In particular, if the analog signal is sampled with more than 2WT samples of infinite precision, the sampling is completely invertible. -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA UUCP: {sun,nsc}!apple!turk CSNET: turk@Apple.CSNET ARPA: turk%Apple@csnet-relay.ARPA
zdenek@heathcliff.columbia.edu (Zdenek Radouch) (11/09/86)
In article <277@apple.UUCP> turk@apple.UUCP (Ken "Turk" Turkowski) writes: > > ... The steps seen >on signals coming directly from an A/D converter are due to aliases in the >frequency domain; a lowpass filter with a sharp cutoff at half the Nyquist >rate will eliminate these steps or aliases. > Nonsense. Discrete nature of the digital signal has NOTHING to do with alias distortion. (It's about equivalent to saying: There is a relation between the digital representation of the numbers in my program and the bus error I got.) Also note that aliasing can only be PREVENTED. Once it occurs, there is NO WAY to eliminate it. If you want to post something, anybody can find in any introductory text for signal processing, why don't you read it first? You would learn something and in addition you might find some interesting problem, we could disscuss. It's boring to keep correcting the drivel. Anyway, let me say something about A/D, D/A, aliasing and related subjects. The standard analog-digital-analog chain looks like this: 2. ADC analog signal -> 1. LP filter -> -> 4. signal processing engine 3. sampling -> 5. DAC -> 6. LP filter -> analog signal 1. LP filter (low pass) limits the frequency range of the input signal to <0, Fsampling/2> in order to eliminate aliasing (see 3.) Note that 2. and 3. can be in any order 2. ADC ("analog to digital converter") 1. Converts continuous amplitude of input signal to discrete levels 2. Represents discrete amplitude levels as numbers 3. sampling Conversion of continuous-time signal f(t) into sequence x(n) with values x(n) = f(nT) by periodic sampling. T is sampling period. This is the critical part of the chain because sampling process is many-to-one mapping. In general, if F(f) is the sampling function, F(f1) = F(f2) DOESN'T imply f1 = f2! F(f) becomes one-to-one only if f is from <0,Fs/2> where Fs = 1/T. Sampling process is the only place in the chain with possibility of many-to-one mapping. In plain words if you sample an "illegal" frequency f>Fs/2 you will get different frequency on the output. The output frequency will be f<Fs/2 i.e. SAME as if you sampled some "legal" frequency. Hence the word aliasing. Since you cannot distinguish between the alias and "proper" frequencies, there is NO WAY to correct the problem once it occurs. 4. signal processing engine digital computer to do the required processing 5. DAC ("digital to analog converter") Represents numbers as discrete amplitude levels. Note that both input and output of DAC are DISCRETE! 6. LP filter (low pass) 1. Converts digital signal to continuous analog signal. 2. Removes high frequency (f>Fs/2) components from the spectrum. These frequencies are inherent to discrete-time signal. In addition please note: 1. "Analog to digital converter" does only part of the conversion from analog to digital. 2. "Digital to analog converter" DOESN'T perform the digital to analog conversion. 3. The mathematics in signal processing uses DISCRETE-TIME signals as opposed to STEP signals engineers like to draw. The original signal can be reconstructed only if the pulses fed to the output LP filter have zero width. If the pulsewidth is finite (equal to T in step function) there will be decrease in amplitude of the high frequencies that has to be corrected elsewhere. zdenek ------------------------------------------------------------------------- Men are four: He who knows and knows that he knows, he is wise - follow him; He who knows and knows not that he knows, he is asleep - wake him; He who knows not and knows that he knows not, he is simple - teach him; He who knows not and knows not that he knows not, he is a fool - shun him! zdenek@CS.COLUMBIA.EDU or ...!seismo!columbia!cs!zdenek Zdenek Radouch, 457 Computer Science, Columbia University, 500 West 120th St., New York, NY 10027
edhall@randvax.UUCP (Ed Hall) (11/10/86)
Oh my, I feel like I just resubscribed to net.philosophy... Going back to the original hypothesis that something is more ``natural'' about an anolog representation than a digital one: consider that Nature chose digital code of three-digit base-four numbers to determine how you and I are put together. (Don't tell me there is something ``unnatural'' about DNA...) There is a good engineering reason why this is so. You can say all you want about the discontinuous nature of digital representations as opposed to analog, but the fact remains that digital is exactly reproducible, while analog is not. Where reproduction is concerned--biological or otherwise--digital representation yields practical techniques for attaining a high degree of accuracy whereas the ability of analog techniques to attain such accuracy is merely theoretical. Take a cue from Mother Nature. Go digital. -Ed Hall decvax!randvax!edhall
roy@phri.UUCP (Roy Smith) (11/11/86)
In article <680@randvax.UUCP> edhall@rand-unix.UUCP (Ed Hall) writes: > Nature chose digital code of three-digit base-four numbers to determine > how you and I are put together. [...] There is a good engineering reason > why this is so. You can say all you want about the discontinuous nature > of digital representations as opposed to analog, but the fact remains > that digital is exactly reproducible, while analog is not. This is one of my favorite topics, so I'd like to expand on that a bit. I trust the real biologists out there will take into account the fact this this is a huge gross simplification of a complicated subject and not take me to task on details. I have been deliberately loose with nomenclature to highlight the information processing aspects at the cost of some biological accuracy. Readers interested in finding out more are encouraged to get a good book on molecular biology. Jim Watson's "Molecular Biology of the Gene, 3rd edition (1976)" is a good place to start. The Genetic code is indeed 3-digit, base-4 numbers. It's also an overloaded code -- the mapping from DNA to Amino Acids (AA's) is not one-to-one. Some AA's are coded for by more than one codon (3 base DNA sequence). What's really interesting, is that the copying of DNA does *not* have the perfect accuracy we have come to expect from digital processes. DNA exists in the cell most of the time as double stranded (dsDNA). This means that each base exists twice, once on one strand, and again on the other strand in its complementary form. After replication, you have a piece of dsDNA in which one of each base pair is from the original piece of DNA, and the other is a copy. You digital types will recognize this as a 2-symbol ECC, with 1 data symbol and one check symbol (there are 4 symbols, so you can't really say "bits"). OK, now that we've got our base pairs, what do we do with them? Well, a wonderful thing happens -- an enzyme (Pol1?) comes along and re-reads both strands of the new dsDNA. Every time it finds a place where a base-pair is wrong, it corrects it. But, you ask, with only a single check symbol (Hamming distance < 1), how do you know which one to trust? The answer is that you don't! You fix one of them at random and hope it's the right one. If it's not, no big deal. Either you've introduced a fatal mutation which will take care of itself, or you've made a "silent mutation" which doesn't make any difference (remember the many-to-one mapping of codons to AA's). Of course, you might have just lucked out and made a useful mutation, in which case you're off on the road to evolution. If you really get into this, it's amazing how many computer science concepts were thought of by living cells first. The most obvious is that DNA is a program. Then you have ECC (described above), subroutines (different enzymes made from common subunits), regular expressions (restriction enzymes), compilers and assemblers (ribosomes and tRNA's) compile-time preprocessing using #ifdef's (introns), self-modifying code (transposons and integrating phages), portable programs (plasmids), P&V operations (numerous regulatory systems), etc. You can even think of mRNA as a vector register, DNA as main memory, and chromosone-histone complexes as demand paging from a file system (or maybe as archival tape storage). -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016 "you can't spell unix without deoxyribonucleic!"
chiaraviglio@husc2.UUCP (lucius) (11/13/86)
In article <2489@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: > . . .What's really interesting, is that the copying of DNA does > *not* have the perfect accuracy we have come to expect from digital > processes. [Stuff about how new strands of DNA are derived from each other deleted.] > OK, now that we've got our base pairs, what do we do with them? > Well, a wonderful thing happens -- an enzyme (Pol1?) comes along and > re-reads both strands of the new dsDNA. Every time it finds a place where > a base-pair is wrong, it corrects it. But, you ask, with only a single > check symbol (Hamming distance < 1), how do you know which one to trust? > The answer is that you don't! You fix one of them at random and hope it's > the right one. If it's not, no big deal. Either you've introduced a fatal > mutation which will take care of itself, or you've made a "silent mutation" > which doesn't make any difference (remember the many-to-one mapping of > codons to AA's). Of course, you might have just lucked out and made a > useful mutation, in which case you're off on the road to evolution. Wrong. DNA Polymerase I does not come along to fix errors after replication of DNA, but rather does it while it is replicating the DNA. Every time it attaches a new base to the strand it is synthesizing, it will refuse to proceed unless that base pairs properly with the corresponding one on the old strand. If the new and old ones will not pair correctly, it snips the new one off. The way it can distinguish between new and old strands is that it is holding on to the two strands in different ways, and also the new strand is not complete (even if DNA Polymerase I runs into more completed strand, it cannot close the nick, but will just eat up the part of the strand that it is running into in order to make space for what it is putting down -- this gives you a way to radioactively label strands of DNA in places other than the ends ("nick translation")). (Completion of a strand (joining the growing (3') end to the beginning (5' end) of the next part of the strand) requires DNA Ligase.) No "fixing at random" is involved, except for the low probability of the condition which caused DNA Polymerase I to put the wrong base in in the first place continuing long enough for DNA Polymerase I to put in the next base after that (which would still have an enhanced chance of not sticking even if correctly matched, due to the overall weakening of pairing caused by the mismatched base before it). This is one of the reasons mutation rates are as low as has been observed. Other ways in which errors are corrected non-randomly depend on the fact that most alterations to DNA produce invalid bases rather than valid but incorrect bases. For example, under UV light thymidines which are next to each other dimerize, producing obviously invalid bases; under any conditions cytidine may spontaneously deaminate to uridine (which does not occur in DNA, but only in RNA, where the resulting error would be less disastrous); the result of both alterations (and some others) are things which specific enzymes can recognize as invalid and cleave out, to be replaced with properly matching bases. The probability of a base mutating while it is transiently unpaired (due to a mishap to its partner) is much lower than that of a base mutating while paired, because only a small fraction of the bases are unpaired at any given time. Recommended reading: _G_e_n_e_s (or _G_e_n_e_s_ _I_I) by Lewin, and _M_o_l_e_c_u_l_a_r _B_i_o_l_o_g_y_ _o_f_ _t_h_e_ _C_e_l_l (by 4 authors whose names I cannot remember right off hand -- this is the 1983 edition; I hear a 1986 edition may be out by a somewhat different set of authors). -- -- Lucius Chiaraviglio Department of Molecular chiaraviglio@husc4.harvard.edu Biology, seismo!husc4!chiaraviglio Massachusetts General Hospital Please do not mail replies to me on husc2 (disk quota problems, and broken mail system won't let me send mail out). Please send only to the address given above, until tardis.harvard.edu is revived.
chiaraviglio@husc2.UUCP (lucius) (11/14/86)
It has been pointed out that I made an error in my previous message in these newsgroups. This is due to omission of the underlined words in the following corrected segment of the affected paragraph in that message: > Wrong. DNA Polymerase I does not _h_a_v_e_ _t_o_ come along to fix errors _o_f_ > _r_e_p_l_i_c_a_t_i_o_n_ after _t_h_e_ replication of DNA, but rather does _t_h_e_ _g_r_e_a_t_ _p_a_r_t_ _o__f_ > it while it is replicating the DNA. _(_I_t_ _i_s_ _a_l_s_o_ _u_s_e_d_ _t_o_ _h_e_l_p_ _c_o_r_r_e_c_t_ _e_r_r_o_r_s_ > _a_t_ _o_t_h_e_r_ _t_i_m_e_s_._)_ _ _T_h_e_ _w_a_y_ _i_t_ _w_o_r_k_s_ _d_u_r_i_n_g_ _D_N_A_ _r_e_p_l_i_c_a_t_i_o_n_ _i_s_ _t_h_i_s_:_ _ _every > time it attaches a new base to the strand it is synthesizing, it will refuse > to proceed unless that base pairs properly with the corresponding one on the > old strand. If the new and old ones will not pair correctly, it snips the new > one off. [Rest of message not shown here.] That should make it much clearer. -- -- Lucius Chiaraviglio chiaraviglio@husc4.harvard.edu seismo!husc4!chiaraviglio Please do not mail replies to me on husc2 (disk quota problems, and broken mail system won't let me send mail out). Please send only to the address given above, until tardis.harvard.edu is revived.