rcd@opus.UUCP (Dick Dunn) (11/01/85)
> >> First of all, I don't know of any features of any languages which are > >> "inherently" expensive to implement, regardless of the machine > >> architecture. (me) > > > > Right, because any expensive feature can be done directly in > > hardware! :-) But simply compiling a call to "(sort x)" into > > "SORT (X)+,0" doesn't mean that the sorting won't still be O(n log n)... > > ...The proof > that I know of that no sort can run faster than O(n log n) > is based on an analysis of possible decision trees...The important point > is that this proof relies on assumptions about ARCHITECTURE. > It assumes that more than one comparison will NOT be done at > the same time. This is not true. I think it indicates a basic misunderstanding of "O() notation". If you change the architecture of your machine so that it can do, say, five comparisons at once, and you modify your sort program to take advantage of this, the time for the sort is STILL O(n log n)--the situation is little different than if you'd made the hardware five times as fast. Remember (or understand) that when you talk about this sort of "order of" an algorithm, you don't get to set a fixed upper limit on n. > I believe that one can find the N-th highest number in an > unsorted list in less than O(n log n) time (I'd have to check > my algorithms books to make sure, but...) Throw n different > processors at an unsorted array, one looking for the first > highest, the second looking for the second highest, etc. > Have these processors work concurrently. Voila! NO! You cannot throw "n different processors" at the array! N is (potentially) larger than the number of processors you have. Actually, there is an assumption in analyzing algorithms that one does not have an infinite number of computational elements (whatever they may be). If you throw out that assumption, you're nowhere--because (1) you can't build, or even emulate, the hardware implied and mostly (2) all the algorithms you're going to find interesting will take constant time! (If you have unlimited hardware, you just keep replicating and working in parallel.) > Look, I'm no algorithms expert. But separation of language > issues from architectural issues from implementation issues > is RARELY done and I'd like to see people be more careful > about doing so. On the contrary, analysis of complexity of algorithms is generally done with considerable care to separate these issues and identify the assumptions about architecture implied by the algorithm. If you're not accustomed to reading in the area, you may find yourself a little befuddled because you don't understand some of the assumptions commonly left implicit. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...At last it's the real thing...or close enough to pretend.
sher@rochester.UUCP (David Sher) (11/03/85)
In article <189@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes: ... >NO! You cannot throw "n different processors" at the array! N is >(potentially) larger than the number of processors you have. Actually, >there is an assumption in analyzing algorithms that one does not have an >infinite number of computational elements (whatever they may be). If you >throw out that assumption, you're nowhere--because (1) you can't build, or >even emulate, the hardware implied and mostly (2) all the algorithms >you're going to find interesting will take constant time! (If you have >unlimited hardware, you just keep replicating and working in parallel.) > ... >-- >Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 > ...At last it's the real thing...or close enough to pretend. I am afraid that this opinion is an oversimplification. It is true that if you have arbitrary amounts of hardware analysis of algorithms becomes meaningless (since all problems are table lookup). However it is perfectly reasonable to study the case where the amount of hardware available is proportional to size of the problem or some function thereof (like log(n)). This is no more an assumption of infinite hardware than O(n) assumes that you have infinite time. As an example consider the result that you can sort at speed O(n) with O(n) hardware. This means that if you are willing to acquire hardware proportional to the maximum size of the input you are expecting you can save a log(n) factor in sorting. An interesting result no? -- -David Sher sher@rochester seismo!rochester!sher
barmar@mit-eddie.UUCP (Barry Margolin) (11/03/85)
In article <189@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes: >NO! You cannot throw "n different processors" at the array! N is >(potentially) larger than the number of processors you have. Actually, >there is an assumption in analyzing algorithms that one does not have an >infinite number of computational elements (whatever they may be). If you >throw out that assumption, you're nowhere--because (1) you can't build, or >even emulate, the hardware implied and mostly (2) all the algorithms >you're going to find interesting will take constant time! (If you have >unlimited hardware, you just keep replicating and working in parallel.) True, you never have unlimited number of processing elements, but you can have enough for many applications. Check out some of the recent non-Von Neumann architecture research, such as Danny Hillis' Connection Machine (he recently wrote a very readable book about it, published by the MIT Press). The Connection Machine will have thousands, possibly hundreds of thousands, of simple processors with small amounts of local memory. True, algorithms are still theoretically of the same order as they would be on conventional hardware, but the coefficients are MUCH smaller. Also, many instances of particular algorithms do not run out of processors, so they are much faster still. For instance, if N is less than the number of processors, finding the maximum of N numbers is something like O(log N) (my Connection Machine book isn't handy, so I'm not sure if that is the real speed, but it is better than O(N)); this uses an algorithm that arranges the processors in a binary tree, and percolates the maximum up to the top. -- Barry Margolin ARPA: barmar@MIT-Multics UUCP: ..!genrad!mit-eddie!barmar
gvcormack@watmum.UUCP (Gordon V. Cormack) (11/03/85)
> NO! You cannot throw "n different processors" at the array! N is > (potentially) larger than the number of processors you have. Actually, > there is an assumption in analyzing algorithms that one does not have an > infinite number of computational elements (whatever they may be). If you > throw out that assumption, you're nowhere--because (1) you can't build, or > even emulate, the hardware implied and mostly (2) all the algorithms > you're going to find interesting will take constant time! (If you have > unlimited hardware, you just keep replicating and working in parallel.) Sorry, this is just not true. There is inherent sequencing in many problems. For example, it is extremely difficult to sort in O(log n) time using n processors. Noone knows (and indeed I think it is impossible) to sort in O(1) time with n log n processors.
rcd@opus.UUCP (Dick Dunn) (11/04/85)
Talking about sorting and complexity of algorithms... > In article <189@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes: > ... > >NO! You cannot throw "n different processors" at the array! N is > >(potentially) larger than the number of processors you have... > ... > I am afraid that this opinion is an oversimplification...it is > perfectly reasonable to study the case where the amount of hardware > available is proportional to size of the problem or some function thereof > (like log(n)). This is no more an assumption of infinite hardware > than O(n) assumes that you have infinite time... OK, now I am SURE that the folks who are responding to this don't understand O() notation. This notation is explicitly designed to express the ASYMPTOTIC COMPLEXITY of an algorithm. If that's not what you mean, use some other notation. The usefulness of this sort of complexity measure is that it tells how the required processing time (or space) increases in relation to the "size" of the problem being solved. Again, O() notation is an asymptotic measure. This makes it quite explicit that you cannot have an "...amount of hardware...proportional to size of the problem..." unless you consider an infinite amount of hardware. The "order-of" notation is not the only measure that can be applied to an algorithm--but it DOES have a specific meaning and a specific use, and it's a very basic part of analysis of algorithms. Try to understand it before you use it. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Never attribute to malice what can be adequately explained by stupidity.
ken@turtlevax.UUCP (Ken Turkowski) (11/04/85)
Several people have mentioned the problem of sorting using multiple processors. Has anyone looked at using the AMD sorting and searching chip, and how to integrate it into a multi-user system? -- Ken Turkowski @ CIMLINC (formerly CADLINC), Menlo Park, CA UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.DEC.COM
sher@rochester.UUCP (David Sher) (11/05/85)
In article <196@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes: >Talking about sorting and complexity of algorithms... >> In article <189@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes: >> ... >> >NO! You cannot throw "n different processors" at the array! N is >> >(potentially) larger than the number of processors you have... >> ... >> I am afraid that this opinion is an oversimplification...it is >> perfectly reasonable to study the case where the amount of hardware >> available is proportional to size of the problem or some function thereof >> (like log(n)). This is no more an assumption of infinite hardware >> than O(n) assumes that you have infinite time... > >OK, now I am SURE that the folks who are responding to this don't >understand O() notation. This notation is explicitly designed to express >the ASYMPTOTIC COMPLEXITY of an algorithm. If that's not what you mean, >use some other notation. The usefulness of this sort of complexity measure >is that it tells how the required processing time (or space) increases in >relation to the "size" of the problem being solved. Again, O() notation is >an asymptotic measure. This makes it quite explicit that you cannot have >an "...amount of hardware...proportional to size of the problem..." unless >you consider an infinite amount of hardware. > >The "order-of" notation is not the only measure that can be applied to an >algorithm--but it DOES have a specific meaning and a specific use, and it's >a very basic part of analysis of algorithms. Try to understand it before >you use it. >-- >Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 > ...Never attribute to malice what can be adequately explained by stupidity. Are you sure you understand O(f(n)) notation? The exact definition of O(f(n)) is: "a function g(n) is said to be O(f(n)) if there exists a constant c such that g(n) <= cf(n) for all but some finite (possibly empty set of nonnegative values for n." (Taken w/o permission from Aho, Hopcroft, and Ullman's "The Design and Analysis of Computer Algorithms") (Actually I can define it also axiomatically but this gets the gist of it and should be understandable even to the nonmathematicians in the audience.) Actually i suspect your misunderstanding is of the word infinite. O(f(n)) never refers to any of the infinities. What it refers to is an arbitrarily large number (which is a subtly different concept). Thus I can consider an arbitrarily large amount of hardware without considerring an infinite amount. This just means that if you are willing to supply me with a problem n words long I am willing to construct some unknown c times f(n) hardware (c must remain constant for the rest of my life) and run the problem on it (for all but a finite set of special cases. In particular there is a constant d such that if you hand me a problem larger than d, I can definitely do it within cf(n)). Since I don't expect you to hand me any infinitely long problems I do not expect to construct infinite amounts of hardware. -David Sher "Never assume that someone else is stupid when your own stupidity is an equally valid explanation" -- -David Sher sher@rochester seismo!rochester!sher
preece@ccvaxa.UUCP (11/06/85)
[Quoted from a posting responding to a complaint that algorithm analysis tended to consider architectural rather than computational issues. I don't have the name of the original author.] > On the contrary, analysis of complexity of algorithms is generally done > with considerable care to separate these issues and identify the > assumptions about architecture implied by the algorithm. If you're not > accustomed to reading in the area, you may find yourself a little > befuddled because you don't understand some of the assumptions commonly > left implicit. > Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 > x3086 ---------- Well, you both missed the major architectural assumption in the examples you were arguing about, which is that the number of compares is a reasonable measure of cost. This assumption is usually mentioned in texts on the subject. Certainly in many cases there are other, more appropriate measures (in sorting, for instance, the cost of moving data may sometimes be a much larger factor than the cost of the compares). You also both missed the point that throwing N processors at a problem does not change its complexity -- all those compares are still being done SOMEWHERE -- even though it does change the elapsed time to completion (again raising the question of what is a useful measure of complexity). Overall I'd have to say that the first author was more correct: the common conception of algorithm analysis does tend to include major architectural assumptions that may not be valid for unconventional and innovative architectures. Those assumptions ARE usually considered by those who write in the discipline, but are often ignored by those who have merely studied it. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
richw@ada-uts.UUCP (11/06/85)
>>> I believe that one can find the N-th highest number in an >>> unsorted list in less than O(n log n) time (I'd have to check >>> my algorithms books to make sure, but...) Throw n different >>> processors at an unsorted array, one looking for the first >>> highest, the second looking for the second highest, etc. >>> Have these processors work concurrently. Voila! (me) > NO! You cannot throw "n different processors" at the array! N is > (potentially) larger than the number of processors you have. > (Dick Dunn) As pointed out in another reply, it is perfectly reasonable to assume that N processors will always be available. Consider a machine which had limited memory (they ALL do). Say M bytes. It also has M processors. Come up with a problem instance that has more elements to sort than processors -- you can't. And let's not get into the different subject of external sorts, please. Of course no machine has an unbounded number of processors. HOWEVER, given an instance of a sorting problem, a machine can exist which can 1) store the array to sort and 2) match each array element with a processor. For those machines, sorting can be done in O(n). And, yes, a single processor can find the k-th element in an unsorted list of n numbers in O(n). In analyzing algorithms, no one cares that REAL machines have bounded amounts of memory because machines have enough memory to not worry about this detail. Why concern yourself with the fact that no machine can have an infinite number of processors? This discussion is relevent because it seems that in the future machines will exist which have a sufficiently large number of processors that one can, once again, ignore the finite-ness of the universe. Details, details... :-) > Actually, > there is an assumption in analyzing algorithms that one does not have an > infinite number of computational elements (whatever they may be)... > (Dick Dunn) This assumption is ONLY applicable when COMPARING algorithms. If you want to analyze the performance of an algorithm on a new machine you just built (or hypothesized), there's NO reason not to. The "assumption" Dick mentions I think applies to published analyses of algorithms; here, most readers should assume the good ol' random-access machine, single processor architecture for the sake of COMPARISON against other published results. >>> Look, I'm no algorithms expert. But separation of language >>> issues from architectural issues from implementation issues >>> is RARELY done and I'd like to see people be more careful >>> about doing so. (me) > On the contrary, analysis of complexity of algorithms is generally done > with considerable care to separate these issues and identify the > assumptions about architecture implied by the algorithm. If you're not > accustomed to reading in the area, you may find yourself a little befuddled > because you don't understand some of the assumptions commonly left > implicit. > (Dick Dunn) Grrr... My saying "I'm no algorithms expert" was a poor attempt at humility. Please avoid using such statements to imply lack of experience. My saying "so-and-so is rarely done" applied to some of the off-the-cuff remarks I've seen posted. Wasn't this obvious? I realize that algorithms research, language research, etc. is done seriously (somewhere out there). But, then again, this is not MIT's Lab. for Computer Science (notice I didn't say the AI-Lab :-) ) -- this is the net. Informal discussions about various topics occur all the time on the net and in real life. Sometimes these discussions include statements which I feel are wrong. So I say so. I truly welcome being corrected when I'm wrong. I'd like to learn how I'm mistaken. But please no more bullshit about not being "accustomed to reading in the area." -- Rich Wagner
stevev@tekchips.UUCP (Steve Vegdahl) (11/06/85)
This message is in response to a number of articles regarding the above- named subject. My intent is to eliminate some of the misunderstanding and miscommumincation that has been taking place. (And probably replace it with new misunderstanding ... :-)) First, the O(n log n) result for sorting applies when 1. The architecture has a single processor with a "large" amount of memory 2. Sorting is done using comparisons and 3. Probablistic algorithms are not used 1--> Arguments crying "foul, you can't use an infinite amount of hardware", do not hold water. The point of the original posting (as I read it anyway) was that a problem's complexity depends on the architecture. Comparison sorting is O(n^2) on a Turing Machine, and is O(n log n) on a von Neumann architecture. It is perfectly reasonable to assume an architecture in which each memory element has a dedicated processor. The O-notation allows one to assume that there is an infinite amount of memory (or at least an amount of memory that is proportional to problem size). Thus, on a processor-per-memory-element architecture, the assumption of n-processors for a problem of size n is not only reasonable, but manditory. 2--> There are, of course, sorting algorithms such as radix-sort (see Knuth, Vol. 3), that do not use comparisons. 3--> It has been shown (e.g., Bruce Weide's thesis, Carnegie-Mellon U. (1977?)), that probablistic techniques can be used to sort in linear time. (This must be qualified by defining what "worst-case" means for a probablistic algorithm, but I'll leave that as an exercise ...) With that as a preface, I will now respond to some of the statements that have been made regarding this subject. >NO! You cannot throw "n different processors" at the array! N is >(potentially) larger than the number of processors you have. Actually, >there is an assumption in analyzing algorithms that one does not have an >infinite number of computational elements (whatever they may be). If you >throw out that assumption, you're nowhere--because (1) you can't build, or >even emulate, the hardware implied and mostly (2) all the algorithms >you're going to find interesting will take constant time! (If you have >unlimited hardware, you just keep replicating and working in parallel.) Responding to (1): certain computation models allow an infinite number of processors. Even disallowing these as impractical, assumming N processors is no more impractical than assuming N memory elements. Responding to (2): there are professors that make careers studying the complexity of algorithms assuming an infinite number of processors. There are lots of interesting algorithms do not work in constant time even with that assumption. >OK, now I am SURE that the folks who are responding to this don't >understand O() notation. This notation is explicitly designed to express >the ASYMPTOTIC COMPLEXITY of an algorithm. If that's not what you mean, >use some other notation. The usefulness of this sort of complexity measure >is that it tells how the required processing time (or space) increases in >relation to the "size" of the problem being solved. Again, O() notation is >an asymptotic measure. This makes it quite explicit that you cannot have >an "...amount of hardware...proportional to size of the problem..." unless >you consider an infinite amount of hardware. > >The "order-of" notation is not the only measure that can be applied to an >algorithm--but it DOES have a specific meaning and a specific use, and it's >a very basic part of analysis of algorithms. Try to understand it before >you use it. The O-notation makes no such assumption. In fact, it assumes just the opposite: that is enough memory (i.e., hardware) to hold the problem. The important thing is that the be O-notation used with respect to a computation model, and that the everyone agree (for a given discussion) on what the model is. The writer of the above quote seems to insist that the von Neumann model (or something very close to it) is the only model of computation that can be used. Allowing the number of processors to increase in proportion to the number of memory elements simply trades space for time. It is still appropriate to use the O-notation. >If I remember right, even though the time complexity of sort is O(log n) >on a single processor, the median of a set can be found in O(n). >It is hard to imagine how this might be done, but... The median of an array of elements (or more generally, the kth-smallest element) can definitely be found in linear time. See Aho, Hopcroft and Ullman (1974), page 97. >There is inherent sequencing in many >problems. For example, it is extremely difficult to sort in O(log n) >time using n processors. Noone knows (and indeed I think it is >impossible) to sort in O(1) time with n log n processors. If pipelining is allowed, sorting can be done in "constant time" on a network of size O(n * log^2(n)) (see Knuth, vol. 3). By this I mean that you present n elements to the network at each time unit, and you get n sorted elements out of the network at each time unit, with a latency of log^2(n). I seem to remember reading recently that someone discovered an O(n * log(n)) sorting network that would do the same thing. Ullman, maybe? Steve Vegdahl Computer Research Lab. Tektronix, Inc. Beaverton, Oregon
preece@ccvaxa.UUCP (11/07/85)
> Noone knows (and indeed I think it is impossible) to sort in O(1) time > with n log n processors. /* Written 10:17 am Nov 3, 1985 by > gvcormack@watmum.UUCP in ccvaxa:net.lang */ ---------- Needless to say, you can sort in O(1) time on 1 processor if you have unbounded memory, though generating the lookup table would take a long time... :-) -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
jbn@wdl1.UUCP (11/08/85)
On parallelism: The extent to which a computation can be speeded up by doing multiple operations in parallel is a very difficult subject; attempts to find formal upper bounds on potential parallelism is an area in which not much is known. On sorting: The upper limit is of the order of 2**n binary comparisons, but one can do better than 2**n operations by using distribution techniques. There was a recent article in Computer Language giving a sorting algorithm that beats 2**n. John Nagle
jbn@wdl1.UUCP (11/08/85)
On parallelism: The extent to which a computation can be speeded up by doing multiple operations in parallel is a very difficult subject; attempts to find formal upper bounds on potential parallelism is an area in which not much is known. On sorting: The upper limit is of the order of n log n binary comparisons, but one can do better than n log n operations by using distribution techniques. There was a recent article in Computer Language giving a sorting algorithm that beats n log n; heuristic techniques are used to equally divide the key space so as to evenly distribute the records into multiple buckets. If this can be done perfectly, it can be done in strictly linear time. Consider sorting a set of records numbered 0 to 9999 by providing 10,000 buckets and just putting each record into the right slot as it comes along. Linear time, right? In practice, nearly linear performance is achieved by the high performance packages on mainframes, such as SyncSort; the distribution algorithms used to do this are proprietary. John Nagle
preece@ccvaxa.UUCP (11/08/85)
> Again, O() notation is an asymptotic measure. This makes it quite > explicit that you cannot have an "...amount of hardware...proportional > to size of the problem..." unless you consider an infinite amount of > hardware. /* Written 1:47 am Nov 4, 1985 by rcd@opus.UUCP in > ccvaxa:net.lang */ ---------- Rephrasing this, the O() notation measures the relationship between the amount of input and the amount of work that has to be done to produce the output of the algorithm. When you start talking about throwing multiple processors at the problem you are talking about how long it takes you to do that work, which is a wholly separate issue. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
oleg@birtch.UUCP (Oleg Kiselev) (11/12/85)
In article <841@wdl1.UUCP> jbn@wdl1.UUCP writes: >On sorting: > > Consider sorting a set of records numbered 0 to 9999 by > providing 10,000 buckets and just putting each record into the > right slot as it comes along. Linear time, right? In practice, Uh, that's what they taught us as "radix" sort, no? And it works only on a defined set of possible values of the key... -- Disclamer: My employers go to church every Sunday, listen to Country music, and donate money to GOP. I am just a deviant. +-------------------------------+ Don't bother, I'll find the door! | "VIOLATORS WILL BE TOAD!" | Oleg Kiselev. | Dungeon Police |...!{trwrb|scgvaxd}!felix!birtch!oleg --------------------------------+...!{ihnp4|randvax}!ucla-cs!uclapic!oac6!oleg