jma@beach.cis.ufl.edu (John 'Vlad' Adams) (11/17/90)
In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: > >Amiga 2000HD >Supra memory expansion board w 2meg 80ns ram >GVP 3001 accel card w/o '882 and w/o 32 bit ram > > When I run performance tests on the machine, I show only a 20% >increase over the stock 68000! I know that this card will perform much >better with 32bit ram and a fpu, but it is clocked 4 times as fast as >the stock 68000, and the 030 should perform better at the same clock >speed. You are lucky to see that much of a performance boost. Don't forget, the 030 fetches memory in four-byte chunks through the normal Amiga bus. Therefore, you slow down with the wait states and the double fetch. An 030 will not outperform the 68000 without 32bit fast ram. >Another funny thing: all my system performance and monitering >software tells me that I have an '020 instead of an '030. Anyway, if >some one could either give me some suggestions as to what might be >wrong, (or worst case tell me that 20% is all you get :-( ) I would >be very thankful! > There is no magic flag to tell the difference via software between an 020 and an 030. Do yourself a BIG favor. Get at least two megs of 32bit ram. Don't worry about the FPU, as it won't do anywhere near as much for you. -- John M. Adams --**-- Professional Student on the eight-year plan! /// Internet: jma@beach.cis.ufl.edu -or- vladimir@maple.circa.ufl.edu /// "We'll always be together, together in electric dreams" Moroder & Oakey \\V// Sysop of The Beachside. FIDOnet 1:3612/557. 904-492-2305 (Florida) \X/
daveh@cbmvax.commodore.com (Dave Haynie) (11/20/90)
In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: > I recently purchased a GVP 28mhz '030 accelerator board, and >it is not performing up to par. >GVP 3001 accel card w/o '882 and w/o 32 bit ram ^^^^^^^^^^^^^^-------- Great Ceasar's Ghost, Superman! > When I run performance tests on the machine, I show only a 20% >increase over the stock 68000! That's just about what you should expect to see, without any 32 bit RAM. Without faster and wider memory, the clock speed is going to make very little difference, the CPU will be memory bound. And, with 16 bit memory, the 68030 is less efficient than the 68000. 680x0 instructions are based on 16 bit words, yet the 68030's prefetch mechanism always grabs 32 bits at a time for any instruction fetch, even if the second 16 bits won't be used. So, on a 68030 with 16 bit RAM, a certain % of the memory cycles you run do absolutely nothing relative to a 68000. You do get performance increases as well, thanks to full 32 bit ALUs and better microcode, but you still don't win in many cases until the caches are enabled. If you're running with the GVP standard Startup-Sequence, you will have the data cache turned on too, which should help you out a bit (the OS turns on the instruction cache). If not, copy SetCPU over from the GVP disk and run "SetCPU CACHE" somewhere in your Startup-Sequence. >Another funny thing: all my system performance and monitering >software tells me that I have an '020 instead of an '030. Under 1.3, the OS doesn't differentiate. If you run SetCPU first, and you have software that follows the OS 2.xx conventions for CPU identification, you'll find that a 68030 will be reported. Really, the only real use for a 68030 without 32 bit RAM would be in floating point operations, which of course require an FPU. A plain '030 with no FPU and no 32 bit RAM is about the most expensive way I can think of to improve your system performance by around 20%. Add some decent 32 bit memory and you'll see more like 400%-800% improvement. That's why the A2630 doesn't come without 32 bit RAM. -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy Standing on the shoulders of giants leaves me cold -REM
lkoop@pnet01.cts.com (Lamonte Koop) (11/20/90)
jma@beach.cis.ufl.edu (John 'Vlad' Adams) writes: >In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: >> >>Amiga 2000HD >>Supra memory expansion board w 2meg 80ns ram >>GVP 3001 accel card w/o '882 and w/o 32 bit ram >> >> When I run performance tests on the machine, I show only a 20% >>increase over the stock 68000! I know that this card will perform much >>better with 32bit ram and a fpu, but it is clocked 4 times as fast as >>the stock 68000, and the 030 should perform better at the same clock >>speed. > >You are lucky to see that much of a performance boost. Don't >forget, the 030 fetches memory in four-byte chunks through >the normal Amiga bus. Therefore, you slow down with the wait >states and the double fetch. An 030 will not outperform the 68000 >without 32bit fast ram. No...he is correct in that something is wrong here. Yes, the 030 does longword instruction prefetches, and running on a 16-bit bus will cause a performance hit, but with the 030 caches active, he should be seeing performance around 2-2.5x that of a stock A2000. The instruction cache should 'buffer' some of the hit taken by the operations on the 16-bit bus. Without it, you are indeed correct that the 030 would act as if crawling in molasses. The 030 (and 020 for that matter) instruction caches are activated automatically by the OS during system startup, so I don't see where the cache should be off, so I would suspect a problem with the board. (Unless GVP is doing some odd way of hooking into the 16-bit bus that would result in only achieving this mediocre performance increase on that kind of setup, but I doubt it. > >>Another funny thing: all my system performance and monitering >>software tells me that I have an '020 instead of an '030. Anyway, if >>some one could either give me some suggestions as to what might be >>wrong, (or worst case tell me that 20% is all you get :-( ) I would >>be very thankful! >> >There is no magic flag to tell the difference via software between >an 020 and an 030. Do yourself a BIG favor. Get at least two >megs of 32bit ram. Don't worry about the FPU, as it won't >do anywhere near as much for you. There is indeed a 'magic flag' to tell the difference between the two processors. Unfortunately, AmigaOS 1.3 (and below) doesn't know how to use it. The 'flag' is found in the Exec base structure and can be read by referencing the Exec-->AttnFlags field. Bits within this field are set according to the type of processor/fpu on the system. AmigaOS 1.3 will properly identify a 68020/881 combination, but ID's a 68030/882 combo simply as the former. [AmigaOS 2.0 fixes this]. For a good example of how to ID the cpu properly under v1.3 or so, Dave Haynie's SetCPU source is what I'd recommend. Anyway, many programs that simply check this field will not be able to tell that a 68030 is in fact installed rather than a 68020. I'd recommend running Dave's SetCPU on the system with no arguments. It'll tell you what CPU is on the system (correctly...and will also set the 'magic' bits so that other programs will ID it properly as well...if they are programmed to do so). If SetCPU shows only a 68020, then I'd say there is something VERY wrong with this board. Also, SetCPU will allow you to fiddle with the caches, and will tell you how they are currently set. It would be prudent to be sure that the Instruction cache at the least was 'on'. >John M. Adams --**-- Professional Student on the eight-year plan! /// >Internet: jma@beach.cis.ufl.edu -or- vladimir@maple.circa.ufl.edu /// >"We'll always be together, together in electric dreams" Moroder & Oakey \\V// >Sysop of The Beachside. FIDOnet 1:3612/557. 904-492-2305 (Florida) \X/ --LaMonte UUCP: {hplabs!hp-sdd ucsd nosc}!crash!pnet01!lkoop ARPA: crash!pnet01!lkoop@nosc.mil INET: lkoop@pnet01.cts.com
daveh@cbmvax.commodore.com (Dave Haynie) (11/21/90)
In article <5749@crash.cts.com> lkoop@pnet01.cts.com (Lamonte Koop) writes: >jma@beach.cis.ufl.edu (John 'Vlad' Adams) writes: >>In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: >>>GVP 3001 accel card w/o '882 and w/o 32 bit ram >>> When I run performance tests on the machine, I show only a 20% >>>increase over the stock 68000! >>You are lucky to see that much of a performance boost. Don't >>forget, the 030 fetches memory in four-byte chunks through >>the normal Amiga bus. Therefore, you slow down with the wait >>states and the double fetch. An 030 will not outperform the 68000 >>without 32bit fast ram. >No...he is correct in that something is wrong here. Yes, the 030 does >longword instruction prefetches, and running on a 16-bit bus will cause a >performance hit, but with the 030 caches active, he should be seeing >performance around 2-2.5x that of a stock A2000. Absolutely wrong. While on a few select (and meaningless) benchmark programs, you may see that, in reality, 20% isn't an unreasonable number. Especially if he's got his data cache off. Try a benchmark that's at least reasonable, perhaps Dhrystone 2.1. >The instruction cache should 'buffer' some of the hit taken by the operations >on the 16-bit bus. It does. Without the I-Cache, the 68030 on the 16 bit bus will run slower than the 68000 on most things, absolutely guaranteed. But it's not perfect; you still lose quite often by wasted 16 bit fetches -- the cache isn't all that effective outside of inner loops; as few as 16 instructions can totally overwrite it. >The 030 (and 020 for that matter) instruction caches are activated >automatically by the OS during system startup, so I don't see where the cache >should be off, so I would suspect a problem with the board. No problem with the board. The 20% is just about right; you mileage will, of course, vary with the benchmark selection, meaningless benchmarks need not apply. The A2630 will behave the same way if you turn off its Fast memory; that's exactly why it doesn't ship without Fast memory. >--LaMonte -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy Standing on the shoulders of giants leaves me cold -REM
lkoop@pnet01.cts.com (Lamonte Koop) (11/21/90)
daveh@cbmvax.commodore.com (Dave Haynie) writes: >In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: > >> I recently purchased a GVP 28mhz '030 accelerator board, and >>it is not performing up to par. > >>GVP 3001 accel card w/o '882 and w/o 32 bit ram > ^^^^^^^^^^^^^^-------- Great Ceasar's Ghost, Superman! Quite agreed here... :^) > >> When I run performance tests on the machine, I show only a 20% >>increase over the stock 68000! > >That's just about what you should expect to see, without any 32 bit RAM. >Without faster and wider memory, the clock speed is going to make very >little difference, the CPU will be memory bound. And, with 16 bit memory, >the 68030 is less efficient than the 68000. 680x0 instructions are >based on 16 bit words, yet the 68030's prefetch mechanism always grabs >32 bits at a time for any instruction fetch, even if the second 16 bits >won't be used. So, on a 68030 with 16 bit RAM, a certain % of the memory >cycles you run do absolutely nothing relative to a 68000. You do get >performance increases as well, thanks to full 32 bit ALUs and better >microcode, but you still don't win in many cases until the caches are >enabled. If you're running with the GVP standard Startup-Sequence, you >will have the data cache turned on too, which should help you out a Dave...this is one I need to disagree with. [Why do I see flames on the horizon? :-)] Of course running in a 16-bit environment is killing a great deal of the performance of the 68030...the extra cycles needed for the bus accesses are quite time consuming. However, the figure of '20%' over a stock 2000 with a standard 68000 I cannot agree on as being correct in any sense. To illustrate, I turned off the 32-bit RAM (well...didn't configure it, so for all intensive puposes it's off) on my 25MHz 030 based system here. My performance figures show the system at 2.1x (or 210%) of a normal 68000-based Amiga....equivalently a 110% increase, not 20%. [This is with the Data cache OFF...with it on, the increase jumps to 200% or so]. These figures are with my term running in the backgound, and several other smaller programs (this is probably not much of a difference...they should all be in a 'Wait' condition anyway). Of course with 32-bit RAM enabled...I get around 7-900%. This is rather a poor way to make my point, but that 20% figure bothers me. If it were the case that it was this bad, there would be little market for the various 68020 accelerators around which have no 32-bit memory facilities...the instruction units on the 020 and 030 are similiar, and have the same basic problem on the 16-bit bus..AND the 020 does not have a nice data cache to help it out... Applicatons-wise, though, I DO agree with around 20-30% though, as benchmarks are far too simple to measure over the entire spectrum of operations. If this GVP board is being tested with any of the benchmarking tests I've seen though (Ronin CPUSpeed, etc...), I'd more expect a reading as I mentioned before. >Really, the only real use for a 68030 without 32 bit RAM would be in >floating point operations, which of course require an FPU. A plain >'030 with no FPU and no 32 bit RAM is about the most expensive way I >can think of to improve your system performance by around 20%. Add some >decent 32 bit memory and you'll see more like 400%-800% improvement. >That's why the A2630 doesn't come without 32 bit RAM. Agreed here...without 32-bit RAM, the 030 is rather like a very expensive chicken without a head.... >-- >Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" > {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy > Standing on the shoulders of giants leaves me cold -REM --LaMonte UUCP: {hplabs!hp-sdd ucsd nosc}!crash!pnet01!lkoop ARPA: crash!pnet01!lkoop@nosc.mil INET: lkoop@pnet01.cts.com
jesup@cbmvax.commodore.com (Randell Jesup) (11/21/90)
In article <5765@crash.cts.com> lkoop@pnet01.cts.com (Lamonte Koop) writes: >accesses are quite time consuming. However, the figure of '20%' over a stock >2000 with a standard 68000 I cannot agree on as being correct in any sense. >To illustrate, I turned off the 32-bit RAM (well...didn't configure it, so for >all intensive puposes it's off) on my 25MHz 030 based system here. My >performance figures show the system at 2.1x (or 210%) of a normal 68000-based >Amiga....equivalently a 110% increase, not 20%. [This is with the Data cache Remember that most performance testers tend to be small loops of code, which are handled very nicely by a 256-byte I-cache. This causes them to overstate the application-level performance of a system. -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com BIX: rjesup Thus spake the Master Ninjei: "If your application does not run correctly, do not blame the operating system." (From "The Zen of Programming") ;-)
daveh@cbmvax.commodore.com (Dave Haynie) (11/21/90)
In article <5765@crash.cts.com> lkoop@pnet01.cts.com (Lamonte Koop) writes: >daveh@cbmvax.commodore.com (Dave Haynie) writes: >>In article <1990Nov16.175346.4884@maverick.ksu.ksu.edu> yeewei@matt.ksu.ksu.edu (Yee-Wei Huang) writes: >>> When I run performance tests on the machine, I show only a 20% >>>increase over the stock 68000! >>That's just about what you should expect to see, without any 32 bit RAM. >Dave...this is one I need to disagree with. [Why do I see flames on the >horizon? :-)] No flames, just possible misconceptions... >Of course running in a 16-bit environment is killing a great deal of the >performance of the 68030...the extra cycles needed for the bus accesses are >quite time consuming. The extra cycles simply explain why, without the caches, the 68030 will be slower. You have to add to that the fact that the memory you're accessing is less that 1/2 the speed of normal 68030 32-bit memory. For example, the basic A2000 or Zorro II memory cycle is 560ns. The A2630 and A3000 memory systems (without burst enabled) cycle in 200ns. Then you have to consider synchronization delays -- the 68030 has to sync up to 16 bit memory, since they're based on different clocks, so you end up adding extra wait states for most of your memory accesses. >However, the figure of '20%' over a stock 2000 with a standard 68000 I cannot >agree on as being correct in any sense. For real system performance, it is. You can construct artificial benchmark conditions that make the '030 performance appear better. >If it were the case that it was this bad, there would be little market for the >various 68020 accelerators around which have no 32-bit memory facilities... Don't count on it. Those little accelerators sell because people THINK they'll go much faster with them. Market preception, above all else, is what sells. Why do think people bought Apple IIs for 3-4 times the price of C64s, when the C64 actually ran a hair faster? The only real significant speedup you get with these kind of '020 boards is via the math coprocessor. >Applicatons-wise, though, I DO agree with around 20-30% though, as benchmarks >are far too simple to measure over the entire spectrum of operations. That's what I'm talking about here. Benchmarks that don't track the behavior of applications should be erased from all your disks. They are totally meaningless. So anything that's telling you your 32-bit-memory-less 68030 is going 200% faster than a plain 68000 is measuring an artifical condition. It is, in effect, lying to you, since you cannot do any useful work and get anywhere near that performance level. >If this GVP board is being tested with any of the benchmarking tests I've seen >though (Ronin CPUSpeed, etc...), I'd more expect a reading as I mentioned >before. First thing you do, erase any copies of the Ronin CPUSpeed test you have around. That test basically measures CPU register speed, and as a side effect has a component that's based on memory speed. It tells you absolutely nothing about system performance. It is totally useless. It will tell you that a 50MHz 68030 system running out of slow 16 bit memory is faster than a 16MHz 68030 system running out of nearly 0 wait state 32 bit memory, which is totally wrong. While I can't recommend any really trustworthy benchmarks, Dhrystone 2.x is a decent one. Though, on the same hardware, you can easily get a 2:1 range of results depending on the compiler you use, sometimes even between different options on the same compiler. So simply quoting Dhrystones tends to be rather meaningless without also including the compiler information, unless you're simply going for bragging rights against a Mac or IBM. >--LaMonte -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy Standing on the shoulders of giants leaves me cold -REM
jeh@sisd.kodak.com (Ed Hanway) (11/21/90)
lkoop@pnet01.cts.com (Lamonte Koop) writes: > If it >were the case that it was this bad, there would be little market for the >various 68020 accelerators around which have no 32-bit memory facilities... Hey, who said the every configuration had to make sense? If that were true, there wouldn't be any el cheapo .41mm dot pitch "VGA" monitors. :-) Seriously, there are two reasons that I can think of for _selling_ a 32-bit accellerator w/o 32-bit memory: 1) It allows one to use an FPU. 2) It makes your base price lower than your competition's, even if the base configuration isn't really usable. Unfortunately, only the first is reason enough to _buy_ one. -- Ed Hanway --- uunet!sisd!jeh This message is packed as full as practicable by modern automated equipment. Contents may settle during shipment.
lkoop@pnet01.cts.com (Lamonte Koop) (11/22/90)
daveh@cbmvax.commodore.com (Dave Haynie) writes: >>If this GVP board is being tested with any of the benchmarking tests I've seen >>though (Ronin CPUSpeed, etc...), I'd more expect a reading as I mentioned >>before. > >First thing you do, erase any copies of the Ronin CPUSpeed test you have >around. That test basically measures CPU register speed, and as a side effect >has a component that's based on memory speed. It tells you absolutely nothing >about system performance. It is totally useless. It will tell you that a >50MHz 68030 system running out of slow 16 bit memory is faster than a 16MHz >68030 system running out of nearly 0 wait state 32 bit memory, which is totally >wrong. I totally agree with you here...but what I was trying to point out earlier is that IF the individual in question here WAS using that (or any similar) program, and only getting 20%, then something is wrong. My own benchmarking program shows interesting things in this respect. [AIBB]. The Dhrystone on AIBB shows a 40% increase over an A2000 (68000 based) WITH both 030 caches ON, and about 15-20% at best with only the instruction cache on. [I haven't the heart to turn the instruction cache off ;-)] AIBB also has a few of those 'meaningless' benchmarks incorporated...and they show higher to a degree...interesting what you can fiddle with here. [Anyone else reading this, TOSS AIBB v1.0! The A3000 figures are WRONG...v2.01 is latest, and is correct...to whatever degree a benchmark can be correct]. > >While I can't recommend any really trustworthy benchmarks, Dhrystone 2.x is a >decent one. Though, on the same hardware, you can easily get a 2:1 range of >results depending on the compiler you use, sometimes even between different >options on the same compiler. So simply quoting Dhrystones tends to be rather >meaningless without also including the compiler information, unless you're >simply going for bragging rights against a Mac or IBM. Yes...different compilers definitely show differences...and certainly what you get depends upon your compile options. For benchmarks, many compilers have a tendency to optimize the benchmark code away, making them entirely useless. In those cases, the benchmark no longer test machine performance, but compiler optimization efficiency. >Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" > {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy > Standing on the shoulders of giants leaves me cold -REM --LaMonte UUCP: {hplabs!hp-sdd ucsd nosc}!crash!pnet01!lkoop ARPA: crash!pnet01!lkoop@nosc.mil INET: lkoop@pnet01.cts.com
daveh@cbmvax.commodore.com (Dave Haynie) (11/27/90)
In article <5801@crash.cts.com> lkoop@pnet01.cts.com (Lamonte Koop) writes: >daveh@cbmvax.commodore.com (Dave Haynie) writes: >>First thing you do, erase any copies of the Ronin CPUSpeed test you have >>around. >I totally agree with you here...but what I was trying to point out earlier is >that IF the individual in question here WAS using that (or any similar) >program, and only getting 20%, then something is wrong. My own benchmarking >program shows interesting things in this respect. [AIBB]. The Dhrystone on >AIBB shows a 40% increase over an A2000 (68000 based) WITH both 030 caches ON, >and about 15-20% at best with only the instruction cache on. [I haven't the >heart to turn the instruction cache off ;-)] That's good; just about what I would expect. >AIBB also has a few of those 'meaningless' benchmarks incorporated...and >they show higher to a degree...interesting what you can fiddle with here. I actually like the idea of what you seem to be doing in AIBB, though I haven't seen the program. Many moons ago, the now defunct Amiga Sentry enlisted my help in benchmarking some competing 680x0 Coprocessor boards for the A2000. It became clear to me that we needed a decent benchmark suite to compare similar Amigas. I came up with a simple series which basically ran three sets of benchmarks. The first set was traditional integer based artificial benchmarks, the second was traditional floating point based artificial benchmarks. The third was perhaps the most useful, as it attempted to measure real-world performance. It used the DBRender program, first as a test of compiler performance, by building the actual program; next as a test of application performance by running the compiled program. I had a clever script program which ran each benchmark, including FFP, IEEE, and '882 versions of the floating point benchmarks (the last only ran if you had an '882) and attempted to produce an automatic text file report of the results. I have never had the time to perfect this, but I did see a need for it. If you read five different Coprocessor board reviews, you'll find five different sets of criteria used to judge the boards. Some standard, any standard, would be more useful than the typical chaos. >Yes...different compilers definitely show differences...and certainly what you >get depends upon your compile options. For benchmarks, many compilers have a >tendency to optimize the benchmark code away, making them entirely useless. >In those cases, the benchmark no longer test machine performance, but compiler >optimization efficiency. Of course, one of the enhancements in Dhrystone 2.x was to at least use the results of the benchmarks. In 1.1, a sufficiently advanced optimizer could avoid the whole benchmark by recognizing that the computed result was never used. I don't think any Amiga compilers actually do this, but you never know. >--LaMonte -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy ONLY 230 MILES TO GO