richard@pegasus.com (Richard Foulk) (07/28/90)
I'm trying to assess the performance of the various video boards and X servers. Here's a very simple benchmark that zeros in on one very important aspect -- scrolling. I'd appreciate it if those running X could try this out and report their results. To run the benchmark start up an xterm like this: xterm -geometry 80x24 -fn 8x13 +j & then put the following awk script in a file called x-test: -----------------------------cut here---------------------------- BEGIN { for (i = 0; i < 1000; i++) { printf("xxxxxxxxxxxxx %d\n", i); } exit; } -----------------------------cut here---------------------------- Run it, from the above mentioned xterm, like this: time awk -f x-test and report the real time results. One data point: Unix: ISC 2.0.2 Unix X server: ISC X11R3 1.0.0, Xhrc Resolution: 720x348x2 Video card: Hercules monochrome graphics clone Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM 80387: no Time: 1:43 If you're running X on a 386 or 486 box please try this and post your results. Thanks. -- Richard Foulk richard@pegasus.com
caf@omen.UUCP (WA7KGX) (07/29/90)
:One data point: : : Unix: ISC 2.0.2 Unix : X server: ISC X11R3 1.0.0, Xhrc : Resolution: 720x348x2 : Video card: Hercules monochrome graphics clone : Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM : 80387: no : : Time: 1:43 : Unix: SCO 3.2 / ODT Cpu: 33 mHz Micronics, 64k cache, 16 MB, no 387 Time: $33 Herc clone: 1:36 Microfield T8: 0:50
misko@abhg.UUCP (William Miskovetz) (07/30/90)
In article <48@omen.UUCP>, caf@omen.UUCP (WA7KGX) writes: > :One data point: > : > : Unix: ISC 2.0.2 Unix > : X server: ISC X11R3 1.0.0, Xhrc > : Resolution: 720x348x2 > : Video card: Hercules monochrome graphics clone > : Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM > : 80387: no > : > : Time: 1:43 > : > Unix: SCO 3.2 / ODT > Cpu: 33 mHz Micronics, 64k cache, 16 MB, no 387 > > Time: $33 Herc clone: 1:36 > Microfield T8: 0:50 UNIX: ISC 2.2 X server: ISC X11 V 1.1, Xgp Resolution 1024x768x256 Video card: Paradise 8514/A CPU: 20 MHz Compaq 386, 9MB RAM 80387: Yes Time: 0:29 Bill Miskovetz {uunet!lll-winken, apple!mathworks}!abhg!misko misko@mathworks.com abhg!misko@lll-winken.llnl.gov
rick@pcrat.uucp (Rick Richardson) (07/30/90)
In article <1990Jul28.014025.17578@pegasus.com> richard@pegasus.com (Richard Foulk) writes: >One data point: > > Unix: ISC 2.0.2 Unix > X server: ISC X11R3 1.0.0, Xhrc > Resolution: 720x348x2 > Video card: Hercules monochrome graphics clone > Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM > 80387: no > > Time: 1:43 Here's another point, for the slowest 386 we've got: Unix: ISC 2.0.2 Unix X server: ISC X11R3 1.0.0, Xvga Resolution: 640x480x16 Video card: Paradise EGA-480 Cpu: 16MHz 386 (Mylex), 64k cache, 4MB 32 bit + 4MB 16 bit 80387: no 80287: yes Time: 1:39 This is rather strange. The elapsed times are so close, even though there is at least a factor of two difference in the raw performance of the machines. I'd guess one or both of the following are true: 1) The herc server has had zero tuning 2) The test saturates the bandwidth to the herc card -Rick -- Rick Richardson | Looking for FAX software for UNIX/386 ??? Ask About: |Mention PC Research,Inc.| FaxiX - UNIX Facsimile System (tm) |FAX# for uunet!pcrat!rick| FaxJet - HP LJ PCL to FAX (Send WP,Word,Pagemaker...)|Sample (201) 389-8963 | JetRoff - troff postprocessor for HP LaserJet and FAX|Output
paul@dialogic.uucp (The Imaginative Moron aka Joey Pheromone) (07/31/90)
Unix: ISC 2.0.2 Unix X server: ISC X11R3 1.1, Xwge Resolution: 1600x1024x2 Video card: Bell Tech Blit (Workstation Graphics Engine) Cpu: 25MHz 386 AST Premium 386/25, no cache, 8megs RAM 80387: no Time: 29.4 seconds -- Paul Bennett | | "I give in, to sin, because Dialogic Corp. | paul@dialogic.UUCP | You have to make this life 300 Littleton Road | ..!uunet!dialogic!paul | livable" Parsippany, NJ 07054 | | Martin Gore
talvola@janus.Berkeley.EDU (Erik Talvola) (07/31/90)
In article <243@abhg.UUCP> misko@abhg.UUCP (William Miskovetz) writes: > In article <48@omen.UUCP>, caf@omen.UUCP (WA7KGX) writes: > > :One data point: > > : > > : Unix: ISC 2.0.2 Unix > > : X server: ISC X11R3 1.0.0, Xhrc > > : Resolution: 720x348x2 > > : Video card: Hercules monochrome graphics clone > > : Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM > > : 80387: no > > : > > : Time: 1:43 > > : > > Unix: SCO 3.2 / ODT > > Cpu: 33 mHz Micronics, 64k cache, 16 MB, no 387 > > > > Time: $33 Herc clone: 1:36 > > Microfield T8: 0:50 > > > UNIX: ISC 2.2 > X server: ISC X11 V 1.1, Xgp > Resolution 1024x768x256 > Video card: Paradise 8514/A > CPU: 20 MHz Compaq 386, 9MB RAM > 80387: Yes > > Time: 0:29 Just for comparisons: UNIX: SunOS 3.5 X server: MIT X11R4 Resolution: 1152x900x2 (black and white) CPU: Sun 3/50 (16 MHz 68020 w/ 68881) Time: 0:31 Looks like the Hercules is just overly slow - probably because nobody has done any work on it. The Sun should be significantly slower than the 33 MHz 386 machines - no graphics hardware in a Sun 3/50 either. -- +----------------------------+ ! Erik Talvola | "It's just what we need... a colossal negative ! talvola@janus.berkeley.edu | space wedgie of great power coming right at us ! ...!ucbvax!janus!talvola | at warp speed." -- Star Drek
johnl@esegue.segue.boston.ma.us (John R. Levine) (07/31/90)
In article <1990Jul30.020330.6291@pcrat.uucp> rick@pcrat.UUCP (Rick Richardson) writes: >> Cpu: 33MHz 386 (Mylex), 128k cache, 8megs RAM >> 80387: no >> Time: 1:43 > Cpu: 16MHz 386 (Mylex), 64k cache, 4MB 32 bit + 4MB 16 bit > 80287: yes > Time: 1:39 > >This is rather strange. The elapsed times are so close, ... The X11R3 does a lot of floating point arithmetic, so the 287 makes a lot of difference. I gather that the X11R4 sample server has been rewritten to do as much as possible as integer. It would be interesting to see some relative figures on the R4 server. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl Marlon Brando and Doris Day were born on the same day.
scottw@ico.isc.com (Scott Wiesner) (07/31/90)
>> > : X server: ISC X11R3 1.0.0, Xhrc > > Looks like the Hercules is just overly slow - probably because nobody > has done any work on it. The Sun should be significantly slower than > the 33 MHz 386 machines - no graphics hardware in a Sun 3/50 either. While it's true ISC's Hercules server probably hasn't gotten the attention the VGA server has, I doubt it could be made too much faster. There were some improvements made to text output since the 1.0 release, but scrolling won't change much. The basic problem here is that we're dealing with a slow 8 bit DRAM device running over a slow bus. I have heard that on a VGA with a 16 Mhz 386, you're stuck with 20 or more wait states. I've found that accessing video memory can take more than 6 times as long as accessing normal memory. Welcome to the wonderful world of the IBM PC. Scott Wiesner Interactive Systems X Development Group
scottw@ico.isc.com (Scott Wiesner) (07/31/90)
> The X11R3 does a lot of floating point arithmetic, so the 287 makes a lot > of difference. I gather that the X11R4 sample server has been rewritten > to do as much as possible as integer. It would be interesting to see some > relative figures on the R4 server. This is a common misconception that I wish would die. Arcs and wide lines in X11R3 use a lot of floating point. There's no floating point in text or CopyArea, so that's not affecting the speed of this scrolling test. The bottleneck here is mainly the Hercules board. Also, the comparison given was between a version of ISC's X that's over a year old and SCO's X, which is newer. Newer releases of ISC's X are somewhat faster. The newest (1.2) release integrates the X11R4 arc and wide line code to speed up those operations substantially. Scott Wiesner Interactive Systems X Development Group
richard@pegasus.com (Richard Foulk) (08/01/90)
Please keep those benchmark numbers coming in! I'll post a summary soon. Thanks. -- Richard Foulk richard@pegasus.com
wtm@uhura.neoucom.EDU (Bill Mayhew) (08/01/90)
I have a very old 16 MHz IBM model 80-071 with motherboard VGA. I was quite surprised to discover that there are 25 (!!!) wait states required to access the VGA RAM. Not exactly the sort of machine that makes one want to have an X windows display. ==Bill== -- Bill Mayhew Northeastern Ohio Universities College of Medicine Rootstown, OH 44272-9995 USA phone: 216-325-2511 wtm@uhura.neoucom.edu ....!uunet!aablue!neoucom!wtm via internet: (140.220.001.001)
gary@mic.UUCP (Gary Lewin) (08/02/90)
These figures may be of interest: Unix: ISC 2.0.2 Unix X server: ISC X11R3 1.2, Xvga Resolution: 1024x768x256 Video card: Micro-Labs, Ultimate VGA Cpu: 25MHz 386, no cache, 16 MB 80387: yes Time: 0:13.31s Nothing like a little variety to perk things up. The X11 is a beta of 1.2 which supports a number of 12024x768x256 color cards. The Micro-Labs card is excellent and is under review by ISC for adding to the approved list. Gary Lewin gary@mic.lonestar.org
brando@uicsl.csl.uiuc.edu (08/08/90)
Just for comparisons (again): UNIX: SunOS 4.1 X server: MIT X11R4 Resolution: 1152x900x2 (black and white) CPU: SPARCstation 1 Generic Time: 0:13.1
plocher@sally.Sun.COM (John Plocher) (08/08/90)
+-- richard@pegasus.com (Richard Foulk) writes: | xterm -geometry 80x24 -fn 8x13 +j & | | then put the following awk script in a file called x-test: | -----------------------------cut here---------------------------- | BEGIN { | for (i = 0; i < 1000; i++) { | printf("xxxxxxxxxxxxx %d\n", i); | } | exit; | } | -----------------------------cut here---------------------------- | Run it, from the above mentioned xterm, like this: | | time awk -f x-test | | and report the real time results. +-- These test times can be reduced by 50% or more by replacing the time awk -f x-test with awk -f x-test > /tmp/x time cat /tmp/x This implies that you are measuring as much "awk" time as you are "scrolling". In fact, awk is a known abuser of FP, as reflected by other comments about this benchmark. FYI, on a Sun SS1+GX (1152x900x256), the test takes about 13 seconds. -John
roell@lan.informatik.tu-muenchen.dbp.de (Thomas Roell) (08/09/90)
First some test results: UNIX: ISC 2.0.2 X server: X386 (X11R4) internal test version Video card: VGA GENOA 5400 CPU: 33 MHz, 32k cache, 8MB (PizzaMan's Special) 30387: no a) Resolution: 864x606x2 Time: 0:59 b) Resolution: 800x600x256 Time: 3:40 Now the interpretation of the results: The generic at386 boxes are generally slow for i/o task. The normal i/o bus speed is between 8MHz and 10MHz. Therefore it is not depending on the CPU speed, whether scrolling is fast. The only factor for speed is the access time of the VGA card. Before doing any asumtions on the speed of a particular VGA, you should note that the CPU can access the VGA's memory about every 5 (five) VGA cycles. One VGA cycle is dotclock/8. In case a) I used a 44.9 MHz dotclock, which means, I could acceive a throughput of 1.06 MBytes/sec. The window we scrolled was 191360 Pixels big (80x8x23x13); we scrolled 1000-24 times. This means we got athroughput of about 0.75 MBytes/sec. Ok, lets work about this scrolling in monochrome some time. Now case b): here I used a 39 MHz dotclock. You should note that the GENOA (i.e the Tseng ET3000 Chip) alows us to access the VGA's memory wordwise. Now our maximal throughtput is 1.86 MBytes/sec. And we got (via this test) about 1,62 MBytes/sec. Summarized, I could say: "good job done, scrolling is very close to the maximum throughput of your graphics device". (Other test showed me that the scroll routine gets about 92% of the maximal possible throughput in 256 colors mode, if only scrolling is tested) Let's do now a interpretation of the test: (Since I didn't use prof(1) for this test all numbers below are estimates) About 80% of the test is spent in the scroll, 10% in a fill and 10% in a glyph painting routine. Was this intended ?? All tested here is only the speed of the graphics device. Nothing more !! This test simulates NOT the normal case (via the option +j 'jump-scrolling' is disabled !!). That means, what you tested here is NOT the time, which is necessary to scroll 1000 lines under normal conditions. My second main point of critic is that you tested only a very small (and in my oppinion not so absolute important) facette of a X server (about 2000 Bytes out of 560000 Bytes of code !!). It depends upon your application what parts of the X server are the most important for your job. For simple text application (which should be benchmarked here) the mixture should be 50% glyph paint, 30% fill solid and 20% scrolling (that were the weights I got via prof(1), when did 'ls -RCs /' in a 80x44 window about 100 times). Here are the results, when you start up the xterm with jumpscroll enabled, which seemed to be done by some people for getting better numbers for their graphics borads (like Gary Lewin, with Micro-Labs, Ultimate VGA **): a) Resolution: 864x606x2 Time: 0:07.28s b) Resolution: 800x600x256 Time: 0:12.19s But no critic without saying what could be done better. As I posted some days ago you should use a special benchmarking utility. X11PERF is such an utility. It can be found under the X11R4-tape (contrib/demos/x11perf). The port to X11R3 should be simple. Running all test will take some hours (4 to 5). But then you will have results that are saying some more than the above test does. - Thomas PS: If you use the above test, move your cursor off the xterm in that you are doing the test. Guess why! The original author has forgotten to say this, or he didn't know the importance of this trick. ** (note) Gary Lewin told us his VGA did the test in 13.31s. That means his VGA has a throughput of 80x8x23x13x(1000-24)x2/13.31s = 26.76 MBytes/sec. (I did not use the fact that the test uses only 80% in scrolling, so the more realistic throughput has to be around 33 MBytes/sec) If his Graphics board allows 16bit (ISA bus) access, his CPU MUST have an i/o speed of 13.5 MHz (assuming the VGA allows zero wait state access, and the CPU can access video memory parallel to the VGA's display unit; if you'll take 33 MB/sec it will be 17 MHz i/o speed). This is faster than every board (CPU & VGA combination) I ever heard of!! Please, Gary post realistic results, or tell us how your VGA works (EISA bus, VRAM's, graphics processor and so on), and where to get it. -- _______________________________________________________________________________ Mail: Thomas Roell (c/o Daniel Hernandez) Inst. f. Informatik / Technische Universitaet M"unchen Arcisstr. 21 / 8000 Munich 2 / Fed.Rep. of Germany E-Mail (domain): roell@lan.informatik.tu-muenchen.dbp.de UUCP (when above fails): roell@tumult.{uucp | informatik.tu-muenchen.de} -------------------------------------------------------------------------------
richard@pegasus.com (Richard Foulk) (08/11/90)
In article <3858@tuminfo1.lan.informatik.tu-muenchen.dbp.de> roell@lan.informatik.tu-muenchen.dbp.de (Thomas Roell) writes: >First some test results: > > UNIX: ISC 2.0.2 > X server: X386 (X11R4) internal test version > Video card: VGA GENOA 5400 > CPU: 33 MHz, 32k cache, 8MB (PizzaMan's Special) > 30387: no > >a) Resolution: 864x606x2 > Time: 0:59 > >b) Resolution: 800x600x256 > Time: 3:40 > Thanks for the data. > [...] >Let's do now a interpretation of the test: (Since I didn't use prof(1) for this >test all numbers below are estimates) About 80% of the test is spent in the >scroll, 10% in a fill and 10% in a glyph painting routine. Was this intended ?? Yes. > [...] >But no critic without saying what could be done better. As I posted some days >ago you should use a special benchmarking utility. X11PERF is such an utility. >It can be found under the X11R4-tape (contrib/demos/x11perf). The port to X11R3 >should be simple. Running all test will take some hours (4 to 5). But then you >will have results that are saying some more than the above test does. That tests things that I'm not concerned with. But more importantly it's not a test that you're likely to get many people to run -- it's just too much trouble. (I've only gotten 18 responses to my simple benchmark so far. Thanks very much to those who've taken the time to run it and send their results!) > >Gary Lewin told us his VGA did the test in 13.31s. That means his VGA has a >throughput of 80x8x23x13x(1000-24)x2/13.31s = 26.76 MBytes/sec. (I did not >use the fact that the test uses only 80% in scrolling, so the more realistic >throughput has to be around 33 MBytes/sec) If his Graphics board allows 16bit >(ISA bus) access, his CPU MUST have an i/o speed of 13.5 MHz (assuming >the VGA allows zero wait state access, and the CPU can access video memory >parallel to the VGA's display unit; if you'll take 33 MB/sec it will be 17 MHz >i/o speed). This is faster than every board (CPU & VGA combination) I ever >heard of!! Please, Gary post realistic results, or tell us how your VGA works >(EISA bus, VRAM's, graphics processor and so on), and where to get it. A growing number of cards don't require the cpu to do the scrolling, they have an on-board processor to do the work. Cpu and bus speeds should be mostly irrelevant for scrolling. I'm interested in scrolling because it's the only performance aspect that comes close to being unacceptable on some systems I've tried. Yes, I use jump-scrolling, but too often it doesn't work very well. Some programs send text just slow enough to circumvent jump-scrolling. I'll post the benchmark summary in a day or two. I was hoping to get more responses, especially from people with faster cards, 8514's and such, but I've only received a couple. Someone complained that too much time is spent within awk, and that this invalidated the benchmark results. In my tests, on a machine with no fpu, when I redirect the test to /dev/null it takes less than one second to run. This is insignificant compared to the average benchmark run time of over a minute. (When I was devising the benchmark, I considered sending awk's output to a file and then timing a "cat" of the file, but I decided it wasn't necessary.) If you haven't run the benchmark yet please do so and send me the results. It's short and fairly simple, just in case you missed it, here it is again: To run the benchmark start up an xterm like this: xterm -geometry 80x24 -fn 8x13 +j & then put the following awk script in a file called x-test: -----------------------------cut here---------------------------- BEGIN { for (i = 0; i < 1000; i++) { printf("xxxxxxxxxxxxx %d\n", i); } exit; } -----------------------------cut here---------------------------- Run it, from the above mentioned xterm, like this: time awk -f x-test and report the real time results. Thanks very much.
richard@pegasus.com (Richard Foulk) (08/11/90)
My .signature misfired. Please send those benchmark results to: richard@pegasus.com or post them if you prefer. Thanks again. Richard Foulk richard@pegasus.com