toma@sail.LABS.TEK.COM (Tom Almy) (06/19/91)
I just received my "free" upgrade, and used the half price coupon for Top Speed C. Some first impressions: 1. The package is much bigger (even before installing C). More memory models has enlarged the libraries, and there are new extended memory versions of ts and vid. 2. The environment is somewhat nicer, yet basically compatible with the old version. 3. The old .prj files don't work. There is now a .pr file that has new, and even more confusing, syntax. The confusing array of options has now expanded into the project file. 4. A couple of my sample programs have balooned in size about 20%. 5. MGDEMO.MOD crashes my machine when run. In fact none of the graphics programs work. Their example C graphics program runs fine. I'm going to try and track this down, but it is a bad omen. 6. The WNDDEMO program comes in both Modula and C versions, and the C version is *much* faster. It is also about 10% smaller. Again the optimization bias is leaning even more toward C and away from Modula. 7. Documentation is much better. I have a number of additional programs to compile, but it looks like this version is a step backwards in code size and execution speed. And it takes 11 Megs of my disk! I will try the C compiler on some "gut buster" code I've got here, and will also see how it works with Microsoft Windows. I only wish they hadn't gone the multi-langage route. I'm afraid this is going to be another "the jack of all trades is master of none" package. Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply -- Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply
USDGOG@VTVM1.BITNET (Greg Granger) (06/20/91)
On Tue, 18 Jun 91 21:11:21 GMT Tom Almy said: >I just received my "free" upgrade, and used the half price coupon for >Top Speed C. Some first impressions: > >1. The package is much bigger (even before installing C). More memory > models has enlarged the libraries, and there are new extended memory > versions of ts and vid. Sigh, doesn't surprise me, but I'll probably have to wait till I get a new machine to install it. (I just ordered my M2 upgrade and a 1/2 price C++ (I figured the C++ manuals should be worth that). >2. The environment is somewhat nicer, yet basically compatible with the > old version. >3. The old .prj files don't work. There is now a .pr file that has > new, and even more confusing, syntax. The confusing array of options > has now expanded into the project file. Gee, I can't wait <sarcastic grin> >4. A couple of my sample programs have balooned in size about 20%. >5. MGDEMO.MOD crashes my machine when run. In fact none of the graphics > programs work. Their example C graphics program runs fine. I'm going > to try and track this down, but it is a bad omen. >6. The WNDDEMO program comes in both Modula and C versions, and the C version > is *much* faster. It is also about 10% smaller. Again the optimization > bias is leaning even more toward C and away from Modula. Could you run a couple of simple tests (like time to do n printf's against time to do n WrStr's)? I'm interested in just how bad this bias is. It is my impression that JPI wrote the low level C calls and is calling them via M2, so a call to WrStr invokes (at some level) a call to printf. This means that JPI M2 programmers have to paid for the overhead of the transition calls plus the overhead of printf. I noticed this first in there heap management stuff. >7. Documentation is much better. Gee, that should be too hard. All Greek text has been changed to French <grin>. >I have a number of additional programs to compile, but it looks like this >version is a step backwards in code size and execution speed. And it takes >11 Megs of my disk! Is that just the M2 compiler or both the C and M2 compilers? >I will try the C compiler on some "gut buster" code I've got here, and will >also see how it works with Microsoft Windows. I only wish they hadn't gone >the multi-langage route. I'm afraid this is going to be another "the jack of >all trades is master of none" package. I feel that the multi-language route is fine, but they just have a decidedly poor implementation (mainly involving 'language bleed' where none is needed or wanted). Greg Greg Granger BITNET: USDGOG@VTVM1 Consultant, USD Internet: USDGOG@VTVM1.CC.VT.EDU Computing Center, Va Tech
toma@sail.LABS.TEK.COM (Tom Almy) (06/20/91)
In article <INFO-M2%91062009222665@UCF1VM.BITNET> Modula2 List <INFO-M2%UCF1VM.BITNET@ucf1vm.cc.ucf.edu> writes: >On Tue, 18 Jun 91 21:11:21 GMT Tom Almy said: >>I just received my "free" upgrade, and used the half price coupon for >>Top Speed C. Some first impressions: >>6. The WNDDEMO program comes in both Modula and C versions, and the C version >> is *much* faster. It is also about 10% smaller. Again the optimization >> bias is leaning even more toward C and away from Modula. >Could you run a couple of simple tests (like time to do n printf's >against time to do n WrStr's)? I'm interested in just how bad this >bias is. Well I though it unfair to compair WrStr with printf, but WrStr with fputs should be ok: I compared the Top Speed Modula-2 with C. I wrote a program that wrote 10000 copies of "This is a test of writing speed" to the display. The result: Language Size Speed Modula-2 2854 21.1 sec C 3116 20.6 I then tried writing to a file. Since Modula-2 is unbuffered by default, while C is buffered by default, I did the test (now for 1000 copies) both unbuffered and both buffered with buffer size 1024. Language Size Speed Modula-2 Nobuf 5620 1.64 Modula-2 Buf 6525 6.81 (NOT a misprint!) C Nobuf 4654 12.58 C Buffered 4750 0.93 It looks like Modula-2 is really buffered, somewhere, and that for some reason explicitly specifying a buffer actually slows it down. At any rate, the TopSpeed C is a clear win of TopSpeed Modula-2. Incidentally, the C Buffered code took 1.48 seconds to execute with Borland C, with a file size of 6294, making the TopSpeed C product look very good. Here are the sources for the buffered file versions: MODULE test; IMPORT FIO; IMPORT Storage; CONST BufferSize = 1024 + FIO.BufferOverhead; VAR i:CARDINAL; dummy: FIO.File; buffer: ADDRESS; BEGIN dummy := FIO.Create("test.out"); Storage.ALLOCATE(buffer,BufferSize); FIO.AssignBuffer(dummy, buffer); FOR i := 0 TO 1000 DO FIO.WrStr(dummy,"This is a test of writing speed"); FIO.WrLn(dummy); END; END test. #include <stdio.h> void main() { int i; FILE *dummy = fopen("test.out","w"); setvbuf(dummy, NULL, _IOFBF, 1024); for (i = 0; i <= 1000; i++) { fputs("This is a test of writing speed\n",dummy); } } >>I have a number of additional programs to compile, but it looks like this >>version is a step backwards in code size and execution speed. And it takes >>11 Megs of my disk! >Is that just the M2 compiler or both the C and M2 compilers? Both, but I couldn't install even the M2 compiler on my home system which had only 2 meg free before erasing the old. >>I will try the C compiler on some "gut buster" code I've got here, and will >>also see how it works with Microsoft Windows. I only wish they hadn't gone >>the multi-langage route. I'm afraid this is going to be another "the jack of >>all trades is master of none" package. Well the results are in! The large C program (XLISP) compiled without a hitch to an executable about 10% smaller than that of Borland, Zortech, or Microsoft, but it was also about 10% slower than any of those. Compilation speed was much better than Microsoft or Zortech, but much worse than Borland (these with all optimizations on). The Windows test was another story. The simple example program I had (About2 from the Petzold book) would not compile correctly when modified as per the TopSpeed manual. I looked at the TopSpeed sample program and discovered that it was set up slightly differently (in the DEF/EXP file and the project file) than the book said. I changed the About2 accordingly and it compiled. The executable was 2k smaller than Borland C. Unfortunately it did some funny things with the system resources when I ran it. >I feel that the multi-language route is fine, but they just have a >decidedly poor implementation (mainly involving 'language bleed' where >none is needed or wanted). Yes, if they pulled it off, just having a common environment for different languages, even if one never did mixed language programming, would have been great. But I just don't have the time to fool around with this when I have other already working compilers. I'm going to check the crashing graphics code on another machine, and if it fails give JPI a call. If this doesn't pan out, I'll remove the whole mess from my system and just use version 1 -- it worked! Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply -- Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply
VOGT@EMBL.BITNET (Gerhardt Vogt) (06/21/91)
I have read the last two messages concerning TS 3.0 and I have a slightly different opinion about it. 1. Size of the System: It's true, the size of the System has increased, however less than mentioned in the postings. I have now 3 memory models installed for M2, C and C++, plus Source for the M2 library, plus all multi-language and windows stuff, the EXE-files are packed with DIET and I have 9.2 MB on my disk (2.2 MB of this for examples and Lib-Source) 2. Project files are more difficult and more powerful but who cares. I played a bit and it's not that difficult. And normally it's not necessary to change anything in the default pr files. If you like to play with it, it's a nice toy, if not it does not bother you. 3. The size of programs did not change as far as I can see. I have some rather big programs (> 20 source files) and I did not notice a difference. You should switch off error checking before you compare. And even if the EXE-files are bigger, the code must not necessarily be as well. My biggest program is a TSR so I can easily check it's real size when it's loaded in memory. 4. MGDemo works fine on my computer (386-33 with VGA). There was an error in an old version of the program, it did not switch in the proper mode but this was a Source problem not one of the compiler. (I have not checked the new program and I don't remember exactly what it was). 5. I can not run the M2 and the C version of the Window demo because I have not installed the MThread library but i can hardly believe that code generation should be extremely different between M2 and C. Maybe one program was tested with error check and the other without. The idea that WrStr might call printf is somewhat strange. printf is much more powerful than all M2-IO and normally all formatted C-IO is quite slow because of the overhead for interpreting format strings. All library functions have either a Modula or a Assembler implementation. I don't know why the writer of one of the mails thinks, the heap management would call C subroutines, the implementation is in MSTORAGE.MOD and calls COREMEM.A to do the low-level work. The system did not improve very much but it's worth to get the upgrade. Multi- language programming is a good thing and it's easier than in V 2. I would have liked some improvements in the environment and the debugger (it was better than Borland's three years ago but Borland has done a lot and JPI has not. Anyway, I like it Gerhard Vogt EMBL D-6900 Heidelberg W Germany
USDGOG@VTVM1.BITNET (Greg Granger) (06/21/91)
On Thu, 20 Jun 91 18:21:00 +0100 Gerhardt Vogt said: >... > The idea that WrStr might call printf is somewhat strange. printf is much > more powerful than all M2-IO and normally all formatted C-IO is quite slow > because of the overhead for interpreting format strings. > All library functions have either a Modula or a Assembler implementation. > I don't know why the writer of one of the mails thinks, the heap management > would call C subroutines, the implementation is in MSTORAGE.MOD and > calls COREMEM.A to do the low-level work. >... ------------------------------------------------- Sorry, I didn't mean to suggest that this was in V3.0 (I haven't received my copy of V3.0 yet). In version 2.0 you can trace the heap calls to a low level routine named ?alloc (can't remember the first letter). It sure looked like C to me. If JPI has now 'fixed' this I'm glad, I'd much rather see the heap and IO routines written in M2 (as in version 1.x) instead of M2 'wrappers' for C routines (as in version 2.x). Greg
USDGOG@VTVM1.BITNET (Greg Granger) (06/21/91)
On Thu, 20 Jun 91 16:11:20 GMT Tom Almy said: >... >I compared the Top Speed Modula-2 with C. I wrote a program that wrote >10000 copies of "This is a test of writing speed" to the display. The result: >... Thanks, makes me feel a little better about JPI's M2 product. BTW, I think printf/WrStr isn't an unfair comparison considering the 'common' use for each. Which reminds me of something in JPI's Comm Toolbox manual. The writer of the manual (clearly a die-hard C programmer) couldn't understand why "the Modula-2 community attached itself to an awkward set of I/O procedures, largely ignoring the solutions for similar problems already found in C." So this person wrote a kludgy Printf in M2. Considering this is the type of person JPI hires to write their toolboxes I guess we are lucky they didn't 'improve' M2 by replacing all those awkward IMPORTS/Def files with some nice %INCLUDE/.h files. I guess this list has spoiled me, I just can't imagine that the world is so short on M2 programmers that a compiler company can't find one to support there flagship product (ooopps, according to here latest marketing noise, ONE of there flagship products, (BTW, how many flagships can you have?)) Sigh, sorry I just kind'a "went off" ... >... >I'm going to check the crashing graphics code on another machine, and if >it fails give JPI a call. If this doesn't pan out, I'll remove the whole >mess from my system and just use version 1 -- it worked! >... Yes, I know what you mean, version 1.x still looks real good compared to later versions. Sometimes I wonder if JPI is advancing or ... A least they make a nice YACC (!!! Yet Another C Compiler !!! :-) (bet I woke some Unix hacks up with that last one ;-) Greg
toma@sail.LABS.TEK.COM (Tom Almy) (06/21/91)
Regarding the crashing MGDEMO program: 1. It doesn't crash on my home system which has a Video Seven graphics card, but does on my work machine which has a Diamond Speedstar. 2. There appears to be a major rewrite of the graphics code since version 2. I'm going to check the graphics code carefully, possibly recompiling to use VID. Somewhere in that code they must be doing some non-portable coding. I'm still underwhelmed. -- Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply
toma@sail.LABS.TEK.COM (Tom Almy) (06/22/91)
A few days ago I stated that my attempt to build a working Microsoft Windows program using TopSpeed 3.0 failed. It turns out I was missing an additional line in the EXP file. I guess my mistake was to use the section on Windows programming in the Techkit manual as the guide -- the extra line necessary was documented in the "Windows Programming Suplement" pamphlet which I had overlooked among the eight glossy manuals. The simple example program made a 4k executable. Borland C made a 9k executable. Very impressive reduction in overhead! It should be noted that the EXP file is *almost* the same as the DEF file needed by the Microsoft, Borland, and Zortech linkers, but requires two additional lines. Also the C source files are slightly different and need some pragma statments not needed by the other C's. I haven't tried it, but writing a Modula-2 Windows app looks easier than using C (except for the wealth of existing C Windows examples...). Now back to the graphics problem... -- Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply
jordan@aero.org (Larry M. Jordan) (06/22/91)
I too received my upgrade (and added the TechKit) last night. Someone must have goofed, because the library source was thrown in as well. Is this the same as the "source kit"? I failed to make anything NOT WORK--GRDEMO, DEMO, and a compiler I'm building. I even tried the windows demo--it too fails not to work. I only installed the Small and MThread models and have no idea how much hard disk was required--I had 50Meg available. That appeared to be enough. The mouse seems better supported by the environment than last time. But, I still must use the keyboard to resize windows--awkward. The environment is designed for the keyboard and the mouse is ancillary. JPI makes no pretense here. The CLASS extension has evolved considerably--multiple inheritance and aliasing (renaming of fields and methods) and "safe" downcasting (if ancestor classes are "up") via type guards. The class (instance) "initialization code" looks just like the module initialization code. I find the syntax and semantics of this to be quite appealing. I had not used JPI Modula-2 for sometime and had forgotten how proficient the system is at reporting compilation errors and permitting you to step through them, edit (continue stepping without getting confused) and recompiling. Zortech C++ v2.1 is pathetic by comparison. TPW stops after the first error (or is this settable?). cfront says things like "you have an error on or near ...", which I've always found humorous, but not extremely helpful. The smartlinking capability is also a great marketting hype (in addition to be a truly valuable technology). Shrinking virtual method tables appears to be a gain. Is the linker smart enough to convert virtual methods to static methods if no descendent class overrides/redefines them? I think Eiffel's application builder does this. My only gripe so far: I've noticed many typographical errors in the documentation. I would have thought a guick review of galley proofs (or whatever) would have caught most of these. Such errors are especially disturbing in coding examples. All in all, I'm pleased. There is a heck of lot here. I just wish the docs were on par with the product. --Larry
VOGT@EMBL.BITNET (Gerhardt Vogt) (06/24/91)
In a previous posting, Tom Almy tried to compare JPI's speed of String- output to textfiles in M2 and C. > Language Size Speed > Modula-2 Nobuf 5620 1.64 > Modula-2 Buf 6525 6.81 (NOT a misprint!) > C Nobuf 4654 12.58 > C Buffered 4750 0.93 The 6.81 seconds are not a misprint but Tom's error. > MODULE test; > IMPORT FIO; > IMPORT Storage; > > CONST BufferSize = 1024 + FIO.BufferOverhead; > > VAR i:CARDINAL; > dummy: FIO.File; > buffer: ADDRESS; > > BEGIN > dummy := FIO.Create("test.out"); > Storage.ALLOCATE(buffer,BufferSize); > FIO.AssignBuffer(dummy, buffer); > FOR i := 0 TO 1000 DO > FIO.WrStr(dummy,"This is a test of writing speed"); > FIO.WrLn(dummy); > END; > END test. The AssignBuffer does not take the length of the buffer as an explicit argument but uses the implicit size which is 4 in case of a ADDRESS variable. Using a 4 byte buffer wouldn't be fast in any language. The proper usage of AssignBuffer is TYPE Array = ARRAY[0 .. BufferSize - 1] OF CHAR; VAR buffer : POINTER TO Array; . . . FIO.AssignBuffer(dummy, buffer^); Here AssignBuffer gets the proper size as an implicit argument and the program is running about 1.2 seconds buffered and 12 seconds unbuffered on a 386 SX with 16 MHz and a 18 ms disk (without Smartdrive and Co). I agree with people who loved version 1. I asked several times people from JPI which powerful features of V1 do not exist anymore (for examples VID's feature to let you examine all local and global variables after a runtime error and to let the program continue in such a case. PMD is nice if a program crashes rarely but letting a program run in the debugger until it crashed is much more convenient). And I don't understand why they removed programs like ANALYZE which produces a list who exports and imports what. On the other hand they do not want to include features in Modula which are standard in C like a check for uninitialized or unused variables which should be quite simple because the optimizer has to maintain this information anyway (I was told that they have asked the compiler writer but that he does not want to do it) Anyway, i still like it very much Gerhard Vogt EMBL D-6900 Heidelberg West Germany
toma@sail.LABS.TEK.COM (Tom Almy) (06/24/91)
In article <56F0AA44FD1F400E38@EMBL-Heidelberg.DE> Modula2 List <INFO-M2%UCF1VM.BITNET@ucf1vm.cc.ucf.edu> writes: >In a previous posting, Tom Almy tried to compare JPI's speed of String- >output to textfiles in M2 and C. >[...] >The 6.81 seconds are not a misprint but Tom's error. > Thanks for pointing out my mistake. I typically statically allocate the buffer, but wanted to dynamically allocate it so as to be as much like the C version as possible. After correcting the program, results were: Language Size Speed Modula-2 Nobuf 5620 1.64 Modula-2 Buf 6525 0.87 C Nobuf 4654 12.58 C Buffered 4750 0.93 Making Modula-2 consistantly faster, but larger (at least for small programs). ********************** On another note: Concerning the crashing problem when graphics were used with my TSENG-4000 based system, I discovered the bug in graph.mod (thank God I've got the source kit!). It was correct in version 2, and they broke it in version 3: In procedure SetVideoMode: IF (Mode >= 13) AND (Mode < 19) THEN R.AX := 1002H; R.ES := Seg(PalRegs); R.DX := Ofs(PalRegs); Lib.Intr(R, 10H); n := 0; WHILE n < 16 DO R.AX:=1010H; (* WAS R.AL = 010H; *) R.BX:=n; R.DH:=SHORTCARD(PalCols[n]); R.CH:=SHORTCARD(LONGCARD(LongSet(PalCols[n])*LongSet(0FF00H))>>8); R.CL:=SHORTCARD(LONGCARD(LongSet(PalCols[n])*LongSet(0FF0000H))>>16); Lib.Intr(R, 10H); INC(n); END; END; They made the invalid assumption that AH would not be changed across the interrupt call. I'll give JPI a call on this one. Conclusion: I could be happier, but it does all work. I'll cut back on disk space by eliminating lots of the various memory model libraries and by archiving the source files. -- Tom Almy toma@sail.labs.tek.com Standard Disclaimers Apply
schoebel@bs3.informatik.uni-stuttgart.de (Thomas Schoebel) (06/25/91)
In article <9782@sail.LABS.TEK.COM> toma@sail.LABS.TEK.COM (Tom Almy) writes: >Language Size Speed >Modula-2 Nobuf 5620 1.64 >Modula-2 Buf 6525 0.87 >C Nobuf 4654 12.58 >C Buffered 4750 0.93 > >Making Modula-2 consistantly faster, but larger (at least for small programs). I just got V 3.01 and played a little bit with a larger project (about 70 Modules, 750K source code in M2). Impressions: The OS/2 version did compile and run without problems when moving from V 2.X. To get the DOS version run (moving from V 1.X) will take some time for adapting calling conventions and system startup code. I also played with calling conventions, with a surprising result: Changing the calling conventions from JPI to STACK decreases the code size by nearly 10%! However, without adapting some assembler routines this code will not run, I just wanted to see the size. Surprise: With STACK conventions, the size produced by V3.01 is nearly the same as with V1.X (188K EXE, all checks on, full optimize). *Not* worse than V1.X! Consequences: The JPI conventions give only some gain at routines with small parameter sets. As soon as a routine calls another or the size of it is more than a trivial one, you can examine a push orgy in the called routine where all parameters are moved to stack. In fairly large systems, only a small percentage of all routines are small and trivial enough to get advantage from register passing. In most cases it produces additional overhead! Whether the execution time will be affected from that is not sure, but it is most likely that larger code will run slower. If JPI had choosen the old stack passing convention as default, their product would be better in general. Another question concerning benchmarks: Most of them are short routines with few parameters. Did this JPI imply to choose register passing? What about relevancy of such benchmarks?? In practice, I believe, register passing will be worthy only if you manually control it. In general, it should be turned off for better results. -- Thomas
gkt@iitmax.iit.edu (George Thiruvathukal) (06/25/91)
In article <11710@ifi.informatik.uni-stuttgart.de>, schoebel@bs3.informatik.uni-stuttgart.de (Thomas Schoebel) writes: > The JPI conventions give only some gain at routines with small parameter > sets. As soon as a routine calls another or the size of it is more than > a trivial one, you can examine a push orgy in the called routine > where all parameters are moved to stack. This is a good point. I think we could prove that the average case behaviour (a result of the push orgies) is probably worse than straight stack-based allocation of parameters. For trivial programs (i.e. some of the programs commonly used for benchmarks), there might be a payoff. > If JPI had choosen the old stack passing convention as default, their > product would be better in general. I have to disagree with you here. It is quite easy to use the stack-based parameter passing conventions by choosing an appropriate compiler flag. You would have to own the SourceKit and rebuild it with the supplied project files. I have a feeling, however, you really meant to say that people in general would be more happy if JPI had used the stack-based parameter passing by default for reasons of compatibility with existing libraries and our agreement on the debatable nature of the alleged "performance improvement." > Another question concerning benchmarks: Most of them are short > routines with few parameters. Did this JPI imply to choose > register passing? What about relevancy of such benchmarks?? As I mentioned, the practice of benchmarking in the context of software vendors should be taken with a grain of salt. As yet, I have not seen a vendor publish a benchmark result which was based on non-trivial programs. I am under the impression that the choice of the JPI calling convention is based on two points: 1. The one you made above. Simple programs with few parameters tend to compile well. These programs are characterised by graphs (for the allocation of registers) which have minimal, if any, conflicts. What does this mean? Well, it means that the push orgies to which Thomas alluded are virtually non-existant. Conclusion: such programs are guaranteed to compile better with a register-based calling convention than a stack-based calling convention. 2. Mainstream programming style. Even to this day, programmers tend to use many global variables (even if their use is potentially confined to a handful of procedures. While the rationale for doing so differs from programmer to programmer, many programmers I know do so because they are aware of the overhead of procedure calls and compiler optimization techniques. Of course, many of the programmers I know really do not believe in structured programming. They make claims to the effect of "structured programs cannot possibly be efficient." In any event, programs which are written in the mainstream programming style tend to be characterized by register allocation graphs which are similar to the ones described for trivial programs. > In practice, I believe, register passing will be worthy > only if you manually control it. In general, it should be turned > off for better results. You can. Check out the pragmas. There is one which you can use which constricts the compilers attempt to allocate registers. Since I cannot remember what it is, please look it up. -- George Thiruvathukal Laboratory for Parallel Computing and Languages Illinois Institute of Technology Chicago
dhinds@elaine18.Stanford.EDU (David Hinds) (06/26/91)
In article <11710@ifi.informatik.uni-stuttgart.de> schoebel@bs3.informatik.uni-stuttgart.de (Thomas Schoebel) writes: > >I also played with calling conventions, with a surprising result: >Changing the calling conventions from JPI to STACK decreases the >code size by nearly 10%! However, without adapting some assembler >routines this code will not run, I just wanted to see the size. >Surprise: With STACK conventions, the size produced by V3.01 is >nearly the same as with V1.X (188K EXE, all checks on, full optimize). >*Not* worse than V1.X! This may be a classic speed vs. size tradeoff. >Consequences: >The JPI conventions give only some gain at routines with small parameter >sets. As soon as a routine calls another or the size of it is more than >a trivial one, you can examine a push orgy in the called routine >where all parameters are moved to stack. In fairly large systems, >only a small percentage of all routines are small and trivial enough >to get advantage from register passing. In most cases it produces >additional overhead! Whether the execution time will be affected from >that is not sure, but it is most likely that larger code will run slower. >If JPI had choosen the old stack passing convention as default, their >product would be better in general. This is quite a strong claim, and I'm not at all sure it is valid. Many very serious compilers (the MIPS RISC compiler backend, for example) that are state of the art in optimization pass some parameters in registers. The MIPS compilers pass the first three or four word-sized parameters this way. The push orgy you complain about would seem to be equivalent to the push orgy that would have been done if the parameters had to be put on the stack in the first place, right? You also over-generalize about properties of procedures in large systems. Can you show some evidence of this claim, such as, that the most time-critical procedures in "large systems" tend to both have many parameters and also do relatively little work (if they do a lot of work, the overhead of passing parameters one way or another is insignificant)? I guess you could argue that the 80x86 machines don't have enough registers to justify using any of them to pass parameters, even transiently. Or that the JPI compiler isn't strong enough at optimization to minimize the overhead. But you should have some timing data before you make this claim (based only on program size, as far as I can tell). -David Hinds dhinds@cb-iris.stanford.edu
Cobus.Debeer@p0.f1.n7101.z5.fidonet.org (Cobus Debeer) (06/26/91)
I have found that the MGDEMO program crashes on certain video adaptors and traced the cause of the problem to the code where the video mode is In line 1961 of the GRAPH.mod file the bios function is called without the AH register being set to 0. This can be fixed by changing the assignment of AL = 010 to AX := 010 or by making ah zero. The reason why it does not crash consistently is that AH is mostly zero. When it happens to be non-zero you go off into space. There is another bug in the code. When using VGA graphics the structure indicating the hrizontal and vertical pixel count returns the correct values but the pixel plotting functions use the EGA limitsto decide if a point should be plotted. regards Cobus de Beer -- uucp: uunet!m2xenix!puddle!5!7101!1.0!Cobus.Debeer Internet: Cobus.Debeer@p0.f1.n7101.z5.fidonet.org
schoebel@bs3.informatik.uni-stuttgart.de (Thomas Schoebel) (06/26/91)
In article <1991Jun25.172455.12384@leland.Stanford.EDU> dhinds@elaine18.Stanford.EDU (David Hinds) writes: >This is quite a strong claim, and I'm not at all sure it is valid. Many >very serious compilers (the MIPS RISC compiler backend, for example) that >are state of the art in optimization pass some parameters in registers. >The MIPS compilers pass the first three or four word-sized parameters this >way. The push orgy you complain about would seem to be equivalent to the >push orgy that would have been done if the parameters had to be put on the >stack in the first place, right? You are right with respect to RISC architectures: There you have plenty of registers, and also the stack can be achieved by register windowing. However, TopSpeed is running at the 8086 where you have only 4 general purpose 16bit registers. Index registers like SI and DI, and also segment register could be used for temporary storage, but the 86 architecture is not very symmetric: Some instructions don't work with all registers or they are hard-assiged to particular registers. This results in frequent use of registers for temporary values. An example: If a procedure has three parameters, where two of them are pointers resp. VAR parameters, you'd need 5 registers for them. Unfortunately, AX is always choosen for the first parameter, but AX is (most likely) the most frequently used register. So the mentioned push orgies are very likely. >You also over-generalize about properties >of procedures in large systems. Can you show some evidence of this claim, >such as, that the most time-critical procedures in "large systems" tend to >both have many parameters and also do relatively little work (if they do >a lot of work, the overhead of passing parameters one way or another is >insignificant)? Well, my project is a multi-thread one. At least for this case my claims are most likely, because you have to avoid static data where possible. Thus nearly all structures are referenced via pointers, which have to be passed as parameters. Of course, if you program FORTRAN or COBOL style, you need not use parameters, but then there would be no difference between both calling conventions. >I guess you could argue that the 80x86 machines don't have enough registers >to justify using any of them to pass parameters, even transiently. Or that >the JPI compiler isn't strong enough at optimization to minimize the overhead. Yes, the problem is the available number of registers: They are only useful for parameters if you don't have to much of them. Secondly: Whenever procedure A calls B where B's parameters are passed via registers, A has to preserve the old values if they are needed after the call. Even if procedure B preserves all temporary used registers, there will be some overhead somewhere. The point is that there is no general measure for predicting the lifetime of values. Keeping less used but lately referenced values in registers leads to preserving overhead. The question is that nobody can predict from a procedure definition (e.g. in a DEFINITION MODULE) the reference pattern for the parameters. Allocating registers implies an assumption that parameters will be used either frequently or can be discarded after a few statements. But what if the parameter is used at the bottom of the procedure, of if the register (e. g. AX) is used in a prior function call for the return value? >But you should have some timing data before you make this claim (based only >on program size, as far as I can tell). > > -David Hinds > dhinds@cb-iris.stanford.edu Yes, I'm currently doing some timing measurements, but there are limits because of the file accesses of my program which aren't reproducable in general. In a previous article, George Thiruvathukal also suggests a proof for my claims. I think a mathematical proof is not very hard, for there are parallels to thrashing effects in paging strategies in operating systems. The only problem is to "prove" that having only 4 or, say 8 registers makes it very likely that such thrashing may occur. -- Thomas