[comp.sys.amiga.programmer] Lemmings - a tutorial Part V

farren@well.sf.ca.us (Mike Farren) (03/24/91)

NOTICE - this entire posting is Copyright (c) 1991 by Michael J.
Farren.  The only reproduction rights granted under this copyright are
for electronic transmission between Usenet sites, and other sites
connected to Usenet sites.  Contact farren@well.sf.ca.us for reproduction
rights other than the above.

Level 4 - DON'T FENCE ME IN

The last three installments have talked about Lemmings from the point
of view of memory use, code size and speed, and its relationship with
multitasking.  This installment, which should be the last one, will
look at Lemmings from the point of view of its overall friendliness to
the system.  Someone, either the designers or Psygnosis, made several
decisions which directly affect the usability of the game, such as the
customized disk format and the lack of installability on hard drives.
The question is, were these decisions necessary?  My answer is no, and
in this installment I try and explain why.

Let's start with disk usage.  The developer of Lemmings has stated in
a message that appeared in comp.sys.amiga.games that the custom disk
driver and disk format they use allows them to cram 980K on a disk.
Since one can presume that they actually use all of that space, this
becomes one justification for NOT providing standard Amiga format
diskettes.  Or does it?  I say no.  In the first place, a huge amount
of the space used on disk one is devoted to the silly animated
Lemmings introduction.  While this is very cute, if it came down to a
choice between an 880K Amiga diskette and such a large-scale animated
display, there was no *need* to choose the animation, just a desire to
do so.  Surely a less ambitious display could have been made, fitting
in 100K less disk space.

On disk two, the situation is a bit different.  This disk contains the
Lemming levels, and if it is full, I will presume that that's
necessary for containing all of that data.  I call your attention,
however, to two distinct levels: A Beast of a Level, and An Awesome
Level.  I would be willing to bet that these two levels alone account
for an enormous amount of the disk space used.  Each of them is unique
in their graphics; they share none of the graphic elements of any
other level.  More to the point, each of them is unique in their
sound, and thus require significant additional disk space just to
store their data.  While the idea of having levels which are
replications of other Psygnosis games is amusing, again it comes down
to a matter of choice - they didn't *need* to include those levels,
whimsical as they may be.

(By the way - there's one implication that putting all of the level
data on disk two brings up: additional level disks.  Psygnosis, are
you listening?  DO IT!)

Part of the goal of Psygnosis was, I am sure, to keep the game down to
two disks.  Each additional disk in a distribution adds significant
costs to the game's production.  I do not cavil with that - I just
believe that with a judicious eye towards the reality of the game,
and, perhaps, some reorganization and packing of the data, those two
disks could easily have been standard AmigaDos disks.  The question of
copy protection is hereby ignored - I don't want to get into that one.

As for hard drive installation, the Lemmings developer has already
stated that it could be done, and that Psygnosis will sell such a
version.

Now, let's look at the question of memory usage.  Specifically, the
use of expansion memory.  Oh, I know that there's a nice looking
display at the beginning that says "Expansion Memory Detected and
Utilized".  What I would like to know, though, is simply this:
utilized for *what*?  If you have expansion memory, you've got enough
memory to allow multiple levels to be loaded at once, thus avoiding
any disk access delays, and speeding the game up significantly.  Get
clever about it, and you can even do things like loading the next
levels during the idle time between the levels, making the whole
process even more friendly.  If you have expansion memory, you have
the room to keep two copies of the background in memory
simultaneously, allowing you to restart a level in seconds, rather
than the tens of seconds it takes now (I, for one, am tired of seeing
that cute sleeping lemming sprite!).  Lemmings, as it stands, does
none of those things, even in the presence of multi-megabytes of
unused memory.  And of course, there's the final benefit of expanded
memory - you can multitask.

Level 5 - TH-TH-TH-TH-TH-TH-THAT'S ALL, FOLKS

I hope that these postings have been useful fuel for thought.  I hope
even more that any fledgling game developers reading this will think,
the next time they do a game, about how simple it can be to have both
a good game and an Amiga-friendly game.  You don't have to give up
much of anything, and you gain the respect and admiration of those who
like to use their computers to their fullest, while not losing the
respect of those who don't care.  It seems to me that this is the
classic "Win-Win" situation, and shouldn't we all be going for that
when we can?

mykes@sega0.SF-Bay.ORG (Mike Schwartz) (03/25/91)

In article <23788@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>I hope that these postings have been useful fuel for thought.  I hope
>even more that any fledgling game developers reading this will think,
>the next time they do a game, about how simple it can be to have both
>a good game and an Amiga-friendly game.  You don't have to give up
>much of anything, and you gain the respect and admiration of those who
>like to use their computers to their fullest, while not losing the
>respect of those who don't care.  It seems to me that this is the
>classic "Win-Win" situation, and shouldn't we all be going for that
>when we can?

All you talked about here is how to strip stuff out of a product to make
it Amiga friendly.  Why settle for anything less than what you want to
make?  Why compromise on the power of what the Amiga can do?  If you
want to play a game while downloading from a bbs or raytracing at the
same time, I suggest you look in the public domain.  If you want to play
a game that pushes the Amiga far beyond what the operating system allows,
but what the Amiga does allow, you look for the best of what is commercially
available (arguably Psygnosis style software).

You do not describe a WIN-WIN situation.  What you describe is a WIN-LOSE
situation.  You WIN because you can install on a hard disk and you can
multitask.  You LOSE because you want to cut 100's of K of code, graphics,
and audio from the product.  You LOSE because the games that result are
not as good as the Amiga is capable of doing.  

The Amiga is NOT a Mac and it is NOT a PC.  Anything those machines can do,
the Amiga can do.  Things the Amiga can do, those other machines can't
touch.  I'd like to see Shadow of the Beast running on a Mac or a PC.
Beast may be difficult to play (which is a different issue altogether and
has nothing to do with whether it takes over the machine or not), but it
does do what the Amiga and only the Amiga can do.  This is what I would
encourage fledgling developers to accomplish.

mykes

--
*******************************************************
* Assembler Language separates the men from the boys. *
*******************************************************

sschaem@starnet.uucp (Stephan Schaem) (03/25/91)

 Dont forget: the target market for game is A500 with 512K.
 If you have you own dos, it will work on V.000001 and V99999.1
 of the software if they have the same boot procedure.
 If the market was A3000, it would be something else!

 What should be done is DEFENECTLY NOT: a general version
 of a game.
 I would love to do a 512k version and 1meg version of my game
 but for the 1meg version, alot of problem arise!

 Since the market is not defined you risk big with returns!
 And right now, some compagnie try to start something and other
 survive.Those can do mutch to change ANYTHING, the big one can.

 To anyone: WRITE TO WHO IT WAY CONCERN.
 Well if you really want something changed! I know of only 2 people
 in the US that would want an HD version of my game...Do you think
 that kind of thing push developers!


 And in europe the subject never apear... So we those settings,
 people try to push the machine to the limit with that in mind!

farren@well.sf.ca.us (Mike Farren) (03/26/91)

mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:


>All you talked about here is how to strip stuff out of a product to make
>it Amiga friendly.


Not at all.  What I talked about is how to keep stuff in a product, and
make it Amiga-friendly anyway.


>The Amiga is NOT a Mac and it is NOT a PC.

Then why do so many game developers keep treating it like it is?  One
of the things the Amiga can do that those machines can't is to run a
full and fast multitasking OS - why are you so insistent that the first
thing we should do is throw that away?

-- 
Mike Farren 				     farren@well.sf.ca.us

rbabel@babylon.rmt.sub.org (Ralph Babel) (03/27/91)

In article <mykes.0374@sega0.SF-Bay.ORG>,
mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:

> A PC can run UNIX and a Mac can run AUX.  Those are fuller
> multitasking OS'es than the Amiga's [...]

Nonsense! Either it's multitasking or it's not! And in the
cases mentioned above, it's even preemptive multitasking.
The Amiga certainly is not a multiuser machine, it doesn't
have virtual memory, and it doesn't support MMU protection,
but that wasn't your point.

Ralph

s902255@minyos.xx.rmit.oz.au (Andrew Vanderstock) (03/27/91)

sschaem@starnet.uucp (Stephan Schaem) writes:


> Dont forget: the target market for game is A500 with 512K.
> If you have you own dos, it will work on V.000001 and V99999.1
> of the software if they have the same boot procedure.
> If the market was A3000, it would be something else!

> What should be done is DEFENECTLY NOT: a general version
> of a game.
> I would love to do a 512k version and 1meg version of my game
> but for the 1meg version, alot of problem arise!

> Since the market is not defined you risk big with returns!
> And right now, some compagnie try to start something and other
> survive.Those can do mutch to change ANYTHING, the big one can.

> To anyone: WRITE TO WHO IT WAY CONCERN.
> Well if you really want something changed! I know of only 2 people
> in the US that would want an HD version of my game...Do you think
> that kind of thing push developers!


> And in europe the subject never apear... So we those settings,
> people try to push the machine to the limit with that in mind!

Well the best way to solve all these problems is :
 a) knock AmigaDos/Intuition down (but saving the context so that you can
 return the user to AmigaDOS when the game is either paused (and my idea of
 a pause is : an icon which when single clicked, stops the action; when
 double-clicked, restores AmigaDos and retains only a small loader and
 context information for the game, and leave an icon on the WB screen,
 which when double clicked, re-loads the game, and restores it from the
 context info saved previously.)

 b) hard disc installable. A definite must. I hate having to re-boot before
 and after a game. I also like HD's speed. Easy done : Create a folder with
 all the files on *one* disc, and just load system dependant overlays. When
 installing on HD, just drag the folder.

 c) runs on *any* set up. This means for the most part ignore the 68020's
 extra (nice) op-codes, and ignore MOVE <data>, SR for any version of the
 game. Keep overlays to a minimum, but load a 512k overlay manager for the
 little machines around, a 1 mg + overlay for the rest. Try to recognise
 the ram in the bay door slot. A lot of people have this ram, but because a
 lot of games have boot-block loaders, they can't configure the ram. I
 would let the user run AmigaDos to get the ram added, and keep all nice
 references for this ram (ie where is it? what is it? chip? slow? how
 much?)

   You would also need 68020 (and '30 and '40) drivers if you want to make
 more playable games for the rich set. (IE more frames per second better
 use of caching/ram/certain instruction sequences, better (&/or) larger
 graphics. But simply have an overlay which can be loaded in, and added
 into the code to replace another overlay which is used for the A500 +
 modulator set.

Just my $0.02
Andrew Vanderstock
s902255@minyos.xx.rmit.oz.au

mykes@sega0.SF-Bay.ORG (Mike Schwartz) (03/28/91)

In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>
>
>>All you talked about here is how to strip stuff out of a product to make
>>it Amiga friendly.
>
>
>Not at all.  What I talked about is how to keep stuff in a product, and
>make it Amiga-friendly anyway.
>

You explicitly said they should have cut out levels that were parts of
their other products.

>
>>The Amiga is NOT a Mac and it is NOT a PC.
>
>Then why do so many game developers keep treating it like it is?  One
>of the things the Amiga can do that those machines can't is to run a
>full and fast multitasking OS - why are you so insistent that the first
>thing we should do is throw that away?

The Amiga is capable of doing awesome video games if you program it right.
The 68000 is not really that fast, and the operating system is fast only
when you compare it with Unix.  Game developers treat the Amiga like it
is a PC by using the OS.  You can't take over a PC, because you need the
BIOS to interface to a variety of hardware configurations.  The multitasking
ability of the Amiga is certainly impressive, but the Amiga has other
qualities that video games often need to stress more.  Like the blitter,
copper, audio, etc.  Most of the PC people I have seen that move over to
the Amiga struggle with volumes and volumes of poorly illustrated RKM
manuals and typically don't make games that I rate very high.

The Amiga operating system is not a high performance video game operating
system.  BOBs are slower than what I use.  Intuition takes 30% of the CPU
time when you just move the mouse around (check out a CPU performance
monitor on a 68000 machine while moving the mouse).  Layers are totally
unnecessary and way too slow.  Exec tasks require a minimum of 2K of stack
each, while any game I ever do needs only 512 bytes of stack for 80 tasks
under my own kernel.  

You mention in your lecture/article stream that the OS steals 80K of
RAM (that is almost 20% of what you get on a 512K machine).  Even if I
don't need that 80K for the game, I can always find something useful
to do with it (like adding instrument samples for the music driver). 

The ROM Kernel routines have many many bugs in them that you end up
programming your way around.  It is not lazy to want to avoid the
hassle.  It is just more cost effective to make the best games the
machine can do and to make them as fast and often as you can. 

The only thing that the OS gets for you is the ability to use hard
disks.  If commodore were smart, they would make ROM routines
accessable for video games to access the hard disk when the OS is not
running.  This is really what the machine needs. 

It may shock you, but I actually do much more programming of the Amiga
in 'C' under the OS than I do taking it over.  When you make a game
that is 40 Megs worth of source files (graphics, sounds, code) on hard disk
and ends up on 2 880K disks, you need to write a lot of programs to
manipulate the data.  IFF format, for example, is fine for a source file
format, but wastes disk space in a product.

Most people who program the Amiga don't have the ability (or gumption)
to write in assembler language.  These are the people who I would
call lazy.  Most people who program the Amiga don't have the ability
to write their own native operating systems that outperform the ROM
Kernel, so for them the OS is the only choice.  In my case, I write
assembler language because I can.  I take over the machine because
I can.  People actually pay for what I program, so in order to give
them the best I can do, I go the extra mile.  It is ridiculous to
say that someone who goes to the extra effort that assembler language
programming takes is lazy.

Have you ever taken over the Amiga?  I bet if you did, you'd change
your tune.  I on the other hand have done things both ways (using the
OS and taking over), and the power you gain by taking over more than
offsets the capability to multitask your game with other programs.
Open your mind and give it a try, then we can really have a productive
disagreement. 

>
>-- 
>Mike Farren 				     farren@well.sf.ca.us

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

mykes@sega0.SF-Bay.ORG (Mike Schwartz) (03/28/91)

In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>Then why do so many game developers keep treating it like it is?  One
>of the things the Amiga can do that those machines can't is to run a
>full and fast multitasking OS - why are you so insistent that the first
>thing we should do is throw that away?
>
I hate to respond to this twice, but...  A PC can run UNIX
and a Mac can run AUX.  Those are fuller multitasking OS'es
than the Amiga's although not as good at realtime response.

Q: By the way, do you know why the Atari ST operating system is
called TOS?

A: Because the first thing you do is TOSS it.

>-- 
>Mike Farren 				     farren@well.sf.ca.us

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

jdickson@jato.jpl.nasa.gov (Jeff Dickson) (03/28/91)

In article <mykes.0374@sega0.SF-Bay.ORG> mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>>Then why do so many game developers keep treating it like it is?  One
>>of the things the Amiga can do that those machines can't is to run a
>>full and fast multitasking OS - why are you so insistent that the first
>>thing we should do is throw that away?
>>
>I hate to respond to this twice, but...  A PC can run UNIX
>and a Mac can run AUX.  Those are fuller multitasking OS'es
>than the Amiga's although not as good at realtime response.
>
>Q: By the way, do you know why the Atari ST operating system is
>called TOS?
>
>A: Because the first thing you do is TOSS it.
>
>>-- 
>>Mike Farren 				     farren@well.sf.ca.us
>
>--
>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
>********************************************************

	Can this message and the other entitled "AmigaWorld lied" get moved
out of the main stream? I'm sick to death of coming accross those messages
and it seems like they are the bulk of messages on comp.sys.amiga.misc,
comp.sys.amiga.hardware, and comp.sys.amiga.programmer. I'm sorry to waste
bandwidth.

espie@flamingo.Stanford.EDU (Marc Espie) (03/28/91)

In article <mykes.0362@sega0.SF-Bay.ORG> mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>>mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
[stuff deleted]
>>>The Amiga is NOT a Mac and it is NOT a PC.
>>Then why do so many game developers keep treating it like it is?  One
>>of the things the Amiga can do that those machines can't is to run a
>>full and fast multitasking OS - why are you so insistent that the first
>>thing we should do is throw that away?
>
>The Amiga is capable of doing awesome video games if you program it right.
>The 68000 is not really that fast, and the operating system is fast only
>when you compare it with Unix.  Game developers treat the Amiga like it
>is a PC by using the OS.  You can't take over a PC, because you need the
>BIOS to interface to a variety of hardware configurations.  The multitasking

??? hey ! There aren't only amiga 500 with 512 K around.

>ability of the Amiga is certainly impressive, but the Amiga has other
>qualities that video games often need to stress more.  Like the blitter,
>copper, audio, etc.  Most of the PC people I have seen that move over to

And the new agnus chip, and the third party diskdrives, and the old
amigas 1000, and the new 3000 with os2.0, and the hurricane cards and...

>the Amiga struggle with volumes and volumes of poorly illustrated RKM
>manuals and typically don't make games that I rate very high.
Well, there should be the programmer interface manual going out.
Anyway, RKMs are intended for people who know a bit about operating
systems, not PC type guys (say again ? what OS ? Ah the BIOS, you mean:-) )
>
>The Amiga operating system is not a high performance video game operating
>system.  BOBs are slower than what I use.  Intuition takes 30% of the CPU
>time when you just move the mouse around (check out a CPU performance
>monitor on a 68000 machine while moving the mouse).  Layers are totally
>unnecessary and way too slow.  Exec tasks require a minimum of 2K of stack
>each, while any game I ever do needs only 512 bytes of stack for 80 tasks
>under my own kernel.  
>
>You mention in your lecture/article stream that the OS steals 80K of
>RAM (that is almost 20% of what you get on a 512K machine).  Even if I
>don't need that 80K for the game, I can always find something useful
>to do with it (like adding instrument samples for the music driver). 

How about checking the configuration of the machine you use ?
If your game is designed for 512K, maybe (just maybe) you could preserve
the OS when running on >1Meg machines. If you're really good, you can
even check the speed of the processor and leave part of the OS running...
How about a download on your a3000 while you're playing an arcade game ?
Since the blitter has a 32 bit path, this means that *even* coprocessor
stuff runs twice as fast as on a 500. Plenty of time left...
F18, for instance, works much better on >1.5meg machines. And it runs with
a 68030 at an impressive speed...

>
>The ROM Kernel routines have many many bugs in them that you end up
>programming your way around.  It is not lazy to want to avoid the

Disregarding the fact that there aren't so many bugs around (if you're
used to game machines, maybe you don't respect multitasking rules :-),
Commodore is working on 2.0, which will be more stable and better...
and break most arcade type games.

What about 68030 and MMU ? Are they a bug ? What the heck, perfectly
reasonable self-modifying code BREAKS when running under them.

>hassle.  It is just more cost effective to make the best games the
>machine can do and to make them as fast and often as you can. 
>
>The only thing that the OS gets for you is the ability to use hard
>disks.  If commodore were smart, they would make ROM routines
>accessable for video games to access the hard disk when the OS is not
>running.  This is really what the machine needs. 
>
>It may shock you, but I actually do much more programming of the Amiga
>in 'C' under the OS than I do taking it over.  When you make a game
>that is 40 Megs worth of source files (graphics, sounds, code) on hard disk
>and ends up on 2 880K disks, you need to write a lot of programs to
>manipulate the data.  IFF format, for example, is fine for a source file
>format, but wastes disk space in a product.
>
>Most people who program the Amiga don't have the ability (or gumption)
>to write in assembler language.  These are the people who I would
>call lazy.  Most people who program the Amiga don't have the ability
>to write their own native operating systems that outperform the ROM
>Kernel, so for them the OS is the only choice.  In my case, I write
>assembler language because I can.  I take over the machine because
>I can.  People actually pay for what I program, so in order to give
>them the best I can do, I go the extra mile.  It is ridiculous to
>say that someone who goes to the extra effort that assembler language
>programming takes is lazy.

Most of the code I've seen in Assembler does atrocious things and is
fairly unreadable. Code written in C tends to be cleaner... 
And I know some people who don't know ANYTHING about high-level
languages and Operating Systems. For them, assembler is the easiest way.

Assembler vs C: compare IFF SMUS/soundtracker. On one hand, you have
a fairly reasonable format, not incredibly effective. On the other hand,
you have a highly effective memory dump, with major problems.
- VERY bad design. Features like sample length/repeat stuff should be
coded with the sample. Several different versions exist, all are
subtly incompatible with each other.
- not supple. Does it support iff sample ? 
Does it support fibonacci compression ?
- hardware dependant... the difference between NTSC/PAL kills it.

>
>Have you ever taken over the Amiga?  I bet if you did, you'd change
>your tune.  I on the other hand have done things both ways (using the
>OS and taking over), and the power you gain by taking over more than
>offsets the capability to multitask your game with other programs.
>Open your mind and give it a try, then we can really have a productive
>disagreement. 

Again: you can take over and keep the OS in a corner, that is easy to
do. You can even partly take over and leave os stuff running.

Ok, what I've said doesn't apply to all games. But I'm sick of seeing
many games breaking the OS for gratuitous purposes. Why is
Shadow of the Beast so badly protected ? Why does full metal planet
take over the machine ? Why does Populous crash so often ?
Why do I spend so much time making BROKEN games running ? installing
bootblocks, patching codes, trying things out ?
Why have so many games decided the joystick should go into port 0 ?
Is this so difficult to add an option (and leave my mouse where it is) ?

I can understand that you feel personally attacked by what Mike Farren
says. So, alright, you're a nice little programmer. Now, look at
existing games... Don't you feel a little uneasy about some of them ?

>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
Which is the surest way to break things. You should
*at least* read the recommendations at the beginning of the RKM.
>********************************************************

	Marc Espie (espie@flamingo.stanford.edu)

bairds@eecs.cs.pdx.edu (Shawn L. Baird) (03/28/91)

s902255@minyos.xx.rmit.oz.au (Andrew Vanderstock) writes:

> [ Andrew mentions people who have non-autoconfig RAM ] Try to recognise
> the ram in the bay door slot. A lot of people have this ram, but because a
> lot of games have boot-block loaders, they can't configure the ram. I
> would let the user run AmigaDos to get the ram added, and keep all nice
> references for this ram (ie where is it? what is it? chip? slow? how
> much?)

One good way, perhaps, to do this is to use the loader (via BSS hunks) or
AllocMem() before you take over the system. I would guess, correct me if
I'm wrong, most games would not need to make further allocations during
the course of the game. That is, take all that you'll need immediately and
hold it until the user quits or pauses.

>Just my $0.02
>Andrew Vanderstock
>s902255@minyos.xx.rmit.oz.au
---
 Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
 The above message is not licensed by AT&T, or at least, not yet.

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (03/29/91)

In article <1991Mar27.211819.19370@neon.Stanford.EDU> espie@flamingo.Stanford.EDU (Marc Espie) writes:
>In article <mykes.0362@sega0.SF-Bay.ORG> mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>>In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>>>mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>[stuff deleted]
>??? hey ! There aren't only amiga 500 with 512 K around.
>
Anything correctly written for the 512K Amiga runs on all Amigas.

>How about checking the configuration of the machine you use ?
>If your game is designed for 512K, maybe (just maybe) you could preserve
>the OS when running on >1Meg machines. If you're really good, you can
>even check the speed of the processor and leave part of the OS running...
>How about a download on your a3000 while you're playing an arcade game ?
>Since the blitter has a 32 bit path, this means that *even* coprocessor
>stuff runs twice as fast as on a 500. Plenty of time left...
>F18, for instance, works much better on >1.5meg machines. And it runs with
>a 68030 at an impressive speed...
>
>>
>>The ROM Kernel routines have many many bugs in them that you end up
>>programming your way around.  It is not lazy to want to avoid the
>
>Disregarding the fact that there aren't so many bugs around (if you're
>used to game machines, maybe you don't respect multitasking rules :-),
>Commodore is working on 2.0, which will be more stable and better...
>and break most arcade type games.
>
>What about 68030 and MMU ? Are they a bug ? What the heck, perfectly
>reasonable self-modifying code BREAKS when running under them.
>

Self-modifying code breaks even if you ARE using the OS.  Commodore has
been pretty clear about what you shouldn't do, either when using the
OS or when taking over.

>
>Most of the code I've seen in Assembler does atrocious things and is
>fairly unreadable. Code written in C tends to be cleaner... 
>And I know some people who don't know ANYTHING about high-level
>languages and Operating Systems. For them, assembler is the easiest way.
>

I know lots of people that don't know assembler and think of themselves
as great programemrs.  If they don't understand what it means to use
assembler, their opinions carry little weight.

>Assembler vs C: compare IFF SMUS/soundtracker. On one hand, you have
>a fairly reasonable format, not incredibly effective. On the other hand,
>you have a highly effective memory dump, with major problems.
>- VERY bad design. Features like sample length/repeat stuff should be
>coded with the sample. Several different versions exist, all are
>subtly incompatible with each other.
>- not supple. Does it support iff sample ? 
>Does it support fibonacci compression ?
>- hardware dependant... the difference between NTSC/PAL kills it.
>

IFF SMUS is a poor format.  I suggest you look at standard MIDI file
format.  It is much better.  The best music drivers for the Amiga
don't use SMUS.  The most popular music drivers for the Amiga do,
unfortunately.  Fibonacci compression is guaranteed to distort the
audio of any sample used with it.

Is soundtracker PD?  You get what you pay for.

>>
>>Have you ever taken over the Amiga?  I bet if you did, you'd change
>>your tune.  I on the other hand have done things both ways (using the
>>OS and taking over), and the power you gain by taking over more than
>>offsets the capability to multitask your game with other programs.
>>Open your mind and give it a try, then we can really have a productive
>>disagreement. 
>
>Again: you can take over and keep the OS in a corner, that is easy to
>do. You can even partly take over and leave os stuff running.
>

There are no corners to hide the OS in on a 512K machine.

>Ok, what I've said doesn't apply to all games. But I'm sick of seeing
>many games breaking the OS for gratuitous purposes. Why is
>Shadow of the Beast so badly protected ? Why does full metal planet
>take over the machine ? Why does Populous crash so often ?
>Why do I spend so much time making BROKEN games running ? installing
>bootblocks, patching codes, trying things out ?
>Why have so many games decided the joystick should go into port 0 ?
>Is this so difficult to add an option (and leave my mouse where it is) ?
>

A pirate party was just busted in France.  The police confiscated over
$1 Million in pirated games.  These parties go on all the time, here
in America and abroad.  I don't like copy protection, myself, but I
understand the rational behind why people use it.

Populous is a port from the PC.  As long as Amiga marketing data for
software is dismal, big companies won't make the investment required
to make truly awesome Amiga games.  The only companies that do make
Amiga originals that survive are European.

>I can understand that you feel personally attacked by what Mike Farren
>says. So, alright, you're a nice little programmer. Now, look at
>existing games... Don't you feel a little uneasy about some of them ?
>

I do not feel personally attacked by what Mike Farren says.  I am only
concerned that he is closing off a valid approach to making games.

Yes, I feel a lot uneasy about many games on the Amiga.  I can't stand
it when I see an action game written in 'C'.  It is easy to tell which
games are done this way, and they suck.  

>>********************************************************
>>* Appendix A of the Amiga Hardware Manual tells you    *
>>* everything you need to know to take full advantage   *
>>* of the power of the Amiga.  And it is only 10 pages! *
>Which is the surest way to break things. You should
>*at least* read the recommendations at the beginning of the RKM.
>>********************************************************
>
>	Marc Espie (espie@flamingo.stanford.edu)

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (03/29/91)

In article <2108@pdxgate.UUCP> bairds@eecs.cs.pdx.edu (Shawn L. Baird) writes:
>s902255@minyos.xx.rmit.oz.au (Andrew Vanderstock) writes:
>
>> [ Andrew mentions people who have non-autoconfig RAM ] Try to recognise
>> the ram in the bay door slot. A lot of people have this ram, but because a
>> lot of games have boot-block loaders, they can't configure the ram. I
>> would let the user run AmigaDos to get the ram added, and keep all nice
>> references for this ram (ie where is it? what is it? chip? slow? how
>> much?)
>
>One good way, perhaps, to do this is to use the loader (via BSS hunks) or
>AllocMem() before you take over the system. I would guess, correct me if
>I'm wrong, most games would not need to make further allocations during
>the course of the game. That is, take all that you'll need immediately and
>hold it until the user quits or pauses.
>
>>Just my $0.02
>>Andrew Vanderstock
>>s902255@minyos.xx.rmit.oz.au
>---
> Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
> The above message is not licensed by AT&T, or at least, not yet.

At bootloader time, the OS is running, and all the memory is configed.
This is the IDEAL time to make all the OS calls you need, like to find
out if there is extra memory, if there is a fast CPU, etc.  Non-autoconfig
stuff is stoneage stuff for the Amiga.

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

elg@elgamy.RAIDERNET.COM (Eric Lee Green) (03/29/91)

From article <mykes.0362@sega0.SF-Bay.ORG>, by mykes@sega0.SF-Bay.ORG (Mike Schwartz):
> In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
> The Amiga operating system is not a high performance video game operating
> system.  BOBs are slower than what I use.  Intuition takes 30% of the CPU

First of all, I've never used a BOB in my entire life. When I needed
high performance graphics, I did what any self-respecting Amiga programmer
does -- I opened myself up my very own screen with its very own bitmap,
and went straight to the hardware. Totally system legal. ObtainBlitter()
and such.

Of course, doing a hi-res structured drawing program is a bit different
from doing a video game... most folks wouldn't try to run a drawing
program on a 512K Amiga with 1 floppy.

> unnecessary and way too slow.  Exec tasks require a minimum of 2K of stack
> each, while any game I ever do needs only 512 bytes of stack for 80 tasks
> under my own kernel.

Hmm. Matt Dillon has done some work on how much stack an Exec task
requires. Basically, Exec tasks require a) enough stack to save the
processor state, and b) enough stack to satisfy your program's need for
stack. 1K stack is fine for all Amiga machines. On a 68000, 512 bytes of
stack quite suffices for Exec (since a 68000 doesn't store mega-bunches of
state on the stack when it recieves an interrupt).

> You mention in your lecture/article stream that the OS steals 80K of
> RAM (that is almost 20% of what you get on a 512K machine).  Even if I
> don't need that 80K for the game, I can always find something useful
> to do with it (like adding instrument samples for the music driver).

This is the only valid argument in your article. But why not give a choice?
How about grab all the memory you can grab from the system, shut down the
system in an Amiga-friendly manner (i.e., as published by Commodore in
their DevCon notes), then if there STILL isn't enough memory (i.e., only
512K in the machine, *THEN* hose that remaining 80K of RAM? That is, on
expanded machines the OS could be re-started when the game was suspended or
etc., while only on 512K machines would you have to give up everything.

> The ROM Kernel routines have many many bugs in them that you end up
> programming your way around.  It is not lazy to want to avoid the

Huh? You don't use many ROM Kernel routines if you're doing high
performance graphics on the Amiga. Basically, you go straight to the
hardware. Period. But there's Amiga-friendly ways to do that. (Though those
ways may not suffice for a fast-action game on a 512K Amiga).

> The only thing that the OS gets for you is the ability to use hard
> disks.  If commodore were smart, they would make ROM routines
> accessable for video games to access the hard disk when the OS is not
> running.  This is really what the machine needs.

I'm aghast. Device drivers run as Exec tasks. They *CAN'T* be put into
stand-alone ROM routines. Most device drivers rely upon having the Amiga
interrupt handler structure intact, rely upon sending messages from
interrupt handler to device driver, rely upon a myriad of other features of
the Amiga operating system (features that you've apparently never used,
but...). They simply CAN'T operate when the operating system is hosed. At
least, not as they are supplied by the major hard drive controller
manufacturers.

If you really want, go talk to the major hard drive controller
manufacturers about creating a "standard" for direct access. They're the
ones responsible for providing drivers for their hard drives. Their
currently supplied Exec drivers are obviously impossible, if you've nuked
the machine.

> Most people who program the Amiga don't have the ability (or gumption)
> to write in assembler language.  These are the people who I would

Hmm... that's an interesting allegation. I know that I'm fairly decent in
68000 assembler (started out with Z-80's and 6502's, quite a while ago),
but still prefer to mostly program in "C". Actual graphics routines that
draw into the bitmap must, of course, be in assembler in any real product
(I've seen some "productivity software" where I'd swear that they'd written
their user interface code in interpretive BASIC, though... sigh).

> call lazy.  Most people who program the Amiga don't have the ability
> to write their own native operating systems that outperform the ROM
> Kernel, so for them the OS is the only choice.  In my case, I write

Hmm. I could easily write my own multitasking kernal. I've never had the
need to, but I could do it. Old hat. I could even make it faster than
Commodore's, by stripping out every feature not essential to my project,
things like the windowing environment, generalized prioritized task queues
(OS has to look at all those priority #'s to decide how to schedule
things), etc. Of course, the resulting OS would be useless as a general
computing environment. It's hard to find a microcomputer OS better than the
Amiga's OS for a general computing environment.

> Have you ever taken over the Amiga?  I bet if you did, you'd change
> your tune.  I on the other hand have done things both ways (using the
> OS and taking over), and the power you gain by taking over more than
> offsets the capability to multitask your game with other programs.

I'm not a games programmer, thankfully. Probably never will be, since I do
not like arcade-style games and thus would be a lousy programmer of such
(it is difficult to do good work in a field that you don't like, and I like
doing good work). I've "taken over" other machines, and may soon be working
on microprocessor-based RTU's (remote telemetry units) for the oilfield, so
I'm well aquainted with low-level programming. But the Amiga's OS is one of
the few things differentiating it from "mere" game machines, and it strikes
me that tossing it aside is a Bad Idea if it is in any way avoidable.

--
Eric Lee Green   (318) 984-1820  P.O. Box 92191  Lafayette, LA 70509
elg@elgamy.RAIDERNET.COM               uunet!mjbtn!raider!elgamy!elg
 Looking for a job... tips, leads appreciated... inquire within...

m0154@tnc.UUCP (GUY GARNETT) (03/30/91)

[As much as I've been trying to ignore this discussion, now I'm going
to open my big mouth ...]

I, too would like to see sources for both permanently taking out the
operating system (for a high performance game), and for suspending and
resuming multitasking properly.  How about a list of trade-offs (what
parts of the OS you can and can't use, and how v2.0 impacts the whole
scheme).  This would be useful information, no matter which "side" of
the discussion you are on.

I, too have been involved with writing high-performance games on the
Amiga (no, you won't find my name in any credits; I was a technical
advisor and algorithm guru rather than a programmer --- most of the
programmers I worked with were high-school or college kids who taught
themselves everything; lots of raw talent, and most of them handn't
the foggiest idea how to figure out a blitter minterm, so I taught
them).  At the time, the only way we knew of to get effective arcade
games was to kill the OS at boot-up time; I would have preferred to be
able to suspend and then resume the OS, but couldn't figure out a way
to do it without causing a crash later on.  Most of the 
"whiz-kids" working with me thought of the OS as an obstacle anyway.

On the other hand, games like Sim-City and Lemmings have no real use
for that kind of environment.  I agree that the game comes first,
*BUT* if you don't need total control, *DON'T* take it.  With the OS
comes a lot more flexability (HD installation, a real file system,
multitasking, and lots of "Wow!  What a neat game, and it multitasks,
TOO!").  Take what you need, bit be sure you need it before you take
it.

There is no excuse for software to break on 68010, 68020, and 68030
machines, and too many programs (most of them games, but does anybody
remember TDI Modula-2?) break on accellerated systems.  Be aware of
what you are doing when your write your code, and make it upward
compatible.  For timing, use one of the many timebases supplied by the
OS, or if you have killed it, then program one of the CIA's directly. 
Pay attention to compatibility; it will ensure that royalties come
trickling in for years to come, instead of months.  

*NOT* learning how the OS works is a kind of intellectual lazieness,
even if you take the effort to "roll your own".  There is always a
programmer or program out there who can show you a trick or two, and a
lot of clever people spent a lot of time working on the AmigaOS (I
have nothing but respect for people like -=RJ=- and the rest of the
Amiga crew).  The OS is decidedly *NOT* full of bugs (and if the last
time you looked at it was in v1.1, take another look!) and can be your
ally, if you learn to control it ("Use the OS, Luke ... Use the OS!"
;-)

I'm not trying to flame or put down anyone, but better games than the
current crop can still be written!  Better both in terms of
awesome-take-over-the-machine-graphics, and better in terms of
awesome-playability-and-multitasking-code.

I suggested earlier that Mike Farren expand and polish his articles,
include source code examples which make a complete, small game, and send
the whole thing off to AC's Tech.  I still feel very strongly that this
should be done.  I also would like to encourage Mike Schwartz to do
the same: write up his techniques for taking over the system and
programming down to the hardware, inclulde sample code, and send it
off to one of the technical magazines (like AC's Tech).

Wildstar

farren@well.sf.ca.us (Mike Farren) (03/30/91)

It occurs to me that there is a distinct difference of outlook which is
the basis of the disagreements here.  Mike and Stephan look at the
Amiga, as far as I can tell, as a game machine which just happens to be
a computer system as well.  Both of them are concentrated pretty much
solely on getting the maximum bang from the hardware.  Both of them
want to use every single resource that is available to make their games
snazzier, more impressive, flashier, and quicker.  And both of them
are probably good at what they do - at least, my email from Mike would
indicate that it's certainly true of him, and, by reputation, Stephan
as well.

I happen to disagree with their outlook very strongly, because it is an
outlook which inherently limits what you can do with the Amiga.  It
puts limits on the game, it puts limits on the user, and it puts limits
on the sales - I, for one, will think long and hard before buying any
game which shuts down my system, although I did buy Lemmings because it
was just too good to ignore.  Limits is what it's all about - and I feel
that the fewer limits you accept, and the more you try and open up the
limits you do have to work under, the better off we all will be.

It's all in how  you look at it. I propose that if you approach the game design
with the attitude "I won't take anything away from the OS unless I absolutely,
positively, without a doubt have to", and utilize your skill and cleverness
as a programmer to making that so, that you will find that the vast majority
of games will NOT require sacrificing the OS.  All that I am asking of any
programmer is that they work from that basis - not the opposing one which
seems to start out with the assumption that you WILL take over the machine,
an assumption which makes any other choice much harder to implement.

The thing is that despite the protestations of the "take over the machine"
folks, I just haven't seen all that many games for the Amiga which, when
I looked at them closely, really required the entire machine under any and
all circumstances.  Yes, there are some - things like Turrican or any of
the other very fast shoot-em-ups.  But those games are the minority.  If
you look at the Top Ten list of games, you don't see shoot-em-ups on there
very often.  Much more popular are things like SSI's AD&D series, Populous,
SimCity/SimEarth, and the like - games which absolutely and postively do NOT
require a complete takeover of the machine, even though many of them do.

I called the refusal of some programmers to even consider operating in an
Amiga-friendly way "laziness", and I'll stick to that.  If they've done
their homework, as Mike and Stephan seem to have done, and have come to
a rational conclusion that trashing the OS and taking over the machine is the
only way that they can do the game they want to do, then that isn't
laziness.  If, however, they've just taken over the machine because it's
more convenient for them or because they just don't want to do the hard
work necessary to make a game Amiga-friendly, then it's laziness, no matter
how hard they work to get their custom disk loaders or graphics routines
or whatever.

-- 
Mike Farren 				     farren@well.sf.ca.us

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (03/30/91)

Sorry to paste this whole thing again, but it is the best article
done so far.

In article <781@tnc.UUCP> m0154@tnc.UUCP (GUY GARNETT) writes:
>
>[As much as I've been trying to ignore this discussion, now I'm going
>to open my big mouth ...]
>
>I, too would like to see sources for both permanently taking out the
>operating system (for a high performance game), and for suspending and
>resuming multitasking properly.  How about a list of trade-offs (what
>parts of the OS you can and can't use, and how v2.0 impacts the whole
>scheme).  This would be useful information, no matter which "side" of
>the discussion you are on.
>

No sources will be posted here, but I have done a small game (in 3
days in assembler) that I do intend to publish source code to 
at a future date.  And believe it or not, it does not kill the OS
to the point that it can't be restored.  In other words, it runs from
DOS and returns to DOS, but it does NOT multitask.

But, a small description of how I do things.  I always develop for
an Amiga 500 with 512K of RAM.  My 500 has 1Meg and the old Agnus
and 1.2 ROMs.  This configuration is guaranteed to give you the
basic features of the vast majority of Amigas around, and I get to
see the game perform exactly as the end user will, including floppy
disk access, the whole time I develop.

When the Amiga first boots, it asks for a workbench disk.  If you
have an autobooting hard drive and a bootable floppy is inserted,
the machine will boot from the floppy.  In any case, the ROM Kernel
loads what is called the boot program from track 0 of the floppy
disk into RAM and does a JSR to it.  The standard Amiga OS bootsector
program simply opens dos.library and then does an RTS and the system
continues to boot up into the normal Operating System.

Well, I wrote my own boot sector program that just doesn't return
to the OS.  In this boot sector program, I call several ROM Kernel
routines, but this is the ONLY time when they will be available to
be called.  Upon entry to the boot sector program, Exec is already
running, as is trackdisk.device, and Commodore was even nice enough
to make the A1 register point to an already opened IORequest structure
for use with trackdisk.  I use AllocMem to allocate enough memory to
hold 2 tracks worth of code and use the IORequest to read 2 tracks
into this memory.  I also make sure that the Allocated memory is
above $40000.  I also use OS calls to find out how many floppy drives
are connected, where any FAST memory is, and what kind of CPU the
machine has.

The two tracks that I read in contain an 8K kernel of code that replaces
ALL of the ROM Kernel routines that are needed for a game.  Consider it
a BIOS of sorts.  In addition to this 8K of KERNEL code, there is another
12K of floppy disk drivers, because I will not have the operating system
running to read any further data from the floppies.  As soon as the 2
tracks are read in, I turn off interrupts and dma and the OS is officially
dead.  I then jump to the beginning of the allocated block which contains
the KERNEL.  The first thing the Kernel does is to copy itself down to low
memory ($200).

The Kernel initialization installs all the low memory vector handlers
I want to use, including all those nasty GURU vectors, VBlank, Copper,
Audio, CIA, etc.  To make things easy to debug, I have 2K of code that
is part of the Kernel that allows debugging out the serial port at
4*57.6K baud, but when these are conditionally assembled in so I can
ship a version of the game without the debugging kernel.  I should
also point out that immediately after copying itself down to low memory,
the Kernel puts itself into supervisor mode (benefits described below).

When the game normally boots, the trackdisk routines in the kernel are
used to load the actual game code into memory and the kernel jumps to
it and everything is hunky dory.

So what are the benefits of taking over?  

Well, I am guaranteed that ALL 512K are mine to use.  I can ORG any 
graphics, code, or audio data at any hard coded location I want.  This
practice allows things like blitter routines to have hard coded constant
addresses in them, which saves CPU cycles where you need them the most.

I can put graphics screens and the stack anywhere I want.  For a 16 
color game, I put the stack at $80000 and a screen at $78000.  The 
stack needed for any program I write is < 512 bytes.  The resulting 
memory map gives me from $100 to $78000 to squeeze the game into.  And
I do mean squeeze.

EVERY single instruction that ever gets executed is my own code.  When
I single step through routines, I get symbolic information for every
single instruction.  I never see jsr offset(a6) and wonder what the
heck is going on.  When I do use the OS and step partway into one
of the ROM routines, I am apalled by how ugly and inefficient the
code is.  When I write the code myself, I am in full control of
every clock cycle and byte that is used by the program.

Since I am in supervisor mode, there are NO illegal instructions that
can be executed (priviledge violations cause a GURU under the OS).  The
User Stack Pointer (USP) also can be used as a quick place to save
an address register (this is 2x faster than a push on the stack).
The upper byte of the status register is available to disable various
levels of interrupts (the INTENA on Paula is just as useful), and
the TRACE bitis available for debugging purposes.

Since my program is the only one running (i.e. no multitasking OS),
there are lots of programming techniques that violate normal programming
practices that become valid.  For example, you can busy wait for blitter
finished (if you need to) without being considered a HOG.  Another trick
I like to use is to put the blitter into NASTY mode ALL of the time.
This effectively STOPS the CPU (even a 68030) when it accesses chip
memory, until the blitter is finished.  Without blitter nasty, which
is how the OS works, blits take at least twice as long to perform.
Another technique that is not valid under the OS is to set up blitter
registers on a semipermanent basis.  By doing this, you only need to
store 2 or 3 blitter registers to start each blit instead of 14.
In order to get the most performance out of the Amiga, you should keep
the blitter busy almost all the time.

VBlank is a particularly precious time period, especially if you are
using a graphics mode that steals CPU/Blitter time.  If you can do
all of your blits to the screen during VBLank, you don't need to do
double buffering.  When the OS is active, a huge amount of VBLank time
is used by the standard OS handler because it has to handle server
chains, etc.  I want every single clock cycle I can get during this
time.  It is important to note that on a PAL machine, VBL is a bit
longer than for NTSC.  Those Europeans get all the breaks :)  

I also get the benefit of putting my variables in low memory.  You
see, the 68000 allows you to use absolute short addressing mode
to access these, which frees a register for other things.  People
have ragged about this being useless, but if you note the tone of
what I am writing about, I am saving/shaving every clock cycle out
of everything I can find.  I am SEEKING the best performance possible.
The register that would normally point into your variables under the
OS (Manx uses A4...) I point at $dff000 so I get fast access to the
hardware registers all the time (including interrupt handlers).

The floppy disk drivers I wrote use a single 10K buffer to handle as
many disk drives (up to 4) that may be configured.  The OS routines
will steal 40K if you have 4 drives (10K per drive).  They read and
write standard trackdisk format, so the floppies can be copies with
DiskCopy (or by dragging the icon on a blank disk icon) under the OS
(to allow users to make as many backups as they want).  They provide
enhanced performance a few ways.  One way is that the blitter is in
nasty mode, so encoding/decoding the MFM data is as fast as possible.
When data is written to diskette, it is arranged so that NO extra disk
revolutions will be made during readback.  This is done by timing
the read and write routines so that by the time a track has been read
in and the head stepped, the next start of track is under the head
and ready to go.  The routines also make use of the DSKSYNC capability
of the drives, which the OS routines don't (under 2.0 they probably
do).  The routines use a CIA timer to get perfect timing, no matter
what processor.  Try popping the disk out with the disk light on (it
works).  Try ejecting the disk in the middle of a load and put it
into a different drive (it works).  Try that with the OS and watch
your disk go bad in 1 second, thanks to the disk validator.

I NEVER need to do dynamic memory allocation, so my memory never fragments.
Under the OS, if an application doesn't respond quickly enough to 
Intuimessages, or you have enough windows opened, the OS starts allocating
memory and never frees it up.  And the OS has serious problems with
low memory situations (mostly it gurus).  

I implement my own BOBs and Multitasking routines.  I preallocate
enough memory to hold 80 task structures and 80 bob structures by just
reserving part of the memory map for them.  It takes exactly 4 instructions
to allocate a task or bob structure and 4 to free it up.  Typically, every
OBJECT in the game is implemented as both a bob and a task.  The task
code is a finite state machine that controls the animation and movement
of the object in the game.  Anytime you fire a bullet, a new BOB and
new task is created, for example.  A task switch takes about 10 instructions.
My tasking scheme easily supports 80 tasks getting a slice of the CPU
in a 60th of a second.  The BOBs system is dependant on the number and
size of the BOBs, naturally.  All tasks share the same stack, which again
is < 512 bytes for the whole game.

Playtesting is a piece of cake.  You don't have to try too many hardware
configurations of Amigas to see how compatible the code is.  There
are only a few software configurations to check out, too (like 1.0, 1.1,
1.2, 1.3, and 2.0).  If the game were written under the OS, you'd have
nightmares testing all the possible software configurations.  For example,
does the game work with GOMF installed?  How about GOMF and DMouse?  How
many possibilities do you see?  I see bazillions :)  And what if you
allow multitasking and some CHIP RAM pig program (like DPaint) is already
running?  GURU.  And you need to test with 4 floppy drives under the OS,
just to make sure the OS hasn't taken more memory than you can allow.

>I, too have been involved with writing high-performance games on the
>Amiga (no, you won't find my name in any credits; I was a technical
>advisor and algorithm guru rather than a programmer --- most of the
>programmers I worked with were high-school or college kids who taught
>themselves everything; lots of raw talent, and most of them handn't
>the foggiest idea how to figure out a blitter minterm, so I taught
>them).  At the time, the only way we knew of to get effective arcade
>games was to kill the OS at boot-up time; I would have preferred to be
>able to suspend and then resume the OS, but couldn't figure out a way
>to do it without causing a crash later on.  Most of the 
>"whiz-kids" working with me thought of the OS as an obstacle anyway.
>

No sh*t sherlock!  The blitter is a very powerful coprocessor and
is no piece of cake to learn.  On the other hand, the OS routines
are PIG slow.  In many cases, you have to restore the hardware to
a state that the OS needs so the system won't crash.  Once you start
using the floppy disk hardware directly, for example, you must put
the CIAs back into a state that the OS wants them in.  What state is that?
The ROM Kernel manuals LIE.  Have fun finding out what page they lie on,
because there is NO index.

>On the other hand, games like Sim-City and Lemmings have no real use
>for that kind of environment.  I agree that the game comes first,
>*BUT* if you don't need total control, *DON'T* take it.  With the OS
>comes a lot more flexability (HD installation, a real file system,
>multitasking, and lots of "Wow!  What a neat game, and it multitasks,
>TOO!").  Take what you need, bit be sure you need it before you take
>it.
>

I agree with this 100%.  If you don't need to take over the machine,
don't.  If you want to push the machine to its limits, there is NO
other way.  Let the game come first.  If you know you can do the
game in a small amount of RAM and that performance is not an issue,
go ahead and use the OS.  In my approach, if I have RAM left over,
I use it for more sounds or instrument samples to make the music
better, or to cache more data from the floppy drives.

>There is no excuse for software to break on 68010, 68020, and 68030
>machines, and too many programs (most of them games, but does anybody
>remember TDI Modula-2?) break on accellerated systems.  Be aware of
>what you are doing when your write your code, and make it upward
>compatible.  For timing, use one of the many timebases supplied by the
>OS, or if you have killed it, then program one of the CIA's directly. 
>Pay attention to compatibility; it will ensure that royalties come
>trickling in for years to come, instead of months.  
>

In most cases, the things that break on 680x0 (where x >= 1) are not
dependant on whether you are using the OS or not.  Self modifying code
breaks in either case.  Using the CPU for timing breaks either way.
Use VBL for timing, or the beam position register, but don't use
a software loop.  Don't use the upper byte of address pointers (either
in RAM or address registers) for flags/variables.  Read your AmigaMail
(sent to registered developers) because they constantly remark on
what practices are invalid.  I sure wish Commodore would collect all
this information in one place and post it to the net and publish it
in all the manuals, etc.

Unfortunately, the sales life of a game is about 3 months.  Royalties
don't keep trickling in.

>*NOT* learning how the OS works is a kind of intellectual lazieness,
>even if you take the effort to "roll your own".  There is always a
>programmer or program out there who can show you a trick or two, and a
>lot of clever people spent a lot of time working on the AmigaOS (I
>have nothing but respect for people like -=RJ=- and the rest of the
>Amiga crew).  The OS is decidedly *NOT* full of bugs (and if the last
>time you looked at it was in v1.1, take another look!) and can be your
>ally, if you learn to control it ("Use the OS, Luke ... Use the OS!"
>;-)
>

I agree here too (except about Use the OS).   But I know it too well...
The OS does have bugs, however.  I spent weeks finding them for other
people at EA.  Ever hear about the trackdisk bug?  It seems that if
you have an external floppy drive and have no disk in it, and do intensive
disk access to the internal drive, it gurus after a random (long) amount
of time.

Did you know that LoadRGB4() takes a full 60th of a second on a 68000?  
How long does MrgCop() take (pick random number).  How long does 
RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
take? 

The OS is definately worth knowing.  I do know it.  The first game I did
used the OS (the only one I will ever do that way).  I have written 
megabytes of software that uses the OS.  It is a great OS.  No argument
from me.  It just is built to outperform a Mac, but not a C64.  The
hardware blows away everything from C64s to Macs to the Genesis, but you
wouldn't know it from watching "performance oriented" games that use it.

By the way, I have an A2000 with a 2630 (25 MHz 68030) and run 1.3.
I have written my own libraries and devices and dos handlers and lots
of other things that many people don't even get into.  It is a clever
piece of work and I am not at all trying to bash it.  I am only saying
that it is not good for games that need performance.

>I'm not trying to flame or put down anyone, but better games than the
>current crop can still be written!  Better both in terms of
>awesome-take-over-the-machine-graphics, and better in terms of
>awesome-playability-and-multitasking-code.
>

I agree here too.  Looking at most Amiga games next to C64 games
makes me want to puke.  It is painful to see Amiga games run at 8
frames per second while the C64 version of the same game runs at
60.  There is no excuse for this.  If you can't achieve the performance
under the OS, boot it by all means.  That is what the C64 guys do.  It
is a proven technique.

You left out what I expect to see from games.  It is awesome-take-over-the-
machine-awesome-playability-code.

BTW, has anybody ever tried Music-X?  It is a performance oriented piece
of midi software.  The first thing I did with it was to plug in my $175
synth and tried to sequence in the built-in demo song.  Poof - guru.
Damn software couldn't keep up with all those midi events... haha.
Anybody doubt David Joyner's abilities?  I don't.  The program is
extremely well done, just the OS can't keep up with 32K baud.

>I suggested earlier that Mike Farren expand and polish his articles,
>include source code examples which make a complete, small game, and send
>the whole thing off to AC's Tech.  I still feel very strongly that this
>should be done.  I also would like to encourage Mike Schwartz to do
>the same: write up his techniques for taking over the system and
>programming down to the hardware, inclulde sample code, and send it
>off to one of the technical magazines (like AC's Tech).
>

Mike already seems to have this idea in mind, except for the part about
writing the game.  His Lemmings posts (which actually started this whole
thread) were full of copyright notices because he expected his stuff
to be published.  You will see something from me in the not so distant
future.

>Wildstar

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

jonabbey@cs.utexas.edu (Jonathan David Abbey) (03/31/91)

Hmm.. Okay, you've convinced me, actually.  I agree that performance oriented
games should treat the Amiga like a game machine, and that games that do not
need that level of performance should stick with the operating system.  In
either case, the software should be written to be compliant with the published
rules in regard to the future hardware compatibility restrictions.  If you
can take advantage of extra memory, please do so.

Can we end this thread now?  Or at the very least let's have some more posts
like Mike Schwartz's last telling how to use the Amiga to its greatest extent
as a game machine.  And let's get on with discussing how to use PPIPC to do
wonderful things with the operating system that you don't see being done
on a Mac, IBM, or C64.  (Or the Amiga at the moment, with the exception of
Jazzbench..)



-- 
-------------------------------------------------------------------------------
Jonathan David Abbey              \"Take your place on the great Mandela" P,P&M
the university of texas at austin  \  jonabbey@cs.utexas.edu     "Love me, love
computer science/math?/psychology?  \ (512) 472-2052              my Amiga" -Me 

bairds@eecs.cs.pdx.edu (Shawn L. Baird) (03/31/91)

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

>Sorry to paste this whole thing again, but it is the best article
>done so far.

[ ... some stuff deleted ... ]

>No sources will be posted here, but I have done a small game (in 3
>days in assembler) that I do intend to publish source code to 
>at a future date.  And believe it or not, it does not kill the OS
>to the point that it can't be restored.  In other words, it runs from
>DOS and returns to DOS, but it does NOT multitask.

[ ... some stuff deleted ... ]

>When the Amiga first boots, it asks for a workbench disk.  If you
>have an autobooting hard drive and a bootable floppy is inserted,
>the machine will boot from the floppy.  In any case, the ROM Kernel
>loads what is called the boot program from track 0 of the floppy
>disk into RAM and does a JSR to it.  The standard Amiga OS bootsector
>program simply opens dos.library and then does an RTS and the system
>continues to boot up into the normal Operating System.

>Well, I wrote my own boot sector program that just doesn't return
>to the OS.  In this boot sector program, I call several ROM Kernel

[ ... rest deleted ... ]

I'm a bit confused here. First you say it runs from DOS and returns to
DOS. Then you say you wrote your own boot sector program that doesn't
return to the OS. Which of the two is true? I suspect that, although
you save the state of the OS there really isn't a reason to do so. Not
only this, but using a custom bootblock will render the game only
bootable on floppy disks and (from the descriptions of doing all of
the floppy reading on your own) the disk itself will not be in AmigaDOS
format, therefore there will be no way to install your program on a
hard drive. What is the point of keeping the OS state when even if you
did return to it the user has had to reboot at least once just to get
started?

Basically, it sounds like most other games I've seen. Granted, you
attempt to do things in a correct and compatible way across all Amigas,
but I already believe that someone who doesn't do this ought to be
shot. I think Mike Farren's point was, correct me if I'm wrong, was that
games like Lemmings could be made to take over the OS or multitask with
the point being to be able to do things like install it on a hard drive.
While avoiding the operating system routines may sometimes be helpful,
I sincerely doubt that you find the routines so horrible that you
avoid using them when running programs like your editor, your assembler,
etc. You can't tell me that fast blitting is impossible when the OS is
still intact. Look at CED. It blazes it's scrolling along at a very
fast clip (and I've noticed that the rest of the machine slows down
accordingly), probably by OwnBlitter() for long periods of time.

---
 Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
 The above message is not licensed by AT&T, or at least, not yet.

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (03/31/91)

  Hmm. TIme to open my mouth again.

In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>Sorry to paste this whole thing again, but it is the best article
>done so far.
>
>In article <781@tnc.UUCP> m0154@tnc.UUCP (GUY GARNETT) writes:
>>
>>[As much as I've been trying to ignore this discussion, now I'm going
>>to open my big mouth ...]
>>
>>I, too would like to see sources for both permanently taking out the
>>operating system (for a high performance game), and for suspending and
>>resuming multitasking properly.  How about a list of trade-offs (what
>>parts of the OS you can and can't use, and how v2.0 impacts the whole
>>scheme).  This would be useful information, no matter which "side" of
>>the discussion you are on.
>>
[Munch...]
>
>The two tracks that I read in contain an 8K kernel of code that replaces
>ALL of the ROM Kernel routines that are needed for a game.  Consider it
>a BIOS of sorts.  In addition to this 8K of KERNEL code, there is another
>12K of floppy disk drivers, because I will not have the operating system
>running to read any further data from the floppies.  As soon as the 2
>tracks are read in, I turn off interrupts and dma and the OS is officially
>dead.  I then jump to the beginning of the allocated block which contains
>the KERNEL.  The first thing the Kernel does is to copy itself down to low
>memory ($200).

  Poof! You just passed the point of no return. Couldn't you spare an
extra 1k of code to test for people with an extra 512k of ram? If on a 
 >512k machine, then save the state of the OS. It doesn't much matter
about returning to the OS, since your game REQUIRES a reboot to load.
 Tell me Mike, have you ever programmed a game other than a action
game? I find that todays game designs just plain suck. I'd still play
Tetris, Donkey Kong, Pac-Man over shadow of the beast. I'm willing to bet
that your game would be boring except for the ohh-ahh of pretty
graphics. Games like Bards Tale, Ultima, Pool of Radiance, and Neuromancer
should multitask.

OB>
>So what are the benefits of taking over?  
>
>Well, I am guaranteed that ALL 512K are mine to use.  I can ORG any 
>graphics, code, or audio data at any hard coded location I want.  This
>practice allows things like blitter routines to have hard coded constant
>addresses in them, which saves CPU cycles where you need them the most.
>
>I can put graphics screens and the stack anywhere I want.  For a 16 
>color game, I put the stack at $80000 and a screen at $78000.  The 
>stack needed for any program I write is < 512 bytes.  The resulting 
>memory map gives me from $100 to $78000 to squeeze the game into.  And
>I do mean squeeze.

 The majority of these benefits are just laziness. Sure, it's nice to ORG
code instead of worry about dynamic memory, and wasting a few extra cycles
to plot graphics, but unless it's absolutely needed it's wrong.

 Consider this. You start writing a game, and optimize it as much as you
can, shaving cycles, memory, etc And end up with 64 BOBs moving at 60fps.
It's possible to degrade the game to 60 BOBs instead of 64 with no
recognizable loss in quality.

  My guess is, when you do a game, your don't have a full design on paper
or know how much performance you need, therefore you just take the 
easy route and go for all out performance. Once you achieve your
game design goal and you find out you still have extra BOB/raster time
left, you go back to the game design and ADD more objects in, thinking
'hey, these extra objects are going to make the game alot more exciting.'
Either that, or you throw in a rediculous animation sequence, or
sampled sound.
  I'm going to go out on a limb and say that Budokan did not need
100% of the Amiga's CPU power to animate the combatants. It may have needed
all 512k of ram for player objects, but not the entire CPU.

>EVERY single instruction that ever gets executed is my own code.  When
>I single step through routines, I get symbolic information for every
>single instruction.  I never see jsr offset(a6) and wonder what the
>heck is going on.  When I do use the OS and step partway into one
>of the ROM routines, I am apalled by how ugly and inefficient the
>code is.  When I write the code myself, I am in full control of
>every clock cycle and byte that is used by the program.

  That's what compiled C code looks like. C is used for large projects
that an Assembly programmer could never accomplish in a resonable
amount of time. I programmed Assembly for 7 years, and I can do things
in C++/LISP that would take insurmountable time in assembly.
I do wish C= would optimize certain parts of the OS with AssemblY.
(Optimize Layers Library clipping code in assembly, and Exec's dispatcher,
and the vertical blank interupt handler. Intuition should stay in C, but
stuff like WritePixel should be in Asy)

>I NEVER need to do dynamic memory allocation, so my memory never fragments.
>Under the OS, if an application doesn't respond quickly enough to 
>Intuimessages, or you have enough windows opened, the OS starts allocating
>memory and never frees it up.  And the OS has serious problems with
>low memory situations (mostly it gurus).  

  I'm willing to bet that all of the memory leaks are fixed in 2.0.
Give the CBM some credit!

>I implement my own BOBs and Multitasking routines.  I preallocate
>eOAnough memory to hold 80 task structures and 80 bob structures by just
>reserving part of the memory map for them.  It takes exactly 4 instructions
>to allocate a task or bob structure and 4 to free it up.  Typically, every
>OBJECT in the game is implemented as both a bob and a task.  The task
>code is a finite state machine that controls the animation and movement
>of the object in the game.  Anytime you fire a bullet, a new BOB and
>new task is created, for example.  A task switch takes about 10 instructions.
>My tasking scheme easily supports 80 tasks getting a slice of the CPU
>in a 60th of a second.  The BOBs system is dependant on the number and
>size of the BOBs, naturally.  All tasks share the same stack, which again
>is < 512 bytes for the whole game.

  Are you really mulTitasking? It sounds to me like your VB interupt
is just looping on a state machine dispatching sub routines. (e.g.
Grab state, look up routIne in lookup table, jump to it if the state is
enabled) This is more like coroutines than real time-slicing. Is it 
possible for one of your BOB tasks to be interupted by another?

>Playtesting is a piece of cake.  You don't have to try too many hardware
>configurations of Amigas to see how compatible the code is.  There
>are only a few software configurations to check out, too (like 1.0, 1.1,
>1.2, 1.3, and 2.0).  If the game were written under the OS, you'd have
>nightmares testing all the possible software configurations.  For example,
>does the game work with GOMF installed?  How about GOMF and DMouse?  How
>many possibilities do you see?  I see bazillions :)  And what if you
>allow multitasking and some CHIP RAM pig program (like DPaint) is already
>running?  GURU.  And you need to test with 4 floppy drives under the OS,
>just to make sure the OS hasn't taken more memory than you can allow.

 No. If the game was written properly for the OS there would be no
problem. If a user runs BROKEN programs like GOMF, then it's his fault, not
the game programmer's.

>>I, too have been involved with writing high-performance games on the
>>Amiga (no, you won't find my name in any credits; I was a technical
>>advisor and algorithm guru rather than a programmer --- most of the
>>programmers I worked with were high-school or college kids who taught
>>themselves everything; lots of raw talent, and most of them handn't
>>the foggiest idea how to figure out a blitter minterm, so I taught
>>them).  At the time, the only way we knew of to get effective arcade
>>games was to kill the OS at boot-up time; I would have preferred to be
>>able to suspend and then resume the OS, but couldn't figure out a way
>>to do it without causing a crash later on.  Most of the 
>>"whiz-kids" working with me thought of the OS as an obstacle anyway.
>>
>
>No sh*t sherlock!  The blitter is a very powerful coprocessor and
>is no piece of cake to learn.  On the other hand, the OS routines
>are PIG slow.  In many cases, you have to restore the hardware to
>a state that the OS needs so the system won't crash.  Once you start
>using the floppy disk hardware directly, for example, you must put
>the CIAs back into a state that the OS wants them in.  What state is that?
>The ROM Kernel manuals LIE.  Have fun finding out what page they lie on,
>because there is NO index.

  I found that the blitter is very easy to learn with the help of 
Tom Rockiki's BlitLab manual.  The difficulty with the blitter is 
learning the tricks with masking, and the bltadat stuff to perform
arbitrary blits anywhere. Also, the fact that the Hardware Manual
(v1.1 that I have) is WRONG in a loT of places and non of the example
code works (it assumes NO OS). When I was first fooling with the blitter
I didn't know how to do a barrel-shift to the left (for a smooth
scroller of course) This is because the HW manual didn't mention
the difference in the way the barrel shifter works in asc/descending modes.
Thank god to Tom Rokiki's and Jeremy San's improvements to the 1.3 HW manual
and for the BlitLab Manual in those early days.
  The Amiga hardware seems to have a lot of quirks that have to be overcome
by tricks. Like stopping a Audio DMA sample, using Disk DMA, programming  
the blitter for 'bit blits' instead of word blits.
  It reminds me of the C64 days of polling the serial/disk ports, timing
out raster's to the cycle, and tricking the video chip to hardware scroll
the screen up and down or provide interlace.



>>On the other hand, games like Sim-City and Lemmings have no real use
>>for that kind of environment.  I agree that the game comes first,
>>*BUT* if you don't need total control, *DON'T* take it.  With the OS
>>comes a lot more flexability (HD installation, a real file system,
>>multitasking, and lots of "Wow!  What a neat game, and it multitasks,
>>TOO!").  Take what you need, bit be sure you need it before you take
>>it.
>>
>
>I agree with this 100%.  If you don't need to take over the machine,
>don't.  If you want to push the machine to its limits, there is NO
>other way.  Let the game come first.  If you know you can do the
>game in a small amount of RAM and that performance is not an issue,
>go ahead and use the OS.  In my approach, if I have RAM left over,
>I use it for more sounds or instrument samples to make the music
>better, or to cache more data from the floppy drives.

   This confirms what I said above. You don't design the game in the
beginning. You program the game according to your ideas, and when you have
finished, if you find ANY Ram/CPU time left, you TAKE IT ANYWAY.
Let's assume you find you have 150k of ram left over after all game
graphics and sound is loaded. There is no reason to use this extra ram
for a rediculous animation/sample. It doesn't improve the playability of
the game. Using this 150k of extra ram, you can save the state of the OS
for restoration.

>>There is no excuse for software to break on 68010, 68020, and 68030
>>machines, and too many programs (most of them games, but does anybody
>>remember TDI Modula-2?) break on accellerated systems.  Be aware of
>>what you are doing when your write your code, and make it upward
>>compatible.  For timing, use one of the many timebases supplied by the
>>OS, or if you have killed it, then program one of the CIA's directly. 
>>Pay attention to compatibility; it will ensure that royalties come
>>trickling in for years to come, instead of months.  
>>
>
>In most cases, the things that break on 680x0 (where x >= 1) are not
>dependant on whether you are using the OS or not.  Self modifying code
>breaks in either case.  Using the CPU for timing breaks either way.
>Use VBL for timing, or the beam position register, but don't use
>a software loop.  Don't use the upper byte of address pointers (either
>in RAM or address registers) for flags/variables.  Read your AmigaMail
>(sent to registered developers) because they constantly remark on
>what practices are invalid.  I sure wish Commodore would collect all
>this information in one place and post it to the net and publish it
>in all the manuals, etc.
>
>Unfortunately, the sales life of a game is about 3 months.  Royalties
>don't keep trickling in.

 I bet a game written for 2000/3000 machines would sell for a long time.
Why? Because there's a niche there that no one has tapped yet, and
the majority of A2000/3000 owners are not pirates, obviously if they can
afford an A3000, they can afford a $30 game. 

>>*NOT* learning how the OS works is a kind of intellectual lazieness,
>>even if you take the effort to "roll your own".  There is always a
>>programmer or program out there who can show you a trick or two, and a
>>lot of clever people spent a lot of time working on the AmigaOS (I
>>have nothing but respect for people like -=RJ=- and the rest of the
>>Amiga crew).  The OS is decidedly *NOT* full of bugs (and if the last
>>time you looked at it was in v1.1, take another look!) and can be your
>>ally, if you learn to control it ("Use the OS, Luke ... Use the OS!"
>>;-)
>>
>
>I agree here too (except about Use the OS).   But I know it too well...
>The OS does have bugs, however.  I spent weeks finding them for other
>people at EA.  Ever hear about the trackdisk bug?  It seems that if
>you have an external floppy drive and have no disk in it, and do intensive
>disk access to the internal drive, it gurus after a random (long) amount
>of time.
>
>Did you know that LoadRGB4() takes a full 60th of a second on a 68000?  
>How long does MrgCop() take (pick random number).  How long does 
>RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
>take? 

  Probably because LoadRGB4 recomputes the copper list. Remember 
screens/views are a sharable resource. Nothing is preventing you from
making a screen, disabling the inputdevice (Amiga n/m screen switch)
and Owning the Copper and Blitter on that custom screen.
  Mike, take alook at Tom Rokiki's RadBoogie. Granted, it doesn't have 
a SoundTracker playing, or filled vectors, but from a programming
point of view, it's an excellent example, and proof that you can program
fast action graphics and still leave the OS functioning.
(Rad boogie calculates multiple splines in real time, blits a huge Amiga 
logo, moves a large sprite, plays a Sonix music, and has a customcopper
going. Most all of those European Filled Vector scrolls/objects PRECOMPUTE
the object rotations and sin/cos paths with a C/Basic/Rexx program. Then
the Assembly code merely looks up table values, and uses the blitter
to draw lines/fill areas)

[deleted... The Amiga is not a C64. Mac's dont have any good actions
games because a game that takes over the MacOS won't sell. Period.]

>
>piece of work and I am not at all trying to bash it.  I am only saying
>that it is not good for games that need performance.
>
>>I'm not trying to flame or put down anyone, but better games than the
>>current crop can still be written!  Better both in terms of
>>awesome-take-over-the-machine-graphics, and better in terms of
>>awesome-playability-and-multitasking-code.
>>
>
>I agree here too.  Looking at most Amiga games next to C64 games
>makes me want to puke.  It is painful to see Amiga games run at 8
>frames per second while the C64 version of the same game runs at
>60.  There is no excuse for this.  If you can't achieve the performance
>under the OS, boot it by all means.  That is what the C64 guys do.  It
>is a proven technique.

  There is no secret why C64 games run at 60fps. That's because 98% of
all C64 games are totally sprite driven. Any backgrounds are merely character
graphics. The C64 doesn't have the horsepower to scroll bitmaps, or
full color character mapped screens in real time. (You can't do it
in 1/60 of a second.) It's a pity the Amiga sprites are smaller than the C64's.
If they were 24/32 pixels wide a lot of games could use them (all 8) instead of
attaching 2 or 4 together.
  The C64 was really pushed to it's limit in both programming a game design.
The Amiga has already been pushed to it's limit (mostly) What it really
needs is GOOD Design.

>You left out what I expect to see from games.  It is awesome-take-over-the-
>machine-awesome-playability-code.
>
>BTW, has anybody ever tried Music-X?  It is a performance oriented piece
>of midi software.  The first thing I did with it was to plug in my $175
>synth and tried to sequence in the built-in demo song.  Poof - guru.
>Damn software couldn't keep up with all those midi events... haha.
>Anybody doubt David Joyner's abilities?  I don't.  The program is
>extremely well done, just the OS can't keep up with 32K baud.

  If I recall correctly, David Joyner did the Amiga port of Faery Tale.
Faery Tale is an example of a game that does only a partial takeover of the
OS. (You can run DMouse before booting the game, and have a faster
mouse) 

>>I suggested earlier that Mike Farren expand and polish his articles,
>>include source code examples which make a complete, small game, and send
>>the whole thing off to AC's Tech.  I still feel very strongly that this
>>should be done.  I also would like to encourage Mike Schwartz to do
>>the same: write up his techniques for taking over the system and
>>programming down to the hardware, inclulde sample code, and send it
>>off to one of the technical magazines (like AC's Tech).
>>
>
>Mike already seems to have this idea in mind, except for the part about
>writing the game.  His Lemmings posts (which actually started this whole
>thread) were full of copyright notices because he expected his stuff
>to be published.  You will see something from me in the not so distant
>future.
 
  Mike, if you submit something to a magazine, put it in context. WHat I mean,
is that don't give the impression that the correct way to program the
Amiga is to boot the OS. Action games have their place, but there are
many types of games beside action games that don't need to boot the OS.
The Amiga is not a C64 (Even WOrd Processors on the C64 booted the OS)
 
  I know that the OS is a big overhead for using the blitter and disk to its
maximum abilities, but the fact that you said it only took you three days
to do that game code shows that you have put much work into it.

'Heart of the Dragon' has 192 color screens, real time action, sampled
sounds and works with the OS. It is not a slow game, I've seen a demo of it.

>>Wildstar
>
>--
>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
>********************************************************


  Let me take some extra time to say, that I used to be in Mike's camp.
I used to program 100% assembler, I used to flame C programmers for not
knowing assembler and programming essentially a 'Black Box'. I used to be
in demo groups and program hardware to its limits (C64). I know the
mentality. Most of the European and American demo coders have 'big egos' like
Mr Schaem. Just read the scroll text on any demo and you'll see groups
cussing each other out. What changed me, was when I got into REAL 
programming on Unix in C hacking interpreters, servers and mini-OS's.
I came to realize that operating systems and standards are there to
benefit the computer and hardware. When people spent $3000 on a computer
system, they expect software to work with it. Bypassing the OS is not
the way to do this, unless you plan on writing custom routines for
each hardware configuration. Assembler is nice to know, but its a huge
waste to use assembler for everything. As hardware gets more and more 
complex (parallel processing, risc, etc) the assembler programmer must
know more and more about the hardware. The hardware today is becoming so
complex that the ORDER of operations is becoming important. Certain
processors now require instructions to occur in precise order to insure
the pipeline is filled. Only compilers can keep track of register and
operation usage throughout the entire program.

  If you're an assembler programmer, the OS is especially annoying because
almost every OS function operates on C structures and linked lists. I became
annoyed because there were too many label names to remember for structure
offsets, and duplicate label names in different structures can't be done
in assemblers witouth collision. This is why C should be used for almost
all operations except speed dependent stuff. I'd suggest that most of
the 'kill the os' move over to CDTV when it hits the market. There you have
600 megs of disk space, and the ability to kill the OS and have no 
complaints.


--
/~\_______________________________________________________________________/~\
|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
|~|                                .-. .-.                                |~|
|_|________________________________| |_| |________________________________|_|

DXB132@psuvm.psu.edu (03/31/91)

In article <2149@pdxgate.UUCP>, bairds@eecs.cs.pdx.edu (Shawn L. Baird) says:

>I sincerely doubt that you find the routines so horrible that you
>avoid using them when running programs like your editor, your assembler,
>etc. You can't tell me that fast blitting is impossible when the OS is
>still intact. Look at CED. It blazes it's scrolling along at a very
>fast clip (and I've noticed that the rest of the machine slows down
>accordingly), probably by OwnBlitter() for long periods of time.

That example doesn't help your point. CED uses blitting which is many many
times slower than using the copper to aid scrolling. But it has no choice
in order to remain Intuition compatible.

-- Dan Babcock

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (03/31/91)

In article <00670212662@elgamy.RAIDERNET.COM> elg@elgamy.RAIDERNET.COM (Eric Lee Green) writes:
>
>This is the only valid argument in your article. But why not give a choice?
>How about grab all the memory you can grab from the system, shut down the
>system in an Amiga-friendly manner (i.e., as published by Commodore in
>their DevCon notes), then if there STILL isn't enough memory (i.e., only
>512K in the machine, *THEN* hose that remaining 80K of RAM? That is, on
>expanded machines the OS could be re-started when the game was suspended or
>etc., while only on 512K machines would you have to give up everything.

  Exactly. This is exactly what Dragon's Lair II does. It puts up requesters
when not enough resources are found. For instance.
'Not enough ram to load Audio Data. Sound will be canceled. Proceed?'
'Not enough ram for life/death sequence. Proceed?'
'Not enough memory for Multitasking, proceed?'
'Ready to load, Proceed?'

  At each stage, it gives a requester. On 512k machines, Dragon's Lair
removes Multitasking, Sound, and life/death sequence. On 1mb machines it only
disables multitasking. On >1mb machines it multitasks. You can even
pull down the screen!

--
/~\_______________________________________________________________________/~\
|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
|~|                                .-. .-.                                |~|
|_|________________________________| |_| |________________________________|_|

limonce@pilot.njin.net (Tom Limoncelli +1 201 408 5389) (03/31/91)

In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

> the KERNEL.  The first thing the Kernel does is to copy itself down to low
> memory ($200).
[...]
> Well, I am guaranteed that ALL 512K are mine to use.  I can ORG any 
> graphics, code, or audio data at any hard coded location I want.  This
> practice allows things like blitter routines to have hard coded constant
> addresses in them, which saves CPU cycles where you need them the most.

This means that your code and data is always going to be in CHIP ram
which is a killer for most graphics modes and for most accellerators.

"saves CPU cycles where you need them the most".  A bubble sort
written in assembler has the same O() as a bubble sort in C.

You get 2 extra registers and CHIP ram holds you back.  Strange trade-off.

(Yes, I know you claim to use FAST ram for any disk caches, but I'm
sure you don't access that durring game-play.)

---------------

Everything you program can be done (in C or Assembler) and only kill
the OS if you are really low on memory.  In that case, you can ask the
user if this is OK.  The only problem is that you have to *learn*
something about AmigaDOS and work towards getting your code to exit
properly.  This is not easy, but can be done.  Why don't you do this?
Others call you lazy.  It's not lazy to write an entire pseudo-KERNEL
as well as trackdisk; but I bet that time could have been spent
learning the OS.

Your two big gripes are control over the blitter, the drives, and the
CPU.  You can OwnBlitter(), allocate the drives, and shut off Exec in
approved ways.  Best yet, when you are done you can un-own those
resources (assuming that you didn't kill the OS due to a lack of
memory) and the user can continue.

Advantages:
	If you can re-enable Exec, etc. when the user wants to pause,
you can also re-enable all that stuff when loading in stuff from the
HD (if you make that an option), printing (I'd love to be able to do
THAT for some games!), etc.  I doubt you do any disk/harddrive access
durring the game.  My HD is faster than *any* trackdisk routines you write.
Why not give the option of (1) your "fast" (faster than 1.3 trackdisk)
trackdisk routines, (2) the native trackdisk routines (2.0 is *fast*),
or (3) proper system calls for HD support.  2 & 3 require a return to
multitasking.
	If you can re-enable Exec, etc. and deallocate the memory you
don't need (reload it later) the user might be able to be productive
AND use your game.  Gosh.

This argument is pretty fruitless.  You seem to go on the European
marketing statistics that you quote.  I'd love to know the source of
those statistics so that I could research them.  Quoting statistics
without giving a source that I can research means that you could be
quoting from a Matt Groening or Dave Barry book for all I know.

As soon as your statistical sample (European kids) starts getting
'030s and HDs I'm sure that you'll produce software that I feel like
buying.  Too bad it takes so long for Europeans to come up to American
standards.  I guess we follow them in fashion and they follow us in...
oh never mind.

Everytime I meet someone looking to trash their 500 and buy an A3000
they say, "but I don't want to because all my games won't work".  Mike
et al seem to have made a captive audience.  Mike won't change until
the users do, if Mike had never started the users could upgrade.

-Tom
P.S.  When someone is considering upgrading, I ask them to try to find
the games that they think won't work on an A3000 running 2.0.  Then I
try to get them to notice that if they haven't used those in the last
6 months, they most-likely won't miss them.

jesup@cbmvax.commodore.com (Randell Jesup) (03/31/91)

In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>But, a small description of how I do things.  I always develop for
>an Amiga 500 with 512K of RAM.  My 500 has 1Meg and the old Agnus
>and 1.2 ROMs.  This configuration is guaranteed to give you the
>basic features of the vast majority of Amigas around, and I get to
>see the game perform exactly as the end user will, including floppy
>disk access, the whole time I develop.

	This is a good thing to do: always have a "standard" machine to
test your code on, and we encourage developers to do this.

>the machine will boot from the floppy.  In any case, the ROM Kernel
>loads what is called the boot program from track 0 of the floppy
>disk into RAM and does a JSR to it.  The standard Amiga OS bootsector
>program simply opens dos.library and then does an RTS and the system
>continues to boot up into the normal Operating System.

	You should learn more about the OS...  It actually never returns
if it opens dos.library.  Dos starts the initial process, and then kills
the initial task (itself).  The initial process continues the boot process.
(I'm the person who rewrote the Dos in C and asm.)

>a BIOS of sorts.  In addition to this 8K of KERNEL code, there is another
>12K of floppy disk drivers, because I will not have the operating system
>running to read any further data from the floppies.

	Note that even 2.0 trackdisk is only about 7K long.

>Since I am in supervisor mode, there are NO illegal instructions that
>can be executed (priviledge violations cause a GURU under the OS).  The
>User Stack Pointer (USP) also can be used as a quick place to save
>an address register (this is 2x faster than a push on the stack).
>The upper byte of the status register is available to disable various
>levels of interrupts (the INTENA on Paula is just as useful), and
>the TRACE bitis available for debugging purposes.

	Hopefully, if you programmed your game right, you shouldn't have
to worry about executing an illegal instruction by mistake (except of
course ILLEGAL).  Disabling interrupts in the processor is faster than
disabling them in Paula, since you don't have to get on the chip bus to
do it.

>The floppy disk drivers I wrote use a single 10K buffer to handle as
>many disk drives (up to 4) that may be configured.  The OS routines
>will steal 40K if you have 4 drives (10K per drive).

	Amusing.  First, you need more than 10K for a MFM buffer, since
the number of bytes (decoded) per track can be as high as 6812, so the MFM
buffer must be at LEAST 13624, and you actually want it a bit larger in
some cases (we use 15296 - we keep a gaps-worth of NULLs (aaaaaaaa) before
the spot where we read, to make writing faster/easier).  Sure you don't mean
16K?  (Which is what the OS in 1.3 used, though it was a bit more than was
needed.)

>enhanced performance a few ways.  One way is that the blitter is in
>nasty mode, so encoding/decoding the MFM data is as fast as possible.

	Decoding is faster with the processor, if you also are going to 
check the checksum.  Nasty mode will hurt your interrupt response time.
Sounds like the classic "optimize the routine within a inch of it's life,
and miss the fact that a different algorithm would be twice as fast".
BTW, when there's a >2 bitplane (>4 in 320x200/400), running code from
the ROMs is faster than from ram, since you don't have to pay the penalty
for getting cycles to from the chip bus 9since in your way of programming,
all your code ends up in chip ram - annoying for something that could use 
the extra horsepower, like 3d games.

>When data is written to diskette, it is arranged so that NO extra disk
>revolutions will be made during readback.  This is done by timing
>the read and write routines so that by the time a track has been read
>in and the head stepped, the next start of track is under the head
>and ready to go.

	Unless your're pulling partial tracks off and using them before
the revolution is complete, or unless you're going to write it out again,
this makes no difference - and for writing all it saves you is a block-move
to eliminate the gap.

> The routines also make use of the DSKSYNC capability
>of the drives, which the OS routines don't (under 2.0 they probably
>do).

	Yup.  Not as big a win as you'd think (I thought it would be a big
win, but floppy rotation time swamps almost anything).

>  The routines use a CIA timer to get perfect timing, no matter
>what processor.  Try popping the disk out with the disk light on (it
>works).  Try ejecting the disk in the middle of a load and put it
>into a different drive (it works).  Try that with the OS and watch
>your disk go bad in 1 second, thanks to the disk validator.

	I guarantee that if you pop a disk while it's writing, it WILL go bad.
Even with your code.  If it's reading, it may forcefully ask for it back, but
it won't go bad.

>And what if you
>allow multitasking and some CHIP RAM pig program (like DPaint) is already
>running?  GURU.  And you need to test with 4 floppy drives under the OS,
>just to make sure the OS hasn't taken more memory than you can allow.

	You should learn to check allocation returns.  It's not hard.
There's even a tool for selectively denying memory allocations to stress-
test your program that we distribute (written by Bill Hawes to help test
2.0).

>Once you start
>using the floppy disk hardware directly, for example, you must put
>the CIAs back into a state that the OS wants them in.  What state is that?
>The ROM Kernel manuals LIE.  Have fun finding out what page they lie on,
>because there is NO index.

	Sure, the 1.1 RKMs had the CIA allocations backwards.  The 1.3 RKMs
(which have been out a long time now) had the correct information (and
indexes).  We told people this.  This caused our worst 2.0 compatibility
problems, though we solved almost all of them (by dink of truely tricky
programming...)

>>On the other hand, games like Sim-City and Lemmings have no real use
>>for that kind of environment.  I agree that the game comes first,
>>*BUT* if you don't need total control, *DON'T* take it.

>I agree with this 100%.  If you don't need to take over the machine,
>don't.  If you want to push the machine to its limits, there is NO
>other way.

	True.

>  Let the game come first.  If you know you can do the
>game in a small amount of RAM and that performance is not an issue,
>go ahead and use the OS.  In my approach, if I have RAM left over,
>I use it for more sounds or instrument samples to make the music
>better, or to cache more data from the floppy drives.

	You don't let the _game_ come first, you put your _implementation_
first.  There is a difference.  As for your approach, you may find your
game didn't need all these tricks, and had ram left over when you're
done.  But since you programmed yourself into a corner you can't go back
and cooperate with the system, so you just look for ways to use up the
ram in a more-or-less-useful manner.

>Unfortunately, the sales life of a game is about 3 months.  Royalties
>don't keep trickling in.

	Good _games_, ones that have a depth beyond flashy graphics, and
have replayability, do continue to sell (though they do best when first
released, like most authored products).  Sure, they do eventually trend 
towards 0, but by no means do they walk off a cliff for a good game
(or even a well-done flashy game).

>The OS does have bugs, however.  I spent weeks finding them for other
>people at EA.  Ever hear about the trackdisk bug?  It seems that if
>you have an external floppy drive and have no disk in it, and do intensive
>disk access to the internal drive, it gurus after a random (long) amount
>of time.

	Actually it has nothing to do with internal versus external.  This
was fixed in one of the 1.3 releases in SetPatch.  Note: I don't ever 
remember seeing a bug report from EA about this (or in fact about almost
anything, though some of the people who sell things through EA do report bugs
well, and this has changed somewhat for the better in the last few years).
Perhaps there was one, it was a while ago, but I don't remember it.  Developer
support is a 2-way street.

>Did you know that LoadRGB4() takes a full 60th of a second on a 68000?

	No, it merely doesn't take effect until the next vblank (it 
modifies the copperlist).

>How long does MrgCop() take (pick random number).  How long does 
>RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
>take? 

	(1) those are not called all the time, (2) you're exaggerating
by a lot.  BMBMRP() is not fast because it operates on arbitrary rectangles,
and some sets require using the A-channel as a mask.  So if you do know
the alignments are ok, OwnBlitter() and program it directly.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (03/31/91)

In article <2149@pdxgate.UUCP> bairds@eecs.cs.pdx.edu (Shawn L. Baird) writes:
>mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>
>[ ... some stuff deleted ... ]
>
>>No sources will be posted here, but I have done a small game (in 3
>>days in assembler) that I do intend to publish source code to 
>>at a future date.  And believe it or not, it does not kill the OS
>>to the point that it can't be restored.  In other words, it runs from
>>DOS and returns to DOS, but it does NOT multitask.
>
>I'm a bit confused here. First you say it runs from DOS and returns to
>DOS. Then you say you wrote your own boot sector program that doesn't
>return to the OS. Which of the two is true? I suspect that, although
>you save the state of the OS there really isn't a reason to do so. Not
>only this, but using a custom bootblock will render the game only
>bootable on floppy disks and (from the descriptions of doing all of
>the floppy reading on your own) the disk itself will not be in AmigaDOS
>format, therefore there will be no way to install your program on a
>hard drive. What is the point of keeping the OS state when even if you
>did return to it the user has had to reboot at least once just to get
>started?
>

The game I described uses NO operating system calls and does not multitask.
It also runs in <256K.  I wouldn't sell it as a product because it is
puny by Amiga standards.  I did it to prove that I know how to do it.
Again, I repeat that it DOES NOT multitask.

>Basically, it sounds like most other games I've seen. Granted, you
>attempt to do things in a correct and compatible way across all Amigas,
>but I already believe that someone who doesn't do this ought to be
>shot. I think Mike Farren's point was, correct me if I'm wrong, was that
>games like Lemmings could be made to take over the OS or multitask with
>the point being to be able to do things like install it on a hard drive.
>While avoiding the operating system routines may sometimes be helpful,
>I sincerely doubt that you find the routines so horrible that you
>avoid using them when running programs like your editor, your assembler,
>etc. You can't tell me that fast blitting is impossible when the OS is
>still intact. Look at CED. It blazes it's scrolling along at a very
>fast clip (and I've noticed that the rest of the machine slows down
>accordingly), probably by OwnBlitter() for long periods of time.
>

Cygnus Ed does not try to maintain a 60Hz animated frame rate as
games do.  Comparing apples with oranges here.  A game is not an
application.

>---
> Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
> The above message is not licensed by AT&T, or at least, not yet.

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (03/31/91)

In article <1991Mar31.003933.1483@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>
>  Hmm. TIme to open my mouth again.
>
>  Poof! You just passed the point of no return. Couldn't you spare an
>extra 1k of code to test for people with an extra 512k of ram? If on a 
> >512k machine, then save the state of the OS. It doesn't much matter
>about returning to the OS, since your game REQUIRES a reboot to load.
> Tell me Mike, have you ever programmed a game other than a action
>game? I find that todays game designs just plain suck. I'd still play
>Tetris, Donkey Kong, Pac-Man over shadow of the beast. I'm willing to bet
>that your game would be boring except for the ohh-ahh of pretty
>graphics. Games like Bards Tale, Ultima, Pool of Radiance, and Neuromancer
>should multitask.
>

If you read the entire posting, it said that I do use the OS in the boot
program to detect memory beyond 512K.  Rather than saving the state of
the OS there, I use it to cache the major parts of the game so disk
accesses aren't needed to reload them.  Once they are loaded from
floppy, the software doesn't go to floppy for the same data again.
I deliberately choose to do this instead of protecting portions of
the OS.  I know how to do it, just don't want to.  The approach I
choose takes advantage of the extra memory and makes the game more
playable for those who have stock A2000's (1Meg RAM) or 1Meg A500's,
and of course, the rest of the enhanced machines, too.

[ stuff about how programs benefit from taking over delted ]

> The majority of these benefits are just laziness. Sure, it's nice to ORG
>code instead of worry about dynamic memory, and wasting a few extra cycles
>to plot graphics, but unless it's absolutely needed it's wrong.
>
> Consider this. You start writing a game, and optimize it as much as you
>can, shaving cycles, memory, etc And end up with 64 BOBs moving at 60fps.
>It's possible to degrade the game to 60 BOBs instead of 64 with no
>recognizable loss in quality.
>

The way I do things is by design, and not out of laziness.  I don't make
300K games, I make 512K games.  It is by design.

>  My guess is, when you do a game, your don't have a full design on paper
>or know how much performance you need, therefore you just take the 
>easy route and go for all out performance. Once you achieve your
>game design goal and you find out you still have extra BOB/raster time
>left, you go back to the game design and ADD more objects in, thinking
>'hey, these extra objects are going to make the game alot more exciting.'
>Either that, or you throw in a rediculous animation sequence, or
>sampled sound.

If you want to design your own games, go ahead.  But don't tell me and
everyone else how to do it.  If everyone designs games exactly the way
you do, it would get real boring real fast.  The fact that individuals
design games, instead of a design by community approach, gives us variety
in game play, look and feel, and style.

>  I'm going to go out on a limb and say that Budokan did not need
>100% of the Amiga's CPU power to animate the combatants. It may have needed
>all 512k of ram for player objects, but not the entire CPU.
>

Budo was done in Extra Half Brite mode which steals 50% of the CPU cycles.
The shadows for the players were calculated with the CPU, too.  You
assume that animation is all that's going on, and you didn't factor nearly
everything that is going on.

>>EVERY single instruction that ever gets executed is my own code.  When
>>I single step through routines, I get symbolic information for every
>>single instruction.  I never see jsr offset(a6) and wonder what the
>>heck is going on.  When I do use the OS and step partway into one
>>of the ROM routines, I am apalled by how ugly and inefficient the
>>code is.  When I write the code myself, I am in full control of
>>every clock cycle and byte that is used by the program.
>
>  That's what compiled C code looks like. C is used for large projects
>that an Assembly programmer could never accomplish in a resonable
>amount of time. I programmed Assembly for 7 years, and I can do things
>in C++/LISP that would take insurmountable time in assembly.
>I do wish C= would optimize certain parts of the OS with AssemblY.
>(Optimize Layers Library clipping code in assembly, and Exec's dispatcher,
>and the vertical blank interupt handler. Intuition should stay in C, but
>stuff like WritePixel should be in Asy)
>

A couple of points.  Exec is already in assembler language.  You might
optimize just a few bytes/cycles out of it.  It is the foundation of the
OS.  It was done RIGHT (assembler).  Layers is really not that slow when
you consider what it has to do (and compare it with other OS performance).
ReadPixel and WritePixel are still going to be slow because it has to
do rastport clipping (so you don't plot pixels outside of a window).
Finally, development time is also not as much of an issue as is quality.

When you call printf() you are relying on someone else's coding ability.
When you call an OS routine, you are relying on someone else's coding ability,
too.  You not only have your own bugs to deal with but those in the (link)
library and those in the ROM.  When you run native under the OS, you are also
relying on a wide variety of PD software hacks (like POPCLI, etc.) that people
use to also not have bugs.  Remember, one wild store in any of this software
crashes everything.

As far as 'C' goes, I have stepped through enough disassembled 'C' code to
see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
cycles are wasted by pushing arguments on the stack to call a subroutine?
How many cycles are wasted by calling a "glue" routine?  How many cycles
and bytes are wasted by fixing up the stack after each and every subroutine
returns?  How much stack do you need to allow for all the dynamic allocation
of local variables?  512K is not a lot of RAM to go and waste memory all the
time.  A 7.14 MHz 68000 isn't fast enough to waste all the extra cycles, if
you are striving for performance.

>>I NEVER need to do dynamic memory allocation, so my memory never fragments.
>>Under the OS, if an application doesn't respond quickly enough to 
>>Intuimessages, or you have enough windows opened, the OS starts allocating
>>memory and never frees it up.  And the OS has serious problems with
>>low memory situations (mostly it gurus).  
>
>  I'm willing to bet that all of the memory leaks are fixed in 2.0.
>Give the CBM some credit!
>

I do give CBM credit.  The OS still has problems when it can't allocate the
memory that it needs.  It also "leaks" memory in places by design, as in
the case of IntuiMessages.

>>I implement my own BOBs and Multitasking routines. 

>  Are you really mulTitasking? It sounds to me like your VB interupt
>is just looping on a state machine dispatching sub routines. (e.g.
>Grab state, look up routIne in lookup table, jump to it if the state is
>enabled) This is more like coroutines than real time-slicing. Is it 
>possible for one of your BOB tasks to be interupted by another?
>

Preemptive multitasking isn't necessary for a game.  The multitasking
kernel I wrote round robins the tasks.  There are a dynamic number of
tasks running at any one time.  The state of each task is preserved
when switching to a different task.  The task structure I use is similar
to what exec uses and the software works similar to the way that exec
does when it is running a lot of tasks at the same priority.  And to
improve things, my tasking system only preserves half of the CPU registers
instead of all of them (making a task switch faster than Exec's).

BOBs are separate data structures.  It is possible for a single task
to control many BOBs or for a task to control NO BOBs.  It is also
possible for a BOB to be controlled by nobody (consider a torch animating
on the wall that has no other properties).

> No. If the game was written properly for the OS there would be no
>problem. If a user runs BROKEN programs like GOMF, then it's his fault, not
>the game programmer's.

Too many people will blame the game software.  They already are blaming
phantom garbage on games.

>>The blitter is a very powerful coprocessor and is no piece of cake to learn.  
>
>  I found that the blitter is very easy to learn with the help of 
>Tom Rockiki's BlitLab manual.  The difficulty with the blitter is 
>learning the tricks with masking, and the bltadat stuff to perform
>arbitrary blits anywhere. Also, the fact that the Hardware Manual
>(v1.1 that I have) is WRONG in a loT of places and non of the example
>code works (it assumes NO OS). When I was first fooling with the blitter
>I didn't know how to do a barrel-shift to the left (for a smooth
>scroller of course) This is because the HW manual didn't mention
>the difference in the way the barrel shifter works in asc/descending modes.
>Thank god to Tom Rokiki's and Jeremy San's improvements to the 1.3 HW manual
>and for the BlitLab Manual in those early days.

Sounds to me like Commodore hasn't done much to help.  You obviously have
some experience programming at the hardware level, so perhaps the blitter
was just naturally easier for you to learn.  Having written blitter code
in software (for the Atari ST), I find that it is not a trivial task and
is quite challenging to make it fast.

What language and OS do you think Jeremy Sans uses to program the Amiga?
I think he is an awesome programmer.  Too bad he hasn't improved the rest
of the manuals.

>  The Amiga hardware seems to have a lot of quirks that have to be overcome
>by tricks. Like stopping a Audio DMA sample, using Disk DMA, programming  
>the blitter for 'bit blits' instead of word blits.

The blitter is a word machine no matter how you look at it.  As in any
language, once you have a good set of routines to do these things, you
can leverage them.  You can modify existing working code to do specific
things or simply reuse them.

>  It reminds me of the C64 days of polling the serial/disk ports, timing
>out raster's to the cycle, and tricking the video chip to hardware scroll
>the screen up and down or provide interlace.
>

These are techniques that allow the C64 to do better than the designers
expected it would ever do.  They also allowed Commodore to sell the
machine for a few years more than it would have otherwise.

>>If you don't need to take over the machine,
>>don't.  If you want to push the machine to its limits, there is NO
>>other way.  Let the game come first.  If you know you can do the
>>game in a small amount of RAM and that performance is not an issue,
>>go ahead and use the OS.  In my approach, if I have RAM left over,
>>I use it for more sounds or instrument samples to make the music
>>better, or to cache more data from the floppy drives.
>
>   This confirms what I said above. You don't design the game in the
>beginning. You program the game according to your ideas, and when you have
>finished, if you find ANY Ram/CPU time left, you TAKE IT ANYWAY.
>Let's assume you find you have 150k of ram left over after all game
>graphics and sound is loaded. There is no reason to use this extra ram
>for a rediculous animation/sample. It doesn't improve the playability of
>the game. Using this 150k of extra ram, you can save the state of the OS
>for restoration.
>

We had 90K left over in Budokan and that was used by Rob Hubbard to
be as creative as he wanted with the music.  I love the music in
Budo (Rob is well known as THE top musician in the games business).
The music would have suffered if he didn't use all 90K.  There is
NOTHING wrong with doing this.  Budo had something like 9 different
original musical scores done for it.  Too many games just loop a single
sample and just change the pitch and call it music.

If all you want is hard disk installability, talk to commodore.  I
am.  You don't need the OS for that.  Taking part of the machine and
killing off the OS is not better than taking the whole machine and
killing off the OS.  Why use part of the machine when you can use
it all?  Once people install something on hard disk, they want to 
run it and other Amiga programs at the same time.

>>Unfortunately, the sales life of a game is about 3 months.  Royalties
>>don't keep trickling in.
>
> I bet a game written for 2000/3000 machines would sell for a long time.
>Why? Because there's a niche there that no one has tapped yet, and
>the majority of A2000/3000 owners are not pirates, obviously if they can
>afford an A3000, they can afford a $30 game. 
>

How much can you buy SIM CITY for mail order these days?  I see lots
of games done in the last year selling for $15.  This is what is called
a fire sale.  People reduce the prices to get rid of inventory that
is not moving.  I bought Pirates (access software) for $19.  It installs
on the hard disk.  It does not require 1Meg, which 2000/3000 machines
have.  The 2000 does not have a hard disk (the 2000HD does).

>>Did you know that LoadRGB4() takes a full 60th of a second on a 68000?  
>>How long does MrgCop() take (pick random number).  How long does 
>>RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
>>take? 
>
>  Probably because LoadRGB4 recomputes the copper list. Remember 
>screens/views are a sharable resource. Nothing is preventing you from
>making a screen, disabling the inputdevice (Amiga n/m screen switch)
>and Owning the Copper and Blitter on that custom screen.

Whatever these routines do, they do too much more than you need in
a game.  None of the benefits you mention are even remotely desirable
in action software.  You suggest making an already tight for space
program bigger at the expense of better things.

>  Mike, take alook at Tom Rokiki's RadBoogie. Granted, it doesn't have 
>a SoundTracker playing, or filled vectors, but from a programming
>point of view, it's an excellent example, and proof that you can program
>fast action graphics and still leave the OS functioning.
>(Rad boogie calculates multiple splines in real time, blits a huge Amiga 
>logo, moves a large sprite, plays a Sonix music, and has a customcopper
>going. Most all of those European Filled Vector scrolls/objects PRECOMPUTE
>the object rotations and sin/cos paths with a C/Basic/Rexx program. Then
>the Assembly code merely looks up table values, and uses the blitter
>to draw lines/fill areas)

This is fine technique.  You still need the CPU/Blitter time to do the
drawing if you have a few shapes.

>
>[deleted... The Amiga is not a C64. Mac's dont have any good actions
>games because a game that takes over the MacOS won't sell. Period.]
>

Sales are poor, even for hit games.  The people who did the two Dark Castles
won't touch games after being burned twice.

>>>I'm not trying to flame or put down anyone, but better games than the
>>>current crop can still be written!  Better both in terms of
>>>awesome-take-over-the-machine-graphics, and better in terms of
>>>awesome-playability-and-multitasking-code.
>>>
>>
>>I agree here too.  Looking at most Amiga games next to C64 games
>>makes me want to puke.  It is painful to see Amiga games run at 8
>>frames per second while the C64 version of the same game runs at
>>60.  There is no excuse for this.  If you can't achieve the performance
>>under the OS, boot it by all means.  That is what the C64 guys do.  It
>>is a proven technique.
>
>  There is no secret why C64 games run at 60fps. That's because 98% of
>all C64 games are totally sprite driven. Any backgrounds are merely character
>graphics. The C64 doesn't have the horsepower to scroll bitmaps, or
>full color character mapped screens in real time. (You can't do it
>in 1/60 of a second.) It's a pity the Amiga sprites are smaller than the C64's.
>If they were 24/32 pixels wide a lot of games could use them (all 8) instead of
>attaching 2 or 4 together.

Shadow of the Beast runs at 60FPS because they were clever enough to use
the Amiga's sprites.  The Amiga sprites are multiplexable down the screen
just as the C64's are.  The Amiga has plenty of horsepower to put dozens of 
sprites on the screen AND use a bunch of blitter objects.  It depends on
how you use it all.  I wouldn't ignore any feature of the Amiga's hardware.

>  The C64 was really pushed to it's limit in both programming a game design.
>The Amiga has already been pushed to it's limit (mostly) What it really
>needs is GOOD Design.
>

There is no law of physics that precludes a game from looking as good
as SOTB and playing as good as whatever you consider playable and fun.

>  Mike, if you submit something to a magazine, put it in context. WHat I mean,
>is that don't give the impression that the correct way to program the
>Amiga is to boot the OS. Action games have their place, but there are
>many types of games beside action games that don't need to boot the OS.
>The Amiga is not a C64 (Even WOrd Processors on the C64 booted the OS)
> 

If Mike Farren somehow gets his stuff published, he should do the same.

>  I know that the OS is a big overhead for using the blitter and disk to its
>maximum abilities, but the fact that you said it only took you three days
>to do that game code shows that you have put much work into it.
>

For me, those three days were 72+ hours (not much sleepy time).  Again,
the game uses 0% of the OS.  I did it to prove that a smallish game can
do it (takeover/restore).  People seem to doubt that I have the ability...
I still wouldn't ship a product that worked that way.

>Bypassing the OS is not
>the way to do this, unless you plan on writing custom routines for
>each hardware configuration. Assembler is nice to know, but its a huge
>waste to use assembler for everything. As hardware gets more and more 
>complex (parallel processing, risc, etc) the assembler programmer must
>know more and more about the hardware. The hardware today is becoming so
>complex that the ORDER of operations is becoming important. Certain
>processors now require instructions to occur in precise order to insure
>the pipeline is filled. Only compilers can keep track of register and
>operation usage throughout the entire program.
>

No compiler for the 68000 can outdo what anyone can do by hand in assembler.
Talking about RISC and incompatible hardware, etc. is not an issue in
the Amiga world.

>  If you're an assembler programmer, the OS is especially annoying because
>almost every OS function operates on C structures and linked lists. I became
>annoyed because there were too many label names to remember for structure
>offsets, and duplicate label names in different structures can't be done
>in assemblers witouth collision. This is why C should be used for almost
>all operations except speed dependent stuff. I'd suggest that most of
>the 'kill the os' move over to CDTV when it hits the market. There you have
>600 megs of disk space, and the ability to kill the OS and have no 
>complaints.
>

Gee, in a Screen structure, the rastport offset is called RastPort and the
rastport offset in a Window structure is RPort.  This kind of orthagonality
makes it hard no matter what language you use.  I do think that the BEST
feature of 'C' is in its structure handling capabilities.  However, just
as COmmodore provides us with .h header files, they also provide us with
.i header files.  They closely parallel on another and they use a quite
logical convention for avoiding label conflicts.  Many of the assembler
conventions have even made their way into the .h files.  Why do some
structures use simple names (as in Window.Height) and others use assembler
style (as in TextAttr.ta_Name)?

In practice, assembler is quite good at both data structures and event
driven programming.  If you haven't tried it, check it out and then
comment again.

>
>--
>/~\_______________________________________________________________________/~\
>|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
>|~|                                .-. .-.                                |~|
>|_|________________________________| |_| |________________________________|_|

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (03/31/91)

In article <2149@pdxgate.UUCP> bairds@eecs.cs.pdx.edu (Shawn L. Baird) writes:
mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

[ ... some stuff deleted ... ]

>> No sources will be posted here, but I have done a small game (in 3
>> days in assembler) that I do intend to publish source code to at a
>> future date. And believe it or not, it does not kill the OS to the
>> point that it can't be restored. In other words, it runs from DOS and
>> returns to DOS, but it does NOT multitask.

[ ... some stuff deleted ... ]

>> When the Amiga first boots, it asks for a workbench disk. If you have
>> an autobooting hard drive and a bootable floppy is inserted, the
>> machine will boot from the floppy. In any case, the ROM Kernel loads
>> what is called the boot program from track 0 of the floppy disk into
>> RAM and does a JSR to it. The standard Amiga OS bootsector program
>> simply opens dos.library and then does an RTS and the system
>> continues to boot up into the normal Operating System.

>> Well, I wrote my own boot sector program that just doesn't return to
>> the OS. In this boot sector program, I call several ROM Kernel

[ ... rest deleted ... ]

> I'm a bit confused here. First you say it runs from DOS and returns to
> DOS. Then you say you wrote your own boot sector program that doesn't
> return to the OS. Which of the two is true? I suspect that, although
> you save the state of the OS there really isn't a reason to do so. Not
> only this, but using a custom bootblock will render the game only
> bootable on floppy disks and (from the descriptions of doing all of
> the floppy reading on your own) the disk itself will not be in
> AmigaDOS format, therefore there will be no way to install your
> program on a hard drive. What is the point of keeping the OS state
> when even if you did return to it the user has had to reboot at least
> once just to get started?

OK, since even Mike's followup to this didn't clarify the situation
much, let me try. I've sat and had Mike walk me through his "standard"
game code at greater length than he's described it here, and I just beta
tested the example game he was describing (easy since we share a host
machine).

The problem is, you've taken two different subjects for one.  Read Mike's
above two quoted excerpts as:  1) I have this example game showing the
basic principles I use, and it works like this; versus 2) When I make a
real commercial game for the Amiga, it works like this.

In fact, I took his example game, stuck it in Rad:, and ran it from the
command line. It is a pretty impressive example of the "fly a spaceship
down a corridor and shoot at gobs of nasties who are busily shooting
back, grab the various fuel and other pods before they get by, etc."
variety. It is highlighted by amazing graphics, a scrolling starfield
below the platform over which you fly your ship, that is three layers
deep and gives the perspective of depth as it moves by, sparkling
graphic imagry, thanks to an artist friend of Mike's, things just
_flying_ around.  It most reminded me of the old Apple ][+ game "AE"
in terms of the number and variety of nasties in sight.

No sound, but I only uploaded 49K of program! It used the usual
misdirection to achieve the illusion of higher performance from my stock
processor A2000; probably two thirds of the screen area was a static
frame and status display; all the dynamic stuff was being done in the
other 1/3 of the screen. That translates into lots less pixels to move
about. Everything but the stars seemed to be built out of the 16x16
tiles he described in a recent rec.games.programmer article. The tiles
that looked like open fretwork really were, you could watch the stars
slide by under them (more misdirection, but that's what I saw!)

Eventually I managed, after losing a couple of dozen games in a row (I'm
47, fuzzy eyed, and have a gimp arm, I'm never going to be an arcade
champ again) to evoke a bug (it _was_ a beta, after all), one of the
sprites smeared itself over the screen two centimeters wide and screen
height. I followed the directions for exiting the program, (touch a
mouse button), and despite the bug, I was right back in the workbench
looking at the shell window from which I had evoked the game, with
normal operations, no problems.

Just as he described, his game nuked multitasking while running, but
gave it back when done, good as new.

The most impressive thing was that this is a game Mike cobbled together
from standard parts, some artwork by a friend, and his usual code, but
slewed clear out of its normal environment to coexist with the OS, it
fit in 50K of code, he got it running in three days, and it's
_addictive_; clean up the bug and he could be selling it in the stores.

I don't much care for Mike's nuking the OS, and coding primarily at the
assembler level, but lose the arguments that his development therefore
_necessarily_ has to take a long time or that the game's playability
_must_ get lost when he's fighting assembler. He's got a great toolbox
of tested code, and it is a productivity engine, even in assembler.

I may disagree with him a _lot_ (we just started talking again at all
after long silence), but he's damned good at what he does, it supports
him in the good life, and he has a lot to teach about programming the
way he likes to do it, and he deserves to be heard by those wanting to
get a start programming the bare metal to make the big bucks.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
--
"Now if he'd just let me multitask from the HD!" -- Mr. Incorrigible  ;-)

sschaem@starnet.uucp (Stephan Schaem) (04/01/91)

S


 And also CED could use interlaced bitplane to eliminate any flicker!
 But no OS function suport that mode...
 But doing a library extension like we do can have those available, but
 the amiga look is not there anymore:-)

						Stephan.

sschaem@starnet.uucp (Stephan Schaem) (04/01/91)

 Most game can offer bigger sample.Have 2 music and load the big one if
 you have memory.
 But anything could be better if you think that way, load another game
 version on 16meg machine (extrem, but that kind of show the point).

							Stephan.

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/01/91)

In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>So what are the benefits of taking over?
>
>Well, I am guaranteed that ALL 512K are mine to use.  I can ORG any
>graphics, code, or audio data at any hard coded location I want.  This
>practice allows things like blitter routines to have hard coded constant
>addresses in them, which saves CPU cycles where you need them the most.

    Read my lips: YOU CANNOT ORG A HARDWIRED RAM LOCATION, PERIOD Why the
    @#$#@$ would you want to anyway?  Do you know how EASY it is to write a
    LoadSeg() equivalent to load relocatable executables?  It took me less
    than two days to write my dynamic object loader and THAT deals with
    full object files, not just executables!

    What happens when an AutoConfig board allocates ram that just happens
    to overlap the ram that you DON'T bother to allocate?

>I can put graphics screens and the stack anywhere I want.  For a 16
>color game, I put the stack at $80000 and a screen at $78000.	The
>stack needed for any program I write is < 512 bytes.  The resulting
>memory map gives me from $100 to $78000 to squeeze the game into.  And
>I do mean squeeze.

    Arrg.

    Well, I'll tell you, I had a 450 line response to this whole thing
    but have decided that it simply is too emotional, so I'm cutting out
    a large portion of it (nice having an Amiga as a UUCP node, you can
    go back and fix postings before it batches out)..  It's only 200
    lines now.

    suffice it to say that except for the trackdisk.device, all of your
    tweaking amounts to less than 1% CPU utilization ... MUCH less.  Your
    VBLANK interrupt based task scheme easily eats up 1-2% of the CPU in
    overhead alone, perhaps you should think less of absolute memory ORGs,
    using absolute word instead of A4-relative, and OR'ing and AND'ing the
    SR in supervisor mode and MORE about using better algorithms.

    You CAN take over much of the OS in a friendly manner and obtain 99.9%
    of the CPU for your game without SMEARing the OS so bad it requires a
    reboot.  You CAN manage dynamic memory allocation and executable
    relocation without loosing a single iota of CPU.  A single Forbid()
    goes a long way in a friendly take over, try it.  EXEC memory
    allocation has 0 memory overhead, and you only need do it during
    initialization.

    You do NOT have to use OS calls if they are too slow, but you should at
    LEAST allocate the appropriate resources before you bang on hardware,
    and fail gracefully if the calls return an error.

    You do NOT have to make life difficult for yourself adding vapor
    complications, simple interleaving of CPU computations generally
    yields a more efficient design without putting the blitter into
    nasty mode, though it should be noted that there is nothing wrong
    with doing so per-say (just that you gave yourself a lot of grief
    trying to program around it after you did so).

    I think you will find that the performance gained by keeping $DFF000 in
    A4 instead of a small-data base is, well, negative... completely
    useless... A4 would be better served used for the small data base.
    Calculate the cycles required to LEA $DFF000,A0 at the beginning of a
    routine that needs it.  CPU overhead < 1% assuming 500 blit starts / sec

    The point is, there are LOTS of tricks you can use that are perfectly
    legal.  The end result of most of your illegal tweaking is to gain a
    few cycles that amounts to NOTHING.  Sorry, didn't mean to insult, but
    you should really do some theoretical work before you write your
    programs, you'll be surprised.

>1.2, 1.3, and 2.0).  If the game were written under the OS, you'd have
>nightmares testing all the possible software configurations.  For example,
>does the game work with GOMF installed?  How about GOMF and DMouse?  How
>many possibilities do you see?  I see bazillions :)  And what if you
>allow multitasking and some CHIP RAM pig program (like DPaint) is already

    Hey, the USER knows about all of that.. .the idea is that if you are OS
    friendly, nothing dies most of the time.  Your solution -- forcing the
    user to not run ANYTHING, is not a solution because you KILL
    everything, as in destruct it, reboot when done.

>No sh*t sherlock!  The blitter is a very powerful coprocessor and
>..
>because there is NO index.

    you are missing the fact that you do not have to share the blitter
    with the OS, since you don't appear to want to use the OS, you can
    still obtain the blitter and keep it until your game exits, at which
    time the OS continues on its merry way as if nothing happenned.

>Unfortunately, the sales life of a game is about 3 months.  Royalties
>don't keep trickling in.

    Unless you write a really good game, I can think of many that have been
    on the market for years because they were created with care, very good
    games, though not necessarily OS friendly.

    It surprises me that game writers have a tendancy to ignore many amiga
    platforms... not only that, but ignore the bad press that occurs when
    most of the games fail to work on the reviewer's Amiga.  It tends to
    cast an image of major greed on the game writers and companies.

>The OS is definately worth knowing.  I do know it.  The first game I did
>used the OS (the only one I will ever do that way).  I have written
>megabytes of software that uses the OS.  It is a great OS.  No argument
>from me.  It just is built to outperform a Mac, but not a C64.  The
>hardware blows away everything from C64s to Macs to the Genesis, but you
>wouldn't know it from watching "performance oriented" games that use it.

    Not to be picky, but do you have any formal education in algorithm and
    theoretical design?  Any operating system classes?	graphics classes?
    Any BOOKS? Your attitude is similar to my attitude during Jr high in
    terms of programming, one I rapidly abandoned after being introduced to
    teletypes and, later, UNIX systems.

    ---

    There are many, many good algorithms, one does not appreciate how many
    and how good unless one receives some formal education that relates a
    few of them.  Over the years I have bought thousands of dollars worth
    of books on algorithms, graphics, languages, and other fields and still
    don't know half of what interests me.  We are not talking about
    required college reading here.  I always surprise myself by finding
    something new in a field I thought I had mastered, and half of those
    surprises ends up in a major performance increase to some program of
    mine, easy to implement because said program was usually written in
    a high level language.

    Frankly, most game writers do not have that kind of background, most
    people WITH that kind of background generally aren't inclined to write
    games.  Being able to say you have written a dozen games only says that
    you have written a dozen games... from my perspective it sounds quite
    dull, doing the same sort of thing a dozen times without really
    LEARNING anything.

    There is also this feeling amoung game writers, at least the ones I
    have talked to (possibly owing to the fact that 90% never got past
    highschool) that all the 'sophisticated' algorithms you learn in
    college and industry, and use for large highlevel language projects
    don't apply to games, or them.. They already *know* the best solution.
    <Sigh>, I always have to laugh at that... there is *always* a better
    way to do something and you need a good broad background (though not
    necessarily good grades) to grasp the better ones.

    ---

    I am not blanketing assembly language programmers, being one myself...
    However, I generally do all but the most critical routines in a high
    level language (C for now), you simply cannot do anything serious in
    assembly beyond games without taking a LONG time about it.	Not
    everyone is like that... people like Jez are highly professional
    programmers who PREFER to write in assembly, and easily holds their own
    in any technical conversation.  We spent a while comparing the
    algorithms used in DAS and ARGASM, the basic difference being that the
    two assemblers were designed for different end-user applications.  DAS
    needed to have forward referenced branch optimization and support
    forward referenced XDEFs while ARGASM was designed for speed to
    assemble megabyte programs fast, and thus is able to forego many
    optimizations and some forward referencing capabiliy (for XREFs) to
    accomplish its job in a single pass while DAS does it's job in 5 passes
    (2 principle), yet DAS is only half as fast as ArgAsm.

    Similarly, certain DICE c.lib support routines are written in assembly.
    movmem(), setmem() and siblings, for example, though not strcpy() (DICE
    does a good enough job that I can leave that in C).  You get the idea
    though, mix and match according to design requirements.

>I agree here too.  Looking at most Amiga games next to C64 games
>makes me want to puke.  It is painful to see Amiga games run at 8
>frames per second while the C64 version of the same game runs at
>60.  There is no excuse for this.  If you can't achieve the performance
>under the OS, boot it by all means.  That is what the C64 guys do.  It
>is a proven technique.

    It's a matter of refinement that has nothing to do with the OS.  After
    all, many of the games you are thinking of are most likely written all
    in assembly and take over the OS, right? :-)  And they are still slow.

    It's all algorithms.  The best example is the use of interrupts.
    Interrupts have a lot of overhead... I mean, a LOT.  A directly
    vectored 68000 interrupt takes 10-20uS just to process! that's
    interrupt fetch to RTE.  If you add even the most simplistic of service
    chaining or checking (for example, the CIA interrupt must handle 8
    interrupt sources) you immediately add, oh, at least 200uS, or 1600
    cycles of overhead and you haven't even gotten CLOSE to your interrupt
    handler yet.  Your little task interrupt based on the VBI CIA chain is
    one of the absolute WORST ways to do it!

					-Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

bairds@eecs.cs.pdx.edu (Shawn L. Baird) (04/01/91)

DXB132@psuvm.psu.edu writes:

>In article <2149@pdxgate.UUCP>, bairds@eecs.cs.pdx.edu (Shawn L. Baird) says:

>That example doesn't help your point. CED uses blitting which is many many
>times slower than using the copper to aid scrolling. But it has no choice
>in order to remain Intuition compatible.

The point was fast blitting. It is rather unreasonable for an editor such
as CED to waste its memory holding a bitmap version of the text. Also, CED
doesn't scroll an entire viewport at once. Otherwise the scrollbar at the
left would be a bad placement.

If you want to see an example of what should have used a custom copper and
the bitmap scrolling abilities of the hardware (for that matter, it didn't
really need smooth scrolling even) take a look at SimCity. All I am saying
is that any performance needs can be met by simply taking the machine over
while still under the operating system. It is only performance accompanied
with the need to run in 512k and on an 880k disk that prevents game makers
from doing so.

What is this about having no choice and not being able to use the copper
to aid scrolling? It is perfectly legal to do whatever the hell you want
with your copper list once you've grabbed your custom screen. Take off
menus by ignoring the right mouse button and not having one present and
you don't have to worry about Intuition coming along and messing up the
contents of your screen. Nothing else should be attempting to put windows
up on it unless it's either public via the screenshare.library or via
2.0's public screen functions (which I really don't know much about). The
point is that a lot of games could do this to increase performance in a
multi-tasking environments. These are games that usually don't need as
much performance as games that usually take over the OS.

Still, you can get all the speed you need, even under an Intuition screen.
Just pop a call to Disable() and start doing everything yourself. I can
even imagine writing a demo that shows people this could be done.
Something that looks like CED without a scrollbar which uses a second
viewport to display the data. Then the user just holds down the left mouse
button and moves up or down to scroll or something. Should be plenty fast.
In fact, I bet I can guarantee 60 frames/sec with multi-tasking. (True, I
wouldn't be doing very much.) Turn off multi-tasking at this point and I
can guarantee 60 frames/sec with just about any normal game activity going
on where the game itself would run at 60 frames/sec as well.

>-- Dan Babcock

---
 Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
 The above message is not licensed by AT&T, or at least, not yet.

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/01/91)

In article <23937@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>
>It occurs to me that there is a distinct difference of outlook which is
>the basis of the disagreements here.  Mike and Stephan look at the
>Amiga, as far as I can tell, as a game machine which just happens to be
>a computer system as well.  Both of them are concentrated pretty much
>solely on getting the maximum bang from the hardware.  Both of them
>want to use every single resource that is available to make their games
>snazzier, more impressive, flashier, and quicker.  And both of them
>are probably good at what they do - at least, my email from Mike would
>indicate that it's certainly true of him, and, by reputation, Stephan
>as well.
>

Exactly!  Maximum bang from every resource!  Snazzier, more impressive,
and quicker!

>I happen to disagree with their outlook very strongly, because it is an
>outlook which inherently limits what you can do with the Amiga.  It
>puts limits on the game, it puts limits on the user, and it puts limits
>on the sales - I, for one, will think long and hard before buying any
>game which shuts down my system, although I did buy Lemmings because it
>was just too good to ignore.  Limits is what it's all about - and I feel
>that the fewer limits you accept, and the more you try and open up the
>limits you do have to work under, the better off we all will be.
>

PLEASE, PLEASE, PLEASE, all you netters out there, please don't confuse
EGO with passion for wanting to see more excellent games, like Lemmings,
which take over the machine.

The only thing that we differ on is what LIMITs are.  I say <512K is
a limit.  I say wasting any resource at the expense of the game at the
OS' expense is a limit.

>It's all in how  you look at it. I propose that if you approach the game design
>with the attitude "I won't take anything away from the OS unless I absolutely,
>positively, without a doubt have to", and utilize your skill and cleverness
>as a programmer to making that so, that you will find that the vast majority
>of games will NOT require sacrificing the OS.  All that I am asking of any
>programmer is that they work from that basis - not the opposing one which
>seems to start out with the assumption that you WILL take over the machine,
>an assumption which makes any other choice much harder to implement.
>

Let the game come first.  Let the market come first.  Give as many people
what they want.  If the game design doesn't require killing the OS, don't.
But don't just design games that fit in with the OS, either.  Make the games
you WANT to make and if the OS is in the way, don't bother with it.

>The thing is that despite the protestations of the "take over the machine"
>folks, I just haven't seen all that many games for the Amiga which, when
>I looked at them closely, really required the entire machine under any and
>all circumstances.  Yes, there are some - things like Turrican or any of
>the other very fast shoot-em-ups.  But those games are the minority.  If
>you look at the Top Ten list of games, you don't see shoot-em-ups on there
>very often.  Much more popular are things like SSI's AD&D series, Populous,
>SimCity/SimEarth, and the like - games which absolutely and postively do NOT
>require a complete takeover of the machine, even though many of them do.
>

I bet the top ten list of games are all nintendo carts.  If the same games
that ran on the nintendo ran on the Amiga, the Amiga would have sold
12+ Million machines instead of just 2 Million.

>I called the refusal of some programmers to even consider operating in an
>Amiga-friendly way "laziness", and I'll stick to that.  If they've done
>their homework, as Mike and Stephan seem to have done, and have come to
>a rational conclusion that trashing the OS and taking over the machine is the
>only way that they can do the game they want to do, then that isn't
>laziness.  If, however, they've just taken over the machine because it's
>more convenient for them or because they just don't want to do the hard
>work necessary to make a game Amiga-friendly, then it's laziness, no matter
>how hard they work to get their custom disk loaders or graphics routines
>or whatever.
>

You (not specifically you, Mike) shouldn't assume that people are too lazy
to learn the OS (or to be friendly with it) unless they tell you they are.
I'm pretty sure that most people who make games for the Amiga use the Amiga
OS all the time while developing it and can see its virtues.

Thanks for the voice of reason.

>-- 
>Mike Farren 				     farren@well.sf.ca.us

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

jsmoller@jsmami.UUCP (Jesper Steen Moller) (04/01/91)

In article <Mar.30.22.51.19.1991.17088@pilot.njin.net>, Tom Limoncelli +1 201 408 5389 writes:

[very, very healthy arguments deleted - I have no intention to dig further
into this thread]

> As soon as your statistical sample (European kids) starts getting
> '030s and HDs I'm sure that you'll produce software that I feel like
> buying.  Too bad it takes so long for Europeans to come up to American
> standards.  I guess we follow them in fashion and they follow us in...
> oh never mind.

Yeah - ufortunately "European kids" (of which I might be considered one)
has to upto pay twice as much for hardware as you lucky Americans. No
wonder we're a bit slow in getting all this new stuff (I am a bit out
of the ordinary, as I run an A3000).
Not all European kids are OS-nukers - some actually like the qualities
of the OS and write nice, multi-tasking (even multi-threading if necessary)
programs. But in general it's a different ball-game over here. We get
the stuff a lot later than you and the prices are very high. The street
price for a raw Quantum LP105S in Denmark is twice the price in the U.S.

Someone posted an article on the Europe/USA relationship a long time ago
(before the c.s.a split, if I remember correctly). The Amiga 500 owners
here are mostly teenagers (like me, but I bought an A1000 instead :) ),
who can't afford more than a plain A500, and a hardware tech manual.
So they code and code and get quite good at it, but use processor timing
loops and move SR, everywhere, because it's the fastest.
I never did that, I used C instead and programmed utilities instead,
and I was laughed at (in the demo society), until my utilities suddenly
got quite good. I wrote a VirusExterminator which made as far as to
South America :) but that that never impressed the assembler coders.
They still use a plain A500 with two drives and a 501 clone.
Except for the ones who got hired to to do some commercial stuff,
they add kludges to avoid the bad effects.

A discussion like this would help to get some (only a few perhaps)
of these "poor European kids" to see that there's another world out
there (=overthere), and I would like to give some of the very good
and well-argumented articles written in this thread to the unofficial
diskmagazines that exist in the European demo-societies. I would have
posted Mike Farrens article, but it was copyrighted, and even if 90%
of the Amiga society here don't really care about copyrights, I do.

I will write a detailed "official warning to ROM jumpers, etc. etc."
and give to my demo-contacts, and if anybody would like to write
some additional stuff, you can reach be at
{rutgers|uunet|pyramid}!cbmvax!cbmehq!cbmdeo!jsmami!jsmoller. As these
magazines are free, all you can get from this is a go at salvaging some
European kids...

> Everytime I meet someone looking to trash their 500 and buy an A3000
> they say, "but I don't want to because all my games won't work".  Mike
> et al seem to have made a captive audience.  Mike won't change until
> the users do, if Mike had never started the users could upgrade.

Mike seems to write compatible programs that work in all Amigas, but
haven't got the qualities "we" want (multitasking, HD-support). But the
one game that I like runs 25MHz '030 in 2.0 very nicely indeed. But
doesn't multitask - a damn shame...

> -Tom

Jesper

> P.S.  When someone is considering upgrading, I ask them to try to find
> the games that they think won't work on an A3000 running 2.0.  Then I
> try to get them to notice that if they haven't used those in the last
> 6 months, they most-likely won't miss them.

Dead right.
--                     __
Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
DK-2920 Charl    \\\///  FIDONET: 2:231/84.45
Denmark           \XX/

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/01/91)

In article <20213@cbmvax.commodore.com> jesup@cbmvax.commodore.com (Randell Jesup) writes:
>In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>>the machine will boot from the floppy.  In any case, the ROM Kernel
>>loads what is called the boot program from track 0 of the floppy
>>disk into RAM and does a JSR to it.  The standard Amiga OS bootsector
>>program simply opens dos.library and then does an RTS and the system
>>continues to boot up into the normal Operating System.
>
>	You should learn more about the OS...  It actually never returns
>if it opens dos.library.  Dos starts the initial process, and then kills
>the initial task (itself).  The initial process continues the boot process.
>(I'm the person who rewrote the Dos in C and asm.)
>

See page VII-4 and VII-5 of AmigaMail, which shows the source code to
the "official" bootsector program.  If the program OPENs dos.library, it
returns a ZERO in D0, if if fails, it returns -1.  Not that I am arguing,
just telling you what your own documentation says.

>>a BIOS of sorts.  In addition to this 8K of KERNEL code, there is another
>>12K of floppy disk drivers, because I will not have the operating system
>>running to read any further data from the floppies.
>
>	Note that even 2.0 trackdisk is only about 7K long.
>

I provide additional features that trackdisk doesn't, like the ability
to search for a disk in any drive.  This function is NOT performed by
trackdisk, but by DOS.

>>Since I am in supervisor mode, there are NO illegal instructions that
>>can be executed (priviledge violations cause a GURU under the OS).  The
>>User Stack Pointer (USP) also can be used as a quick place to save
>>an address register (this is 2x faster than a push on the stack).
>>The upper byte of the status register is available to disable various
>>levels of interrupts (the INTENA on Paula is just as useful), and
>>the TRACE bitis available for debugging purposes.
>
>	Hopefully, if you programmed your game right, you shouldn't have
>to worry about executing an illegal instruction by mistake (except of
>course ILLEGAL).  Disabling interrupts in the processor is faster than
>disabling them in Paula, since you don't have to get on the chip bus to
>do it.
>

Agreed.  If you access the IM (interrupt mask) bits to disable interrupts,
and you are not in supervisor mode, you guru.  I do use ILLEGAL for
breakpoints.  I also use the TRACE bit for single stepping and other
features.

A couple of neat tricks that can be done with the TRACE bit:

1)	Instead of a standard single step trace handler, you can
	install a trace handler that fetches the PC from the stack
	and stores it in a circular buffer.  When your program
	has a "bannana land" bug (i.e. you push some number on the
	stack and do an RTS without poping it by accident), you
	can examine the circular buffer to see what sequence of
	instructions got you there.

2)	You can implement a trace handler that checks to see if
	a memory location is being clobbered.  Useful for detecting
	wild stores.

>>The floppy disk drivers I wrote use a single 10K buffer to handle as
>>many disk drives (up to 4) that may be configured.  The OS routines
>>will steal 40K if you have 4 drives (10K per drive).
>
>	Amusing.  First, you need more than 10K for a MFM buffer, since
>the number of bytes (decoded) per track can be as high as 6812, so the MFM
>buffer must be at LEAST 13624, and you actually want it a bit larger in
>some cases (we use 15296 - we keep a gaps-worth of NULLs (aaaaaaaa) before
>the spot where we read, to make writing faster/easier).  Sure you don't mean
>16K?  (Which is what the OS in 1.3 used, though it was a bit more than was
>needed.)
>>enhanced performance a few ways.  One way is that the blitter is in
>>nasty mode, so encoding/decoding the MFM data is as fast as possible.
>
>
>	Decoding is faster with the processor, if you also are going to 
>check the checksum.  Nasty mode will hurt your interrupt response time.
>Sounds like the classic "optimize the routine within a inch of it's life,
>and miss the fact that a different algorithm would be twice as fast".
>BTW, when there's a >2 bitplane (>4 in 320x200/400), running code from
>the ROMs is faster than from ram, since you don't have to pay the penalty
>for getting cycles to from the chip bus 9since in your way of programming,
>all your code ends up in chip ram - annoying for something that could use 
>the extra horsepower, like 3d games.

I actually use 12K or 14K.  I decode the MFM a sector at a time when
needed.  Are you sure that the using the blitter in NASTY mode isn't
faster than the CPU?  I've been asking for ROM routines to do what 90%
of games do (without the OS).  And one last point (I do know the OS :)
the OS (1.3) uses QBlit to encode/decode MFM.  Not the fastest way, by
any means.

>
>>When data is written to diskette, it is arranged so that NO extra disk
>>revolutions will be made during readback.  This is done by timing
>>the read and write routines so that by the time a track has been read
>>in and the head stepped, the next start of track is under the head
>>and ready to go.
>
>	Unless your're pulling partial tracks off and using them before
>the revolution is complete, or unless you're going to write it out again,
>this makes no difference - and for writing all it saves you is a block-move
>to eliminate the gap.
>

It DOES save a block move, which is not a particularly fast thing to do.
I hope that 2.0 pulls partial tracks off and uses them before the revolution
is complete!

>> The routines also make use of the DSKSYNC capability
>>of the drives, which the OS routines don't (under 2.0 they probably
>>do).
>
>	Yup.  Not as big a win as you'd think (I thought it would be a big
>win, but floppy rotation time swamps almost anything).

Again, I time things out so that immediately after stepping and letting the
head settle, the start of the next track is right there.  Can't get any
faster.

>
>>  The routines use a CIA timer to get perfect timing, no matter
>>what processor.  Try popping the disk out with the disk light on (it
>>works).  Try ejecting the disk in the middle of a load and put it
>>into a different drive (it works).  Try that with the OS and watch
>>your disk go bad in 1 second, thanks to the disk validator.
>
>	I guarantee that if you pop a disk while it's writing, it WILL go bad.
>Even with your code.  If it's reading, it may forcefully ask for it back, but
>it won't go bad.
>

True.  Once the write head is turned on, it will write all over whatever
tracks it is over as the disk is ejected.

>>And what if you
>>allow multitasking and some CHIP RAM pig program (like DPaint) is already
>>running?  GURU.  And you need to test with 4 floppy drives under the OS,
>>just to make sure the OS hasn't taken more memory than you can allow.
>
>	You should learn to check allocation returns.  It's not hard.
>There's even a tool for selectively denying memory allocations to stress-
>test your program that we distribute (written by Bill Hawes to help test
>2.0).
>

There ARE cases where the allocation routines don't return (at least under
1.3).  In too many cases, the OS throws up the ol' AG_NoMemory guru alert,
instead of waiting for some other process to free some memory and retrying.

>>Once you start
>>using the floppy disk hardware directly, for example, you must put
>>the CIAs back into a state that the OS wants them in.  What state is that?
>>The ROM Kernel manuals LIE.  Have fun finding out what page they lie on,
>>because there is NO index.
>
>	Sure, the 1.1 RKMs had the CIA allocations backwards.  The 1.3 RKMs
>(which have been out a long time now) had the correct information (and
>indexes).  We told people this.  This caused our worst 2.0 compatibility
>problems, though we solved almost all of them (by dink of truely tricky
>programming...)
>
>>>On the other hand, games like Sim-City and Lemmings have no real use
>>>for that kind of environment.  I agree that the game comes first,
>>>*BUT* if you don't need total control, *DON'T* take it.
>
>>I agree with this 100%.  If you don't need to take over the machine,
>>don't.  If you want to push the machine to its limits, there is NO
>>other way.
>
>	True.
>
>>  Let the game come first.  If you know you can do the
>>game in a small amount of RAM and that performance is not an issue,
>>go ahead and use the OS.  In my approach, if I have RAM left over,
>>I use it for more sounds or instrument samples to make the music
>>better, or to cache more data from the floppy drives.
>
>	You don't let the _game_ come first, you put your _implementation_
>first.  There is a difference.  As for your approach, you may find your
>game didn't need all these tricks, and had ram left over when you're
>done.  But since you programmed yourself into a corner you can't go back
>and cooperate with the system, so you just look for ways to use up the
>ram in a more-or-less-useful manner.
>

Using the OS is just as easy a way to paint yourself into a corner.  When
I did Budokan, I could have easily gone made it run under the OS in a few
days.  I specifically chose not to  and wouldn't want to do it any other
way.  It had nothing to do with laziness.  I deliberately chose to NOT
use the OS and would do it exactly the same way if I had to do it over.

>>Unfortunately, the sales life of a game is about 3 months.  Royalties
>>don't keep trickling in.
>
>	Good _games_, ones that have a depth beyond flashy graphics, and
>have replayability, do continue to sell (though they do best when first
>released, like most authored products).  Sure, they do eventually trend 
>towards 0, but by no means do they walk off a cliff for a good game
>(or even a well-done flashy game).
>

Wrong.  If a dealer has the luck to sell out every copy of a game that
he has after 3 months, he won't order more.  He'd rather use the shelf
space for a new game.

>>The OS does have bugs, however.  I spent weeks finding them for other
>>people at EA.  Ever hear about the trackdisk bug?  It seems that if
>>you have an external floppy drive and have no disk in it, and do intensive
>>disk access to the internal drive, it gurus after a random (long) amount
>>of time.
>
>	Actually it has nothing to do with internal versus external.  This
>was fixed in one of the 1.3 releases in SetPatch.  Note: I don't ever 
>remember seeing a bug report from EA about this (or in fact about almost
>anything, though some of the people who sell things through EA do report bugs
>well, and this has changed somewhat for the better in the last few years).
>Perhaps there was one, it was a while ago, but I don't remember it.  Developer
>support is a 2-way street.
>

SetPatch didn't fix it.  There is a program that comes with CrossDos called
TDPatch that might fix it though.  We WERE in touch with CATS on this one.
At least you ADMIT there is such a bug.  Just one of too many gotchas.  Gladly,
2.0 will someday be available to EVERYONE and should have far fewer gotchas.

>>Did you know that LoadRGB4() takes a full 60th of a second on a 68000?
>
>	No, it merely doesn't take effect until the next vblank (it 
>modifies the copperlist).
>

No, it doesn't return until next frame on a 68000.  It is faster on
an 030.  I specifically ran into this problem with a library of routines
that I wrote for someone else.  I made a loop that looks like this (to test
out whether LoadRGB4 was the culprit):

Loop:	bsr WaitTOF
	bsr LoadRGB4
	bra Loop

It takes 2 60ths of a second per loop.

>>How long does MrgCop() take (pick random number).  How long does 
>>RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
>>take? 
>
>	(1) those are not called all the time, (2) you're exaggerating
>by a lot.  BMBMRP() is not fast because it operates on arbitrary rectangles,
>and some sets require using the A-channel as a mask.  So if you do know
>the alignments are ok, OwnBlitter() and program it directly.

This is not supposed to be a good programming practice, because future
versions of the OS and hardware might be different, no?  My point is
that if the OS were intended to make games, there would be routines for
doing these things fast.  The OS is a FINE general purpose OS, and despite
how ridiculously slow BmBMRP() is, it still blows the doors off of
CopyMaskBits() on the Mac.  I should also point out that BMBMRP() is THE
most important subroutine in the whole library to support a game.

>
>-- 
>Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
>{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
>Thus spake the Master Ninjei: "To program a million-line operating system
>is easy, to change a man's temperament is more difficult."
>(From "The Zen of Programming")  ;-)

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/01/91)

In article <1991Mar31.144257.28633@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>In article <2149@pdxgate.UUCP> bairds@eecs.cs.pdx.edu (Shawn L. Baird) writes:
>mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>
>In fact, I took his example game, stuck it in Rad:, and ran it from the
>command line. It is a pretty impressive example of the "fly a spaceship
>down a corridor and shoot at gobs of nasties who are busily shooting
>back, grab the various fuel and other pods before they get by, etc."
>variety. It is highlighted by amazing graphics, a scrolling starfield
>below the platform over which you fly your ship, that is three layers
>deep and gives the perspective of depth as it moves by, sparkling
>graphic imagry, thanks to an artist friend of Mike's, things just
>_flying_ around.  It most reminded me of the old Apple ][+ game "AE"
>in terms of the number and variety of nasties in sight.
>
>No sound, but I only uploaded 49K of program! It used the usual
>misdirection to achieve the illusion of higher performance from my stock
>processor A2000; probably two thirds of the screen area was a static
>frame and status display; all the dynamic stuff was being done in the
>other 1/3 of the screen. That translates into lots less pixels to move
>about. Everything but the stars seemed to be built out of the 16x16
>tiles he described in a recent rec.games.programmer article. The tiles
>that looked like open fretwork really were, you could watch the stars
>slide by under them (more misdirection, but that's what I saw!)
>

Ahem, the program has 5 sound effects totalling exactly 24636 bytes
of that 49K.  You sound must have been turned down.

>Eventually I managed, after losing a couple of dozen games in a row (I'm
>47, fuzzy eyed, and have a gimp arm, I'm never going to be an arcade
>champ again) to evoke a bug (it _was_ a beta, after all), one of the
>sprites smeared itself over the screen two centimeters wide and screen
>height. I followed the directions for exiting the program, (touch a
>mouse button), and despite the bug, I was right back in the workbench
>looking at the shell window from which I had evoked the game, with
>normal operations, no problems.
>

Sorry, I know there is a wild store somewhere, but I used NO debugger
to do the thing!  I'll find it.

>Just as he described, his game nuked multitasking while running, but
>gave it back when done, good as new.
>

Didn't use a single OS call in the program.

>The most impressive thing was that this is a game Mike cobbled together
>from standard parts, some artwork by a friend, and his usual code, but
>slewed clear out of its normal environment to coexist with the OS, it
>fit in 50K of code, he got it running in three days, and it's
>_addictive_; clean up the bug and he could be selling it in the stores.
>

Just wanted to show I could do it, but it doesn't mean I would sell
a game done that way :( (frown, not smiley)

Also, you overrate it.  If I spent another 3+ months on it, expanded
it to fill 512K, added music, a title screen, an attract mode, more
levels, more enemies, more graphics... Maybe it would be a product.
The best I'd ever hope for this game is PD.

>I don't much care for Mike's nuking the OS, and coding primarily at the
>assembler level, but lose the arguments that his development therefore
>_necessarily_ has to take a long time or that the game's playability
>_must_ get lost when he's fighting assembler. He's got a great toolbox
>of tested code, and it is a productivity engine, even in assembler.
>

Remember, I didn't use a debugger even...  It also helped that the
artwork was given to me complete.

>I may disagree with him a _lot_ (we just started talking again at all
>after long silence), but he's damned good at what he does, it supports
>him in the good life, and he has a lot to teach about programming the
>way he likes to do it, and he deserves to be heard by those wanting to
>get a start programming the bare metal to make the big bucks.
>

Just did it to prove a point.  I know how to do it, know it can be done,
but still would choose not to make products this way.  I just want the
*alternate* point of view to be heard.

>Kent, the man from xanth.
><xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
>--
>"Now if he'd just let me multitask from the HD!" -- Mr. Incorrigible  ;-)

Give 'em an inch and they want a mile haha.

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (04/01/91)

In article <mykes.0926@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>In article <1991Mar31.003933.1483@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>If you want to design your own games, go ahead.  But don't tell me and
>everyone else how to do it.  If everyone designs games exactly the way
>you do, it would get real boring real fast.  The fact that individuals
>design games, instead of a design by community approach, gives us variety
>in game play, look and feel, and style.

  I'm not telling you how to design games, I'm merely giving you
an alternative viewpoint. You seem to think that flashy graphics are what
makes games good. Graphics and Sound are nice, but they aren't everything.
Psygnosis games are widely known to be the most UNPLAYABLE games on the
Amiga. If it weren't for the trainer modes, most people would have given
up playing them by now. 

>>  That's what compiled C code looks like. C is used for large projects
>>that an Assembly programmer could never accomplish in a resonable
>>amount of time. I programmed Assembly for 7 years, and I can do things
>>in C++/LISP that would take insurmountable time in assembly.
>>I do wish C= would optimize certain parts of the OS with AssemblY.
>>(Optimize Layers Library clipping code in assembly, and Exec's dispatcher,
>>and the vertical blank interupt handler. Intuition should stay in C, but
>>stuff like WritePixel should be in Asy)
>>
>
>A couple of points.  Exec is already in assembler language.  You might
>optimize just a few bytes/cycles out of it.  It is the foundation of the
>OS.  It was done RIGHT (assembler).  Layers is really not that slow when
>you consider what it has to do (and compare it with other OS performance).
>ReadPixel and WritePixel are still going to be slow because it has to
>do rastport clipping (so you don't plot pixels outside of a window).
>Finally, development time is also not as much of an issue as is quality.

  If development time isn't a problem, why don't you spend lots of time
making a game that runs under the OS. Don't say it can't be done, Heart
of the Dragon has action just was smooth as Budokan with more colors!

>When you call printf() you are relying on someone else's coding ability.
>When you call an OS routine, you are relying on someone else's coding ability,
>too.  You not only have your own bugs to deal with but those in the (link)
>library and those in the ROM.  When you run native under the OS, you are also
>relying on a wide variety of PD software hacks (like POPCLI, etc.) that people
>use to also not have bugs.  Remember, one wild store in any of this software
>crashes everything.

  Somewhere along the line your going to have to rely on other peoples' code.
When a physicist/chemist studies Science, he is relying on research and facts
collected by people before him. There isn't such a thing as an 'ultimate'
programmer. MOst of the optimizations you think up, have been thought by
up by many other people as well. Ibet there isn't a single assembler
trick you can do, that hasn't be though up before. Commodore did not hire
idiots. The Commodore crew is just as good at programming as you, they jyst
do a different kind of programming. If you insist on re-inventing the wheel
everytime you code because you don't believe other programmers' routines 
aren't as "perfect" as your specs require, you deserve to be overworked.

>As far as 'C' goes, I have stepped through enough disassembled 'C' code to
>see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
>cycles are wasted by pushing arguments on the stack to call a subroutine?
>How many cycles are wasted by calling a "glue" routine?  How many cycles
>and bytes are wasted by fixing up the stack after each and every subroutine
>returns?  How much stack do you need to allow for all the dynamic allocation
>of local variables?  512K is not a lot of RAM to go and waste memory all the
>time.  A 7.14 MHz 68000 isn't fast enough to waste all the extra cycles, if
>you are striving for performance.

  Obviously you have no idea of how advanced today's optimizing compilers
are.  The code you stepped through must have been produced by some
1970's MetaComco compiler or something. But FYI, most of todays compilers
can pass arguements in registers, allocate memory without stack, eliminate
the frame registers, and even do all the non-obvious tricks of sign extension,
etc. Check out the code I've included at the end compiled by GCC.


>In practice, assembler is quite good at both data structures and event
>driven programming.  If you haven't tried it, check it out and then
>comment again.

  I have tried it, without a great Macro assembler and some nice Macros,
it's still a pain. And assembler can't even come close to object oriented
programming such as data abtraction, inheritance, operator/function overloading
and streams. Assembler is just overkill for mostly everything. Assembler
has it's place, but if you program in assembler 100% of the time, your
a masochist. (Even on the C64 I routinely used compiled basic to implement
some tools like sin tables generators etc.)

>
>--
>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
>********************************************************

The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
I don't have SAS C on the Amiga, but I'm sure it produces simular results.

/* test.c */

char buf[20];
main()
{
  char *d=(char *)&buf;
  const char *s="This is a test\n";
  while(*s) { *d++=*s++; }
}

/* Test.s produced by gcc */

#NO_APP
gcc_compiled.:
.text
LC0:
	.ascii "This is a test\12\0"
	.even
.globl _main
_main:
	lea _buf,a1
	lea LC0,a0
	tstb a0@
	jeq L5
L4:
	moveb a0@+,a1@+
	tstb a0@
	jne L4
L5:
	rts
.comm _buf,20

 I can easily provide more real life examples of produced code, but the
fact of the matter is, just like Chess computers beating the majority of
humans, compilers are reaching a state where they can outperform the
 majority of humans in optimization. I have no doubt that is in the
next 5 years they are going to get even better. What's worse, is that
the new CPU's being produced today have instruction sets and architectures
that are almost impossible to code optimally without considering
timing of the instruction combined with other instructions.

  Also, it's not the language but the algorithm that is responsible for
how fast a routine runs. Compare a C coded Boyer-Moore string search
with an Assembly coded brute-force byte by byte search, the C code would
probably win without optimizations turned on.



--
/~\_______________________________________________________________________/~\
|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
|~|                                .-. .-.                                |~|
|_|________________________________| |_| |________________________________|_|

jesup@cbmvax.commodore.com (Randell Jesup) (04/01/91)

In article <mykes.0962@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>See page VII-4 and VII-5 of AmigaMail, which shows the source code to
>the "official" bootsector program.  If the program OPENs dos.library, it
>returns a ZERO in D0, if if fails, it returns -1.  Not that I am arguing,
>just telling you what your own documentation says.

	Except the trick is that if dos.library is opened successfully,
the call never returns.  The docs don't actually say it returns (though I
can see how you'd assume that).

>>	Note that even 2.0 trackdisk is only about 7K long.
>
>I provide additional features that trackdisk doesn't, like the ability
>to search for a disk in any drive.  This function is NOT performed by
>trackdisk, but by DOS.

	That wouldn't add more than a hundred or two bytes to trackdisk.
Perhaps there are other features you have, though, that trackdisk doesn't,
or things that trackdisk relies on the system to provide.

>I actually use 12K or 14K.  I decode the MFM a sector at a time when
>needed.  Are you sure that the using the blitter in NASTY mode isn't
>faster than the CPU?  I've been asking for ROM routines to do what 90%
>of games do (without the OS).  And one last point (I do know the OS :)
>the OS (1.3) uses QBlit to encode/decode MFM.  Not the fastest way, by
>any means.

	Nasty should be of little or no benefit (unless you busy-wait
for the blit to finish).  You didn't test it?  As for 1.3, yes it uses
QBlit, but that allows it to sleep until the blit is finished (more cycles
for other tasks or the blitter).  2.0 uses the CPU, which is faster than
the blitter for decoding when the checksum is needed.  It still uses the
blitter for writes, and for re-reads of a sector to chip ram (rare).  All
initial reads of a sector and read to fastmem are done by the CPU.

>>	Unless your're pulling partial tracks off and using them before
>>the revolution is complete, or unless you're going to write it out again,
>>this makes no difference - and for writing all it saves you is a block-move
>>to eliminate the gap.
>
>It DOES save a block move, which is not a particularly fast thing to do.
>I hope that 2.0 pulls partial tracks off and uses them before the revolution
>is complete!

	First, the block move is not needed on read (though 1.3 did it). It's
only needed when preparing a track for writing, and it's a very small amount 
compared to the 200+ms you'll wait while the disk revolves.  This is why
a massive reduction in overhead produced little improvement in disk speed
(some, just not immense amounts).  The big win was direct transfer to fast-
ram (for machines with fast-ram), since this allows fs buffers to be in
fastmem.

	As for using sectors before a read completes, that's fine if you've
taken over the machine and can have the processor watch the dma happen in a
busy-loop, or have it pull the bytes off the disk itself (with interrupts
disabled).  That's not acceptable under multitasking.

>There ARE cases where the allocation routines don't return (at least under
>1.3).  In too many cases, the OS throws up the ol' AG_NoMemory guru alert,
>instead of waiting for some other process to free some memory and retrying.

	Yes, 1.3 is not very stable under ~4K free.  2.0 is annoyingly
stable down to almost nothing (thanks to our developers and especially Bill
Hawes, who wrote memoration and has an evil mind...)

>>	Good _games_, ones that have a depth beyond flashy graphics, and
>>have replayability, do continue to sell (though they do best when first
>>released, like most authored products).  Sure, they do eventually trend 
>>towards 0, but by no means do they walk off a cliff for a good game
>>(or even a well-done flashy game).
>
>Wrong.  If a dealer has the luck to sell out every copy of a game that
>he has after 3 months, he won't order more.  He'd rather use the shelf
>space for a new game.

	I guess those brand-new copies of Dungeon Master on the shelf at
B. Daltons are an illusion, then.  Ditto for Shadow of the Beast, Falcon,
F/A 18, Red Storm Rising, etc.  I believe that in Europe it may be like
that (I wouldn't know) given higher piracy, but it doesn't appear so here
in the US for _good_ games.  Average or below games: sure.  Lemmings will be
on the shelves for a LONG time (and the Lemmings sequels/data disks I'm sure
will be coming).

>SetPatch didn't fix it.  There is a program that comes with CrossDos called
>TDPatch that might fix it though.  We WERE in touch with CATS on this one.
>At least you ADMIT there is such a bug.  Just one of too many gotchas.  Gladly,
>2.0 will someday be available to EVERYONE and should have far fewer gotchas.

	Yes, I know Len Poma.  The other thing tdpatch fixed was the RAWREAD/
RAWWRITE bug with single drives, where it would regect your request most
often (cmp.l $8000,xx instead of cmp.l #$8000,xx).  The random hang under
heavy disk activity if there were empty drives WAS fixed by setpatch.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/01/91)

In article <2162@pdxgate.UUCP> bairds@eecs.cs.pdx.edu (Shawn L. Baird) writes:
>DXB132@psuvm.psu.edu writes:
>
>>In article <2149@pdxgate.UUCP>, bairds@eecs.cs.pdx.edu (Shawn L. Baird) says:
>
>>That example doesn't help your point. CED uses blitting which is many many
>>times slower than using the copper to aid scrolling. But it has no choice
>>in order to remain Intuition compatible.
>
>The point was fast blitting. It is rather unreasonable for an editor such
>as CED to waste its memory holding a bitmap version of the text. Also, CED
>doesn't scroll an entire viewport at once. Otherwise the scrollbar at the
>left would be a bad placement.
>
>>-- Dan Babcock
>
>---
> Shawn L. Baird, bairds@eecs.ee.pdx.edu, Wraith on DikuMUD
> The above message is not licensed by AT&T, or at least, not yet.

Would you believe he's using PrintIText()?

Sorry, couldn't resist :)
--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/01/91)

In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:

>  Obviously you have no idea of how advanced today's optimizing compilers
>are.  The code you stepped through must have been produced by some
>1970's MetaComco compiler or something. But FYI, most of todays compilers
>can pass arguements in registers, allocate memory without stack, eliminate
>the frame registers, and even do all the non-obvious tricks of sign extension,
>etc. Check out the code I've included at the end compiled by GCC.
>
>The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
>I don't have SAS C on the Amiga, but I'm sure it produces simular results.
>
>/* test.c */
>
>char buf[20];
>main()
>{
>  char *d=(char *)&buf;
>  const char *s="This is a test\n";
>  while(*s) { *d++=*s++; }
>}
>
>/* Test.s produced by gcc */
>
>#NO_APP
>gcc_compiled.:
>.text
>LC0:
>	.ascii "This is a test\12\0"
>	.even
>.globl _main
>_main:
>	lea _buf,a1
>	lea LC0,a0
>	tstb a0@
>	jeq L5
>L4:
>	moveb a0@+,a1@+
>	tstb a0@
>	jne L4
>L5:
>	rts
>.comm _buf,20
>

/* Test.s produced by gcc */

#NO_APP
gcc_compiled.:
.text
LC0:
	.ascii "This is a test\12\0"
	.even
.globl _main
_main:					; (cycles)
	lea _buf,a1			; 8
	lea LC0,a0			; 8
	tstb a0@			; 8
	jeq L5				; 8
L4:
	moveb a0@+,a1@+			; 14*12
	tstb a0@			; 14*8
	jne L4				; 13*10+1*8
L5:
	rts				; 16
.comm _buf,20


;/* test.s produced by me */		; (cycles)
	lea	text(pc),a0		; 8
	lea	buf(pc),a1		; 8
.loop	move.b	(a0)+,(a1)+		; 14*12
	bne.s	.loop			; 13*10+1*8
	rts				; 16
text	dc.b	'This is a test',10,0
buf	ds.b	20

NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
them is inside your loop.  Try an example with nested loops and your
wasted clock cycles become a geometric progression.  Multiply the kind
of inefficiencies that GCC demonstrates here by EVERY loop and EVERY
function you have and your program is slower and bigger than it needs
to be.  To be specific, the GCC routine is 6 words longer and by the
time it is done executing, it will take 128 more clock cycles than
mine will (on a 68000).  Your routine takes 466 total clocks to execute,
mine takes 338.  I'm just your average 68000 assembler language programmer,
but I saved 28% CPU time.  You might also note that your 'C' source is
7 lines of code and so is my assembler code.

Just think, the OS ROM routines are written in 'C' and compiled with
a lesser compiler than gcc (5 years ago).


>  Also, it's not the language but the algorithm that is responsible for
>how fast a routine runs. Compare a C coded Boyer-Moore string search
>with an Assembly coded brute-force byte by byte search, the C code would
>probably win without optimizations turned on.
>

You should compare apples to apples.  Compare your Boyer-Moore string search
in assembler language with the one in 'C' on the Amiga.  How about comparing
an assembler coded Boyer-Moore string search against a 'C' coded brute-force
byte by byte search?  You'd complain it's not fair either.

>
>
>--
>/~\_______________________________________________________________________/~\
>|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
>|~|                                .-. .-.                                |~|
>|_|________________________________| |_| |________________________________|_|

--
********************************************************
* Appendix A of the Amiga Hardware Manual tells you    *
* everything you need to know to take full advantage   *
* of the power of the Amiga.  And it is only 10 pages! *
********************************************************

hbrinch@icoast.UUCP (Henrik Brinch) (04/01/91)

In article <20213@cbmvax.commodore.com>, Randell Jesup writes:

> >How long does MrgCop() take (pick random number).  How long does 
> >RethinkDisplay() take (seconds)?  How long does BltMaskBitMapRastPort() 
> >take? 
> 
> 	(1) those are not called all the time, (2) you're exaggerating
> by a lot.  BMBMRP() is not fast because it operates on arbitrary rectangles,
> and some sets require using the A-channel as a mask.  So if you do know
> the alignments are ok, OwnBlitter() and program it directly.
> 

About #2 he's _NOT_ exaggerating the RethinkDisplay() is WAY TOO SLOW(!)
I've written a little utility on my A3000 (25Mzh) and it worked perfectly,
then trying it on a standard A2000 I got a shock.  My task was to change
some colors in the copperlist (on an intuitionscreen), I've tried but the
only way to do this is to remake the whole copperlist and refreshing
the display (which is too slow).  Indeed a lot of OS routines are fast
(and would be stupid vasting time writing yourself) but others are simply
too slow to be used (even in utilities :(!)

> Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
> {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  

Best Regards,

InfoCoast       /\  /\_  Henrik Brinch \ cbmehq!cbmdeo!icoast!hbrinch
Technologies /\/  \/   \ Kloevervej 7   \ FidoNet 2:230/112.3
____________/  \  /     \ 2800 Lyngby    \ Voice/Fax tel.# +45 42 87 67 23
           /    \/       \  Denmark       \ "C is SILVER - But ASM is GOLD"

cs326ag@ux1.cso.uiuc.edu (Loren J. Rittle) (04/01/91)

mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
> system.  BOBs are slower than what I use.  Intuition takes 30% of the CPU
> time when you just move the mouse around (check out a CPU performance    
> monitor on a 68000 machine while moving the mouse).  Layers are totally
This is false.  I would suggest that whatever CPU monitor program
you are using is giving you false readings.  Try starting a CPU
intensive program on a dead system (no background tasks, etc).
Time it once without moving the mouse at all and once
shaking like crazy...  You will see that your 30% number is WAY
out of line...

Have a good day,
Loren J. Rittle

Boycott programs that don't support the great OS you payed for!
Boycott programs that don't use the hardware you own.
***********************************************************
* The use of the AmigaOS separates the men from the boys. *
***********************************************************
-- 
``NewTek stated that the Toaster  *would*  *not*  be made to directly support
  the Mac, at this point Sculley stormed out of the booth...'' --- A scene at
  the recent MacExpo.  Gee, you wouldn't think that an Apple Exec would be so
  worried about one little Amiga device... Loren J. Rittle  l-rittle@uiuc.edu

DXB132@psuvm.psu.edu (04/01/91)

In article <20222@cbmvax.commodore.com>, jesup@cbmvax.commodore.com (Randell
Jesup) says:

>        Except the trick is that if dos.library is opened successfully,
>the call never returns.  The docs don't actually say it returns (though I
>can see how you'd assume that).

You're thinking of the "bootstrap" code involved in autobooting. The
standard bootblock code certainly does return. If it didn't various
resources would not be deallocated by strap.

-- Dan Babcock

gay_d@disuns2.epfl.ch (David Gay) (04/01/91)

In article <mykes.1028@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
<In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
<
<>  Obviously you have no idea of how advanced today's optimizing compilers
<>are.  The code you stepped through must have been produced by some
<>1970's MetaComco compiler or something. But FYI, most of todays compilers
<>can pass arguements in registers, allocate memory without stack, eliminate
<>the frame registers, and even do all the non-obvious tricks of sign extension,
<>etc. Check out the code I've included at the end compiled by GCC.
<>
<>The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
<>I don't have SAS C on the Amiga, but I'm sure it produces simular results.
<>
<>/* test.c */
<>
<>char buf[20];
<>main()
<>{
<>  char *d=(char *)&buf;
<>  const char *s="This is a test\n";
<>  while(*s) { *d++=*s++; }
<>}

[Code deleted to satisfy inews, sorry]

<NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
<them is inside your loop. 

0 wasted instructions, see below.

[Stupid statistics deleted]

I wonder what makes you think that move.b (a0)+,(a1)+ is equivalent to
move.b (a0)+,(a1)+ / tst.b (a0) ? Your routine *doesn't* do the same thing
as the C code, you're copying a NULL character extra. 

So we have an optimised C version and an optimised buggy assembler version. 
Which would you choose ?

<Just think, the OS ROM routines are written in 'C' and compiled with
<a lesser compiler than gcc (5 years ago).

Just think, if the OS had been in assembler maybe it would have been released
next year ...

David Gay
gay_d@disuns2.epfl.ch

holgerl@amiux.UUCP (Holger Lubitz) (04/02/91)

In article <mykes.0926@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>As far as 'C' goes, I have stepped through enough disassembled 'C' code to
>see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
>cycles are wasted by pushing arguments on the stack to call a subroutine?
>How many cycles are wasted by calling a "glue" routine?  How many cycles
>and bytes are wasted by fixing up the stack after each and every subroutine
>returns?  How much stack do you need to allow for all the dynamic allocation
>of local variables?

What kind of disassembled 'C' code did you step through ? I suppose
it was Greenhills C or some other ancient compiler.
Optimized code from recent C compilers like SAS 5.10a doesn't do this, if you
don't want it to. You can pass parameters through registers, eliminating the
need to pass them via stack and also making glue routines obsolete.
And dynamic allocation of local variables is nice to have as an aid in
structured programming - however, it is not needed. You can program in C
using only global variables and constants - just like you probably do when
programming in assembler.

Are you always checking how to do a multiplication the fastest way ?
Any optimizing C compiler does (well, at least if you tell it to optimize for
time instead of optimizing for space.) It won't use MULU #17,D0, instead it
would copy D0 to D1, shift D0 by 4 bits, and add D1 to D0. Do you think the
average assembler programmer does ?

And by the way, Mike: Please try to strip down the quoted lines. It is a pain
to read dozens of quoted lines once more just to ensure not missing the one
or two lines where you comment on them.

Best regards,
Holger

--
Holger Lubitz            | holgerl@amiux.uucp
Kl. Drakenburger Str. 24 | holgerl@amiux.han.de
D-W-3070 Nienburg        | cbmvax.commodore.com!cbmehq!cbmger!amiux!holgerl

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/02/91)

In article <mykes.0914@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>
>The game I described uses NO operating system calls and does not multitask.
>It also runs in <256K.  I wouldn't sell it as a product because it is
>puny by Amiga standards.  I did it to prove that I know how to do it.
>Again, I repeat that it DOES NOT multitask.

    The point is, Mike, that you can easily write a NON-MULTITASKING
    game without destroying the OS and requiring a reboot.  You
    don't HAVE to use any OS calls except those that allocate the
    appropriate hardware resources and during initialization to
    allocate memory and disable multitasking, and during exit to
    restore it all.

    It's simple, really it is!

					    -Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

limonce@pilot.njin.net (Tom Limoncelli +1 201 408 5389) (04/02/91)

In article <18ea378c.ARN11fb@jsmami.UUCP> jsmoller@jsmami.UUCP (Jesper Steen Moller) writes:

> In article <Mar.30.22.51.19.1991.17088@pilot.njin.net>, Tom Limoncelli +1 201 408 5389 writes:
> > As soon as your statistical sample (European kids) starts getting
> > '030s and HDs I'm sure that you'll produce software that I feel like
> > buying.  Too bad it takes so long for Europeans to come up to American
> > standards.  I guess we follow them in fashion and they follow us in...
> > oh never mind.
> 
> Yeah - ufortunately "European kids" (of which I might be considered one)
> has to upto pay twice as much for hardware as you lucky Americans. No
> wonder we're a bit slow in getting all this new stuff (I am a bit out
> of the ordinary, as I run an A3000).

I've gotten email on this, and now a post.  I guess I should publicly
post about it.

By "Too bad it takes so long for Europeans to come up to American
standards" I *did* mean including such factors as high taxes, lower
income, and 2x price tags on computer equipment.  I was expressing
sympathy and to a certain extent empathy.

Actually, the "lower income" is open to debate.  Denmark has zero
homeless people because of the social programs there.  I on the other
hand have stepped over many homeless people in the streets of New
York.  Of course, the huge number of people sleeping in the
streets is offset by the few rich people that are so rich it ruins the
averages.

*** Of course, this has nothing to do with Amigas and should be taken to a
*** different newsgroup.

The point is that the game designers will make games multitaskor at
least return to the OS without rebooting when their market demands it.
I claim that the market will be right when Euopeans gets 030's, tons
of RAM, and HDs.  This can happen because they get higher incomes or
lower computer costs.

Tom

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/02/91)

In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>In article <mykes.0926@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>>As far as 'C' goes, I have stepped through enough disassembled 'C' code to
>>see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
>>cycles are wasted by pushing arguments on the stack to call a subroutine?
>>How many cycles are wasted by calling a "glue" routine?  How many cycles
>>and bytes are wasted by fixing up the stack after each and every subroutine
>>returns?  How much stack do you need to allow for all the dynamic allocation
>>of local variables?  512K is not a lot of RAM to go and waste memory all the
>>time.  A 7.14 MHz 68000 isn't fast enough to waste all the extra cycles, if
>>you are striving for performance.
>
>  Obviously you have no idea of how advanced today's optimizing compilers
>are.  The code you stepped through must have been produced by some
>1970's MetaComco compiler or something. But FYI, most of todays compilers

    I agree, you haven't looked at the output from a good compiler... GCC
    is one of the best, though on an Amiga it runs the bejeezes slow
    compiling something.  You will ALWAYS be able to write assembly that
    goes faster, but compared to a good C programmer the result will not
    go *that* much faster, only a little faster.

    Not that I am advocating you write the game in C, clearly you
    simply do not want to accept any arguments in favor of C, but
    I'll try once to convince you.

>can pass arguements in registers, allocate memory without stack, eliminate
>the frame registers, and even do all the non-obvious tricks of sign extension,
>etc. Check out the code I've included at the end compiled by GCC.
>
>>In practice, assembler is quite good at both data structures and event
>>driven programming.  If you haven't tried it, check it out and then
>>comment again.
>
>  I have tried it, without a great Macro assembler and some nice Macros,
>it's still a pain. And assembler can't even come close to object oriented
>programming such as data abtraction, inheritance, operator/function overloading
>and streams. Assembler is just overkill for mostly everything. Assembler
>has it's place, but if you program in assembler 100% of the time, your
>a masochist. (Even on the C64 I routinely used compiled basic to implement

    It should be interesting to note that one of the BEST development paths
    for a game is to write the thing in C first... because you can write
    the game MUCH faster.. weeks instead of months, in fact, then convert
    critical routines to assembly.  The end result is something that will
    generally go as fast or FASTER than doing it all in assembly from stage
    one, because if you do it in C you can make MAJOR changes to the
    algorithms quickly, whereas you must basically hardwire everything in
    assembly, making it nearly impossible to change a major core algorithm
    without rewriting the whole thing.	In fact, you can make several MAJOR
    changes to the C code without comming close to the amount of time it
    takes to write the thing entirely in assembly.

    Nobody gets the algorithm right the first time... half way through
    writing the thing up you always figure out a way to do it better.

					    -Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/02/91)

>In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:

>_main: 				; (cycles)
>	lea _buf,a1			; 8
>	lea LC0,a0			; 8
>	tstb a0@			; 8
>	jeq L5				; 8
>L4:
>	moveb a0@+,a1@+ 		; 14*12
>	tstb a0@			; 14*8
>	jne L4				; 13*10+1*8
>L5:
>	rts				; 16
>.comm _buf,20
>
>
>;/* test.s produced by me */		; (cycles)
>	lea	text(pc),a0             ; 8
>	lea	buf(pc),a1              ; 8
>.loop	move.b	(a0)+,(a1)+             ; 14*12
>	bne.s	.loop			; 13*10+1*8
>	rts				; 16
>text	dc.b	'This is a test',10,0
>buf	ds.b	20
>
>NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
>them is inside your loop.  Try an example with nested loops and your
>wasted clock cycles become a geometric progression.  Multiply the kind

    YOU ARE MISSING THE #@%$#!% POINT

    Of *COURSE* you can write faster assembly by hand... but your routine
    is ONLY 28% faster than GCC compiled code?

    Do you understand the point we are trying to make?	Perhaps it isn't
    clear.... the point is that in C you have a LOT MORE TIME to think
    about and redo the algorithms you originally chose -- using a better
    algorithm can yield an order of magnitude increase in performance,
    turning that 28% deficit into an 800% surplus, all because you can
    write the C code much faster and make *major* changes to it, like
    completely replacing core algorithms, much faster than you could do
    in assembly.  How much time do you spend on one piddling 50 line
    assembly routine to optimize it?  Quite a long time I would expect.

    Now, of *COURSE* a tight loop should be done in assembly... even C
    programmers will write really tight loops in assembly because that
    is where the compiler usually screws up in terms of outputting
    efficient code... where one instruction can make the difference.
    It's everything ELSE that can be written in C without loosing
    performance.  You can count cycles all day but in the end its the
    algorithm that counts, because you can always optimize tight items
    in assembly... maybe 5% of your program would be in assembly while
    the rest would be in C, EASILY have the same performance, and take
    much less time to write, debug, tweak, reorganize, and get out
    the door.

>Just think, the OS ROM routines are written in 'C' and compiled with
>a lesser compiler than gcc (5 years ago).

    The 1.3 OS was compiled with greenhills, I believe, which is a pretty
    good compiler.  Would you rather the OS not have come out at all?  Do
    you know how many YEARS it would take to write all that stuff in
    hand assembly?  much less debug it and enhance it.

    Can you image doing arbitrary layers clipping entirely in assembly?
    What a waste of time!

>>  Also, it's not the language but the algorithm that is responsible for
>>how fast a routine runs. Compare a C coded Boyer-Moore string search
>>with an Assembly coded brute-force byte by byte search, the C code would
>>probably win without optimizations turned on.
>>
>
>You should compare apples to apples.  Compare your Boyer-Moore string search
>in assembler language with the one in 'C' on the Amiga.  How about comparing
>an assembler coded Boyer-Moore string search against a 'C' coded brute-force
>byte by byte search?  You'd complain it's not fair either.

    Again, you are missing the point...  he IS comparing apples to apples.
    Not perhaps with that specific example, but in the more general sense.

    You only have so much time to write your game.. what you can do in a
    month with C would take a year in assembly, so you have a lot more
    time to think about the algorithms you choose writing the complex parts
    in C than you have writing it all in assembly.. a LOT more time.

    The foregone conclusion is that you would have time to develop some
    really hot algorithms writing in C an order of magnitude better than
    the algorithm you never had time to change written entirely in
    assembly.

    28% is nothing compared to finding an algorithm that gives you an order
    of magnitude better performance, and the chance of finding such an
    algorithm is incredibly high when you have the time to think about it.

					-Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

chopps@ro-chp.UUCP (Chris Hopps) (04/02/91)

>In article <20213@cbmvax.commodore.com> jesup@cbmvax.commodore.com (Randell Jesup) writes:
>>In article <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

[...]

>>>Unfortunately, the sales life of a game is about 3 months.  Royalties
>>>don't keep trickling in.
>>
>>	Good _games_, ones that have a depth beyond flashy graphics, and
>>have replayability, do continue to sell (though they do best when first
[...]

>Wrong.  If a dealer has the luck to sell out every copy of a game that
>he has after 3 months, he won't order more.  He'd rather use the shelf
>space for a new game.

Thats funny, becuase I worked for a dealer in Michigan , and we had plenty of
games on the shelf that were over three months old (not the packages
themselves, but the release date.) For example: Battle Chess (3 months old?),
F18 Intercepter, F16 Falcon, Dungeon Master, Test Drive II, Wayne G. Hockey,
TV Sports Football, Shadow of the Beast I or II, Sim City. I haven't worked
there in about 2 months, and this is all from memory( a couple days ago.) I
know how the owner ordered stuff and it wasn't with a calendar, but with the
weekly sales or the day after he sold out. ANY popular game that people
request, is always on the shelf. So as randell said if its good it lasts.
(BTW I know a guy who programed Space Invaders for the 64(for C=), and
he STILL gets royalty checks.)

Chris...

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (04/02/91)

In article <mykes.1028@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>
>>  Obviously you have no idea of how advanced today's optimizing compilers
>>are.  The code you stepped through must have been produced by some
>>1970's MetaComco compiler or something. But FYI, most of todays compilers
>>can pass arguements in registers, allocate memory without stack, eliminate
>>the frame registers, and even do all the non-obvious tricks of sign extension,
>>etc. Check out the code I've included at the end compiled by GCC.
>>
>>The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
>>I don't have SAS C on the Amiga, but I'm sure it produces simular results.
>>
>>/* test.c */
>>
>>char buf[20];
>>main()
>>{
>>  char *d=(char *)&buf;
>>  const char *s="This is a test\n";
>>  while(*s) { *d++=*s++; }
>>}
>>
>>/* Test.s produced by gcc */
>>
>>#NO_APP
>>gcc_compiled.:
>>.text
>>LC0:
>>	.ascii "This is a test\12\0"
>>	.even
>>.globl _main
>>_main:
>>	lea _buf,a1
>>	lea LC0,a0
>>	tstb a0@
>>	jeq L5
>>L4:
>>	moveb a0@+,a1@+
>>	tstb a0@
>>	jne L4
>>L5:
>>	rts
>>.comm _buf,20
>>
>
>/* Test.s produced by gcc */
>
>#NO_APP
>gcc_compiled.:
>.text
>LC0:
>	.ascii "This is a test\12\0"
>	.even
>.globl _main
>_main:					; (cycles)
>	lea _buf,a1			; 8
>	lea LC0,a0			; 8
>	tstb a0@			; 8
>	jeq L5				; 8
>L4:
>	moveb a0@+,a1@+			; 14*12
>	tstb a0@			; 14*8
>	jne L4				; 13*10+1*8
>L5:
>	rts				; 16
>.comm _buf,20
>
>
>;/* test.s produced by me */		; (cycles)
>	lea	text(pc),a0		; 8
>	lea	buf(pc),a1		; 8
BUG! This is not the same algorithm in the C code. Your code would move
a NULL byte if a NULL string was passed when it should do nothing. I don't know
why GCC doesn't generate PC relative instructions, but I think it has to do
with UNIX and perhaps the scatter load/memory partitioning.

>.loop	move.b	(a0)+,(a1)+		; 14*12
>	bne.s	.loop			; 13*10+1*8
>	rts				; 16
>text	dc.b	'This is a test',10,0
>buf	ds.b	20
>
>NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
>them is inside your loop.  Try an example with nested loops and your
>wasted clock cycles become a geometric progression.  Multiply the kind
>of inefficiencies that GCC demonstrates here by EVERY loop and EVERY
>function you have and your program is slower and bigger than it needs
>to be.  To be specific, the GCC routine is 6 words longer and by the
>time it is done executing, it will take 128 more clock cycles than
>mine will (on a 68000).  Your routine takes 466 total clocks to execute,
>mine takes 338.  I'm just your average 68000 assembler language programmer,
>but I saved 28% CPU time.  You might also note that your 'C' source is
>7 lines of code and so is my assembler code.

  This code is compiled to run on a 68030 @ 50mhz, the difference in speed
would be _very_ small. Hey, someone compile this on SAS/C with ALL
optimizations on. You're just your average assembly language programmer
yet you introduced a subtle bug into the program that will stomp on the
first byte of a static global string by attempting to copy a null
string. In a HUGE program this bug would be so subtle that it may
take days to find out how a memory location is being sporatically
trashed. Worse yet, a null string will have garbage following it that could
span hundreds of bytes. 

 Your assembler code has the C equivelent of 'do { *d++=*s++; } while(*s);'

>Just think, the OS ROM routines are written in 'C' and compiled with
>a lesser compiler than gcc (5 years ago).
  Just think, OS2.0 is compiled with SAS/C.

>
>
>>  Also, it's not the language but the algorithm that is responsible for
>>how fast a routine runs. Compare a C coded Boyer-Moore string search
>>with an Assembly coded brute-force byte by byte search, the C code would
>>probably win without optimizations turned on.
>>
>
>You should compare apples to apples.  Compare your Boyer-Moore string search

>in assembler language with the one in 'C' on the Amiga.  How about comparing
>an assembler coded Boyer-Moore string search against a 'C' coded brute-force
>byte by byte search?  You'd complain it's not fair either.
   The point is, OPTIMIZATIONS only account for a marginal linear speed
improvement while choosing the right algorithm could result in an order
or two of magnitude improvement. Matt's assembler is perfect proof. If
Matt coded it in assembly and added some features it would blow away DevPac
and probably beat ArgAsm.


--
/~\_______________________________________________________________________/~\
|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
|~|                                .-. .-.                                |~|
|_|________________________________| |_| |________________________________|_|

davewt@NCoast.ORG (David Wright) (04/02/91)

In article <mykes.0362@sega0.SF-Bay.ORG> mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>
>You explicitly said they should have cut out levels that were parts of
>their other products.
	Yes he did, if they were unneccessary. And I personally consider
"Intros" that take up almost all of the first disk (Like in Fiendish
Freddy) a total waste of my money. I paid for a disk and the code on it that
I will MAYBE watch once, if at all.
>when you compare it with Unix.  Game developers treat the Amiga like it
>is a PC by using the OS.  You can't take over a PC, because you need the
>BIOS to interface to a variety of hardware configurations.  The multitasking
	Oh PLEASE. It is SOP to take over PC's all the time. I don't think
you know thing one about PC programming. The BIOS is *NOT* the OS, or
even PART of the OS. It is a set of ROM functions that ALL PC compatible
computers contain, as well as a set of known entry points. And yes they DO
indeed take over the whole machine, and yes they DO have to write a special
version for EACH of the video displays they support. If you think they just
write one version, you are sadly mistaken (or they have been lazy and chosen
to support only one video mode, such as EGA or VGA (of course, you probobly
consider this good, after all, why produce something that the majority of
the people (more than 60% of all PC's have <EGA graphics))).
	What is TRUELY sickening is that many of these games:
	1) ARE HS installable
	2) DO allow you to return to the OS without rebooting
>copper, audio, etc.  Most of the PC people I have seen that move over to
>the Amiga struggle with volumes and volumes of poorly illustrated RKM
>manuals and typically don't make games that I rate very high.
	Well they are ex-PC programmers, aren't they? Do you really expect
them to know anything about how to do decent graphics and sound? :-)
>
>The Amiga operating system is not a high performance video game operating
>system.  BOBs are slower than what I use.  Intuition takes 30% of the CPU
>time when you just move the mouse around (check out a CPU performance
>monitor on a 68000 machine while moving the mouse).  Layers are totally
>unnecessary and way too slow.  Exec tasks require a minimum of 2K of stack
>each, while any game I ever do needs only 512 bytes of stack for 80 tasks
>under my own kernel.
	So what. You can disable the OS while your game is running, and
reenable it when you are done. If you can't do this, you aren't anywhere
near as good a programmer as you think. At least you could argue that you
don't have the room in a 512k machine to do this (I'll let the other Mike
take on THAT point).
>The ROM Kernel routines have many many bugs in them that you end up
>programming your way around.  It is not lazy to want to avoid the
	Really? Which are they?
>programming your way around.  It is not lazy to want to avoid the
>hassle.  It is just more cost effective to make the best games the
	I think most people would consider doing anything "to avoid
hassle" of doing something in a C= reccomended and system friendly way
to be lazy.
>The only thing that the OS gets for you is the ability to use hard
>disks.  If commodore were smart, they would make ROM routines
>accessable for video games to access the hard disk when the OS is not
>running.  This is really what the machine needs. 
	Really? and be stuck in the cruddy position of PC's, in which there
are a limited number of drive types supported by the BIOS? The HD driver
is part of the HD controller, which you are free to choose any ST-506, ESDI,
IDE, SCSI, etc. drive. How do they know how any manufacturer will create
the drive controller?
>manipulate the data.  IFF format, for example, is fine for a source file
>format, but wastes disk space in a product.
	Plus, it is too easy for people to steal and use in other programs.
I won't argue with you there. Don't forget the "I" in IFF stands for
interchange. If you don't want to do that, IFF is not for you.
>Most people who program the Amiga don't have the ability (or gumption)
>to write in assembler language.  These are the people who I would
>call lazy.  Most people who program the Amiga don't have the ability
	Gee, I would call them intelligent. Why write in a language which
in many cases produces only marginaly better code, is harder to maintain,
is much less portable, takes longer to develop in, and is far easier to
produce bugs.
>call lazy.  Most people who program the Amiga don't have the ability
>to write their own native operating systems that outperform the ROM
>Kernel, so for them the OS is the only choice.  In my case, I write
	This is just silly. This is like saying that most americans don't
have the ability to speak a foreign language, so English is the only
choice.
	1) Just because people DON'T do something doesn't mean they can't
	2) English is spoken (or at least understood, to some degree) by
		more people in more countries than any other language
		aside from chinese. English is *THE* international
		language of navigation (aero and naval), and is generally
		understood in technical fields. Same for C. More people
		know C than know assembly, so your code will be more easily
		modified by others if it is written in C instead of some
		vernacular of AL.
	3) I would be just as justified to call all AL programmers
		"lazy" or "not as good as C programmers" because they must
		not be able to understand higher-level computer science,
		as they only use low-level programming methods that any
		4th grader could learn. Would that be correct?
>assembler language because I can.  I take over the machine because
>I can.  People actually pay for what I program, so in order to give
>them the best I can do, I go the extra mile.  It is ridiculous to
>say that someone who goes to the extra effort that assembler language
>programming takes is lazy.
	But you give them "that extra mile" at significant cost in other
areas, which could ALSO be done in AL, if you spent the time to do so.
>Have you ever taken over the Amiga?  I bet if you did, you'd change
>your tune.  I on the other hand have done things both ways (using the
>OS and taking over), and the power you gain by taking over more than
>offsets the capability to multitask your game with other programs.
>Open your mind and give it a try, then we can really have a productive
>disagreement. 
	Oh please, you obviously don't consider multi-tasking to be a very
usefull feature, and could care less if your program will run the same on
all platforms, or makes use of any expanded features available.
	There is NO reason that Shaddow of the Beast could not:
	1) Be HD-installable
	2) Allow you to "pause" the game and work with something else
		and then come back to the game.
	3) Allow you to exit without rebooting.
	True, doing so would possibly prevent it from working on a 512k
machine. But that is a different argument than the "I NEED to take over the
machine to get the speed" whimper.
>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
>********************************************************
	If you only want to write something that can be easily explained
in 10 pages.



				Dave

cpca@marlin.jcu.edu.au (Colin Adams) (04/02/91)

In article <mykes.0476@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>In article <1991Mar27.211819.19370@neon.Stanford.EDU> espie@flamingo.Stanford.EDU (Marc Espie) writes:
>>In article <mykes.0362@sega0.SF-Bay.ORG> mykes@sega0.SF-Bay.ORG (Mike Schwartz) writes:
>>>In article <23837@well.sf.ca.us> farren@well.sf.ca.us (Mike Farren) writes:
>>[stuff deleted]
>>??? hey ! There aren't only amiga 500 with 512 K around.
>>
>Anything correctly written for the 512K Amiga runs on all Amigas.
>
>Populous is a port from the PC.  As long as Amiga marketing data for
>software is dismal, big companies won't make the investment required
>to make truly awesome Amiga games.  The only companies that do make
>Amiga originals that survive are European.

Sorry to interrupt the personal flame war, but I believe Populus is
a port from the ST.  It was programmed by Bullfrog in England, where the
major machines for games are the ST and the Amiga, and people still have 512k
and no hard drive.

>Yes, I feel a lot uneasy about many games on the Amiga.  I can't stand
>it when I see an action game written in 'C'.  It is easy to tell which
>games are done this way, and they suck.  

I've seen a demo from some people here on campus who did their 3rd year
graphics project on the Amiga in C with critical bits in assember.  They
did take over the system but it is in C and is very very fast.  It's called
Nebulas (looks like Backlash) and you should be able to ftp it from
somewhere.  If you know what you're doing you can make C very fast...

>********************************************************
>* Appendix A of the Amiga Hardware Manual tells you    *
>* everything you need to know to take full advantage   *
>* of the power of the Amiga.  And it is only 10 pages! *
>********************************************************


-- 
Colin Adams                                  
Computer Science Department                     James Cook University 
Internet : cpca@marlin.jcu.edu.au               North Queensland
'And on the eight day, God created Manchester'

jesup@cbmvax.commodore.com (Randell Jesup) (04/02/91)

In article <1991Apr2.002631.22799@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>>>The following was compiled with GCC -O -fstrength-reduce -fomit-frame-register
>>>I don't have SAS C on the Amiga, but I'm sure it produces simular results.
>>>
>>>/* test.c */
>>>
>>>char buf[20];
>>>main()
>>>{
>>>  char *d=(char *)&buf;
>>>  const char *s="This is a test\n";
>>>  while(*s) { *d++=*s++; }
>>>}

SAS C: (5.10a)
       | 0000  48E7 0030                      MOVEM.L   A2-A3,-(A7)
       | 0004  47EC  0000-02.2                LEA       02.00000000(A4),A3
       | 0008  45EC  0000-01.2                LEA       01.00000000(A4),A2
       | 000C  6002                           BRA.B     0010
       | 000E  16DA                           MOVE.B    (A2)+,(A3)+
       | 0010  4A12                           TST.B     (A2)
       | 0012  66FA                           BNE.B     000E
       | 0014  4CDF 0C00                      MOVEM.L   (A7)+,A2-A3
       | 0018  4E75                           RTS

	It does use a2/a3 instead of a0/a1.  However it beats the GNU
version slightly by jumping to the test instead having two copies of it.
I suspect the next major release of SAS will have this fixed (I talk to them
often, and have suggested this improvement to them before).

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is in anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

barrett@jhunix.HCF.JHU.EDU (Dan Barrett) (04/02/91)

In article <dillon.5779@overload.Berkeley.CA.US> dillon@overload.Berkeley.CA.US (Matthew Dillon) writes:
>There are many, many good algorithms,...

	Hear, hear!!!  Matt is absolutely right -- the algorithm is very
often the biggest key to obtaining speed in your program.

	In my programming courses (I teach CS at Johns Hopkins), my students
have been shocked to see the blazing speed improvement you can get by using
good algorithms.

	Here is a simple example:  raising a number to a power.  I took a
typical Power() function and raised a number to the 1,000,000,000 power.  It
took 30 minutes on a VAX 8530:

		double Power(double number, long exponent)
		{
			long i;
			double answer = 1.0;
			for (i=0; i<exponent; i++)
				answer *= number;
		}

{Now code the same algorithm in assembler.  Make EVERY little optimization
you like -- remove instructions from the loop, etc.  I guarantee that, with
this algorithm, you might shave off a minute or two from the run time.}

	Next, I changed the algorithm.  It made the source code TWICE AS
LONG (oh no!), and it even used (horror of horrors!) recursion.  The run
time dropped to 0.5 seconds.  Yes, you read that right.  0.5 SECONDS instead
of 30 MINUTES.

	Every programmer should read, read, READ those algorithms books.
You have the collective wisdom of ALL COMPUTER SCIENCE at your fingertips.
Use it and enjoy it!

                                                        Dan

 //////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
| Dan Barrett, Department of Computer Science      Johns Hopkins University |
| INTERNET:   barrett@cs.jhu.edu           |                                |
| COMPUSERVE: >internet:barrett@cs.jhu.edu | UUCP:   barrett@jhunix.UUCP    |
 \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/////////////////////////////////////

farren@well.sf.ca.us (Mike Farren) (04/02/91)

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

>>>Unfortunately, the sales life of a game is about 3 months.  Royalties
>>>don't keep trickling in.
>>
>>	Good _games_, ones that have a depth beyond flashy graphics, and
>>have replayability, do continue to sell (though they do best when first
>>released, like most authored products).  Sure, they do eventually trend 
>>towards 0, but by no means do they walk off a cliff for a good game
>>(or even a well-done flashy game).
>
>Wrong.  If a dealer has the luck to sell out every copy of a game that
>he has after 3 months, he won't order more.  He'd rather use the shelf
>space for a new game.

Hmm.  Then why is it that I'm still getting royalty payments for Storm
Across Europe, nearly a full year after its release?  Why is it that my
very first game, Temple of Apshai, paid me substantial royalties for
*four* years?

Action games, glitzy games, arcade-type shoot-em-up games; yes, those do
have a very short lifespan, remaining alive only until the next, glitzier,
flashier game comes along (I own a few of those myself :-).  A good, solid
game with depth can sell for a long, long time.
-- 
Mike Farren 				     farren@well.sf.ca.us

sschaem@starnet.uucp (Stephan Schaem) (04/02/91)

 If you think a Library funtion is slow, look at it...
 Usally you will find ugly C:-) (yes sometime its pretty)
 The amiga do ALOT when it come to refrech the View copperlist!
 Use a Ucoplist, built it yoursel, and by pass the system colors or
 whatever... You might have twice the hadware register definition but
 it will be faster.
 And if you use your own CList you can use speed tricks (Of course not
 involving any Wait:-)

 I'm using MakeVPort, MrgCop, LoadView only.And I think that's what you
 only need to do your stuff...
 Calling the 3 above rebuild the View copperlist for your screen
 VPort, But be carefull if you use it to move screen:-) can get pretty
 weird.
 Anyway, just use your own Clist.It will be ALOT better.
 I used to do the above for 64 color palette on hires screens, work
 pretty well!

 The hires screen was in 8 colors:-)

sschaem@starnet.uucp (Stephan Schaem) (04/02/91)

 About C assembler output...
 A year ago 15% was the accepted number about C/assembly speed
 diference.
 (I'm talking about 68000 code...)
 But if you look what people are using untill now its not a pretty seen.
 And why would use a C compiler when you know you will never port your
 code?
 And why would you use a C compiler on a 680x0 machine?
 Check out some MANX C/asm source... does it make any sence? declaring
 variable as register, having restriction, continional assembly etc...

 So if you do amiga tools only, dont you think C is a little to mutch?
 Also the mulu #17 exaMPLE should be optimized by a macro assembler
 has a C compiler (macro seembler should let you decide...).
 And n C like in ASM people use mulu table for cycles sucking
 subroutines...

 But lets get the the question: what do C offer over a NON clasic
 motorola macro assembler?

 I assume we are not talking of multy platform port...

 I know just a little about C, but still know how to write a simple
 paint program in it.I usally read C to converte it to ASM, like I done
 on some IRIS C source for image decompacting.
 (Just in cases somebody care, or would ask:-)

sschaem@starnet.uucp (Stephan Schaem) (04/02/91)

 Just want to say that some people dont use ASM for its speed..!?
 We (I'm talking for 3 poeple here:-
 ) simply like assembly language because we think that way.
  dont mind C for port, my friend pascal when he do mac software.

So even if C was faster:-) we would stay with our macro assembler.
And Genim2 is slow on a 25mhz A3000, but I work when its running and I
dont need mutch cycle to read/write :-)
Genim2 is reliable and not 'overkilled', that's all.

And when I think of C I think only of 15% slower for general coding
and asm for crytical parts...

And if Matthew Dillon was think of doing a macro assembler... Could he
use a library for the instruction decoding? I think string gadget sould
be able to do calculation.

sschaem@starnet.uucp (Stephan Schaem) (04/02/91)

 Making run C fast require knowledge of the CPU you run on.
 Register usage, and couple look at the code created to see what
 is actually compiled...
 And use precalculation, tables, etc.
 See where I'm getting?

							Stephan.

BAXTER_A@wehi.dn.mu.oz (04/02/91)

In article <mykes.0476@amiga0.SF-Bay.ORG>, mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
> 
> Is soundtracker PD?  You get what you pay for.


Any one who has had the oportunity to compare BGraphics with Multiplot will
know that in no sense of the words do you necessarily "get what you pay for"
when it comes to commercial software.

Regards Alan

jdickson@jato.jpl.nasa.gov (Jeff Dickson) (04/03/91)

In article <holgerl.0703@amiux.UUCP> holgerl@amiux.UUCP (Holger Lubitz) writes:
>In article <mykes.0926@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>>As far as 'C' goes, I have stepped through enough disassembled 'C' code to
>>see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
>>cycles are wasted by pushing arguments on the stack to call a subroutine?
>>How many cycles are wasted by calling a "glue" routine?  How many cycles
>>and bytes are wasted by fixing up the stack after each and every subroutine
>>returns?  How much stack do you need to allow for all the dynamic allocation
>>of local variables?
>
>What kind of disassembled 'C' code did you step through ? I suppose
>it was Greenhills C or some other ancient compiler.
>Optimized code from recent C compilers like SAS 5.10a doesn't do this, if you
>don't want it to. You can pass parameters through registers, eliminating the
>need to pass them via stack and also making glue routines obsolete.
>And dynamic allocation of local variables is nice to have as an aid in
>structured programming - however, it is not needed. You can program in C
>using only global variables and constants - just like you probably do when
>programming in assembler.
>
>Best regards,
>Holger
>
>--
>Holger Lubitz            | holgerl@amiux.uucp
>Kl. Drakenburger Str. 24 | holgerl@amiux.han.de
>D-W-3070 Nienburg        | cbmvax.commodore.com!cbmehq!cbmger!amiux!holgerl

	I'm not even going to atest that 'C' code is just as good as assembly.
However, given a decent compiler, how sloppy, bulky, whatever - is largely up
to you. I have written 'C' code that is nearly as compact and fast as it would
be if it were written in assembly.

	I began my software ventures writing 8080 and Z80 assembly language
programs. Back then, there was a real effort to write optimized programs,
because the processor's speed was < 6MHZ and memory was limited to 64K. I
wrote a source level debugger that used the PRN files Microsoft's M80 macro
assembler produced. It was only 16K. Nowadays, processor speed and memory are
not as important, and so most programmers don't care or take the time to
write compact and efficient code. 

	If you think that its ok to write half assed code, because an optimizer
is going to clean it up for you - then you're in for a rude awakening. Also,
beware that the library often is comprised of poorly written code. At least
that's the way it is for the MANX 3.x libraries. I have my own library. The
functions it contains suffice for most applications - not all. I make liberal
use of Amiga.lib where applicable.

			-jeff

kent@vf.jsc.nasa.gov (04/03/91)

>     You only have so much time to write your game.. what you can do in a
>     month with C would take a year in assembly, so you have a lot more
>     time to think about the algorithms you choose writing the complex parts
>     in C than you have writing it all in assembly.. a LOT more time.


	Exactly,  A programmer can only write and debug so many lines of code
	a day.  So, the more each line of code does, the more productive 
	the programmer is!! 

	No one can argue that each 'C' statement does more than assembly.
	No one can argue that writting the code in assembly creates tighter
	code.  The question: is the extra speed of the pure assembly code 
	needed.  The answer: not very often, and probably only in critical
	sections of the code.

	- Mike Kent

dltaylor@cns.SanDiego.NCR.COM (Dan Taylor) (04/03/91)

In <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>I also use OS calls to find out how many floppy drives
>are connected, where any FAST memory is, and what kind of CPU the
>machine has.

>The first thing the Kernel does is to copy itself down to low
>memory ($200).  >I should
>also point out that immediately after copying itself down to low memory,

>I can put graphics screens and the stack anywhere I want.  For a 16 
>color game, I put the stack at $80000 and a screen at $78000.  The 
>stack needed for any program I write is < 512 bytes.  The resulting 
>memory map gives me from $100 to $78000 to squeeze the game into.  And

Why, if you have done the checking that you claim, do you SLOW down games
that are running on expanded/enhanced machines?  If you've taken over the
hardware, you certainly DO NOT want your vectors in low memory, if there
is a 68010, 68020, or 68030 installed, since they can relocate them
using the VBR.  On any Amiga with an 020 or 030, why put ANY code or
non-CHIP data into anything but 32-bit RAM?  You get to use the absolute
address modes, which are SLOWER than the register relative modes???  It
seems to me that a couple of "movem"s at various the various entry
points in you coroutine loop, would allow easy access to all the registers
you need at any point in the program.

>I also get the benefit of putting my variables in low memory.  You
>see, the 68000 allows you to use absolute short addressing mode
>to access these, which frees a register for other things.

So, you trade a spare register, for FORCED contention between the blitter
and the CPU.  What's the saved cycle trade-off in wait-states vs address-
mode calculations, especially given that address-register-relative, WITH
displacement, has the same ac time as absolute short?

It almost seems that you are trying to prove Mike Farren's point for him.
You just do not WANT to use the OS, if it IS possible, so you won't
even try.  Well, you can program in whatever style you choose, but my
MONEY will be spent on games programmed in the style that I choose.
Wouldn't you, and your publishers, like to get some of it?  If so, help
me by writing what I'll buy, rather than what you want.  And even the
avid gamers that I know are tired of trashing their environment, so
they are not buying much, if anything.

Dan Taylor
* My opinions, not NCR's. *

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) (04/03/91)

In article <dillon.5831@overload.Berkeley.CA.US> dillon@overload.Berkeley.CA.US (Matthew Dillon) writes:
>In article <1991Apr1.020748.26863@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>>In article <mykes.0926@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:


>>>As far as 'C' goes, I have stepped through enough disassembled 'C' code to
>>>see enough.  How much RAM is wasted on LINK, UNLK instructions?  How many
>>>cycles are wasted by pushing arguments on the stack to call a subroutine?
>>>How many cycles are wasted by calling a "glue" routine?  How many cycles
>>>and bytes are wasted by fixing up the stack after each and every subroutine
>>>returns?  How much stack do you need to allow for all the dynamic allocation
>>>of local variables?  512K is not a lot of RAM to go and waste memory all the
>>>time.  A 7.14 MHz 68000 isn't fast enough to waste all the extra cycles, if
>>>you are striving for performance.
>>
>>  Obviously you have no idea of how advanced today's optimizing compilers
>>are.  The code you stepped through must have been produced by some
>>1970's MetaComco compiler or something. But FYI, most of todays compilers
>
>    I agree, you haven't looked at the output from a good compiler... GCC
>    is one of the best, though on an Amiga it runs the bejeezes slow
>    compiling something.  You will ALWAYS be able to write assembly that
>    goes faster, but compared to a good C programmer the result will not
>    go *that* much faster, only a little faster.
>
>    Not that I am advocating you write the game in C, clearly you
>    simply do not want to accept any arguments in favor of C, but
>    I'll try once to convince you.
>

[ stuff deleted ]

>					    -Matt
>
>--

Hey, Matt (and others):

In a big way, we are both arguing different points.  I USE C A LOT.  I
just don't use it for anything I'd sell to anyone.  I also wouldn't
use it to make a program that I am going to use a lot.

Ray Cromwell sent me a code fragment he compiled (it was posted in
this group) with GCC.  I AM impressed, with GCC, but I (an average
coder) put it to shame.  To make things worse (about GCC) is that
EVERY single subroutine compiled with it loses in terms of size and
speed.  And people wonder why some programs with LOTS of functionality
(Trackdisk.device in 2.0 is 7K :) while your standard printf("Hello,
world\n") program is huge.   Sometimes this tradeoff is something
you might live with, others it's not.

Even if you intend to go back and recode the entire program in
assembler language, it is rare that it ever works out that way.  Some
other neato project comes along and you get stuck with what you have.
Complete and thorough and well polished programs do take months and
even years, even in 'C'.

Consider all the commercial programs you have on your hard disk that
were written in 'C' and compiled before GCC was available for the
Amiga!  I don't have the luxury of taking DPaint (for example) sources
and running them through a better compiler.  Dan Silva doesn't work at
EA anymore, so there IS some possibility that there won't be a DPaint
IV that was compiled with a better one.  How about the ROM Kernel that
we both love so much?  Wouldn't it be nice to have been able to run
the sources to it through improved compilers as they came out?
Instead we are *still* waiting for 2.0 to go to ROM, and it still
won't be compiled with a compiler that does as good as a human.  How
many years are we going to have to settle for what today's compiler
was able to do? 

How many LINK, UNLK instructions do I unnecessarily have on my hard
disk?  How much for all the hard disks in the Amiga community alone?
Terabytes?  I often write programs that need to fit in small machines,
so I'd rather save tens of thousands of bytes in a (> 100K) program. 

Now when you code portions of your programs in assembler, I agree that
those routines gain back most of the size and speed lost by using a
compiler.  However, if you have a few hundred of these routines
scattered in a lot of programs, how much work is it to change them to
take parameters in registers instead of from the stack when a new
compiler comes along that you want to use?  What if you use #asm ...
#endasm and your new compiler doesn't support it?

Maybe someone can answer these questions for me:

Why is the best compiler on the Amiga run the bejeezes slow?  In my
experience, your development cycle (edit/compile/link/run/debug and
repeat all the above) is critical to your productivity.

fWhy isn't something like LightSpeed 'C' available for the Amiga?  It
flies and generates awesome code.

Why hasn't someone made the entire c.lib into a loadable library
so all programs can share it instead of duplicating these routines
hundreds of times all over everyone's hard disks?  Why not even
just printf.library (this alone would save megabytes on my hard disk)?

Why didn't Jez Sans write a 'C' compiler?

Why don't 'C' compilers support linkerless usage?

Why don't 'C' compilers know about the routines in the OS without
#pragmas?  While we're at it, why not all the structures and other
things from the header files, too?

Why don't 'C' compiler c.lib libraries automatically use the Exec
Task structure's TC_TRAPCODE field to trap 99% of the gurus that
happen and force programs to exit gracefully?  Maybe the question
is why don't 'C' programmers do it all the time?

Why do 'C' programmers ask me whether I use:
	MULU	#17,d0
instead of:
	move.l	d0,d1
	lsl.l	#4,d0
	add.l	d1,d0
when it is basic programming normally done by assembler language
programmers (one of the oldest tricks in the book)?

Come on people, no flames for this, these are serious questions.
--
****************************************************
* I want games that look like Shadow of the Beast  *
* but play like Leisure Suit Larry.                *
****************************************************

mwm@pa.dec.com (Mike (My Watch Has Windows) Meyer) (04/04/91)

In article <mykes.1332@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
   Even if you intend to go back and recode the entire program in
   assembler language, it is rare that it ever works out that way.  Some
   other neato project comes along and you get stuck with what you have.
   Complete and thorough and well polished programs do take months and
   even years, even in 'C'.

Yeah - but assembler takes even longer. I have more things I want to
do than time to do them already. Coding in ASM from scratch means even
fewer of them are going to get done. So I code in HLLs, and more of
them get done. I also know that they'll require minimal work to move
to another machine.

   Now when you code portions of your programs in assembler, I agree that
   those routines gain back most of the size and speed lost by using a
   compiler.  However, if you have a few hundred of these routines
   scattered in a lot of programs, how much work is it to change them to
   take parameters in registers instead of from the stack when a new
   compiler comes along that you want to use?  What if you use #asm ...
   #endasm and your new compiler doesn't support it?

It doesn't take any more work to rewrite the assembler routines than
it does to add prototypes to the C routines.

If you use #asm and #endasm in your C, you get what you deserve.

   Maybe someone can answer these questions for me:

   Why is the best compiler on the Amiga run the bejeezes slow?  In my
   experience, your development cycle (edit/compile/link/run/debug and
   repeat all the above) is critical to your productivity.

It depends on what you mean by the "best compiler". In any case, I
find that accurate type-checking is far more critical than the
development cycle time. That means you get to skip a fair part of the
compile step, and all of the link/run/debug steps for that cycle, and
probably several other trips through the cycle while you try and track
down the bug. Ideally, of course, you blow off the compile/link phases
until you're ready to generate production code.  Getting either of
these benefits in asm seems to be impossible.

   Why isn't something like LightSpeed 'C' available for the Amiga?  It
   flies and generates awesome code.

Because the IBM PC & MAC markets are much more competitive than the
Amiga market.

   Why hasn't someone made the entire c.lib into a loadable library
   so all programs can share it instead of duplicating these routines
   hundreds of times all over everyone's hard disks?  Why not even
   just printf.library (this alone would save megabytes on my hard disk)?

See above.

   Why don't 'C' compilers support linkerless usage?

See above.

   Why don't 'C' compilers know about the routines in the OS without
   #pragmas?  While we're at it, why not all the structures and other
   things from the header files, too?

C compilers know as much about the routines in the OS & the structures
as any assembler I've seen. Of course, I didn't evaluate every
assembler on the market, and some of them may have been designed with
symbol tables wired into them. Silly idea....

   Why don't 'C' compiler c.lib libraries automatically use the Exec
   Task structure's TC_TRAPCODE field to trap 99% of the gurus that
   happen and force programs to exit gracefully?  Maybe the question
   is why don't 'C' programmers do it all the time?

   Why do 'C' programmers ask me whether I use:
	   MULU	#17,d0
   instead of:
	   move.l	d0,d1
	   lsl.l	#4,d0
	   add.l	d1,d0
   when it is basic programming normally done by assembler language
   programmers (one of the oldest tricks in the book)?

Because they're not well-educated? Personally, I'd ask the compiler
writers why they didn't use shift/add technic for constants, as it's
one of the oldest tricks in the book. Of course, you do have to watch
out for things like that. The next rev of the CPU could make the
obvious-but-slow sequence faster (i.e. - the obs instructions moves
from microcode to hardware, or some such).

BTW, if you want things that _really_ fly, you use a processor with an
WCS (or better), and code at that level. It can be harder (if it's
horizontal, much, much, much harder) than writing assembler, but you
can do away with lots of slow activities, most notably touching
off-CPU memory.

Come to think of it, I once made the mulu example faster than the
asm-level shift/add by optimizing the mulu code to do the shift/add in
a single instruction instead of two. But that was long ago and far
away....

	<mike

--
Lather was thirty years old today,			Mike Meyer
They took away all of his toys.				mwm@pa.dec.com
His mother sent newspaper clippings to him,		decwrl!mwm
About his old friends who'd stopped being boys.

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/04/91)

In article <1991Apr2.093855.28281@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>
> Making run C fast require knowledge of the CPU you run on.
> Register usage, and couple look at the code created to see what
> is actually compiled...
> And use precalculation, tables, etc.
> See where I'm getting?
>
>							Stephan.

    Actually, no, I don't...  is this another blanket statement about C?

				    -Matt
--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/04/91)

In article <1991Apr2.090315.27856@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
> But lets get the the question: what do C offer over a NON clasic
> motorola macro assembler?
>
> I assume we are not talking of multy platform port...
>
> I know just a little about C, but still know how to write a simple
    ^
    *THAT* is obvious!

> paint program in it.I usally read C to converte it to ASM, like I done
> on some IRIS C source for image decompacting.
> (Just in cases somebody care, or would ask:-)

				    -Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/04/91)

In article <1991Apr2.091947.27988@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>
>And when I think of C I think only of 15% slower for general coding
>and asm for crytical parts...

    ok, so?  That seems about right to me for general coding.  Now put the
    core code in assembly and leave the rest in C.. what do you get?

    On the otherhand, for medium to large programs it can take 10 times
    as long to do it in assembly..  or longer!	That's the key tradeoff.

>And if Matthew Dillon was think of doing a macro assembler... Could he
>use a library for the instruction decoding? I think string gadget sould
>be able to do calculation.

    *what* are you talking about?

					-Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

jsmoller@jsmami.UUCP (Jesper Steen Moller) (04/04/91)

In article <mykes.1332@amiga0.SF-Bay.ORG>, Mike Schwartz writes:

> Consider all the commercial programs you have on your hard disk that
> were written in 'C' and compiled before GCC was available for the
> Amiga!  I don't have the luxury of taking DPaint (for example) sources
> and running them through a better compiler.  Dan Silva doesn't work at
> EA anymore, so there IS some possibility that there won't be a DPaint
> IV that was compiled with a better one.  How about the ROM Kernel that
> we both love so much?  Wouldn't it be nice to have been able to run
> the sources to it through improved compilers as they came out?
> Instead we are *still* waiting for 2.0 to go to ROM, and it still
> won't be compiled with a compiler that does as good as a human.  How
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Depends on the human. I'm not sure who, but someone said that any
compiled C program could be coded better by any assembler programmer.
This is wrong. Take for instance the switch construction in C.
Most of "Any assembler programmer" would do a CMP/BEQ structure to implement
this, without taking furhter notice. But see how a good C compiler does
this. It takes the values and builds a jump table or does the CMP/BEQ
(mabye does some kind of "bisection" first), or possibly a compination
of these. All depending of how many "case"-values there is, and how big
the difference between the single cases are. Now mix this with some
contitional compiling and a lot of #defines.
Now the "Any assembler programmer" is in deep trouble, while the C
compiler handles all the trivial stuff and the programmer writes
another module and drinks a cup of coffee.
"The real good assembler programmers" (like you seem to be) would
do the work and write programs that gain 15% over the C-compiled
version. But "Any assembler programmer" would waste some orders
of magnitude if unlucky. The C programmers would have time for
their coffee and version 2.0, though. This is another example of
Matt's general message (the right algorithm...).

> many years are we going to have to settle for what today's compiler
> was able to do? 
> 
> How many LINK, UNLK instructions do I unnecessarily have on my hard
> disk?  How much for all the hard disks in the Amiga community alone?
> Terabytes?  I often write programs that need to fit in small machines,
> so I'd rather save tens of thousands of bytes in a (> 100K) program. 

Some are LINK/UNLK are waste. Some are there because of a source that
is modulized and easy to maintain so that you will receive a new version
some day.

> Now when you code portions of your programs in assembler, I agree that
> those routines gain back most of the size and speed lost by using a
> compiler.  However, if you have a few hundred of these routines
> scattered in a lot of programs, how much work is it to change them to
> take parameters in registers instead of from the stack when a new
> compiler comes along that you want to use?  What if you use #asm ...
> #endasm and your new compiler doesn't support it?

Well, the #asm/#endasm wasn't in my Kernighan/Richie text, but I
might have missed it... no, actually it's good practise to seperate
C and assembler. I help in the situations described above.

> Maybe someone can answer these questions for me:
> 
> Why is the best compiler on the Amiga run the bejeezes slow?

It's good at optimizing, and that takes time. And, it wasn't written
by Matt (unless you do mean DICE of course...) so it might have
some inefficient algorithms in it...

> In my experience, your development cycle (edit/compile/link/run/debug
> and repeat all the above) is critical to your productivity.

Yeah, but C-coders just do a version 1.1 or 2.0 that pays off, enough
to buy a faster Amiga ;-) ... No, point taken.

> Why hasn't someone made the entire c.lib into a loadable library
> so all programs can share it instead of duplicating these routines
> hundreds of times all over everyone's hard disks?  Why not even
> just printf.library (this alone would save megabytes on my hard disk)?

Yeah, you picked the problem - one c.library is too big, and splitting
up gives too much overhead.
Unless the complete c.library goes into ROM! It's on it's way,
just take a look at 2.0 DOS Library. Buffered I/O, FPrintF, etc.
etc. Randell has been a good boy.
Anyway, if I have to use a sprintf in a tiny program, I'd use
RawDoFmt(), which (although not documented under 1.3) does most
of the standard C stuff.
 
> Why don't 'C' compilers support linkerless usage?

Because of scanned libraries. The good one will link for you,
though (DICE, SAS/C).

> Why don't 'C' compilers know about the routines in the OS without
> #pragmas?  While we're at it, why not all the structures and other
> things from the header files, too?

I recently got the include files for 2.0, and - boy - if my compiler
still would load hard-coded 1.3 values - I'd be reeeal angry. The
key word is flexibility.

> Why don't 'C' compiler c.lib libraries automatically use the Exec
> Task structure's TC_TRAPCODE field to trap 99% of the gurus that
> happen and force programs to exit gracefully?  Maybe the question
> is why don't 'C' programmers do it all the time?

That was a good one - I hack up catch.a to just that, one day. BTW,
it's not c.lib, it should be the startup code (c.o or catch.o or
whatever).

> Why do 'C' programmers ask me whether I use:
> 	MULU	#17,d0
> instead of:
> 	move.l	d0,d1
> 	lsl.l	#4,d0
> 	add.l	d1,d0
> when it is basic programming normally done by assembler language
> programmers (one of the oldest tricks in the book)?

Because, if you were to code e.g. a file requester, and you
had
    VISIBLE_FSEL_LINES EQU 17 ; or whatever

and you'd do a
    MULU.L   #VISIBLE_FSEL_LINES,d0
one day, you'd forget this. Using a define here is good, and any day now
it might be changed to 37, when recompiling for interlace (or whatever).
That is (my humble guess) why we ask.
 
> Come on people, no flames for this, these are serious questions.

You got no flames, and serious answers.

> ****************************************************
> * I want games that look like Shadow of the Beast  *
> * but play like Leisure Suit Larry.                *
> ****************************************************

Sure, Leisure Suit Larry could use a re-implementation
while Shadow of the Beast would need a re-design.

Greets, Jesper

(Can we conclude that assembler is suitable for "once and for all" coding,
while a C core is easier to maintain if you know your program will have to
be expanded, and that assembler in the right places is a good thing?)
--                     __
Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
DK-2920 Charl    \\\///  FIDONET: 2:231/84.45
Denmark           \XX/

ben@epmooch.UUCP (Rev. Ben A. Mesander) (04/04/91)

>In article <1991Apr4.224045.10542@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>
>
> For the Mulu #17 and optimization.
> mulu #17 42 cycle.
> other example 26, but it use 2 register.
> If you dont have the luxary and have better use of it (for you loop)
> you are over 42 cycles not using mulu.
> And other solution in some cases are mulu tables...

Yeah, I've written some Z-80 code to do table-assisted multiplies. After
all, it has no mulu. (You kids today have it *easy*! Assembler?? We had
to type the bytes in with a hex keypad and a 7-segment display!) Table
driven multiplies can be pretty quick...

> And when will you have C compiler with AI? Optimization is not only
> tricks, but logical solution for you code.
> Again, how many compiler break up large loops for registers usage?

Far be it from me to join in the great C vs. Assembly wars, I just wanted
to point out: "it's in there". GCC is an assembly language independent 
compiler that works internally by converting your C to a lisp-like language
and manipulates this when optimizing. A machine-dependent back end 
converts this to assembler language.

Personally, I think the whole flame war here is bogus. I've programmed in
assembler and C, I like Z-80 assembler, I don't like 68000 assembler, so I
program in C on my Amiga. I can see why someone else likes something else.
I use programs written in assembler (ARexx, Xoper), and programs written
in C (MicroEMACS, XLISP).

So what? Take it to .advocacy and get this newsgroup the *hell* back on
topic. (Not just you, Stephan).

--
| ben@epmooch.UUCP   (Ben Mesander)       | "Cash is more important than |
| ben%servalan.UUCP@uokmax.ecn.uoknor.edu |  your mother." - Al Shugart, |
| !chinet!uokmax!servalan!epmooch!ben     |  CEO, Seagate Technologies   |

jsmoller@jsmami.UUCP (Jesper Steen Moller) (04/04/91)

In article <mykes.1332@amiga0.SF-Bay.ORG>, Mike Schwartz writes:

Some thing just came to mind:

> How many LINK, UNLK instructions do I unnecessarily have on my hard
> disk?  How much for all the hard disks in the Amiga community alone?
> Terabytes?  I often write programs that need to fit in small machines,
> so I'd rather save tens of thousands of bytes in a (> 100K) program. 

How many fantazilliobytes of graphics, sound and intros do you think
is taking up space on games flopies all around the world? And,
these bytes often _have_ to be seen, _have_ to be heard, and _have_
to be loaded!

Just my $0.02 (or less)!

> Come on people, no flames for this, these are serious questions.

Err - a bit out of subject, but not really a flame.

Greets again, Jesper
--
--                     __
Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
DK-2920 Charl    \\\///  FIDONET: 2:231/84.45
Denmark           \XX/

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (04/04/91)

 dillon@overload.Berkeley.CA.US (Matthew Dillon) writes:

> Can you image doing arbitrary layers clipping entirely in assembly?
> What a waste of time!

You don't know what pain is, Matt.

(When _I_ was a kid, he continued, Pythonesquely:)

I wrote aribitrary 3D six plane clipping,
in an arbitrary oblique perspective projection,
with real time dynamic motion,
in assembly language,
for installation in firmware

Now _that's_ pain.

No choice, either; the available HOL was too slow by a factor of three
in the given hardware.

Ouch.  Talk about masochism for a buck.

Compared to general 3D clipping in any language, I'll write 2D clipping
for you in hex, typing only with my toes, and feel glad for the chance
to do something easy for a change.

;-)

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
--
This is all true, but the rest of the Monty Python routine might have to
be fudged a bit for effect.

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (04/04/91)

sschaem@starnet.uucp (Stephan Schaem) writes:


> And why would use a C compiler when you know you will never port your
> code?

Sorry, Stephen, but that's an argument for the other team.  The quite
successful Starflight 2 game has never been ported to the Amiga because
it was written in a low level, non-standard language (a homebrew Forth),
and no one has been found crazy enough to sign a contract to try to
port it, even though the much less interesting Starflight 1 game sold
respectably when ported to the Amiga.

Fact is, a big part of the reward for doing a good game comes from porting
it to other platforms, and you're a lot farther from that reward when your
game is in 68000 assembler and the target machine runs an 80286.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

mwm@pa.dec.com (Mike (My Watch Has Windows) Meyer) (04/04/91)

In article <mykes.1028@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
   >I don't have SAS C on the Amiga, but I'm sure it produces simular results.
   >
   >/* test.c */
   >
   >char buf[20];
   >main()
   >{
   >  char *d=(char *)&buf;
   >  const char *s="This is a test\n";
   >  while(*s) { *d++=*s++; }
   >}
   >

   /* Test.s produced by gcc */

   #NO_APP
   gcc_compiled.:
   .text
   LC0:
	   .ascii "This is a test\12\0"
	   .even
   .globl _main
   _main:					; (cycles)
	   lea _buf,a1			; 8
	   lea LC0,a0			; 8
	   tstb a0@			; 8
	   jeq L5				; 8
   L4:
	   moveb a0@+,a1@+			; 14*12
	   tstb a0@			; 14*8
	   jne L4				; 13*10+1*8
   L5:
	   rts				; 16
   .comm _buf,20


   ;/* test.s produced by me */		; (cycles)
	   lea	text(pc),a0		; 8
	   lea	buf(pc),a1		; 8
   .loop	move.b	(a0)+,(a1)+		; 14*12
	   bne.s	.loop			; 13*10+1*8
	   rts				; 16
   text	dc.b	'This is a test',10,0
   buf	ds.b	20

   NO COMPARISON DUDE!  GCC makes 3 totally wasted instructions, and one of
   them is inside your loop.

Well, since you declared open season on tweaking the algorithm, I
rewrote that short program with the _obvious_ C code (it's not what
I'd call wonderful to start with). The algorithm used will work with
any valid C pointer to char, even those that point to null strings. It
may even correclty ignore null pointers (but I wouldn't bet on it).

I got this out of SAS C:

	LEA		0012(PC),A0
	LEA		01.00000000,A1
	MOVE.L		(A0)+,(A1)+
	MOVE.L		(A0)+,(A1)+
	MOVE.L		(A0)+,(A1)+
	MOVE.L		(A0)+,(A1)+
	RTS

In your own words "NO COMPARISON DUDE!" It got rid of the loop
completely. It turned 18 memory references into four, a major win on
stock 680[01]0s.

Since loop unrolling - especially from small objects to machine words
- is one of the oldest tricks in the book, I'm surprised you didn't
know about it.

   Just think, the OS ROM routines are written in 'C' and compiled with
   a lesser compiler than gcc (5 years ago).

From the above, it looks like it could have been worse - it could have
been written by a lesser assembler programmer (I'm assuming you're a
greater one) 5 years ago.

For the curious, here's the modified source...

#include <string.h>
char buf[20]
void
main(void)
{
    char *d=buf;
    char *s="This is a test\n";
    strcpy(d, s);
}

Note that the const is now gone; strcpy's second argument is not
declared const, so passing it a pointer to const is questionable.

	<mike
--
I went down to the hiring fair,				Mike Meyer
For to sell my labor.					mwm@pa.dec.com
I noticed a maid in the very next row,			decwrl!mwm
I hoped she'd be my neighbor.

sschaem@starnet.uucp (Stephan Schaem) (04/05/91)

 For the Mulu #17 and optimization.
 mulu #17 42 cycle.
 other example 26, but it use 2 register.
 If you dont have the luxary and have better use of it (for you loop)
 you are over 42 cycles not using mulu.
 And other solution in some cases are mulu tables...

 And when will you have C compiler with AI? Optimization is not only
 tricks, but logical solution for you code.
 Again, how many compiler break up large loops for registers usage?

 And when time comes where you need to save 2 or 4 cycle in a loop you
 dont use C for the job...
 And actually 2 or 4 cycle can make A huge diference.

 And people should create library when they can! look at the basic
 assembler from mithctron.Very small, and use xxK.library (sory my
 assign went way once more!)
 -ANYBODY have that problem? like do something and the system dont know
 system: anymore or whatever was assigned?- that on A3000.
 Or know where to get info on the task that are runing in resident mode?
 I know Its the third time I ask help, but maybe someone will se it:-)

							Stephan.

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (04/05/91)

In article <1991Apr4.224045.10542@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>
>
> For the Mulu #17 and optimization.
> mulu #17 42 cycle.
> other example 26, but it use 2 register.

   The multiplication tricks are old and a result of simple mathematics.
Simply decompose an operation into binary shifts and adds. Consider
multiplying by the number 40. This is easy.
   40=32 + 8

   So C * 40=32C + 8C
   Multiplying by 32 and 8 are just shift operations. 
   So multiplying register D0 by 40 in binary math is:
   move.l d0,d1
   lsl.l  #5,d0
   lsl.l  #3,d1
   add.l d0,d1

  With this method, you can decompose any number into its binary 
components and use shift operations. The problem is, it's very
hardware dependent. So what you do on a 68000, may actually slow
a 68040 down.

 The problem is, the programmer has to constantly decide which 
method is more optimal. For instance, a mulu on a 68040 might take
1 clock cycle to execute, therefore it would be optimal to use
mulu almost all the time. Depending on memory speed, cache size, and
processor type, you have to decide the optimal method whereas
the compiler will do this for you instantly. Meanwhile, you're still
looking at that motorola cycle chart factoring in all the hardware specs.

> If you dont have the luxary and have better use of it (for you loop)
> you are over 42 cycles not using mulu.
> And other solution in some cases are mulu tables...
>
> And when will you have C compiler with AI? Optimization is not only
> tricks, but logical solution for you code.
> Again, how many compiler break up large loops for registers usage?

> And when time comes where you need to save 2 or 4 cycle in a loop you
> dont use C for the job...
> And actually 2 or 4 cycle can make A huge diference.
 
  Yea, when your loop has 20000 iterations, but come on. If you get to the
point where you need to kill the OS to gain 2 cycles, you need to
go back to the drawing board and redesign you game, because you've
made a design flaw.



--
/~\_______________________________________________________________________/~\
|n|   rjc@albert.ai.mit.edu   Amiga, the computer for the creative mind.  |n|
|~|                                .-. .-.                                |~|
|_|________________________________| |_| |________________________________|_|

jesup@cbmvax.commodore.com (Randell Jesup) (04/05/91)

In article <1991Apr4.070030.21792@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>sschaem@starnet.uucp (Stephan Schaem) writes:
>
>> And why would use a C compiler when you know you will never port your
>> code?
>
>Sorry, Stephen, but that's an argument for the other team.  The quite
>successful Starflight 2 game has never been ported to the Amiga because
>it was written in a low level, non-standard language (a homebrew Forth),
>and no one has been found crazy enough to sign a contract to try to
>port it, even though the much less interesting Starflight 1 game sold
>respectably when ported to the Amiga.

	Bingo.  I was offered that port (actually Starflight 1, then still
in metacompiled Forth).  I was smart.  I turned it down and came to commodore
instead.  BTW, I've written multiple Forth implementations before (IBM 370
and 68000 (Amiga) versions), subroutine threaded with inlining.  Mine (never
polished up for release) was around as fast (faster in some cases) as JForth,
which is a really speedy Forth.

>Fact is, a big part of the reward for doing a good game comes from porting
>it to other platforms, and you're a lot farther from that reward when your
>game is in 68000 assembler and the target machine runs an 80286.

	If we're talking market realities here, kent's right.  You must
assume as a game designer that in most cases, it's better to have versions
of your game for other computers, even if they're inferior to the one it was
designed on.  This is again some of the game design/implementation spearation.
If it either written in C or in mixed C/asm (preferably with working but 
slower C versions of most of the downcoded asm), then the porting costs will
likely be far lower, and thus you get more money in your pocket.  How much
better must a pure asm game be in order to justify the extra porting cost?
I dunno, but I bet the publishers out there can give you fairly firm
opinions.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

jesup@cbmvax.commodore.com (Randell Jesup) (04/05/91)

tarnet.uucp>
Sender: 
Reply-To: jesup@cbmvax.commodore.com (Randell Jesup)
Followup-To: 
Distribution: 
Organization: Commodore, West Chester, PA
Keywords: 

In article <1991Apr4.224045.10542@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
> And when will you have C compiler with AI? Optimization is not only
> tricks, but logical solution for you code.
> Again, how many compiler break up large loops for registers usage?

	Any good modern compiler uses procedure-level optimization,
particularily register assignment and common-subexpression elimination.
I alpha and beta test compilers, and often go on "missed optimization hunts",
trying to determine if I could have coded it any tighter myself (within
the semantics of the language).  More often than I care to admit, I think I've
caught the compiler wasting instructions, and find _far_ later in the routine
that the value in question is being reused, saving time/space.  I can still
catch even the latest ones on some things, but it has been getting a lot
harder in the last couple of years.

	Go pick up or borrow from a library a book on compiler optimization
(make sure it's fairly recent, say the last 3 years or so).

> assign went way once more!)
> -ANYBODY have that problem? like do something and the system dont know
> system: anymore or whatever was assigned?- that on A3000.
> Or know where to get info on the task that are runing in resident mode?
> I know Its the third time I ask help, but maybe someone will se it:-)

	No one can help you unless you give some indication of what version
of the OS you're running (results of the version command).  There were
problems like that a LONG LONG time ago.  You could also have a buggy piece
of software that hits memory outside it's allocation.  I advise running
enforcer all the time, and mungwall either all the time, or whenever testing
software or hunting down a problem like this.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

breemen@rulcvx.LeidenUniv.nl (E. van Breemen) (04/05/91)

In article <dillon.5971@overload.Berkeley.CA.US> dillon@overload.Berkeley.CA.US (Matthew Dillon) writes:
>In article <1991Apr2.091947.27988@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>>
>>And when I think of C I think only of 15% slower for general coding
>>and asm for crytical parts...
>
>    ok, so?  That seems about right to me for general coding.  Now put the
>    core code in assembly and leave the rest in C.. what do you get?
>
>    On the otherhand, for medium to large programs it can take 10 times
>    as long to do it in assembly..  or longer!	That's the key tradeoff.
>
>>And if Matthew Dillon was think of doing a macro assembler... Could he
>>use a library for the instruction decoding? I think string gadget sould
>>be able to do calculation.
>
>    *what* are you talking about?
>
>					-Matt


I think it is a kind of expression evaluator (like the EVAL function on my
old BBC). I've written such a thing in C in about 2 evenings. How long will
that take in assembler??? C is great for developing algorithmes. If your
program is ready, then rewrite critical parts in assembly. I know a lot of
programmer who only write in assembler: their output is very small.

-Erwin van Breemen-

The Orega Programming Group Holland.

holgerl@amiux.UUCP (Holger Lubitz) (04/06/91)

In article <mykes.1332@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

>Why do 'C' programmers ask me whether I use:
>       MULU    #17,d0
>instead of:
>       move.l  d0,d1
>       lsl.l   #4,d0
>       add.l   d1,d0
>when it is basic programming normally done by assembler language
>programmers (one of the oldest tricks in the book)?

Since I happen to be the one who did, I will try to explain:
In my experience, the average assembler programmer does NOT know those
tricks. I have seen some demo sources from some guys who claimed themselves
to be some of the coolest demo programmers, and guess what I saw ?

Tons of absolute adressing. Tons of unneeded instructions (like TST after
an instruction that already sets the zero-bit correctly) or wrong ordered
instructions (like TST-ing a return value before copying it elsewhere
instead of first copying it and leaving out the TST).

And there is always one who never heard of TST and uses CMP.L #0,...
instead (don't laugh, there has been an assembler introduction in a german
magazine that suggested just that ! And the guy who wrote it didn't even
believe that TST was shorter and faster when I first called him... Took me
several calls to convince him.)

So I would rather have some C programs by those programmers (the optimizing
compilers prevent most of this), because they would actually run FASTER.
Just because the average C compiler nowadays usually knows how to produce
good assembly code (I am not talking about 1985's compilers, of course),
while the *average* assembler programmer usually doesn't.

If you think that YOU know perfectly well how to program in assembler, just
go ahead and do it. But don't tell anybody that assembler is always faster.
They might believe it.

Best regards,
Holger

--
Holger Lubitz            | holgerl@amiux.uucp
Kl. Drakenburger Str. 24 | holgerl@amiux.han.de
D-W-3070 Nienburg        | cbmvax.commodore.com!cbmehq!cbmger!amiux!holgerl

jbickers@templar.actrix.gen.nz (John Bickers) (04/06/91)

Quoted from <mykes.1332@amiga0.SF-Bay.ORG> by mykes@amiga0.SF-Bay.ORG (Mike Schwartz):
> fWhy isn't something like LightSpeed 'C' available for the Amiga?  It
> flies and generates awesome code.

    Back at university we had to use Lightspeed C for Macs, and some of
    us still wonder why we can't use SAS C the same way (the Macs had
    1 floppy, 1M, and no HD, but we wrote largish programs on them
    without too much grief, complete with clickety-click compiler
    interface).

> Why hasn't someone made the entire c.lib into a loadable library

    Some of the routines do exist in the OS somewhere. Most people
    can't be bothered using them.

> Why don't 'C' compilers support linkerless usage?

    Why should they? If you only want to recompile parts of a project,
    linking is going to be necessary. And there's no significant
    difference between having the linker as part of the compiler
    or as a standalone program.

> Why don't 'C' compilers know about the routines in the OS without
> #pragmas?  While we're at it, why not all the structures and other
> things from the header files, too?

    What's wrong with #pragmas? C compilers typically do these things
    through #include files. Suppose I wanted to #define my own
    structures with the same names as the system ones, and therefore
    left out the appropriate #include files?

    It's for reasons like this that functions like "printf" and co.
    are functions, not C keywords, etc.

> Why do 'C' programmers ask me whether I use:
> 	MULU	#17,d0
> instead of:
> 	move.l	d0,d1
> 	lsl.l	#4,d0
> 	add.l	d1,d0
> when it is basic programming normally done by assembler language
> programmers (one of the oldest tricks in the book)?

    This is also olde to C programmers. In some cases one expects
    the omptimizer to do it (like v *= 2 is more readable to some
    than v <<= 1), but in cases where a programmer has to sit down
    and replace multiplications, even average programmers like
    myself know about this one.

    I read somewhere that SAS C (what I use) does these optimizations
    for powers of 2, so I tend to leave such things alone unless I
    want to be sure that they are happening, or the value is not
    a power of 2.

> ****************************************************
--
*** John Bickers, TAP, NZAmigaUG.        jbickers@templar.actrix.gen.nz ***
***         "Patterns multiplying, re-direct our view" - Devo.          ***

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/06/91)

In article <1991Apr4.065112.21496@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>
> dillon@overload.Berkeley.CA.US (Matthew Dillon) writes:
>
>> Can you image doing arbitrary layers clipping entirely in assembly?
>> What a waste of time!
>
>You don't know what pain is, Matt.
>
>(When _I_ was a kid, he continued, Pythonesquely:)
>
>I wrote aribitrary 3D six plane clipping,
>in an arbitrary oblique perspective projection,
>with real time dynamic motion,
>in assembly language,
>for installation in firmware
>
>Now _that's_ pain.

    I agree, _that_ *is* a pain!

>No choice, either; the available HOL was too slow by a factor of three
>in the given hardware.
>
>Ouch.	Talk about masochism for a buck.

    8-)

>Compared to general 3D clipping in any language, I'll write 2D clipping
>for you in hex, typing only with my toes, and feel glad for the chance
>to do something easy for a change.
>
>;-)

    Yah, and 6502 hex codes and half the PET ROM rom addresses are *still*
    burned into my head!  Never again will I go that far, it takes up too
    many brain cells!

>Kent, the man from xanth.
><xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
>--
>This is all true, but the rest of the Monty Python routine might have to
>be fudged a bit for effect.

					-Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/06/91)

In article <884@cns.SanDiego.NCR.COM> dltaylor@cns.SanDiego.NCR.COM (Dan Taylor) writes:
>In <mykes.0774@amiga0.SF-Bay.ORG> mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:
>>..
>non-CHIP data into anything but 32-bit RAM?  You get to use the absolute
>address modes, which are SLOWER than the register relative modes???  It

    To be fair, Mike was talking about ABSOLUTE WORD addressing, which
    is just as fast and compact as register relative.  That isn't to
    say it's good programming practice, it *isn't* on the amiga!

    But I agree with almost everything else said.

				    -Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

dillon@overload.Berkeley.CA.US (Matthew Dillon) (04/06/91)

In article <18ee3f47.ARN125e@jsmami.UUCP> jsmoller@jsmami.UUCP (Jesper Steen Moller) writes:
>In article <mykes.1332@amiga0.SF-Bay.ORG>, Mike Schwartz writes:
>>...
>> How many LINK, UNLK instructions do I unnecessarily have on my hard
>> disk?  How much for all the hard disks in the Amiga community alone?
>> Terabytes?  I often write programs that need to fit in small machines,
>> so I'd rather save tens of thousands of bytes in a (> 100K) program.
>
>Some are LINK/UNLK are waste. Some are there because of a source that
>is modulized and easy to maintain so that you will receive a new version
>some day.

    Neither DICE nor Manx nor SAS/C generate LINK/UNLK instructions unless
    they have to, and the next major release of SAS/C will do away with
    them entirely.

    Similar optimizations -- passing args in registers for example, have
    existed for several years.

    as far as switch() statements go, an optimizing C compiler has the
    advantage of being able to generate extremely fast code for large
    switches, either by using a jump table (which an assembly programmer
    would use anyway) OR, for *really* sparse switches, a log N time
    comparison tree perhaps using jump tables when the sparseness
    becomes reasonable.

>> Why is the best compiler on the Amiga run the bejeezes slow?
>
>It's good at optimizing, and that takes time. And, it wasn't written
>by Matt (unless you do mean DICE of course...) so it might have
>some inefficient algorithms in it...

    :-)

>> when it is basic programming normally done by assembler language
>> programmers (one of the oldest tricks in the book)?
>
>Because, if you were to code e.g. a file requester, and you
>had
>    VISIBLE_FSEL_LINES EQU 17 ; or whatever
>
>and you'd do a
>    MULU.L   #VISIBLE_FSEL_LINES,d0
>one day, you'd forget this. Using a define here is good, and any day now
>it might be changed to 37, when recompiling for interlace (or whatever).
>That is (my humble guess) why we ask.

    That's an excellent answer that I missed in my earlier reply to
    the message... the moment you hand optimize a constant operation
    you loose the flexability of the constant.

>> Come on people, no flames for this, these are serious questions.
>
>You got no flames, and serious answers.
>
>> ****************************************************
>> * I want games that look like Shadow of the Beast  *
>> * but play like Leisure Suit Larry.		      *
>> ****************************************************
>
>Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
>Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
>DK-2920 Charl	  \\\///  FIDONET: 2:231/84.45
>Denmark	   \XX/

				-Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 Well, CBM had 6 years to redo crytical routines in ASM...
 Anyway there will still be low C programers to waist all the
 work.(not personal here.Just look at the software around).
 Most redraw gadgets all the time for 
 no reason, or when do selection redraw the all list! pitifull.
 Do Gadget use layers by the way? Just an impresion I had.

 Do you think its resonable to have use more than 1/60 second to move
 one step a slider and do a rethink screen?

 I'm just really happy exec is was not done in C.

 And the 20% 30% is defenectly true, if the os have a single
 MOUSEMOVE assign the a window!
 Or you read position yourself from the various system copy...

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 We those kind of memory usage you do it for games, and game only.
 And when you have a speady rate you DO NOT WANT IT TO GO FASTER.
 So you syncronise to WAIT.So 32 bit memory is nothing more than memory
 at that point.
 And if a game is designed for 512k it will use 512K, the extra memory
 will not be used for serious game desing.More for buffer or save the
 system:-)

 But if you make a multitasking game its totaly diferent, because here
 you care about how mutch cycle you leave to the OS...
 Anyway I'm saying that but almost all aplication dont care for the
 above.

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 

 Let me explain:-) Only stupid people do the same thing over and over
 again:-)

 After a while even in ASM (yes sir) you build function library.
 And maybe build a structure language... Than your code is actually
 smaller than a C program.Faster because you had more time to think on
 the project because alot was already done.
 Come one, where did you get the idea that ASM programers are stupid and
 cant come up with good coding?
 ML coding is gone.
 And if you take assembly like a language you can compile 68000 code in
 whater you want...
 Do you think it would be that hard to converte 68000 code in Mips risc
 code? 

 Anyway, for me nobody will converte me to C or make me think otherwise
 (because I tryed C in college and 7 month in aplication, I simply dont
 like it and find 68000 to easy).
 And C programers have also reason for it.
 So pretty mutch useless discusion, since both side know it all :-)

						Stephan

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 Come one how many people here dont use jump table instead of multi cmp?
 An example would be for instruction encoding for a compiler.
 Intead of up to 16 test (that would have been done in probability
 usage) you replace it by 10 instruction (with an offset jump table).

 The C compiler need to ANDERSTAND what you are doing for it to do
 'perfect' compilation.
 Soembody wrote the C compiler anyway.

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 Thats why I only know motorola, I dont care about 99% machine using
 intel.
 I dont want to port my software on anything other than a 68000.
 Genesis come to mind first, than X68000, than NEO-GEO ...
 And in C it was imposible. (beleive me here)
 And If I wanted to port I would have done another game, and in C.

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 Let me be CLEAR! I dont programe the same way when multitasking.
 I even find it hard to swap programing style, and each style are very
 diferent from each other.

 When multitasking I dont care mutch about speed or optimization.I think
 I'm programing to acomplish by goal, and then perfect it.
 You will never see tables in my aplication, or aoptimization like
 doing double ADD instead of LSL.
 And check my books before doing anything I dont know or never done
 before.
 And for games I go for the 68000, aplications 680x0.

sschaem@starnet.uucp (Stephan Schaem) (04/06/91)

 And about that 2 cycles story... Imagine you have around 110 object to
 generate... After you find all the trick you could find and its not
 enouhgt because you have a very short time to do the job you optimize!

 The first time was with generation 40 SPRITES moving with path created
 with math structure.
 We now have 3 pass, and the last pass you need to creat a copper list
 with 6 instruction per sprites (after the sort).
 For sorting we came across 3 possibility.First one coded in a few
 minuts was VERY cycle comsuming, the second one used alot of memory.
 And we settle for a mix:-) Then turn out good since we reused those
 sorting table for fast colision with those 40 objects.
 Also dont forget we are in 6 plane mode, you have only 50% of the bus
 for you!
 And with a LARGE copper otside the playground screen, you have to be
 carefull.
 4 cycle here mean 40 simple instruction...
 You do that a couple of time in every loop, you save ALOT.
 Dont forget how sprite position need to be coded:-) and we have those
 sprite structures cut in 2,3 parts for the hardware str and 'user
  str.(To port over).

Untill you know what 4 cycle mean you dont care!

jbickers@templar.actrix.gen.nz (John Bickers) (04/06/91)

Quoted from <1991Apr2.090315.27856@starnet.uucp> by sschaem@starnet.uucp (Stephan Schaem):
>  Also the mulu #17 exaMPLE should be optimized by a macro assembler

>  But lets get the the question: what do C offer over a NON clasic
>  motorola macro assembler?

    Development speed. A lot of people (some of whom have formal
    backgrounds in Comp Sci, where they get to do silly things with
    PDP-11 assembler :) know this, so they never bother to get good
    enough with assembler to do more than optimise parts of a C program.

    There are few, if any, who can implement a non-trivial algorithm
    faster in assembler than an equally competent C programmer can in
    C. For a lot of people, development speed == income.

>  I know just a little about C, but still know how to write a simple
>  paint program in it.I usally read C to converte it to ASM, like I done

    I think (I'm not a psychiatrist or anything, so take this with some
    salt crystals) that you should be able to read just about any
    language, if a manual (so you can find out what the keywords mean)
    is handy. Or at least, you should be able to do this with procedural
    languages - so you read the C, convert that into some sort of
    algorithm spec, then convert that into assembler.
--
*** John Bickers, TAP, NZAmigaUG.        jbickers@templar.actrix.gen.nz ***
***         "Patterns multiplying, re-direct our view" - Devo.          ***

farren@well.sf.ca.us (Mike Farren) (04/06/91)

mykes@amiga0.SF-Bay.ORG (Mike Schwartz) writes:

>your standard printf("Hello, world\n") program is huge.

printf, in its full form, is huge.  It's quite complex once you add all the
different data types, formats, justifications, etc.  I've seen asm printf
implementations, and they're huge, too.  (That's not to mention the overhead
involveed in adding in the floating point math libraries, which standard
printf requires.)

>Why is the best compiler on the Amiga run the bejeezes slow?  In my
>experience, your development cycle (edit/compile/link/run/debug and
>repeat all the above) is critical to your productivity.

Don't know.  Mostly because they all come from old technology, I'd suspect,
plus some of them are incredibly comprehensive.

>fWhy isn't something like LightSpeed 'C' available for the Amiga?  It
>flies and generates awesome code.

I tried LightSpeed.  Nice compiler, enormous disk hog, very, very quirky,
absolutely non-standard.  I wouldn't switch.

>Why hasn't someone made the entire c.lib into a loadable library
>so all programs can share it instead of duplicating these routines
>hundreds of times all over everyone's hard disks?  Why not even
>just printf.library (this alone would save megabytes on my hard disk)?

You could do this - but it's hard enough to guarantee that something like
arp.library is on everyone's disk, let alone a c.lib.library.  Besides,
there is no "standard" c.lib - although the ANSI standard one probably will
be in a decade or so.

>Why don't 'C' compilers support linkerless usage?

A) You have to link in the library routines SOMEWHERE.  B) With the linker,
you have much more flexibility in how you put your final program together.
C) Most "linkerless" implementations of anything just incorporate the linker
into the compiler/assembler/whatever - it's still there, but hidden.

>Why don't 'C' compilers know about the routines in the OS without
>#pragmas?  While we're at it, why not all the structures and other
>things from the header files, too?

Standards.  They could know about OS routines - but then standards would be
out the window.  Besides, #pragmas allow you to easily make the OS routines
inline calls, instead of having to pass all the args on a stack, "destack"
them and stuff them into registers, and only then call the OS.

If you had to incorporate all of the information in all of the header files
into a C compiler, it would be an incredible pig, both in loading and in
running.  Symbol searches, in particular, would kill you.

>Why do 'C' programmers ask me whether I use:
>	MULU	#17,d0
>instead of:
>	move.l	d0,d1
>	lsl.l	#4,d0
>	add.l	d1,d0
>when it is basic programming normally done by assembler language
>programmers (one of the oldest tricks in the book)?

Yeah?  Then how come every would-be assembly language programmer I've ever
worked with has had trouble understanding shift-add multiplication, let
alone something like Booth's algorithm?  It might be one of the oldest
tricks in the book, but most assembly programmers only read Abacus books :-)

-- 
Mike Farren 				     farren@well.sf.ca.us

ccplumb@rose.uwaterloo.ca (Colin Plumb) (04/07/91)

sschaem@starnet.uucp (Stephan Schaem) wrote:
> And also CED could use interlaced bitplane to eliminate any flicker!
> But no OS function suport that mode...

Sorry, but it's easy.  Do it once and forget about it.

For those who don't know, a simple trick to dramatically reduce blit
is to allocate bitplanes A and B, not as

AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA
AAAAAAAAAA

BBBBBBBBBB
BBBBBBBBBB
BBBBBBBBBB
BBBBBBBBBB
BBBBBBBBBB

but as

AAAAAAAAAA
BBBBBBBBBB
AAAAAAAAAA
BBBBBBBBBB
AAAAAAAAAA
BBBBBBBBBB
AAAAAAAAAA
BBBBBBBBBB
AAAAAAAAAA
BBBBBBBBBB

Which is equivalent to 

AAAAAAAAAABBBBBBBBBB
AAAAAAAAAABBBBBBBBBB
AAAAAAAAAABBBBBBBBBB
AAAAAAAAAABBBBBBBBBB
AAAAAAAAAABBBBBBBBBB
AAAAAAAAAABBBBBBBBBB

So you can use double-height blits on it to move rectangular chunks
around, or treat it as a double-width bitplane and draw into it.
If there are ROWBYTES bytes in a row, lie to the system and say
there are twice as many, and set the bitplane pointer for B
to A+ROWBYTES.

Anyway, to do this with the Amiga's system, you just allocate the
memory, then allocate two BitMap structures.  One claims to be
one bitplane, with twice as many lines as normal, while the other
is two bitplanes, each twice as wide as normal, and you set the
bitplane pointers carefully.

Works like a champ.
-- 
	-Colin

ccplumb@rose.uwaterloo.ca (Colin Plumb) (04/07/91)

Gcc (written as a function, arguments passed in):

_foo:
	movel a6@(8),a0
	movel a6@(12),a1
	tstb a0@
	jeq L5
L4:
	moveb a0@+,a1@+
	tstb a0@
	jne L4
L5:
	rts


jesup@cbmvax.commodore.com (Randell Jesup) wrote:

>SAS C: (5.10a)
>       | 0000  48E7 0030                      MOVEM.L   A2-A3,-(A7)
>       | 0004  47EC  0000-02.2                LEA       02.00000000(A4),A3
>       | 0008  45EC  0000-01.2                LEA       01.00000000(A4),A2
>       | 000C  6002                           BRA.B     0010
>       | 000E  16DA                           MOVE.B    (A2)+,(A3)+
>       | 0010  4A12                           TST.B     (A2)
>       | 0012  66FA                           BNE.B     000E
>       | 0014  4CDF 0C00                      MOVEM.L   (A7)+,A2-A3
>       | 0018  4E75                           RTS
>
>	It does use a2/a3 instead of a0/a1.  However it beats the GNU
>version slightly by jumping to the test instead having two copies of it.

We must disagree on what is good optimisation... I consider gcc's duplication
of the test to be a feature, and SAS's jump-to-the-end a missed optimisation.
It's clearly faster the way gcc does it.  (Gcc saves one untaken branch
in the no-execute case, and one taken branch in the execute case.)

However,
	move.b	a0@+,d0
	jeq L5
L4:
	moveb d0,a1@+
	move.b	a0@+,d0
	jne L4
L5:
	rts

Is faster still, by 4 clocks per loop iteration on a 68000.  I'm submitting
this as a bug in gcc.
-- 
	-Colin

jesup@cbmvax.commodore.com (Randell Jesup) (04/07/91)

ix.gen.nz>
Sender: 
Reply-To: jesup@cbmvax.commodore.com (Randell Jesup)
Followup-To: 
Distribution: 
Organization: Commodore, West Chester, PA
Keywords: 

In article <1974.tnews@templar.actrix.gen.nz> jbickers@templar.actrix.gen.nz (John Bickers) writes:
>> Why hasn't someone made the entire c.lib into a loadable library
>
>    Some of the routines do exist in the OS somewhere. Most people
>    can't be bothered using them.

	Not really (i.e. they're not exact replacements of ANSI/Unix routines,
and they're only available under 2.0).  They are handy for someone trying to
write very small programs (like all the C: commands) who doesn't want to pull
in library routines, and for Asm programmers.

>    I read somewhere that SAS C (what I use) does these optimizations
>    for powers of 2, so I tend to leave such things alone unless I
>    want to be sure that they are happening, or the value is not
>    a power of 2.

	It does rather more than that.  It optimizes for a number of 
specific multiplies (not just powers of two), and in some cases divides (you
have to be careful with signed arithmetic to get the same value - unsigned
is easy for powers of 2.)


-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

jesup@cbmvax.commodore.com (Randell Jesup) (04/07/91)

In article <1991Apr5.232419.23297@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
>
> We those kind of memory usage you do it for games, and game only.
> And when you have a speady rate you DO NOT WANT IT TO GO FASTER.
> So you syncronise to WAIT.So 32 bit memory is nothing more than memory
> at that point.

	Wrong.  Look at lemmings, or a flight simulator.  They can't always
keep up with the 25/30 or 50/60 Hz.  If it used available fast/32-bit memory,
on machines with upgrades it would be able to handle more objects on-screen
before degrading.  For a good example of this, look at Indy 500 on a 3000,
or falcon on a 3000, etc.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

jesup@cbmvax.commodore.com (Randell Jesup) (04/07/91)

.UUCP> <1991Apr5.234958.23678@starnet.uucp>
Sender: 
Reply-To: jesup@cbmvax.commodore.com (Randell Jesup)
Followup-To: 
Distribution: 
Organization: Commodore, West Chester, PA
Keywords: 

In article <1991Apr5.234958.23678@starnet.uucp> sschaem@starnet.uucp (Stephan Schaem) writes:
> Come one how many people here dont use jump table instead of multi cmp?
> An example would be for instruction encoding for a compiler.
> Intead of up to 16 test (that would have been done in probability
> usage) you replace it by 10 instruction (with an offset jump table).
>
> The C compiler need to ANDERSTAND what you are doing for it to do
> 'perfect' compilation.

	Guess what?  C compilers do use jump tables (when it's profitable).
In fact, I think SAS C has at least 3 or 4 (maybe more) different "types" of
code generated for switch statements.  It calculates all the performance
issues for you.  In asm, if you are using the optimal version, and then add
a few more cases, your implementation may no longer be optimimal, but may
well take hours or days (including debugging) to recode, while the C compiler
can do it in seconds.  Trust me, I know, having had to modify "optimized"
asm switch statements in a part of dos we contracted from an outsider for -
he used a jump-table of bra.s's.  When we had to add code, things started
breaking, because there was too much code to reach with bra.s.  Then it got
to the point where there wasn't enough space to reach with bra.s with the
table right in the middle of the routines, even with lots of careful reordering
to get the entrypoints in range.  I finally had to make some entries in the
table point to additional branches to the actual destination.  Sure, I could
have bought the rom space and made them bra's (but we were tight on space), or
I could have worked out whether another switch implementation was now smaller/
faster (but I didn't have time).  If it was in C it would have been handled
for me, and I could have concentrated on getting the algorithm right, instead
of concentrating on whether a branch is within +- 128 bytes.

	This is one reason why assembler is expensive: maintenance and
bug-fixes.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

jesup@cbmvax.commodore.com (Randell Jesup) (04/07/91)

In article <1991Apr6.225956.21886@watdragon.waterloo.edu> ccplumb@rose.uwaterloo.ca (Colin Plumb) writes:
>>	It does use a2/a3 instead of a0/a1.  However it beats the GNU
>>version slightly by jumping to the test instead having two copies of it.
>
>We must disagree on what is good optimisation... I consider gcc's duplication
>of the test to be a feature, and SAS's jump-to-the-end a missed optimisation.
>It's clearly faster the way gcc does it.  (Gcc saves one untaken branch
>in the no-execute case, and one taken branch in the execute case.)

	Note that I typically compile with -O -ms (optimize, space before
speed).  However, you're probably right that it does the same thing with
-mt (time before space), so I'll make sure it's in the next SAS release
(I alpha/beta test for them, and often suggest optimizations after reading
OMD dumps - I've written full RISC code-reorganizers in the past, and am
quite up on my assembler).

	My latest optimization idea, specifically oriented at tags:

	You often push a lot of tags and data on the stack and call xxxTags().
Because of an annoying (in hindsight) decision by the person writing 
utility library, most of these values are $80xxxxxx, and are being pushed
by move.l #xxxxxxxx,-(a7).  However, most of these values are within a small
number of the "base" of the tags for that routine.  The compiler could notice
it's pushing a bunch of values that are within a range of 64K, and load an
address register with a value in that range, and then use pea  xxxx(an) (one
would normally be pea (an)).  Unfortunately, (an)+ and -(an) aren't allowed
for pea, or you could do even better in certain cases.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
Disclaimer: Nothing I say is anything other than my personal opinion.
Thus spake the Master Ninjei: "To program a million-line operating system
is easy, to change a man's temperament is more difficult."
(From "The Zen of Programming")  ;-)

jsmoller@jsmami.UUCP (Jesper Steen Moller) (04/07/91)

In article <1991Apr5.231849.23215@starnet.uucp>, Stephan Schaem writes:

>  Well, CBM had 6 years to redo crytical routines in ASM...
>  Anyway there will still be low C programers to waist all the
>  work.(not personal here.Just look at the software around).

And there will always be low assembler programmers to code all
the worst games. Hey come on - grow up. Stop throwing mud here.

>  Most redraw gadgets all the time for 
>  no reason, or when do selection redraw the all list! pitifull.

Strange - I never saw _one_ program that redraws any gadgets upon selection
(except home-made MutualExcludes and some toggle-switches). Can you come
up with any proof?

>  Do Gadget use layers by the way? Just an impresion I had.

I need not comment this in any way - it is clear that your knowledge
of the OS is ... apalling.

>  Do you think its resonable to have use more than 1/60 second to move
>  one step a slider and do a rethink screen?

Well you wouldn't really want to RethinkDisplay() for moving a slider.
Anyway, I have no complaints about slider-movement in any Amiga program.
Or is it real-time movement in e.g. ProPage 2.0 you want?
 
>  I'm just really happy exec is was not done in C.

Why am I following this up? I must be mad!
 
>  And the 20% 30% is defenectly true, if the os have a single
>  MOUSEMOVE assign the a window!

Definitely not true. See the other thread about CPU-usage for a
resting Amiga. 1% measured on a normal MC68000. If the mouse
is moved around, ok - another % or two. Do not count on the
information XOper gives you - the count is based on the number of
task switches to each task, not the actual time taken.

>  Or you read position yourself from the various system copy...

Ha - poof! Your you read for instance the values in IntuitionBase
your program will get wrong values in 2.0. And the technique is
not Amiga-like anyway - it's called busy-polling.

                                - Jesper
--                     __
Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
DK-2920 Charl    \\\///  FIDONET: 2:231/84.45
Denmark           \XX/

jsmoller@jsmami.UUCP (Jesper Steen Moller) (04/07/91)

In article <1991Apr5.234958.23678@starnet.uucp>, Stephan Schaem writes:

>  Come one how many people here dont use jump table instead of multi cmp?

I will not let this point go down the drain:

A typical example where multi cmp's are the best:


switch (gadCode)
{
case 1: break;
case 37: break;
case 100: break;
case 1000: break;
    }

This is an extreme example, but not at all unlikely. A multi-cmp is
the likely here, right?


Ok, now consider:

switch (gadCode)
{
case 1: break;
case 2: break;
case 3: break;
case 4: break;
case 5: break;
case 6: break;
case 7: break;
case 8: break;
    }

A jump table would be preferred, right?
The point is  - the C compiler knows what's the best - so does the
assembler programmer. If one value is changed, the C compiler will
redo the work. So will the assembler programmer. During development,
values change oftenly...

This reminds me of:
----
#define CONSTANT 37

number *= CONSTANT;
-----

verses:

------
   CONSTANT EQU 37

   mulu #CONSTANT,d0
------

Aha, now change CONSTANT to 16. The C-compiler catches this, while the
assembler programmer has to know all about where CONSTANT is placed to
optimize this case.
So for large projects, C must be the least bothering of the two...

>  The C compiler need to ANDERSTAND what you are doing for it to do
>  'perfect' compilation.

As a switch sentence is limited to constants, It needn't understand that
much. It can compute which solution is the most efficient. And the
human user doesn't have to recode the entire switch construction when
one single value changes. As the equivalent assembler coder would have.

>  Soembody wrote the C compiler anyway.

Yes, typically a person with a thorogh knowledge of the achitecture from
assembler up to higher structured languages than C. Someone with the
right theoretical background. Someone like Matt.


           - Jesper
--                     __
Jesper Steen Moller   ///  VOICE: +45 31 62 46 45
Maglemosevej 52  __  ///  USENET: cbmehq!cbmdeo!jsmoller
DK-2920 Charl    \\\///  FIDONET: 2:231/84.45
Denmark           \XX/

elg@elgamy.RAIDERNET.COM (Eric Lee Green) (04/08/91)

a0.SF-Bay.ORG>

From article <1991Apr5.234958.23678@starnet.uucp>, by sschaem@starnet.uucp (Stephan Schaem):
>  Come one how many people here dont use jump table instead of multi cmp?
>  An example would be for instruction encoding for a compiler.
>  Intead of up to 16 test (that would have been done in probability
>  usage) you replace it by 10 instruction (with an offset jump table).

Most "C" compilers will do this automatically for a "switch" statement.

Don't understand what you mean by "instruction encoding for a compiler."
Most parser generators generate table-driven parsers in the first place,
i.e., they automatically generate a "jump table" as you put it. Similarly,
the "lexer" in any decent compiler will similarly be table-driven.
Do you mean the code generator? The optimizer? Hmm? (Do you know what I'm
talking about, anyways?).

--
Eric Lee Green   (318) 984-1820  P.O. Box 92191  Lafayette, LA 70509
elg@elgamy.RAIDERNET.COM               uunet!mjbtn!raider!elgamy!elg

m0154@tnc.UUCP (GUY GARNETT) (04/09/91)

[OK, you've got it: No flames, serious answers]

[Software written in C versus assembler.  You can't recompile when a
better compiler comes along, and all of this stuff is bigger and
slower than it has to be.  "Sometimes this tradeoff is something you
might live with, others its not".]

The tradeoff has to be evaluated by the systems analyst who designed
the software, or the programmer who coded it; there is simply no other
way.  The tradeoff is three ways: between algorithm and program
design, coding and debuggin time, and maintainance time.  Over the
lifetime of a significant commercial application, the third is by far
the largest expense; it makes good economic sense to take any steps
necessary to minimise it (even if it costs in terms of the first two).
This is why most productivity applications are written in a higher
level language.  I would also like to point out that tight, fast
assembly coding will not cure a poor design (as a case in point, take
OS/360 and its descendants - they are pure assembler, at this point
literally millions of lines - and the current MVS/ESA is still a
hog!).  To take a specific point:

[AmigaOS v2.0 comments]

I'm sure that Commodore would love to have v2.0 coded entirely in
assembler, tight and fast, by a team of the worlds best 680x0
assembler programmers.  Unfortunately, they are in business to make
money; they cant afford the development time or the staff requirements
to do that.  Perhaps you'd like to do it full-time (on an unpaid
basis, of course)?  I didn't think so; neither would I.  I'm sure that
Commodore's engineers are getting it to us as fast as they can, and
will be delivering the best performance that they can.  Sure, it could
be better (but there's always 2.1 :) but I don't want to have to waqit
5 years for it to come out, and then have to pay $500 for the update.

[Why does the best compiler run so slowly]

I'll bet that gcc was designed to be relatively portable, and has a
number of stages, each of which add to the compile time (as opposed to
a fast one- or two- pass compiler which emits object code directly
instead.  So the design of the compiler imposes speed constraints. 
The implementation (probably coded in C, for portability) adds others.
On top of it all, good optimization involves heavy symbol manipulation
on the part of the compiler (IBM's VS FORTRAN has one of the best
optimizers I've seen, and when its fully enabled, compile times
increase by a factor of 3 to 5.  Then again, it produces better
assembly code than any IBM ALC/370 coder I've met; although there
might be some that can better it).

Personally, I'd like an implementation of Object-Oriented Turbo Pascal
(v5.5 or later language spec) for the Amiga, with a front end (symbol
based) and a back end (code based) optimizers.  The OOP language
extensions add a lot of power to the language, and an assembly
inclusion feature would allow the worst speed offenders (which can be
identified with Turbo Profiler) to be recoded in assembly.  I've doe a
of development with this package on the PClones, and it beats (in
terms of good code and easy development) all of the other development
environments I have tried on the PC.

[c.library and printf.library]
Good Idea.  Why don't you (or someone else who is a very good
assembler programmer) write them and make them freely distributable
(or even better yet, give them to Commodore for inclusion in the 2.0
(or maybe 2.1) distribution?

[linkerless C; why does C need #pragma to know about library routines]
Like an assembler that could only emit complete object modules, a
"linkerless" C compiler would loose a lot of its power.  The liker is
your friend, it can save you from compiling the same code fragment
over and over again.  C needs a #pragma to pass parameters to library
routines because the Amiga subroutine calling conventions are
different from the default C calling convention.  It is easy, and
certainly much more efficient that passing arguments on the stack
(which would also make assembly coding much more cumbersome).  On the
other hand, it is certainly possible (although more difficult) to
design a compiler which passes arguments in registers by default; it
is even possible to design one which "knows" about the library bases
and can "automagically" manipulate them (opening and closing libraries
as needed) - as a matter of fact, I have a specification for such a
language, but I have not tried to implement it (I need to add object
oriented facilities to the spec, see the above about TurboPascal).

[Trapping the guru]
I don't know why people don't do this more often, either.  Ideally,
the compiler should generate such code as part of the run-time
package, with an option to disable it.  On the other hand, it should
also be possible for any programmer to add it to the standard startup code.

[optimizing MULU]
Any decent peepholer (compiler back-end optimizing technique) should
be able to pick this one up.  C programmers who ask you what it means
just aren't paying attention (or didn't bother to read the code).  On
the other hand, a comment line might not be a bad idea, either.

Wildstar

sschaem@starnet.uucp (Stephan Schaem) (04/10/91)

 Usally you will get more calculation time ONLY! if the display is done
 corectly it wont change mutch, from what I saw.
 In my case on a A3000 the game go 60 frame second, really to fast! All
 it does is the object go twice faster.
 (F29 totaly umplayable!) F18 simply syncronise, so its stay perfect!.
 Anyway, I'm in half Bright mode, so 50% of! Than Its a 68000 and I can
 still do colision, tracjectory calculation, shadows etc  for all of the
 100 or more object.
 So imagine with 16 colors not 64...
 I'm saying that if you game work perfectly on your A500 version you
 defenetly dont want it to go faster.
 Lemings slow down on a 68000, so its DOES NOT have steady frame rate
 (that what I stated in my last message, and maybe the point to missed).
 When you have a steady frame rate, the game goes has fast as it was
 designed for.
 Also I'm sure that lemings on a A3000 should be slow down less easelly.
 Like any games (but not true on all 68030 cards).

sschaem@starnet.uucp (Stephan Schaem) (04/10/91)

Wasn't that more of a desing problem? 
Days for changing tables jump? Usally  use table because of the compare,
not the branching...
If you do 'serial' compare you know what your programe do.
For exmaple if you try to reconiz instructions, you use table or
compare.
And In C or ASM its an 'optimization' that you will do.

If you are on a 68000 bra.b or bra.w or jrs d16(pc) take 10 cycles...
bra.b and bra.w take 8 cycle (or 2 bus cycle) like jsr d16(pc).
So what the idea in using bra.b? on a 68000 software.
And I haven't check how that follow on a 68020.
You will use .s for contidional branching! and will do so you
continional braching instruction branch mostly on (Not taken).
When analizing a buffer I know the % of instruction I will be likly to
find... So I do the routine brnching acordingly.

I'm not a cycle optimization fanatic, but logical optimization
fanatic:-)
Anyway from what I read bra.s VS jmp d16(pc) is a size diference
(68000).So keep you bra.b into bra and let your assembler optimize it
when possible.

And you also have to know when using table that its has to be decided by
something else than cycles... If you use a table and 99% of the time it
doesn't pass the first element you are lossing time.
And you might need to use both.You need to run the software for that...
So finaly the point already try to state: a compiler CANT do that, a
human can guess...

I have programe using table for that instruction decoding, and I keep
track of the table usage :-) And I find fun to think the programe could
recode itself for each aplication it already knows:-)

Also your point is more than true on optimization (case switch example).
But doesn't happen often to me, and usally but not always the
compare/jump are not time crutical at all.

							Stephan.

sschaem@starnet.uucp (Stephan Schaem) (04/10/91)

You must realize that if I attacked C aplication programer it must be
a response to something/someone!.I simply throw things back...

Are you kiding about that gadget redraw! Do you use you amiga sometime?
TXED:   clicking outside text will redraw the slider even if it not
needed.
Jrcomm: Will ALWAY redraw its 'gadgets' even if curent state is corect!
I never said upon selection (did I?!), I said that usally they uptade
the
all list when doing one change. (Again not the usally...)
This is not a intuition.library problem but programers problems! (for
jrcomm
it seem to be an intuition problem...).This your task dont have control
over gadgets!

Yes my knowledge on intuition is low!
If I tell you that 'do intuition gadgets use layers'
itsthat I dont know.. And have apreciate more a yes or no answer than
an insult :-)

And did you read my example at all? it was a 64 colors palette on a 3bit

hires screen (I was using ututiion at the time).
I use 3 slider for the RGB ... companant.
slider with RELVERIFY!FOLLOWMOUSE!GADGETIMEDIAT.
Than needed to uptade a UCopList for the palette lines colors, with a 4
by
8 /16 matrix. So 8 to 16 CWait with 4 * 8 to 16 CMove.
After each copper uptadte you need to recreate the View copper list...
So AGAIN that doesn't fit in 1/60 of a second.

And again! I said mouse movement with one mousemove/reportmouse active!
Tested on a A2000 with 68000! 20 to 30% tank you, I done the test I know
what I saw! I didnt use xoper, but display based on a 1/60 frame basis!

Where do you come from Jesper! Where did you read you need to go with
IBase?!?!?! sc_ and wd_ offer relative mouse possition GET REAL!
Before insulting someone on its knowledge you better get your sense
together.
ALL my tools work from on 2.0! only a version of UShow dont because a
guy
use intuition to get gfxbase...

If you want to upset me, like you try:-), dont goes with stuff like
that...
Show me I'm wrong, dont tell me I'm wrong.

						Stephan
~l`6kDl`Lk, l`

sschaem@starnet.uucp (Stephan Schaem) (04/10/91)

 Busy_poling? Anyway maybe you heard of the timer? or VBLANK?
 You can waikup your task and and create your own message that way...
 The best with that is you can give an uptade rate. And minimazie
 message passing time and size.
 Anyway You imagine too mutch about me:-) Where did you get sucth silly
 ideas:-)

							Stephan

plouff@kali.enet.dec.com (Wes Plouff) (04/10/91)

In article <1991Apr9.193233.5773@starnet.uucp>, sschaem@starnet.uucp (Stephan Schaem) writes...
> 
>Wasn't that more of a desing problem? 
>Days for changing tables jump? Usally  use table because of the compare,
>not the branching...

(Something rarely seen on Usenet...)  Stephen, your answers would be
easier to follow if you would include a line or two of the original
messag

uzun@pnet01.cts.com (Roger Uzun) (04/10/91)

[]
The way i approached frame rates was to optimize for the 68000, and
set a frame rate goal using the timer.device, I can achive that frame rate
on a 68020 equipped machine (of similar power to an A2620), but cannot
get the 'ideal' frame rate on a stock, nofastmem A500.  It is closer
when you add true fast RAM in the system, but still not ideal.
Basically the games multitask and are perfectly stable applications,
but they will only achive the max frame rate with an A2500/20 or better.
The frame rate is still acceptable on an A500.

So just because lemmings is slower on a 68000, this does not mean
it does not have a fixed frame rate, it probably has a ceiling like
my games, one which cannot be reached on a 7Mhz 68000, but can 
be reached on faster systems.

-Roger

UUCP: {hplabs!hp-sdd ucsd nosc}!crash!pnet01!uzun
ARPA: crash!pnet01!uzun@nosc.mil
INET: uzun@pnet01.cts.com