[comp.sys.amiga] Low Memory and Hanging Forbids

hamilton@intersil.uucp (Fred Hamilton) (01/23/90)

--

I've just started running MemWatchII to aid (hopefully) in tracking down
the source of some intermittent crashes in my system.  MemWatch has
identified for me a number of programs that write over low memory.

Now I understand that the OS needs section(s) of memory all to itself
(that's what I'm assuming "low memory" is reserved for), but why do *any*
applications write to low memory?  What's the appeal?  Or is it done by
accident? by compilers?  How and why do all these programs that trash
low mem do it?

On a related note, I've wondered about these "XXXXXXX won't run with my 590
because it was expecting to see 00 in location $00, but an early version
of the FastFileSystem would write different values to location $00 causing
XXXXXXX to crash" messages.  The solution was "get the latest revision of
FFS".  I don't understand that.  Why was game/application "XXXXXXX" writing
to and/or worried about the value in location 0 in the first place.  Why
was FFS "wrong" and the program not?

Finally, since upgrading to WShell1.2, I've gotten a few "Warning-
Hanging Forbid!"  messages after running some applications.  What is a
hanging forbid and how serious is it?   Should I report it to the people
who made the software that hung the forbid?  

Thanks in advance for any enlightment.
-- 
Fred Hamilton                  Any views, comments, or ideas expressed here
Harris Semiconductor           are entirely my own.  Even good ones.
Santa Clara, CA

cmcmanis@stpeter.Sun.COM (Chuck McManis) (01/24/90)

In article <67.25bb84f0@intersil.uucp> (Fred Hamilton) writes:
> Now I understand that the OS needs section(s) of memory all to itself
> (that's what I'm assuming "low memory" is reserved for), but why do *any*
> applications write to low memory?  What's the appeal?  Or is it done by
> accident? by compilers?  How and why do all these programs that trash
> low mem do it?

Well, a lot of people program in C, and sooner or later they use pointers.
Consider the following code fragment :

struct {
	int	a, b, c, d;
	char	string[80];
	} *mystruct;

	...
	mystruct->a = CalculateSomething();
	strcpy(SomeString, mystruct->string);
	...
	foo = mystruct->a;
	printf("%s", mystruct->string);
	...

What do you notice ? Well you will notice that the structure called
mystruct occupies 96 bytes of memory (88 if you are using 16 bit integers).
And the application is using it properly, (ie it's dereferencing the pointer
to get at the members.) But what if the application forgot to initialize
the pointer to anything? Guess what, it defaults to 0. Now if we had a 
MMU and a process address space it might warn us that we were reading/writing
outside of our address space, unfortunately we don't. To further complicate
matters, unless someone writes into variable 'b' above and clobbers ExecBase
the program might perform flawlessly. So now your program is working fine
until the process that actually owns the memory you've been writing into
needs it and it has been stomped. Or maybe you have overwritten an interrupt
vector and the next time the floppy gets accessed your machine crashes.

>On a related note, I've wondered about these "XXXXXXX won't run with my 590
>because it was expecting to see 00 in location $00, but an early version
>of the FastFileSystem would write different values to location $00 causing
>XXXXXXX to crash" messages.  The solution was "get the latest revision of
>FFS".  I don't understand that.  Why was game/application "XXXXXXX" writing
>to and/or worried about the value in location 0 in the first place.  Why
>was FFS "wrong" and the program not?

Unless you "own" location 0, meaning that you've called the system memory
allocator and it's given you the 4 bytes at location zero for your program
to use, it is illegal to write to it. However, a side effect of modifying
location zero is that when you crash, some alerts that gets displayed will
have on one side of the '.' the contents of location 0. Consequently a
very clever form for debugging low level code was to write a debugging
value to 0 (say a 1 for the subroutine Foo, and a 2 for subroutine Bar)
and when the system crashed, looking at that value to determine which 
routine you were in. The FastFileSystem used this technique and not all
of the debugging code was removed before the initial release.

>Finally, since upgrading to WShell1.2, I've gotten a few "Warning-
>Hanging Forbid!"  messages after running some applications.  What is a
>hanging forbid and how serious is it?   Should I report it to the people
>who made the software that hung the forbid?  

A hanging forbid occurs when a process calls the Forbid() exec function
without calling the Permit() function. The use of this function is to 
lock out the task scheduler for a moment because the operation your
program is doing cannot be split between instructions. (The classic
example of this is when you are modifying system lists which the next
task in the queue may reference, you must guarantee that they are 
consistent before you allow a task switch.) As it turns out things
like Wait() and interrupts will break a Forbid() so the system doesn't
always halt, but whenever control returns to the task that has an
outstanding Forbid(), control will stay with that task until it either
calls Wait() or Permit().

A common use for Forbid() is to terminate a child task or process. 
Generally what happens is that the child process or task is getting
ready to exit and thus cleans up all of it's memory that it used or
resources that it has allocated and then wishes to send a message to
it's parent saying that it is ok to clean it up and unload it. One
of the paradoxes of this situation is that the child may need to
do one last thing after it sends the message but before it is actually
ready to be unloaded. To accomplish this a simple technique is used,
the child calls Forbid(), replys to (or sends) the message to it's
parent that it is ready to die, and then because it is under a forbid
it knows the parent won't get the message before it's ready, it does
its final cleaning and calls Wait(0). The effect of the Wait(0) is
twofold. First, the Forbid() is broken even though it was never "Permitted"
because the Wait() call forces a task switch. Secondly, because there
were no signal bits to check for the process/task is permanently placed
in the "not ready to run" queue so it won't execute any more instructions.
Now when the parent removes the task, it will be guaranteed to not be
running and thus safe to remove from the system. If you try to remove
a running task, there is a chance that the scheduler will restart it
after you have released it's memory but before it has been removed from
the run queue. This will often cause a system crash.

So in summary, a hanging Forbid() may indicate a bug or it may indicate
a child process that has begun the process of exiting, but you can't really
tell if that is good or bad. 

--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@Eng.Sun.COM
These opinions are my own and no one elses, but you knew that didn't you.
"If it didn't have bones in it, it wouldn't be crunchy now would it?!"