[comp.sys.amiga.tech] divide by 0 problem in "info"? Also: GOMF

phil@rice.edu (William LeFebvre) (08/31/88)

Honest!  It really happened to me.  I just innocently typed "info" at my
CLI.  It gave me a list of mounted disks and their used space, free,
percent, etc.  Then I got a "Task held" requestor before it got to the
"Volumes available" list.  When I clicked Cancel, it gurued with a 5:
divide by zero check.  I can't recreate it because my machine had been up
for so long when it happened.  It succeeded in displaying info for VD0:
and both floppies.  Maybe it was trying to calculate the line for RAM:?
Has anyone seen this before?  This was 1.2, of course.

Unfortunately, I wasn't running GOMF.  Which brings me to another
question.  Will GOMF catch only Gurus generated after a "Task held"
requestor?  I made VT100 generate my favorite reproducible guru (Kermit
BYE before doing a transfer), but GOMF didn't catch it.  I was a little
disappointed.  Does anyone know just what it will catch?

I'm full of questions these days...

			William LeFebvre
			Department of Computer Science
			Rice University
			<phil@Rice.edu>

riley@batcomputer.tn.cornell.edu (Daniel S. Riley) (08/31/88)

In article <1841@kalliope.rice.edu> phil@rice.edu (William LeFebvre) writes:
[...]
>I made VT100 generate my favorite reproducible guru (Kermit
>BYE before doing a transfer), but GOMF didn't catch it.  I was a little
>disappointed.  Does anyone know just what it will catch?

When you kermit BYE in vt100 2.8 before doing a transfer, vt100 constructs
a packet using a pointer which never had memory allocated for it...i.e.
it points to NULL, so Kermit happily scribbles all over low memory.
I think it scribbles over less than 100 bytes, so memwatch might catch it.
Beyond that, you nuke the autovector interrupts and there's not much hope
of getting out intact.

I sent fixes for this (and several other bugs) to Tony, so this should
disappear in 2.9, and you'll have to find a new `favorite guru'.

-dan riley (dsr@lns61.tn.cornell.edu, dsr@crnlns.bitnet)
-wilson lab, cornell u.

dillon@CORY.BERKELEY.EDU (Matt Dillon) (09/01/88)

:Honest!  It really happened to me.  I just innocently typed "info" at my
:CLI.  It gave me a list of mounted disks and their used space, free,
:percent, etc.  Then I got a "Task held" requestor before it got to the
:"Volumes available" list.  When I clicked Cancel, it gurued with a 5:
:divide by zero check.  I can't recreate it because my machine had been up

	This is a bug in RAM: ... sometimes it looses track of the number
blocks free / allocated and it becomes 0.  INFO assumes that disk devices
are at least 1 block big and doesn't check for 0's when it does a divide.
I think the 1.3 INFO checks for 0.

				-Matt

rchampe@hubcap.UUCP (Richard Champeaux) (09/01/88)

In article <1841@kalliope.rice.edu>, phil@rice.edu (William LeFebvre) writes:
> 
> Unfortunately, I wasn't running GOMF.  Which brings me to another
> question.  Will GOMF catch only Gurus generated after a "Task held"
> requestor?  I made VT100 generate my favorite reproducible guru (Kermit
> BYE before doing a transfer), but GOMF didn't catch it.  I was a little
> disappointed.  Does anyone know just what it will catch?
> 

    GOMF seems to catch a lot of the system errors, both the "Task Held"
and the GURU ones.  However, it doesn't tend to catch the ones I got it for,
the ones that the programs I'm writting create.  To quote the manual:

          The third is a catastrophic system failure chaaracterized by an
     instant lockup of the entire system.  Unfortunately, this 'sudden death'
     is fatal to all programs, including GOMF.  Luckily, programmers so inept
     as to write software that does this seldom get their work published, and
     as such, this condition is very rarely encountered.

   Well I guess that makes me an inept programmer, because I get a lot of 
these.  They don't all cause a complete system lockup.  My favorite one that
GOMF misses is one that causes GOMF's "guru buster" picture to appear and
stay there.  Some times the mouse locks up, sometimes it doesn't.  I don't
even get to decode the GURU MEDITATION ERROR because it never shows up and
if I reboot, I get thrown back to kickstart.
   As far as I can tell, these are all pretty much caused by making system
calls with improperly initialized structures. (they arn't very forgiving)
Unfortunately, these are exactly the kind of errors I get when writting
programs.  Usually because either the Rom Kernel Manual's information is 
sketchy, or because one of the important steps you need to do is mentioned
in another section 15-20 pages away (or I just plain screw up :-)
     Oh well, it wouldn't be as much fun if it was all just handed to you :-)

> I'm full of questions these days...
> 
> 			William LeFebvre
> 			Department of Computer Science
> 			Rice University
> 			<phil@Rice.edu>

Rich Champeaux
Clemson University

disd@hubcap.UUCP (Gary Heffelfinger) (09/01/88)

From article <2914@hubcap.UUCP>, by rchampe@hubcap.UUCP (Richard Champeaux):
> In article <1841@kalliope.rice.edu>, phil@rice.edu (William LeFebvre) writes:
>> 
> 
>     GOMF seems to catch a lot of the system errors, both the "Task Held"
> and the GURU ones.  However, it doesn't tend to catch the ones I got it for,
> the ones that the programs I'm writting create.  To quote the manual:
> 
>           The third is a catastrophic system failure chaaracterized by an
>      instant lockup of the entire system.  Unfortunately, this 'sudden death'
>      is fatal to all programs, including GOMF.  Luckily, programmers so inept
>      as to write software that does this seldom get their work published, and
>      as such, this condition is very rarely encountered.
My, my, my.  Aren't they sweet?  I assume that they mean that the
condition will rarely be encountered in commercial software, but it sure
isn't nice to call potential customers "inept".  This sort of thing has
no place in documentation.  A simple "Since most software will be
thoroughly tested by the time you purchase it, you will rarely encounter
this condition." would be sufficient.  Sheesh.  No need to insult.

> 
>    Well I guess that makes me an inept programmer, because I get a lot of 
> these.  They don't all cause a complete system lockup.  My favorite one that
> GOMF misses is one that causes GOMF's "guru buster" picture to appear and
> stay there.  Some times the mouse locks up, sometimes it doesn't.  I don't
> even get to decode the GURU MEDITATION ERROR because it never shows up and
> if I reboot, I get thrown back to kickstart.
I don't think that any of this is a measure of ineptness.  Heck we've
all seen the "fireworks" associated with an invalid pointer.  Does GOMF
trap these?  I don't consider myself to be inept, and I see these things
happen now and then.  (Not inept, but perhaps a bit misguided at times :-)





Gary









-- 
Gary Heffelfinger   ---   Employed by, but not the mouthpiece of 
                          Clemson University.
---===      Amiga.  The computer for the best of us.     ===---

rap@ardent.UUCP (Rob Peck) (09/02/88)

In article <2914@hubcap.UUCP>, rchampe@hubcap.UUCP (Richard Champeaux) writes:
> 
>     GOMF seems to catch a lot of the system errors, both the "Task Held"
> and the GURU ones.  However, it doesn't tend to catch the ones I got it for,
> the ones that the programs I'm writting create.  To quote the manual:
> 
>           The third is a catastrophic system failure chaaracterized by an
>      instant lockup of the entire system.  Unfortunately, this 'sudden death'
>      is fatal to all programs, including GOMF.  Luckily, programmers so inept

I found that the most common cause of instant lockup (at least when -->I<--
cause it) is asking to free a memory segment that has already been freed.
It seems that once a free memory list has been mangled, Amy does not
quite know what to do --- 

	"Do I have any memory available to put up a dead end requester, 
	 well, maybe I'd stomp on something because this memory list is 
	 so rotz'ed up I don't quite know what to do... aha, I'll just lock
	 up... the programmer (or user) will recognize that I have a problem 
	 and nerve pinch me back to health"

Actually, maybe lets make this an official request for 1.4 -- make it impossible
to lock up the system under at least this particular condition (and any
others for which the reason for the lockup can be determined).  No, I
don't know HOW, but maybe the lockup condition can be reworked to wait
a few seconds (so the user might recognized that it was locked), then
stomp/unstomp something so that a dead end requester can be put up --
like maybe even if the memory list is messed up, make a fast assumption
that there is at least 512k in the machine and reinitialize the memory
and/or screen so that the requester can be put up with the appropriate
message.  The access to data is probably already too bad off and it
might be more desireable to allow the machine to recover on its own.

One thing I had not tried yet, though ... does the system drop into
ROMWack when a lockup happens?  If so, programmers might still benefit,
even during lockup, if an external terminal could do a little bit of
post-mortem analysis.  You'd have to do a bit of printf'ing to the
external terminal before the crash, so you'd know where Amy loaded
and created various structures, but maybe there'd be a way to look
at things if you crash consistently each time you run it.

Rob Peck

paolucci@snll-arpagw.UUCP (Sam Paolucci) (09/02/88)

In article <1841@kalliope.rice.edu> phil@rice.edu (William LeFebvre) writes:
+Honest!  It really happened to me.  I just innocently typed "info" at my
+CLI.  It gave me a list of mounted disks and their used space, free,
+percent, etc.  Then I got a "Task held" requestor before it got to the
+"Volumes available" list.  When I clicked Cancel, it gurued with a 5:
+divide by zero check.  I can't recreate it because my machine had been up
+for so long when it happened.  It succeeded in displaying info for VD0:
+and both floppies.  Maybe it was trying to calculate the line for RAM:?
+Has anyone seen this before?  This was 1.2, of course.

The same think happened to me a few weeks back, but I am sorry to say that
I cannot add any more information to your accurate description.




-- 
					-+= SAM =+-
"the best things in life are free"

				ARPA: paolucci@snll-arpagw.llnl.gov

andy@cbmvax.UUCP (Andy Finkel) (09/03/88)

In article <8808311804.AA19391@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>
>:Honest!  It really happened to me.  I just innocently typed "info" at my
>:CLI.  It gave me a list of mounted disks and their used space, free,
>
>	This is a bug in RAM: ... sometimes it looses track of the number
>blocks free / allocated and it becomes 0.  INFO assumes that disk devices

Actually, there was a bug in the info command, and a bug in ram.
Info assumed 2 reserved blocks, rather than reading the environment
vector.  (It also didn't check for a possible divide by 0).  I fixed
both for 1.3.  RAM also had a bug, involving deleting directories.
(essentually, the block count became 1 off)  I fixed than one, too.


-- 
andy finkel		{uunet|rutgers|amiga}!cbmvax!andy
Commodore-Amiga, Inc.

"If we can't fix it, it ain't broke."

Any expressed opinions are mine; but feel free to share.
I disclaim all responsibilities, all shapes, all sizes, all colors.

carlson@ernie.Berkeley.EDU (Richard L. Carlson) (09/03/88)

In article <4645@cbmvax.UUCP> andy@cbmvax.UUCP (Andy Finkel) writes:
<In article <8808311804.AA19391@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
<<
<<:Honest!  It really happened to me.  I just innocently typed "info" at my
<<:CLI.  It gave me a list of mounted disks and their used space, free,
<<
<<	This is a bug in RAM: ... sometimes it looses track of the number
<<blocks free / allocated and it becomes 0.  INFO assumes that disk devices
<
<Actually, there was a bug in the info command, and a bug in ram.
<Info assumed 2 reserved blocks, rather than reading the environment
<vector.  (It also didn't check for a possible divide by 0).  I fixed
<both for 1.3.  RAM also had a bug, involving deleting directories.
<(essentually, the block count became 1 off)  I fixed than one, too.
<-- 
<andy finkel		{uunet|rutgers|amiga}!cbmvax!andy

The bug in RAM: involved just deleting directories?  I've recently been
doing some heavy-duty *file* creation and deletion in RAM:, and I always
end up with very negative block counts for RAM: [-118 blocks, as I type
this].  Are there circumstances other than deleting directories that are
known to induce this bug?

-- Richard
   {tektronix,dual,sun,ihnp4,decvax}!ucbvax!ernie!carlson
   carlson@ernie.berkeley.edu

andy@cbmvax.UUCP (Andy Finkel) (09/09/88)

In article <25918@ucbvax.BERKELEY.EDU> carlson@ernie.Berkeley.EDU.UUCP (Richard L. Carlson) writes:
>In article <4645@cbmvax.UUCP> andy@cbmvax.UUCP (Andy Finkel) writes:
><In article <8808311804.AA19391@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>The bug in RAM: involved just deleting directories?  I've recently been

It also showed itself when deleting 0 length files.  You making any of these ?

			andy
-- 
andy finkel		{uunet|rutgers|amiga}!cbmvax!andy
Commodore-Amiga, Inc.

"If we can't fix it, it ain't broke."

Any expressed opinions are mine; but feel free to share.
I disclaim all responsibilities, all shapes, all sizes, all colors.