[comp.sys.amiga] Modem/VT100 problem

ins_adjb@jhunix.HCF.JHU.EDU (Daniel Jay Barrett) (06/06/88)

Ever since I've been using my new Avatec 2400-baud modem and VT100 v2.8,
I've been having problems when I exit VT100.

Shortly after I click the close gadget and turn off my modem, the
system either locks up, or I start getting "Volume ASDG-RAM has a
read/write error" from my ramdisk, VD0:.

Has anybody else experienced this problem?  Could the modem be the
cause?  Or must I suspect poor, loveable VT100?

-- 
Dan Barrett	ins_adjb@jhunix.UUCP
		barrett@cs.jhu.edu

hardie@damask.UUCP (Peter Hardie ) (06/13/88)

In article <6512@jhunix.HCF.JHU.EDU>, ins_adjb@jhunix.HCF.JHU.EDU (Daniel Jay Barrett) writes:
> 
> Ever since I've been using my new Avatec 2400-baud modem and VT100 v2.8,
> I've been having problems when I exit VT100.
> 
> Shortly after I click the close gadget and turn off my modem, the
> system either locks up, 
> Has anybody else experienced this problem?  Could the modem be the
> cause?  Or must I suspect poor, loveable VT100?


I have a Hayes smartmodem 1200 and have had this problem for a long time
now with several other modem/terminal programs. 
It only happens when the remote end drops the carrier and this occasionally
causes the amiga to go bye-bye. The mouse can still be moved around but trying
to terminate vt100 by clicking on the left-hand button don't work anymore.
I can't remember if pushing the window to the back still works, but I suspect
not because I always have to reboot.
I posted a note about it
a couple of years ago and the only reply I got suggested that I had wired 
my modem improperly. Rubbish!
It seems to me that it is in the kernel somewhere.
Pete Hardie ve5va
ihnp4!watmath!alberta!damask!hardie

robocop@netmbx.UUCP (Thorsten Ebers) (06/15/88)

In article <6512@jhunix.HCF.JHU.EDU> ins_adjb@jhunix.HCF.JHU.EDU (Daniel Jay Barrett) writes:
>
>Shortly after I click the close gadget and turn off my modem, the
>system either locks up, or I start getting "Volume ASDG-RAM has a
>read/write error" from my ramdisk, VD0:.
>

I also experienced this problem.I got this message when i was compiling some 
stuff with M2Amiga.I found out that my memory was going out.At this moment my 
compiler wrote somthing over in the ASDG-RAM.So i got my read/write error.
I suppose that something similiar has happend to you.So I do not thing that your
modem was the cause for that.

Thorsten Ebers
robocop@netmbx.UUCP
DO NOT MAIL OUTSIDE GERMANY.

riley@batcomputer.tn.cornell.edu (Daniel S. Riley) (06/18/88)

In article <1108@damask.UUCP> hardie@damask.UUCP (Peter Hardie ) writes:
>I have a Hayes smartmodem 1200 and have had this problem for a long time
>now with several other modem/terminal programs. 
>It only happens when the remote end drops the carrier and this occasionally
>causes the amiga to go bye-bye. The mouse can still be moved around but trying
>to terminate vt100 by clicking on the left-hand button don't work anymore.
>I can't remember if pushing the window to the back still works, but I suspect
>not because I always have to reboot.

I know that there are a number of places within vt100 where it will
AbortIO() a request,
Wait() for the signal associated with the port for that IO request, then
WaitIO() on that IO request.
This sequence can lead to obscure vt100 lock-ups if the IO happens to
complete and the signal is cleared somewhere else in the program before
this code gets executed.  However, this only causes vt100 to hang, not
any other part of the system, and it's fairly rare and timing dependent.
I took all the Wait()'s out of these sequences after I was burned a few
times aborting scripts, and have had no problems with vt100 hanging
since then.  Hopefully I'll remember to complain to Tony before the
next version comes out.

While I'm on the subject, there's also a bug in the Kermit bye routine--
when it's invoked from the menu (or keyboard equivalent), it either uses
memory that's been freed (if you've done a kermit transfer that session)
or scribbles all over low memory (if it's the first kermit operation that
session).  It's also on my list to tell Tony about.

-Dan Riley (dsr@lns61.tn.cornell.edu, dsr@crnlns.bitnet)
-Wilson Lab, Cornell U.

dillon@CORY.BERKELEY.EDU (Matt Dillon) (06/18/88)

:I know that there are a number of places within vt100 where it will
:AbortIO() a request,
:Wait() for the signal associated with the port for that IO request, then
:WaitIO() on that IO request.
:This sequence can lead to obscure vt100 lock-ups if the IO happens to
:complete and the signal is cleared somewhere else in the program before
:this code gets executed.  However, this only causes vt100 to hang, not
:any other part of the system, and it's fairly rare and timing dependent.
:I took all the Wait()'s out of these sequences after I was burned a few
:times aborting scripts, and have had no problems with vt100 hanging
:since then.  Hopefully I'll remember to complain to Tony before the
:next version comes out.

	The only possible way this could cause a lock up is if the
signal got cleared AFTER the IO request completed but BEFORE the Wait().
AbortIO() itself does not munge with the signals.  The only way the
signal can be cleared is if VT100 clears it somewhere (via Wait(), for
instance).  Thus, the bug is in *ANOTHER PART* of VT100.

	Besides all that there is absolutely no reason to put a Wait()
before WaitIO().  WaitIO() works no matter what the state of the signal
bit is.

	A common problem which you might look for is the programmer
forgetting set the node type of the IO request to NT_MESSAGE before
SendIO(), BeginIO(), or DoIO() (not sure about needing it in DoIO()).
PutMsg() does this for you, but SendIO()/BeginIO()/DoIO() usually don't
(it depends on the device).

	Just as PutMsg() sticks in NT_MESSAGE automatically, ReplyMsg()
sticks in NT_REPLYMSG or NT_FREEMSG automatically.  This is how WaitIO()
figures out whether the request has been returned by the device yet or not.
(IOF_QUICK is also checked but that's another story).  Since one doesn't
PutMsg() to the IO device, one must set ln_Type to NT_MESSAGE manually before
every SendIO()/BeginIO()/DoIO().


				-Matt

acs@amdahl.uts.amdahl.com (Tony Sumrall) (06/18/88)

In article <5205@batcomputer.tn.cornell.edu> riley@tcgould.tn.cornell.edu (Daniel S. Riley) writes:
writes about AbortIO(), Wait(), WaitIO() sequences.  Matt has begun
addressing this in his follow-up so I'll reply there.  Now, on to the next
point:
>             Hopefully I'll remember to complain to Tony before the
>next version comes out.
>
>While I'm on the subject, there's also a bug in the Kermit bye routine--
>when it's invoked from the menu (or keyboard equivalent), it either uses
>memory that's been freed (if you've done a kermit transfer that session)
>or scribbles all over low memory (if it's the first kermit operation that
>session).  It's also on my list to tell Tony about.

Please don't delay telling me about bugs, suspected bugs or anomalous
behaviour.  Prompt e-mail will give me more opportunity to find a fix.
Then the only problem is to get the fix out (and, yes, 2.8A the patch
version will be out RSN, honest!).

>-Dan Riley (dsr@lns61.tn.cornell.edu, dsr@crnlns.bitnet)
>-Wilson Lab, Cornell U.
-- 
Tony Sumrall acs@amdahl.uts.amdahl.com <=> amdahl!acs

[ Opinions expressed herein are the author's and should not be construed
  to reflect the views of Amdahl Corp. ]

acs@amdahl.uts.amdahl.com (Tony Sumrall) (06/18/88)

In article <8806180105.AA07445@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes,
in response to a statement of how VT100 handles the abort of an I/O
operation:
>	The only possible way this could cause a lock up is if the
>signal got cleared AFTER the IO request completed but BEFORE the Wait().
>AbortIO() itself does not munge with the signals.  The only way the
>signal can be cleared is if VT100 clears it somewhere (via Wait(), for
>instance).  Thus, the bug is in *ANOTHER PART* of VT100.
>
>	Besides all that there is absolutely no reason to put a Wait()
>before WaitIO().  WaitIO() works no matter what the state of the signal
>bit is.

Yeah, but there was a *significant* discussion around about 2nd queater
last year on exactly how to successfully abort an I/O.  I remember Leo
being involved and even CATS (Randy Weiner).  In fact, Leo wrote (in
article 3213@well.UUCP:
>	Try this:
>
>	AbortIO (ConsoleRead);
>	WaitIO (ConsoleRead);
>	Wait (1L << ConsoleRead -> io_Message -> mn_ReplyPort -> mp_SigBit);
>
>	When I was writing Robotroff, there were occasions that I would need
>to AbortIO() a timer request so I could reconfigure things.  I AbortIO()ed,
>then WaitIO()ed.  I flushed it, right?
>
>	Apparently not.  I would keep getting this phantom signal that
>Wait() would respond to.  I surmised that the AbortIO() posted a reply to
>your reply port, raising a signal bit.  WaitIO() apparently then checks to
>see if the queue has anything.  If so, it grabs it right away, and doesn't
>bother to clear the signal bit.  So you still have an outstanding signal
>pending to your task... er.. process... er.. whatever.  The above convoluted
>Wait() statement will clear it.
>
>	I suspect that, when you post the new request, then wait for it, the
>old signal somehow gets in and fouls things up.
>
>	If CATS would like the code that created this condition, I'm sure I
>could hack something together.
>
>	Be warned, however, that I have a sneaking suspicion that the above
>hack has a very low probability of working for you....

Is there new information that I somehow missed?  I'm not *that* heavily
into the Amiga (insofar as I've never disassembled any of the code) but
I've followed the discussions here on the net pretty closely and never saw
this particular point brought to light.

Back to Matt:
>	A common problem which you might look for is the programmer
>forgetting set the node type of the IO request to NT_MESSAGE before
>SendIO(), BeginIO(), or DoIO() (not sure about needing it in DoIO()).
>PutMsg() does this for you, but SendIO()/BeginIO()/DoIO() usually don't
>(it depends on the device).
>
>	Just as PutMsg() sticks in NT_MESSAGE automatically, ReplyMsg()
>sticks in NT_REPLYMSG or NT_FREEMSG automatically.  This is how WaitIO()
>figures out whether the request has been returned by the device yet or not.
>(IOF_QUICK is also checked but that's another story).  Since one doesn't
>PutMsg() to the IO device, one must set ln_Type to NT_MESSAGE manually before
>every SendIO()/BeginIO()/DoIO().

Really?  I know that SendIO() and BeginIO() are rather poorly documented
and they even say that their operation varies based on the device (RKM,
old version) but I can't remember *ever* seeing this in any source (other
than Matt's, perhaps).

I'm not disagreeing on either of the above two points, just asking for
confirmation.

>
>				-Matt


-- 
Tony Sumrall acs@amdahl.uts.amdahl.com <=> amdahl!acs

[ Opinions expressed herein are the author's and should not be construed
  to reflect the views of Amdahl Corp. ]

jesup@cbmvax.UUCP (Randell Jesup) (06/19/88)

In article <5205@batcomputer.tn.cornell.edu> riley@tcgould.tn.cornell.edu (Daniel S. Riley) writes:
>I know that there are a number of places within vt100 where it will
>AbortIO() a request,
>Wait() for the signal associated with the port for that IO request, then
>WaitIO() on that IO request.
>This sequence can lead to obscure vt100 lock-ups if the IO happens to
>complete and the signal is cleared somewhere else in the program before
>this code gets executed.

	You've got it right.  The sequence would be safe if no other IO
requests used that port (since nothing else would ever clear the signal).
If multiple IO requests use the port, and DoIO/WaitIO on the other requests
may cause you to miss the signal from the AbortIO()ed request.  I find it
easier to assume the signal means something MAY have come back, so I CheckIO
before deciding to WaitIO.  CheckIO is very fast, and also if I know it's
done, WaitIO never actually waits, so I don't go away and ignore the user.

Randell Jesup, Commodore Engineering {uunet|rutgers|ihnp4|allegra}!cbmvax!jesup

jesup@cbmvax.UUCP (Randell Jesup) (06/19/88)

In article <8806180105.AA07445@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>	The only possible way this could cause a lock up is if the
>signal got cleared AFTER the IO request completed but BEFORE the Wait().
>AbortIO() itself does not munge with the signals.  The only way the
>signal can be cleared is if VT100 clears it somewhere (via Wait(), for
>instance).

	Or by doing a WaitIO or DoIO for another request that uses the same
port.

>	Besides all that there is absolutely no reason to put a Wait()
>before WaitIO().  WaitIO() works no matter what the state of the signal
>bit is.

	This is exactly right.  However, a program may wish to Wait() if it
KNOWS it will be signaled, so it can handle other tasks while waiting.
The best way to KNOW you will be signaled is to CheckIO.  If it hasn't
completed, you will be signaled when it does.

>	A common problem which you might look for is the programmer
>forgetting set the node type of the IO request to NT_MESSAGE before
>SendIO(), BeginIO(), or DoIO() (not sure about needing it in DoIO()).
>PutMsg() does this for you, but SendIO()/BeginIO()/DoIO() usually don't
>(it depends on the device).

	VERY good point, very often missed.  This should be documented
better:  If in doubt, set the ln_Type to NT_MESSAGE before doing SendIO,
BeginIO, or DoIO.  Most devices will set it for you, but it's safer to
do it yourself.  If you write a device, have your BeginIO vector set it
or the System Police will come looking for you.  :-)

>	Just as PutMsg() sticks in NT_MESSAGE automatically, ReplyMsg()
>sticks in NT_REPLYMSG or NT_FREEMSG automatically.  This is how WaitIO()
>figures out whether the request has been returned by the device yet or not.
>(IOF_QUICK is also checked but that's another story).  Since one doesn't
>PutMsg() to the IO device, one must set ln_Type to NT_MESSAGE manually before
>every SendIO()/BeginIO()/DoIO().

>				-Matt

	Well said, Matt.

	(Have you been disassembling the roms again?  naughty naughty :-)

Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

dillon@CORY.BERKELEY.EDU (Matt Dillon) (06/19/88)

>>	Try this:
>>
>>	AbortIO (ConsoleRead);
>>	WaitIO (ConsoleRead);
>>	Wait (1L << ConsoleRead -> io_Message -> mn_ReplyPort -> mp_SigBit);
>>

	WRONG!

	WaitIO() may OR MAY NOT clear the signal bit, it depends whether
or not the request has already been returned.  In anycase, if WaitIO()
DOES clear the signal bit, a further Wait() statement will FREEZE the
program.

	This is what WaitIO does (not exactly, but close enough):

	If QUICKIO, return immediately
	If the request is already returned, remove and return immediately
	else Wait() on the signal bit until the request is returned,
	     then remove and return.

WaitIO(ior)
{
    if (ior->io_Flags & IOF_QUICK)
	return(ior);
    while (ior->io_Message.mn_Node.ln_Type == NT_MESSAGE)
	Wait(1 << ior->io_Message.mn_ReplyPort->mp_SigBit);
    Disable();
    Remove(ior);
    Enable();
    return(ior);
}

	NOTE that WaitIO() ONLY calls Wait() if the request has NOT been
returned yet.

	Therefore, after a call to WaitIO(), the state of the signal bit
is COMPLETELY UNKNOWN.  If you really want to get rid of phantom signals,
the proper sequence is: (if you want to abort the request, stick in an
AbortIO() before the WaitIO()).

	long mask = 1 << ior->io_Message.mn_ReplyPort->mp_SigBit;

	WaitIO(ior);
	SetSignal(0, mask);

	In the case where you are using the IO port as a multi-drop
response path, it would probably be better to SET the signal after a
WaitIO() instead of clearing it, so you main-loop Wait() doesn't freeze
up when other requests are present on the replyport:

	WaitIO(ior);
	SetSignal(mask, mask);


					-Matt

dillon@CORY.BERKELEY.EDU (Matt Dillon) (06/19/88)

>	This is exactly right.  However, a program may wish to Wait() if it
>KNOWS it will be signaled, so it can handle other tasks while waiting.
>The best way to KNOW you will be signaled is to CheckIO.  If it hasn't
>completed, you will be signaled when it does.
	
	A very common problem I see in programs is the assumption that
the IO request has completed when a signal is received.  Definately
a bad assumption to make (unless you are *very* careful).  CheckIO()
itself doesn't check to see if the request has already been removed 
from the replyport, only if it has been returned to the reply port.

	So, as you said, the purpose of CheckIO() is to, well, check
if the IO is done yet, independant of any signal that may have occured.

	An io request is:	(1) inactive (not in progress and not
				    returned).  I.e. you haven't 
				    started the IO yet.
				(2) Active but not yet returned
				(3) Active and has been returned, but
				    has not been pulled off the reply 
				    port yet (before you WaitIO()).

				when you pull it off, it is no longer
				Active.

	In most cases, dealing with the Active/InActive state of an io
request (i.e. one that you've SendIO()d vs one that you have pulled 
off the reply port with WaitIO()) is inherent in your software design.
For instance, for serial and console IO I usually always have a request
in progress at all times (for read)... After I process a completed 
request I immediately SendIO() another one.  Thus, there is no question
in any other part of the program that the request is active.

	And since I don't bother to explicitly clear the signal bit
after a WaitIO() (and thus may get false signals), I always call
CheckIO() after receiving a signal to filter out those dirty signals
and call WaitIO() only if CheckIO() returns TRUE.

	However, for WRITEs, my request may be in any of the 3 states
mentioned above, and I need another variable to tell me whether it
is active or not.  I.E. First check the variable, and if it is active
you can then call CheckIO() to determine whether it has been returned
yet, then WaitIO() if it has (or WaitIO() without a CheckIO() if you 
don't mind waiting for it to return).

	I should write a book about it, yah?

					-Matt

cmcmanis%pepper@Sun.COM (Chuck McManis) (06/21/88)

In article <6512@jhunix.HCF.JHU.EDU>, (Daniel Jay Barrett) writes:
> Ever since I've been using my new Avatec 2400-baud modem and VT100 v2.8,
> I've been having problems when I exit VT100.
> Shortly after I click the close gadget and turn off my modem, the
> system either locks up, 

In article <1108@damask.UUCP> hardie@damask.UUCP (Peter Hardie ) writes:
> I have a Hayes smartmodem 1200 and have had this problem for a long time
> now with several other modem/terminal programs. 

Actually, I suspect the problem is in the serial.device. There are a couple
of bugs that seem to bite people and I have hypothesized some ideas although
Bryce will have to be the definitive source of information on these.

A) The serial.device really dislikes getting junk for input, particularly
when it has been closed. For Daniel I would suggest you *not* switch off
your modem. That will probably keep the system from crashing. 

B) Changing the serial.device parameters when it is receiving a character
is a known bug that Matt has a great test case for (the original DTerm).

One hypothesis is that there is a race condition when the serial device is
being unloaded (due to space constraints) and it gets unloaded before it's
interrupt server gets removed from the IntServer chain. Character comes in,
it calls the now unloaded possibly corrupt serial device, poof crash-ola.


--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.

papa@pollux.usc.edu (Marco Papa) (06/21/88)

In article <1108@damask.UUCP> hardie@damask.UUCP (Peter Hardie ) writes:
>In article <6512@jhunix.HCF.JHU.EDU>, ins_adjb@jhunix.HCF.JHU.EDU (Daniel Jay Barrett) writes:
|| Ever since I've been using my new Avatec 2400-baud modem and VT100 v2.8,
|| I've been having problems when I exit VT100.
|| Shortly after I click the close gadget and turn off my modem, the
|| system either locks up, 
|| Has anybody else experienced this problem?  Could the modem be the
|| cause?  Or must I suspect poor, loveable VT100?
|I have a Hayes smartmodem 1200 and have had this problem for a long time
|now with several other modem/terminal programs. 
|It only happens when the remote end drops the carrier and this occasionally
|causes the amiga to go bye-bye. The mouse can still be moved around but trying
|to terminate vt100 by clicking on the left-hand button don't work anymore.

A known problem of the current serial.device is the "lock up" that
can occur when switching baud rates (more so if the rates are VERY different).
Random data, when turning off a modem, can produce the same results.  
I believe this won't be fixed until Bryce rewrites the serial device for
1.4 (Bryce?). So, please don't blame the terminal program(s).

-- Marco Papa 'Doc'
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
uucp:...!pollux!papa       BIX:papa       ARPAnet:pollux!papa@oberon.usc.edu
 "There's Alpha, Beta, Gamma and Diga!" -- Leo Schwab [quoting Rick Unland]
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=