[comp.sys.apollo] DPCI and pty probs.

bep@quintro.uucp (Bryan Province) (06/01/90)

I haven't seen anything about this for a while so I thought it was time to
start it up again.

We have DPCI V5.0 running over ethernet on a DN3000 at SR10.2.  Dterm locks
up several times a day.  I can fix it temporarily by several stages of fixes:

	1.  Delete the offending ttyp? and ptyp? from /dev.  The PC attached
	    to that port may or may not need to be rebooted.

	2.  Get everyone off of the pty ports and run /etc/mkdev /dev pty.

	3.  Remake the ptys and reboot the node.

After step one you can usually use Dterm again but the 'tty' value gets set
to /dev/tty which messes up our login scripts.  When Dterm gets screwed up
usually it also affects rlogin.  I have also seen occasions when two PCs end
up using the same tty port.  We DO have the pty patch #143 loaded and haven't
seen much of a difference.  WHEN WILL APOLLO PPPPPLEASE GET THIS PTY PROBLEM
FIXED?!?!?!?  This is a great source of grief for me so any response is
helpful.
-- 
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
Bryan Province     Glenayre Corp.           quintro!bep@lll-winken.llnl.gov
                   Quincy,  IL              tiamat!quintro!bep@uunet
           "Surf Kansas, There's no place like home, Dude."

wjw@eb.ele.tue.nl (Willem Jan Withagen) (06/05/90)

In article <1990Jun1.145336.424@quintro.uucp> bep@quintro.UUCP (Bryan Province) writes:
>We have DPCI V5.0 running over ethernet on a DN3000 at SR10.2.  Dterm locks
>up several times a day.  

Well we are running just about the same from DN4500, and with the sam kind of 
luck. ( It also runs on a serial line, this to used pty's. And this also hangs)
As an example, here's a sesion which got hung while executing the dterm_server.
the serial server also uses pty's for its connections:

 1382 ttyp0   S     0:02 dpci_server -netbios dpci1 -line 2 -baud 9600 -retries 100 -signal
 1436 ttyp0   S     0:00 /sys/dpci/dterm_domain /bin/start_csh 1382 sio2 N dpci1 0

You can debug the servers by specifing -debug {1,2,3} and this will give 
you all kinds of info. But the problem here is caused, by starting up the 
/bin/start_csh with something like 'exec /bin/start_csh' and there the server
halts.
If one removes all pty's from /dev, then the server will create a temporary pty
and afterwards this one will be removed. Hence this will "always" work.

Just for those intrested: I've also booted a server on an DN4000 running OS9.7
and that works without any problems, except that it's a little slower.
But dterm does not hang.

I also have another "problem" with the DPCI. The manual says that for all
communication versions you can load the network software in EMS. 
( except for the DPCI1 version, which I desprately need ), but the DPCI501
version complains that the -ems switch is not known, even if it's used as the
first switch. 

Questions to those at Apollo: 
    1)	Why is there no -ems for the DPCI1. It should not
	be too hard. since the ones with dpciring and dpci503 DO have this 
	working.
    2)	When is the limit on 63 processes abandoned, since we quite often
	run out of processes. ( Every DPCI uses 2-5 processes )
	I know that it's a promisse in OS10.3, but when is that going to be
	released, and (more inportant:) shipped.
    3)  Why is it so hard to fix something so essential?
	I know it's the wrong question from the problem side, but
	seen from my ( and others ) side it's the right question. We're 
	going to invest more in the DPCI when and if the pty problem is
	fixed. But I don't like to be naged by users, wanting to get new
	pty's since the old ones are "used up".
	Not only the DCPI suffers, but more essential are the TCP programs
	which get messed up.

So far there have been only two DPCI users on the net. Are there any more
out there, and are you happy??

Greetings,


	Willem Jan Withagen               

Eindhoven University of Technology   DomainName:  wjw@eb.ele.tue.nl    
Digital Systems Group, Room EH 10.10 BITNET: ELEBWJ@HEITUE5.BITNET
P.O. 513                             Tel: +31-40-473401
5600 MB Eindhoven                 
The Netherlands
 

bep@quintro.uucp (Bryan Province) (06/06/90)

In article <1990Jun1.145336.424@quintro.uucp> bep@quintro.UUCP (Bryan Province) writes:
>We have DPCI V5.0 running over ethernet on a DN3000 at SR10.2.  Dterm locks
>up several times a day.  . . .
>We DO have the pty patch #143 loaded

This is a followup to my own article.  I was misled about the patch.  The
proper one is #139 and can be found on patch tape 9005_1.  I've loaded the
patch and have seen some different things happen but not complete resolution.
I'm told that this patch has fixed several other sites with the same problem.
Our main problem is that we get two different PCs or a PC and an rlogin
acquiring the same pty port after Dterm doesn't release it correctly.

I'll post a summary whether or not my problems get fixed.
-- 
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
Bryan Province     Glenayre Corp.           quintro!bep@lll-winken.llnl.gov
                   Quincy,  IL              tiamat!quintro!bep@uunet
           "Surf Kansas, There's no place like home, Dude."

bep@quintro.uucp (Bryan Province) (06/12/90)

OK DPCI and PTY fans, here are the results of my investigations.

First of all, if you are just having problems with rlogin, telnet, etc. try
loading patch number 139.  This is a new version of /lib/streams for SR10.2
nodes.  It supposedly takes care of alot of problems with ptys needing to be
recreated all of the time.  It was also supposed to take care of problems
with DPCI and DTERM but it didn't solve my problems.

As for my problems, and possibly yours, Apollo is investigating them.  There is
an APR out on DPCI running at SR10.2 but I don't have the number with me.
The nature of my problem is as follows.  When you first bring up DPCI and
establish virtual connections a dpci_server process is started on the node
but no ptys are allocated to it.  When you bring up DTERM a pty is associated
with the dpci_server process and any other processes you create within DTERM.
When you get out of DTERM the pty is still (sort of) associated with the
dpci_server process.  The problem is when someone else either uses rlogin,
telnet, or DTERM to get on the same node, that process will get the same pty
as the previous DTERM process.  So now the original dpci_server and the new
login process both have the same port.  Now if the original dpci_server user
tries to bring up DTERM it also gets the same port and thus the pty becomes
corrupted.

I tried loading patch 139, rebuilding the ptys, and rebooting but to no
avail.  I also tried loading new dpci_server and dterm_domain files from
Apollo but that didn't help either.  There is a 5.1 version of DPCI comming
out but my impression from Apollo is that it still doesn't have this problem
fixed.

If anyone else is having similar or different problems I'd appreciate hearing
about them.  My advice is try patch 139 from the April patch tape (9005).
Apollo says that it has solved problems with other sites.
-- 
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
Bryan Province     Glenayre Corp.           quintro!bep@lll-winken.llnl.gov
                   Quincy,  IL              tiamat!quintro!bep@uunet
           "Surf Kansas, There's no place like home, Dude."