rjs@a.cs.okstate.edu (Roland Stolfa) (04/21/87)
Hello, Several weeks ago, I submitted a letter about some Intel 286/310 boxes that I was having problems with. I wish to thank all those who answered my plea. I am, however, still having problems. Therefore, I would like restate the problems that I am having, hoping that one of you in netland can help me out some more... Here at Oklahoma State University, we have 6 Intel 286/310's running Xenix 3.0 (that's right 3.0) Update 1, with OpenNet 1.0 Update 1. Each of the 310's has a 40M drive, 1M of memory (this is a correction), and one iSBC 188/48 Version 1.1. Currently this machine set is serving about 150 students (usually no more than 25 at a time). Each student has NO idea of which machine they are logining onto. When we originally set this system up, we thought that the OpenNet product would handle the load of having any student log onto any machine and have OpenNet route their session to the correct home directory. However, this seems to not be the case. When the load reaches about 20 students (usually just before an assignment is due :-) the system seems to handle everything just fine. However, after the system has sustained this for a "while", one machine in the set will lock up. By that, I mean that someone on machine A, trying to access machine B (which is locked up), fails. However, someone on machine B trying to access ANY other machine succeeds. Furthermore, that student never even knows that machine B is unreachable by fellow classmates who have home directories on machine B. When I try to fix the situation, I plug a terminal into the console port on machine B, and it just sits there. No response to <CR>, <DEL>, etc. My only apparent solution has been to shutdown machine B and reboot it. However, this then usually causes the next machine in the "/net/data" file to suffer the same problems that I was trying to fix on machine B. Any and all assistance in this matter would be greatly appreciated. Please mail any questions, comments, and/or help directly to me. Roland J. Stolfa Department of Computing and Information Sciences Oklahoma State University UUCP: {cbosgd, ea, ihnp4, isucs1, mcvax, pesnta, uokvax}!okstate!rjs Internet: rjs@a.cs.okstate.edu Disclaimer: You have lost your MIND if you think ANYBODY speaks forin nein neiUND_
cmcmanis@sun.uucp (Chuck McManis) (04/22/87)
In article <1910@a.cs.okstate.edu>, rjs@a.cs.okstate.edu (Roland Stolfa) writes: > Hello, > > Several weeks ago, I submitted a letter about some Intel 286/310 boxes that > I was having problems with. I wish to thank all those who answered my plea. > I am, however, still having problems. Therefore, I would like restate the > problems that I am having, hoping that one of you in netland can help me out > some more... > > Here at Oklahoma State University, we have 6 Intel 286/310's running > Xenix 3.0 (that's right 3.0) Update 1, with OpenNet 1.0 Update 1. > Each of the 310's has a 40M drive, 1M of memory (this is a correction), > and one iSBC 188/48 Version 1.1. I assume you also have a 186/51 ethernet board ? > ... When the load reaches about 20 students (usually just before an > assignment is due :-) the system seems to handle everything just fine. > However, after the system has sustained this for a "while", one machine > in the set will lock up... > ... When I try to fix the situation, I plug a terminal into the console > port on machine B, and it just sits there. No response to <CR>, <DEL>, etc. First, the question comes to mind "Why isn't there a terminal *always* plugged into the console port?" You may have missed a message like "panic: interrupt" or something. But since you obviously don't, you can check to see if the kernel has paniced by jumping into the debug monitor. (To enable the monitor you build the kernel with the line "debug 1" in the xenix.conf file.) Push the "I" button on the front panel and single step along until you get out of the debug interrupt routine. Is the machine looping on a 'Halt' instruction? If so your kernel probably died. If not then the only other problem could be that it called 'sleep' in the ethernet driver and missed the wakeup interrupt. If that is the case the statement(s) after the halt instruction will check to see which interrupt broke the machine out of the interrupt and basically check for the sleep condition. > Roland J. Stolfa > Department of Computing and Information Sciences -- --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.