LSCHULTZ@WOOSTER (02/03/90)
On Friday (of course) at approximately 11:15 am, my system crashed while a PMDF batch job was just starting to execute. The crash was caused by a SSRVEXECPT (Unexpected system service exception). The current image file was LOGINOUT.EXE. Everything seemed fine after rebooting the system, so I went to lunch. The PMDF software worked fine for about an 1-1/2 hours. Then it seemed to get stuck, repeating the following error message (in L_MASTER.LOG in pmdf_root:[log]) over and over and over.....: %PMDF-W-V5DETECT, MAIL CNCT block not found -- assuming VMS V5 %PMDF-F-GETFILERR, Error getting file parameter off command line -CLI-W-ABSENT, entity or value absent from command string The system had run non-stop for the better part of four weeks before the crash. I have stopped/started both the mail$batch queue and Jnet with no success. Since I was involved in a number of other projects, I did not notice there was a problem until later in the day. Now the mail$batch queue has gotten HUGE ! Any suggestions will be greatly appreciated. Lee Schultz Asssitant Director Academic Computing Services College of Wooster Wooster, Ohio 44691 (216) 263 -2242 LSCHULTZ @ WOOSTER.BITNET
TERRY@SPCVXA (Terry Kennedy, Operations Mgr) (02/03/90)
Lee Schultz writes: > The PMDF software worked fine for about an 1-1/2 hours. Then it seemed to > get stuck, repeating the following error message (in L_MASTER.LOG in > pmdf_root:[log]) over and over and over.....: > %PMDF-W-V5DETECT, MAIL CNCT block not found -- assuming VMS V5 > %PMDF-F-GETFILERR, Error getting file parameter off command line > -CLI-W-ABSENT, entity or value absent from command string The first error is actually normal. In PMDF 3.0, PMDF looked around in MAIL to determine whether the system was running VMS V4 or V5. PMDF V3.1 now has seperate images for VMS V4 and V5, so this message is a historical one. I be- lieve Ned has stated that it will be going away in a future release (probably once V4 support is dropped). We had the second error every now and again when we were on VMS V5.2. It was actually non-fatal (at least for us). I think Ned said it was a problem in the VMS side of things. I haven't seen it since we went to V5.3. In any event, queued files will be retried the next time the periodic deliver job runs. You could change the interval to a shorter period if you like. I think the default is four hours, but you can trim that way down. We had it at 15 minutes while we were sufferinf from the problem. Terry Kennedy St. Peter's College
NED@YMIR (Ned Freed, Postmaster) (02/04/90)
The simple explanation is that the PMDF-VMS MAIL interface could not read the name of the queued file to process off the command line. Why not? Well, what usually happens is that MAIL fires up PMDF, it reads the file name fine, it tries to open the file and gets an error. This is duly reported to VMS MAIL. But then VMS MAIL calls PMDF again. This time, when PMDF goes to read that parameter, it gets an error since it has already been read. This then essentially aborts VMS MAIL. You can get a lot of these if the timing is just right, i.e. one delivery job is following on the heels of another, trying to deliver the messages the other job has already processed. I've piddled around with this code a fair amount, and I have not found any way to fix this problem. I've never seen it last for any great length of time, and as Terry said, the mail always seems to get delivered on the next pass. I think there's a bug in the initialization of the foreign procotol interface on the incoming side, but I've never been able to locate it on the ufiche. There's also a mechanism that forces MAIL to loop and call PMDF over and over again. This would be very useful if it worked properly, since then a single invocation of VMS MAIL would process every message (a considerable savings in overhead). There's expermental code in PMDF_MAIL.PAS to operate this way, but I've never been able to get it to work right -- it always bombs out of mail on the second pass. I don't know why this happens, but it does. I have not seen this problem under VMS 5.3 either, and I have not tried the loop code since 5.2. I'll try to look into this some more in the future, but the technique used now does seem to work pretty well. If the messages are actually stuck in the queue, that's another kettle of fish. I'd recommend turning on slave_debug on the l channel to get an idea of what's really going on if messages are actually stuck and are not being delivered. Hope this helps. Ned