[comp.unix.xenix] Errors in PS

vijay@lll-winken.LLNL.GOV (Vijay Subramaniam) (01/30/89)

Has anyone else been having trouble with PS giving a ps: seek error?
The machine I am having this trouble on, is a SCO Xenix/386 running version 2.3.1 and tcp/ip networking software from Streamlined Networks.
Has anyone else been having this trouble on their Xenix machine? and if so..do you have a fix for it?




                                  Vijay

karl@ddsw1.MCS.COM (Karl Denninger) (01/31/89)

In article <19496@lll-winken.LLNL.GOV> vijay@lll-winken.UUCP (Vijay Subramaniam) writes:
>Has anyone else been having trouble with PS giving a ps: seek error?
>The machine I am having this trouble on, is a SCO Xenix/386 running version 2.3.1 and tcp/ip networking software from Streamlined Networks.
>Has anyone else been having this trouble on their Xenix machine? and if so..do you have a fix for it?

Call SCO, ask for the Media desk, give them your customer key number (or
register your OS if you haven't done so) and ask for fix disk "xnx120", the
"ps seek error fix" for SCO Xenix V/386 Version 2.3.

That should do the job.

--
Karl Denninger (karl@ddsw1.MCS.COM, ddsw1!karl)
Data: [+1 312 566-8912], Voice: [+1 312 566-8910]
Macro Computer Solutions, Inc.    	"Quality solutions at a fair price"

jom@belltec.UUCP (Jerry Merlaine) (01/31/89)

In article <19496@lll-winken.LLNL.GOV>, vijay@lll-winken.LLNL.GOV (Vijay Subramaniam) writes:
> Has anyone else been having trouble with PS giving a ps: seek error?
> The machine I am having this trouble on, is a SCO Xenix/386 
> running version 2.3.1 and tcp/ip networking software from 
> Streamlined Networks.
> Has anyone else been having this trouble on their Xenix machine?
> and if so..do you have a fix for it?
> 
> 
> 
> 
>                                   Vijay

Well, the first step is to check the manual, then 
call Streamlined for technical support.

The XENIX /dev/kmem has a bug where kernel memory which has been
allocated by malloc() can't be looked at via /dev/kmem.  This is probably
not it.

Also, the Streamlined software adds lots more symbols to the XENIX kernel
and maybe ps blows up because of this.

Also, the Streamlined installation puts the TCP kernel in /xenix.snip
and tells you to boot up first with that and check everything out first
before making it your normal kernel file in /xenix.  Maybe you're running
off of /xenix.snip and ps'ing off of /xenix.  'ps -efn /xenix.snip' may work.

It works on our XENIX 2.3 machine.   What can I say?

Jerry Merlaine
pacbell.com!belltec!jom

rick@pcrat.UUCP (Rick Richardson) (02/01/89)

In article <19496@lll-winken.LLNL.GOV>, vijay@lll-winken.LLNL.GOV (Vijay Subramaniam) writes:
> Has anyone else been having trouble with PS giving a ps: seek error?
> The machine I am having this trouble on, is a SCO Xenix/386 
> running version 2.3.1 and tcp/ip networking software from 
> Streamlined Networks.

This may be related.  Or not.  I just saw a "ps: seek error" under
ISC 386/ix version 1.0.6 (first and only one time this happened).
The situation was a coding error (mine!) which started a program doing
infinite recursion.  While the stack was growing in leaps and bounds
I found that I could not interrupt it from the keyboard.  No
amount of INT character pounding stopped it, even though the program
did not catch any signals.  I switched to the console VT and issued
a "ps" to see what process # to kill.  The system was very sluggish,
with almost continuous disk activity going on.  The "ps" printed
a few lines and then gave the "ps: seek error".  By the time all this
finished, my errant program had exitted, taking with it the shell
on that VT.  The system returned to normal at this point and I just
logged in again and fixed the bug.

-Rick
-- 
Rick Richardson | JetRoff "di"-troff to LaserJet Postprocessor|uunet!pcrat!dry2
PC Research,Inc.| Mail: uunet!pcrat!jetroff; For anon uucp do:|for Dhrystone 2
uunet!pcrat!rick| uucp jetroff!~jetuucp/file_list ~nuucp/.    |submission forms.
jetroff Wk2200-0300,Sa,Su ACU {2400,PEP} 12013898963 "" \d\r\d ogin: jetuucp

derekv@dvlmarv.UUCP (Derek Vair) (02/02/89)

I haven't seen your bug, but SCO knows about it.

We just received the January 1989 SCO "Support Level Supplement and Update
Catalog" in the mail.  On the second page, there is an entry for the
"PS Seek Error Supplement" for SCO Xenix 2.3.  It is SLS #xnx120.  It's dated
01/01/89, so it's a very recent fix.

Derek Vair
The Software Group Limited

jr@oglvee.UUCP (Jim Rosenberg) (02/04/89)

In article <667@pcrat.UUCP> rick@pcrat.UUCP (Rick Richardson) writes:
>In article <19496@lll-winken.LLNL.GOV>, vijay@lll-winken.LLNL.GOV (Vijay Subramaniam) writes:
>> Has anyone else been having trouble with PS giving a ps: seek error?
>This may be related.  Or not.  I just saw a "ps: seek error" under
>ISC 386/ix version 1.0.6 (first and only one time this happened).
>The situation was a coding error (mine!) which started a program doing
>infinite recursion.  While the stack was growing in leaps and bounds

[...]

>The system was very sluggish,
>with almost continuous disk activity going on.  The "ps" printed
>a few lines and then gave the "ps: seek error".

I probably shouldn't jump into this, since I've never had the privilege of
breaking the bonds of second class citizenship in the UNIX world by having
access to source code, so I'm only speculating.  (A euphemism for bs'ing!)

It appears that ps needs to do i/o on /dev/swap in order to fill in some of
the fields it needs, depending on the state of the process it's listing.
Exactly why this should be true on a 386, which *OUGHT* to be a paging system
rather than a swapping system, I can't say, but I know that on our Xenix
System V (**not** V.3, alas) ps needs read permission on /dev/swap.  I believe
even on a paging system it pages out to /dev/swap, or the like, right?  Rick's
machine probably started thrashing when he got uncontrolled stack growth.
Could it be that the paging code somehow stepped on the toes of an lseek ps
was doing on /dev/swap?

The way that ps works is an utter abortion, IMHO.  Many, many, many operating
systems have a system call to yield up process table entries.  While I surely
place great weight on those wise souls who argue that both System V and BSD
are bloated as it is & the last thing we need is more system calls, don't we
really need *something* more than reading /dev/mem and fishing things out of
the name list just to get something the kernel should hand us for the asking?
When ps has to go to the extreme of pawing its way through /dev/swap it seems
to me something is drastically wrong.  I should be able to read section 2 of
the man pages and bang out a respectable ps without that much work.
-- 
Jim Rosenberg                        pitt
Oglevee Computer Systems                 >--!amanue!oglvee!jr
151 Oglevee Lane                      cgh
Connellsville, PA 15425                                #include <disclaimer.h>

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/04/89)

In article <464@oglvee.UUCP> jr@.UUCP (Jim Rosenberg) writes:
>The way that ps works is an utter abortion, IMHO.

Yes; fixed in SVR4 (I think).  Not with more system calls, though.
Read the paper "Processes as Files" in one of the USENIX proceedings.

By the way, your guesses about "ps" operation were pretty good.

guy@auspex.UUCP (Guy Harris) (02/05/89)

>It appears that ps needs to do i/o on /dev/swap in order to fill in some of
>the fields it needs, depending on the state of the process it's listing.
>Exactly why this should be true on a 386, which *OUGHT* to be a paging system
>rather than a swapping system, I can't say,

The name "swap" in "/dev/swap" is, in part, historical.  In many paging
UNIX systems, "/dev/swap" refers to the area where pages (or, at least,
pages not directly backed by blocks in a file) get written if necessary.
(On some systems it may be known as "/dev/drum" instead, which is, these
days, a historical name as well - did any BSD release *ever* use a drum
as a paging device?  Did DEC ever *sell* a drum as a PDP-11 or VAX
peripheral?)

In addition, paging systems *do* swap, on occasion; if there are two
many processes swapped in and competing for physical page frames, your
system will thrash.  Paging systems may "swap out" processes, which will
mark them as not eligible to be run, in order to prevent this from
happening.

Some of the information about the process that "ps" wants to print is
stored in a region called the "U area", which tends to be swapped out
when the process is swapped out

>I believe even on a paging system it pages out to /dev/swap, or the
>like, right?

Right.

>Could it be that the paging code somehow stepped on the toes of an lseek ps
>was doing on /dev/swap?

It could be.  To quote from the BUGS section of the SunOS 4.0 PS(1)
manual page:

     Things can change while ps is running; the picture it  gives
     is only a close approximation to the current state.

This appears in other PS(1) manual pages as well; I think it may date
back to V7.

>The way that ps works is an utter abortion, IMHO.  Many, many, many operating
>systems have a system call to yield up process table entries.

Some versions of UNIX have a better mechanism for this than either the
current mechanism *or* a system call.  They have a fake "file system"
mounted on the directory "/proc".  In this directory appear files with
names that are ASCII versions of process IDs; for instance, "/proc/1" (I
don't know if they have leading zeroes or not; I'm assuming they don't,
here) is the file for the "init" process.

You can open this file if you have permission (as I understand it, the
file appears to have, as its owner and group owner, the effective UID
and GID under which it's running, and to have permissions "rw-------",
so only the UID under which it's running or the super-user can open it).
You can read or write the file to read or modify the address space of
the process; you can issue various "ioctl"s to control the process -
stop it, start it, catch various signals, read its U area, etc..

This can serve as a replacement for "ptrace", of course.  It can also
serve as a mechanism that "ps" can use.

debra@alice.UUCP (Paul De Bra) (02/05/89)

In article <9593@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <464@oglvee.UUCP> jr@.UUCP (Jim Rosenberg) writes:
>>The way that ps works is an utter abortion, IMHO.
>
>Yes; fixed in SVR4 (I think).  Not with more system calls, though.
>Read the paper "Processes as Files" in one of the USENIX proceedings.

This "Processes as Files" way of implementing ps, as done in the Eight
and Ninth edition Unix, does not guarantee flawless behaviour of ps though!

The old remark in some man-pages remains in effect:
"Things can change while ps is running."

The Unix system is not "frozen" while ps is trying to get the info about
the processes. While ps is trying to read info about a process that process
may change states, grow, be swapped out, or die, all of which may confuse
ps. The "ps: seek error" can still occur.

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

raf@andante.UUCP (Roger Faulkner) (02/06/89)

In article <8869@alice.UUCP> debra@alice.UUCP (Paul De Bra) writes:
>In article <9593@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>
>>Yes; fixed in SVR4 (I think).  Not with more system calls, though.
>>Read the paper "Processes as Files" in one of the USENIX proceedings.
>
>This "Processes as Files" way of implementing ps, as done in the Eight
>and Ninth edition Unix, does not guarantee flawless behaviour of ps though!

The /proc process filesystem as implemented in SVR4 differs in
detail, not in concept, from Eighth and Ninth Edition Unix systems.
In particular, there is one ioctl() operation that fetches all of
the information needed by ps(1).  ioctl() operations are guaranteed
to be atomic wrt the target process, so what you get is a flawless
snapshot of the process at the moment of the ioctl().

The comment in the ps(1) man page:
     Things can change while ps is running; the picture it
     gives is only a close approximation to the current state.
continues to be true wrt the full ps(1) listing.  To have
ps(1) stop the whole system (except itself) in order to
give a correct total snapshot would be overkill (i.e.,
users would undoubtedly kill their AT&T representatives).

Guy Harris's description (article <951@auspex.UUCP> guy@auspex.UUCP)
of the operation of /proc is correct (and lucid).  However, in the
SVR4 implementation the security provisions are more stringent:

- Except for the super-user, an open() of a /proc file will fail
  unless both the user-id and the group-id of the caller match
  those of the target process and unless the process's a.out is 
  readable by the caller.
- Setuid and setgid processes can be opened only by the super-user.
- An open filedescriptor will become invalid if the target process
  exec()s a setuid/setgid or unreadable object file.  Any operation
  on an invalid filedescriptor (except close()) returns an error.
  (Previous implementations either failed the target's exec() or
  silently disallowed the setuid/setgid, causing an inspected
  process to malfunction.)

In article <464@oglvee.UUCP> jr@.UUCP (Jim Rosenberg) writes:
>I should be able to read section 2 of
>the man pages and bang out a respectable ps without that much work.

With /proc, you would be able to do this (provided you can be
super-user on your system).  You'll have to read a different
section of the manual, though.  It will be either 4 or 7.

Disclaimer:  I don't officially represent AT&T, my opinions
are my own, but what I have described to you is fact.
	Roger A. Faulkner
	allegra!raf

friedl@vsi.COM (Stephen J. Friedl) (02/06/89)

In article <15827@andante.UUCP>, raf@andante.UUCP (Roger Faulkner) writes:
> 
> The /proc process filesystem as implemented in SVR4 differs in
> detail, not in concept, from Eighth and Ninth Edition Unix systems.

     The /proc filesystem is also found on the AT&T 3B15 running
Sys V Release 3 and probably on the 3B4000 as well.

     It's *cool* :-)

     Steve

-- 
Stephen J. Friedl        3B2-kind-of-guy            friedl@vsi.com
V-Systems, Inc.       I speak for you only      attmail!vsi!friedl
Santa Ana, CA  USA       +1 714 545 6442    {backbones}!vsi!friedl
Nancy Reagan on these *stupid* .signatures: "Enough already, OK?"