[comp.unix.internals] Interfaces for accessing kernel memory

tchrist@convex.com (Tom Christiansen) (11/29/90)

We've all written (or at least seen) programs that make kernel dives
into /dev/mem using an nlist from /vmunix in order to figure out what
some particular kernel variable looks like, or perhaps even to change
it.

Sun has libkvm that helps speed this up a bit, and Convex has a similar
libvm that does the same thing.  Still, this isn't what you really want
for a robust, clean interface.  I'd like a scheme that wouldn't break
just because the version of O/S changed and the structures changed
size, moved around, or whatever.  And I'd really like not to have to
recompile, at the expense of not getting new features just added to a
certain structure.

I've heard that some vendors have a system (or is it library) call for
returning you chunks out of the kernel.  Can anyone tell me how they
seem to work?  I'm talking manpage level details here, not what the
actual source code looks like.

Also, what would you *like* to see in such an interface?  What about
arrays of structures?  Wouldn't it be nice to be able to get the first,
then the next element until null, of the array?

--tom

"With a kernel dive, all things are possible, but it makes it hard
 to face yourself in the mirror the next day."

shore@mtxinu.COM (Melinda Shore) (12/02/90)

In article <109449@convex.convex.com> tchrist@convex.com (Tom Christiansen) writes:
>I've heard that some vendors have a system (or is it library) call for
>returning you chunks out of the kernel.  Can anyone tell me how they
>seem to work?

Both Unicos and Mach have system calls to return the current contents
of selected kernel data structures.  In both cases you pass in which
structures you want (header files include constants such as TBL_PROCINFO),
starting address and number of elements (for arrays), and a pointer to
where you want the data stuffed.  How do they seem to work?  They seem
to work great.  The only limitation I've run into with either of them
is that they don't necessarily provide access to all of the structures
in which you might be interested.  On the other hand, Mach's table()
call with give you the u. area for any process, and that's such a win
that it overrides any regrets I might have about missing structures like
the file table (which are still available through the standard nlist/
lseek/read mechanism).

There's certainly no reason that it has to be implemented as a system
call.
-- 
                      Deport Neil Bush
Melinda Shore                                 shore@mtxinu.com
mt Xinu                              ..!uunet!mtxinu.com!shore

meissner@osf.org (Michael Meissner) (12/04/90)

In article <1410@mtxinu.COM> shore@mtxinu.COM (Melinda Shore) writes:

| In article <109449@convex.convex.com> tchrist@convex.com (Tom Christiansen) writes:
| >I've heard that some vendors have a system (or is it library) call for
| >returning you chunks out of the kernel.  Can anyone tell me how they
| >seem to work?

Data General's unix (DG/UX) had such calls, though since I've not
worked for them for a year, my mind is growing blank on the details.
If somebody has an AViiON, do a man on dg_sys_info (and any other
/usr/include/sys/dg_* include files).  Here are the fragments of
emacs' etc/loadst that gets the load average:

#include <sys/dg_sys_info.h>		/* DG/UX rev 4.00 system information */ 

struct dg_sys_info_load_info load_info;

	/* ... */

      if (dg_sys_info( &load_info,
		       DG_SYS_INFO_LOAD_INFO_TYPE,
		       DG_SYS_INFO_LOAD_VERSION_0 ) == 0){

	      printf("%.2f", load_info.one_minute);
      }

| Both Unicos and Mach have system calls to return the current contents
| of selected kernel data structures.  In both cases you pass in which
| structures you want (header files include constants such as TBL_PROCINFO),
| starting address and number of elements (for arrays), and a pointer to
| where you want the data stuffed.  How do they seem to work?  They seem
| to work great.  The only limitation I've run into with either of them
| is that they don't necessarily provide access to all of the structures
| in which you might be interested.  On the other hand, Mach's table()
| call with give you the u. area for any process, and that's such a win
| that it overrides any regrets I might have about missing structures like
| the file table (which are still available through the standard nlist/
| lseek/read mechanism).
| 
| There's certainly no reason that it has to be implemented as a system
| call.

It depends on whether normal users are allowed to get information from
the kernel.  You generally have a security loophole if you get random
bytes from the kernel (either through a system call or by reading
/dev/kmem).
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

bzs@world.std.com (Barry Shein) (12/09/90)

>There's certainly no reason that it has to be implemented as a system
>call.

Security is one consideration as most people like to keep /dev/kmem
unreadable. But then again, a library would be more than useful for
people developing privileged programs (certainly no *worse* than the
current situation, and would work when/if the table() call became a
common syscall.)

Another, more subtle, consideration and one reason encore put this
sort of stuff into system calls is that on multiprocessors the
likelihood that the table you're looking at won't be "corrupted"
(changed) as you read it becomes fleetingly small. At least a syscall
can, optionally, lock the structure while it's being copied out.
There's almost no way to do that thru the regular read/write interface
on /dev/kmem (well, you can do anything, look at what the guy is
reading etc, but forget it, a syscall here solves a real problem.)

A library emulation would be a good idea. It would be a way for people
to start moving their favorite utilities over and break the old
chicken and egg compatability problem (and create a demand.)

Should have been done years ago.
-- 
        -Barry Shein

Software Tool & Die    | {xylogics,uunet}!world!bzs | bzs@world.std.com
Purveyors to the Trade | Voice: 617-739-0202        | Login: 617-739-WRLD

mike@wang.com (Mike Sullivan) (12/11/90)

tchrist@convex.com (Tom Christiansen) writes:

>We've all written (or at least seen) programs that make kernel dives
>into /dev/mem using an nlist from /vmunix in order to figure out what
>some particular kernel variable looks like, or perhaps even to change
>it.
...

>arrays of structures?  Wouldn't it be nice to be able to get the first,
>then the next element until null, of the array?

>--tom

When I worked at Alliant our arrangement was very simple, we mapped the
kernel memory into the process (typically read only ;-) which needed
access.  Since my only interest was in the proc table, and a few other
structures which didn't have much call to change, it was immune to 
OS changes.  The direct mapping of kernel memory made linked lists very
easy to follow.

-- 
  ________________________
 /                    __  \  | Michael J. Sullivan    | "Used to be different,
| \  \  /  /\  |\ |  /  `  | | Wang Laboratories Inc. | Now you're the same,
|  \/ \/  /--\ | \|  \__T  | | mike@WANG.COM	      | Yawn as your plane"
 \________________________/  | 			      |	goes down in flames"

meissner@osf.org (Michael Meissner) (12/12/90)

In article <axhux8.50q@wang.com> mike@wang.com (Mike Sullivan) writes:

| When I worked at Alliant our arrangement was very simple, we mapped the
| kernel memory into the process (typically read only ;-) which needed
| access.  Since my only interest was in the proc table, and a few other
| structures which didn't have much call to change, it was immune to 
| OS changes.  The direct mapping of kernel memory made linked lists very
| easy to follow.

This suffers from the same problem that diving into /vmunix and
reading /dev/kmem does, namely that things can change from under you
if you take a page fault or lose your time slice to another process at
the wrong time......
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

scott@convergent.com (Scott Lurndal) (12/14/90)

In article <axhux8.50q@wang.com>, mike@wang.com (Mike Sullivan) writes:
|> tchrist@convex.com (Tom Christiansen) writes:
|> 
|> >We've all written (or at least seen) programs that make kernel dives
|> >into /dev/mem using an nlist from /vmunix in order to figure out what
|> >some particular kernel variable looks like, or perhaps even to change
|> >it.
|> ...
|> 
|> >arrays of structures?  Wouldn't it be nice to be able to get the first,
|> >then the next element until null, of the array?
|> 
|> >--tom
|> 
|> When I worked at Alliant our arrangement was very simple, we mapped the
|> kernel memory into the process (typically read only ;-) which needed
|> access.  Since my only interest was in the proc table, and a few other
|> structures which didn't have much call to change, it was immune to 
|> OS changes.  The direct mapping of kernel memory made linked lists very
|> easy to follow.
|> 

Wait a minute, there is nothing that states an implementation of unix 
is required to have a proc table (and I know of at least one which doesn't).
Nor is a proc table immune to os changes.   Exporting knowledge of 
operating system data structures is a sure way of restricting the
ability of operating system designers to change the internals of
an operating system - which no self-respecting application should know.

Your alliant applications would most likely only run on alliant 
systems - even then I suspect that the alliant o.s. engineers would
not restrain themselves from changing the format of the proc structure
just so that programs that know it will still run and/or compile.

The solution is to export interface-defined structures, independent of
the internal format (like the PIOCPRSTATUS and PIOCPSINFO requests in 
system V 4.0 /proc file system).   This allows for portable applications
which are not dependent on a particular version of a particular o.s.
In fact, such an interface could be defined so that new
information could be provided (with new releases of o.s.) in such a 
way as to not break existing programs (e.g. $GETJPI in VMS)

|> -- 
|>   ________________________
|>  /                    __  \  | Michael J. Sullivan    | "Used to be different,
|> | \  \  /  /\  |\ |  /  `  | | Wang Laboratories Inc. | Now you're the same,
|> |  \/ \/  /--\ | \|  \__T  | | mike@WANG.COM	      | Yawn as your plane"
|>  \________________________/  | 			      |	goes down in flames"

Scott Lurndal
scottl@convergent.com  {uunet|sun}!pyramid!ctnews!pase70!scottl