[net.unix-wizards] UNIX per-process accounting

fpt@wgivax.UUCP (Fred Toth) (04/08/86)

Once again, I need to tap the wisdom of the community.
Here goes:

At Washburn, we have used the UNIX accounting system as
a base for charging customers for work we do for them.

Since our people may be banging away at several different
jobs at one time, we need to know which grep, sort, ls, etc.,
goes with which customer's job.

Since a customer's job related files are always in directory
subtrees, we decided long ago that we needed to know not only
what commands were run, but WHERE ('current' directory) in the
file system they were run. We made minor changes, first in
version 7, later in 4.2, to the definition of the accounting
record (acct(5)), and the kernel routines that fill the record.
We added fields for the device and inode number of the 'current'
directory. A script that runs in off hours decides whether a
given accounting record is chargeable to a customer or not,
based on whether the current directory is in an active list.

Well, this has worked very well for years. But now we would like
to distribute some of our work to another system (Sun 3). We
don't have source. Even if we did, the same changes might not
be as simple, given vnode vs. inode problems. In any case, a
system based purely on user code would be preferable.

I'm sure some members of the community have had similar problems.

I am very interested in ideas for a better way.

Some important design considerations:

1. The existing accounting system has a lot going for it. The kernel
writes a record for every process transparently. All the interesting
numbers for the process are already there (user time, system time, etc.).

2. Our current system (device/inode added to records) is totally
transparent to the users. They don't have to do anything to make it
work. An idea that may occur to some is to have users log into and
out of jobs. This gets a big thumbs down from our users, as they
hop around among many jobs at once.

3. The current system requires no modifications to user programs.
Our in-house programs, as well as standard utilities don't have to
know about the accounting system.

A nice extension to the system that would neatly solve my problem
would be a user definable field in the accounting record, set
by an non-privileged primitive. The job number could be set by
the parent shell after each directory change, and would be inherited
by all child processes. I bet people would think of lots of interesting
uses for such a thing. Alas, for now, I need a user code solution.

My questions to you all:
Does this situation ring any familiar bells?
Who has a better way?
Any ideas for discussion?

Many thanks for your attention.

Fred Toth
Washburn Graphics, Inc.
Charlotte, NC
decvax!mcnc!unccvax!wgivax!fpt
704-334-5371

bzs%bostonu.csnet@csnet-relay.arpa (Barry Shein) (04/11/86)

To summarize, Fred Toth is looking for a way to charge work to
customers using a SUN3 in a manner similar to that provided through
kernel mods in previous systems. The hook seems to be that when work
is done for a customer it is done in a particular directory.

I've got it (I think):

Assuming you can/will set the group of the directory to something
indicative of which customer is involved (say, their name), you could
create a .cshrc alias for 'cd' which will grovel out the group name
(this could be done most easily by writing a little C program which
just does a stat on the target, looks up the group name and prints it
to the standard output so it can be used in a backquote arg, or you
can grovel a 'ls -ldg' with sed or awk.) Then, write a null program
with the same name as the group, this is run on each 'cd' simply to
leave it's accounting stamp in the accounting file. Then, when you
run through the accounting files you have windows between the running
of each group name to charge. done.

Ok, that's pretty obscure, so here's some code:

let us say that directory xyzzy has group ownership Joe, that a
program 'pgroup' is written which simply prints out the group field
to the standard output and there exists a program 'Joe' which does
nothing but just run (main(){ exit(0);}). An alias might be:

	alias CD '`pgroup \!^`;cd \!^'

Now if you do a lastcomm after running that program it should reveal
that your user just ran a program named Joe, thus you know to start
charging all subsequent jobs to the account 'Joe'. When he cd's somewhere
else, a new group will be stamped in the accounting file (or, s/he can
use good ole 'cd' rather than 'CD' if that's not his/her intention.)
(That is, 'CD xyzzy' expands to '`pgroup xyzzy`;cd xyzzy' which expands
to 'Joe;cd xyzzy'. I tried an example, it should work.)

Done. (I think.) No kernel hacks, 10 minutes work. I suppose some sort
of pushdown approach could be done, but that's left as an exercise for
the reader :-) If that's not clear ask, but ask fast, I'm gonna forget
this one real fast I think. Also, give a thought to security.

	-Barry Shein, Boston University

fpt@unccvax.UUCP (fred p toth) (04/16/86)

> I've got it (I think):
> 
> Assuming you can/will set the group of the directory to something
> indicative of which customer is involved (say, their name), you could
> create a .cshrc alias for 'cd' which will grovel out the group name
> (this could be done most easily by writing a little C program which
> just does a stat on the target, looks up the group name and prints it
> to the standard output so it can be used in a backquote arg, or you
> ...
> 	-Barry Shein, Boston University

Great idea Barry! In fact, I went down this almost exact trail for a while
before it blew up in my face. There are 2 problems:

1. shell scripts. Process number 1 charges fine, since it appears properly
	in the window between cd's. However, by the time process 2 starts,
	the user will probably have changed directories. The rest of the
	script charges to the wrong job.

2. The same problem applies to any program that spawns processes, such as
	simple programs that use system(), or popen().

I think that this problem is solvable, but only by sacrificing much of the
simplicity of your original idea. For example, there is a seldom used csh
feature that could help. If you have an alias for 'shell', when the csh
detects a shell script (non-exec-able but with x bits set), it prepends
the value of the alias to the command. So, what you get when you type
'do_stuff.sh', is maybe 'special_program do_stuff.sh'. What the special
program could do is a setregid() to establish a special group id to be
inherited by the shell script that is about to run. 

Then you have 2 types of accounting. 1) Normal processes that fall within
normal cd's. 2) Processes with special group id's that override their window
position.

Still have to do something with those programs that like to do system().

See how messy it gets?

Still, it's great to see your ideas. If you have more, bring 'em on.

Fred Toth
Washburn Graphics
Charlotte, NC
decvax!mcnc!unccvax!wgivax!fpt