[comp.unix.questions] Counting files created in /tmp

shaw@hpihoah.HP.COM (Joy-lim Shaw) (02/02/90)

/ hpihoah:comp.unix.questions / jik@athena.mit.edu (Jonathan I. Kamens) / 10:33 am  Jan 31, 1990 /

>  I guess that you might consider what the original poster is trying to
>do as a "traffic map" of /tmp -- what files get put there most often,
>from what programs, and how many are there?  There are no simple
>solutions to a question like this, and without modifying the kernel in
>some way, there is almost certainly no way to get a 100% accurate set of data.

I agree.  There may even be "temporary" files that a process creates and
immediately unlinks.  This allows the process to access a file that will
disappear after the process terminates.

The question  here is:  WHAT DOES THE ORIGINAL  POSTER  REALLY WANT?  If
the original  poster could be more  specific, a kludge maybe  available.
Send e-mail to me at shaw%hpda@hplabs.hp.com  as I don't read notes on a
regular bases.

Also... (I'm nit picking here so please ignore)

1)	you can  leave  out the -l  option  of ls() in "ls -la | wc -l",
	since you're just counting the output (ls() will use  unformated
	output when it's not talking to a terminal (You knew that)).

2)	You'll also be counting two extra  outputs (.  and ..)  with the
	-a option.  Root by default will list  invisible  dot files out,
	but a regular user will have to use ls -A.

shaw

jik@athena.mit.edu (Jonathan I. Kamens) (02/02/90)

In article <22031@unix.cis.pitt.edu>, yahoo@unix.cis.pitt.edu (Kenneth L Moore)
writes:
> So the answer to his question about a SIMPLE way to keep track of what 
> has happened in temp, assuming he knows when he wants to look at it,
> is:
> 
> ls -la /tmp

  The person who asked the original question is trying to collect some
really meaningful statistics about file names, sizes and lifetimes in
the /tmp directory.

  As I think I've already pointed out, file lifetimes in the /tmp
directory have a very wide range.  A periodic ls of /tmp simply doesn't
have a chance of catching many of them, although I guess you could just
run it continually, over and over, and even that wouldn't catch all of them.

  Worse, if an execution of the program "ls" is used to get the data,
then the extra overhead of:

  1. Starting up an ls process.
  2. Ls time to get the whole directory.
  3. Ls time to format its output.
  4. Kernel time to give you the output to work with.
  5. If you do it with a pipe, kernel time to context switch between 
     the processes on both sides of the pipe.

make the program significantly run faster.

  It is relatively simple to write a program to do a periodic opendir()
and readdir() on /tmp itself.  This will significantly speed up the
program and make it far more able to collect meaningful statistics.

  There are times when modularity and chaining different unix utilities
together in a pipe are good.  This is not true, however, when speed is
an important factor.

  As an example, I just wrote a shell script that takes a number as an
argument and does "ls /tmp/ | wc -l" that many times.  I then wrote a C
program that does the same thing, but it does the counting internally,
not using ls.

  Doing the count twenty times, the shell script took 20.4 seconds on a
very lightly loaded system.  Doing the same thing with the C program
toom 0.3 seconds.

  It took me less than five minutes to write the C program, and it would
take me a minimal amount of time to augment it to collect more statistics.

Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8495			      Home: 617-782-0710

jik@athena.mit.edu (Jonathan I. Kamens) (02/02/90)

In article <22047@unix.cis.pitt.edu>, yahoo@unix.cis.pitt.edu (Kenneth L Moore)
writes:
> So what's your point? If the guy didn't know to do an "ls /tmp" he probably
> doesn't know how to program very well. Hence, "ls /tmp" is the SIMPLE answer.

1. "The guy" is quite aware of how to do "ls /tmp".  He is also quite
aware that
   the statistics generated in that manner will be relatively useless. 
Hence his
   question about how to do it some other way.

   It was fairly obvious to me, although apparently not to you, that someone
   asking about how to find out about file creation and lifetimes in /tmp is
   probably sufficiently knowledgeable about Unix to know what "ls" is.

2. Since you seem so sure of "the guy"s level of knowledge about Unix, have you
   gotten your information through private E-mail correspondence with him?  I
   have.

3. Despite the fact that I know that "the guy" is quite aware of what ls is,
   let's assume for a moment that he doesn't.  Even assuming that, the fact
   still remains that ls is NOT going to be able to give him the statistics he
   needs.  It just can't do it.  Therefore, he's going to have to learn enough
   programming to write the C code that reads a directory in order to do his
   gathering of statistics.  Hence my answer.

   But this, of course, is irrelevant because he *does* know how to use
ls, and I
   suspect he already knows how to use the C directory-reading routines.
  
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8495			      Home: 617-782-0710

sja@sirius.hut.fi (Sakari Jalovaara) (02/03/90)

>> how many files get created in the /tmp subdirectory
> modify the Unix kernel

Weell, you could write a user-level NFS server that gathers statistics
and mount /tmp using the server.

Ok, easier: NFS mount /tmp and write a program that monitors network
traffic and interprets any NFS packets that reference /tmp.

Not easy but possible... unless, of course, the local kernel caches
creat/unlink NFS requests.  Back to the drawing board?
									++sja

dold@mitisft.Convergent.COM (Clarence Dold) (02/03/90)

in article <22031@unix.cis.pitt.edu>, yahoo@unix.cis.pitt.edu (Kenneth L Moore) says:

> So the answer to his question about a SIMPLE way to keep track of what 
> has happened in temp, assuming he knows when he wants to look at it,
> is:

> ls -la /tmp

The majority of files created in /tmp are removed, but still open.
I might assume that the reason someone is looking in the first place would
be to justify a separate file system for /tmp, or something equally useful.
ls of any variety doesn't tell the story adequately.

tmpfile(3S) creates a file in /tmp with a unique name, then immediately
unlinks it.  The name goes away, but the disk utilization remains.

If I cared about the file usage in /tmp, I would be more concerned about
the disk utilization, not just the names, or lack thereof.

-- 
---
Clarence A Dold - dold@tsmiti.Convergent.COM            (408) 435-5293
               ...pyramid!ctnews!tsmiti!dold        FAX (408) 435-3105
               P.O.Box 6685, San Jose, CA 95150-6685         MS#10-007

ken@cs.rochester.edu (Ken Yap) (02/04/90)

I don't know exactly what the person wants to do with stats from /tmp,
I had some mail from him, but it wasn't totally clear what sorts of
stats he wanted, but I want to point out that not all files in /tmp
have names. A neat trick to avoid leaving /tmp files around when
a program crashes is to do this at creation time.

	fd = creat(name, mode);	/* or open with O_CREAT */
	unlink(name);

Now the file is nameless and can only be accessed via the fd. It
will go away as soon as nobody is referencing it.

sja@sirius.hut.fi (Sakari Jalovaara) (02/04/90)

> how many files get created in the /tmp subdirectory of Unix on a Sun
> 3/50 or 3/60

How about this: if you are running SunOS 4 you can get an estimate by
modifying the dynamically linked C library.  Write new versions of
open(), creat() and unlink() that first check if the argument file is
in /tmp, write to a trace log, and then do the real system call.
Again, not simple but doable, at least if you have the source to libc.
Depends on how serious you are about getting the statistics...

Statically linked programs (sh, tar etc) escape the statistics.
Files that are unlink()ed but left open are not tracked correctly.
									++sja