[comp.unix.wizards] Must UNIX be a memory hog?

vern@zebra.UUCP (Vernon C. Hoxie) (05/09/89)

Gentlemen:
	Since first becoming acquainted with the UNIX system, I have
curious about the use of entire sectors of memory for such trivial
entries as 'TZ' and 'LCK..xxx' files.

	In the case of 'TZ', this file will have at maximum 10 bytes but
it requires 1024 bytes of memory on most systems.  Multiplied by the
number of UNIX systems in operation, that is a whale of a lot of wasted
memory.  This could be one of the most expensive constants in the
computing world.

	The 'LCK..xxx' files used by uucp were intended to last on disk
for only a short interval.  Never the less, sufficient memory must be
available for when an external connection is made.  In the case of the
3b1, permanent 'LCK...ph0' are in existence whenever the telephone
controller is placed in the VOICE condition.

	Is there not some other mechanism by which the functionality of
these sparse files is accomplished but require much less storage space?

	In the case of 'TZ', a struct could have been developed but
then, the code necessary to change this parameter might well be greater
than the 1024 bytes wasted by the present method.  Of course, there is a
working equivalent to the information obtained from the 'TZ' file which
already has the supporting code.

	The 'LCK...xxx' files contain the PID of the process which
opened them so that janitorial type processes could determine if the
need for the lock still exists.  Should a process die without removing
the 'LCK...xxx', the janitor sweeps them away.

	My basic question is: Is this good programming form?  Should
not these processes been implemented with something of the nature
of 'semaphores' or 'signals' or some other system level concept?  A
conception which will accomplish the desired result without hogging so
much memory.

	While I have your attention, isn't a great deal of disk capacity
wasted with directories that don't shrink?  As the number entries in a
directory grows, additional disk blocks are assigned to the directory. 
But when number entries in a directory is reduced, the number of blocks
assigned to that directory remains at the high water mark.

-- 
Vernon C. Hoxie		       {ncar,nbires,boulder,isis}!scicom!zebra!vern
3975 W. 29th Ave.					voice: 303-477-1780
Denver, Colo., 80212					 uucp: 303-455-2670

guy@auspex.auspex.com (Guy Harris) (05/11/89)

>	Since first becoming acquainted with the UNIX system, I have
>curious about the use of entire sectors of memory for such trivial
>entries as 'TZ' and 'LCK..xxx' files.

I presume by "memory" here you mean disk memory rather than main memory.

>	In the case of 'TZ', this file will have at maximum 10 bytes but
>it requires 1024 bytes of memory on most systems.  Multiplied by the
>number of UNIX systems in operation, that is a whale of a lot of wasted
>memory.  This could be one of the most expensive constants in the
>computing world.

Uhh, to what are you referring, here?  The references to "files" makes
it sound like you're referring to the time zone files used by the
"Arthur Olson" time zone code, but they're not "at maximum 10 bytes";
they're typically more like 750 bytes or so.  110592 bytes are taken up
by the "/usr/share/lib/zoneinfo" directory on the system here, which
isn't a heck fo a lot.

The "10 bytes" part is a bit mysterious, then; are you referring to
space on disk, or what?  In systems not using the Olson code, time zone
information is stored in a file only in those files that set the TZ
environment variable.  This is generally either:

	"/etc/profile" - which does a lot more than just set "TZ",
	and is generally longer than 10 bytes; what's more, there's
	only one "/etc/profile" file, so *relative to the number of
	files on the system* it's a drop in the bucket.

or
	"/etc/TIMEZONE" - which only sets a few environment variables,
	at most, but again there's only one of them.

If you really want to get fanatical about wasted disk space, worry about
the "true" command; it doesn't need to contain any data, but on more
recent versions of System V, for example, it contains an AT&T copyright
notice (right, they've copyrighted the null sequence of bytes; give me a
break).  Multiply *that* by the number of UNIX systems with that style
of "true" command, and just *imagine* what a *huge* chunk of the GNPs of
the world's nations are being *wasted* on that! :-) :-) :-) :-) :-)

>	The 'LCK..xxx' files used by uucp were intended to last on disk
>for only a short interval.  Never the less, sufficient memory must be
>available for when an external connection is made.  In the case of the
>3b1, permanent 'LCK...ph0' are in existence whenever the telephone
>controller is placed in the VOICE condition.
>
>	Is there not some other mechanism by which the functionality of
>these sparse files is accomplished but require much less storage space?

No.  Think of how little storage space they actually require; it's
simply not *worth* worrying about the disk space they take up.  On a
10MB disk, say, with one "/etc/TIMEZONE" file and 10 "LCK.." files,
that's 11 1KB files, or 11KB.  1% of 10MB is 100KB; you're talking about
.1% here - a drop in the bucket - and lots of disks are considerably
larger than 10MB these days.

>	While I have your attention, isn't a great deal of disk capacity
>wasted with directories that don't shrink?  As the number entries in a
>directory grows, additional disk blocks are assigned to the directory. 
>But when number entries in a directory is reduced, the number of blocks
>assigned to that directory remains at the high water mark.

Yeah, but unless the directory gets *very* big for a brief period of
time, and usually stays considerably below that size, it won't save
much.  4.3BSD will shrink directories, but I think the main win here may
be for directories like the UUCP spool directory when a big burst of
UUCP work comes in and then leaves - and even there I suspect the real
win of shrinking the directory is the reduction in time to search the
directory, not in disk space.

cjc@ulysses.homer.nj.att.com (Chris Calabrese[mav]) (05/11/89)

In article <159@zebra.UUCP>, vern@zebra.UUCP (Vernon C. Hoxie) writes:
> [introduction to /etc/TZ and HDB LCK files deleted]
> 
> 	Is there not some other mechanism by which the functionality of
> these sparse files is accomplished but require much less storage space?
> 
> [...]
> 
> 	My basic question is: Is this good programming form?  Should
> not these processes been implemented with something of the nature
> of 'semaphores' or 'signals' or some other system level concept?  A
> conception which will accomplish the desired result without hogging so
> much memory.

Well, these files live on the disk where space costs about 1 cents per k
(say a 40meg disk for $500).
Such system level structures would have to live in memory, which costs
about 50 cents per k (say a 1meg sim for $500).

After the support code for keeping these structures around, you might
be approaching 1k anyway.  Besides, IPC is optional in the svid :-)/2.

I think everyone will agree on not having these things as 'system level
concepts'.  The kernel is bloated enough already.

Now that I think about it, where would you put the TZ info if not in /etc/TZ
(you could just have it as a line in /etc/profile (instead of having
/etc/profile run /etc/TZ), but TZ is easier to edit)?  Would this have
to go into the battery backup memory like the clock?  Not exactly portable.

Excuse my bitching, but let's not try to turn UNIX into THE GREAT OPERATING
SYSTEM FROM OUTER SPACE.  Its main attraction is the simplicity of the kernel
(though it's getting a little un-simple these days) and the ease of changing
such things as how to store the timezone and create lock files for UUCP
(which, BTW, is just a plain old application, not part of the "system").
-- 
Name:			Christopher J. Calabrese
Brain loaned to:	AT&T Bell Laboratories, Murray Hill, NJ
att!ulysses!cjc		cjc@ulysses.att.com
Obligatory Quote:	``Now, where DID I put that bagel?''

flint@gistdev.UUCP (05/11/89)

90% of the time when it comes to a choice between using disk space and using
time, I'd rather use up the disk space.  The same goes for memory.  The only
thing you have to be careful about is that the amount of disk space or memory
being used might cost you in execution time.  (Like when it causes swapping.)

Extra disk capacity is cheap: labor costs are not.  If someone created a UNIX
that ran twice as fast but needed twice as much disk in order to run, they'd
have a lot of customers.

jc@minya.UUCP (John Chambers) (05/18/89)

In article <1608@auspex.auspex.com>, guy@auspex.auspex.com (Guy Harris) writes:
> If you really want to get fanatical about wasted disk space, worry about
> the "true" command; it doesn't need to contain any data, but on more
> recent versions of System V, for example, it contains an AT&T copyright
> notice (right, they've copyrighted the null sequence of bytes; give me a
> break).  Multiply *that* by the number of UNIX systems with that style
> of "true" command, and just *imagine* what a *huge* chunk of the GNPs of
> the world's nations are being *wasted* on that! :-) :-) :-) :-) :-)

A minor legal quibble:  If you look at /bin/true, you will find that it
actually contains a blank line, which is an executable statement.  This
is what they are actually copyrighting.  So if you sell any shell script
that contains a blank line, you are in violation of AT&T's copyright.

On this system, as on several others, I've replaced /bin/true and /bin/false
with executables (which will be left as an exercise for the reader, since
posting them would be an intellectual insult to any True Unix Wizards ;-).
I've verified that the result is a measurable speedup in "while true"
loops, due to the elimination of the shell startup to run an empty script.
But this isn't much of a big deal, since such loops are rather rare.
For example, try adding a line to /bin/true that appends a byte to some
file every time it is called, and watch how fast the file grows.  You
will probably be disappointed.  If you're not, then replace /bin/true
right away.  In fact, why don't you do it now - the time it takes will
eventually be recovered in faster system response time.

-- 
John Chambers <{adelie,ima,mit-eddie}!minya!{jc,root}> (617/484-6393)

[Any errors in the above are due to failures in the logic of the keyboard,
not in the fingers that did the typing.]

bzs@bu-cs.BU.EDU (Barry Shein) (05/21/89)

Actually, although humorous, I wonder about the legal implications of
that /bin/true which contains nothing but a copyright notice (and
perhaps one blank line.)

One could make an argument that AT&T ran around blindly copyrighting
everything in sight without being bothered to so much as inventory its
copyright value or verify that there were any contents to which their
copyrights could lay claim to or be properly assigned.

This would tend to open up the arguments that:

	A) There is no reason to believe that merely because
	AT&T has stamped a copyright on something that they
	seriously lay claims to it since obviously they have
	not bothered to consider what they have copyrighted.

	B) That AT&T stamps copyrights frivolously and was
	not motivated by the (claimed or implied) value of
	what they copyright to review its status. That is, the
	material is of no value to THEM, otherwise they would
	have reviewed it before assigning claims.

Put simply, it was of no value to AT&T since they could not be
bothered to inventory what they were copyrighting so why should such
copyrights be of any value to the courts (the courts here acting as
agents of the desires of society at-large to have a copyright
protection which, at the very least and most minimal test, reflects
the worth of the material being copyrighted TO THE AUTHOR.)

The copyright action was of no value to them so why should it be of
any value to the rest of us (ie. the society which grants the rights
under the copyright law)? We surely cannot exhibit more concern for
the value of the material than the author!

It was frivolously done not to protect creative work but merely to
exploit and abuse the copyright law as evidenced by their copyright of
an empty program file (and re-issuing it as such, repeatedly, even
long after it was surely brought to their attention.)

The copyright law must, at a minimum, presume that the copyrightor is
aware of what s/he is copyrighting and is prepared to divulge its
worth (even if only to the author.) Otherwise one has to assume
contempt for the copyright law, not a position I would like to be in.

I would be interested in any case law which dealt with frivolous use
of the copyright law who's only purpose was to restrain trade rather
than protect a creative work (eg. someone trying to copyright a blank
book and lay claim to the concept of a blank book, as opposed to the
design of a particular blank book.)

Food for thought.
-- 
	-Barry Shein, Software Tool & Die

There's nothing more terrifying to hardware vendors than
satisfied customers.

fuat@cunixc.cc.columbia.edu (Fuat C. Baran) (05/23/89)

In article <31529@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>Actually, although humorous, I wonder about the legal implications of
>that /bin/true which contains nothing but a copyright notice (and
>perhaps one blank line.)
>
>One could make an argument that AT&T ran around blindly copyrighting
>everything in sight without being bothered to so much as inventory its
>copyright value or verify that there were any contents to which their
>copyrights could lay claim to or be properly assigned.

Are all UNIX files individually copyrighted, or is UNIX as a whole (or
by suitably large product chunks) protected by copyright?  Some of the
files (e.g the sources to /bin/true, /bin/false), etc. are obviously
trivial, and on there own would not merit a copyright notice, but as
part of UNIX as a whole they probably got the copyright notice, just
to make sure every file associated with the copyright was marked.

>I would be interested in any case law which dealt with frivolous use
>of the copyright law who's only purpose was to restrain trade rather
>than protect a creative work (eg. someone trying to copyright a blank
>book and lay claim to the concept of a blank book, as opposed to the
>design of a particular blank book.)

I don't think slapping a copyright notice (or registering a copyright)
on /bin/true (along with all the other sources to the kernel and
utilities) is frivolous use of the copyright law.  Attempting to sue
for copyright infringement based solely on someone else's sources to
/bin/true being similar to AT&T's WOULD be frivolous, and would
probably get thrown out of court.


						--Fuat

-- 
INTERNET: fuat@columbia.edu          U.S. MAIL: Columbia University
BITNET:   fuat@cunixc.cc.columbia.edu           Center for Computing Activities
USENET:   ...!rutgers!columbia!cunixc!fuat      712 Watson Labs, 612 W115th St.
PHONE:    (212) 854-5128                        New York, NY 10025

jiii@visdc.UUCP (John E Van Deusen III) (05/23/89)

In article <31529@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>
> I wonder about the legal implications that /bin/true which contains
> nothing but a copyright notice ...

Since it is impossible to copyright ideas, (see COMPUTER SOFTWARE
PROTECTION by Thorne D Harris III, 1985, Prentice Hall, 0-13-528373-6),
I can see how AT&T might have been concerned about publishing a
significant, although small, part of the UNIX operating system that was
unequivocally unique.  A software company in Korea, trying to export an
operating system called YUNICS, might claim that although their code was
almost identical to UNIX, certain things can only be done in one way;
exibit A, /bin/true.

I am currently not inclined to use either /bin/true or /bin/false.  The
colon ':' has the same effect as true, but as part the the shell it
avoids the path search and file read operations.  Test(1) is usually a
part of the shell too.  Thus,

	test ""

should usually be more efficient than /bin/false.  Even if it were not
more efficient, it is at least as clear to me what is intended.
--
John E Van Deusen III, PO Box 9283, Boise, ID  83707, (208) 343-1865

uunet!visdc!jiii