[comp.unix.internals] RAM disk.

jms@tardis.Tymnet.COM (Joe Smith) (09/13/90)

In article <1990Sep12.084002.5575@hq.demos.su> dvv@hq.demos.su (Dmitry V. Volodin) writes:
>Folks, does anyone want to discuss the pros and cons of placing the
>swap/pageing area onto the ram disk? :)

I understand the joke, but did you know that Sun has done the opposite?
They have put the ram disk on the swap/paging area.  Actually, it's more
like any page of physical memory can be used for either a swapped-in page
or a tmpfs file system page, first come first served.

Small files stay completely in ram.  Large file spill over into swap space,
but it's still faster than a regular file partition due to not waiting for
synchronized writes to the directory blocks, the bitmap/freelist, superblock,
etc.  It's good for /tmp (but not /usr/tmp unless you have a giant swap space.)

-- 
Joe Smith (408)922-6220 | SMTP: jms@tardis.tymnet.com or jms@gemini.tymnet.com
BT Tymnet Tech Services | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms
PO Box 49019, MS-C41    | BIX: smithjoe | 12 PDP-10s still running! "POPJ P,"
San Jose, CA 95161-9019 | humorous dislaimer: "My Amiga speaks for me."

del@thrush.mlb.semi.harris.com (Don Lewis) (09/13/90)

In article <1223@tardis.Tymnet.COM> jms@tardis.Tymnet.COM (Joe Smith) writes:
>I understand the joke, but did you know that Sun has done the opposite?
>They have put the ram disk on the swap/paging area.  Actually, it's more
>like any page of physical memory can be used for either a swapped-in page
>or a tmpfs file system page, first come first served.
>
>Small files stay completely in ram.  Large file spill over into swap space,
>but it's still faster than a regular file partition due to not waiting for
>synchronized writes to the directory blocks, the bitmap/freelist, superblock,
>etc.  It's good for /tmp (but not /usr/tmp unless you have a giant swap space.)

Does anyone have a feel for the relative performance of Sun's tmpfs versus
a 4.2 filesystem?  I have an application that uses a lot of temporary
file space.  After it is finished thrashing about with the scratch files
it builds a large data structure in memory.  The amount of swap space
and the amount of scratch file space consumed at one time are someone
complementary.  Without using tmpfs, I need both big swap and big /tmp.
I am interested in combining these into one large swap and using tmpfs.
The issue is what is the performance when simultaneously reading and
writing several large files (and possibly significant paging as well)
using tmpfs versus the same operations using the 4.2 filesystem.  The
hosts in question currently have /tmp and swap on separate drives and
have a fair amount of RAM.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (09/13/90)

In article <1990Sep13.002300.15266@mlb.semi.harris.com>
	del@thrush.mlb.semi.harris.com (Don Lewis) writes:

>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>a 4.2 filesystem?

Few monthes ago, in comp.arch, someone in Sun posted the result of
measurement that tmpfs won't improve performance of whole kernel
recompilation.

>I have an application that uses a lot of temporary
>file space.  After it is finished thrashing about with the scratch files
>it builds a large data structure in memory.  The amount of swap space
>and the amount of scratch file space consumed at one time are someone
>complementary.

There is positive correlation between amount of swap space and amount
of temporary file necessary for a process. Moreover, they are
often necessary at the same time.

That is, if you output something large to file, you almost certainly have
a copy of it in memory, and vice versa.

>Without using tmpfs, I need both big swap and big /tmp.

So, even with tmpfs, you need twice as big swap.

>The issue is what is the performance when simultaneously reading and
>writing several large files (and possibly significant paging as well)
>using tmpfs versus the same operations using the 4.2 filesystem.

If you do large amount of IO to /tmp, with simple-minded memory disk,
it is about TWICE AS SLOW AS ordinary disk file system.

The reason is that memory disk can't do async write. Data is copied
from user space to buffer cache and then to memory disk. With ordinary
disk, data is only copied to buffer cache.

If you use elaborated and complicated memory disk, it can be only as
slow as ordinary disk, but not faster.

						Masataka Ohta

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (09/13/90)

In article <1990Sep13.002300.15266@mlb.semi.harris.com>
	del@thrush.mlb.semi.harris.com (Don Lewis) writes:

>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>a 4.2 filesystem?

Few monthes ago, in comp.arch, someone in Sun posted the result of
measurement that tmpfs won't improve performance of whole kernel
recompilation.

>I have an application that uses a lot of temporary
>file space.  After it is finished thrashing about with the scratch files
>it builds a large data structure in memory.  The amount of swap space
>and the amount of scratch file space consumed at one time are someone
>complementary.

There is positive correlation between amount of swap space and amount
of temporary file necessary for a process. Moreover, they are
often necessary at the same time.

That is, if you output something large to file, you almost certainly have
a copy of it in memory, and vice versa.

>Widhout using tmpfs, I need both big swap and big /tmp.

So, even with tmpfs, you need twice as big swap.

>The issue is what is the performance when simultaneously reading and
>writing several large files (and possibly significant paging as well)
>using tmpfs versus the same operations using the 4.2 filesystem.

If you do large amount of IO to /tmp, with simple-minded memory disk,
it is about TWICE AS SLOW AS ordinary disk file system.

The reason is that memory disk can't do async write. Data is copied
from user space to buffer cache and then to memory disk. With ordinary
disk, data is only copied to buffer cache.

If you use elaborated and complicated memory disk, it can be only as
slow as ordinary disk, but not faster.

						Masataka Ohta

del@thrush.mlb.semi.harris.com (Don Lewis) (09/14/90)

In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1990Sep13.002300.15266@mlb.semi.harris.com>
>	del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>
>>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>>a 4.2 filesystem?
>
>Few monthes ago, in comp.arch, someone in Sun posted the result of
>measurement that tmpfs won't improve performance of whole kernel
>recompilation.
>

In my case, the files are larger (10-80 Meg or so).

>>I have an application that uses a lot of temporary
>>file space.  After it is finished thrashing about with the scratch files
>>it builds a large data structure in memory.  The amount of swap space
>>and the amount of scratch file space consumed at one time are someone
>>complementary.
>
>There is positive correlation between amount of swap space and amount
>of temporary file necessary for a process. Moreover, they are
>often necessary at the same time.
>
>That is, if you output something large to file, you almost certainly have
>a copy of it in memory, and vice versa.

In my case, there is somewhat of an inverse relationship.  About half
the memory bloat occurs near the end as the program is reading and
deleting files.  It does write a final (large) output file, but it
is not something I want to trust to tmpfs.

>
>>Without using tmpfs, I need both big swap and big /tmp.
>
>So, even with tmpfs, you need twice as big swap.
>

Probably only 1.5x in my case, but what's a hundred or so megabytes
between friends :-).

>>The issue is what is the performance when simultaneously reading and
>>writing several large files (and possibly significant paging as well)
>>using tmpfs versus the same operations using the 4.2 filesystem.
>
>If you do large amount of IO to /tmp, with simple-minded memory disk,
>it is about TWICE AS SLOW AS ordinary disk file system.
>
>The reason is that memory disk can't do async write. Data is copied
>from user space to buffer cache and then to memory disk. With ordinary
>disk, data is only copied to buffer cache.
>
>If you use elaborated and complicated memory disk, it can be only as
>slow as ordinary disk, but not faster.

This is the sort of info I'm looking for.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

lm@slovax.Sun.COM (Larry McVoy) (09/30/90)

In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1990Sep13.002300.15266@mlb.semi.harris.com>
>	del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>
>>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>>a 4.2 filesystem?
>
>Few monthes ago, in comp.arch, someone in Sun posted the result of
>measurement that tmpfs won't improve performance of whole kernel
>recompilation.

This is true in SunOS 4.1, not true in SunOS 4.1.1.  There was a bug,
a rather silly bug, the made write performance to tmpfs about
the same as ufs.  (For those who care, in the write case, if you are
doing a partial page, you have to read in the page before writing over
the part that the user sent down.  If you are writing a whole page
there is no need to fault in page from disk; you're overwriting all
of it.  Tmpfs didn't think to do this optimization; so performance
was limited by how fast one can fault in pages from the swap device.)

My measurements of kernel builds has shown a 20% improvement in wall
clock time on an otherwise idle system.  Test was a build of a 
GENERIC kernel (i.e., everything) and the only difference was 
/tmp was tmpfs instead of UFS.

Tests like "time dd if=/dev/zero of=/tmp/XXX count=1000" showed data
rate changing from 300KB / sec to 5MB / sec on a SS1 (if you do the math
a SS1 can't bcopy much faster than that).

>If you do large amount of IO to /tmp, with simple-minded memory disk,
>it is about TWICE AS SLOW AS ordinary disk file system.
>
>The reason is that memory disk can't do async write. Data is copied
>from user space to buffer cache and then to memory disk. With ordinary
>disk, data is only copied to buffer cache.

2X as slow is correct.  The reasoning is incorrect.  Tmpfs is better
than a ram disk because it avoids an extra copy two times.  It copies
the data from user to kernel but not from kernel to ramdisk.  If you
think about that's two less copies.  Think hard.

>If you use elaborated and complicated memory disk, it can be only as
>slow as ordinary disk, but not faster.

Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

>						Masataka Ohta
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

jfh@rpp386.cactus.org (John F. Haugh II) (10/01/90)

In article <143190@sun.Eng.Sun.COM> lm@slovax.Sun.COM (Larry McVoy) writes:
>In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>>If you use elaborated and complicated memory disk, it can be only as
>>slow as ordinary disk, but not faster.
>
>Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

I suspect this was a typo - if tmpfs can only be as slow, and not
any faster, then it must be the same speed.

As for 5MB/sec transfer limitations - does the SS1 have a DMA
setup that can handle memory to memory transfers?  I would hope
that the memory subsystem can handle more than 5MB/sec ...
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"SCCS, the source motel!  Programs check in and never check out!"
		-- Ken Thompson

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (10/02/90)

In article <143190@sun.Eng.Sun.COM> lm@slovax.Sun.COM (Larry McVoy) writes:

>>Few monthes ago, in comp.arch, someone in Sun posted the result of
>>measurement that tmpfs won't improve performance of whole kernel
>>recompilation.

>This is true in SunOS 4.1, not true in SunOS 4.1.1.  There was a bug,

If you know what is responsibility, you should post that to comp.arch.

>My measurements of kernel builds has shown a 20% improvement in wall
>clock time on an otherwise idle system.

That is what I already observed with delay option (delay option is
a very simple and better replacement of RAM disk or tmpfs, see my
paper in the proceedings of 1990 summer USENIX conference).

>Tests like "time dd if=/dev/zero of=/tmp/XXX count=1000" showed data
>rate changing from 300KB / sec to 5MB / sec on a SS1 (if you do the math
>a SS1 can't bcopy much faster than that).

And you can observe the same 5MB/sec rate even on ordinary files, if
you are using a SANE UNIX.

>>If you do large amount of IO to /tmp, with simple-minded memory disk,
>>it is about TWICE AS SLOW AS ordinary disk file system.

>2X as slow is correct.  The reasoning is incorrect.  Tmpfs is better
>than a ram disk because it avoids an extra copy two times.

READ WHAT I POST.

>>If you do large amount of IO to /tmp, with simple-minded memory disk,
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I am not saying anything about tmpfs here.

>>If you use elaborated and complicated memory disk, it can be only as
>>slow as ordinary disk, but not faster.

>Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

Of course, you KNOW fgrep is faster than grep.

I miss comp.unix.wizards.

See the assumption:

>>If you do large amount of IO to /tmp,
            ^^^^^^^^^^^^^^^^^^

If you know what is buffer cache, you can understand why tmpfs is only
as fast as ordinary disk file.

						Masataka Ohta

jfh@rpp386.cactus.org (John F. Haugh II) (10/08/90)

In article <143359@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>I'm not sure if you think he made the typo or I did, but my statement stands.
>Just for grins (mydd is a little program that moves I/O like dd but has some
>other options and timing info stuck in for free):

Oops.  I meant that the other poster probably made a typo.  His statement
was that tmpfs wasn't any slower and couldn't be any faster.  By implication,
this means tmpfs is exactly as fast as diskfs (or whatever ...)  Since this
is just plain unlikely, I assumed he meant something else.

>$ mydd if=internal of=/tmp/XXX count=500 fsync=1
>4000.00 Kbytes in 0.258 seconds (15505.4 Kbytes/s)
>$ mydd if=internal of=/usr/tmp/XXX count=500 fsync=1
>4000.00 Kbytes in 5.1 seconds (784.386 Kbytes/s)
>
>Yeah, so I lied, it's 15MB / sec not 5MB / sec.  I was being conservative :-)
>Actually, slovax is a 4/470 which has bcopy hardware support, the 5MB/sec
>number is pretty close for a 20MHZ SS1.

OK.  But as I pointed out, single "small" amounts of I/O are not the proper
test.  The only valid test of I/O performance is going to be continuous I/O
at some simulated load.  For example, trying DD on this system does not show
any slowdown until I DD 512K.  Everything up to that point occurs in the
same amount of time.  This is because all of the I/O is performed in system
buffers until that point.  Testing with a larger amount of total I/O will
give a more accurate representation.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"SCCS, the source motel!  Programs check in and never check out!"
		-- Ken Thompson

boyd@necisa.ho.necisa.oz (Boyd Roberts) (10/09/90)

When I hear `ram disk' I reach for my revolver.  Now, repeat after me...

    What is the buffer cache? -- A ram disk.

Increase NBUF and throw tmpfs away.  Vote 1 comp.unix.gizzards.


Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

cpcahil@virtech.uucp (Conor P. Cahill) (10/09/90)

In article <1850@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
>When I hear `ram disk' I reach for my revolver.  Now, repeat after me...
>    What is the buffer cache? -- A ram disk.
>Increase NBUF and throw tmpfs away.

This is not true.  The ram disk could be considered a buffer cache for
a particular portion of the disk, but it is not a buffer cache for
the entire disk.  Depending upon the configuration of the system and 
the application mixture it may be more advantagious to have a ram disk
as opposed to increasing the buffer cache.

On a system that was running near 90% utilization (i.e. very little CPU 
left) we doubled the number of NBUF entries and system performance
*dropped* significantly.  This was probably due to the extra time spent
searching through the buffer cache looking to see if a block was there.

This is an example of why performance tuning is magic.  There are no
simple answeres for all cases. What seems at first examination to be an
obvious performance gain, may turn out to be a loss.

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170

bzs@world.std.com (Barry Shein) (10/10/90)

>When I hear `ram disk' I reach for my revolver.  Now, repeat after me...

I agree Boyd.

The "problem" Unix is experiencing right now is that it has become
standard in the industry and all these folks from other O/S cultures
are having it thrown in their laps.

Those with some system experience and an opinion immediately begin
shouting for all these jargon items they were sold on from their
former system (systems which, I may add, are now becoming rapidly
obsolete.)

We had a good rash of this when TOPS-20 died a few years ago and lots
of TOPS-20 types who had been switched to Unix suddenly began shouting
for a rework of their favorite features.

Here's several features that keep coming up. Not all are without
merit, but some are already there (like RAM disk) and others are
questionably portable or, well, without merit.

1. RAM Disk

Unix already has most of the advantages of a RAM disk, so this is just
mostly a "jargon" check-off item. Had we called the buffer-cache ram
disk most of these people would not be asking for this today.

2. Caseless file system

This is stupid. The ability to insert and distinguish almost all 256
character codes is used in other languages to great advantage (e.g.
Kanji.)

The point being, a good way to view this is that file names are not
stored in ASCII under unix, they're stored as a string of 8-bit bytes,
so case is only in the eyes of the beholder.

There are a few byte values (such as slash) which are hard to encode,
but so what, you can't give your computer an IP address of all ones or
all zeros either, hardly damning. Out of band values has a long
history in computing and engineering.

3. Asynchronous I/O for "performance" reasons

All Unix block I/O is asynchronous (well, driver dependant of course,
but disks and tapes and so forth.)

The recent addition to Unix has been *synchronous* I/O (FSYNC bit.)
The one major exception is directory updates, but that's never the
issue when this comes up.

Systems like VMS have synchronous I/O as the default so asynch had to
be engineered in as the exception. It's important to understand all
this before shouting for a particular style of programming interface
that probably won't change the performance of your application one bit
(actually, will probably slow it down as now you've added all sorts of
exception handling baggage to a formerly low overhead feature.)

Another related feature is being able to get an interrupt when an I/O
has really gone to disk (et al). This has merit and for many
applications would be vastly superior to the FSYNC bit. I'm not sure
why this hasn't been done universally as the SIGIO signal could just
be used for this and it probably only entails marking a bit in the
buffer and having the kernel issue a psignal() or whatever when the
right completes and the buffer is being freed. Perhaps I'm missing
something.

4. Command names which resemble english words.

Certainly more popular among those who speak english.

Bell Labs commissioned independant studies early in Unix's history to
see if this was important or not. I've tried to locate these, others
have claimed to have been involved and seen the reports. It's possible
they were "internal use only", perhaps the work should be repeated.

The basic conclusion was that you can make commands english, wierd,
mnemonic, and even counter-intuitive (e.g. "delete" means edit a file,
"edit" means delete etc.) and it simply doesn't seem to make all that
much difference to learning curves.

This may seem horribly counter-intuitive and against all conventional
wisdom (some people get quite apoplectic when this is asserted, sort
of like telling them that there is no Santa Claus.)

Consider, for a moment, that the company which produced Unix has had
some moderate success getting the general public to enter long strings
of digits as part of their daily activity. And it is even successful
among children and other idiots, etc.

Motivation to use the thing appears to be more important than what you
call the commands. In the end, the remarkable thing about computers
are the people who use them.

5. TSR's

The MS/DOS community developed these out of utter desparation due to
their single-tasking O/S and the way memory management was
brain-damaged from the start. See "job control". Of no merit.

----------

There are many other things that can go on this list. It might be
interesting to generate such a list and post it monthly along with the
frequently asked questions lists. Something like "Why Doesn't Unix..."
-- 
        -Barry Shein

Software Tool & Die    | {xylogics,uunet}!world!bzs | bzs@world.std.com
Purveyors to the Trade | Voice: 617-739-0202        | Login: 617-739-WRLD

kessler@hacketorium.Eng.Sun.COM (Tom Kessler) (10/10/90)

In article <1850@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
>When I hear `ram disk' I reach for my revolver.  Now, repeat after me...
>    What is the buffer cache? -- A ram disk.
>Increase NBUF and throw tmpfs away.

Whoa, hold on there.  There are some performance wins with tmpfs 
(once you get it working right :-) ) that you can't get by just upping
NBUF (with the SunOS/System V R 4 memory management NBUF doesn't do what 
it used to do anyway :-) ).  Remember that when you create and delete
lots of shortlived files you've got to update their inodes (yes, I know
i-nodes are cached as well as blocks).  With the UFS file system you
actually go to disk quite frequently (mostly for recover reasons).
If you've marked this files system as a tmpfs the kernel doesn't have
to worry about getting your stuff out to disk ever because presumably
you don't care if the files are still there after a reboot.
Maybe you could tweak the file system to "know this" but for whatever
I've found tmpfs to speed up compiles quite a bit.

brtmac@maverick.ksu.ksu.edu (Brett McCoy) (10/10/90)

In <1990Oct09.121447.3336@virtech.uucp> cpcahil@virtech.uucp (Conor P. Cahill) writes:

>On a system that was running near 90% utilization (i.e. very little CPU 
>left) we doubled the number of NBUF entries and system performance
>*dropped* significantly.  This was probably due to the extra time spent
>searching through the buffer cache looking to see if a block was there.

Could someone tell me if this is possible on a Sun SS1 (increasing or
decreasing NBUF entries) and if so, where it needs to be done?  I looked
in the kernal config files and couldn't really find anything that looked
like it would do the job.
--
Too bad the universe doesn't run in a segmented environment with
protected memory. -- Wiz from "Wizards Bane" by Rick Cook
Brett McCoy                 | Kansas State University
brtmac@maverick.ksu.ksu.edu | UseNet news manager.

brtmac@maverick.ksu.ksu.edu (Brett McCoy) (10/10/90)

In <KESSLER.90Oct9140542@hacketorium.Eng.Sun.COM> kessler@hacketorium.Eng.Sun.COM (Tom Kessler) writes:

>Maybe you could tweak the file system to "know this" but for whatever
>I've found tmpfs to speed up compiles quite a bit.

Has anyone done any comparisons between compiling using a RAM tmp disk and
using the -pipe option on the compiler.  Seems to me that they would be
much the same since both use memory only and not a lot of temporary files
on a physical disk, they just go about it differently.  I'm still running
4.0.3c so I can't really try this out.

--
Too bad the universe doesn't run in a segmented environment with
protected memory. -- Wiz from "Wizards Bane" by Rick Cook
Brett McCoy                 | Kansas State University
brtmac@maverick.ksu.ksu.edu | UseNet news manager.

gt0178a@prism.gatech.EDU (Jim Burns) (10/10/90)

in article <BZS.90Oct9144959@world.std.com>, bzs@world.std.com (Barry Shein) says:

> 5. TSR's

> The MS/DOS community developed these out of utter desparation due to
> their single-tasking O/S and the way memory management was
> brain-damaged from the start. See "job control". Of no merit.

Wrong - unless you are using a windowing environment, and there are still
plenty of glass tube unices out there. And even then, few windowing
environments I've worked in can match the one or two keystroke responsive
- ness of a good TSR. (Granted, what you're talking about *does* apply to
the filter and os extension types of TSRs.)

P.S. - Since we don't get the alt. groups here, I don't know if that
(alt.religion.computers) is a real group, or just your idea of a joke,
but thanx - my mailer choked on it the first time around.
-- 
BURNS,JIM
Georgia Institute of Technology, Box 30178, Atlanta Georgia, 30332
uucp:	  ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt0178a
Internet: gt0178a@prism.gatech.edu

jgh@root.co.uk (Jeremy G Harris) (10/10/90)

In article <1850@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
>When I hear `ram disk' I reach for my revolver.

I agree strongly.  However, increasing NBUF is not the only thing which should
be done (besides, I don't think it'll work all that well in V.4 or SunOS).
A tmpfs also should have absolutely no filesystem hardening.  No carefully
writing that inode out to disk.  Zilch.

-- 
Jeremy Harris			jgh@root.co.uk			+44 71 315 6600

cosc038@canterbury.ac.nz (10/10/90)

There's been some pretty heated discussion on this topic lately

[most of it deleted]

Boyd Roberts, for instance, writes:

In article <1850@necisa.ho.necisa.oz>, boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
> When I hear `ram disk' I reach for my revolver.  Now, repeat after me...
> 
>     What is the buffer cache? -- A ram disk.
> 
> Increase NBUF and throw tmpfs away.  Vote 1 comp.unix.gizzards.

(Larry McVoy is talking about SunOS 4.1, where there is no longer any
distinction between the buffer cache and the physical memory
available for virtual memory - but that's beside the point).

I tend to agree with most posters who have expressed sentiments similar to
Boyd's, if we are talking about a single machine with disc.  However
tmpfs could be a very big win where /tmp would otherwise be NFS-mounted.
This is because EVERY write to an NFS-mounted /tmp would have to be written 
synchronously to a remote disc, whereas every write to a tmpfs file system
would go no further than local RAM.

We have seen something similar here in the Department.  Ordinarily
the various processes involved in doing a cc(1) communicate using files
in /tmp.  There is a -pipe option which connects the cc(1) processes
directly using pipes, with no need for /tmp files.  The cc(1) man page
comments that the -pipe option is "Very CPU intensive".  We have found
though that when a server is heavily loaded compiles run MUCH more
quickly on clients if the -pipe option is used.  I would put this
down to the fact that when the -pipe option is used a lot less
synchronous NFS writes to /tmp are required.

So perhaps Larry McVoy and the other posters have been talking a little
at cross-purposes?

-- 
Paul Ashton
Email(internet): paul@cosc.canterbury.ac.nz
NZ Telecom:     Office: +64 3 667 001 x6350
NZ Post:        Dept of Computer Science
                University of Canterbury, Christchurch, New Zealand

richard@aiai.ed.ac.uk (Richard Tobin) (10/10/90)

In article <1850@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
>When I hear `ram disk' I reach for my revolver.  Now, repeat after me...

>    What is the buffer cache? -- A ram disk.

As has been pointed out elsewhere, there *is* a difference - most unix
filesystems will try to increase reliability by forcing certain writes
to take place synchronously.  This makes creating files faster on a ram
disk regardless of the buffer cache size.  Whether this affects you
depends on whether you write a few large files or lots of small ones.

However, it is reasonable for /tmp to be a filesystem which does not
do any synchronous writes, if you don't find it important to maintain
/tmp across crashes.  Once you do this, its performance should be
similar to (or better than) a ram disk.

-- Richard

-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

meissner@osf.org (Michael Meissner) (10/11/90)

In article <1990Oct10.152556.9367@canterbury.ac.nz>
cosc038@canterbury.ac.nz writes:

| We have seen something similar here in the Department.  Ordinarily
| the various processes involved in doing a cc(1) communicate using files
| in /tmp.  There is a -pipe option which connects the cc(1) processes
| directly using pipes, with no need for /tmp files.  The cc(1) man page
| comments that the -pipe option is "Very CPU intensive".  We have found
| though that when a server is heavily loaded compiles run MUCH more
| quickly on clients if the -pipe option is used.  I would put this
| down to the fact that when the -pipe option is used a lot less
| synchronous NFS writes to /tmp are required.

I don't think -pipe in general is more CPU intensive, but rather
memory intensive, in that you should have enough memory to run both
the compiler and assembler simulataneously without thrashing.  The
-pipe option also wins on a multiprocessor where the assembler and
compiler can run on separate CPU's.

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

dvv@hq.demos.su (Dmitry V. Volodin) (10/11/90)

In article <1990Oct09.121447.3336@virtech.uucp> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>On a system that was running near 90% utilization (i.e. very little CPU 
>left) we doubled the number of NBUF entries and system performance
>*dropped* significantly.  This was probably due to the extra time spent
>searching through the buffer cache looking to see if a block was there.

And what about hash queues and other neat things? Forgot them?
You mean it - you get it.
-- 
Dmitry V. Volodin <dvv@hq.demos.su>     |
fax:    +7 095 233 5016                 |      Call me Dima (D-'ee-...)
phone:  +7 095 231 2129                 |

boyd@necisa.ho.necisa.oz (Boyd Roberts) (10/11/90)

In article <1990Oct10.152556.9367@canterbury.ac.nz> cosc038@canterbury.ac.nz writes:
>  There is a -pipe option which connects the cc(1) processes
>directly using pipes, with no need for /tmp files.

Brucee did that at basser.  cc had a -K flag (Keep it in the buffer cache).
Sped up compiles no end, but with 3 cc -O -K's going (courtesy of a parallel
make) it'd _slaughter_ a 780.  The whole thing would be CPU bound -- nice.

Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

grp@Unify.com (Greg Pasquariello) (10/11/90)

In article <14884@hydra.gatech.EDU>, gt0178a@prism.gatech.EDU (Jim
Burns) writes:

> > 5. TSR's
> 
> > The MS/DOS community developed these out of utter desparation due
to
> > their single-tasking O/S and the way memory management was
> > brain-damaged from the start. See "job control". Of no merit.
> 
> Wrong - unless you are using a windowing environment, and there are
still
> plenty of glass tube unices out there. And even then, few windowing
> environments I've worked in can match the one or two keystroke
responsive
> - ness of a good TSR. (Granted, what you're talking about *does* apply
to
> the filter and os extension types of TSRs.)
>

A windowing environment has nothing to do with it.  Like the man said,
see "job control", which allows me to put applications in the the
background,
or suspend them, until I am ready to use them again.  At which point I
can invoke them very quickly and easily.

> P.S. - Since we don't get the alt. groups here, I don't know if that
> (alt.religion.computers) is a real group, or just your idea of a
joke,
> but thanx - my mailer choked on it the first time around.
> -- 
> BURNS,JIM
> Georgia Institute of Technology, Box 30178, Atlanta Georgia, 30332
> uucp:	  ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt0178a
> Internet: gt0178a@prism.gatech.edu

--

-Greg Pasquariello	grp@unify.com

malc@iconsys.uucp (Malcolm Weir) (10/12/90)

OK, Reach For Your Revolver... Make My Day! But you dudes who say "RAM disks?
Unnecessary, 'cos we've got a Buffer Cache!!" are WRONG, INCORRECT, MISTAKEN,
and basically WAY OUT OF LINE.

Consider the AT&T System Administrator's Guide section on "Performance Tuning".
This source will indicate that it is a Good Thing to put heavily used
directories in the physically-centered cylinders of your faster disks. Obvious,
huh? OK, now stuff the same data on a RAM disk, as a logical extension.
"But", you cry, "the Buffer Cache would do that for us!"

Really? just how do you persuade *nix to cache "/lib/*", in prefence to 
Joe Unimportant-User's huge statistical jobs that have been munging vast
amounts of data for the past 12 days?  How do you persuade it that the
disk accesses caused by the backup of "/irrelevant" are less important than
the accesses caused by the CEO's secretary's WP temp files?

There are MANY applications where RAM disk provide incredible benefits.
Don't disparage them just because you don't know how REAL systems are used,
with varying importance of tasks.

A perfect example of a manufacturer who has done excellent things with RAM
disk is Pyramid Technology. The allow all sorts of neat things to do with
RAM disk, for example:

    * Mirror a RAM disk to a physical disk. The data is *always* in the
      cache, yet will eventually be on magnetic. This does wonderful
      things if you stick database dictionaries/upper-levels on it.

    * Define a logical slice consisting of a RAM disk followed by
      magnetic. The idea being that you get a scratch space that 
      is esp. fast, and has an economic price, but can accomodate
      periodic heavy demands.

Disclaimer: I have nothing to do with Pyramid. In fact, Sanyo/Icon
computers do not support RAM disk, but we do support huge buffer
caches, with a dedicated processor to keep track of whats where.

Of course, RAM disks made from Virtual Memory are truly Stoopid, but
there are even a couple of manufacturers out there who build real RAM
disks: 5 1/4" form factor enclosures, SCSI interface (or whatever), but
huge quantities of DRAM instead of magnetics. The De Luxe versions even
have a battery...  how do these relate to Buffer Cache?

(btw, I used to be anti-RAM-disk, 'till I tried a system with "/lib"
 on RAM. "/tmp" didn't make that much difference, but you should've
 seen "ld" fly... )

Malc.

tli@phakt.usc.edu (Tony Li) (10/12/90)

In article <14884@hydra.gatech.EDU> gt0178a@prism.gatech.EDU (Jim Burns) writes:

    Wrong - unless you are using a windowing environment, and there are still
    plenty of glass tube unices out there. And even then, few windowing
    environments I've worked in can match the one or two keystroke responsive
    - ness of a good TSR. (Granted, what you're talking about *does* apply to
    the filter and os extension types of TSRs.)
    
This could be easily satisfied by implementing a different user
interface to job control.  Or even by programmable function keys.  For
example, you might bind one key to be: "^Z%emacs".  Poof.  

The difficulty is doing this in a consistent and portable way...

-- 
Tony Li - USC Computer Science Department
Internet: tli@usc.edu Uucp: usc!tli
Thus spake the master programmer: "A well written program is its own
heaven; a poorly-written program its own hell."

gt0178a@prism.gatech.EDU (Jim Burns) (10/12/90)

in article <12454@chaph.usc.edu>, tli@phakt.usc.edu (Tony Li) says:

> This could be easily satisfied by implementing a different user
> interface to job control.  Or even by programmable function keys.  For
> example, you might bind one key to be: "^Z%emacs".  Poof.  

Yes!

> The difficulty is doing this in a consistent and portable way...

True.

-- 
BURNS,JIM
Georgia Institute of Technology, Box 30178, Atlanta Georgia, 30332
uucp:	  ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt0178a
Internet: gt0178a@prism.gatech.edu

poser@csli.Stanford.EDU (Bill Poser) (10/12/90)

Supposing that RAM disk is a wonderful thing, I don't see why it
requires any change to UNIX. Couldn't the RAM used for this be
treated as a device and mapped into the filesystem in the same way
as any other block device? I should think that it would just be a matter
of writing an appropriate driver.

dmw9q@uvacs.cs.Virginia.EDU (David M. Warme) (10/13/90)

In message <15785@csli.Stanford.EDU>, Bill Poser asks:

> Supposing that RAM disk is a wonderful thing, I don't see why it
> requires any change to UNIX. Couldn't the RAM used for this be
> treated as a device and mapped into the filesystem in the same way
> as any other block device? I should think that it would just be a matter
> of writing an appropriate driver.

There are two major reasons this is not usually done:

- It is easier to hook up new gizmo that works through a known
  and tested interface (i.e. SCSI).

- Writing such a device driver would cripple the CPU by forcing
  it to copy data from the RAM partition into the block I/O
  buffer.  Note that disk controllers have DMA which allows the
  data transfer to be done in parallel with other useful CPU
  work, especially if you have a large cache.

If you happen to have a few spare DMA channels on the CPU board
itself, then this might not be a bad idea.  However, vendors
such as Sun, HP, etc are not known for putting spare hardware
into their system that is not already used for something.


Dave Warme,
all disclaimers made, no rights reserved, etc.

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (10/15/90)

In article <1990Oct11.185949.29164@iconsys.uucp>
	malc@iconsys.uucp (Malcolm Weir) writes:

>OK, Reach For Your Revolver... Make My Day! But you dudes who say "RAM disks?
>Unnecessary, 'cos we've got a Buffer Cache!!" are WRONG, INCORRECT, MISTAKEN,
>and basically WAY OUT OF LINE.

Not so much.

>Really? just how do you persuade *nix to cache "/lib/*", in prefence to 
>Joe Unimportant-User's huge statistical jobs that have been munging vast
>amounts of data for the past 12 days?

As /lib is almost readonly, I recommend you to tune BSD file system
parameters such as maxcontig with appropriate disk controllers.

Then, you can read entire /lib/libc.a with a single seek.

>How do you persuade it that the
>disk accesses caused by the backup of "/irrelevant" are less important than
>the accesses caused by the CEO's secretary's WP temp files?

CEO's secretary should have his own workstation, of course.

>(btw, I used to be anti-RAM-disk, 'till I tried a system with "/lib"
> on RAM. "/tmp" didn't make that much difference, but you should've
> seen "ld" fly... )

If you are using your own workstation with large memory and dynamic
buffer caching, you can observe the same thing.

						Masataka Ohta

richard@aiai.ed.ac.uk (Richard Tobin) (10/16/90)

In article <15785@csli.Stanford.EDU> poser@csli.stanford.edu (Bill Poser) writes:
>Supposing that RAM disk is a wonderful thing, I don't see why it
>requires any change to UNIX. Couldn't the RAM used for this be
>treated as a device and mapped into the filesystem in the same way
>as any other block device?

It could.  Indeed, I believe there are people who sell boxes full of
slowish RAM chips as fast disks (this could also be useful for paging
on a machine which can't have more main memory added).  You might well
want to avoid the waste of having blocks from the RAM disk duplicated
in the buffer cache, however.

-- Richard

-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

lm@slovax.Sun.COM (Larry McVoy) (10/17/90)

In article <6372@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>As /lib is almost readonly, I recommend you to tune BSD file system
>parameters such as maxcontig with appropriate disk controllers.
>
>Then, you can read entire /lib/libc.a with a single seek.

Don't try this at home, kids, unless you know what you are doing.  This will
work pretty well for the newer SCSI drives that have track buffers but it will
work horribly for old drives such as SMD's (eagles) and older SCSI.  The 
controllers mentioned are not common on work stations.
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

buck@siswat.UUCP (A. Lester Buck) (10/17/90)

In article <BZS.90Oct9144959@world.std.com>, bzs@world.std.com (Barry Shein) writes:
> Here's several features that keep coming up. Not all are without
> merit, but some are already there (like RAM disk) and others are
> questionably portable or, well, without merit.

I agree with most of your points, but challenge your comments on
asynchronous I/O.

> 
> 3. Asynchronous I/O for "performance" reasons
> 
> All Unix block I/O is asynchronous (well, driver dependant of course,
> but disks and tapes and so forth.)

This is, of course, incorrect.  Only writes are effectively asynchronous.
Reading can be asynchronous with read ahead and caching, but the effect
is small for high performance input, which is rarely consuming I/O
in chunks of a single block, and that is all that read ahead grabs.
Any process interested in high performance I/O is doing raw reads and
writes to avoid extra buffer copies.  For the numbers, consult
Rochkind, "Advanced Unix Programming".

> The recent addition to Unix has been *synchronous* I/O (FSYNC bit.)
> The one major exception is directory updates, but that's never the
> issue when this comes up.

No objections here, but you are refering to synchrony at a different
point in the I/O system, for reliability from buffer cache to permanent
storage instead of at the user buffer level.

> Another related feature is being able to get an interrupt when an I/O
> has really gone to disk (et al). This has merit and for many
> applications would be vastly superior to the FSYNC bit. I'm not sure
> why this hasn't been done universally as the SIGIO signal could just
> be used for this and it probably only entails marking a bit in the
> buffer and having the kernel issue a psignal() or whatever when the
> right completes and the buffer is being freed. Perhaps I'm missing
> something.

This is a really interesting idea.  A vendor could add this into
POSIX.4 very easily by defining an extra bit in the aio_flag argument
for an asynch request, and the driver would delay the completion
notification until the appropriate level of stable storage had been
reached.  I like it!

I don't see much of an argument here for your initial premise that
asynchronous I/O has no performance benefits.  As a counterexample, I
personally know of at least two database and/or filesystem vendors who
flatly refused to port their products to a major Unix kernel without the OS
vendor adding some form of asynchronous I/O.  When it comes to implementing
real systems where I/O is everything, database vendors know exactly what
they need for performance.

> 4. Command names which resemble english words.
> 
> Bell Labs commissioned independant studies early in Unix's history to
> see if this was important or not. I've tried to locate these, others
> have claimed to have been involved and seen the reports. It's possible
> they were "internal use only", perhaps the work should be repeated.
> 
> The basic conclusion was that you can make commands english, wierd,
> mnemonic, and even counter-intuitive (e.g. "delete" means edit a file,
> "edit" means delete etc.) and it simply doesn't seem to make all that
> much difference to learning curves.
> 
> This may seem horribly counter-intuitive and against all conventional
> wisdom (some people get quite apoplectic when this is asserted, sort
> of like telling them that there is no Santa Claus.)

I hardly get apoplectic, and there are plenty of counterintuitive results
carefully documented in the research literature.  But if this were at all
correct, it would easily generate dozens of papers, theses, and grants.
Until you or someone else can present tangible references, I'm putting this
in my amusing legend category.  Instead, it sounds like a limp coverup for
chosing two letter command names simply because they were faster to type on
ASR-33s.

>         -Barry Shein
> Software Tool & Die    | {xylogics,uunet}!world!bzs | bzs@world.std.com

-- 
A. Lester Buck    buck@siswat.lonestar.org  ...!uhnix1!lobster!siswat!buck

costis@csi.forth.gr (Costis Aivalis) (10/19/90)

cpcahil@virtech.uucp (Conor P. Cahill) writes:

>This is an example of why performance tuning is magic.  There are no
>simple answeres for all cases. What seems at first examination to be an
>obvious performance gain, may turn out to be a loss.

>Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
>uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
>                                                Sterling, VA 22170 

The best answer so far!

It definitely _IS_ application specific!

Let's try to avoid giving a general answer to the RAMdisk question
and check our system and application performance with the RAMdisk 
and without it. 

Chances are that we won't be getting the performance improvement
we expect, but then again who knows without trying... ?

No revolvers or machine-guns please! :-)

Costis Aivalis

-------------------------------------------------------------------------------
costis@csi.forth.gr

F.O.R.T.H.