[comp.unix.admin] RAM disk.

jms@tardis.Tymnet.COM (Joe Smith) (09/13/90)

In article <1990Sep12.084002.5575@hq.demos.su> dvv@hq.demos.su (Dmitry V. Volodin) writes:
>Folks, does anyone want to discuss the pros and cons of placing the
>swap/pageing area onto the ram disk? :)

I understand the joke, but did you know that Sun has done the opposite?
They have put the ram disk on the swap/paging area.  Actually, it's more
like any page of physical memory can be used for either a swapped-in page
or a tmpfs file system page, first come first served.

Small files stay completely in ram.  Large file spill over into swap space,
but it's still faster than a regular file partition due to not waiting for
synchronized writes to the directory blocks, the bitmap/freelist, superblock,
etc.  It's good for /tmp (but not /usr/tmp unless you have a giant swap space.)

-- 
Joe Smith (408)922-6220 | SMTP: jms@tardis.tymnet.com or jms@gemini.tymnet.com
BT Tymnet Tech Services | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms
PO Box 49019, MS-C41    | BIX: smithjoe | 12 PDP-10s still running! "POPJ P,"
San Jose, CA 95161-9019 | humorous dislaimer: "My Amiga speaks for me."

del@thrush.mlb.semi.harris.com (Don Lewis) (09/13/90)

In article <1223@tardis.Tymnet.COM> jms@tardis.Tymnet.COM (Joe Smith) writes:
>I understand the joke, but did you know that Sun has done the opposite?
>They have put the ram disk on the swap/paging area.  Actually, it's more
>like any page of physical memory can be used for either a swapped-in page
>or a tmpfs file system page, first come first served.
>
>Small files stay completely in ram.  Large file spill over into swap space,
>but it's still faster than a regular file partition due to not waiting for
>synchronized writes to the directory blocks, the bitmap/freelist, superblock,
>etc.  It's good for /tmp (but not /usr/tmp unless you have a giant swap space.)

Does anyone have a feel for the relative performance of Sun's tmpfs versus
a 4.2 filesystem?  I have an application that uses a lot of temporary
file space.  After it is finished thrashing about with the scratch files
it builds a large data structure in memory.  The amount of swap space
and the amount of scratch file space consumed at one time are someone
complementary.  Without using tmpfs, I need both big swap and big /tmp.
I am interested in combining these into one large swap and using tmpfs.
The issue is what is the performance when simultaneously reading and
writing several large files (and possibly significant paging as well)
using tmpfs versus the same operations using the 4.2 filesystem.  The
hosts in question currently have /tmp and swap on separate drives and
have a fair amount of RAM.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (09/13/90)

In article <1990Sep13.002300.15266@mlb.semi.harris.com>
	del@thrush.mlb.semi.harris.com (Don Lewis) writes:

>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>a 4.2 filesystem?

Few monthes ago, in comp.arch, someone in Sun posted the result of
measurement that tmpfs won't improve performance of whole kernel
recompilation.

>I have an application that uses a lot of temporary
>file space.  After it is finished thrashing about with the scratch files
>it builds a large data structure in memory.  The amount of swap space
>and the amount of scratch file space consumed at one time are someone
>complementary.

There is positive correlation between amount of swap space and amount
of temporary file necessary for a process. Moreover, they are
often necessary at the same time.

That is, if you output something large to file, you almost certainly have
a copy of it in memory, and vice versa.

>Without using tmpfs, I need both big swap and big /tmp.

So, even with tmpfs, you need twice as big swap.

>The issue is what is the performance when simultaneously reading and
>writing several large files (and possibly significant paging as well)
>using tmpfs versus the same operations using the 4.2 filesystem.

If you do large amount of IO to /tmp, with simple-minded memory disk,
it is about TWICE AS SLOW AS ordinary disk file system.

The reason is that memory disk can't do async write. Data is copied
from user space to buffer cache and then to memory disk. With ordinary
disk, data is only copied to buffer cache.

If you use elaborated and complicated memory disk, it can be only as
slow as ordinary disk, but not faster.

						Masataka Ohta

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (09/13/90)

In article <1990Sep13.002300.15266@mlb.semi.harris.com>
	del@thrush.mlb.semi.harris.com (Don Lewis) writes:

>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>a 4.2 filesystem?

Few monthes ago, in comp.arch, someone in Sun posted the result of
measurement that tmpfs won't improve performance of whole kernel
recompilation.

>I have an application that uses a lot of temporary
>file space.  After it is finished thrashing about with the scratch files
>it builds a large data structure in memory.  The amount of swap space
>and the amount of scratch file space consumed at one time are someone
>complementary.

There is positive correlation between amount of swap space and amount
of temporary file necessary for a process. Moreover, they are
often necessary at the same time.

That is, if you output something large to file, you almost certainly have
a copy of it in memory, and vice versa.

>Widhout using tmpfs, I need both big swap and big /tmp.

So, even with tmpfs, you need twice as big swap.

>The issue is what is the performance when simultaneously reading and
>writing several large files (and possibly significant paging as well)
>using tmpfs versus the same operations using the 4.2 filesystem.

If you do large amount of IO to /tmp, with simple-minded memory disk,
it is about TWICE AS SLOW AS ordinary disk file system.

The reason is that memory disk can't do async write. Data is copied
from user space to buffer cache and then to memory disk. With ordinary
disk, data is only copied to buffer cache.

If you use elaborated and complicated memory disk, it can be only as
slow as ordinary disk, but not faster.

						Masataka Ohta

del@thrush.mlb.semi.harris.com (Don Lewis) (09/14/90)

In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1990Sep13.002300.15266@mlb.semi.harris.com>
>	del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>
>>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>>a 4.2 filesystem?
>
>Few monthes ago, in comp.arch, someone in Sun posted the result of
>measurement that tmpfs won't improve performance of whole kernel
>recompilation.
>

In my case, the files are larger (10-80 Meg or so).

>>I have an application that uses a lot of temporary
>>file space.  After it is finished thrashing about with the scratch files
>>it builds a large data structure in memory.  The amount of swap space
>>and the amount of scratch file space consumed at one time are someone
>>complementary.
>
>There is positive correlation between amount of swap space and amount
>of temporary file necessary for a process. Moreover, they are
>often necessary at the same time.
>
>That is, if you output something large to file, you almost certainly have
>a copy of it in memory, and vice versa.

In my case, there is somewhat of an inverse relationship.  About half
the memory bloat occurs near the end as the program is reading and
deleting files.  It does write a final (large) output file, but it
is not something I want to trust to tmpfs.

>
>>Without using tmpfs, I need both big swap and big /tmp.
>
>So, even with tmpfs, you need twice as big swap.
>

Probably only 1.5x in my case, but what's a hundred or so megabytes
between friends :-).

>>The issue is what is the performance when simultaneously reading and
>>writing several large files (and possibly significant paging as well)
>>using tmpfs versus the same operations using the 4.2 filesystem.
>
>If you do large amount of IO to /tmp, with simple-minded memory disk,
>it is about TWICE AS SLOW AS ordinary disk file system.
>
>The reason is that memory disk can't do async write. Data is copied
>from user space to buffer cache and then to memory disk. With ordinary
>disk, data is only copied to buffer cache.
>
>If you use elaborated and complicated memory disk, it can be only as
>slow as ordinary disk, but not faster.

This is the sort of info I'm looking for.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

gary@sci34hub.UUCP (Gary Heston) (09/14/90)

In article <6167@titcce.cc.titech.ac.j`> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1990Sep13.002300.15266@mlb.semi.harris.com>
> [ discussion of ramdisc performance deleted ]

>The reason is that memory disk can't do async write. Data is copied
>from user space to buffer cache and then to memory disk. With ordinary
>disk, data is only copied to buffer cache.

You're presuming that all hardware is incapable of unattended
memory-to-memory DMA.  I don't think this is the case. Further, a
well-integrated RAM disc should transfer directly from user space to
RAM disc in cases where the RAM disc is part of main memory. This is
also not necessarily the case. A RAM disc can be implemented separately
from main memory, or as a peripheral device. (Installable software
drivers, however, would only use main memory.)

>If you use elaborated and complicated memory disk, it can be only as
>slow as ordinary disk, but not faster.

I think you're forgetting drive latency, here. A RAM disc will respond
immediately, where a hard drive may take several ms just to find the
data. Somewhere around here, I have a data sheet for a RAM disc with
a SCSI interface. I think it'll show a lot of speed improvement over an
"ordinary disk". Granted, not everyone has one of these, but I'd
sure like one for my swap device.

>						Masataka Ohta

-- 
    Gary Heston     { uunet!sci34hub!gary  }    System Mismanager
   SCI Technology, Inc.  OEM Products Department  (i.e., computers)
"The esteemed gentlebeing says I called him a liar. It's true, and I
regret that." Retief, in "Retiefs' Ransom" by Keith Laumer.

del@thrush.mlb.semi.harris.com (Don Lewis) (09/18/90)

In article <758@sci34hub.UUCP> gary@sci34hub.sci.com (Gary Heston) writes:
>I think you're forgetting drive latency, here. A RAM disc will respond
>immediately, where a hard drive may take several ms just to find the
>data. Somewhere around here, I have a data sheet for a RAM disc with
>a SCSI interface. I think it'll show a lot of speed improvement over an
>"ordinary disk". Granted, not everyone has one of these, but I'd
>sure like one for my swap device.

Well, so would I, but if I could afford enough memory to replace my
swap device with a RAM disk, I'd rather just use that memory as main
memory and not bother with paging/swapping anymore.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

dick@cca.ucsf.edu (Dick Karpinski) (09/19/90)

In article <1990Sep18.044625.12606@mlb.semi.harris.com> del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>In article <758@sci34hub.UUCP> gary@sci34hub.sci.com (Gary Heston) writes:
>>I think you're forgetting drive latency, here. A RAM disc will respond
>>immediately, ... RAM disc with a SCSI ... for my swap device.
>
>Well, so would I, but if I could afford enough memory to replace my
>swap device with a RAM disk, I'd rather just use that memory as main
>memory and not bother with paging/swapping anymore.

I want the RAM disk (who makes it?) 'cause the cheap RAM is $50/MB 
while IBM RAM for the 6000 is $600/MB.  I don't know that the RAM disk
is priced right but there is sure some room for that posibility.

Dick

lm@slovax.Sun.COM (Larry McVoy) (09/30/90)

In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1990Sep13.002300.15266@mlb.semi.harris.com>
>	del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>
>>Does anyone have a feel for the relative performance of Sun's tmpfs versus
>>a 4.2 filesystem?
>
>Few monthes ago, in comp.arch, someone in Sun posted the result of
>measurement that tmpfs won't improve performance of whole kernel
>recompilation.

This is true in SunOS 4.1, not true in SunOS 4.1.1.  There was a bug,
a rather silly bug, the made write performance to tmpfs about
the same as ufs.  (For those who care, in the write case, if you are
doing a partial page, you have to read in the page before writing over
the part that the user sent down.  If you are writing a whole page
there is no need to fault in page from disk; you're overwriting all
of it.  Tmpfs didn't think to do this optimization; so performance
was limited by how fast one can fault in pages from the swap device.)

My measurements of kernel builds has shown a 20% improvement in wall
clock time on an otherwise idle system.  Test was a build of a 
GENERIC kernel (i.e., everything) and the only difference was 
/tmp was tmpfs instead of UFS.

Tests like "time dd if=/dev/zero of=/tmp/XXX count=1000" showed data
rate changing from 300KB / sec to 5MB / sec on a SS1 (if you do the math
a SS1 can't bcopy much faster than that).

>If you do large amount of IO to /tmp, with simple-minded memory disk,
>it is about TWICE AS SLOW AS ordinary disk file system.
>
>The reason is that memory disk can't do async write. Data is copied
>from user space to buffer cache and then to memory disk. With ordinary
>disk, data is only copied to buffer cache.

2X as slow is correct.  The reasoning is incorrect.  Tmpfs is better
than a ram disk because it avoids an extra copy two times.  It copies
the data from user to kernel but not from kernel to ramdisk.  If you
think about that's two less copies.  Think hard.

>If you use elaborated and complicated memory disk, it can be only as
>slow as ordinary disk, but not faster.

Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

>						Masataka Ohta
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

jfh@rpp386.cactus.org (John F. Haugh II) (10/01/90)

In article <143190@sun.Eng.Sun.COM> lm@slovax.Sun.COM (Larry McVoy) writes:
>In article <6167@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>>If you use elaborated and complicated memory disk, it can be only as
>>slow as ordinary disk, but not faster.
>
>Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

I suspect this was a typo - if tmpfs can only be as slow, and not
any faster, then it must be the same speed.

As for 5MB/sec transfer limitations - does the SS1 have a DMA
setup that can handle memory to memory transfers?  I would hope
that the memory subsystem can handle more than 5MB/sec ...
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"SCCS, the source motel!  Programs check in and never check out!"
		-- Ken Thompson

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (10/02/90)

In article <143190@sun.Eng.Sun.COM> lm@slovax.Sun.COM (Larry McVoy) writes:

>>Few monthes ago, in comp.arch, someone in Sun posted the result of
>>measurement that tmpfs won't improve performance of whole kernel
>>recompilation.

>This is true in SunOS 4.1, not true in SunOS 4.1.1.  There was a bug,

If you know what is responsibility, you should post that to comp.arch.

>My measurements of kernel builds has shown a 20% improvement in wall
>clock time on an otherwise idle system.

That is what I already observed with delay option (delay option is
a very simple and better replacement of RAM disk or tmpfs, see my
paper in the proceedings of 1990 summer USENIX conference).

>Tests like "time dd if=/dev/zero of=/tmp/XXX count=1000" showed data
>rate changing from 300KB / sec to 5MB / sec on a SS1 (if you do the math
>a SS1 can't bcopy much faster than that).

And you can observe the same 5MB/sec rate even on ordinary files, if
you are using a SANE UNIX.

>>If you do large amount of IO to /tmp, with simple-minded memory disk,
>>it is about TWICE AS SLOW AS ordinary disk file system.

>2X as slow is correct.  The reasoning is incorrect.  Tmpfs is better
>than a ram disk because it avoids an extra copy two times.

READ WHAT I POST.

>>If you do large amount of IO to /tmp, with simple-minded memory disk,
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I am not saying anything about tmpfs here.

>>If you use elaborated and complicated memory disk, it can be only as
>>slow as ordinary disk, but not faster.

>Bullsh*t.  Tmpfs is orders of magnitude faster than a disk.  

Of course, you KNOW fgrep is faster than grep.

I miss comp.unix.wizards.

See the assumption:

>>If you do large amount of IO to /tmp,
            ^^^^^^^^^^^^^^^^^^

If you know what is buffer cache, you can understand why tmpfs is only
as fast as ordinary disk file.

						Masataka Ohta