[comp.unix.ultrix] A question about swap

dbrillha@dave.enet.dec.com (Dave Brillhart) (06/15/91)

On one of our 5500s, we have configured over 500M of swap space. Someone
recommended using the 'a' and 'b' partitions over five drives, with the
a/b partitions combined into a larger 'a' partition.

Does this buy us much over a single 500M swap partition? Is the OS smart
enough to
distribute the load over several spindles? We are currently running
Ultrix 4.1 and
using Prestoserve (which BTW we have found to be a significant
performance enhancer).

I just upgraded to 4.2 and have had reports I wasn't receiving mail
outside my domain.
It should be fixed by the time you read this :-)

Thanks in advance for any comments.

-- Dave Brillhart
     Harris Semiconductor

grr@cbmvax.commodore.com (George Robbins) (06/16/91)

In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
> On one of our 5500s, we have configured over 500M of swap space. Someone
> recommended using the 'a' and 'b' partitions over five drives, with the
> a/b partitions combined into a larger 'a' partition.
> 
> Does this buy us much over a single 500M swap partition? Is the OS smart
> enough to  distribute the load over several spindles?

Supposedly the Berkeley code does distribute swap usage over the multiple
drives in a way that improves performance.

On the to note is that swap space is kind of a dinosaur from the days of
7*'s and 10 million students.  If you have enough memory, swap usage is
minimal and even paging low (unless you're running X).

Most the requirement for large swaps spaces stems from the problem that
Ultrix allocates swap space before it actually needs it.  The other is
megalithic software that depends on virtual memory to fit at all.  Some
of this is well behaved, some will cause a paging frenzy.

I'd suggest you use pstat/vmstat/monitor to evaluate how many paging/
swapping event occur per second.  If the number is less than 1, it's
a don't care.  If the number is a little larger you need to contemplate
whether you will get better action with a single "swap drive" with
only self-contention or spreading it over several shared drives.  If
the number is large, see if you can afford mega-memory.

Large here should be considered relative to the number of random
read/write requests the controller/drive combination can perform
per second, less whatever useful disk I/O you need to get done.

You might also want to look at that Ultrix tuning paper that was posted
here not long ago, it appears that Ultrix may be surprisingly poorly
tuned for today's RISC systems...

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

alan@shodha.enet.dec.com ( Alan's Home for Wayward Notes File.) (06/16/91)

In article <22486@cbmvax.commodore.com>, grr@cbmvax.commodore.com (George Robbins) writes:
> In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
> > [ A previous customer asked about how to best configure more
> >   page/swap space. ]
> 
> Supposedly the Berkeley code does distribute swap usage over the multiple
> drives in a way that improves performance.

	Yes it does.  You can demonstrate this by running iostat(1)
	or Monitor at the time you're doing a:

		# cp /dev/drum /dev/null
	
> 
> Most the requirement for large swaps spaces stems from the problem that
> Ultrix allocates swap space before it actually needs it.

	Actually ULTRIX V4.0 and later "reserves" the space at process 
	startup.  It doesn't get allocated until used.  The reservation 
	is necessary to avoid deadlocks and the harder problem of how 
	to cope with them.  Some systems cope by letting you dynamically 
	add space as needed.  Others cope by killing the offending process.
> 
> [ George discusses the performance aspects of the problem. ]

	If you don't page/swap (much) then the performance doesn't
	matter much.  If you do page/swap much, then you need more
	memory (up to a point).  If you do have to the pageing/swapping
	I/O then use the /dev/drum trick mentioned above.  It's hard
	to determine page-out performance this way, but you can see
	which is better for page-in.

	Also consider playing around with the configuration parameter
	"swapfrag".  It determines the I/O size for some things.  If
	you're doing a lot, then increaseing swapfrag may help.
> 
> -- 
> George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr


-- 
Alan Rollow				alan@nabeth.cxn.dec.com

yzarn@lhdsy1.chevron.com (Philip Yzarn de Louraille) (06/16/91)

In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
>On one of our 5500s, we have configured over 500M of swap space. Someone
>recommended using the 'a' and 'b' partitions over five drives, with the
>a/b partitions combined into a larger 'a' partition.
... deleted text...
I was told once to *never* mess up with the a partition of a disk.
Forget about the 16Meg you will be loosing.
-- 
  Philip Yzarn de Louraille                 Internet: yzarn@chevron.com
  Research Support Division                 Unix & Open Systems
  Chevron Information & Technology Co.      Tel: (213) 694-9232
  P.O. Box 446, La Habra, CA 90633-0446     Fax: (213) 694-7709

diamond@jit533.swstokyo.dec.com (Norman Diamond) (06/17/91)

In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
>On one of our 5500s, we have configured over 500M of swap space. Someone
>recommended using the 'a' and 'b' partitions over five drives, with the
>a/b partitions combined into a larger 'a' partition.

I believe that if you swap onto your 'a' partition, then you overwrite the
partition information and corrupt the entire disk.  It might be possible to
make a really tiny 'a' partition and larger 'b' partition while leaving the
rest of the disk alone; I haven't tried it.

I believe that if you swap onto a disk's 'c' partition (an entire disk),
then you are usually safe, but if a random memory that happens to look like
a partition table happens to get swapped out onto the lucky block, and then
you reboot, you have a corrupted disk.

Warning:  this post is a combination of vague recollections and logical
thinking, both of which are risky and should not be depended upon when using
a derivative of the Unix operating system.  Do your own reading and/or obtain
advice from an expert on the topic before relying on it.

Incidentally to dcb, your "from" address of dcb@dave.enet.dec.com seems
somewhat suspect.  You'll have to hack your news software a bit more.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.
Permission is granted to feel this signature, but not to look at it.

kepowers@mbunix.mitre.org (Powers) (06/18/91)

>You might also want to look at that Ultrix tuning paper that was posted
>here not long ago, it appears that Ultrix may be surprisingly poorly
>tuned for today's RISC systems...

Does anyone know where this can be found?  TIA.

-- 
Kelly-Erin Powers		The MITRE Corporation
Unix Systems Group		Burlington Road
(617) 271-2143			Bedford, MA 01730
kepowers@mbunix.mitre.org	your_neighborhood!linus!mbunix!kepowers

ellis@rata.vuw.ac.nz (Brian Ellis) (06/18/91)

In article <967@lhdsy1.chevron.com>, yzarn@lhdsy1.chevron.com (Philip Yzarn de Louraille) writes:
|> In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
|> >On one of our 5500s, we have configured over 500M of swap space. Someone
|> >recommended using the 'a' and 'b' partitions over five drives, with the
|> >a/b partitions combined into a larger 'a' partition.
|> ... deleted text...
|> I was told once to *never* mess up with the a partition of a disk.
|> Forget about the 16Meg you will be loosing.

Referring to the guide to Configuration file maintenance...

"Avoid selecting partition a of any disk for use as the swap partition. If
partition table information was defined for a disk and swapping occurs on the
a partition, the information is destroyed and data is lost."

Given that either the "a" or the "c" partition may be the "root" partition
of the disk, I would be *very* wary of using the c partition for the same
reasons. 

I have successfully and painlessly changed the size of the "a" partition on some
of my disks. The trick is not to have the disk in use for *anything* at the
time. The only thing you can't do to the a partition is change it's offset.
(Obviously it has to start at sector 0).

I didn't like the idea of loosing 16 MB either, so in one case, I'm using it
as a backup of my root filesystem. On another disk, I changed the partition
tables such that the b partition was now in the "middle" of the disk with a
filesystem on each side. ie:

/dev/ra1a      187 MB  /usr/local
/dev/ra1b      64 MB   swap space
/dev/ra1f      174 MB  /u1

The system I administer is swapping to 3 spindles. In reverence to advice
I received locally I followed these rules:
	- all swap spaces are the same size (64 MB in my case)
	- each one is larger than physical memory
	- total swap space is about 4 times physical memory

In practice (observed using vmstat(1)), this system is nicely distributing it's
swapping across the 3 partitions available to it.

-- 
Brian Ellis                               Computing Services Centre
Domain: ellis@rata.vuw.ac.nz              Victoria University of Wellington
Bang paths... grrrr!!!!!		  P.O Box 600, New Zealand.
What! - no cute .sig ???

grr@cbmvax.commodore.com (George Robbins) (06/18/91)

In article <1991Jun18.042752.10818@rata.vuw.ac.nz> ellis@rata.vuw.ac.nz (Brian Ellis) writes:
> In article <967@lhdsy1.chevron.com>, yzarn@lhdsy1.chevron.com (Philip Yzarn de Louraille) writes:
> |> In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
> |> >On one of our 5500s, we have configured over 500M of swap space. Someone
> |> >recommended using the 'a' and 'b' partitions over five drives, with the
> |> >a/b partitions combined into a larger 'a' partition.
> |> ... deleted text...

> |> I was told once to *never* mess up with the a partition of a disk.
> |> Forget about the 16Meg you will be loosing.
> 
> Referring to the guide to Configuration file maintenance...
> 
> "Avoid selecting partition a of any disk for use as the swap partition. If
> partition table information was defined for a disk and swapping occurs on the
> a partition, the information is destroyed and data is lost."
> 
> Given that either the "a" or the "c" partition may be the "root" partition
> of the disk, I would be *very* wary of using the c partition for the same
> reasons. 

Well, I would be wary of changing the "c" parition to anything other than the
whole drive and then using it for swap space.  As long as you are really planning
on using the whole drive, the worst that can happen is you blow away the (putative)
partition table and Ultrix reverts to using the parition information coded in the
disk driver, which incidentally will have "c" as the whole drive.

On the other hand, I would prefer to config the kernal to swap on "b" and then
chpt the "b" partition to whatever size I wanted.  This makes changing the swap
an operational task instead of requring a new config and avoids violating the
principle of least surprise, lest a co-worker other person get stuck with doing
a post-mortem on your system.

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

grr@cbmvax.commodore.com (George Robbins) (06/18/91)

In article <967@lhdsy1.chevron.com> yzarn@lhdsy1.chevron.com (Philip Yzarn de Louraille) writes:
> In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
> >On one of our 5500s, we have configured over 500M of swap space. Someone
> >recommended using the 'a' and 'b' partitions over five drives, with the
> >a/b partitions combined into a larger 'a' partition.
> ... deleted text...

> I was told once to *never* mess up with the a partition of a disk.
> Forget about the 16Meg you will be loosing.

This is rather simplistic - the better advice is to warn people not
to mess with disk paritions at all, *unless* they understand the
consequences of changing the different parititions and how to
recover from a mistake...

Also, I don't consider the 16MB-byte wasted, there are lots of
good uses for "spare" "a" and "b" paritions if you don't need them
for swapping or root partitions.

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

frank@croton.nyo.dec.com (Frank Wortner) (06/19/91)

In article <967@lhdsy1.chevron.com>, yzarn@lhdsy1.chevron.com (Philip Yzarn de Louraille) writes:

|>I was told once to *never* mess up with the a partition of a disk.
|>Forget about the 16Meg you will be loosing.

You don't have to waste the space entirely.  An "orphan" a partition makes
a handy /tmp or /var or other small filesystem.  It also makes a handy spare
copy of the root filesystem.  Just the thing to have around the next time some-
one accidentally types "rm -rf /".  [I've seen it happen!  :-( ]

Also, you can use chpt to reduce the size of the "a" partition.  You just should
not get rid of it entirely.  I would not suggest reducing the size of the paritition
if you intend to use it as a root filesystem, though.

						Frank

lhoe@iddth.id.dk (Lasse H Oestergaard (EMP)) (06/19/91)

diamond@jit533.swstokyo.dec.com (Norman Diamond) writes:

>In article <1991Jun14.184609.21178@mlb.semi.harris.com> dcb@dave.mis.semi.harris.com writes:
>>On one of our 5500s, we have configured over 500M of swap space. Someone
>>recommended using the 'a' and 'b' partitions over five drives, with the
>>a/b partitions combined into a larger 'a' partition.

>I believe that if you swap onto your 'a' partition, then you overwrite the
>partition information and corrupt the entire disk.  It might be possible to
>make a really tiny 'a' partition and larger 'b' partition while leaving the
>rest of the disk alone; I haven't tried it.

>I believe that if you swap onto a disk's 'c' partition (an entire disk),
>then you are usually safe, but if a random memory that happens to look like
>a partition table happens to get swapped out onto the lucky block, and then
>you reboot, you have a corrupted disk.

>Warning:  this post is a combination of vague recollections and logical
>thinking, both of which are risky and should not be depended upon when using
>a derivative of the Unix operating system.  Do your own reading and/or obtain
>advice from an expert on the topic before relying on it.


  HOW TO DO IT RIGHT:

  The following is correct, and tested by experience on VAXes, but should
  be true for RISCS also:

  If you swap on a partition, no matter which, it should NOT start at
  segment 0, which the  a  and  c  partitions usually do. Use 

   chpt -p<whatever>  /dev/r...c  64  <size when starting at 0 - 64>

  to leave room for the bootblock, partition table and whatever, by
  starting the partion at 64, and shrinking the size correspondingly. 
  Or change the a-partition to have a small length (64 should do it)
  (while still starting at 0), discard it, and use (for instance) an
  extended b-partition (which can start at 64) for swapping.
  The current (or if no valid partition table exists, the default) size 
  of partitions can be found with

   chpt -q /dev/r...c


  MISUSE:

  If a partition starting at 0 is used to swap on, AND any partition on the
  same disk is used for filesystems, the effect depends on whether

   1. The partitions used for filesystems have been redefined (i.e. 
	have either a non-standard placement or size, or both).
	In this case the disk will become unreadable, probably on the next
	boot, because the partition table has been overwritten, AND THE
	DISK DRIVER THEREFORE APPLIES THE COMPILED-IN DEFAULT.
	Unless (one of) the swappartition(s) is the primary swap partition
	and further, the DEFAULT placement/size for that partition overlays
	part of a filesystem partition, the boot to multi user mode should
	fail, and NO INFORMATION SHOULD BE DESTROYED, and the disk can be
	recovered, PROVIDED THAT you can remember the original partitioning.

   2. The partition is the a-partition (and no partitions have been
        re-defined). THIS SEEMS TO WORK, because the compiled-in default
	is used instead of the destroyed partition table. BUT see below.

   However, WHENEVER a partition starts at 0 is used to swap on (which
   includes, when the whole disk is used as a c-partition), the
   swapping may (as Norman Diamond writes) create a 'false partition table',
   though this is very unlikely.  Then all bets are off, of course, and
   you may get filesystems overwritten.
   (In case 2, and when the whole disk is used as a c-partition, you can 
   probably recover by dd -ing  64 segments of 0-s to the raw disk,
   however).


   CORRECTING:
   To correct the partitioning of a disk, which has a partition, that starts
   at segment 0, that is used as a swapfile, WITHOUT COPYING THE DATA,
   proceed as follows:

      . Make sure, that the partition is not active. This normally implies
	a re-boot to single user mode. If the partition is the primary
	swapfile, make an alternative kernel, or boot from the net.

      . Read the current partition table by  chpt -q /dev/r...c 
	If you know, what the contents should be, and it isn't,
	restore it by successive  chpt -p. /dev/r...c  ... ...
	commands, then use the above commands to modify the 
	(relevant) swapfile. (Also check the filesystems.)

      . Othewise, if the partition table is  NOT standard, it is (probably)
        un-corrupted, so, unless you KNOW different, just use the above 
	commands to modify the (relevant) swapfile.

      . Otherwise set the partition table to the default with  
	  chpt -d /dev/r... , then use the above 
	commands to modify the (relevant) swapfile.c

    NOTE: This may fail, if no filesystem exist, in what would be the
    a-partition, if the default applied (i.e. starting at 0). Then one 
    must make a small dummy filesystem there first (which will NOT
    overwrite anything useful, in this situation):

      . Find a partition, that starts at 0, in the current partition
	table. Either the actual swap-partition or the default a-partition
	must do this. Assume this is  X .
	Use  /etc/mkfs  /dev/r...X  1000   to create a mini-filesystem.
        DO NOT USE  newfs  .

      . Either restore a (known) nonstard partitiontable partition by 
	partition, or set it to the default. This should work now. 
	Then use the above commands to modify the (relevant) swapfile.


		     Lasse H. Ostergaard      lhoe@id.dth.dk

bill@unixland.natick.ma.us (Bill Heiser) (06/22/91)

In article <1991Jun17.171237.11133@linus.mitre.org> kepowers@mbunix.mitre.org (Powers) writes:
>>You might also want to look at that Ultrix tuning paper that was posted
>>here not long ago, it appears that Ultrix may be surprisingly poorly
>>tuned for today's RISC systems...
>
>Does anyone know where this can be found?  TIA.

If someone could post this info, it'd be most-helpful.  Thanks!


-- 
bill@unixland.natick.ma.us     ...!uunet!think!unixland!bill
OR ..!uunet!world!unixland!bill     heiser@world.std.com
Public Access Unix 508-655-3848(2400)   508-651-8723(HST)  508-651-8733(PEP-V32)