[comp.unix.questions] Large file systems

davidsen@steinmetz.ge.com (Wm. E. Davidsen Jr) (03/25/89)

In article <28819@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:

  I changed the subject on this, because it had drifted away from the
topic of my original posting.

| 4. Is the Unix file system, unenhanced, the right view for personal
| workstations with a few GB of disk? I would claim that the MacOS file
| system view has collapsed as an abstraction with the popularity of
| 300MB or larger disks, as cute as it was with a few files. Is there a
| similar threshold for the Unix system? It's 10PM, do you know where
| your sources are?

  No doubt at some point the tree analagy becomes cumbersome, and I
certainly think there's room for research, but the idea of grouping
things by function makes things a lot easier to understand.

  Would putting executables in one directory make it easier to
understand and backup? How about:
	new		was		contains
	----------------------------------------
	/bin		/bin		standard UNIX utilities
			/usr/bin (part)	distributed with *all*
			/usr/ucb	versions
	/bin/vendor	/usr/bin (part)	vendor speciffic software
	/bin/apps	/usr/bin (part)	application specific
	/bin/local	/usr/lcl/bin	site specific
I'm sure others would have ideas about organization of the filesystem to
be easier to understand.

  Now, your suggestion for research on a new model... sure! There are
lots of ways to group and access data, some tried and some as yet
undiscovered. Things like having things organized in multiple group (by
links), and having keywords and/or short descriptions associated with an
inode have been tried, and may inspire something better. There are whole
new ideas to be tried.

  Now, if we assume that vendors are shipping a SysV derivitive, is
there really an advantage to doing the research on a BSD kernel?
Certainly if most of the vendors are shipping SysV, they will not be
thrilled at having the development be done on BSD which is still related
to V7. If you build a new filesystem is there an advantage to using
something other than SysV for the starting point.

NOTE: I am assuming that the vendors who have joined OSF or UNIX
Internat. will be shipping a SysV flavor in 3-5 years, rather than BSD.
The original question was if there will still be advantages to using
BSD for anything, including kernel research.
-- 
	bill davidsen		(wedu@crd.GE.COM)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

bzs@bu-cs.BU.EDU (Barry Shein) (03/25/89)

>  Now, if we assume that vendors are shipping a SysV derivitive, is
>there really an advantage to doing the research on a BSD kernel?
>Certainly if most of the vendors are shipping SysV, they will not be
>thrilled at having the development be done on BSD which is still related
>to V7. If you build a new filesystem is there an advantage to using
>something other than SysV for the starting point.
>
>NOTE: I am assuming that the vendors who have joined OSF or UNIX
>Internat. will be shipping a SysV flavor in 3-5 years, rather than BSD.
>The original question was if there will still be advantages to using
>BSD for anything, including kernel research.
>-- 
>	bill davidsen		(wedu@crd.GE.COM)

Two points:

1. It doesn't matter a whole lot what platform the research is done
on, if someone comes up with the better mousetrap the doorward path
will be beaten. It's not all that important, consider it a full
employment act for system programmers :-)

2. You're overemphasizing the name, although they're calling it SysV
it has a lot of BSD stuff in there, it really is a merge.

Remember the old maxim, "If it works...it can't be state of the art".

I think if all you're fretting over is what you express above then
your concerns are factually correct but of minor concern. By
definition any research variant will differ from what the vendors are
shipping and great, new ideas will have to be merged in to be adopted,
and will be merged, happily, when the value evidences itself.

	-Barry Shein, Software Tool & Die

olsen@batcomputer.tn.cornell.edu (Dave Olsen) (05/03/89)

I am a system manager at Materials Science Center, Cornell University,
responsible for a Convex C210 system with 5 GB of disk.  Out of this, we
would like to make about 3 GB of disk available for our users, on two
partitions.

What I would like to know is if anyone has found problems with large
partition sizes (up to 2 or 3 GB).  Any problems with utilities like dump,
system administration difficulties with large partition sizes, and so on.
Those persons we have talked to have no experience with this size of
partition.

(For those of you who are wondering how I can be talking about a 2 or 3 GB
partition size on disks smaller than this, Convex Unix allows striping
across several disks to get partitions larger than a single disk.)

+---------------------------------+----------------------------------+
|     Dave Olsen                  |      Materials Science Center    |
|     olsen@msc2.tn.cornell.edu   |      E20 Clark Hall              |
|     olsen@crnlmsc2.bitnet       |      Cornell University          |
|     (607) 255-2067              |      Ithaca, NY 14853            |
+---------------------------------+----------------------------------+

cgh018@tijc02.UUCP (Calvin Hayden ) (05/04/89)

> 
> I am a system manager at Materials Science Center, Cornell University,
> responsible for a Convex C210 system with 5 GB of disk.  Out of this, we
> would like to make about 3 GB of disk available for our users, on two
> partitions.
> 
> What I would like to know is if anyone has found problems with large
> partition sizes (up to 2 or 3 GB).  Any problems with utilities like dump,
> system administration difficulties with large partition sizes, and so on.
> Those persons we have talked to have no experience with this size of
> partition.
> 
> (For those of you who are wondering how I can be talking about a 2 or 3 GB
> partition size on disks smaller than this, Convex Unix allows striping
> across several disks to get partitions larger than a single disk.)
...

We have some file systems (Sys V r2 v2 on a VAX 8600) that are large, not 
as large as you are wanting, but still pseudolarge.  The largest is a
600MB file system located on one drive.  It takes up only 4 of the drives
8 partitions.   Unfortunately, we cant spread a file system across drives.
Problems... backups to 9 track tape (if thats what is used)
can be hell.  We're up to 4 full 2400 ft reels @ 6250bpi for a volcopy of 
this file system (operator just loves it :->)  Another problem... doing a
restore of the file system in the event of a hda failure(s).  This       
happened to me, on the above mentioned file system (too bad most *nix 
systems dont have online diags like VMS's, to let the admin know ahead
of time that there may be a problem ;->).  Takes a while to restore, and
then you have to worry about having a spare drive (or in your case drives)
to restore the file system to.  We were lucky, we had a spare drive for this
purpose, but while work was being done on the dead one, we had no spare
disk space for any further fs restorations.  Backups to cartridge tapes
would make this easier, but the later problem would still exist.
Hope this helps in some way.

++++++++++++++++++++++++++++++++++
+ Calvin Hayden                  +
+ Texas Instr.                   +
+ UUCP:...mcnc!rti!tijc03!cgh018 + 
++++++++++++++++++++++++++++++++++

Life's a mountain, not a beach!

nash@ucselx.uucp (ron) (05/05/89)

In article <7875@batcomputer.tn.cornell.edu> olsen@batcomputer.tn.cornell.edu (Dave Olsen) writes:

>I am a system manager at Materials Science Center, Cornell University,
>responsible for a Convex C210 system with 5 GB of disk.  Out of this, we
>would like to make about 3 GB of disk available for our users, on two
>partitions.

>What I would like to know is if anyone has found problems with large
>partition sizes (up to 2 or 3 GB).  Any problems with utilities like dump,
>system administration difficulties with large partition sizes, and so on.

I am a system manager at San Diego State University, responsible for
an Elxsi 6400 BSD4.3 system with 3 GB of disk.  Our /usr1 partition
is 1.2 GB.  I have not had any problems with dump, fsck, quotas so far.
I hope this helps some.  Feel free to email any questions.

Ron Nash
University Computing Services
...ucsd!sdsu!ucselx!nash

peno@kps.UUCP (Pekka Nousiainen /DP) (05/06/89)

> What I would like to know is if anyone has found problems with large
> partition sizes (up to 2 or 3 GB).

There's one practical problem with dump: To restore one file you may have
to read through several tapes.  This can be a problem if the file system
is used for "/users".  Apart from this I can't think of any problems.
I use a 1 GB system for /spool in production (7 tapes), in the future
it could be 2 GB concatenated on 2 drives.  The dumps are meant mainly
to cover disk failure, not random "rm *"'s by careless users.

--
peno@kps

wh%cxa.daresbury.ac.uk@nsfnet-relay.ac.uk (Bill Purvis) (05/08/89)

In a recent message, Dave olsen writes:
>  What I would like to know is if anyone has found problems with large
>  partition sizes (up to 2 or 3 GB). ...

We run a Convex C220 with 16 GB of disk. You will find that there is a
limit of 2GB on file systems (offsets are held as signed 32-bit ints).
We have had no problems with user partitions of 900Mb (single disk)
although you need a lot of tapes in your backup system. We do run a
1.8 MB partition for scratch files which gives no problems, but we
don't back it up. Most of our user files are stored in the 'g' and 'h'
partitions, since we can use smaller fragment sizes (512) provided the
file system is less than 512 Mb. Above this limit we must have 4096 byte
fragments.

=========================================================================
Bill Purvis                     || wh@uk.ac.dl.cxa
SERC, Daresbury Lab             || (JANET)
Warrington, Cheshire, UK        ||
=========================================================================

gentry@kcdev.UUCP (Art Gentry) (05/08/89)

In article <473@kps.UUCP>, peno@kps.UUCP (Pekka Nousiainen /DP) writes:
> > What I would like to know is if anyone has found problems with large
> > partition sizes (up to 2 or 3 GB).
> 
> There's one practical problem with dump: To restore one file you may have
> to read through several tapes.  This can be a problem if the file system
> is used for "/users".  Apart from this I can't think of any problems.
> [additional verbiage deleted]
 
I have several 571meg file systems (max size of my discs) and have no problems
except as noted above.  One workaround I have used for that is to cpio out
individual directories.  Makes restoring files much easier.  Have always been
a little nerveous about multiple tape archives anyhow, Murphy says "if you
need a file from tape #9, tape #8 will be corrupt". :-)

prc@erbe.se (Robert Claeson) (05/10/89)

In article <735@kcdev.UUCP>, gentry@kcdev.UUCP (Art Gentry) writes:

> Have always been a little nerveous about multiple tape archives anyhow,
> Murphy says "if you need a file from tape #9, tape #8 will be corrupt". :-)

Yes...

-- 
          Robert Claeson      E-mail: rclaeson@erbe.se
	  ERBE DATA AB

jeffrey@algor2.UUCP (Jeffrey Kegler) (05/11/89)

In article <675@maxim.erbe.se> prc@erbe.se (Robert Claeson) writes:
=In article <735@kcdev.UUCP=, gentry@kcdev.UUCP (Art Gentry) writes:
=
== Have always been a little nervous about multiple tape archives anyhow,
== Murphy says "if you need a file from tape #9, tape #8 will be corrupt". :-)
=
=Yes...

I have always considered the behavior of making all subsequent volumes
unreadable if a previous one is unreadable (lost, etc) a serious bug.
Hence I never make multi-volume back ups.

I am clumsy, and do a lot of file system crunching driver work, so I need
backups pretty often.  To date, I have only had one unsuccessful restore
out of dozens (my tape drive broke, and the restore will probably work when
the new one arrives).  The usual track record I see elsewhere is that one
in two restore's fail.

My rules for backups:

1) Backups procedures should be unintelligent, in fact stupid.  Clever
selection of only the directories you will need is likely to miss one
crucial file somewhere.  Your backups should be of at least whole file
systems at a time, if not the universe.  Assume that whoever is doing the
backups is really dumb or not paying attention, or both.

Exception:  Special project backups of what you are working on at the
moment, if you have a fuller backup scheme in place, sufficient to prevent
catastrophic losses.

2) Never use incremental backups (files changed since the last back up).
The reliance on two restores increases the risk factor too much.

Exception:  Incrementals done for a little extra security where the basic
backup scheme is sufficient to prevent major losses.  In other words, where
you are not relying on the incremental for anything major.

3) Never do a backup onto multiple media, where you are depending on the
contents of one volume to restore another.  In fact, where it makes sense
on the media, (Bernoulli's, for example, or other random access media with
capacity over 2 megabytes) I will break up a backup even within a single
physical volume.

4) Always do a verify pass over the backup volumes immediately after
creating them.

5) Use backup methods that conveniently allow you to restore a single file.
If the only easy way to restore stomps your entire file system, you are
creating some pretty nasty potential choices for the restorer.

6) Use backup methods that allow you the greatest range of restore chances.
If you have a choice between backing up on media that only one drive will
read, as opposed to two drives, guess which gives you better odds.
Remember, the circumstances under which you do restores are usually less
than optimal, and often bad beyond the imagination of the person doing the
backup.

No backup procedure wastes more time than one that will let you down
when you need it.

Techniques:  As long as the above rules are followed, anything that works
is OK.  I personally (and I may be behind the times) use ff to generate
lists of file names and size by file system, a shell script to break the
file name list into 1 megabyte or volume sized chunks, as appropriate, and
then cpio.  I cpio -icvt them back and do a compare with the actual
contents of the file system, by name and file size (log files, of course,
will differ, and the set of temporary file will differ).
-- 

Jeffrey Kegler, President, Algorists,
jeffrey@algor2.UU.NET or uunet!algor2!jeffrey
1762 Wainwright DR, Reston VA 22090