[comp.unix.wizards] Can directory files have holes in them ?

naim@accuvax.nwu.edu (Naim Abdullah (CSRL)) (10/05/89)

In 4.3bsd, is it possible for a directory file to have holes
in it ?

By this I mean:

struct dinode *ip;

ip = ... ; /* set it to a directory inode */
(ip->di_db[i] == 0) && (ip->di_db[j] != 0) && (i < j)

And what about indirect blocks (assuming a really HUGE directory). Is it
possible to have:

(ip->di_ib[i] == 0) && (ip->di_ib[j] != 0) && (i < j)


It seems to me that since the kernel is the only one that writes directory
blocks, it should be easy to ensure that no holes are present in a directory
file. But I don't know if the 4.3bsd kernel bothers to do this, hence the
question.

Thanks.

		      Naim Abdullah
		      Dept. of EECS,
		      Northwestern University

		      Internet: naim@eecs.nwu.edu
		      Uucp: {oddjob, chinet, att}!nucsrl!naim


P.S: The reason for this question is that dumptraverse.c in the source
for dump, is careful enough to handle directory files with holes in them.
When I was browsing through it, I noticed this and I was wondering if this
was just good and healthy programmer paranoia or whether this can really
happen.

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/05/89)

In article <1212@accuvax.nwu.edu> naim@eecs.nwu.edu writes:
>In 4.3bsd, is it possible for a directory file to have holes in it?

Sure.

tjc@ecs.soton.ac.uk (Tim Chown) (10/05/89)

In article <1212@accuvax.nwu.edu>, naim@accuvax.nwu.edu (Naim Abdullah (CSRL)) writes:
> In 4.3bsd, is it possible for a directory file to have holes in it ?

Sure is.  Here's an example with the nrsdbm files that are commonly
used in the UK as an online database of mail/ftp sites:

tjc40% ls -l
total 4164
-rw-r--r--  1 root       549740 Oct  5 14:41 DBM1
-rw-r--r--  1 root       544304 Oct  5 14:26 DBM1~
-rw-r--r--  1 root       766931 Oct  5 14:22 DERFIL2
-rw-r--r--  1 root       761268 Oct  5 14:22 DERFIL2~
-rw-r--r--  1 root          568 Sep  5 16:51 UAIEF
-rw-r--r--  1 root          196 Jun  6 16:37 UAIEF~
-rwxr-xr-x  1 root        49152 Jan 25  1989 dbencode
-rwxr-xr-x  1 root        49152 Jan 25  1989 dbpatch
drwxr-xr-x  3 root          512 Jan 25  1989 maint
drwxr-xr-x  2 root          512 Jan 25  1989 newcomms
-rw-r--r--  1 root         8192 Oct  5 14:48 nrsdbm.dir
-rw-r--r--  1 root      3552256 Oct  5 14:48 nrsdbm.pag
tjc41% du
188     ./maint/patch
189     ./maint
1674    ./newcomms
6026    .

More holes than filling, it seems.  (This is on a Sun 3, 4.3BSD)

Tim.

cpcahil@virtech.UUCP (Conor P. Cahill) (10/06/89)

In article <442@ecs.soton.ac.uk>, tjc@ecs.soton.ac.uk (Tim Chown) writes:
> In article <1212@accuvax.nwu.edu>, naim@accuvax.nwu.edu (Naim Abdullah (CSRL)) writes:
> > In 4.3bsd, is it possible for a directory file to have holes in it ?
> 
> Sure is.  Here's an example with the nrsdbm files that are commonly
> used in the UK as an online database of mail/ftp sites:

Your example shows a regular file with holes in it.  The original poster
was asking about a directory file with holes in it.  I, for one, cannot figure
any way you could get holes in a directory because the mechanism to generate
holes is to lseek beyond the end of the file and write some information, thereby
generating a hole between the old end of the file and the new data (assuming 
there was at least 1 full block between the two.

The kernel shouldn't have any reason to seek beyond the end of the file and
therefore shouldn't create a directory with holes in it - But this is 
only guesswork.


-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

chris@mimsy.UUCP (Chris Torek) (10/07/89)

In article <1240@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>[someone's] example shows a regular file with holes in it.  The
>original poster was asking about a directory file with holes in it.  I,
>for one, cannot figure any way you could get holes in a directory
>because the mechanism to generate holes is to lseek beyond the end of
>the file and write some information, thereby generating a hole between
>the old end of the file and the new data (assuming there was at least 1
>full block between the two.

Well, actually, it is by setting an offset (uio->uio_offset, or
u.u_offset in other kernels) sufficiently large so that when bwrite()
calls bmap(), and bmap() finds there is no data block for the given
offset, and the kernel allocates one, that one happens to be `after' an
offset that also has no data block.  (Sometimes necessary indirect
blocks are also missing, and the allocation gets complicated, but it
all works out the same.)

Anyway, holes in a directory would be invalid in 4.2 and 4.3BSD because
the d_reclen field should never be zero.

The original question was why dump is careful not to look at holes
in directories.  The only possible answer is `paranoia'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

bruner@uicsrd.csrd.uiuc.edu (John Bruner) (10/11/89)

In article <20044@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>The original question was why dump is careful not to look at holes
>in directories.  The only possible answer is `paranoia'.

The paranoid code in dump and the kernel was added because of a
problem I ran into at Purdue on either a V7 PDP-11 or 4.1BSD VAX
(probably an 11/70) several years ago.  A hardware problem caused
a block pointer in a directory inode to be zeroed.  (I do not
remember the exact circumstances, except that it was a hardware
problem and it did not totally destroy the disc.)  "fsck" passed
the filesystem without a complaint, but any attempt to access the
directory (which I recall was something innocuous like "/usr/bin")
caused a panic.

In those days I believe that namei() did something like

	bp = bread(ip->i_dev, bmap(ip, lbn, B_READ));

The bug was that namei() didn't check whether bmap() returned -1
because holes in directories were "impossible".  My fix was to report
the hole on the console and allocate a new block if the filesystem
was mounted read/write; otherwise, it just skipped to the next block.
As I recall, "fsck" had code which checked for holes in files, but it
was commented out.  I turned on the check for directories.
I sent mail to Berkeley, and they integrated these changes into the
4.2BSD filesystem code (although they removed the code which filled
in the hole).
--
John Bruner	Center for Supercomputing R&D, University of Illinois
	bruner@uicsrd.csrd.uiuc.edu	(217) 244-4476	

jc@minya.UUCP (John Chambers) (10/12/89)

In article <1240@virtech.UUCP>, cpcahil@virtech.UUCP (Conor P. Cahill) writes:
> The kernel shouldn't have any reason to seek beyond the end of the file and
> therefore shouldn't create a directory with holes in it - But this is 
> only guesswork.

Oh, I don't know about that; I immediately thought of a way it could 
reasonably happen.  Older Unix systems have a notable problem in the
fact that directories can only grow; there is no mechanism for freeing
the space after a lot of deletions.  One common example is the various
subdirectories of /usr/spool.  Suppose there were a clever routine that
scanned thru large directories, and if there were no in-use entries in
a block, returned it to the free-space list.  This would solve the space
problem nicely.  It'd also help if the directory scanner would notice
the missing blocks, and silently skip over them.

I'm not saying that BSD does this; in fact, I've seen claims that it
moves entries about to compact directories, and frees the blocks from
the end.  This is a bit more work, true, but it wouldn't produce holes
(and all the annoying POSSIBLE FILE SIZE ERROR messages from fsck :-).

Of course, if you have people twiddling the filesystem via fsdb or their
own programs (as in the suggestions re my recent query), it'd be quite
easy to produce a directory with holes.  In any case, holes are legal,
if uncommon, and programs should handle them correctly.

Which reminds me:  If I wanted to test a program of mine for validity
in the face of holey directories, how would I go about producing one?
Legally, I mean, not by twiddling the raw device.  (I know how to do 
that, and I don't like it much; I'd much rather open the directory for 
writing. ;-)  In general, I like to test my code against all the possible
pathological cases, which means that I have to find ways to produce them.
In my mind, a system that won't let me generate a library of "garbage"
files is somewhat deficient as a development system; such deficiencies
are a sign of poor engineering.

-- 
#echo 'Opinions Copyright 1989 by John Chambers; for licensing information contact:'
echo '	John Chambers <{adelie,ima,mit-eddie}!minya!{jc,root}> (617/484-6393)'
echo ''
saying