[comp.sys.amiga.tech] when is a block not a block?

cpmurphy@vax1.tcd.ie (07/11/90)

   I'm working on a project with a friend which involves, inter alia, writing
files to and from disks. We've written a few small things just to cut our teeth
on the Amiga, as it were. So I had a good look at the new 'ls' program that was
in comp.sources.amiga recently. There I was alarmed to see that there is a bug
in 1.3 to do with file blocks. Apparently using Examine() or ExNext() with a
FileInfoBlock structure will not always give the correct number of blocks taken
up by a file. The way to get around it, so it says, is to use Info().
  I'm a little bit confused by all of this. Could someone please enlighten me.
How do I get the "real" number of blocks taken up by a file so as to avoid this
bug? Say I have a file on a HD and I want to copy it to a floppy. How do I
ensure that there is enough space on the floppy for this file? Is it to convert
it's size in bytes to "floppy blocks", then compare it to the Info() from the
floppy? Or is there an overhead involved (some code I've seen implies there is).
  Thank you for yor help and your patience.
-- 
--------------------------------------------------------------------------------
Christian Murphy, Trinity College, Dublin, Ireland	cpmurphy@vax1.tcd.ie
cpmurphy%vax1.tcd.ie@cunyvm.cuny.edu (from ignorant internet sites)
...uunet!mcsun!ukc!swift.cs.tcd.ie!vax1.tcd.ie!cpmurphy (same path as news)

mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) (07/11/90)

In article <6498.269a4527@vax1.tcd.ie> cpmurphy@vax1.tcd.ie writes:
   There I was alarmed to see that there is a bug
   in 1.3 to do with file blocks. Apparently using Examine() or ExNext() with a
   FileInfoBlock structure will not always give the correct number of blocks
   taken up by a file. The way to get around it, so it says, is to use Info().

There isn't a bug in 1.3 about the number of blocks a file uses.
There's just a difference of opinion between OFS and FFS about how
what that phrase means. One of them counts the number of data blocks,
the other counts all the blocks used by the file, both data and
overhead. (Question to CATS: Is this considered a bug? And is it fixed
in 2.0?)

Because of this, the block count field in the FileInfoBlock is pretty
much useless if high accuracy is desired. Get the size of the file in
characters (that only counts the data) and the # of bytes in a block
for the device. From that, you can compute both the number of data
blocks and the number of overhead blocks.

   I'm a little bit confused by all of this. Could someone please enlighten me.
   How do I get the "real" number of blocks taken up by a file so as to avoid
   this bug? Say I have a file on a HD and I want to copy it to a floppy. How
   do I ensure that there is enough space on the floppy for this file? Is it
   to convert it's size in bytes to "floppy blocks", then compare it to the
   Info() from the floppy? Or is there an overhead involved (some code I've
   seen implies there is).

The easy way is to get Fish disk 352, and look in treewalk for the
"willfit" program. That takes a device and a bunch directory, and
tells you whether there are enough free blocks on the device to hold
the directories and everything in them (modulo 1 block, which depends
on where on the floppy the directory is going).

What willfit does is uses the block size from the device to figure
sizes for files in the tree. Since FFS puts more data in a block than
OFS, copying a file from an HD with FFS to a floppy with OFS can cause
the number of blocks to grow when you copy a file from the FFS to the
OFS.

	<mike
--
All around my hat, I will wear the green willow.	Mike Meyer
And all around my hat, for a twelve-month and a day.	mwm@relay.pa.dec.com
And if anyone should ask me, the reason why I'm wearing it,	decwrl!mwm
It's all for my true love, who's far far away.

peterk@cbmger.UUCP (Peter Kittel GERMANY) (07/12/90)

In article <6498.269a4527@vax1.tcd.ie> cpmurphy@vax1.tcd.ie writes:
>How do I get the "real" number of blocks taken up by a file so as to avoid this
>bug? Say I have a file on a HD and I want to copy it to a floppy. How do I
>ensure that there is enough space on the floppy for this file?

One file on floppy or HD consists of one header block plus a number of
data blocks. Now it depends on which filesystem is used, how many data
bytes fit into one media sector. On HD's today mostly the FastFilingSystem
is used, it contains 512 data bytes per sector. However on floppy still
the old filesystem is used normally which contains only 488 data bytes
per sector (plus some organisational overhead). So there well can arise
a difference in number of blocks needed on HD vs. floppy. You may take the said
numbers to compute the needed blocks when you know the file length in bytes.

(To further confuse you: in future we perhaps will see other filesystems
or media with not only 512 bytes per block, but 1024. Then the difference
will be again different...)

-- 
Best regards, Dr. Peter Kittel      //     E-Mail to 
Commodore Frankfurt, Germany      \X/      rutgers!cbmvax!cbmger!peterk

mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) (07/12/90)

In article <277@cbmger.UUCP> peterk@cbmger.UUCP (Peter Kittel GERMANY) writes:
   One file on floppy or HD consists of one header block plus a number of
   data blocks.

Don't forget that if the file is big enough, you get multiple header
blocks. I haven't checked 2.0 yet, but on 1.3 there's no way (that I
know of) to determine how many data blocks/header block. On the other
hand, everything uses 72.

	<mike

--
Teddies good friend has his two o'clock feast		Mike Meyer
And he's making Teddies ex girl friend come		mwm@relay.pa.dec.com
They mistook Teddies good trust				decwrl!mwm
Just for proof that Teddy was dumb.

kim@uts.amdahl.com (Kim DeVaughn) (07/16/90)

In article <6498.269a4527@vax1.tcd.ie>, cpmurphy@vax1.tcd.ie writes:
>
>So I had a good look at the new 'ls' program that was
>in comp.sources.amiga recently. There I was alarmed to see that there is a bug
>in 1.3 to do with file blocks. Apparently using Examine() or ExNext() with a
>FileInfoBlock structure will not always give the correct number of blocks taken
>up by a file.

Well ... the real "bug" is that the only place it's "officially documented"
(that I could find) is in an old AmigaMail from CBM (the Nov/Dec 1988 issue,
in an article by Steve Beats on the differences between the FFS and the OFS).
To quote from the "struct FileHeaderBlock" listed on page 5 of that issue:

   ULONG HighSeq;         /* total blocks used in file (not updated
                             by FFS, only the old filing system) */
   ULONG DataSize;        /* number of data blocks used for this file */  (sic)

One other mention of this is made by Betty Clay in her fine article on the
FFS in an issue of the now (sadly) defunct TrasnAmi journal, from about the
same time.  In it she mentiond that a USENET posting describes this, but I was
unable to locate any such from my "archives" of "probably useful information".

As may be, once you figure out that what we're really talking about is a part
of what is known as the "struct FileInfoBlock" as defined in the dos.h/dos.i
include files, and you equate "HighSeq" with "fib_NumBlocks", and you decide
to ignore the comment associated with "DataSize" above as a typo, things
start to make sense ... :-)

Oh yes, one other thing, the comment on "HighSeq" (fib_NumBlocks) is wrong.
The field *is* updated by the FFS ... it's just not accurate (usually off by
2 or 3 blocks, but that depends on the file's size).

For example, a file I have is 135046 bytes long.  On an FFS device, the
fib_NumBlocks entry says 267 data blocks, whereas the file actually has only
264 data blocks.  And no, fib_NumBlocks isn't adding in the file's header
and extension blocks for you.  The file actually occupies 268 blocks total
(264 data, 1 header, 3 extensions).

BTW, on an OFS device (floppys, for example), this file will have 277 data
blocks, and occupy 281 blocks total.


>The way to get around it, so it says, is to use Info().

You use the call to Info() to find out the number of data-bytes per data-block
(id_BytesPerBlock in the InfoData struct).  You use this to "correct" the
value found in fib_NumBlocks (see below).


>  I'm a little bit confused by all of this. Could someone please enlighten me.

I can't imagine why ... :-)  Actually, this whole area is one of the most
poorly "documented" areas in the entire AmigaOS (along with the console/CLI
interface).  Due, I'm sure, to the braindead TriPOS/BCPL crap that had to
be grafted on top of an otherwise pretty nice OS at the last moment, many
years ago.  CBM has been working hard to exorcize this stuff, and I really
hope to see the interfaces and documentation improve with 2.0.  From the few
"bits'n'pieces" that have been mentioned here at times, I don't think I'll be
disappointed.

One other problem is that none of this information is in the RKM's.  They just
point you to the Bantam AmigaDOS manual, which tells you very little, and what
it does say is quite out of date.


>How do I get the "real" number of blocks taken up by a file so as to avoid this
>bug? Say I have a file on a HD and I want to copy it to a floppy. How do I
>ensure that there is enough space on the floppy for this file? Is it to convert
>it's size in bytes to "floppy blocks", then compare it to the Info() from the
>floppy?

Well, assuming your HD partition is an FFS one, you could just say:

    "ls -lB488 filename" 

using ls v4.0k, take a look at the number of blocks reported, and see if your
floppy has that many blocks left on it (with the AmigaOS command "Info").  See
the ls docs for further examples.

Or you could use Mike Meyer's "willfit" pgm.  I think he's calculating things
correctly, as we hashed this over in email awhile ago.

Don't use the recently posted "du" (v1.2), as it does not always give the
correct number of blocks on large files.  I've reported this to the author.

If it's for your own code, you can use/modify the routines that handle this
in ls ... fixNumBlocks() and blkalloc().  You first call fixNumBlocks(), then
blkalloc() for each file of interest (though if the file of interest is on
an OFS device, it isn't strictly necessary to do the fixNumBlocks() call).

I've appended these two little routines below (edited somewhat to remove some
"extraneous" comments :-) ).


In subsequent postings, Mike Meyer said:
>
>There isn't a bug in 1.3 about the number of blocks a file uses.
>There's just a difference of opinion between OFS and FFS about how
>what that phrase means. One of them counts the number of data blocks,
>the other counts all the blocks used by the file, both data and
>overhead. (Question to CATS: Is this considered a bug? And is it fixed
>in 2.0?)

Not quite, Mike (see the example above).  Basically, you cannot depend on
the value in fib_NumBlocks to tell you anything meaningful if the device
is an FFS one.

And no, it's not a "bug".  Since it's "documented", that makes it a "feature",
right?  :-)

What *is* a bug, is that the official 1.3 includes don't tell you that
fib_NumBlocks is meaningless on FFS (nor do the AutoDocs, RKM's, etc).


>I haven't checked 2.0 yet, but on 1.3 there's no way (that I
>know of) to determine how many data blocks/header block. On the other
>hand, everything uses 72.

There are a couple of "interesting" items that could be of further use in
Steve Beat's AmigaMail article:

"Please note that these structures assume a fixed, 512 byte disk block.  This
 is a valid assumption for filing systems up to V1.3.  However this will NOT
 be the case under V1.4 Kickstart which will support full variable sized disk
 blocks.
"

and:

"Hash table size will always be set to 72 for instance, this can be used as
 one of the consistency checks before proceeding to directly read from or
 write to the disk.
"

and (in BOLD face):

"Never use this knowledge in an application program.  It is guaranteed to
 break under V1.4 Kickstart.
"

and (of possible real use):

"struct RootBlock {
    [...]
    ULONG HTSize;       /* Size of the hash table in longwords,
                           must be set to 72 */
"

Nevermind that I have yet to find a RootBlock (etc.) struct in any of the
includes or headers.  At least there is a place for the information to
reside.  Whether or not it will actually be used is another story ... though
it would seem to be a shame to keep the 72 blocks/extent limitation if the
block sizes on some future (hypothetical) filesystem were to be of 4K size,
or even 1K for that matter.

Until this stuff gets *documented*, going with "72" is about the best you
can do, I guess.


Perhaps the "Keeper of AmigaDOS" could comment (Hi, Randell) ...?

/kim



The routines fixNumBlocks() and blkalloc() from ls v4.0k:

#define  MAX_BLKS_PER_EXTENT  72

struct InfoData *CurID = 0;  /* Global InfoData for Info() */

/*
 *  fixNumBlocks() - A hack to fix the fib_NumBlocks field so it is correct.
 *
 *  It was busted with the introduction of the FFS in AmigaDOS 1.3.  Don't
 *  ask where this is documented ... as far as I know, it isn't;  not in the
 *  1.3 includes, autodocs, readme's, user's manual, RKM's, or DevCon notes.
 *  It is mentioned (sort of) in an AmigaMail, and I'm told it was mentioned
 *  once in a message on USENET.
 *
 *  As may be ... since fib_NumBlocks was in use thruout this code before
 *  this was discovered, it was easiest to just fixup the fib_NumBlocks when
 *  the file/dir gets initially Examine/ExNext'd,  with a call to Info() for
 *  *all* files (which means alot of unnecessary calls get made [for files
 *  all in the same dir, etc]).  The performance penalty ends up being about
 *  1 sec, for a tree with 537 entries (which seems acceptable to me).
 *
 *  Helluva way to run a ship ...
 *
 *  /kim   /\;;/\
 *
 */

/* CurID is allocated in _main() during initialization, etc. */

VOID fixNumBlocks(lock, fib)
  struct FileLock *lock;
  FIB *fib;
{
    LONG *nb;
    LONG bsize;
    static int errflag = 0;

    if ((LSFlagsX & NOFIXNUMBLOCKS) != 0) return;

    if (blksize == 0)
    {
      /* Try to fill InfoData, bomb if not readable for some reason (like */
      if (Info((BPTR)lock, CurID) == 0)   /* ls'ing a "pathass'd" assign) */
      {
	/* Print error msg only once, and only if we'll be printing block counts */
	if ((errflag == 0) && ((LSFlags & (LONGLIST | TOTALIZE)) != 0))
	{
	  errflag++;
	  asprintf(workstr, "\nls: warning (%ld): block count(s) may be inaccurate - see ls.doc\n\n", IoErr());
	  WSTR(workstr);
	}
	return;
      }
      else
	bsize = CurID->id_BytesPerBlock;
    }
    else
     bsize = blksize;  /* non-0 blksize used to force the size of your */
                       /* choice ... it is set from the cmd args parse */
                       /* routine for the -B option                    */

     nb = (LONG *)&(fib->fib_NumBlocks);
    *nb = (fib->fib_Size / bsize) + ((fib->fib_Size % bsize) ? 1 : 0);
}


/*
 *  blkalloc() - Returns the actual number of blocks allocated by a file/dir
 *		 (or just the data blocks, if DATABLKSONLY is set).
 *
 *  Assumes 1.3 original or fast filesystem (and a fixed up fib_NumBlocks).
 *
 */

LONG blkalloc(fib)
  register FIB *fib;
{
    if ((LSFlagsX & DATABLKSONLY) == 0) {
      return(fib->fib_NumBlocks +
	    (fib->fib_NumBlocks / MAX_BLKS_PER_EXTENT) +
	   ((fib->fib_NumBlocks % MAX_BLKS_PER_EXTENT) ? 1 : 0) +
	   ((fib->fib_Size == 0) ? 1 : 0)  /* kludge for 0-len files (and dirs) */
	    );
    } else {
      return(fib->fib_NumBlocks);
    }
}

-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

jesup@cbmvax.commodore.com (Randell Jesup) (07/17/90)

In article <648H02l0b8KM01@amdahl.uts.amdahl.com> kim@uts.amdahl.com (Kim DeVaughn) writes:
>Oh yes, one other thing, the comment on "HighSeq" (fib_NumBlocks) is wrong.
>The field *is* updated by the FFS ... it's just not accurate (usually off by
>2 or 3 blocks, but that depends on the file's size).
>
>For example, a file I have is 135046 bytes long.  On an FFS device, the
>fib_NumBlocks entry says 267 data blocks, whereas the file actually has only
>264 data blocks.  And no, fib_NumBlocks isn't adding in the file's header
>and extension blocks for you.  The file actually occupies 268 blocks total
>(264 data, 1 header, 3 extensions).

	Actually, it looks to be counting extension blocks as well as data
blocks, but not header blocks (since you can have them while having no
data at all).  One could argue this is more accurate, since those extension
blocks are used in storing the data, though they don't contain data
themselves.

>"Please note that these structures assume a fixed, 512 byte disk block.  This
> is a valid assumption for filing systems up to V1.3.  However this will NOT
> be the case under V1.4 Kickstart which will support full variable sized disk
> blocks.
>"

	In fact, this _did_ get in (though only for powers of 2, with some
limits).  Don't ask me how to set them up, I haven't tried.  Useful if you
have devices that have a minimum sector size of 1K or 2K, also reduces
fragmentation at cost of increased loss from partially full blocks.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"

kim@uts.amdahl.com (Kim DeVaughn) (07/20/90)

In article <13237@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>
> In article <648H02l0b8KM01@amdahl.uts.amdahl.com> I wrote:
> >
> >For example, a file I have is 135046 bytes long.  On an FFS device, the
> >fib_NumBlocks entry says 267 data blocks, whereas the file actually has only
> >264 data blocks.  And no, fib_NumBlocks isn't adding in the file's header
> >and extension blocks for you.  The file actually occupies 268 blocks total
> >(264 data, 1 header, 3 extensions).
> 
> 	Actually, it looks to be counting extension blocks as well as data
> blocks, but not header blocks (since you can have them while having no
> data at all).  One could argue this is more accurate, since those extension
> blocks are used in storing the data, though they don't contain data
> themselves.

Yeah ... that is what it looks to be ... but I've been burned by a "pretty
face" before, too :-)

Are you saying this is now the documented behavior for the fib_NumBlocks
entry (for the FFS), or is it still better to compute them yourself from
fib_Size, so as not to break in future releases?


> > be the case under V1.4 Kickstart which will support full variable sized disk
> > blocks.
> 
> 	In fact, this _did_ get in (though only for powers of 2, with some
> limits).  Don't ask me how to set them up, I haven't tried.  Useful if you
> have devices that have a minimum sector size of 1K or 2K, also reduces
> fragmentation at cost of increased loss from partially full blocks.

OK, I won't (yet, anyway :-) ) ... but can you say if the hash-table size
is still fixed at "72", and if it isn't fixed, how one finds out that
info (the field in the root block struct ?)

Also, will all the info on the file system finally get documented in
the RKM's (or elsewhere), now that the BCPL stuff is exorcised?  Or do
we still need to look to Bantam to provide the info on the filesystem?

Thanks, Randell ...!

/kim

-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

jesup@cbmvax.commodore.com (Randell Jesup) (07/21/90)

In article <3fco02lsb9LE01@amdahl.uts.amdahl.com> ked01@juts.ccc.amdahl.com (Kim DeVaughn) writes:
>> 	Actually, it looks to be counting extension blocks as well as data
>> blocks, but not header blocks (since you can have them while having no
>> data at all).  One could argue this is more accurate, since those extension
>> blocks are used in storing the data, though they don't contain data
>> themselves.
>
>Yeah ... that is what it looks to be ... but I've been burned by a "pretty
>face" before, too :-)
>
>Are you saying this is now the documented behavior for the fib_NumBlocks
>entry (for the FFS), or is it still better to compute them yourself from
>fib_Size, so as not to break in future releases?

	Computing it yourself is never really safe.  It depends on how the
FS works internally.  Take what it gives you, with the understanding that
this really is only an internal measure of storage to the filesystem.  On some
filesystems I can imagine (easily), the number of blocks a file takes may
depend on fragmentation.  Certainly the number of blocks can and will
vary from filesystem to filesystem for a given file.

	I advise against writing code that uses blocks for anything more than
user informational display or hints.

>OK, I won't (yet, anyway :-) ) ... but can you say if the hash-table size
>is still fixed at "72", and if it isn't fixed, how one finds out that
>info (the field in the root block struct ?)

	I don't know, that's Steve's baby.  The various blocks have always
been defined as <size - X> for things towards the end, with the table in
the middle.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"

ked01@ccc.amdahl.com (Kim DeVaughn) (07/30/90)

In article <13342@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>
> In article <3fco02lsb9LE01@amdahl.uts.amdahl.com> ked01@juts.ccc.amdahl.com (Kim DeVaughn) writes:
>
> >Are you saying this is now the documented behavior for the fib_NumBlocks
> >entry (for the FFS), or is it still better to compute them yourself from
> >fib_Size, so as not to break in future releases?
> 
> 	Computing it yourself is never really safe.  It depends on how the
> FS works internally.  Take what it gives you, with the understanding that
> this really is only an internal measure of storage to the filesystem.  On some
> filesystems I can imagine (easily), the number of blocks a file takes may
> depend on fragmentation.  Certainly the number of blocks can and will
> vary from filesystem to filesystem for a given file.
> 
> 	I advise against writing code that uses blocks for anything more than
> user informational display or hints.

So the bottom line is that there is no supported* way to determine the block
count of a file.  And hence, there is no supported method of determining if
a file on one device will "fit" on another device without actually attempting
the copy operation.

Unfortunate.

/kim

 -- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

jesup@cbmvax.commodore.com (Randell Jesup) (08/02/90)

In article <c4pK02n=01EQ01@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
>In article <13342@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>> In article <3fco02lsb9LE01@amdahl.uts.amdahl.com> ked01@juts.ccc.amdahl.com (Kim DeVaughn) writes:
>> >Are you saying this is now the documented behavior for the fib_NumBlocks
>> >entry (for the FFS), or is it still better to compute them yourself from
>> >fib_Size, so as not to break in future releases?
>> 
>> 	Computing it yourself is never really safe.  It depends on how the
>> FS works internally.  Take what it gives you, with the understanding that
>> this really is only an internal measure of storage to the filesystem.  On some
>> filesystems I can imagine (easily), the number of blocks a file takes may
>> depend on fragmentation.  Certainly the number of blocks can and will
>> vary from filesystem to filesystem for a given file.
...
>So the bottom line is that there is no supported* way to determine the block
>count of a file.  And hence, there is no supported method of determining if
>a file on one device will "fit" on another device without actually attempting
>the copy operation.
>
>Unfortunate.

	First, it's a multitasking system, and any value you determine
(without writing the file) is no longer valid if anyone else wrote to the disk.
Second, since filesystems are separate from Dos as a very basic level, the
only way you could ever even answer the question would be to have a packet
for specifically asking the filesystem if it has enough space (at the
moment).  Filesystems can store the data any way they want.  Hell, they could
lempel-ziv encode them into data blocks, or allocate storage on the byte
level between files, etc, etc.

	However, there's another solution under 2.0: Open a new file, then
do SetFileSize for the size you want.  That will make a file of the requested
size filled with random data, or fail.  Then just write over it with your
file.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"

ked01@ccc.amdahl.com (Kim DeVaughn) (08/03/90)

In article <13571@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>
> In article <c4pK02n=01EQ01@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
>
> >So the bottom line is that there is no supported* way to determine the block
> >count of a file.  And hence, there is no supported method of determining if
> >a file on one device will "fit" on another device without actually attempting
> >the copy operation.
> >
> >Unfortunate.
> 
> 	First, it's a multitasking system, and any value you determine
> (without writing the file) is no longer valid if anyone else wrote to the disk.

I know that.  I was thinking of the somewhat more common case where one wishes
to transfer files from the primary storage media (say a hard disk), to the
secondary media (such as a floppy), in a relatively controlled situation.  If
there is other activity to the target/destination device, naturally any measure
of space used/available is likely to be inaccurate.

That such is the case, does not seem to prevent other multitasking OS's (such
as UN*X, VM/CMS, MVS, etc) from reporting the available space at the time the
request is issued.  Nor does it seem to prevent AmigaDOS from doing the same
on a call to Info(), or via use of the Info command.

The use such information is put to must be left up to the calling program's
discretion, and it must be aware that in some instances, that information may
become invalid before it can be used (and take the appropriate recovery steps
in case of such an eventuality).


What I object to is this:  in the OFS, fib_NumBlocks was *documented* to be
the number of data blocks occupied by a file (at some particular moment).  In
the FFS, it was then redefined to "not be updated" (per the AmigaMail article),
which is simple not true, and indeed *appears* to be the total number of blocks
used by a file, minus one (though that is not "documented" to be so).  Then I
am told that "computing the number of blocks used" based on the filesize, and
blocking factor is not a good idea, and that any value found in fib_NumBlocks
should be considered to only be an approximation.

I think that is "unfortunate", as most frequently some simple "sanity checks"
that *could* be performed WRT file xfer'ing won't be, and will opt instead
for the brute-force "see if it fits by trying to make it fit" approach (which
is a complete waste of time, and cycles, if there is no possibility of it
being able to do so in the first place).


> 	However, there's another solution under 2.0: Open a new file, then
> do SetFileSize for the size you want.  That will make a file of the requested
> size filled with random data, or fail.  Then just write over it with your
> file.

Well that's great, and I'm glad to hear it!  That can help solve an annoying 
problem with copying files, etc (if taken advantage of).  When and where will
all this "good stuff in 2.0" be documented (for the masses)?

And nonetheless, I still want to be able to find out the number of blocks (or
whatever the smallest allocatable chunk of storage is called on a given
device) a file *really* occupies, if only for informational purposes.


/kim


P.S.  While we're talking about filesystems, etc. and 2.0 ... will 2.0 quit
      defaulting newly created files to ----rwed, and perhaps make the more
      likely assumption of ----rw-d?  And will the attribute bits be
      enforced a little bit more rigerously?

-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

jesup@cbmvax.commodore.com (Randell Jesup) (08/04/90)

In article <d4eQ02U2010201@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
>That such is the case, does not seem to prevent other multitasking OS's (such
>as UN*X, VM/CMS, MVS, etc) from reporting the available space at the time the
>request is issued.  Nor does it seem to prevent AmigaDOS from doing the same
>on a call to Info(), or via use of the Info command.

	We'll report the amount of space left, but neither we nor Unix (nor
I suspect VMS) guarantee a specific number of blocks used to store a file.
Unix is slightly less variable, due to preallocation of inodes, but even it
varies (bsd storing of small files in partial blocks, etc).

>The use such information is put to must be left up to the calling program's
>discretion, and it must be aware that in some instances, that information may
>become invalid before it can be used (and take the appropriate recovery steps
>in case of such an eventuality).

	The caller should be aware that since it doesn't know exactly the
algorithms used by the filesystem to store the data, there's no way to be 
certain of your interpretation of whether the file will fit (even if nothing
else is happening).

>I think that is "unfortunate", as most frequently some simple "sanity checks"
>that *could* be performed WRT file xfer'ing won't be, and will opt instead
>for the brute-force "see if it fits by trying to make it fit" approach (which
>is a complete waste of time, and cycles, if there is no possibility of it
>being able to do so in the first place).

	It's fairly simple to write a file of null/garbage data of the given
length, and see whether it succeeds (or under 2.0 use SetFileSize to get the
filesystem to do it for you - even faster).

>> 	However, there's another solution under 2.0: Open a new file, then
>> do SetFileSize for the size you want.  That will make a file of the requested
>> size filled with random data, or fail.  Then just write over it with your
>> file.
>
>Well that's great, and I'm glad to hear it!  That can help solve an annoying 
>problem with copying files, etc (if taken advantage of).  When and where will
>all this "good stuff in 2.0" be documented (for the masses)?

	Get the Atlanta Devcon notes from CATS.

>And nonetheless, I still want to be able to find out the number of blocks (or
>whatever the smallest allocatable chunk of storage is called on a given
>device) a file *really* occupies, if only for informational purposes.

	That doesn't work if the allocation sizes are variable (there are
other gotcha's as well).  I think my previous statement stands: the size
reported is a reasonable approximation, but there can be no guarantee of
correlation with other values/objects.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"

BAXTER_A@wehi.dn.mu.oz (08/04/90)

In article <c4pK02n=01EQ01@JUTS.ccc.amdahl.com>, ked01@ccc.amdahl.com (Kim DeVaughn) writes:
> In article <13342@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>>
>> In article <3fco02lsb9LE01@amdahl.uts.amdahl.com> ked01@juts.ccc.amdahl.com (Kim DeVaughn) writes:
>>
>> >Are you saying this is now the documented behavior for the fib_NumBlocks
>> >entry (for the FFS), or is it still better to compute them yourself from
>> >fib_Size, so as not to break in future releases?
>> 
>> 	Computing it yourself is never really safe.  It depends on how the
>> FS works internally.  Take what it gives you, with the understanding that
>> this really is only an internal measure of storage to the filesystem.  On some
>> filesystems I can imagine (easily), the number of blocks a file takes may
>> depend on fragmentation.  Certainly the number of blocks can and will
>> vary from filesystem to filesystem for a given file.
>> 
>> 	I advise against writing code that uses blocks for anything more than
>> user informational display or hints.
> 
> So the bottom line is that there is no supported* way to determine the block
> count of a file.  And hence, there is no supported method of determining if
> a file on one device will "fit" on another device without actually attempting
> the copy operation.
> 
> Unfortunate.
> 
> /kim
You know how big the space is, yes? So you want to know how big the file is.
Copy it to Nil:/Null: and cout the bytes on the way. It will be quick enough
to not make much difference (Unless the file is enormous, in which case, you
will have some idea of if it will fit :-)

Regards Alan> 
>  -- 
> UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
>   or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
> DDD:   408-746-8462
> USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
> BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

ked01@ccc.amdahl.com (Kim DeVaughn) (08/05/90)

In article <10797@wehi.dn.mu.oz> BAXTER_A@wehi.dn.mu.oz writes:
>
> You know how big the space is, yes? So you want to know how big the file is.
> Copy it to Nil:/Null: and cout the bytes on the way. It will be quick enough
> to not make much difference (Unless the file is enormous, in which case, you
> will have some idea of if it will fit :-)

The "size" of a file is a bit trickier than just knowing the number of bytes
it's length is (which BTW, can be found directly by looking at fib_Size ...
at least I *think* that it documented to be accurate ... Randell ?)

What matters when (say) copying a file, is the number of allocatable entities
it will occupy (i.e., blocks), some of which do not contain any of the file's
data, per se (header and extension blocks, for example).

As Randell has pointed out, this number can be variable, even for a given file
on some (as yet hypothetical) file system.  And that the value of fib_NumBlocks
can only be considered to be a rough indicator of a file's actual allocation
(for which there is NO supported way of determining).

At present (1.3 OFS and FFS), the true block allocation of a file can be
found with a simple computation, as "ls" v4.0k does.  Whether or not that can
be done in the future is not known, nor is there any requirement on future
filesystems to provide such information as would make that possible.

I wish it were otherwise, and that one could count on fib_NumBlocks to return
the true allocation of a file (or some varient thereof as the OFS is documented
to do, and that it "seems" the FFS does), but that is NOT a requirement, based
on what has been said here.

Seemingly, the only real "requirement" on a filesystem is that it must be self-
consistent, and only its device drivers need have knowledge of things like the
blocking/allocation methods used.  The rest of the "system", such as utilities,
must simply take what they can get WRT information about files, and must be
"aware" that any such information may not be accurate.

Thus, they should do nothing of a critical nature with anything based on this
info, and should use it for "informational purposes" only.


Actually, I ran into a good example of this kind of thing when working on "ls".
The "pathass" utility allows you make "assignments" which can span different
real devices.  So you can "assign" (say) foo: to df0:c and dh2:c together, and
then refer to file "bar" as foo:bar, where "bar" may be in either real dir.

Obviously, such a "device" has no id_BytesPerBlock, since part of the "device"
is on an OFS, and part on an FFS.  Calling Info() on such a puppy, correctly
returns an error, so I was able to pump out a warning msg, to the effect that
the block counts may be inaccurate, and then just use the value found in the
fib_NumBlocks entry for each file, bypassing the "actual number of blocks"
calculation.  No huhu.


While I understand that other such "oddities" may well arise in the future,
what bothers me is not being able to ask the filesystem for the *true* info
in a supported way.

/kim

-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

ked01@ccc.amdahl.com (Kim DeVaughn) (08/05/90)

In article <13615@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes:
>
> 	We'll report the amount of space left, but neither we nor Unix (nor
> I suspect VMS) guarantee a specific number of blocks used to store a file.
> Unix is slightly less variable, due to preallocation of inodes, but even it
> varies (bsd storing of small files in partial blocks, etc).

That's fine.  I wasn't looking for any "guarantees".  Is it possible that you
can do the same for the actual number of blocks (or some variant thereof) used
by a file via fib_NumBlocks or a similar mechanism?


> 	The caller should be aware that since it doesn't know exactly the
> algorithms used by the filesystem to store the data, there's no way to be 
> certain of your interpretation of whether the file will fit (even if nothing
> else is happening).

True, but it can be a real good "sanity check" at times.  No guarantees tho.


> 	It's fairly simple to write a file of null/garbage data of the given
> length, and see whether it succeeds (or under 2.0 use SetFileSize to get the
> filesystem to do it for you - even faster).

2.0 sounds better and better all the time!  Thanks for all the "goodies"!


> >When and where will
> >all this "good stuff in 2.0" be documented (for the masses)?
> 
> 	Get the Atlanta Devcon notes from CATS.

I wasn't aware that they were available yet?  What's the cost ... my check is
all made out except for that item ... :-)


> 	That doesn't work if the allocation sizes are variable (there are
> other gotcha's as well).  I think my previous statement stands: the size
> reported is a reasonable approximation, but there can be no guarantee of
> correlation with other values/objects.

I've run into one (pathass), as I mentioned in a previous posting, but that
was won a "logical" psuedo-device.  What real devices vary their allocations
in the way you're suggesting?


Thanks for all the info on this stuff Randell ... it is appreciated, and may
keep a few of us from falling in a deep, dark hole in the future!

/kim

-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

mks@cbmvax.commodore.com (Michael Sinz - CATS) (08/06/90)

In article <10797@wehi.dn.mu.oz> BAXTER_A@wehi.dn.mu.oz writes:

[lots of other stuff deleted]

>You know how big the space is, yes? So you want to know how big the file is.
>Copy it to Nil:/Null: and cout the bytes on the way. It will be quick enough
>to not make much difference (Unless the file is enormous, in which case, you
>will have some idea of if it will fit :-)

Or you can just seek to the end of the file with OFFSET_END and then find
the size...  (Why copy it)  Or you could get the information by doing an
Examine() on the file.  The real question here is how to figure out if a file
will fit on a device.  This can not be done without trying.  That is, the
device may be running a filing system that uses different allocation units
or block sizes etc.  Plus the system, being multi-tasking, could end up using
some of the space while you are writing the file.  (Some other application also
writing a file) and thus mess you up.  The only real method is to try the copy
and see if it worked.

>Regards Alan> 
>>  -- 
>> UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
>>   or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
>> DDD:   408-746-8462
>> USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
>> BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25


/----------------------------------------------------------------------\
|      /// Michael Sinz -- CATS/Amiga Software Engineer                |
|     ///  PHONE 215-431-9422  UUCP ( uunet | rutgers ) !cbmvax!mks    |
|    ///                                                               |
|\\\///       Quantum Physics:  The Dreams that Stuff is made of.      |
| \XX/                                                                 |
\----------------------------------------------------------------------/

peter@sugar.hackercorp.com (Peter da Silva) (08/10/90)

In article <d8zg02JJ01bI01@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
> While I understand that other such "oddities" may well arise in the future,
> what bothers me is not being able to ask the filesystem for the *true* info
> in a supported way.

*which* filesystem?

OFS, FFS, ram.handler, and network file systems all have different amounts of
overhead for a file. When you copy a file from FFS to OFS it grows by
512/448ths right off the bat. How do we figure that?

You need a call "RealSize(fs, nbytes)" to figure the real size of a file. That
might depend on what *directory* it's being moved into, if extra blocks need
to be allocated. Make it "RealSize(dir, nbytes)".

OK, now we just need to know the free space on a device. How do you figure free
space in RAM:? How about NFS:?

Sigh. Make a guess and punt, leaving space for slop.
-- 
Peter da Silva.   `-_-'
<peter@sugar.hackercorp.com>.

ked01@ccc.amdahl.com (Kim DeVaughn) (08/10/90)

In article <6321@sugar.hackercorp.com> peter@sugar.hackercorp.com (Peter da Silva) writes:
> In article <d8zg02JJ01bI01@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
> > While I understand that other such "oddities" may well arise in the future,
> > what bothers me is not being able to ask the filesystem for the *true* info
> > in a supported way.
> 
> *which* filesystem?

Whichever filesystem is being addressed at the time.

There *is* a field in the FIB called fib_NumBlocks.  What I would like is for
that field to accurately reflect the actual number of blocks allocated (or
some algorithmic variation thereof) by a file, at the time it is interrogated.

That, and that a filesystem *document* what that field "means" in its own
context.

/kim
-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

lphillips@lpami.wimsey.bc.ca (Larry Phillips) (08/10/90)

In <f2Dl02hm01UD01@JUTS.ccc.amdahl.com>, ked01@ccc.amdahl.com (Kim DeVaughn) writes:
>In article <6321@sugar.hackercorp.com> peter@sugar.hackercorp.com (Peter da Silva) writes:
>> In article <d8zg02JJ01bI01@JUTS.ccc.amdahl.com> ked01@JUTS.ccc.amdahl.com (Kim DeVaughn) writes:
>> > While I understand that other such "oddities" may well arise in the future,
>> > what bothers me is not being able to ask the filesystem for the *true* info
>> > in a supported way.
>> 
>> *which* filesystem?
>
>Whichever filesystem is being addressed at the time.
>
>There *is* a field in the FIB called fib_NumBlocks.  What I would like is for
>that field to accurately reflect the actual number of blocks allocated (or
>some algorithmic variation thereof) by a file, at the time it is interrogated.
>
>That, and that a filesystem *document* what that field "means" in its own
>context.

This whole discussion has been a problem for me. I am trying to figure out
under what conditions a file system might not know how many blocks (minimum
storage units) a given length of data will take up.

I can see having some problems with reporting on a file system that supports a
variable block length, such as might be found on large IBM disks, where 'block'
has no meaning, and data is stored in 'records' with a count being part of the
record, but I think there is a way around even that.

I do think a file system should have the ability to return:

  1. the number of blocks left on a device, if appropriate.
  2. the number of blocks that will be used, if appropriate, given
     the length of a file's data.

The 'if appropriate' proviso would allow a file system to return a value to the
caller that in effect says 'I have no idea, mate, I just write it out and see
if it fits.'. A value of -1 would suffice. All file systems able to determine
the number of blocks required, should be able to tell the caller what the
requirements are.

-larry

--
Sex is better than logic, but I can't prove it.
+-----------------------------------------------------------------------+ 
|   //   Larry Phillips                                                 |
| \X/    lphillips@lpami.wimsey.bc.ca -or- uunet!van-bc!lpami!lphillips |
|        COMPUSERVE: 76703,4322  -or-  76703.4322@compuserve.com        |
+-----------------------------------------------------------------------+