jik@athena.mit.edu (Jonathan I. Kamens) (02/26/91)
My "delete" application wants to be able to tell exactly how much space on a disk is actually occupied by a file. This involves two things: 1. Finding out how many disk blocks the file occupies. 2. Finding out the size of each block. Now, on a 4.3BSD system, the stat structure contains the st_blocks field, which tells the "actual number of blocks allocated." Given that description, the question becomes, what exactly is a "block?" There are two possible answers: 1. The size specified by DEV_BSIZE. 2. The size in the f_bsize field of the statfs structure of the filesystem on which the file resides. Now, it seemed to me that f_bsize would be the logical choice, since different filesystems can have different minimum block sizes, but some experimentation indicates that actually, DEV_BSIZE is what's being used. The 4.3reno stat(2) man page goes even further; it describes st_blocks as "The actual number of blocks allocated for the file in 512-byte units." But that leaves me with another question -- is it DEV_BSIZE, or 512 bytes? Besides all of these problems, we have the problem that some Unix implementations (POSIX systems, in particular) don't even have an st_blocks structure, so all I've got to work with is st_size. So, I would like to ask if the following outlined method of determining the actual space usage of a file on many different flavors of Unix is reliable: #include <sys/param.h> /* for DEV_BSIZE */ int actual_bytes; struct stat statbuf; ... assume statbuf is initialized ... #ifdef ST_BLOCKS_EXISTS /* I'll define this myself, as necessary */ #ifdef DEV_BSIZE actual_bytes = statbuf.st_blocks * DEV_BSIZE; #else actual_bytes = statbuf.st_blocks * 512; #endif /* DEV_BSIZE */ #else /* ! ST_BLOCKS_EXISTS */ #ifdef DEV_BSIZE actual_bytes = DEV_BSIZE * (statbuf.st_size / DEV_BSIZE + ((statbuf.st_size % DEV_BSIZE) ? 1 : 0)); #else actual_bytes = statbuf.st_size; #endif /* DEV_BSIZE */ #endif /* ST_BLOCKS_EXISTS */ One final question: I thought that f_bsize was the minimum block size for a filesystem, but when statfs()ing certain filesystems, I find it possible to create a file that takes up much less space than what f_bsize says. So, what is f_bsize supposed to represent? Thanks for any help you can provide! -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
torek@elf.ee.lbl.gov (Chris Torek) (02/26/91)
In article <1991Feb25.205932.16587@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: >... on a 4.3BSD system, the stat structure contains the st_blocks field, >which tells the "actual number of blocks allocated." Given that >description, the question becomes, what exactly is a "block?" There are two >possible answers: > >1. The size specified by DEV_BSIZE. >2. The size in the f_bsize field of the statfs structure of the filesystem on > which the file resides. The answer is `none of the above'. >Now, it seemed to me that f_bsize would be the logical choice, No: f_bsize is the `block' size and not the `fragment' size under 4.3BSD-reno, i.e., typically 8K rather than 1K. (SunOS and 4BSD are different here; SunOS defines f_bsize as the fragment size.) >The 4.3reno stat(2) man page goes even further; it describes st_blocks >as "The actual number of blocks allocated for the file in 512-byte units." >But that leaves me with another question -- is it DEV_BSIZE, or 512 bytes? It is 512 bytes; it does not matter what DEV_BSIZE is. Under 4.3tahoe on the Tahoe, DEV_BSIZE was 1024; 4.3reno has no DEV_BSIZE at all (well, it has one as a compatibility hack) and each disk's block size is a property of that disk. Note that there may be (probably are) some systems out there in which st_blocks is in terms of 1 kbyte blocks; these should dwindle away, but will probably leave a lingering stench. :-) -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
jik@athena.mit.edu (Jonathan I. Kamens) (02/26/91)
Well, if there are systems that measure st_blocks in terms of 1k blocks, how can I detect them in my source code? Assuming that it's always 512 bytes would leave me with the following code: int actual_bytes; struct stat statbuf; ... assume statbuf is initialized ... #ifdef ST_BLOCKS_EXISTS actual_bytes = statbuf.st_blocks * 512; #else actual_bytes = statbuf.st_size; #endif /* ST_BLOCKS_EXISTS */ But this is going to lose on sites that have 1k blocks. Is there any way to detect them. And, on a historical note, what led to the decision to measure in terms of 512-byte blocks, and why do some sites measure in terms of 1k blocks instead? -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
rbj@uunet.UU.NET (Root Boy Jim) (03/01/91)
In article <10283@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes: >[Reno is 512, Tahoe is 1024] >Note that there may be (probably are) some systems out there in which >st_blocks is in terms of 1 kbyte blocks; these should dwindle away, >but will probably leave a lingering stench. :-) Methinks the stench comes from POSIX, which gutlessly refused to buck existing but ancient practice. I assume that's why BSD changed. Oddly enuf, this came at a time when 1K FS blocks were becoming more common in System V. It is a fortunate coincidence that 2^10 ~= 10^3. Too important not to take advantage of. I was delighted when Berkeley "defined" a "block" as 1K. No more doubling or halving in one's head when trying convert blocks to chars. To make it worse, Pyramid's FS block sizes are 2K to 16K (yes, the sectors are 2K), and so they report blocks in 2K increments. It is rather sad to see filesystems quadruple in size when reported between NFS partitions mounted to or from a Pyramid. Oh well... -- [rbj@uunet 1] stty sane unknown mode: sane
md@sco.COM (Michael Davidson) (03/02/91)
jik@athena.mit.edu (Jonathan I. Kamens) writes: > And, on a historical note, what led to the decision to measure in terms of >512-byte blocks, and why do some sites measure in terms of 1k blocks instead? I'm sure that, like most design decisions, it was essentially arbitrary. It was, however, a very natural choice in the context of the machines on which the early UNIX filesystems were implemented. These machines had small disks with a physical sector size of 512 bytes and small amounts of main memory. So, 512 bytes was a natural choice. Try reading some of Ritchie and Thompson's papers on UNIX for an introduction to the design philosophy (Bell System Technical Journal for July-August 1978 is a good place to start). A more interesting question is "why did it stay that way so long ..."
bzs@world.std.com (Barry Shein) (03/04/91)
>To make it worse, Pyramid's FS block sizes are 2K to 16K (yes, the >sectors are 2K), and so they report blocks in 2K increments. It is >rather sad to see filesystems quadruple in size when reported >between NFS partitions mounted to or from a Pyramid. Oh well... That's a real bug, I bet their NFS is based on an older LAI version. There are two different values kept in the server, one for the local unit and another for the external unit. They're confused in a few places in the code, particularly in the call that does whatever it is that "df" wants, I forget the NFS name for this op. I've fixed this before in that code, it's just a matter of changing the name of the struct element used in a few places to the correct one. -- -Barry Shein Software Tool & Die | bzs@world.std.com | uunet!world!bzs Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
greywolf@unisoft.UUCP (The Grey Wolf) (03/05/91)
<1991Feb26.010146.27490@athena.mit.edu> by jik@athena.mit.edu (Jonathan I. Kamens) # # And, on a historical note, what led to the decision to measure in terms of # 512-byte blocks, and why do some sites measure in terms of 1k blocks instead? # 512 bytes seems to be the usual size of a physical sector on a disk (as I have discovered the hard way via the /stand/diag stuff for a sun). System V used 512-byte blocks in their filesystems from <insert generic deity here>-knows-when up until 1k and 8k blocks in filesystems were available. (8k, I suspect, was for filesystems chock full of data so that it could be schlumped around with "reasonable" speed.) And even after that, the 512-byte block survived. Of course, there's one in every crowd: Our pyramid's filesystems have 16k blocks and 2k fragments, and the disk itself is tuned to 2k physical block size. Go figure. The reasoning behind figuring in terms of 1k blocks (or appearing to, in the case of "du") was probably mathematical. It might not take a whole lot of effort to double or half a number (especially if you make the machine do it for you :-), but someone out there probably figured that Wouldn't Life Be So Much Simpler If... and it went from there. Someone out there probably has even more historical info than this. It would break some things, but would anyone else out there find it useful to have the stat structure contain the number of logical blocks and the number of fragments, rather than/in addition to the number of physical blocks? Are fs_blocksize and fs_fragsize for a file system defined anywhere? (probably somewhere in the superblock, but is there a system call to return this information? fsstat/statfs deal with the fundamental blocksize, but they don't provide info about the fragment size. Assuming 8:1 isn't always right.) Is it just me or should more information about a filesystem be available? # -- # Jonathan Kamens USnail: # MIT Project Athena 11 Ashford Terrace # jik@Athena.MIT.EDU Allston, MA 02134 # Office: 617-253-8085 Home: 617-782-0710 -- # The days of the computer priesthood are not over. # May they never be. # If it sounds selfish, consider how most companies stay in business.
sas@shadow.pyramid.com (Scott Schoenthal) (03/08/91)
In article <BZS.91Mar3133828@world.std.com> bzs@world.std.com (Barry Shein) writes: > >>To make it worse, Pyramid's FS block sizes are 2K to 16K (yes, the >>sectors are 2K), and so they report blocks in 2K increments. It is >>rather sad to see filesystems quadruple in size when reported >>between NFS partitions mounted to or from a Pyramid. Oh well... There is nothing in the NFS protocol that specifies a required filesystem or directory block size. The NFS statfs response returns the "fundamental" block size and the total and free # of blocks in the server's filesystem. Some applications (e.g., OSx 'du') don't do the statfs() when calculating # of blocks used. If an application uses the local notion of device block size, block calculations will be wrong when interacting with a server with a different block size. >That's a real bug, I bet their NFS is based on an older LAI version. You would lose. OSx NFS is based upon Sun NFSSRC with multi-processor, scaling, and "dual universe" extensions. >There are two different values kept in the server, one for the local >unit and another for the external unit. They're confused in a few >places in the code, particularly in the call that does whatever it is >that "df" wants, I forget the NFS name for this op. 'df' does use statfs() (at least the Sun NFSSRC and OSx 'df') and ought to work properly. If not, send mail to bugs@pyramid.com If the problem is in our server code, it will get fixed in a timeframe relative to the customer severity. Pyramid has successfully participated in Sun NFS/ONC Connectathons for several (>5) years. /sas ---- Scott Schoenthal sas@shadow.pyramid.com Pyramid Technology Corp. {sun,hplabs,decwrl,uunet}!pyramid!sas
guy@auspex.auspex.com (Guy Harris) (03/12/91)
>There is nothing in the NFS protocol that specifies a required filesystem >or directory block size. It also doesn't specify the units to be used in the "blocks" field of the "fattr" structure in an NFS GETATTR reply; this is extremely unfortunate, as it led various vendors not to use 512-byte chunks as the size, and therefore cause users of programs running on other machines to be unpleasantly surprised when said programs assume, incorrectly, that when they do a "stat()" the "st_blocks" result isn't in units of 512-byte chunks. Given that S5R4 and 4.3-reno both specify, in the documentation, that "st_blocks" is in units of 512-byte chunks, a convention needs to be specified - either in the NFS protocol, or in some kind of side notes to it - ensuring that (modern UNIX) clients can arrange to report "st_blocks" in those units. Given that most (modern UNIX) clients probably just use what they get back from the server in the "blocks" field, the most appropriate convention would probably be to say "'blocks' is in units of 512-byte chunks, regardless of what the block or fragment size of the underlying file system, or the disk block size, is." >Some applications (e.g., OSx 'du') don't do the statfs() when calculating ># of blocks used. If an application uses the local notion of device >block size, block calculations will be wrong when interacting with >a server with a different block size. *Lots* of applications on *non*-Pyramid systems don't do the "statfs()" when calculating # of blocks used from the "st_blocks" field; I suspect, in fact, most applications on most systems don't.
rbj@uunet.UU.NET (Root Boy Jim) (03/12/91)
In article <147432@pyramid.pyramid.com> sas@shadow.pyramid.com (Scott Schoenthal) writes: >In article <BZS.91Mar3133828@world.std.com> bzs@world.std.com (Barry Shein) writes: >> >There is nothing in the NFS protocol that specifies a required filesystem >or directory block size. The NFS statfs response returns the "fundamental" >block size and the total and free # of blocks in the server's filesystem. No, but there is existing practice, and there are the Connectathons you mentioned below. Didn't you notice that your numbers were different locally than across the network? Didn't it bother you? >Stat doesn't say how big your block >Some applications (e.g., OSx 'du') don't do the statfs() when calculating ># of blocks used. If an application uses the local notion of device >block size, block calculations will be wrong when interacting with >a server with a different block size. DU shouldn't use statfs. It can cross filesystems. So what's left? Lowest common denominator, 512 byte blocks. I would rather see 1K "Blocks" regardless of actual size. BTW, kudos for making your sector sizes 2k and allowing 16k blocks. >'df' does use statfs() (at least the Sun NFSSRC and OSx 'df') and ought to >work properly. If not, send mail to bugs@pyramid.com If the problem >is in our server code, it will get fixed in a timeframe relative to the >customer severity. DF also prints in kilobytes, as does ls, as does du, as does sum. Here again (sum), y'all took 'blocks' literally, and print a different result than other BSD systems. >Pyramid has successfully participated in Sun NFS/ONC Connectathons >for several (>5) years. Yes, I note that you were one of the first. However, why don't you have a lock daemon, and is your code the latest version? -- [rbj@uunet 1] stty sane unknown mode: sane
richard@aiai.ed.ac.uk (Richard Tobin) (03/12/91)
In article <6558@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>There is nothing in the NFS protocol that specifies a required filesystem >>or directory block size. >It also doesn't specify the units to be used in the "blocks" field of >the "fattr" structure in an NFS GETATTR reply; this is extremely >unfortunate, Particularly given that it appears to specify it; the obvious interpretation of the protocol specification is that it's in units of "blocksize": "'blocksize' is the size in bytes of a block of the file ... 'blocks' is the number of blocks the file takes up on disk" [NFS: Version 2 Protocol Specification, reproduced in the SunOS 4.1 documentation] It would take a mind-reader to guess that the two uses of "block" in one sentence had different meanings, and that the first use in fact meant "the optimal transfer size". >Given that most (modern UNIX) clients probably just use what they get >back from the server in the "blocks" field, the most appropriate >convention would probably be to say "'blocks' is in units of 512-byte >chunks, regardless of what the block or fragment size of the underlying >file system, or the disk block size, is." This does seem the best solution. Fortunately, disk block sizes are usually a multiple of 512 bytes, so the space occupied can be reported accurately. -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin