wjones@nwnexus.WA.COM (Warren Jones) (05/18/91)
I've observed something very mysterious in our RS/6000 file system: "ls -l" shows a file of ~24 Mbytes, but "du" shows the directory using only ~17 Mbytes. Can anyone out there offer an explanation? The following script file tells the whole story: | Script command is started on Fri May 17 14:37:51 1991 | | [58] ls -l /usr/tmp | total 17372 | -rw-r--r-- 1 sharp macro 222048 May 17 14:34 CpGUmYQAAA | -rw-r--r-- 1 sharp macro 300024 May 17 14:34 CpGUmYQAAB | -rw-r--r-- 1 sharp macro 19052 May 17 14:35 CpGUmYQAAC | -rw-r--r-- 1 sharp macro 0 May 17 14:34 CpGUmYQAAD | -rw-r--r-- 1 sharp macro 24480468 May 17 14:36 CpGUmYQAAE !!! | drwxr-xr-x 2 root system 512 Mar 09 14:06 X11 | -rw-r--r-- 1 jones support 0 May 17 14:37 typescript | [59] du -s /usr/tmp | 17376 /usr/tmp !!! The files "Cp*" are scratch files created by a Fortran application, which was still running when this typescript was made. The files are presumably still open. To compound the mystery: Before the 24 Mbyte file was created in /usr/tmp, "df" shows only ~19 Mbytes available on the /usr partition. Where did the extra ~7 Mbytes go? Has IBM invented a hyperspace extension to the JFS? Oh, by the way, we're running AIX 3.1 (3003 update). Thanks in advance for any enlightening comments. Warren Jones wjones@nwnexus.wa.com
sfreed@ariel.unm.edu (Steven Freed CIRT) (05/19/91)
In article <509@nwnexus.WA.COM>, wjones@nwnexus.WA.COM (Warren Jones) writes: > I've observed something very mysterious in our RS/6000 file system: > "ls -l" shows a file of ~24 Mbytes, but "du" shows the directory > using only ~17 Mbytes. Probably a hole in the file. When I was in school we used to drive the sys admins crazy with this. (some weren't too bright). We would have like a 1 meg quota, (yeah, that quota topic again ;-) and we would write a program that would write 8k, do an lseek for about 500 megs and write another 8k. They would come after us, trying to find out how we broke the quota system, not stopping to think that the partion we were on was only 200 megs. Data base files are usually the most common type of file with holes. -- Steve. sfreed@ariel.unm.edu
scott@prism.gatech.EDU (Scott Holt) (05/20/91)
In article <509@nwnexus.WA.COM> wjones@nwnexus.WA.COM (Warren Jones) writes: >I've observed something very mysterious in our RS/6000 file system: >"ls -l" shows a file of ~24 Mbytes, but "du" shows the directory >using only ~17 Mbytes. Can anyone out there offer an explanation? >The following script file tells the whole story: > .... The file may be "sparse" - on some UNIX file system implementation, if any entire block of the file contains only zeros, the appropriate block pointer in the inode may be set to zero rather than the location of a disk block containing data. The idea is why allocate disk space to something you know contains only zeros. This is typical of database files (esp those that use mdbm) and other applications which write randomly to a file. It also is not a unique property of AIX. Word of warning about such files - backup programs love them - and I do mean this to be taken sarcasticly. When you back the file up, a typical backup program will read the file sequentially. When this is done, it doesn't matter much that a block contains all zeros. It is very possible that a file "appears" much larger than even the total amount of space on your disk. When this file is backed up, it will take up its apparent size on the backup media. Worse yet, when it is restored, the restore program may not "sparsify" the file - that its, it will try to restore it back to its apparent size and then you have real problems. I don't know how AIX backup and restore deal with this (any comments from IBM?), but most other backup schemes (such as tar and cpio) deal with it a naive manner. This too is something not unique to IBM. - Scott -- This is my signature. There are many like it, but this one is mine. Scott Holt Internet: scott@prism.gatech.edu Georgia Tech UUCP: ..!gatech!prism!scott Office of Information Technology, Technical Services
hbergh@nlicl1.oracle.com (Herbert van den Bergh) (05/27/91)
In article <1991May18.184923.28785@ariel.unm.edu>, sfreed@ariel.unm.edu (Steven Freed CIRT) writes: |> Data base files are usually the most common type of file with holes. I know at least one RDBMS (guess which) that doesn't do that, and for a number of reasons: when updating your database you don't want the overhead of the filesystem finding free blocks and more important it may lead to file fragmentation, slowing down access to the file. So *REAL* databases ;-) won't use files with holes, but more likely raw devices (even less overhead). |> Steve. sfreed@ariel.unm.edu