earle@mahendo.Jpl.Nasa.Gov (Greg Earle) (03/29/88)
In Maurice Bach's `The Design Of The UNIX Operating System' book, the last Exercise in Chapter 4 (Internal Representation Of Files) reads: 15. Discuss a system implementation that keeps track of free disk blocks with a bit map instead of a linked list of blocks. What are the advantages and disadvantages of this scheme? I once worked on a project (a proprietary database computer, not unlike that made by Britton-Lee) where the O/S team member designing the file system used the bitmap scheme for the free disk block list. Seeing this exercise got me curious as to whether anyone has tried implementing this as a research project using a UNIX system as a base, or conversely, do any extant O/S research projects use this method? If so, can anyone answer the question above (namely how does it compare to using a linked list, kept in the Super Block)? [ I know VMS uses bit maps. --DL ] -- Greg Earle earle@mahendo.JPL.NASA.GOV Indep. Sun consultant earle%mahendo@jpl-elroy.ARPA [aka:] (Gainfully Unemployed) earle%mahendo@elroy.JPL.NASA.GOV Lake View Terrace, CA ...!{cit-vax,ames}!elroy!jplgodo!mahendo!earle
allyn@odin.ucsd.edu (Allyn Fratkin) (03/31/88)
See: Peacock, J. Kent, "The Counterpoint Fast File System", USENIX Conference Proceedings, (Winter, 1988), pp. 243-249. for a description of an implementation of a "fast" file system for system v. it uses a bit map for the free list and shows throughput approaching the berkeley fast file system. From the virtual mind of Allyn Fratkin allyn@sdcsvax.ucsd.edu or EMU Project {ucbvax, decvax, ihnp4}!sdcsvax!allyn U.C. San Diego
die@frog.UUCP ( David I. Emery) (03/31/88)
Charles River Data Systems UNOS operating system (an independently implemented system V UN*X kernel (passes SVVS) that contains no AT&T code) has used bit maps for inode and free disk block allocation since the first version of its file system was designed in 1979. We feel that bit map allocation significantly enhances file system reliability and robustness. It also permits improved performance - we use it to allocate contiguous regions on the disk to those blocks which are written behind when a file is being written to sequentially. We postpone actually binding the blocks in the cache to physical sectors until enough have accumulated and then go find a contiguous region big enough to hold them by searching the bit map. Having the sectors contiguous on the disk permits efficient read-ahead in one disk operation and minimizes head motion when a file is accessed. [ You postpone writing to disk to collect enough for a contiguous region on ] [ disk. What metric do you use to decide when you have collected enough? ] [ How do you trade this off against the vulnerability to crashes? --DL ] -- David I. Emery Charles River Data Systems 983 Concord St. Framingham, MA 01701 Tel: (617) 626-1102 uucp: ...!decvax!frog!die
earle@mahendo.Jpl.Nasa.Gov (Greg Earle) (03/31/88)
In article <4811@sdcsvax.UCSD.EDU> I wrote: >I once worked on a project ... that ... used [ a ] >bitmap scheme for the free disk block list. Seeing this exercise got me >curious as to whether anyone has tried implementing this as a research project >using a UNIX system as a base, or conversely, do any extant O/S research >projects use this method? Everyone has been kindly reminding me that 4.x BSD UNIX uses such a scheme. Yes, thank you, I knew that already (I'd be a pretty sad consultant otherwise). My trusty copy of `A Fast File System for UNIX' is within easy reach ... What I was *really* thinking when I said `UNIX system' was a System V-based or BTL Research (i.e., V8 or V9) version (sorry, mass-Bach reading has me in System V mode). After having used it for 4 years I sort of forgot that 4.x BSD UNIX can (and still is) be called a `research project'. What I was *really* thinking when I said [ (-: ] `research project' was things I know little of, like Sprite for example. Sorry for the confusion ... -- Greg Earle earle@mahendo.JPL.NASA.GOV Indep. Sun consultant earle%mahendo@jpl-elroy.ARPA [aka:] (Gainfully Unemployed) earle%mahendo@elroy.JPL.NASA.GOV Lake View Terrace, CA ...!{cit-vax,ames}!elroy!jplgodo!mahendo!earle
darrell@cs.ucsd.edu (Darrell Long) (04/01/88)
Melinda Shore <shor@sphinx.uchicago.edu>, says... Unicos (Cray's version of SysV) uses a bitmap of disk free blocks. Earlier versions used the traditional linked lists. We never ran the old filesystem, but apparently it really was much slower than the current one. Maybe someone at Cray could give us real figures. -- Melinda Shore ..!uunet!reason.psc.edu!shore Pittsburgh Supercomputing Center shore@reason.psc.edu emory!arnold@ucsd.edu (Arnold D. Robbins {EUCC}), says... The Eighth and Ninth Edition Research Unix systems use a bit map to store the free list; blocks in the file system are all 4K bytes in size. The V8/V9 'fsck' knows how to convert an 'old' style 1K-linked-list filesystem (i.e. the 4.1 BSD file system) into a new style 4K bitmapped filesystem, in place. Peter Weinberger wrote a description of this file system in a Bell Labs Technical Journal article circa 1981, I have the journal at home, so if someone desperately needs the reference I could look it up. As I recall, the main speed-up came from the larger block sizes, although the ability to use an in-memory bit map also helped. It's been a while since I read the article. Basically, Weinberger was able to move data through the filesystem at just about maximum practical disk transfer speeds. Note also that this is still the traditional "all the i-node blocks at the front" sort of file system. A wart is that V8 uses the eigth bit in the minor device field to distinguish between file system types; the new one has the high bit in the byte on. It seems to me they could have done it better with the File System Switch, but I suspect the FSS came much later than the faster file system. Perhaps V9 does it differently than V8 did; I've only seen (someone else's) V8 manual. -- Arnold Robbins ARPA, CSNET: arnold@emory.ARPA BITNET: arnold@emory UUCP: { decvax, gatech, }!emory!arnold DOMAIN: arnold@emory.edu (soon) ``csh: just say NO!''
deh0654@sjfc.UUCP (Dennis Hamilton) (04/01/88)
[ This will be the last bit-mapped allocation posting, unless something ] [ really new appears. --DL ] The CP/M operating system used bit maps, and it was very efficient on small- memory machines using slow floppy disks, for example. The bit map is not stored on the disk however -- it is constructed from allocation tables every time the disk is "mounted." That way, there is no problems about using a bit map that has been invalidated by a crash (or at least, fewer problems). The CP/M directory structure lists the allocated blocks for each file, using directory overflow records if necessary, for large or strange, sparse files whose virtual blocks are not all allocated. The biggest problem with the CP/M scheme has to do with shared physical blocks. The file erase function updated the bit map and invalidated the directory slot of the erased (actually, released) file, but it wouldn't notice if any blocks were shared with another file. This allows re-allocation of shared blocks when they are really not available. There are remedies, and some utility software is careful to force the equivalent of a remount every time that erasure is done, so that the bit map will be correctly reconstituted. Apart from the time needed to reconstruct a bit map every time a disk volume is mounted, the only other problem that tended to reduce performance was the need to consult directories so often, especially for file allocation overflow records. This was improved in later versions by improved directory caching and knowing which devices were non-removable. A very nice quality of the CP/M scheme was the ability to handle sparsely allocated large virtual files. That, and sharing of blocks seems actually to go smoother than the linking in allocation lists that is typical of MSDOS and other systems. But as far as performance goes, caching is becoming so easy that it would seem to overcome most performance objections to any method (except, possibly, in terms of ability to handle recoveries after crashes). Dennis E. Hamilton -- -- orcmid {uucp: ... !rochester!sjfc!deh0654 vanishing into a twisty little network of nodes all alike}
cantrell@Alliant.COM (Paul Cantrell) (04/01/88)
The MASSCOMP unix file system, which was a derivative of the Bell file system, was modified by Tom Teixeira (now at Stellar) to do bitmap allocation of file blocks. This was necessary because the system supported contiguous block allocation (required for the high speed data acquisition). You could try mailing him at stellar for more information about it. My experience with his file system was that allocations were very fast, and the file system throughput was very high. Even when non-contiguous files were allocated, a side affect of the bitmap allocation was that the blocks were obtained in sorted order. Even with gaps between some of the blocks, the locality obtained made for very nice file system throughput. I can't think of any negative effects of his allocation method... PC
jgy@hropus.att.com (John Young) (04/02/88)
This bit map discussion is quite interesting! Question: Previous posters have mentioned the use of bitmaps for simplyfing contigous allocations. How is this done with respect to gap/cylinder optimizations. Do you keep n maps where n is the gap size?? John Young AT&T-BL
bruce@ucsd.edu (Bruce Beare) (04/05/88)
Convergent's Mightyframe (S/320, S/640, etc) uses a combination bitmap/free list file system. What we did was to keep the freelist intact, and slice off a section of the file system at a time for bitmapping. Advantages: Minimal changes to kernel, fsck, etc, very good performance, ability to switch between the two formats (free list/combo bitmap) at will (fsck does it). Disadvantages: When the bitmap is exhausted, the kernel will lock the free list and scan the entire free list (one pass) building a new bitmap. This does not happen very often, but when it does the file system will block alloc/free calls (for that file system) for several seconds. Other improvements: When an alloc calls for more then 1 block, it gets contiguous blocks from the bit map (changes to bmap, etc). This coupled with a 'direct I/O' facility allowed us to do large file system reads/write as a single DMA to the physical device, giving us a 4X speed improvement. We peaked with the Fuji SMD drive, Interphase SMD Controller at the 1MB/S range for file system I/O on the AIM II benchmark. The typical ST506 performance was 350-400 K/S with Maxtor xt1140 drives. Bruce Beare Now @ Voysys Corporation ..!pyramid!ctnews!voysys!bruce Non-Standard disclaimer: I never did speak for Convergent Technologies, and still don't.