haberman@s1.msi.umn.edu (Joe Habermann) (12/27/90)
Please excuse me if this question has been asked many times already. It seems common enough. We have a Sun running SunOS 4.0.3 that is acting as a net news server. Since news invariably makes many small files, we wanted to make our news filesystem with many inodes. The obvious thing to try is to run mkfs with a smaller nbpi than the default. We have tried this, but are now running up against MAXIPG. That is, no matter how small nbpi is made, SunOS mkfs can only allocate <= MAXIPG inodes per cylider group as defined in /usr/include/ufs/fs.h. Is there any reasonable way around this problem? Thanks, Joe Habermann haberman@msi.umn.edu
dpz@action.rutgers.edu (David Paul Zimmerman) (12/27/90)
What we generally do here is to skirt the problem by mkfs'ing a smaller number of cylinders per cylinder group (-c flag to mkfs on SunOS). For example, if the standard is 16 cpg, we use 12 or 8 or even 4. (My own news spool system uses 12.) David -- David Paul Zimmerman dpz@dimacs.rutgers.edu Systems Programmer rutgers!dpz Rutgers Univ Center for Discrete Math and Theoretical Computer Science (DIMACS)
chris@mimsy.umd.edu (Chris Torek) (01/07/91)
In article <1990Dec26.182035.6868@s1.msi.umn.edu> haberman@s1.msi.umn.edu (Joe Habermann) asks: >We have a Sun running SunOS 4.0.3 that is acting as a net news server. ... >We ... are now running up against MAXIPG. That is, no matter how small >nbpi is made, SunOS mkfs can only allocate <= MAXIPG inodes per cylider >group as defined in /usr/include/ufs/fs.h. >Is there any reasonable way around this problem? In article <Dec.26.20.35.45.1990.2094@action.rutgers.edu> dpz@action.rutgers.edu (David Paul Zimmerman) answers: >What we generally do here is to skirt the problem by mkfs'ing a smaller number >of cylinders per cylinder group (-c flag to mkfs on SunOS). For example, if >the standard is 16 cpg, we use 12 or 8 or even 4. (My own news spool system >uses 12.) This is usually the solution. I have, however, a longer article explaining what is going on (and why you should complain to your vendor about their not picking up some changes from 4.3BSD-tahoe). From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.unix.questions Subject: Re: How to increase number of inodes in a BSD file system? Message-ID: <26329@mimsy.umd.edu> Date: 1 Sep 90 21:40:27 GMT First, avoid mkfs. Use newfs. Even if you still have a separate mkfs program (4.3-tahoe and 4.3-reno no longer do), newfs is easier. Next, newfs: there are a number of options that together affect the total number of inodes. These are: -b, -f, -i, -c, -s, -u, -t, -p, -x Many of these have effects that are so indirect (and complicated) that they are best ignored. Some dictate the physical geometry of the disk and should therefore not be changed anyway. The important ones are `-i' and `-c', and if you have a sufficiently recent BSD (4.3-tahoe or later) `-i' will (almost) always work and you can ignore the rest anyway. The -i option takes an argument giving the `number of bytes per inode': that is, the expected average size of a file. It is best to underestimate slightly. The average file size on many of our file systems is around 7 or 8 kB; on these file systems `-i 6144' or `-i 6656' gives us good results. On a few file systems (e.g., /news) the average file size is much smaller. Now, what are all the rest of those doing there? -i `ought' to do the trick, but. . . . There are various constraints imposed by the original file system code that crop up in certain common situations on certain machines from a certain vendor that, despite significant progress on certain networking and VM architecture fronts :-) , seems remarkably slow at picking up such important and useful fixes as the `fat fast file system' code in 4.3BSD-tahoe. In particular, the 4.2BSD code required that there be no more than 2048 (MAXIPG) inodes per `cylinder group'. (A `cylinder group' is just that: a group (collection) of (contiguous) disk cylinders.) There are normally 16 cylinders per cylinder group (except in the last group). On an RK07, for instance, this gives 22 sect | 3 trk | 1 kB | 16 cyl --------+-------+--------+------- = 528 kB per cyl group 1 trk | 1 cyl | 2 sect | 1 cg Now, `-i' defaults to 2048 bytes per inode. This means that there would be (528 * 1024 / 2048) inodes in a default RK07 cylinder group, or 264 i/g. But consider a slightly more modern disk: a Fujitsu Eagle: 48 sect | 20 trk | 1 kB | 16 cyl --------+--------+--------+------- = 7680 kB per cyl group 1 trk | 1 cyl | 2 sect | 1 cg This requires 3840 inodes per cylinder group to allot one inode for every 2048 bytes. Already we have overrun MAXIPG. On a still-higher density drive once popular from the fore(not)mentioned vendor we find: 67 sect | 20 trk | 1 kB | 16 cyl --------+--------+--------+------- = 10720 kB / cg 1 trk | 1 cyl | 2 sect | 1 cg To get 2048 bytes per inode here, we would have to have 5360 inodes per cylinder group. Since we can only have 2048 inodes per group, newfs (silently, on said vendor's systems, as they had not yet picked up the fixed version that was more verbose) raised the -i value from 2048 to 5360: that is, it assumed that the average file was about 5400 bytes long. Hence the -c option. If we change the number of cylinders in a cg, we will change the total space in each cg, while retaining the 2048 inode limit. The obvious choice for someone needing more inodes is to use a smaller value for -c, such as 8: 48 sect | 20 trk | 1 kB | 8 cyl --------+--------+--------+------ = 3840 kB per cyl group 1 trk | 1 cyl | 2 sect | 1 cg which needs only 1920 inodes to get 2048 bytes per inode, comfortably under the MAXIPG limit of 2048. But wait, there is more. Each superblock also contains rotational tables, used when allocating disk blocks so that sequentially reading a file does not require long waits for the disk to spin around. These tables eventually repeat, or `cycle'. The period for this cycle sets a *lower* bound on the size of the cylinder group. This lower bound is aggravated (gets larger) by values of `sectors per track' that have few factors of two. On RK07s the number of sectors per track is divisible by two; on Eagles the number of sectors per track is 2^4*3; but on these other disks, the list of prime factors is: 67. The track size is a prime number. All this means that in order to reduce the size of a cylinder group, you may wind up having to reduce the block size (this has the opposite effect on the cycle period). Of course, you can also simply lie to newfs, describe a disk geometry that is not quite the actual geometry, and get whatever results you like, at some cost in performance. Without measuring it I would not care to guess what that cost might be. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris