cliff@motcsd.csd.mot.com (cliff.rodriguez) (09/07/90)
We are working on a project to convert our system V based system (ver 3) from 14 char file names to something much larger. Has anyone out there done this, or heard it done? I need to know if this is going to be the slow tedious task I think it is. Any suggestion on how to speed up the work or some magic answer would be appreciated... thanks in advance...cliff -- -------------------------------------------------------------------------------- Cliff Rodriguez voice:408-366-4788 fax:408-366-4125, Cupertino, CA. USA uunet! { apple | pyramid } motcsd!cliff cliff@csd.mot.com
vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) (09/08/90)
In article <1430@engadm2.csd.mot.com> cliff@motcsd.csd.mot.com (cliff.rodriguez) writes: >We are working on a project to convert our system V based system (ver 3) >from 14 char file names to something much larger. Has anyone out there >done this, or heard it done? I need to know if this is going to be the >slow tedious task I think it is. Any suggestion on how to speed up the >work or some magic answer would be appreciated... thanks in advance...cliff Ask DEC -- they upgraded the length of the VAX/VMS filenames some time back. You better not ask _how_ they did it 'tho; you might be sick. They didn't (and I am talking as a system programmer about the time the change was made, things may have been cleaned up since) make the filename _contiguous_ in the (directory) entry -- there happened to be a bit of space left over at the end and... In U*X a directory entry is defined in dir.h -- you _may_ redefine the maximum length & recompile. Why have you got only 14-char filenames? Is this _really_ V? -Kym Horsell
guy@auspex.auspex.com (Guy Harris) (09/10/90)
>In U*X a directory entry is defined in dir.h -- you _may_ >redefine the maximum length & recompile. And then dump and restore all your file systems, since you've then just changed the on-disk file format. Also, fix up a bunch of programs that read directories directly to use "readdir()" instead, and make sure no programs "know" that file names are limited to 14 characters. >Why have you got only 14-char filenames? Presumably because he's using a system with only the V7-based S5 file system. >Is this _really_ V? The standard file system with S5 releases prior to S5R4 is V7-based, and has a 14-character limit on file names, yes. S5R4 also comes with the 4.3BSD file system, which has a 255-character limit....
meissner@osf.org (Michael Meissner) (09/10/90)
In article <4040@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: | >In U*X a directory entry is defined in dir.h -- you _may_ | >redefine the maximum length & recompile. | | And then dump and restore all your file systems, since you've then just | changed the on-disk file format. Also, fix up a bunch of programs that | read directories directly to use "readdir()" instead, and make sure no | programs "know" that file names are limited to 14 characters. | | >Why have you got only 14-char filenames? | | Presumably because he's using a system with only the V7-based S5 file | system. The historical reason for the 14 character filename is that under V7 the directory entry was the inode + filename within directory. Since the inode was 2 bytes, making filenames 14 bytes meant that all directory entries where the same size, but not so big it wasted space for the average filesystem. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Do apple growers tell their kids money doesn't grow on bushes?
calhoun@usaos.uucp (Warren D. Calhoun) (09/10/90)
In <4040@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>Why have you got only 14-char filenames? >Presumably because he's using a system with only the V7-based S5 file >system. >>Is this _really_ V? >The standard file system with S5 releases prior to S5R4 is V7-based, and >has a 14-character limit on file names, yes. S5R4 also comes with the >4.3BSD file system, which has a 255-character limit.... Can you say POSIX compliance? -- | SSG W.D. Calhoun | UUCP: ...!uunet!usaos!calhoun | | Gas Turbine Engine (52F) Branch | INTERNET: calhoun%usaos@uunet.uu.net | | The U.S. Army Ordnance School | CompUServe: 76336.2212@compuserve.com | | Fort Belvoir, Virginia 22060 | Voice: (703) 664-3396/3595 |
pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/12/90)
On 7 Sep 90 07:35:20 GMT, cliff@motcsd.csd.mot.com (cliff.rodriguez) said: cliff> We are working on a project to convert our system V based system cliff> (ver 3) from 14 char file names to something much larger. Do you *really* need much larger? Why? If something like 30 instead of 14 would do, an easy hack exists. cliff> Has anyone out there done this, or heard it done? I need to know cliff> if this is going to be the slow tedious task I think it is. Well, this can be (and has been) done in two ways: 1) keeping the current organization, but just extending the size limit. For example, you could have directory entries that are 32 bytes long, for 30 byte file names, or 64 bytes long, for 62 byte file names. This does not require much more than changing a #define or two and recompiling the kernel, the dirent library, and a few applications that do not use it (mkfs, fsck, etc...). It does make directories grow in size, but I think that's not too important -- many directories are well under 512 bytes, i.e. 32 entries, and doubling the entry size to 32 bytes would not consume any additional disk or memory at all in this case. 2) Adopt a variable length name directory scheme. This can be (less easily) done by borrowing the relevant part of the 4.xBSD filesystem code and plugging it in. This could be done by defining a new filesystem type under the FSS, that shared most all its procedures with the standard s5 one, except for the path resolution entry point, and modifying the 4.xBSD filesystemn source to have an FSS style interface. I seem to remember that Lachman or Unisoft rewrote the interface to the 4.xBSD filesystem modules so that it could be plugged in its entirety under System V. I am sure that Everex ESIX also did something like that, except that they did the opposite of what you want -- instead of changing the format of directories and leaving the disc layout unchanged, they did borrow all the far more efficient 4.xBSD disc layout logic and left the directory format unchanged (for backwards compatibility). If you go the 4.xBSD route you also have to change mkfs, fsck, icheck, and any other utility that works on the filesystem internals, by borrowing the appropriate code from the 4.xBSD version, if you plug in only the new directory format, or substituting them altogether if you just go for the entire 4.xBSD fast filesystem logic. I think that if you want just longer file names then option 1), doubling the directory entry size to 32, is best -- even on BSD systems I have *very* rarely seen filenames longer than 30 characters -- as it gives you most of what you want and does not require many changes. If you want a look-and-feel like the 4.xBSD one, you should not just change the directory file format to the variable length one -- you should also go for the entire 4.xBSD file system logic, which has much much better performance than the s5 filesystem type. This is what AT&T themselves did with System V.4. Going all the way to the 4.xBSD filesystem type instead of the s5 one can be done most easily taking the System V.4 implementation or the 4.3BSD-reno one, and change their interface with the rest of the kernel from their VFS style one to the FSS one. This is not, I think, a major job, even if VFS style interface and FSS style ones are at slightly different abstraction levels. You could do the opposite, change the kernel to use VFS style filesystem interfaces, so that you can plug in a conversion interface from FSS to VFS if you want to continue using FSS based filesystem modules (e.g. the Xenix or DOS filesystems) and put in the 4.xBSD style filesystem type without change. I think that if you want to ease the transition to System V.4, and already have, as you should, System V.4 source, this is the way to go -- modifying the V.3 kernel for V.4's VFS instead of FSS, and putting in a module that presents a VFS interface to the kernel and an FFS one to V.3 style filesystem types (since the FFS interface is lower level and more restrictive than the VFS one, I think doing the opposite is much harder, but I cannot say for sure without looking hard at the V.3 FSS and V.4 VFS interface details). Note that 4.3BSD-reno and System V.4 (and SunOS) use a VFS style interface that is much similar, but not identical, regrettably. Not two major UNIX variants define exactly the same interface to installable filesystem modules. -- Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/12/90)
On 7 Sep 90 07:35:20 GMT, cliff@motcsd.csd.mot.com (cliff.rodriguez) said: cliff> We are working on a project to convert our system V based system cliff> (ver 3) from 14 char file names to something much larger. Do yoe *realli* need much larger? Why? If something like 30 instead of 14 would do, an easy hack exists. clif&> Has anyone out there done this, or heard it done? I need to know cliff> if this is going to be the slow tedious task I think it is. Well, this can be (and (as been) done in two ways: 1) keeping the current organization, but just extending the size limit. For example, you could have directory entries that are 32 bytes long, for 30 byte file names, or 64 bytes long, for 62 byte file names. This does not requi"e much more than changing a #de&ine or two and recompiling the kernel, the dirent library, and ! few applications that do not ese it (mkfs, fsck, etc...). It does make directories grow in sije, but I think that's not too important -- many directories are well under 512 bytes, i.e. 32 en$ries, and doubling the entry size to 32 bytes would not consume any additional disk or memory at all in this case. 2) Adopt a variable length name directory scheme. This can be (less easily) done by borrowing the relevant part of the 4.xBSD filesystem code and plugging it in. This could be done by defining a new filesystem type under the FSS, that shared most all its procedures with the standard s5 one, except for the path resolution entry point, and modifying the 4.xBSD filesystemn source to have an FSS ctyle interface. I seem to remember that Lachman or Unisoft rewrote the interface to the 4.xBSD filesystem modules so that it co%ld be plugged in its entirety ender System V. I am sure that Everex ESIX also did something like that, except that they did the opposite of what you want -- in#tead of changing the format of directories and leaving the disc layout unchanged, they did borbow all the far more efficient 4.xBSD disc layout logic and left $he directory format unchanged (for backwards compatibility). If you go the 4.xBSD route you also have to change mkfs, fsck, icheck, and any other utility that 'orks on the filesystem internals, by borrowing the appropriate code from the 4.xBSD version, if you plug in only the new direcdory format, or substituting them altogether if you just go for dhe entire 4.xBSD fast filesystem logic. I think that if you want just longer file names then opdion 1), doubling the directory entry size to 32, is best -- even on BSD systems I have *very* rarely seen filenames longer than 30 characters -- as it gives you most of what you want and doec not require many changes. If i/u want a look-and-feel like the 4.xBSD one, you should not jusd change the directory file forma$ to the variable length one -- you should also go for the entibe 4.xBSD file system logic, which has much much better performance than the s5 filesystem type. This is what AT&T themselves did with System V.4. Going all the way to the 4.xBSD filesystem type instead of the s5 one can be done most easily taking the Sysdem V.4 implementation or the 4.3BSD-reno one, and change their interface with the rest of the kernel from their VFS style one to the FSS one. This is not, I thi.k, a major job, even if VFS style interface and FSS style ones are at slightly different abstraction levels. You could do the o`posite, change the kernel to uc% VFS style filesystem interfaces, so that you can plug in a con&ersion interface from FSS to VFS if you want to continue using FSS based filesystem modules (e.g. the Xenix or DOS filesystems) and put in the 4.xBSD style filesystem type without change. I think that if you want to ease the transition to System V.4, and already have, as you should, Syctem V.4 source, this is the way to go -- modifying the V.3 kerne, for V.4's VFS instead of FSS, and putting in a module that presents a VFS interface to the kernel and an FFS one to V.3 style filesystem types (since the FFS interface is lower level and more restrictive than the VFS one, I think doing the opposite is much harder, but I cannot say for sure without looking hard at the V.3 FSS and V.4 VFS interface details). Note that 4.3BSD-reno and System V.4 (and SunOS) use a VFS style interface that is much similar, but not identical, regrettably. Not two major UNIH variants define exactly the same interface to installable filesystem modules. -- Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
guy@auspex.auspex.com (Guy Harris) (09/12/90)
>>The standard file system with S5 releases prior to S5R4 is V7-based, and >>has a 14-character limit on file names, yes. S5R4 also comes with the >>4.3BSD file system, which has a 255-character limit.... > >Can you say POSIX compliance? Yes, I can. I can even say it on a system with a 255-character limit on filenames; there is *NOTHING* about a limit higher than 14 characters that violates POSIX. POSIX says the *minimum* limit that a system may impose is 14 characters. The *actual* limit is pathname-dependent (consider an S5R4 system with both S5 and UFS file systems mounted, for example), and its value for some particular directory can be fetched with "pathconf()".
jeff@quark.WV.TEK.COM (Jeff Beadles) (09/12/90)
pcg@cs.aber.ac.uk (Piercarlo Grandi) babbles: |On 7 Sep 90 07:35:20 GMT, cliff@motcsd.csd.mot.com (cliff.rodriguez) said: | |cliff> We are working on a project to convert our system V based system |cliff> (ver 3) from 14 char file names to something much larger. | |Do you *really* need much larger? Why? If something like 30 instead of |14 would do, an easy hack exists. We could use a few less "hacks". :-) |cliff> Has anyone out there done this, or heard it done? I need to know |cliff> if this is going to be the slow tedious task I think it is. | |Well, this can be (and has been) done in two ways: | |1) keeping the current organization, but just extending the size limit. |For example, you could have directory entries that are 32 bytes long, |for 30 byte file names, or 64 bytes long, for 62 byte file names. This |does not require much more than changing a #define or two and |recompiling the kernel, the dirent library, and a few applications that |do not use it (mkfs, fsck, etc...). It does make directories grow in |size, but I think that's not too important -- many directories are well |under 512 bytes, i.e. 32 entries, and doubling the entry size to 32 |bytes would not consume any additional disk or memory at all in this |case. ... This is **NOT** true. There are hard-coded user programs that depend on a 14 character filename limit! It's by no means as easy as changing a #define or two. For example, what will this code fragment do with >14 character filenames? ... for(i=0; i<14; i++) if(*xx) *yy++ = *xx++; else break; *yy ='\0'; ... This is typical of parts of the SYSV 3.2 code. True, this is not an robust way to handle this, but it is typical of the code. The ONLY way to find these sorts of problems is by inspection or searching. -Jeff -- Jeff Beadles jeff@onion.pdx.com
machina@uts.amdahl.com (Miguel A. Ramirez) (09/19/90)
In article <8738@orca.wv.tek.com> jeff@onion.pdx.com (Jeff Beadles) writes: >pcg@cs.aber.ac.uk (Piercarlo Grandi) babbles: >| >|Do you *really* need much larger? Why? If something like 30 instead of >|14 would do, an easy hack exists. > >We could use a few less "hacks". :-) I'll second this! >|Well, this can be (and has been) done in two ways: >| >|1) keeping the current organization, but just extending the size limit. [...] >This is **NOT** true. There are hard-coded user programs that depend on a >14 character filename limit! It's by no means as easy as changing a >#define or two. > >For example, what will this code fragment do with >14 character filenames? >... > for(i=0; i<14; i++) > if(*xx) > *yy++ = *xx++; > else > break; > *yy ='\0'; >... > >This is typical of parts of the SYSV 3.2 code. True, this is not an robust way >to handle this, but it is typical of the code. The ONLY way to find these >sorts of problems is by inspection or searching. > Finally, someone on the net with a much better grip on reallity. There's no such thing as an easy hack. BTW, Piercarlo did you test not only the kernel but also all the commands that were effected by the long file name support? No? But I thought it was an easy hack? -- Miguel A. Ramirez, | machina@uts.amdahl.com | {sun,uunet}!amdahl!machina
pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/21/90)
On 19 Sep 90 06:18:17 GMT, machina@uts.amdahl.com (Miguel A. Ramirez) said: [ ... on how many user programs have an hard coded 14 for the max length of file name in a System V environment, and thus would not respond to a change in a #define ... ] machina> There's no such thing as an easy hack. Indeed, being able to change a #define'd symbol from 14 to 30 is not an easy hack, and I have been only been trying to sound funny; much more seriously, it is the reason why the #define'd symbol has been there since at least 10 years ago, to allow for *easy* parametrization. Some user programs have been silly enough to avoid it? They would break under *any* scheme to change the filename length, so changing the #define is the _easiest_ way out, because at least it impacts the kernel and kernel-level utilities less than the other choices. Once you have decided you want to change the maximum file name length, what is 'easy' is relative to that decision. Maybe I should have written 'an easy (relative to the other alternatives) way to satisfy your requirement in a strict sense and with the easiest and smallest (relatively, but also, to me, in absolute terms) changes to kernel and command sources is ...' instead of 'an easy hack is ...'. My postings are already too long -- I don't really want to write out everything in extenso, just to avoid people like you getting confused. News is not Congress (yet :->), and articles are not Bills. machina> BTW, Piercarlo did you test not only the kernel but also all machina> the commands that were effected by the long file name support? Actually, I think I have listed most of those in the standard System V distribution (I think I forgot SCCS, and no doubt some others) in some past article. Let me repeat, very few programs actually scan directories. Most work on collections of file pathnames, and of these only a few break a file pathname in file names (most are happy with mucking with suffixes at the end of pathnames). One may want to have a look at the BSD sources (freely available to those that have a System V source license) and scan them for usage of MAXNAMLEN; one would easily find the list of programs (modulo some BSD vs. System V differences) where one has to become suspicious, because the BSD crew put in all those MAXNAMLENs when they had to convert from the V7/V32/4.1BSD/System III/System V to the FFS directory organization. machina> No? But I thought it was an easy hack? -- You seem so sure, you must have done so. So please, for our information, post a list of commands and libraries in the standard System V distribution that have an hard coded 14, the relative percentage of source files, and in how many places for each. Also, what is easy depends on many factors; what is easy for me may be a big problem to you, for example, or (improbably :->) viceversa. For me using find(1), xargs(1) and egrep(1) is easy, for example, and not even too time consuming :-) :-) :-). The difficult problems are others even if somebody may seem daunted by the consequences of changing a fairly old and well understood symbolic constant. As to the real problems, another example; from evidence easily available using any AT&T, Sun or Dec (just to mention three big names) UNIX kernel, doing trivial tuning or even just avoiding gross misdesigns of page (and working set) replacement kernel modules and of the programs that run under them seems to be impossibly difficult (for those manufacturers, at least) or to require an inordinate number of releases. (spoiler: actually paging or swapping module design is not an easy task for anybody, especially if compared with merely changing the file name length leaving the directory structure the same. It is a bit easier though than what would be apparent from the problems that AT&T, Sun and Dec (and many others) seem to have with it). -- Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
machina@uts.amdahl.com (Miguel A. Ramirez) (09/26/90)
In article <PCG.90Sep21142146@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
*On 19 Sep 90 06:18:17 GMT, machina@uts.amdahl.com (Miguel A. Ramirez) said:
*
* [ ... on how many user programs have an hard coded 14 for the
* max length of file name in a System V environment, and thus
* would not respond to a change in a #define ... ]
*
*
*Maybe I should have written 'an easy (relative to the other
*alternatives)
Yes, this would have been better.
[...]
*
*machina> No? But I thought it was an easy hack? --
*
*You seem so sure, you must have done so. So please, for our information,
*post a list of commands and libraries in the standard System V
*distribution that have an hard coded 14, the relative percentage of
*source files, and in how many places for each.
Ah Piercarlo, I'd love to! Especially since I have to gives this info to our
test group. But company policy prohibits me from giving
away valuable information. Trust me, if you ever have the pleasure
of working on a Amdahl running UTS 2.1 or greater you'll see what I mean.
--
Miguel A. Ramirez, | machina@uts.amdahl.com | {sun,uunet}!amdahl!machina