std-unix@ut-sally.UUCP (Moderator, John Quarterman) (10/26/86)
From: gwyn@brl.arpa (VLD/VMB) (Doug Gwyn) Date: Mon, 20 Oct 86 10:33:00 EDT The reason strcoll() expands a text string to as much as twice its original length for collating purposes, rather than mapping it to a lowest-common-denominator form (such as case folding) is because it is believed that the former can always be done successfully, whereas lowest-common-denominator same-length mapping is known to be inadequate. Note also that the actions of strcoll() depend on a dynamically-changeable selection of "locale" information. So strcoll() is a red herring in this debate. UNIX variants that clear all but 7 bits in each char of a filename are examples of systems that try too hard to be "helpful" based on too limited a view of the world. They should be fixed, as I'm sure the Japanese have already suggested. Arguments based on characteristics of the shell or of command-line option parsing are beside the point; we're talking about what the kernel does or should do about filenames. The UNIX kernel was deliberately designed to be not much more than an I/O multiplexer. As with limited government, the theory is that the kernel should do only those things that cannot be done at the application level. This includes coordination of shared resources but NOT enforcement of technically unnecessary ideas about what is appropriate for applications to be doing. The fact that most UNIX implementations, including systems from Berkeley and ATTIS, do not fully adhere to this design philosophy is also irrelevant; they should perhaps be fixed. Certainly a standard such as POSIX that establishes a minimum common environment has no business imposing limited application models across the board. If POSIX is done properly, it should be even more minimalist than 8th Edition UNIX. Volume-Number: Volume 7, Number 81