[mod.std.unix] case mapping

std-unix@ut-sally.UUCP (Moderator, John Quarterman) (10/26/86)

From: gwyn@brl.arpa (VLD/VMB) (Doug Gwyn)
Date:     Mon, 20 Oct 86 10:33:00 EDT

The reason strcoll() expands a text string to as much as twice its
original length for collating purposes, rather than mapping it to
a lowest-common-denominator form (such as case folding) is because
it is believed that the former can always be done successfully,
whereas lowest-common-denominator same-length mapping is known to
be inadequate.  Note also that the actions of strcoll() depend on
a dynamically-changeable selection of "locale" information.  So
strcoll() is a red herring in this debate.

UNIX variants that clear all but 7 bits in each char of a filename
are examples of systems that try too hard to be "helpful" based on
too limited a view of the world.  They should be fixed, as I'm sure
the Japanese have already suggested.

Arguments based on characteristics of the shell or of command-line
option parsing are beside the point; we're talking about what the
kernel does or should do about filenames.

The UNIX kernel was deliberately designed to be not much more than
an I/O multiplexer.  As with limited government, the theory is that
the kernel should do only those things that cannot be done at the
application level.  This includes coordination of shared resources
but NOT enforcement of technically unnecessary ideas about what is
appropriate for applications to be doing.

The fact that most UNIX implementations, including systems from
Berkeley and ATTIS, do not fully adhere to this design philosophy
is also irrelevant; they should perhaps be fixed.  Certainly a
standard such as POSIX that establishes a minimum common environment
has no business imposing limited application models across the board.
If POSIX is done properly, it should be even more minimalist than
8th Edition UNIX.


Volume-Number: Volume 7, Number 81