jsh@usenix.org (Jeffrey S. Haemer) (01/06/90)
From: Jeffrey S. Haemer <jsh@usenix.org>
An Update on UNIX* and C Standards Activities
December 1989
USENIX Standards Watchdog Committee
Jeffrey S. Haemer, Report Editor
IEEE 1003.2: Shell and tools Update
Randall Howard <rand@mks.com> reports on the October 16-20, 1989
meeting in Brussels, Belgium:
Background on POSIX.2
The POSIX.2 standard deals with the shell programming language and
utilities. Currently, it is divided into two pieces:
+ POSIX.2, the base standard, deals with the basic shell
programming language and a set of utilities required for
application portability. Application portability essentially
means portability of shell scripts and thus excludes most
features that might be considered interactive. In an analogy to
the ANSI C standard, the POSIX.2 shell command language is the
counterpart of the C programming language, while the utilities
play, roughly, the role of the C library. POSIX.2 also
standardizes command-line and function interfaces related to
certain POSIX.2 utilities (e.g., popen, regular expressions,
etc.) [Editor's note - This document is also known as "Dot 2
Classic".]
+ POSIX.2a, the User Portability Extension or UPE, is a supplement
to the base POSIX.2 standard; it will eventually be an optional
chapter of a future draft of the base document. The UPE
standardizes commands, such as screen editors, that might not
appear in shell scripts but are important enough that users must
learn them on any real system. It is essentially an interactive
standard that attempts to reduce retraining costs incurred by
system-to-system variation.
Some utilities, have interactive as well as non-interactive
features In such cases, the UPE defines extensions from the base
POSIX.2 command. An example is the shell, for which the UPE
defines job control, history, and aliases. Features used both
interactively and in scripts tend to be defined in the base
__________
* UNIX is a registered trademark of AT&T in the U.S. and other
countries.
December 1989 Standards Update IEEE 1003.2: Shell and tools
- 2 -
standard.
In my opinion, the biggest current problem with the UPE is that it
lacks a coherent view: it's becoming a repository for features that
didn't make it into the base standard. For example, compress is in
the current UPE draft. It's hard to rationalize classifying file
formats as an "interactive" or "user portability" issue, yet the one
used by compress is specified in the UPE. It certainly doesn't fit in
with a view of the UPE as a standard that merely adds utility syntax
information (e.g., information that would allow users to type the same
command line to compress a file on any system). This highlights the
schizophrenic nature of the UPE: it addresses a range of different
needs that, taken together, do not appear to define a whole. Dot 2
Classic, to my taste, appears to have far more unified scope and
execution.
A second, related, problem with the UPE is that there appears to be
less enthusiasm for it than for the base standard. A number of
people, including me, understand the need for it, but it doesn't
appear to have the strategic importance of the base. [Editor's note -
The UPE is, frankly, controversial. Like 1201, the committee
undertook the UPE out of a fear that if they didn't, NIST would do the
job without them. Supporters note that although its utilities are
probably not necessary for portability of most software, it would be
unpleasant for programmers to do the porting work without them.
Detractors counter that POSIX was never intended to cover software
development and that the group is exceeding not only its charter, but
that of the entire 1003 committee.]
Status of POSIX.2 Balloting
POSIX.2 is in its second round of balloting. The first ballot, on
Draft 8, produced many objections that are only partially resolved by
Draft 9. Although there were only fifty-four pages of unresolved
objections remaining after Draft 9 was produced, the current balloting
round is not restricted to existing objections, and there will almost
certainly be many new ones. Remaining objections range from the
perennial war between David Korn and the UNIX Support Group over what
features should be required in the POSIX shell, through the resolution
of the incompatible versions (Berkeley and USG) of echo, to the
treatment of octal and symbolic modes in umask.
A digression to illustrate the kind of issues being addressed:
In March of 1989, a study group from 1003.2 met at AT&T to
resolve major objections to the shell specified in Draft 8 by the
two warring parties. This was a good place to hold the meeting,
since both parties are from AT&T: one led by David Korn of Bell
Labs, the author of the popular Korn Shell (KSH) the other, a
group led by Rob Pike of Bell Labs Research and the UNIX Support
Organization, advocating more traditional shells, like the System
December 1989 Standards Update IEEE 1003.2: Shell and tools
- 3 -
V Bourne Shell and the Version 9 Research shell. Korn's group
contends that the shell should be augmented to make it possible
to efficiently implement large scripts totally within the shell
language. For example, while the more traditional camp views
shell functions as little more than command-level macros and uses
multiple scripts to modularize large shell applications, the Korn
shell views functions as a tool for modularizing applications,
and provides scoping rules to encourage this practice.
The two philosophies engender different opinions on issues such
as the scoping of traps within functions and the use of local
variables. Other contentious issues were the reservation of the
brace ({ }) characters as operators (rather than as the more
tricky "reserved words"), the promotion of tilde expansion to a
runtime expansion (like parameter expansion), and the issue of
escape sequences within echo/print/printf.
The meeting produced a false truce. I attended, and believe that
both parties had different views of the agreement that came out
of the meeting. As a result, Draft 9 produced balloting
objections from both parties and the dispute continues unabated.
Shades of POSIX.1 Tar Wars...
I suspect the next draft (Draft 10) will fail to achieve the consensus
required for a full-use standard.
This is a good thing. Useful features are still finding their way
into the document. (Draft 9 introduces hexdump, locale, localedef,
and more.) Also, the sheer size (almost 800 pages) of Draft 9 has
prevented many balloters from thoroughly reviewing the entire
document, Still, there is a stable core of utilities that is unlikely
to change much more than editorially; I predict the standard will
become final around Draft 12.
A mock ballot on Draft 4 of the UPE will probably start after the New
Orleans meeting in January, and the resulting Draft 5 will probably go
to a real ballot somewhere in summer to early fall of 1990. Although
many sections remain unwritten or unreviewed, the UPE is a much
smaller standard than POSIX.2 and should achieve consensus more
quickly.
Status of the Brussels Meeting
The Brussels meeting focused on the UPE, with only a summary report on
the status of balloting for the base standard. For most of the
meeting, small groups reviewed and composed UPE utility descriptions.
The changes generated at the meeting will appear in Draft 3.
The groups reviewed many utilities. The chapter on modifications to
the shell language (for interactive features) is now filled in, and
such utilities as lint89 (the recently renamed version of lint), more,
December 1989 Standards Update IEEE 1003.2: Shell and tools
- 4 -
etc. are approaching completion. Still, much work remains.
[Editor's complaint - We think renaming common commands like lint
("lint89") and cc ("c89") is both cruel and unusual. We are not eager
to re-write every makefile and shell script that refers to cc or lint,
nor to re-train our fingers to find new keys each time the C compiler
changes. The name seems to have been coined by either a hunt-and-peck
typist, or someone who has longer and more accurate fingers than we
do. (Was it, perhaps, the work of Stu Feldman, author of f77?)
Moreover, replacing commands with newer versions is commonplace and
traditional in UNIX. Examples like "make", "troff", and "awk" spring
to mind. If an older version is kept on for die-hards, it's renamed
(e.g., otroff, oawk).
One Dot-Two member rebuffed our objections with the reply, "But, you
see, this isn't UNIX: it's POSIX." ]
Because the meeting was in Europe, attendance at the working group
meetings was lower than normal (20-25 rather than the normal 35-40 in
POSIX.2. Nevertheless, the choice of location served a purpose. The
meeting was held in Brussels to garner international support and
participation, particularly from the European Economic Community.
There were many EEC representatives at the background sessions on
POSIX and two or three European working group members in the POSIX.2
meetings who wouldn't normally have attended. Though it remains to be
seen what will come out of having met in Brussels, I am convinced that
the extra effort will prove to have been justified.
December 1989 Standards Update IEEE 1003.2: Shell and tools
Volume-Number: Volume 18, Number 4jsh@usenix.org (Jeffrey S. Haemer) (09/21/90)
Submitted-by: jsh@usenix.org (Jeffrey S. Haemer)
An Update on UNIX*-Related Standards Activities
September 1990
USENIX Standards Watchdog Committee
Jeffrey S. Haemer, Report Editor
IEEE 1003.2: Shell and tools
Randall Howard <rand@mks.com> reports on the July 16-20 meeting in
Danvers, MA:
Background on POSIX.2
The POSIX.2 standard deals with the shell programming language and
utilities. Currently, it is divided into two components:
+ POSIX.2, the base standard, deals with the basic shell
programming language and a set of utilities required for
application portability. Application portability essentially
means portability of shell scripts and thus excludes most
features that might be considered interactive. In an analogy to
the ANSI C standard, the POSIX.2 shell command language is the
counterpart to the C programming language, while the utilities
play, roughly, the role of the C library. In fact, because
POSIX.2 provides an interface to most of the features (and
possibly more) of POSIX.1, it might also be thought of as a
particular language binding to the soon-to-be language
independent version of that standard. POSIX.2 also standardizes
command-line and function interfaces related to certain POSIX.2
utilities (e.g., popen(), regular expressions, etc.), as
discussed in detail in the snitch report for the Snowbird
meeting. This part of POSIX.2, which was developed first, is
also known as ``Dot 2 Classic.''
+ POSIX.2a, the User Portability Extension or UPE, is a supplement
to the base POSIX.2 standard. Not a stand-alone document, it
will eventually be an optional chapter and a small number of
other revisions to a future draft of that base document. This
approach allows the adoption of the UPE to trail Dot 2 Classic
without delaying it. The UPE standardizes commands, such as vi,
that might not appear in shell scripts but are important enough
that users must learn them on any real system. It is essentially
an interactive standard that attempts to reduce retraining costs
caused by system-to-system variation.
__________
* UNIXTM is a Registered Trademark of UNIX System Laboratories in
the United States and other countries.
September 1990 Standards Update IEEE 1003.2: Shell and tools
- 2 -
Some utilities have interactive as well as non-interactive
features. In such cases, the UPE defines extensions from the
base POSIX.2 utility. An example is the shell, for which the UPE
defines job control, history, and aliases. Features used both
interactively and in scripts tend to be defined in the base
standard.
Together, Dot 2 Classic and the UPE will make up the International
Standards Organization's IS 9945/2 -- the second volume of the
proposed ISO three-volume standard related to POSIX.
Status of POSIX.2 Balloting
Draft 10 of Dot 2 Classic was sent out during July in a recirculation
ballot. Recirculation means that objections need only be considered
if they are existing unresolved objections or are based on new
material. Other objections will be considered at the whim of the
Technical Editor.
Draft 10 is an imposing, if not intimidating, 780 pages, made even
denser by the decision to remove much white space in a (vain) attempt
to save paper. Ballots are due by September 10. Unfortunately, the
recirculation ballot materials arrived at my organization on August
17th, giving our group barely three weeks to review this massive
document.
The technical editors and others working behind the scenes (Hal
Jespersen, Don Cragun, and others) have done an admirable job of
diff-marking changes and producing personalized lists of unresolved
objections for each balloter. In addition, all 96 pages of unresolved
objections are provided. However, the amount of new material that has
never been reviewed and the major reorganization means that Draft 10
bears much less resemblance to Draft 9 than one might hope. That,
combined with balloting on the UPE, has put many balloters -- myself
included -- in balloting overload.
If a recirculation simply means forming opinions on my (and other)
unresolved objections, then the time period is quite reasonable.
However, as I shall describe below, Draft 10 is so changed from the
previous drafts that it deserves to be read practically from cover to
cover, and the recirculation deadline does not provide adequate time
for that task. The changes fall into a number of categories:
+ New Utilities: For example, a superset of the traditional od
replaced the Draft 9 hexdump which was xd in Draft 8. Pathchk
and set -o noclobber have replaced create from Draft 9 and
validfnam and mktemp from Draft 8. Such examples demonstrate
that Draft 10 is not mature and needs more consideration to
achieve consensus.
September 1990 Standards Update IEEE 1003.2: Shell and tools
- 3 -
+ Expanded Material: Previous descriptions of such utilities as
awk, sh, bc, etc., were neither sufficiently comprehensive nor
sufficiently complete to be of the quality demanded of a
standard. In the latest draft, these descriptions have been
fleshed out, and include much more detail on operator precedence,
interactions, subtle semantics, and so on. This is clearly a
step in the right direction, but adds to the job of reviewing
Draft 10.
+ Internationalization: While the localedef and locale utilities
remain, they have changed substantially. I personally support
including these features, but am concerned that these are being
designed during the balloting process which is, if anything,
worse than design-by-committee. Overall, balloting-group
reaction to these utilities ranges from impassioned pleas for
their removal to requests for greater functionality (complexity)
to handle ever more arcane aspects of the internationalization
problem.
+ Chapter 2: Chapter 2's front matter is substantially reorganized
and more voluminous. This chapter contains definitions, utility
syntax information, requirements imported from POSIX.1, the
definition of a locale, description of basic and extended regular
expressions, etc.. Utility descriptions seem to be getting
shorter, with more and more pointers to Chapter 2. This is a
good trend, as long as balloters adequately consider the
chapter's technical contents.
Status of POSIX.2a Balloting
The first formal ballot on POSIX.2a UPE Draft 5 was due in the IEEE
offices by August 16th. Unfortunately, the UPE is laced with
references to definitions and concepts largely defined in Chapter 2 of
Draft 10. I did not receive my Draft 10 until after the UPE balloting
was due to be returned. This hinders any attempt to review these two
documents as a single entity -- which is what they will eventually
become.
The UPE is starting to mature: it's converging. The major controversy
is scope -- as it has been throughout the UPE's entire life. This
draft aligns itself more closely to Dot-2-Classic in many ways, which
leads me to believe that combined review is essential to its
understanding.
A few utilities remain contentious:
+ nice, renice: These require underlying functionality absent from
POSIX.1, although POSIX.4 has setscheduler(), which allows
applications to set priority and scheduling algorithms.
September 1990 Standards Update IEEE 1003.2: Shell and tools
- 4 -
Some working and balloting group members adamantly resist any
attempt to add utilities that are not implementable on top of a
bare POSIX.1. Others view the UPE as addressing what users type,
regardless of underlying implementation. I am in the latter
camp, not the least because other working groups, such as
POSIX.4, have not yet standardized a utility interface, leaving a
void which the much-maligned UPE group is most able to fill. (It
is telling that implementing df and ps is impossible using only
POSIX.1 functions, yet there is little opposition to including
either utility.
+ ps: The description for this utility was an interesting amalgam
of two incompatible visions of how ps output should be
formatted -- that in Draft 4 and that in Draft 5. A correction
should have been issued during balloting, so that balloters could
concentrate on the real issues of what should be the scope of the
ps utility.
+ patch: This utility differs from many others; its origins are in
the public domain rather than in a traditional UNIX variants. As
a result, many people feel that patch is worthwhile, but not
mature enough to standardize.
+ lint89: This utility is optional, largely because it is
controversial for a number of reasons. Obviously, the very name
lint89 is painfully bureaucratic. Furthermore, many feel that
ANSI C makes it unnecessary; moreover, any remaining required
functionality rightfully belongs as an additional option in the
c89 (cc) utility. Some point to existing practice. But what is
existing practice when the utility's name is lint89? [Editor: On
the other hand, it may prove indispensable in detecting
portability problems in lex89- and yacc89-generated code.
Parenthetically, Draft 10 calls these lex and yacc, but that must
just be a temporary oversight; the utilities obligatorily have
ANSI C input and output. (One assumes we'll escape c89tags
because ctags can be made to work with both flavors.)]
+ compress: The inclusion of this utility remains controversial
because of the Unisys patent on the particular variable of
Lempel-Ziv compression used by traditional implementations of
this utility. The working group appears to be divided on the
subject of basing a standard on patented material -- no matter
what the licensing fees are. There is, however, general
agreement that it is preferrable to remove compress entirely
rather than ``invent'' some new compression algorithm.
Therefore, it appears that a pax-like compromise, of having a
single interface to a number of competing formats or algorithms,
is not widely supported. [Editor: see Andrew Hume's X3B11 report
for another wrinkle on data compression.] Clearly, this issue
will have to be resolved with further information from Unisys
lawyers during the balloting process.
September 1990 Standards Update IEEE 1003.2: Shell and tools
- 5 -
Status of the Danvers Meeting
The Danvers working group dealt with neither Dot 2 Classic nor the
UPE. Instead, at POSIX.3.2's request (that's the subgroup of Dot 3
producing test assertions for Dot 2), we met jointly to co-develop
test assertions for Dot 2 Classic. This work is a consequence of the
SEC's recent decision requiring each POSIX working group to develop
its own test assertions and ballot them with the standard. It also
stems from Dot 3's frustration over the (inadequate) way Dot 2
addressed testing. For example, automated testing of lp, is
impossible; it can only be tested by a human test procedure. Our
working group should have explored the implications of this before
subjecting POSIX.3 to that task. (Some utilities can only be tested
manually, but the working group defining that utility should likely
put something to that effect in the Rationale or History of Decisions
Made to confirm to the testing people that they knew this.)
The three days of working with Dot 3 were a real learning experience
for our working group. Nonetheless, many of us had our fill of test
assertions that week.
I'm also concerned that a three-day meeting cost my company nearly as
much as a five day meeting would have. In the future, I would prefer
to see schedules that make productive use of the entire working week.
September 1990 Standards Update IEEE 1003.2: Shell and tools
Volume-Number: Volume 21, Number 120gwyn@smoke.brl.mil (Doug Gwyn) (09/21/90)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)
In article <530@usenix.ORG> std-unix@uunet.uu.net writes:
-Submitted-by: jsh@usenix.org (Jeffrey S. Haemer)
- + lint89: This utility is optional, largely because it is
- controversial for a number of reasons. Obviously, the very name
- lint89 is painfully bureaucratic. Furthermore, many feel that
- ANSI C makes it unnecessary; moreover, any remaining required
- functionality rightfully belongs as an additional option in the
- c89 (cc) utility. Some point to existing practice. But what is
- existing practice when the utility's name is lint89? [Editor: On
- the other hand, it may prove indispensable in detecting
- portability problems in lex89- and yacc89-generated code.
- Parenthetically, Draft 10 calls these lex and yacc, but that must
- just be a temporary oversight; the utilities obligatorily have
- ANSI C input and output. (One assumes we'll escape c89tags
- because ctags can be made to work with both flavors.)]
I really do not understand the reasoning behind not just using the
names "cc", "lint", "lex", etc. The entire software generation system
needs to work together as an integrated whole. Now that there is an
official standard for C, any further standardization involving C should
be connected to standard C. Since the C standard is in almost all ways
upward-compatible so that "lint", "lex", etc. can be upgraded to support
standard C and still act as before when fed "old K&R C", so long as the
software generation system's C compiler understands lex/yacc/whatever
output there should be no issue here. From the standards point of view
there should currently be only one notion of what C is.
- D A Gwyn (sporadically acting X3J11/P1003 liaison)
Volume-Number: Volume 21, Number 121willcox@urbana.mcd.mot.com (David A Willcox) (09/21/90)
Submitted-by: willcox@urbana.mcd.mot.com (David A Willcox) In article <530@usenix.ORG> jsh@usenix.org (Jeffrey S. Haemer) writes: >A few utilities remain contentious: > + nice, renice: These require underlying functionality absent from > POSIX.1, although POSIX.4 has setscheduler(), which allows > applications to set priority and scheduling algorithms. A point of clarification: These utilities, as defined in 1003.2a, do NOT require any functionality that is not in 1003.1. Both can be implemented on a bare-bones 1003.1 system as having no effect on execution priority. The following, for example, is a valid shell script implementation of nice: case $1 in -n) shift;shift;; -* shift;; esac exec $* renice is a little more complicated, but not much. (It should just have to check for valid arguments.) So saying that you can't implement this on a 1003.1 system is not only a red herring, it simply isn't true. Providing these utilities allows well-mannered applications to make use of the priority manipluation features that are already provided by most implementations. David A. Willcox "Just say 'NO' to universal drug testing" Motorola MCD - Urbana UUCP: ...!uiucuxc!udc!willcox 1101 E. University Ave. INET: willcox@urbana.mcd.mot.com Urbana, IL 61801 FONE: 217-384-8534 Volume-Number: Volume 21, Number 122
rsalz@bbn.com (Rich Salz) (09/22/90)
Submitted-by: rsalz@bbn.com (Rich Salz) Reporting on 1003.2 (Shell and tools), in <503@usenix.org> Randall Howard writes: >+ patch: This utility differs from many others; its origins are in > the public domain rather than in a traditional UNIX variants. As > a result, many people feel that patch is worthwhile, but not > mature enough to standardize. I find this sentence totally amazing. Patch has been around far longer, and is much more worthwhile, than more than 80% of what 1003 has been doing ever since they expanded beyond dot one. Can anyone from the committee who holds this viewpoint offer a reasonable defense of it? /r$ Volume-Number: Volume 21, Number 123