WALLI%7178.gm@hac2arpa.hac.com (10/11/90)
Submitted-by: WALLI%7178.gm@hac2arpa.hac.com
The Context for Programming Language Independence for POSIX
Stephen R. Walli - walli@osmcl1.gm.hac.com
EDS of Canada, Ltd.
1 INTRODUCTION
Programming Language Independent Specification (LIS) of POSIX has
become a hot topic within the IEEE P1003 Working Groups and ISO WG15
(POSIX). Depending on one's point of view, it will either make the
POSIX family of standards more robust and usable or make them
completely unusable while seriously delaying the process. What I hope
to accomplish here is to present all of the relevant concerns and
information in one place, so as to provoke ideas and discussion which
will prove fruitful in the upcoming P1003 Seattle meeting and the
subsequent WG15 meeting on Orcas Island.
The standard disclaimers apply. All views expressed are the author's
and do not necessarily represent the opinions of the IEEE P1003
Working Group members, ISO WG15 or the author's employers. I would
like to thank Paul Rabin for reading this through, catching some of my
oversights and helping me clarify some of my statements. POSIX is a
registered trademark of the IEEE. UNIX is a registered trademark of
AT&T.
2 WHAT EXACTLY IS THE POINT OF LIS
I will not provide any of the historical reasons/arguments/discussions
between ISO and the IEEE since all I know is hearsay, and I would not
want to raise anyone's ire if I've misunderstood any of the facts. It
also serves no real purpose in accomplishing the task at hand to
re-iterate these discussions. It is sufficient to say that the
direction to accomplish the LIS work is coming from ISO and TCOS-SS
has agreed to the work.
The directive is to write the POSIX interfaces in a programming
language independent way, such that the functionality and behaviour
are completely described, but no language semantics are introduced.
This then frees up language bindings groups to map to the interface
specification in a way most natural for their particular language.
Describing the service in a language independent manner also serves to
provide a more rigorous definition of the service [1].
Many will argue that POSIX is a C language standard interface to UNIX
system services, and all of the noise about any other programming
language binding or any other operating system is immaterial. I've
both seen this and heard it voiced in the occasional dark corner at
P1003 Working Group meetings.
While POSIX' roots are most certainly C interfaces to UNIX system
Page 2
services, the market has driven POSIX beyond those roots. There are
many other language groups which have a vested interest in writing
portable applications which want access to operating system services
in a portable way [2]. The US government commitment to Ada, and the
amount of government funded work in FORTRAN has created the need for
two POSIX Working Groups to produce their respective bindings to
1003.1 services. As the commercial market interest in Open Systems
grows, I have no doubt we will eventually see a COBOL binding. I
would be very surprised if there isn't someone at IBM already working
in this direction.
The fact remains that the LIS of POSIX is required work for the
international acceptance of POSIX, and is here to stay.
3 WHAT ARE PAUL AND STEPHE DOING?
Paul Rabin and I are working on the methods document [1] for producing
a language independent specification of POSIX. Paul found time to do
all the real work of compiling and editing the document, while I acted
as critical reviewer and chief nit-picker.
The work is based on documents received from ISO/IEC JTC1/SC22/WG11
which is defining a set of methods for creating abstract, programming
language independent procedure specifications [3] [4].
The method's objectives are:
o to meet the ISO requirement of producing a LIS of POSIX,
while adhering to their guidelines on developing these
specifications.
o to facilitate the development of language bindings from base
LIS.
o to facilitate the development of base LIS which are
sufficiently robust so as to ensure a common recognizable
functionality in the bindings.
Specific non-goals include:
o Interoperability between modules written in different
languages bound within the same executable image, or
interoperability between applications written in different
languages using common services, including data interchange.
o Incorporating formal description techniques.
o Ensuring the portability of language or binding
implementations.
o Providing a machine translatable language-independent binding
description language.
Page 3
We discovered that interoperability between applications written in
different programming languages cannot be ensured within the current
scope. The general formula presented by WG11 for producing
language-independent procedure specifications is to model the
interface using abstract data types, then each binding defines its
mapping of real data types to these abstract types.
Interoperability fails when certain abstract opaque types, process id
or file descriptor for example, are mapped by different languages to
different types. What may be effectively mapped to a pointer in one
language cannot be supported by another language which does not
understand pointers. The second language must map the opaque type
differently, to the detriment of interoperability.
In retrospect, this is not unreasonable. POSIX' goal is to ensure the
portability of an application using operating systems services across
multiple implementations at the source level. It makes no effort to
ensure interoperability between programming languages, nor should that
be within the scope of the standard.
The method defines a model which contains data types, procedures on
those types, and constants. The objects that the system services act
upon are modelled by their abstract types. The procedures (services)
become the operations on the data types.
Operating system service interfaces are presented as an abstract
procedure, with input parameters, output parameters and the notion of
a completion status. Note that completion status does not refer to a
returned value, but could just as easily refer to a raised exception,
a signal, a return value of a function, an output parameter of a
procedure, or any other entity you can imagine.
The methods document goes on to suggest guidelines for both base
standard and language binding developers.
The methods document has been updated since Danvars and will be
presented again in Seattle. It will be put to a mock ballot sometime
after Seattle. Donn Terry is managing the ballot list.
One interesting example of a similar informal method that I've seen
recently is the circulation of ORKID Draft 2.1 [5] in a LIS form with
a C binding. ORKID isn't as complex as POSIX, but the draft serves as
an interesting and complete example. A C binding accompanies the
draft as an appendix, formatted tersely as a C header file. I would
be very interested to see a FORTRAN or Ada binding to the draft, if
one exists.
The draft has the same problem with language interoperability that we
discovered with our method, in that there is considerable room for
choosing language specific data types to match the opaque types. They
go as far as to allow implementations within a language to specify
their own data types. I haven't spent enough time with the draft to
be able to comment on whether this hurts networked applications, or
whether the procedure interface deals with this behind the application
developer's back. It is still a valuable example.
Page 4
Paul is managing a mailing list for LIS related issues and discussion.
Messages for distribution to the whole list should be sent to
posix-lis@osf.org or uunet!osf.org!posix-lis. Requests for updates to
the list should be sent to posix-lis-request@osf.org.
4 PEOPLE ISSUES
There are a number of people issues surrounding the LIS, which should
be understood, because the LIS sometimes becomes an emotionally
debated topic. An effort has been made to state them unbiasedly and
to completely avoid any of the finger pointing arguments which
sometimes occur.
4.1 People Issue #1
Many people have devoted a considerable effort into building the
current 1003.1 standard and the draft documents which are balloting or
near ballot. There is ownership and sweat built into all of the
documents. A perception exists that the ISO mandate to produce LIS of
the services destroys these documents. It does not. There is a
desire to change the documents to produce the best possible standard,
yet backwards conformance to the current work has always been a goal.
These documents will change in future revisions of POSIX. The
knowledge gained and information content is valid. We are discussing,
however, an effort that is far beyond simple reformatting of the
current documents.
ANY significant change at this point will inevitably meet with
resistance no matter how it's presented. This whole issue is very
analogous to 1003.3 requirements for providing test assertions at this
point. At the last P1003 meeting in Danvars, I had an opportunity to
spend time in the .1, .5 and .9 rooms, (as well as my home in .4)
discussing LIS issues. I think I'm beginning to understand how Roger
Martin (P1003.3 chair) feels any time he shows up in another working
group to explain test assertion requirements.
4.2 People Issue #2
Working Groups that thought they had ballotable documents are being
asked to fulfil additional requirements. These requirements entail
considerable extra effort. Base standards groups are being asked to
provide base LIS of their services, and the C language binding to the
LIS. Bindings groups are being asked to provide reformatted bindings
to a base LIS which doesn't yet exist. At the same time test
assertion requirements are being presented. Both of these areas are
perceived as being tedious and "boring". One Working Group actually
went on record saying they felt they would lose membership over these
issues.
Page 5
For this work to be worthwhile it must be done completely and
accurately. This will require exacting effort. Nothing like the
"exciting" work of arguing the functionality of a family of services.
This comes at the perceived end of work as a draft document prepares
to go to ballot.
5 THICK DOCUMENTS OR THIN - A USABILITY ISSUE
One of the debates currently being argued in P1003 is whether the
individual language bindings are thick documents or thin.
In the thick document scenario, there is a base document which
describes the abstract service interfaces, and each binding document
is a thick standalone document which will repeat the functional
descriptions, adjusted for the particular language. This camp's
followers are programmers with real experience in receiving a standard
on their desk and having to use it as a programming tool. The base
document becomes a tool only for language binding writers.
The thin document scenario has a base document describing abstract
service interfaces, but each thin binding will only include language
specific information. All appropriate functional descriptions are
pointed to in the base document by reference. This camp is the
"Standards aren't for People" crowd. Standards are only meant for
conformance testing for procurement. If a programmer actually had to
use the binding standard, they would also require the base standard
and would then work with a finger stuck in each book.
There are actually two separate issues hidden in the thick/thin
binding debate. The first issue is whether a binding is allowed to
repeat material contained in the base LIS. The second issue concerns
whether a binding provides a direct one-to-one mapping to the base, or
whether it can be creative and more directly map to the semantics of
the language being bound.
For the record, the P1003.5 (Ada) Working Group decided early in their
history to create a standalone document appropriate to the Ada
language. The P1003.9 (FORTRAN) Working Group chose to create a
binding which points to the "base" document, mapping its services
one-to-one as closely as possible.
We are working under the assumption that ISO ascribes to the thin
binding camp. Semantically, standards do not overlap. Standards are
allowed to refer to other standards. There are genuine and realistic
concerns with synchronizing standards documents if many documents
contain overlapping material.
For the record, I'll voice the following suggestion. The STANDARDS
themselves will be individually drafted and balloted documents as in
the thin binding camp. The LIS standard comes first. The binding
standard comes second. However, instead of merely pointing to a
another document, or including its own interpretation of the contents
of the other standard, the text of the LIS is directly embedded. The
Page 6
embedded LIS text is clearly delineated so as to be clearly separate
from the binding text, and only the binding text is ballotable in the
draft binding document. This would hopefully solve the usability
issue put forward by the one document camp.
Think of it as a documentation analogy to software development.
Instead of subroutine calls pointing elsewhere, there are already
expanded "macro" calls to "speed" the understanding. (Publication
synchronization concerns become source control concerns similar to
different applications referencing the same "macro" library.) It would
simplify the synchronization issue.
Ultimately I believe the publication should be usable by mere mortals.
6 THE CASE FOR RIGOROUS FORMAL METHODS AND THE CASE AGAINST
Another hot debate is the level of rigor required by the LIS. Our
understanding is that natural language descriptions of the services
are sufficient for the current LIS of POSIX. It is explicitly stated
in the draft methods document that we are not pursuing a formal method
of specification for the standard.
There seems to be a lack of experience and standardization of formal
methods within the standards community. Little work has been done to
formally specify standards. (Now that I've publicly made this
statement, I'm sure I'll be accosted by everyone next week who has
seen anything even remotely looking like a formal standard.) I base
this statement on three P1003 meetings worth of LIS BOFs where
everyone is quick to suggest their favourite formal method, but there
is never anyone who has taken the trouble to bring an example of it
used to specify a standard. Please do. The author welcomes all
related information.
A recent issue of IEEE Software had a very reasonable article on the
use of formal methods in the standards process [6]. This work came
out of a working group formed by the British Computer Society (BCS) to
address the lack of informed opinion on the matter. They outline
briefly the reasons for using formal methods, a few examples of
experiments with using formal methods on standards, and finish with a
set of guidelines. These guidelines are by far the strongest part of
the article.
They also refer briefly to an ISO three-phase plan [7] to bring formal
methods into the standards process.
1. Phase 1 has the use of formal methods restricted due to lack
of experience and expertise. Work is done in parallel on a
formal specification of the standard, which hopefully
contributes to the robustness and clarity of the standard,
and the results are published as a SEPARATE report.
Page 7
2. Phase 2 has seen the knowledge and experience base expanded
in the use of formal methods in standards creation, and the
work proceeds in parallel and is sufficient to be published
as an informative annex. (Note: This is not a balloted
NORMATIVE one, but an unballoted informative one.)
3. Phase 3 sees the standards developing body well versed in
formal methods and the formal description is the standard
with a natural language commentary.
One thing that the article warns against is the retrofitting of formal
methods to an existing standard, because it can often demonstrate the
lack of conceptual integrity of the original work.
Additionally they recommend choosing an appropriate formal method to
express the standard, depending on such factors as the proposed
standard's content, mathematical sufficiency and accessibility to the
standards forming group.
The primary formal methods that have been suggested are Z, and VDM.
Z actually has a number of interesting examples to consider. Recent
work has been done to produce a formal specification of P1003.1 using
Z, and it was reported upon by Martin Kirk [8].
The report concluded that while the work was useful at finding "weasel
worded" areas of the standard, it requires a large effort to continue
this work. Several other problems exist as well. Some of these
problems had to do with the complexity of POSIX, and its deliberate
areas of ambiguity. Other problems encountered had to do with the Z
notation and the choice of model.
Indeed this raises an area of concern with how far formal methods
should be applied to POSIX. POSIX has deliberate areas of ambiguity,
"weasel words", and unspecified nature. This is required so as to
allow a maximum number of implementations, to not restrict
implementations in unnecessary ways or force implementations. POSIX
is a standard for portable operating system service interfaces. It is
not the specification of an operating system [9]. There are also
times when weasel words are the only way to arrive at consensus.
Another interesting example of Z in this area is a recent article by
J. Michael Spivey on using Z to specify a real-time kernel [10].
This is the specification of a minimal kernel and not an interface to
it. Spivey discusses several deficiencies in his specification of the
kernel, and addresses all of them at the risk of making his
specification more complex and less understandable. He does conclude
that using a formal specification is a valuable and beneficial tool
for answering questions about the kernel, but he then "suggests that
the idea of using a formal specification as a complete contract
between implementer and user is not very helpful." [11]
Indeed it has been sensibly pointed out that the use of formal methods
is beneficial to aiding understanding about the object being
specified, but that they need not be a complete specification [12]
Page 8
[13] [14]. This certainly fits in with the ISO three-phase approach
to introducing formalism into standards. They never require a
complete formal specification without natural language commentary.
The Vienna Development Method (VDM) is also frequently suggested as a
candidate formal method. VDM has a similar flavour to Z but does not
have a facility similar to Z's schema calculus to allow simple
specifications to be built into complex ones.
This summarises all the current discussion I've discovered to date
concerning actual formal methods to specify a standard POSIX
interface.
7 A BRIEF NOTE ON TESTING AND CONFORMANCE ISSUES
There are several testing issues about LIS of POSIX no matter how
formal the specification method. The following questions have been
raised.
How does one "test" a language independent specification? At first
glance, one doesn't. Test assertions are merely done at the binding
level to allow implementations to demonstrate conformance. This
certainly needs to be done.
But can formal or natural language assertions be made about the LIS,
which can be tested manually by argument and inspection, and which can
then act as a basis set of assertions used when building language
binding assertions?
>From a different point of view, is there a set of assertions that can
be made about the LIS which can help determine how good a binding
reflects the base? How do bindings conform to the base? If a binding
becomes a one-to-one mapping of the base LIS, then they conform
directly. If they do not completely map the binding or map it
differently, how is conformance measured?
All of these questions need some thought, and I hope this generates
some creative feedback for next week.
8 SUMMARY
I hope I have presented completely and with as little bias as possible
the issues surrounding the language independent specification of
POSIX.
Hopefully at the BOF gatherings at P1003 and the WG15 Ad Hoc, many of
these issues can be solved to everyone's satisfaction, with a care
towards the tremendous effort which has gone on to date at building
POSIX.
I look forward to seeing everyone there.
Page 9
9 REFERENCES
[1] Paul Rabin and Stephen Walli, "Draft TCOS-SS Programming Language
Independent Specification Methods", Draft 1, 15 July, 1990.
[2] Dominic Dunlop, comp.std.unix Volume 20, Number 110, USENET, 5
July, 1990.
[3] "Proposed DTR 10182 on:Information Processing Systems - Guidelines
for Language Bindings", ISO/IEC JTC1/SC22 N754, International
Standards Organization, Geneva.
[4] "Common Language-Independent Datatypes: Working Draft #3",
ISO/IEC JTC1/SC22/WG11 N162, International Standards Organization,
Geneva.
[5] ORKID Working Group, "ORKID - Open Real-time Kernel Interface
Definition, Draft 2.1", August 1990.
[6] David Blyth, et al, "The Case for Formal Methods in Standards",
IEEE Software, Volume 7, Number 5, September, 1990.
[7] "JTC1 Statement of Policy on Formal Description Techniques",
ISO/IEC JTC1 N145, and ISO/IEC JTC1/SC18 N1333, International
Standards Organization, Geneva, 1987. This reference was pointed to
in [6] and I have not yet been able to obtain a copy.
[8] Martin Kirk, "Z Specification of P1003.1", ISO/IEC JTC1/SC22/WG15
N115, International Standards Organization, Geneva, September, 1990.
[9] Donn Terry, "Suggested Response to JTC1/SC22/WG15 N115", Document
SC22/WG15 US TAG N146.
[10] J. Michael Spivey, "Specifying a Real-Time Kernel", IEEE
Software, Volume 7, Number 5, September, 1990.
[11] Ibid. p.27
[12] J. Michael Spivey, "The Z Notation : a reference manual",
Prentice Hall International, 1989
[13] Anthony Hall, "Seven Myths of Formal Methods", IEEE Software,
Volume 7, Number 5, September, 1990.
[14] Jeannette M. Wing, "A Specifier's Introduction to Formal
Methods", Computer, Volume 23, Number 9, September 1990.
Volume-Number: Volume 21, Number 197