[comp.std.unix] The Context for LIS of POSIX

WALLI%7178.gm@hac2arpa.hac.com (10/11/90)
Submitted-by: WALLI%7178.gm@hac2arpa.hac.com


     The Context for Programming Language Independence for POSIX

              Stephen R. Walli - walli@osmcl1.gm.hac.com
                         EDS of Canada, Ltd.




1  INTRODUCTION

Programming Language Independent  Specification  (LIS)  of  POSIX  has
become  a  hot topic within the IEEE P1003 Working Groups and ISO WG15
(POSIX).  Depending on one's point of view, it will  either  make  the
POSIX  family  of  standards  more  robust  and  usable  or  make them
completely unusable while seriously delaying the process.  What I hope
to  accomplish  here  is  to  present all of the relevant concerns and
information in one place, so as to provoke ideas and discussion  which
will  prove  fruitful  in  the  upcoming P1003 Seattle meeting and the
subsequent WG15 meeting on Orcas Island.

The standard disclaimers apply.  All views expressed are the  author's
and  do  not  necessarily  represent  the  opinions  of the IEEE P1003
Working Group members, ISO WG15 or the author's  employers.   I  would
like to thank Paul Rabin for reading this through, catching some of my
oversights and helping me clarify some of my statements.  POSIX  is  a
registered  trademark  of the IEEE.  UNIX is a registered trademark of
AT&T.



2  WHAT EXACTLY IS THE POINT OF LIS

I will not provide any of the historical reasons/arguments/discussions
between  ISO and the IEEE since all I know is hearsay, and I would not
want to raise anyone's ire if I've misunderstood any of the facts.  It
also  serves  no  real  purpose  in  accomplishing the task at hand to
re-iterate these discussions.   It  is  sufficient  to  say  that  the
direction  to  accomplish  the LIS work is coming from ISO and TCOS-SS
has agreed to the work.

The directive is to  write  the  POSIX  interfaces  in  a  programming
language  independent  way,  such that the functionality and behaviour
are completely described, but no language  semantics  are  introduced.
This  then  frees  up language bindings groups to map to the interface
specification in a way most natural  for  their  particular  language.
Describing the service in a language independent manner also serves to
provide a more rigorous definition of the service [1].

Many will argue that POSIX is a C language standard interface to  UNIX
system  services,  and  all  of  the noise about any other programming
language binding or any other operating system  is  immaterial.   I've
both  seen  this  and heard it voiced in the occasional dark corner at
P1003 Working Group meetings.

While POSIX' roots are most certainly  C  interfaces  to  UNIX  system

                                                                Page 2


services,  the  market has driven POSIX beyond those roots.  There are
many other language groups which have a  vested  interest  in  writing
portable  applications  which want access to operating system services
in a portable way [2].  The US government commitment to Ada,  and  the
amount  of  government funded work in FORTRAN has created the need for
two POSIX Working Groups  to  produce  their  respective  bindings  to
1003.1  services.   As  the commercial market interest in Open Systems
grows, I have no doubt we will eventually  see  a  COBOL  binding.   I
would  be very surprised if there isn't someone at IBM already working
in this direction.

The fact remains that the LIS  of  POSIX  is  required  work  for  the
international acceptance of POSIX, and is here to stay.



3  WHAT ARE PAUL AND STEPHE DOING?

Paul Rabin and I are working on the methods document [1] for producing
a  language independent specification of POSIX.  Paul found time to do
all the real work of compiling and editing the document, while I acted
as critical reviewer and chief nit-picker.

The work is based on documents received  from  ISO/IEC  JTC1/SC22/WG11
which  is defining a set of methods for creating abstract, programming
language independent procedure specifications [3] [4].

The method's objectives are:

      o  to meet the ISO requirement of  producing  a  LIS  of  POSIX,
         while  adhering  to  their  guidelines  on  developing  these
         specifications.

      o  to facilitate the development of language bindings from  base
         LIS.

      o  to  facilitate  the  development  of  base  LIS   which   are
         sufficiently  robust  so  as  to ensure a common recognizable
         functionality in the bindings.

Specific non-goals include:

      o  Interoperability  between  modules   written   in   different
         languages   bound   within  the  same  executable  image,  or
         interoperability between applications  written  in  different
         languages using common services, including data interchange.

      o  Incorporating formal description techniques.

      o  Ensuring   the   portability   of   language    or    binding
         implementations.

      o  Providing a machine translatable language-independent binding
         description language.

                                                                Page 3


We discovered that interoperability between  applications  written  in
different  programming  languages cannot be ensured within the current
scope.   The  general  formula  presented  by   WG11   for   producing
language-independent   procedure   specifications   is  to  model  the
interface using abstract data types, then  each  binding  defines  its
mapping of real data types to these abstract types.

Interoperability fails when certain abstract opaque types, process  id
or  file  descriptor for example, are mapped by different languages to
different types.  What may be effectively mapped to a pointer  in  one
language  cannot  be  supported  by  another  language  which does not
understand pointers.  The second language must  map  the  opaque  type
differently, to the detriment of interoperability.

In retrospect, this is not unreasonable.  POSIX' goal is to ensure the
portability  of an application using operating systems services across
multiple implementations at the source level.  It makes no  effort  to
ensure interoperability between programming languages, nor should that
be within the scope of the standard.

The method defines a model which contains data  types,  procedures  on
those  types, and constants.  The objects that the system services act
upon are modelled by their abstract types.  The procedures  (services)
become the operations on the data types.

Operating system service  interfaces  are  presented  as  an  abstract
procedure,  with input parameters, output parameters and the notion of
a completion status.  Note that completion status does not refer to  a
returned  value, but could just as easily refer to a raised exception,
a signal, a return value of a  function,  an  output  parameter  of  a
procedure, or any other entity you can imagine.

The methods document goes on  to  suggest  guidelines  for  both  base
standard and language binding developers.

The methods document has  been  updated  since  Danvars  and  will  be
presented  again in Seattle.  It will be put to a mock ballot sometime
after Seattle.  Donn Terry is managing the ballot list.

One interesting example of a similar informal method  that  I've  seen
recently  is the circulation of ORKID Draft 2.1 [5] in a LIS form with
a C binding.  ORKID isn't as complex as POSIX, but the draft serves as
an  interesting  and  complete  example.   A C binding accompanies the
draft as an appendix, formatted tersely as a C header file.   I  would
be  very  interested  to see a FORTRAN or Ada binding to the draft, if
one exists.

The draft has the same problem with language interoperability that  we
discovered  with  our  method,  in that there is considerable room for
choosing language specific data types to match the opaque types.  They
go  as  far  as  to allow implementations within a language to specify
their own data types.  I haven't spent enough time with the  draft  to
be  able  to  comment on whether this hurts networked applications, or
whether the procedure interface deals with this behind the application
developer's back.  It is still a valuable example.

                                                                Page 4


Paul is managing a mailing list for LIS related issues and discussion.
Messages  for  distribution  to  the  whole  list  should  be  sent to
posix-lis@osf.org or uunet!osf.org!posix-lis.  Requests for updates to
the list should be sent to posix-lis-request@osf.org.



4  PEOPLE ISSUES

There are a number of people issues surrounding the LIS, which  should
be  understood,  because  the  LIS  sometimes  becomes  an emotionally
debated topic.  An effort has been made to state them  unbiasedly  and
to  completely  avoid  any  of  the  finger  pointing  arguments which
sometimes occur.



4.1  People Issue #1

Many people have devoted  a  considerable  effort  into  building  the
current 1003.1 standard and the draft documents which are balloting or
near ballot.  There is ownership and  sweat  built  into  all  of  the
documents.  A perception exists that the ISO mandate to produce LIS of
the services destroys these documents.   It  does  not.   There  is  a
desire  to change the documents to produce the best possible standard,
yet backwards conformance to the current work has always been a goal.

These documents  will  change  in  future  revisions  of  POSIX.   The
knowledge gained and information content is valid.  We are discussing,
however, an effort that is  far  beyond  simple  reformatting  of  the
current documents.

ANY significant  change  at  this  point  will  inevitably  meet  with
resistance  no  matter  how  it's presented.  This whole issue is very
analogous to 1003.3 requirements for providing test assertions at this
point.   At the last P1003 meeting in Danvars, I had an opportunity to
spend time in the .1, .5 and .9 rooms, (as well  as  my  home  in  .4)
discussing  LIS issues.  I think I'm beginning to understand how Roger
Martin (P1003.3 chair) feels any time he shows up in  another  working
group to explain test assertion requirements.



4.2  People Issue #2

Working Groups that thought they had ballotable  documents  are  being
asked  to  fulfil  additional requirements.  These requirements entail
considerable extra effort.  Base standards groups are being  asked  to
provide  base LIS of their services, and the C language binding to the
LIS.  Bindings groups are being asked to provide reformatted  bindings
to  a  base  LIS  which  doesn't  yet  exist.   At  the same time test
assertion requirements are being presented.  Both of these  areas  are
perceived  as  being tedious and "boring".  One Working Group actually
went on record saying they felt they would lose membership over  these
issues.

                                                                Page 5


For this work  to  be  worthwhile  it  must  be  done  completely  and
accurately.   This  will  require  exacting  effort.  Nothing like the
"exciting" work of arguing the functionality of a family of  services.
This  comes  at the perceived end of work as a draft document prepares
to go to ballot.



5  THICK DOCUMENTS OR THIN - A USABILITY ISSUE

One of the debates currently being argued  in  P1003  is  whether  the
individual language bindings are thick documents or thin.

In the thick  document  scenario,  there  is  a  base  document  which
describes  the  abstract service interfaces, and each binding document
is a thick  standalone  document  which  will  repeat  the  functional
descriptions,  adjusted  for  the  particular  language.   This camp's
followers are programmers with real experience in receiving a standard
on  their  desk  and having to use it as a programming tool.  The base
document becomes a tool only for language binding writers.

The thin document scenario has a  base  document  describing  abstract
service  interfaces,  but each thin binding will only include language
specific information.  All  appropriate  functional  descriptions  are
pointed  to  in  the  base  document  by  reference.  This camp is the
"Standards aren't for People" crowd.  Standards  are  only  meant  for
conformance  testing for procurement.  If a programmer actually had to
use the binding standard, they would also require  the  base  standard
and would then work with a finger stuck in each book.

There are actually  two  separate  issues  hidden  in  the  thick/thin
binding  debate.   The  first issue is whether a binding is allowed to
repeat material contained in the base LIS.  The second issue  concerns
whether a binding provides a direct one-to-one mapping to the base, or
whether it can be creative and more directly map to the  semantics  of
the language being bound.

For the record, the P1003.5 (Ada) Working Group decided early in their
history  to  create  a  standalone  document  appropriate  to  the Ada
language.  The P1003.9 (FORTRAN)  Working  Group  chose  to  create  a
binding  which  points  to  the  "base" document, mapping its services
one-to-one as closely as possible.

We are working under the assumption that  ISO  ascribes  to  the  thin
binding  camp.  Semantically, standards do not overlap.  Standards are
allowed to refer to other standards.  There are genuine and  realistic
concerns  with  synchronizing  standards  documents  if many documents
contain overlapping material.

For the record, I'll voice the following  suggestion.   The  STANDARDS
themselves  will  be individually drafted and balloted documents as in
the thin binding camp.  The LIS standard  comes  first.   The  binding
standard  comes  second.   However,  instead  of  merely pointing to a
another document, or including its own interpretation of the  contents
of  the other standard, the text of the LIS is directly embedded.  The

                                                                Page 6


embedded LIS text is clearly delineated so as to be  clearly  separate
from  the binding text, and only the binding text is ballotable in the
draft binding document.  This  would  hopefully  solve  the  usability
issue put forward by the one document camp.

Think of it  as  a  documentation  analogy  to  software  development.
Instead  of  subroutine  calls  pointing  elsewhere, there are already
expanded "macro" calls to  "speed"  the  understanding.   (Publication
synchronization  concerns  become  source  control concerns similar to
different applications referencing the same "macro" library.) It would
simplify the synchronization issue.

Ultimately I believe the publication should be usable by mere mortals.



6  THE CASE FOR RIGOROUS FORMAL METHODS AND THE CASE AGAINST

Another hot debate is the level of rigor required  by  the  LIS.   Our
understanding  is  that  natural language descriptions of the services
are sufficient for the current LIS of POSIX.  It is explicitly  stated
in the draft methods document that we are not pursuing a formal method
of specification for the standard.

There seems to be a lack of experience and standardization  of  formal
methods  within the standards community.  Little work has been done to
formally  specify  standards.   (Now  that  I've  publicly  made  this
statement,  I'm  sure  I'll  be accosted by everyone next week who has
seen anything even remotely looking like a formal  standard.)  I  base
this  statement  on  three  P1003  meetings  worth  of  LIS BOFs where
everyone is quick to suggest their favourite formal method, but  there
is  never  anyone  who has taken the trouble to bring an example of it
used to specify a standard.   Please  do.   The  author  welcomes  all
related information.

A recent issue of IEEE Software had a very reasonable article  on  the
use  of  formal  methods in the standards process [6].  This work came
out of a working group formed by the British Computer Society (BCS) to
address  the  lack  of  informed  opinion on the matter.  They outline
briefly the reasons for  using  formal  methods,  a  few  examples  of
experiments  with using formal methods on standards, and finish with a
set of guidelines.  These guidelines are by far the strongest part  of
the article.

They also refer briefly to an ISO three-phase plan [7] to bring formal
methods into the standards process.

     1.  Phase 1 has the use of formal methods restricted due to  lack
         of  experience  and expertise.  Work is done in parallel on a
         formal  specification  of  the  standard,   which   hopefully
         contributes  to  the  robustness and clarity of the standard,
         and the results are published as a SEPARATE report.

                                                                Page 7


     2.  Phase 2 has seen the knowledge and experience  base  expanded
         in  the  use of formal methods in standards creation, and the
         work proceeds in parallel and is sufficient to  be  published
         as  an  informative  annex.   (Note:   This is not a balloted
         NORMATIVE one, but an unballoted informative one.)

     3.  Phase 3 sees the standards developing  body  well  versed  in
         formal  methods  and  the  formal description is the standard
         with a natural language commentary.

One thing that the article warns against is the retrofitting of formal
methods  to an existing standard, because it can often demonstrate the
lack of conceptual integrity of the original work.

Additionally they recommend choosing an appropriate formal  method  to
express  the  standard,  depending  on  such  factors  as the proposed
standard's content, mathematical sufficiency and accessibility to  the
standards forming group.

The primary formal methods that have been suggested are Z, and VDM.

Z actually has a number of interesting examples to  consider.   Recent
work  has been done to produce a formal specification of P1003.1 using
Z, and it was reported upon by Martin Kirk [8].

The report concluded that while the work was useful at finding "weasel
worded"  areas of the standard, it requires a large effort to continue
this work.  Several other problems  exist  as  well.   Some  of  these
problems  had  to  do with the complexity of POSIX, and its deliberate
areas of ambiguity.  Other problems encountered had to do with  the  Z
notation and the choice of model.

Indeed this raises an area of concern  with  how  far  formal  methods
should  be applied to POSIX.  POSIX has deliberate areas of ambiguity,
"weasel words", and unspecified nature.  This is  required  so  as  to
allow   a   maximum   number   of  implementations,  to  not  restrict
implementations in unnecessary ways or force  implementations.   POSIX
is a standard for portable operating system service interfaces.  It is
not the specification of an operating  system  [9].   There  are  also
times when weasel words are the only way to arrive at consensus.

Another interesting example of Z in this area is a recent  article  by
J.   Michael  Spivey  on  using  Z to specify a real-time kernel [10].
This is the specification of a minimal kernel and not an interface  to
it.  Spivey discusses several deficiencies in his specification of the
kernel,  and  addresses  all  of  them  at  the  risk  of  making  his
specification  more complex and less understandable.  He does conclude
that using a formal specification is a valuable  and  beneficial  tool
for  answering  questions about the kernel, but he then "suggests that
the idea of using  a  formal  specification  as  a  complete  contract
between implementer and user is not very helpful." [11]

Indeed it has been sensibly pointed out that the use of formal methods
is   beneficial   to  aiding  understanding  about  the  object  being
specified, but that they need not be  a  complete  specification  [12]

                                                                Page 8


[13]  [14].   This certainly fits in with the ISO three-phase approach
to  introducing  formalism  into  standards.   They  never  require  a
complete formal specification without natural language commentary.

The Vienna Development Method (VDM) is also frequently suggested as  a
candidate  formal method.  VDM has a similar flavour to Z but does not
have a facility  similar  to  Z's  schema  calculus  to  allow  simple
specifications to be built into complex ones.

This summarises all the current discussion  I've  discovered  to  date
concerning   actual   formal  methods  to  specify  a  standard  POSIX
interface.



7  A BRIEF NOTE ON TESTING AND CONFORMANCE ISSUES

There are several testing issues about LIS  of  POSIX  no  matter  how
formal  the  specification  method.  The following questions have been
raised.

How does one "test" a language independent  specification?   At  first
glance,  one  doesn't.  Test assertions are merely done at the binding
level to  allow  implementations  to  demonstrate  conformance.   This
certainly needs to be done.

But can formal or natural language assertions be made about  the  LIS,
which can be tested manually by argument and inspection, and which can
then act as a basis set of  assertions  used  when  building  language
binding assertions?

>From a different point of view, is there a set of assertions that  can
be  made  about  the  LIS  which can help determine how good a binding
reflects the base?  How do bindings conform to the base?  If a binding
becomes  a  one-to-one  mapping  of  the  base  LIS, then they conform
directly.  If they do  not  completely  map  the  binding  or  map  it
differently, how is conformance measured?

All of these questions need some thought, and I  hope  this  generates
some creative feedback for next week.



8  SUMMARY

I hope I have presented completely and with as little bias as possible
the  issues  surrounding  the  language  independent  specification of
POSIX.

Hopefully at the BOF gatherings at P1003 and the WG15 Ad Hoc, many  of
these  issues  can  be  solved to everyone's satisfaction, with a care
towards the tremendous effort which has gone on to  date  at  building
POSIX.

I look forward to seeing everyone there.

                                                                Page 9


9  REFERENCES


[1] Paul Rabin and Stephen Walli, "Draft TCOS-SS Programming  Language
Independent Specification Methods", Draft 1, 15 July, 1990.

[2] Dominic Dunlop, comp.std.unix Volume 20,  Number  110,  USENET,  5
July, 1990.

[3] "Proposed DTR 10182 on:Information Processing Systems - Guidelines
for   Language   Bindings",   ISO/IEC  JTC1/SC22  N754,  International
Standards Organization, Geneva.

[4]  "Common  Language-Independent  Datatypes:   Working  Draft   #3",
ISO/IEC  JTC1/SC22/WG11  N162,  International  Standards Organization,
Geneva.

[5] ORKID Working Group, "ORKID  -  Open  Real-time  Kernel  Interface
Definition, Draft 2.1", August 1990.

[6] David Blyth, et al, "The Case for Formal  Methods  in  Standards",
IEEE Software, Volume 7, Number 5, September, 1990.

[7] "JTC1 Statement  of  Policy  on  Formal  Description  Techniques",
ISO/IEC   JTC1   N145,  and  ISO/IEC  JTC1/SC18  N1333,  International
Standards Organization, Geneva, 1987.  This reference was  pointed  to
in [6] and I have not yet been able to obtain a copy.

[8] Martin Kirk, "Z Specification of P1003.1", ISO/IEC  JTC1/SC22/WG15
N115, International Standards Organization, Geneva, September, 1990.

[9] Donn Terry, "Suggested Response to JTC1/SC22/WG15 N115",  Document
SC22/WG15 US TAG N146.

[10]  J.   Michael  Spivey,  "Specifying  a  Real-Time  Kernel",  IEEE
Software, Volume 7, Number 5, September, 1990.

[11] Ibid.  p.27

[12] J.  Michael Spivey, "The  Z  Notation  :   a  reference  manual",
Prentice Hall International, 1989

[13] Anthony Hall, "Seven Myths of  Formal  Methods",  IEEE  Software,
Volume 7, Number 5, September, 1990.

[14]  Jeannette  M.   Wing,  "A  Specifier's  Introduction  to  Formal
Methods", Computer, Volume 23, Number 9, September 1990.


Volume-Number: Volume 21, Number 197