[comp.os.research] OSF and operating system research, and other topics

shapiro@blueberry.inria.fr (Marc Shapiro) (11/18/88)

I participated an OSF meeting last week where OS technology and
relation to research was much discussed.  As this may interest people,
here is a report.  I believe comp.os.research is the most appropriate
forum; if not, forgive me.


						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@sor.inria.fr or: ...!mcvax!inria!shapiro

===========================================================================

                        Report on OSF meeting
                      Bruxelles, 7 October 1988

[The OSF held a meeting in Brussels on 7 november 1988 to which
vendors and research institutions in Europe were invited.

I was there as INRIA's ear.  I also wanted to voice my own strong
objections to OSF: I am afraid that such a big conglomerate can do no
useful work, and that they will impede the free circulation of ideas
and sources.

I summarize here my meeting notes on the 3 topics which I feel most
important:

  * how ``open'' is the OSF?
  * OS technology
  * relations to research, especially in Europe

My own remarks are in square brackets like this.]


1. How open is the Open Software Foundation?

Anybody may join the OSF.  It costs $4.5M to be a ``sponsor'', i.e.
hold a seat on the board of directors.  ``Membership'' is $25K for
for-profit organizations, $5K for non-profit, and $2K for educational
institutions.  Total budget is approx $150K.

All members are equal in the decision process, which is
``vendor-neutral''.  OSF is not a standards or a consensus commitee,
but a software house.  They take their decisions independently, after
input from members, who receive full and equal information.

Members have full and equal access to all project plans,
specifications, designs, source code, documentation, rationales of
decision and validation suites.  They may consult them on-line at any
time, even as they are being developed.

After a component is released, non-members may apply for a licence for
that component, also in source form.


A component will be released if it passes OSF's validation suites on
various different hardware architectures.  OSF delivers source code
only (no binaries), with clear separation between machine-independent
code, and machine-dependent code for the 3 or 4 reference
architectures.

The reference architectures shall cover the range from PC's to
mainframes.  A reference architecture must be supplied by multiple
vendors, and be non-controversial.  Currently the reference machines
are: the 386 with AT bus, the 680x0, and the 370 architecture.  The
IBM PC/RT is temporarily chosen to represent the RISC family, to be
replaced by whatever RISC becomes the industry standard.

Source code will be protected by copyrights and patents (which
shouldn't prohibit local copies and hacking).  No trade secrets, non
non-disclosure licences (except for code originating from ATT, of
course, unless they can reach an agreement).  No export prohibitions
(except when mandated by law).  The idea is to make new technologies
*available*, not to hide them.  Anybody can buy a licence to anything
(nominal fee for universities).  Nothing is mandatory.  The scope of
OSF covers all of the OS, in a broad sense, including common tools,
and excluding hardware and applications.  They will support tools
which run on non-Unix kernels (such as VMS), e.g. the user interface.

[Apparently both hardware manufacturers and software houses have
decided there was no more money to made on OS's.  They both want to
get out of that business.  HW people don't want to hassle with system
and application developers want their stuff to run on any machine.]

[I asked about possible relations with the Free Software Foundation
(GNU project).  The issue is apparently a painful one.  Answer: ``we
will distribute user-contributed software too.  If FSF asks us to, we
will distribute their stuff, but the terms of the FSF copyleft won't
let us do that''.]

[In conclusion, OSF software is pretty ``open'' but not yet ``free''.
I believe they can be convinced to support research for the
development of truely free software, since IBM and DEC already do so
with Andrew (CMU) and X-windows (MIT).]


2. Operating System technology

The OSF Kernel version 1 is based on the IBM-RT/PC's AIX version 3
(i.e. without the Virtual Resource Manager), itself based on System V.
AIX was chosen *although* it's an IBM product, because it was believed
to be, at the present point in time and from a technical standpoint,
the best base to start to have a high-quality Unix system ready by
mid-1989.  OSF is not commited to AIX, and fells free to rip out
anything they don't like.

OSF is aware of the limitations of the ATT-based technology and *wants
to collaborate with research to replace the kernel with a better one*
in the future.

2.1 Main points of the OSF operating system:

  * conforms to current standards (to protect existing software)
        i.e.: X/OPEN, SVID, POSIX, 4.3BSD, TCP/IP (OSI to be added),
        NFS, X-windows, RFC822, SMTP, etc.
  * Will not be stifled by standard commitees in future
        developments, but will conform to industry standards as they
        emerge.
  * All the BSD functionality is there without code redundancy (no
        BSD code; it is all re-written): all BSD system calls
        .h files, and libraries; csh, multiple groups, signals,
        job control, long file names, symbolic links, select, pty,
        sockets, dbx, mail, file quotas, etc.
  * Targeted to any modern architecture: 32 bit address space,
        supervisor/normal execution mode, memory protection.
  * Documented ``System Internal Interface'' and hooks for adding
        functionality to the kernel.

2.2 Enhancements to Unix

All enhancements will be upwards compatible: the old Unix interfaces,
are preserved.  For instance, the kernel now contains multiple
concurrent pre-emptible threads, with primitives for protecting
critical sections; however the old sleep/wakeup interface is retained
for compatibility.

  * Lightweight processes in the kernel
        
        [no mention of kernel support for lightweight processes in
        user code.]
        
  * System V IPC extended across the net, protected by access
        control lists. 
        Streams not used because of ATT restrictions; replaced by V7
        multiplexed files.
  * The kernel can be configured on-line.
  * Dynamic linking (implies extended, upwards compatible COFF
        format) for both user processes and kernel; new drivers can be
        loaded on-line.
  * Demand paged virtual memory of user processes and kernel;
        mapped files; single-level store; no buffer cache. 
        Pin, pre-page, and purge primitives.
  *  Fork by copy-on-reference (because not all hardware supports
        copy-on-write).
  * Disks partitioned in 4Mb physical chunks; a logical partition is
        any number of chunks (possibly spanning multiple disks); its size may
        be changed at any time, in 4Mb increments, by the administrator.
        No fixed-sized partitions, no dedicated swap zone.
  * File system meta-data managed with DB techniques: journal,
        atomic commit;  fsck should never be necessary again.
  * Terminals: POSIX-compatible job control, page mode,
        input editing a la PC-DOS; curses with color; pty's.
        Multiple physical bitmaps possible; each may be multiplexed
        into multiple virtual terminals (therefore it will be possible to
        run 2 different window sytems on the same screen).
        Access either in ``monitored'' mode (i.e. bitblit
        access) or via ASCII terminal emulator.
        Efficient graphics library (uses monitored mode) for X, GKS,
        PHIGS.
  * National language support: both 8 and 16 bit character sets.
  * Structured I/O handling to ease writing drivers.
  * Error logging by error device driver.
  * Unbundled subsystems, such as X-windows; atomic installation
        tools.
  * Distributed file system and virtual memory
        [I don't know what they mean by that],
        with local and remote caching, full Unix semantics, C2
        security using Kerberos.

Availibility: IBM delivers first version at the end of November (will
be passed on immediately to the membership).  The OSF/1
hardware-independent kernel is to be delivered in the second half of
1989.  The user interface (based on X-windows) will be commercially
available independently, in the first half of 1989 on System V R3.

3. Relation with research

OSF is commited to supporting industry standards, in order to protect
existing software, but will not be stifled by them.  The OSF Research
Institute will be on the lookout for innovations from research
institutions which can be marketed within 2 to 5 years (less than 2
years is development; more than 5 is utopia).

The OSF Research Institute is a ``transformer'' from research to
development, in a vendor-independent way.  It will facilitate
communication and fund (or help funding) research.

5 programs:

  * operating systems,
  * distributed services,
  * information technology,
  * user interface,
  * software engineering.

In each program the OSF RI maintains an independent development team
which will test and evaluate prototypes from research.  Each has a
University program which will organize colloquia, print newsletters,
edit books, fund sabbaticals and research grants.  The money goes 40%
to the USA, 40% to Europe, and 20% to the Pacific.  They primarily
encourage collaborative funding (e.g. Esprit or NSF).

Europe has a long tradition of OS research, e.g. ANSA, Amoeba, Chorus,
Birlix, Comandos, PCTE, Newcastle, Gothic.  Collaboration with
research in Europe is sought on:

  * alternative kernel technologies (e.g. message-based).
  * Architecture-Neutral Distribution Format
  * Persistent programming languages.
  * Distributed application environments.
  * Fault-tolerance.
  * Distributed debugging.
  * Distributed resource allocation.

The European RI will organize thight cooperation with European
universities.  An international workshop will be organized in the
Spring in Europe open to members.  ``The best way of increasing the
weight of Europe in the OSF is for more Europeans to join as
members''.  The European RI advisory board has leading researchers in
it: Mr. Goos of GMD (Berlin), G. Kahn of INRIA (France), S. Mullender
of CWI (Amsterdam), and R. Needham (Cambridge).

[The OSF is trying very hard to convince research to collaborate.  I
clearly see what their advantage is in doing so.

I see less well what research can gain from such a collaboration,
other than strictly material.

The goals and terms of the proposed collaboration don't seem very
clear to me.  ]

4. Conclusion

[My questions about free access to sources were answered frankly.  OSF
sources will not be free but will be available to all members, and no
trade-secret protection will apply.

I imagine they can be convinced to find research to develop free
software, as was done for X-windows and Andrew.

OSF is big and bulky.  Within 1/2 year we should know if they can get
any good work done (when the user interface is scheduled to be
believed).

OSF has a lot to gain from membership of research institutions and are
ready give them financial material support.  I still need to be
convinced that research institutions have something to gain by
joining.]

rick@seismo.CSS.GOV (Rick Adams) (11/24/88)

>   * File system meta-data managed with DB techniques: journal,
>         atomic commit;  fsck should never be necessary again.

It is a truly impressive piece of software that can prevent hardware
errors from damaging data on the disk. I think I'll keep a copy
of fsck around anyway.

---rick

w-colinp@microsoft.UUCP (Colin Plumb) (11/27/88)

In article <5583@saturn.ucsc.edu> rick@seismo.CSS.GOV (Rick Adams) writes:
>>   * File system meta-data managed with DB techniques: journal,
>>         atomic commit;  fsck should never be necessary again.
>
>It is a truly impressive piece of software that can prevent hardware
>errors from damaging data on the disk. I think I'll keep a copy
>of fsck around anyway.

Not really; the techniques are well known, and involve duplicating all
important information.  There was one file system at Xerox I recall that
was secure against all failures in a single or two consecutive sectors.
(User data could be lost, but the file system's integrity was guaranteed.)
The free block bitmap had no invariants on its correctness; it was merely
a performance-boosting cache.

Essentially the idea is that anything fsck can do, the file system does
automatically.  It's adding "incrementally", i.e. not locking the whole
disk while you fix things up, that's hard.

The people at Xerox also found another benefit: the amount of code that had
to be correct to maintain basic FS consistency was only a few pages.  As
long as the rest worked most of the time, all would be well.
-- 
	-Colin (microsof!w-colinp@sun.com)

rick@seismo.CSS.GOV (Rick Adams) (11/29/88)

> Essentially the idea is that anything fsck can do, the file system does
> automatically.  It's adding "incrementally", i.e. not locking the whole
> disk while you fix things up, that's hard.

Moving fsck into the filesystem code is only renaming fsck, not getting
rid of it.

Whats so horrible about the current BSD filesystem? It's already got
duplicate copies of the superblock. It can rebuild the free block
bitmap if necessary, so you can say that it too is only a performance
win.

What about the cost/performance tradeoffs of these great 'database
techniques'?  I'm not willing to shadow every disk drive I have. Buying
30 extra gigabytes of disk to insure filesystem consistancy is not very
reasonable.

To use Andy Tanenbaum's example, "What happens if there is an
earthquake and your entire computer room falls into a fissure and is
suddenly relocated to the center of the earth?". I suspect you lose
big.  (Tanenbaum discusses distributed filesystems as a possible
solution to this)

What price? This is totally passed over in the name of fixing something
that is not necessarily broken in the first place (Note I'm only
talking about the BSD filesystem, the Sys5 filesystem can be considered
broken if you wish)

E.g. I'm not willing to give up the huge performance gain of having
lots of disk blocks cached in memory for the infinitessimal increase in
disk stability.

The OSF "announcement" clearly wins the prize for buzzwords per
square inch, but what is it really saying?

---rick

root@husc6.harvard.edu (Celray Stalk) (11/30/88)

Along the lines of robust file systems:

For my master's thesis I modified the Unix kernel (the SunOS version, which
matters little except that the code was messier because it dealt with NFS)
to include "transaction logging" in the database sense of the words.

I used the sticky bit on non-executable files to mean that the file should 
have transactions logged on it whenever it was changed.  Then in the kernel
I added code that watched for this bit during writes and logged all changes 
(the before and after write data, the size of the write and the location it
occured at) into a system-wide log file.

The next step was to write a set of library routines which implemented the
usual "undo", "redo", etc database functions on files which had logging
done.

The last step was to analyze the performance cost of logging a file.  It
turned out that, as expected, the cost was no higher than two synchronous
disk writes.  Synchronous because transaction logging requires that    
changes occur to the file before the changes are logged, and that cannot
be assured unless synchronous writes are used.  Of course extra disk
space was used to store the transaction file.

So the long answer to a short question is that it is possible to add
at least _some_ more robustness to current Unix disk systems without
incurring large performance penalties. (And of course you get more than
just robustness through transaction logging.)
					      --Peter

------------------------------------------    --------------------------------
Peter Baer Galvin       		      (203)432-1254
Senior Systems Programmer, Yale Univ. C.S.    galvin-peter@cs.yale.edu
51 Prospect St, P.O.Box 2158, Yale Station    ucbvax!decvax!yale!galvin-peter
New Haven, Ct   06457			      galvin-peter@yalecs.bitnet

shapiro@iznogoud.inria.fr (Marc Shapiro) (12/02/88)

In article <5598@saturn.ucsc.edu> rick@seismo.CSS.GOV (Rick Adams) writes:
>Moving fsck into the filesystem code is only renaming fsck, not getting
>rid of it.
1) fsck is very slow for large systems.  My Sun server has a mere
   gigabyte attached to it and rebooting takes ages.
2) getting rid of fsck is not the only advantage of doing updates
   atomically.

>Whats so horrible about the current BSD filesystem?
   It's not bad (except that it's too complex).  OSF proposes a
   filesystem where the size of any partition can be chnaged online.
   I think that's a *big* win.

>What about the cost/performance tradeoffs of these great 'database
>techniques'?
   This of course is the big question.  These techniques have been
   around for a while now and I expect we (i.e. the comp.os.research
   community) now know how to implement them right.  A write-ahead log
   implementation allows to do atomic updates without duplicating all
   the data on disk (i.e. you duplicate new data, in the log, only for
   the short period of time where you are not sure of the outcome of
   the transaction; then you can re-use the log). However you then
   lose the benefit of shadow disks: that even a head crash on a
   single disk doesn't delete your data.

   Using a write-ahead log shouldn't necessarily slow you down w.r.t.
   asynchronous updates, because updates are spooled to the log.  Only
   the commit record needs to be written synchronously.

> I'm not willing to shadow every disk drive I have. Buying
>30 extra gigabytes of disk to insure filesystem consistancy is not very
>reasonable.
   If I understood correctly, the OSF proposal is to update filesystem
   *metadata* (superblocks, inode tables, and directories) atomically;
   not user data.

>To use Andy Tanenbaum's example, "What happens if there is an
>earthquake and your entire computer room falls into a fissure and is
>suddenly relocated to the center of the earth?". I suspect you lose
>big.
   I just checked the fsck man page; I didn't find the option to deal
   with this kind of situation.  (:-)

> (Tanenbaum discusses distributed filesystems as a possible
>solution to this)
   You're saying that you must duplicate all your data on to two disks
   (or other media) which are in 2 places far away enough from each
   other that no single earthquake will swallow them both.  You were
   talking about the cost?

>The OSF "announcement" clearly wins the prize for buzzwords per
>square inch, but what is it really saying?
   I guess we wil find out when their kernel becomes available.  If
   they deliver what they promise, and the performance is not a lot
   worse than existing Unixes on comparable configurations, then I
   think we should applaud, and demand to have access to the sources
   to play with.

						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@sor.inria.fr or: ...!mcvax!inria!shapiro

						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@sor.inria.fr or: ...!mcvax!inria!shapiro