[mod.protocols.tcp-ip] NFS comments

MKL@SRI-NIC.ARPA.UUCP (12/19/86)

NFS is claimed to be a general network file system, but it really isn't.
As someone who is trying to implement an NFS server for a non-UNIX
system, I've got lots of problems.  Here are a few:

As far as I'm concerned, NFS has no security or authentication.  
If you want security you must specify exactly which hosts can
mount your filesystems and you must trust EVERY single user
on those hosts, since they can tell the server that they are
whoever they want to be.  This isn't really a problem with the
file protocol and may be considered a seperate issue,
but I wouldn't use a file protocol without security.

NFS is claimed to be idempotent, but it really isn't.  One example:
If you do a file rename and the request is retransmitted, you may
get back a success indication if the first request was received,
or you'll get back an error if it was the retransmission.

There are some fields that are very UNIX specific.  A userid field is
used to indicate user names for things like file authors.  This userid
is a number and it is assumed that there is a GLOBAL /etc/passwd file
so you can translate numbers to names.  This is completely bogus.  A
userid should be a string, not a number.  More could be said about the
groupid field.

NFS uses very large UDP packets to achieve acceptable performance.
This may indicate that the protocol is what really needs to be fixed.

There is no attempt at any ASCII conversion between normal systems
and UNIX.  This of course is the famous CRLF to newline problem which
makes sharing of text files between different systems almost useless.
Yes, you can write a program to do the conversions, but that ruins
the entire idea of file access since you must then do an entire file transfer.
Besides that, sharing binary files between different operating systems
is almost useless anyways.

From a document that lists the design goals of NFS, it appears that it
was only intended as a way to provide ACCESS to remote files.
It was not and is not a protocol to allow SHARING of the data
in those files between non-homogeneous systems.  For that reason
it is really quite useless as a way to share files between
different operating systems (and probably explains why the CRLF/newline
problem was left out).  It is too bad that they defined
a common data representation (XDR) to build the RPC protocol
with, but then left it out when dealing with file representation.

With that stated, I can probably say that NFS is a good protocol for
sharing files between homogeneous (UNIX-like) systems based on
non-homogeneous file servers.  This doesn't seem like a very
interesting or useful design goal though, and I still don't know
why I'm bothering to implement it.
-------

schoff@csv.rpi.edu.UUCP (12/19/86)

	
	NFS is claimed to be a general network file system, but it really isn't.
	As someone who is trying to implement an NFS server for a non-UNIX
	system, I've got lots of problems.  Here are a few:
	
	There are some fields that are very UNIX specific.  A userid field is
	used to indicate user names for things like file authors.  This userid
	is a number and it is assumed that there is a GLOBAL /etc/passwd file
	so you can translate numbers to names.  This is completely bogus.  A
	userid should be a string, not a number.  More could be said about the
	groupid field.

I'd just like to comment on this aspect and let others comment on the
rest.  Back in 1982 when the new tacacs (TAC access) was being worked
there was some discussion on the "network id" (which broadly is what
NFS's ID is).  Independantly of ANYTHING that SUN was tooling up for
it was determined that the "network id" would in fact be a number.  The
last I heard that was still the plan (and implementation).

Marty Schoffstall

mrose@NRTC-GREMLIN.ARPA (Marshall Rose) (12/20/86)

In the ISO world, you might consider doing FTAM instead.  I think it meets
all of your objections, with the notable exception that it's going to be
a while before someone writes an ftam that really performs well.  You can
get concurrency, committment and recovery (CCR) with FTAM for all the usual
updating and locking type of problems.  Also, owing to its size, you probably
would need two protocols on a diskless workstation: a small netload protocol
(MAP has one), and then FTAM proper.  For those of you interested in looking
at FTAM, I suggest you get a copy of parts 1 and 2 of the FTAM draft
international standard:

	ISO DIS 8571/1
	File Transfer, Access and Management (FTAM) Part 1: General Description

	ISO DIS 8571/2
	File Transfer, Access and Management (FTAM) Part 2: Virtual Filestore

As mentioned in one of the RFCs (I can't remember which), you can purchase
these from Omnicom, 703/281-1135.  Part 1 will cost you $28, part 2 will cost
you $36.

/mtr

bzs@BU-CS.BU.EDU.UUCP (12/20/86)

>In the ISO world, you might consider doing FTAM instead.  I think it meets
>all of your objections, with the notable exception that it's going to be
>a while before someone writes an ftam that really performs well...
>For those of you interested in looking
>at FTAM, I suggest you get a copy of parts 1 and 2 of the FTAM draft
>international standard:

Are there any working FTAM implementations to look at, performant or
not?

	-Barry Shein, Boston University

mrose@NRTC-GREMLIN.ARPA.UUCP (12/21/86)

    An excellent question.  As with most "international standards" you
    have to qualify which point in the life of you're talking about:  

	WD - working draft
	DP - draft proposal
	DIS - draft international standard
	IS - international standard

    There are, to my knowledge, no implementations of the FTAM DIS, as
    it was only recently released (August, 1986).  I expect this to
    change in about six months.

    There are however some implementations of the FTAM DP, I believe
    that DEC has one, that Concord Data Systems has one, and probably
    about five other MAP/TOP vendors (a few even claim to have it
    running on the PC).  I'll ask my local MAP/TOP guru here which
    implementations are available and I'll get back to you.  

    Organizations like the NBS and COS (Corporation or Open Systems) in
    the US and SPAG and ECMA in Europe have done a fine job of
    specifying the "agreed subset" of FTAM which should be implemented.
    This makes the harmonization problem (getting different vendors
    implementations to work together) much easier.  However, the FTAM
    DIS and the FTAM DP are NOT, repeat NOT, compatible (they even use
    different underlying services), so it's not clear how much of a DP
    implementation can be used when building a DIS implementation.

    To comment a bit on some related mail: if I remember the FTAM spec
    correctly, you can do things like forced-append writes and
    record-level access.  I don't think you'll see the first generation
    of FTAM DIS implementations do this, but these features are built
    into the FTAM.

/mtr

hedrick@TOPAZ.RUTGERS.EDU (Charles Hedrick) (12/21/86)

Do you know how random access is done in FTAM?  The big problem it
seems to be is specifying locations in the file.  Unix does this by
byte number.  That can't work for ISAM files.  But if you do it by
record number, you are going to have to count records from the
beginning of the file to make that work on Unix.  So at first glance
it would seem that system-indepedent random-access is impossible
unless you force people to conform to one file model.  The folks at
Johns Hopkins hospital have a very large multi-vendor distributed
database system.  They decided to forget about network file systems
and did it directly on top of RPC.  It seems to have worked very well
for them.  The idea is that it isn't all that useful to do
cross-system random access anyway.  Let the database be managed by a
local process, and have people on other machines make somewhat
high-level requests via RPC.  They made RPC work on an incredible
variety of machines, including ones that only understood BASIC, and
only talked on serial lines.

If you restrict your network file system to sequential I/O, and if you
are willing to specify whether you want a file treated as text or
binary, then it is possible to do things across a variety of systems.
The Xerox D-machines implement a transparent network file system using
normal Internet FTP under these constraints.  NFS didn't do this
because there is no obvious way to tell in Unix whether a file is
binary or text.  There would seem to be basic design issues here, and
I am sceptical about claims that FTAM somehow gets around them.  If
you think of the network file system as something external, i.e. if
you don't store all your system files on it, but use it only for
things that the user knows are special, then of course all these
problems go away.  You can demand some clue as to whether the file is
binary or text, and you can impose restrictions on what operations are
allowed (e.g. no random access or locations specified only by a "magic
cookie").  But NFS was designed to allow you to completely replace
your disk drive with it, in which case such restrictions are not
acceptable.  I'm open to the possibilty that one needs two different
kinds of network file system, one which is completely transparent in
terms of the host system's semantics, and the other which makes
whatever compromises are needed to support every possible operating
system.  NFS is a compromise between these two, and like all
compromises runs the danger of satisfying no one.

Can anybody tell me how FTAM handles these issues?  I don't need the
whole standard, just a brief description of its file model and the
kinds of operations allowed.

mrose@NRTC-GREMLIN.ARPA (Marshall Rose) (12/22/86)

    Well let me try to answer that.  I've only read the spec twice and
    don't have it here in from of me but here goes...

    FTAM is based on the notion of a "virtual filestore" (sound
    familiar, huh?)  The filestore consists of zero or more files, each
    with an unambiguous name.  Each file has a set of attributes (of
    which filename is one) and some contents.  The attributes have the
    usual permission stuff along with a description of the kind of
    information the file contains (e.g., iso646 ascii, octets, bits),
    and a description of the access discipline (my term) for the file.
    The contents is a binary tree.  Each node in the tree contains

	- node attributes:
		node name
		length of arc back to parent
		a flag saying whether data is attached to this node
	- attached data (if any)
	- a list of children

    Now, there are a couple of ways that you can implement a UNIX-style
    regular file.  The simplest is to have just the root node with the
    entire file contents as the attached data (as an octet string) and
    no children.  In this case, the access discipline is rather
    inconsequential, since you can only get at one data element at a
    time and there is only one to choose from.  

    Alternately, for a file like /etc/passwd, you might have a root node
    with no data, and a child for each line in the file.  The access
    discipline would allow you to specify any child element you want
    when you wanted to read or write.

    There are in the spec, several document types and access disciplines
    listed with pre-defined meanings.  Others can be chosen via
    "bi-lateral" agreement.  In the NBS OSI Implementor's Workshop
    Agreements, they have defined a new document type called "directory"
    in which the nodes are, you guessed it, file names.  Assuming you
    had an FTAM server which supported that document type, an FTAM
    client could do the DIR and NLST commands that we've all become so
    attached to.  

    So to answer your question:  FTAM imposes on everyone the same
    fairly general file model.  Each FTAM server consists of a protocol
    engine for FTAM and a localFS-virtualFS engine.  For UNIX-like
    systems, going between the two is rather restricted unless you want
    to put a lot of smarts in your code (at which true UNIX-ites would
    gasp, I'm sure people are reeling at my /etc/passwd example!).  In
    this case, the question of "is it ascii or is it octet-aligned or is
    it bit-aligned" is something the localFS-virtualFS engine for UNIX
    would have to answer.  Now of course, if you had something like
    DEC's RMS in your filesystem, FTAM makes more sense as there is a
    closer match between the local and virtual filestores.  

    It is important in all of this however, to remember what OSI is:  a
    method for tying together dissimilar systems.  This is done by
    choosing a sufficiently general model which is (hopefully) a
    superset of all existing systems, and then letting people code
    local2virtual engines at the network boundary.  

    With respect to RPC, there are such notions in OSI.  My favorite is
    called ROS (Remote Operations Service) which is a fairly simple
    invoke-result, invoke-error protocol with provisions to support
    "once-only" execution of operations.  FTAM is not meant as a
    competitor to ROS (and quite frankly, had *I* designed FTAM, I would
    have put FTAM on top of ROS), but is trying to solve a different
    problem, which perhaps has overlap for certain applications.  

/mtr

braden@ISI.EDU.UUCP (12/22/86)

Marshall,

How can I learn what the "agreed subset" of FTAM is??

Bob Braden

braden@ISI.EDU (Bob Braden) (12/22/86)

Marshall,

  It seems that anything less than universal agreement on a subset of FTAM
  will lead to massive incompatibility among ISO-based implementations.
  Is that wrong?
  
  Bob Braden

mrose@NRTC-GREMLIN.ARPA (Marshall Rose) (12/23/86)

    Well, the obvious answer is to ask "who's doing the agreeing".  I
    know of four such organizations, though there are probably more.

    In Europe, organizations like SPAG have an FTAM profile.  In the
    US, the organization to check with is, of course, the NBS.  John
    Heafner at the NBS spearheads all these kinds of activities, and
    he's the guy you want to ask.  John has an ARPAnet mailbox at the
    NBS, though I don't recall what it is.  In any event, you want the
    notes from the "NBS OSI Implementors' Agreements Workshop".

/mtr

mrose@NRTC-GREMLIN.ARPA (Marshall Rose) (12/24/86)

Yes, that's right.  These organizations which produce "profiles" actually
talk a lot between themselves to try and maximize harmony.  In the US,
co-operation between MAP/TOP, NBS, and COS has been quite good.  This is really
a chicken-and-egg type thing.  Once you get the critical mass, you're set;
until then you're hanging.  I believe that given the way the NBS has been
guiding things, we've reached the mass and should have many different
implementations in harmony...

/mtr