geof@decwrl.DEC.COM@imagen.UUCP (Geof Cooper) (12/22/86)
I would venture to say that the problem of <<transmitting>> file semantics is solvable, and even reasonably well understood. It is not dissimilar to the problem of transmitting abstract data types, which is well described in Maurice Herlihy's Master's thesis ("Transforming Abstract Values in Messages", S.M. thesis, MIT, 1980 -- there was a paper about it, too, but I don't happen to have a reference to it). The fundamental idea is that you can solve the N^2 problem of translating file (or terminal, or abstract) data types between N different machines either by brute force or by standardizing on one "transmissible" data type for purposes of transmission (e.g., ASCII, various FTP transfer modes, "binary" file formats (such as Interpress, DDL, Impress), bigendian number semantics, IEEE floating point format, TAR tapes, punch cards). I'd like to tag the "real problem" as the "interface problem," at least for the purposes of this discussion. The "interface problem" is that the set of capabilities of the transmissible data type may not be the same as the capabilities of a particular system. For example, EBCDIC and ASCII don't necessarily overlap in all the codes they define. A more pertinent example is that Unix OPEN calls don't give a way to specify that a file is textual, so applications don't generate any information about what they are <trying> to do when the modify the file system. So it doesn't matter if you have a textual file "type" in NFS, since UNIX doesn't give you a way to know that you're supposed to be using it. I've seen three generic attempts to solve this problem: [1] Modify all systems to use the transmissible type (ascii, IEEE floating point, ISO protocols, Interscript, virtually all standards). [2] Modify all systems to have functionality appropriate to the transmissible type and translate on the fly in each system (IBM machines sending to ASCII printers, graphics applications that change their capabilities to fit the printer (or page description language) [cf Macintosh original ROM's versus second version ROM's that gave characters fractional widths to cope with laserwriter]). [3] Define a broad transmissible type, but not every system has to implement the whole thing. Systems can intercommunicate where there is an overlap of supported options. (Telnet (esp SUPDUP), FTP, prob ISO-FTAM). The advantage of [1] is that it works the best, but the problem is that it disrupts the systems, and tends to inhibit technical progress (since adding a new feature requires distributed consensus and implementing it on all machines). [2] still requires that applications change, but it can be workable when the system in question already implements part of the transmissible type. For example, I believe that the ATT guys have found it possible to add "manditory file locking" to UNIX for some files. Approach [3] is pretty common, and can achieve good but limited results (e.g., you can use FTP between any two machines for textual files, assuming they implemented FTP correctly). Unfortunately, it is really the brute force solution to the N^2 problem in disguise. For example, how many machines actually implement ALL the telnet options (How many implementors (or even system architects) could list them all without looking)? Usually, of course, a mixture of the three is involved. For example, a UNIX machine can easily know to receive a "textual type" file correctly using [2], even if it doesn't know how to generate one. All this is not to put a damper on the interesting discussion that is going on about NFS. Rather, it is my intent to try and raise the level of that discussion to more general issues. - Are there other approaches to solving the interface problem? (I thought about it for a whole 10 minutes, so please shoot bullets at my arguments)? - Can people who are familiar with NFILE, NFS, FTAM, etc.., characterize them in terms of the "interface problem", above, so we can compare them abstractly? - Can we come up with a particularly good mix of the 3 approaches to solve the problem well for file systems? (did ISO?) Or is blind standardization the only way (it would be disappointing if it were) -- just tell everyone to use UNIX? Any ideas? - Geof