JPALME@qz.qz.se (Jacob Palme QZ) (10/28/90)
One way of introducing "data compression" into X.400 would be to introduce a new body part type, e.g. "IA5 compressed according to compression algorithm X". Conversion between this new body part and ordinary IA5 would be simple, and would have to be done when transferring a message into a system which cannot understand this new body type. In fact, what I am describing above is actually what we are already doing in the SuperKOM message system. We do however always transfer from "compressed IA5" to ordinary "IA5" whenever a message is transmitted to a non-SuperKOM system, since we do not assume that any other X.400 system uses compression just now.
Eppenberger@verw.switch.ch (Urs Eppenberger) (10/29/90)
From my point of view data compression should not be handled within the framework of X.400. This is purely a matter of the lower layers. We have already a mess with all the body parts, I can't see any reason to add compressed ones. If some X.400 implementation store messages in a compressed format on disk, I do not care and see also no need for standardisation. Perhaps this view is too easy? Kind regards, Urs.
vcerf@NRI.Reston.VA.US (10/29/90)
Urs, your view strikes me as potentially off the mark in the sense that compression methods may vary in their usefullness depending on the kind of material and encoding employed. As a result, it may be important to perform the compression with knowledge of the type of content. This tends to place the application of compression rather high in the protocol architecture rather than below the level of X.400, for example. Just to give you one example, I recently got a message advising I could pick up a compressed postscript file via FTP from Germany. The compression took place "above" the level of FTP and decompression is applied after receipt of the file. I suppose you could argue this should have somehow been done at a lower layer, but I think the argument is not convincing on its surface. I appreciate your apparent distress with all the various body types. I suppose this will get worse over time as people want to convey proprietary objects. Best guess is that things will settle down as we discover particular encodings and object types which seem to be the most useful. Vint Cerf
Eppenberger@verw.switch.ch (Urs Eppenberger) (10/29/90)
Dear Vint, you are perfectly right with your FTP example. But I was talking about ISO standards, where the same mistakes should note be repeated. There are two reasons for compression: 1. Save disk space It is up to the UA how it stores info on disk, standardisation is not needed. 2. faster transmission Here it is the job of the lower layers to compress the protocoll units. Users should not be bothered with that at all! Kind regards, Urs Eppenberger
ms6b+@andrew.cmu.edu (Marvin Sirbu) (10/29/90)
Urs, You miss the point that Vint was trying to make. Shannon's theory of information says that the more you know about the message set, the more effectively you can compress it. Thus, if I send a multi-media message, I want to use two dimensional run length encoding to compress the image portion, but a very different scheme (LZW?) to compress the text. With image alone, I would use a different encoding table if the image is scanned at 600 dpi than I would use if it is scanned at 200 dpi. In fact, using an inappropriate encoding scheme can actually _increase_ the number of bits a message takes. The inefficiency of doing ecoding only at lower layers is well illustrated by the problem of telephone circuit encoding. If I intend to use the circuit only for voice traffic, I can easily encode it as 16 kbps or 8 kbps. If I want the channel to carry any kind of 3300 Hz bandwidth information (e.g. modem traffic as well) than the best I can do is ADPCM at 32 kbps. While it may appear simpler to use a single compression scheme at a layer below the application, such an approach may sacrifice substantial potential efficiency gains in transmission. Marvin Sirbu CMU
vcerf@NRI.Reston.VA.US (10/30/90)
Urs, I gather we are considering different reasons for compression and different layers in which it might be practiced. I agree that with respect to local compression inside a UA, standardization is less necessary - although I suppose even there some standards might be welcome if it led to widely available hardware assistance. Vint
neufeld@cs.ubc.ca (Gerald Neufeld) (10/30/90)
It seems to me that you can still get the advantage of different compression algorithms based on knowledge of the data (as Marvin points out) and still do the compression below the application layer. Couldn't this be done by using different transfer syntaxes for each of the different types of data? The compression can then be done at the presentation layer. Gerald UBC
enag@ifi.uio.no (10/31/90)
In article <531*Eppenberger@verw.switch.ch>, Urs Eppenberger writes:
From my point of view data compression should not be handled within
the framework of X.400. This is purely a matter of the lower
layers. We have already a mess with all the body parts, I can't
see any reason to add compressed ones.
If some X.400 implementation store messages in a compressed format
on disk, I do not care and see also no need for standardisation.
Perhaps this view is too easy?
Aren't all those layers supposed to make things easier?
Seriously, does anybody know of attempts to standardize compression
schemes so they can be negotiated by the (re)presentation entities at
connection establishment time?
CCITT has recommended compression schemes at the data link layer for
low-speed PSTN connections, i.e. in the V-series (V.42, I believe). I
don't know whether it is possible to negotiate this at a higher level,
and whether it is possible to propagate PDU boundaries so that the
data link layer algorithm does not reduce the average transmission
speed in the presence of quick turnarounds, small windows, etc.
Just curious.
--
[Erik Naggum] Naggum Software; Gaustadalleen 21; 0371 OSLO; NORWAY
I disclaim, <erik@naggum.uu.no>, <enag@ifi.uio.no>
therefore I post. +47-295-8622, +47-256-7822, (fax) +47-260-4427
--
pww@uunet.uu.NET (Peter Whittaker) (10/31/90)
In article <Qb=4CfO00VADA1N41e@andrew.cmu.edu> ms6b+@andrew.cmu.edu (Marvin Sirbu) writes: >Shannon's theory of >information says that the more you know about the message set, the more >effectively you can compress it. Thus, if I send a multi-media message, (a bit deleted) >While it may appear simpler to use a single compression scheme at a layer >below the application, such an approach may sacrifice substantial >potential efficiency gains in transmission. > > Can't help but agree that compression should be higher in the stack, and for a variety of reasons (number 3 is the most imp, IMHO). 1) As Marvin states, you get better compression when you know what you are compressing (compressing data of unknown type/origin seems kinda silly (esp. as it could already be in its most space efficient form, and please correct me if I'm wrong, but compressing it could lead to pathological behavior where the 'compressed' data is bulkier than the original)). 2) Compress higher in the stack, and the lower layers have less data to move, i.e. less memory to manipulate, less room for transmission/reception/ allocation/deallocation errors. 3) Compression is an example of manipulation of user data: from the OSI purists perspective (I'm a purist on odd-numbered days - Happy Hallowe'en) the last (lowest numbered) layer to touch user data is the presentation layer (layer 6). Once it gets further down, the OSI stack assumes it's safe to ship. It can't assume that it's in best form to ship, but it's bound to heed the 'prerogative' of layer 6: that's where ASN.1 is made, and where the BER are applied. Not to mention that when data is compressed, it has to be uncompressed (trivial, right?). But how does the other side of the connection know data is compressed? It seems to me that compression vs non-compression would be part of the context negotiations at session establishment: the iniatiator and responder would have to agree on what set of compression routines to use, if any, and how to indicate to one another that compression had been applied. My understanding of layers 4 and below (I work on the upper 3-4 layers, depending on how you define the application stack) is that peer-to-peer communication do not provide any services for such negotiation. (Please corect if wrong....). There are some more practical consioderations too (NOTE: OSI purists may go into conniptions fits :@} ). The presentation layer (layer 6) is responsible for translation between network independent and host specific data representations. It is also the last layer that 'knows' what data types it's handling (all that layer 5 and below see are bits). So, the presentation layer is the last layer that can make a determination as to the most effcient compression routine to be applied to a certain body type (or generic data type). Furthermore, when compressing the data, are you compressing to save local disk space and memory, or to save network resources? In the former case, X.400 (at layer 7) could call a presentation service element and ask it to perform some compression on a body type before transmission (i.e. in the case of a store and forward node: receive the data, identify the data type, compress it, then store it till it's time to forward it. All this depends on the store-time, of course (is it worth processing 10 pages of g3fax if you're only going to store it for ten minutes?)). In the latter case, the network may benefit from having compression applied to the machine dependent data representation (i.e. compress, then encode as ASN.1) or it might benefit from compression after encoding. The only way to know which to do is to have compression routines having to the presentation layer (for use before or after ASN.1 encoding), and to experiment, and collect some metrics. In time, we'll (hopefully) have a body of experimental evidence of what-works-best-when-in-most-cases (or maybe somebody can work it all out in theory: theories are easier to program to than experimental data). -- Peter Whittaker [~~~~~~~~~~~~~~~~~~~~~~~~~~] Open Systems Integration pww@bnr.ca [ ] Bell Northern Research Ph: +1 613 765 2064 [ ] P.O. Box 3511, Station C FAX:+1 613 763 3283 [__________________________] Ottawa, Ontario, K1Y 4H7
anand@ka (Govindaraj Padmanaban) (11/02/90)
One thing which everyone is agreed upon is that "Data Compression" is really useful. (I hope so.) In that case, the next question is "Who can do the compression?". Which is the best place to put in?. If the burden is put on the MTA, every MTA on the route has to parse the message to get the body-parts to do the uncompress/compress. The message format received by the MTA looks something like this. P1 envelope +---------------------------------+\ | UMPDUEnvelope (P1 envelope) | \ | /* | \ | * This is the only part each | MTA can use and modify | * MTA has to parse and change.| only this portion | */ | / | Origin, ContentType, Recipients| / | Trace Information etc... | / +---------------------------------+ P2 message | UMPDUContent (P2 message) | \ | +-----------------------------+ | \ | | IM-UAPDU | | \ | | | | | | | +--------------------------+ | | | | | Heading | | | | | | IPmsgid, originator | | | | | +--------------------------+ | | | | | Body Part 1 | | | | | | | | | | | +--------------------------+ | UA formats this part | | | Body Part 2 | | and submits to the MTA | | | | | for sending. Only UA or | | +--------------------------+ | the Gateway MTA parses it. | | | o | | | | | | o | | | | | | o | | | | | | o | | | | | | | | | | | +--------------------------+ | | | | | Body Part n | | | | | | | | / | +--+--------------------------+ | / +---------------------------------+/ The relaying MTA is NOT suppose to do burden of compression/uncompression because it is not suppose to MODIFY the p2 message. Only the UA knows about the body-part boundaries. Each of the bodypart can be of different type (say text, exe, GIF image etc). A single compression algorithm may not work for all the body parts. Each body part has to carry the information about the algorithm used to compress also. Compression can't be handled by the lower layers. In the article, <Qb=4CfO00VADA1N41e@andrew.cmu.edu> pww@uunet.uu.NET (Peter Whittaker) writes, >3) Compression is an example of manipulation of user data: from the OSI > purists perspective (I'm a purist on odd-numbered days - Happy Hallowe'en) > the last (lowest numbered) layer to touch user data is the presentation > layer (layer 6). Once it gets further down, the OSI stack assumes it's > safe to ship. > But how does the other side of the connection know > data is compressed? It seems to me that compression vs non-compression > would be part of the context negotiations at session establishment: > the iniatiator and responder would have to agree on what set of compression > routines to use, if any, and how to indicate to one another that compression > had been applied. It calls for all the type compression algorithm used by the bodyparts should be known before hand to negotiate the connection. I disagree with that. And also needs a basic change in the message format to carry the compression information. And with the single negotiated connection you can't transfer all the messages to the next MTA. And for every message you can't negotiate the connection also... Route 1 origin UA ==> MTA-10 ==>MTA-11 ==> ..... ==> MTA-1N ==> UA (recipient) | | | | | route 2 | MTA-20 ==>MTA-21 ==> ..... ==> MTA-2M If one MTA doesn't know of the compress method used in the message, the routing decisions are affected to transfer the message. I hate when this happens. I really vote for the Sending and Receiving UAs to do the compression. It calls for a little bit of intelligence on the UA part. It also makes sense. The message originator knows about the types of body-parts and the compresss algorithm used. The receiving UA has to do the uncompress operation. The job of MTA is to transfer the message rather than changing/modifying it on the fly. If the receiving UA can handle the new compress method, all the MTAs in the route need not change at all. Any suggestions welcome. Anand +------------------------------------------------------------------------+ | We have not inherited the earth from our parents, | | but borrowed it from our children. | | Govindaraj A Padmanaban - Novell Inc. 408-473-8643(w) 408-263-7055(h) | | Email: anand@novell.COM {ames | apple | mtxinu | leadsv }!excelan!anand| +------------------------------------------------------------------------+ +------------------------------------------------------------------------+ | We have not inherited the earth from our parents, | | but borrowed it from our children. | | Govindaraj A Padmanaban - Novell Inc. 408-473-8643(w) 408-263-7055(h) | | Email: anand@novell.COM {ames | apple | mtxinu | leadsv }!excelan!anand| +------------------------------------------------------------------------+
Christian.Huitema@mirsa.inria.fr (Christian Huitema) (11/05/90)
Peter, I agree with your general remark that compression could be done at the presentation layer. In fact, we at INRIA played with presentation layer compression for two years, and came to the conclusion that defining a presentation transfer syntax as e.g. the stacking of a LZ or Hamming coding over BER (or faster) coding rules is both feasible and useful. There are a couple of problems to solve, essentially relating to the limited negociation capabilities of the presentation protocol (how do you negociate the size of the LZ dictionnary?) and also to the "tree" structure of the encoding (how do you handle an EXTERNAL quoted within a compressed syntax?). The case of X.400 is much harder to solve, however: * The bulk of the data is within the "content", which is carried as an "octet string". * The exchange of messages between UA is "connectionless". Using the presentation layer compression for X.400 would be done within a single context, that of the envelope. Not very interesting... And the octet string "content" is only typed by a "content identifier", pointing in principle to some ASN.1 content description, e.g. P2. There is no place to indicate that something else than ASN.1 BER was used for the encoding -- and the same is true for the use of the EXTERNAL construct in the absence of presentation negociation. Christian Huitema
jwagner@princeton.edu (11/05/90)
> I really vote for the Sending and Receiving UAs to do the compression. > It calls for a little bit of intelligence on the UA part. It also > makes sense. The message originator knows about the types of body-parts > and the compresss algorithm used. The receiving UA has to do the > uncompress operation. The job of MTA is to transfer the message rather > than changing/modifying it on the fly. If the receiving UA can handle > the new compress method, all the MTAs in the route need not change at While this approach seems attractive to me also, how does the sending UA know the receiving UA can handle the compresssion? This becomes even more important if the sending UA is isolated (for example on a BITNET node). John Wagner
Stef@ICS.UCI.EDU (Einar Stefferud) (11/06/90)
Compression of Body Parts in X.400 P2 envelopes is very much like adoption of any other special body part, like WordPerfect, or MSWord, or DCA, or LOTUS Spread Sheet, or EXCEL, etc, et al. There are two basic problems: 1. Establish a standard definition and make it wisely available and widely implemented so many UAs can install and use it. 2. Figure out who can handle the defined object as a body part. Item 2 is really hard to resolve for the global community, without requiring a global directory to hold specific information on exactly what body part types every UA in the world can handle so every potential originator can simply ask "the directory" if an intended recipient can handle a given body part type. I shudder at the task of setting up and operating such a global directory of UA capabilities, and at the Quality of Service aspects when individuals fail to keep their UA entries current in the global directory. Some people and organizations will even regard this as private information, not to be disclosed to the public. As I see it, this grand global directory is only a dream. Maybe a "pipedream". So, the only fall back we have is for the originator to ask the intended UA owner if the target UA can handle the body part type that the originator wants to send. This is actually cheaper than demanding that everyone in the world inform everyone in the world what body part types their UA can handle. I don't see any other way around this dilemma. Best...\Stef
mhsc@oce.nl (Maarten Schoonwater) (11/14/90)
The discussion on data compression makes it very clear that there is a need for standard data representations and formats for interchange. Now the global community is coming more and more together we must learn to speak common languages. In ODA (Office Document Architecture) the problems that are signalled here are solved to a great extend. ODA defines different Document Application Profiles for different levels of interchange. The simplest form only contains text, level 2 and 3 also graphics. The bitmap graphics in the ODA documents can be compressed according to the T.4 and T.6 compression standards (i.e. fax group 3 and 4). When you send an ODA body part over X.400 you declare the content type (the profile level). The receiving X.400 system thus can check whether it can decode this level of ODA and thus refuse the message if desired. ODA only solves the problem for documents, there should be additional standards for other applications. For pure bitmaps there are already the CCITT facsimile standards that can be used and declared in the various body parts. Compression is therefore already solved. Maarten Schoonwater Oce-Nederland BV
pv@Eng.Sun.COM (Peter Vanderbilt) (11/16/90)
Assuming that compression of P2 body parts is a good thing, is there a standard mechanism to use for identifying compression? The simpliest mechanism is to just use an external body part with a different object id (OID) for each different compression. Alternatively, one could use the parameters part with a field to indicate what kind of compression is used, where each compression algorithm is assigned an OID. The first mechanism has the problem that it requires "M*N" OIDs -- an OID has to be allocated (and configured) for each pair of data type and compression algorithm. The second mechanism only requires "M+N" OIDs -- one for each data type and one for each compression algorithm. But the second mechanism has the problem that it requires wide-spread implementation to achieve the desired independence -- which seems like a major hurdle. Does anybody have info on whether the standards people considered body part compression and, if so, how they expected it to be implemented? Is anyone implementing it currently and, if so, how? (Along the lines of the second mechanism, in practice it appears to be useful to carry an identifying string with a body part -- is there any hope that implementors would agree to a standard way to carry labels in the parameters part?) Pete