karl@asylum.SF.CA.US (Karl Auerbach) (06/06/90)
In article <9006030129.AA00899@psi.com> schoff@PSI.COM ("Martin Lee Schoffstall") writes: >About every six months there seems to be a little chinese fire drill, >it would appear initated by some of the iconic figures of the Internet, >who talk about performance problems with X,Y,Z areas of SNMP/ASN.1. >All of them have been proven false, and without merit. I've often thought Chinese Fire Drills were fun. Consequently: I've worked with X.409 and ASN.1 for rather a few years now. And it can be a real pig. In my SNMP code, my profilers indicate that an overly large portion of the of cycles are spent in the ASN.1 encoder/decoder. Maybe my code is inefficient -- although I doubt it. It's just that while ASN.1 is good for representing things like electronic mail/mixed-media mail and stuff, it is really a bad choice for fast transactional things. (Actually my complaint isn't so much against ASN.1 as it is against the BER [Basic Encoding Rules]. A few months ago there was an article in the the SIGCOMM Computer Communications Review about an improved, alternative BER.) It is my opinion that the TCP/IP community would be wise to avoid ASN.1 for any but its (original) purpose: electronic mail. --karl--
amanda@mermaid.intercon.com (Amanda Walker) (06/07/90)
In article <11946@asylum.SF.CA.US>, karl@asylum.SF.CA.US (Karl Auerbach) writes: > In my SNMP code, my profilers indicate that an > overly large portion of the of cycles are spent in the ASN.1 > encoder/decoder. Maybe my code is inefficient -- although I doubt it. You might want to look at it again. The quite restricted variant of ASN.1 that is used in SNMP can actually be pretty quick to parse. Most of the publically available ASN.1 parsers that I have seen are overly complex... -- Amanda Walker, InterCon Systems Corporation -- "If we don't succeed, then we run the risk of failure." -- Dan Quayle
mrose@CHEETAH.NYSER.NET (Marshall Rose) (06/08/90)
I think history is being a little loosely interpreted here. The committee working on X.400 was the first to define an OSI application. As such, they defined X.409(84). When other OSI applications began being defined, the abstract syntax/transfer syntax (ASN.1 and BER) were split out into ISO8824/ISO8825 and X.208/X.209. It is true that the original use of the ASN.1 stuff was MHS, but ASN.1 was not invented solely for that purpose--the MHS people just happened to be in the position to need it first. The strength of something like ASN.1 is its ability to describe a wide range of data structures using a machine-parsable notation. One of the design goals of the OSI people involved was to make sure that it did not limit the precision of the things being described. Hence, INTEGERs needn't be of fixed length, etc. A side-effect of this is that you need an encoding algorithm which can accommodate variable length data. In theory, you can use many different algorithms to encode things defined using ASN.1, but at present, only one such algorithm has been standardized, the Basic Encoding Rules. The reference by Huitema you are referring to is a second algorithm defined at INRIA, based on Sun's XDR. The motivation for using ASN.1 in the SGMP and SNMP is simple: industry has already decided on the lingua franca of the 90's (ASN.1/BER), and it is counter-productive to invent yet-another-description-and-encoding-standard So, the goal was to select a small, workable subset that could be implemented easily, and allowed for some useful optimizations. Then, if an implementor is really concered about efficiency, then the routines can be done by hand with great care. (Probably the most important part is to have efficient encode/decode support for integers, which interestingly enough is where the performance characteristics of BER and XDR differ the most. By optimizing in once place, you can dramatically improve the speed of your encoder.) Using the BER, rather than a non-standard algorithm, over an optimized ASN.1 was chosen because regardless of the availability of alternate encoding algorithms, anyone implementing ASN.1 still has to implement the BER since that's what all ASN.1-capable applications use. So, if you are going to ask a vendor to implement something else for their ASN.1 library to use with application X, you have essentially told them to do twice as much work. Since the time of the initial SGMP work, experience has shown that if you limit use of ASN.1 carefully, you can implement an efficient BER mapping, this seems to be the most reasonable compromise. /mtr
karl@asylum.SF.CA.US (Karl Auerbach) (06/09/90)
In article <266E82D2.592A@intercon.com> amanda@mermaid.intercon.com (Amanda Walker) writes: >The restricted variant of ASN.1 >that is used in SNMP can actually be pretty quick to parse. Most of the >publically available ASN.1 parsers that I have seen are overly complex... "pretty quick to parse" is a relative term. I've optimized my ASN.1 tools to handle the SNMP subset and they are fairly efficient, on a relative scale. The trouble is that even the limited subset of ASN.1 used by SNMP represents a fair computational load. Optimizing it and saying that the result is good is a lot like saying that you have optimized a bubble sort -- no matter how well optimized, it is bad news when presented with anything more than trivial input. The argument that one can optimize ASN.1 is a red herring. The trouble is that ASN.1 requires that one look at a significant number of bytes, no matter how well optimized. For things where one has a fairly fixed structure (SNMP being a prime example) the flexibility of ASN.1 isn't warranted. As a general representational tool ASN.1 is OK (not great, but OK). But for things that happen often and where the content is fairly predictable (in structure), a less universal tool ought to be considered. I'm into my fourth ASN.1 BER encoder/decoder tool. It does everything in the ASN.1 BER specification (except floating point) and obeys all kinds of constructorization/definite-indefinite length strategies and I have not yet found an input sequence that breaks it (although massivly deep constructors can cause it to reject an otherwise legitimate input sequence.) It was designed to be moved to hardware as a co-processor. I'm still profiling and optimizing the code paths to do fastpathing, bulk operations and pattern matches when possible. It is small (less than 20K bytes of compiled code on an Intel *86 processor) and, for ASN.1, fast. My ASN.1 encoder/decoder for SNMP is far smaller, about 3K bytes of compiled code on an Intel *86 class machine. And considerably faster. But even with this efficient code (at least I think it is efficient) I find that ASN.1 parsing/encoding burns a significant portion of the overall CPU cycles required to handle an SNMP request. The potential flexibility of ASN.1 just isn't worth the expense in situations (like SNMP) where we don't need the flexibility. We of the internet community ought not to fall into the OSI trap of thinking that just because ASN.1 exists that it is the perfect tool.* --karl-- *I used to own an English sports car (Jensen Healy) and I carried a set of calibrated hammers. Each was tuned to correctly bash whatever thing wasn't working at the time. A tiny hammer for the carburators, a mallet for the fuel pump, and a sledge for the front suspension. To me, the use of ASN.1 for SNMP is akin to using the sledge to adjust the carburator.
schoff@PSI.COM ("Martin Lee Schoffstall") (06/10/90)
> In my SNMP code, my profilers indicate that an > overly large portion of the of cycles are spent in the ASN.1 > encoder/decoder. Maybe my code is inefficient -- although I doubt it. Refresh my memory but doesn't your code base derive from an effort to span the perceived market for both CMOT and SNMP? Could this be a factor? I've seen some very tight code by commercial implementors of SNMP. While you make the comparison of ASN.1 vs "SNMP" within your code base, do remember that the tradeoff were made between doing real work code (such as routing) vs network management code inside the network entity. That was the foundation of all critical engineering decision tradeoffs. Your statement above leads me to believe that you aren't seeing the forest for the trees. Marty
mrose@CHEETAH.NYSER.NET (Marshall Rose) (06/10/90)
Karl - we've reached a circular discussion here. The SNMP PDUs were written in ASN.1 so as to be as fixed and predictable as possible. That is one method that was used in order to help out people optimizing their encoders and decoders. I suppose the PDUs could have been written in such a way as to use more of ASN.1's expressive-ness, but the only benefit would be to be more elegant in an academic fashion with no gain in functionality. Of course, the drawback would be that it would be nigh impossible to do optimizations. Sure, SNMP could have used a special-purpose, roll your own encoding algorithm. And sure, it could be faster than any form of ASN.1 known to man; probably an order of magnitude faster, if you were very, very lucky. And what would that cost exactly? Well, let's see, it would be yet another special-purpose thing useable only for one task; it would have associated with it the usual debugging cost for the designers; it would also have the usual start-up and training costs for the people writing implementations. So it would have a lot of hidden costs that would slow development and hinder deployment. (And to add insult to injury it would still need to have things like ASN.1 object identifiers to provide for decent naming and extensibility.) Now I hate to be the one to point this out, but technological issues are hardly boolean. Any time someone says "X is better than Y", they have neglected to state the criteria associated with that judgement. As such, there is more to consider than just raw speed when discussing this issue. /mtr
george@ditmela.oz (George Michaelson) (06/10/90)
If an ASN.1 en/de coder is built to a specific and known subset of ASN.1 which is what at least one message about SNMP implied, is it any wonder it runs faster than a en/de coder written to try and handle an *arbitrary* ASN.1 description? The same goes for the BER surely, if you know in advance what specific sequences to expect you can afford to write an optimal package to read and write those sequences, nasty hacks like pre-defined fixed buffers that you write 2-3 bytes into and chuck onto the net without actually "walking" all the ASN.1 involved. ISODE is more than just SNMP support. If somebody writes fast generic ASN.1 nobody will complain. meantime, its better than nothing. How does ISODE compare to Huitema's ASN.1 stuff for speed? ditto the EAN code? Retix? is there are basis for benchmarking ASN.1 and BER code? George -- ACSnet: ggm@brolga.cc.uq.oz Phone: +61 7 377 4079 Postal: George Michaelson, Prentice Computer Centre Queensland University, St Lucia, QLD 4067 Australia
huitema@jerry.inria.fr (Christian Huitema) (06/11/90)
A few comment on the efficiency of ASN.1. We did quite a lot of work on that at INRIA, including the development of alternative encoding rules. The relative cost of ASN.1 derives from the recursive ``type-length-value'' structure of the BER, plus the cost of decoding length fields, integers, and reals. The relative cost of ASN.1, compared to a simpler encoding, depends from the type of elements. To cite, in order: * the handling of REALs is ludicrous. It is almost equivalent to converting them to string using something like <sprintf/sscanf>, only more complex. * the handling of the Integers, as explained by MTR, uses variable length big endian complement to 2 notation. The coding can be optimized somehow, e.g. by predicting the range of the element or by loosening your compliance to the BER (as advocated by some experts from DEC); the decoding routines on the other hand must be general. Their cost depends from the quality of the implementation; I measured a ration of 10 to 20 between BER decoding of integers and a simple ``move''; an endian reversion would have costed 5 to 6 moves. * the handling of the strings is comparable to what can be found in other syntaxes: the coding of the tag and length field may take 20 instructions instead of 1 or 2, but it is outweighted by the copying of the string itself, which is syntax independant. The decoding can however be made much longer if your partner insists on passing its strings as ``sequences of segments''. * the handling of structures requires an analysis of the tags and length fields. A sequence without optional components has a very low cost; a set is very costly; a choice cost marginally more than a ``case of'' operation; an array (set of, sequence of) costs significantly longer to decode than a simple ``count + pointer'' representation, as one can only assert the number of elements (useful for malloc, bound checking, etc) by decoding each element and testing the ``length'' field. In short: the efficiency of ASN.1/BER depends of the complexity of the structure and of the type of the element. If a simple structure contains mostly strings, ASN.1 is very efficient. If a semi complex structure contains a mix of strings and integers (or booleans), one can expect 1 to 4 instructions per byte -- depending of the mix. If the message is a matrix of floating point numbers exchanged between a Cray and a graphical workstation, then ASN.1 is about as good as converting the matrix to a text file... In short, if the application deals mostly with numbers, one should try to negociate something else than the BER. For information handling protocols like MHS, X.500, DFR or CMIP (and SGMP) the cost of the encoding is probably in the 1-4 inst/byte range. This is significant, although probably much less than the cost of retrieving the information itself. And one should point out that the coding of the DNS is about as costly... Christian Huitema
karl@asylum.SF.CA.US (Karl Auerbach) (06/14/90)
In article <8020@mirsa.inria.fr> huitema@jerry.inria.fr (Christian Huitema) writes: >A few comment on the efficiency of ASN.1. We did quite a lot of work on >that at INRIA >* the handling of the strings is comparable to what can be found in >other syntaxes: the coding of the tag and length field may take 20 >instructions instead of 1 or 2, but it is outweighted by the copying of >the string itself, which is syntax independant. The decoding can however >be made much longer if your partner insists on passing its strings as >``sequences of segments''. I've found that constructorization of strings can add substaintially to the overhead of the decoder. Have you considered this in your numbers? Thanks for your comments. --karl--