ross@spam.ua.oz.au (Ross Williams) (06/22/91)
CRIT_004: CONTIGUOUS MEMORY AND TYPING (CONTINUED) -------------------------------------------------- Sender: jloup@chorus.fr Date: 22-Jun-1991. || One problem here is that many algorithms allocate heap items || incrementally as they build their data model. It seems silly for an || algorithm that is told that it can have three megabytes to make 100000 || allocation calls only to later find out that it only has to process 1K || of data! If the standard has to be changed so that the algorithm || allocates its own memory then I am in favour of allowing it to do so || incrementally, but to ensure that it does not use more than it has || been allowed. | |I also agree that incremental allocation is important for some algorithms. |But I don't think that you need a bullet proof protection against |algorithms which do not follow the rules. If you cannot prove from |the source of your algorithm that the amount of required memory is bounded by |a certain value you are in trouble anyway. Even in your original proposal |you could not guard against malicious use of the stack or heap without |checking the source code. One of the troubles with allowing an algorithm to do its own allocations is that you run into heap fragmentation problems. There may be enough memory left, but if it is fragmented and the algorithm wants another slightly largish lump then you may get a failure half way through a compression. That is why I decided to add in the result status for all data compression procedure calls when I agreed to allowing algorithms to allocate their own memory on the fly. I see no harm in creating a bullet proof standard even if it cannot be automatically enforced. The Ada language definition has a concept of "erroneous" which is a bit like this. || I agree with your analysis and propose the following: || || 1. The identity record specifies the maximum amount of memory to be || used. || 2. This memory may be allocated from the heap only. (However, the || procedure can still use up to 1024 bytes of other memory.) | |I think you mean that the memory must not be allocated before the |"Initialize data structures" call and must be deallocated no later |than the "Destroy data structures" call. This is sufficient to |disallow static data. There is no need to make an explicit reference |to "heap". I am not convinced that you must disallow stack usage. |Many architectures do not have small limits on the stack size, and |stack space is cheaper than heap space. Of course the part of |the data structures which must live across compression calls must |be in the heap and accessed through the MEMORY parameter, but temporary |data could be on the stack. (For example the LZW uncompress requires a stack |to reverse the bytes of a code string.) OK OK I'm being a martinet again. Agreed. The standard has to be more flexible than I have made it about where memory comes from. I am getting uneasy about how the standard is loosening up though. I wanted to place a bound on the stack so that procedures wouldn't cause trouble on machines with a small stack space (e.g. some microprocessor embedded systems). I like your definition for disallowing static data. However, it does allow multiple activated instantiations to communicate with each other so extra conditions might have to be added. || 4. The memory parameter will become an INOUT generic pointer in which || the algorithm should place a reference to the root of its || data structure during an init call. | |Please allow this pointer to have a meaningful type for strongly |typed languages. In other words, the type may be different for |different algorithms. In C terminology allow (struct foo *) instead of |(void *). This is up to the writer of the child standard for each programming language. The generic standard is concerned only with the rough information flows and protocol. || Jean: Thanks for your comments. Do you agree with the proposal as || given above? | |Yes. In particular it is nice to allow incremental allocation. At, as I mentioned earlier, the cost of needing a result status for all calls (because of fragmentation). However, the status is also useful for allowing the algorithm to notify of data corruptions so it is probably worth it. |Jean-loup |jloup@chorus.fr | |PS: My only strong disagreement is to be named Jean :-) Hyphenated |french names must not be split. OK. || PS: Please don't think I'm a typeless troglodyte. My SAKDC algorithm || is written in Ada and is full of rock-solid highly typed abstractions || using great big heap based data structures. It's just that my brain || has been warped by coding in C recently. | |I have myself fallen from the Ada language design team down to earth (C++) :-) |But I'd like to get your SAKDC. I'm impressed. I think that you've created a big monster (I have been involved with safety-critical systems and sympathize with Hoare's criticism) but I have found Ada to be a friendly big monster who looks after you most of the time. I like it. Well done. Yet another vote for digging up SAKDC. I might even do it soon. Ross Williams ross@spam.ua.oz.au
ian@airs.com (Ian Lance Taylor) (06/22/91)
ross@spam.ua.oz.au (Ross Williams) writes: >OK OK I'm being a martinet again. Agreed. The standard has to be more >flexible than I have made it about where memory comes from. I am >getting uneasy about how the standard is loosening up though. I wanted >to place a bound on the stack so that procedures wouldn't cause >trouble on machines with a small stack space (e.g. some microprocessor >embedded systems). Your standard seems to be driving at two different goals at once: 1) Allow programs to use several different compression subroutines. 2) Require compression subroutines to be portable between machines. The point you are making above seems to relate only to goal 2. I think the real benefit of the standard will be goal 1. It's going to be awfully hard to make all compression routines portable to all machines anyhow. That seems to me to be, as the C language standard would say, a ``quality of implementation'' issue for the compression routine itself. Furthermore, an implementation on a machine with lots of stack space could use the stack; an implementation on a machine without stack space could use some other memory. There shouldn't be a requirement that the same compression code work on different machines; only that if a different machine is supported, the same functionality is provided. Allowing programs to use different compression routines with a standard interface seems to me to be a worthy goal. Specifying details about where the memory comes from seems needlessly machine specific. -- Ian Taylor ian@airs.com uunet!airs!ian First person to identify this quote wins a free e-mail message: ``If he could have moved, he would have gotten up and gone after the man to thank him for wearing something so marvelously interesting.''