jqj@bullwinkle (12/30/85)
From: jqj@bullwinkle (J Q Johnson)
One of the design goals of C was to produce a language with very simple and 
efficient variable allocation.  The result, with no nested procedures and 
only local and global variables (no uplevel references, no heap variables
though you do have heap storage, no variable-dimensioned arrays) produces
a language that can be implemented without a frame pointer.  This is
a desirable characteristic of the language, and implies that you should
not expect all or even the typical implementation to use a frame pointer.
Unfortunately, since the VAX CALLS/RET discipline uses a frame pointer,
the alloca() routine exists.  And since it exists on the VAX, it is used
heavily by GNU emacs.  However, its use conflicts with one of what I take
to be GNU emacs's design goals -- portability.  I would like to improve
that portability if possible.  My particular domain is porting GNU emacs 
to a Gould PN.  I see several alternatives, and would like advice on which 
to pursue.
Note that emacs uses alloca() in a very stylized way -- almost entirely
to provide dynamically dimensioned arrays.  One never sees code like:
foo() { ...
	while (baz) {
		p->next = alloca(sizeof(p)+n); p = p->next;
	...
So alloca() would not have been necessary at all in an Algol-like language.
My alternatives seem to be:
    1/	implement a full alloca() for the Gould.  This would not be hard
but would be a gross assembler hack -- alloca()ed variables would actually
be malloc()ed, but alloca() would push the ptr on a private stack and
fudge the caller's return to free() then pop the private stack.  setjmp{}
would be extended to mark the private stack, and longjmp() would do a
number of free()s.  The only visible change to existing code would
be redefining jmp.buf by changing the #include to reference a private
setjmp.h.  However, as I remarked above this would be a gross assembler
hack, and nonportable.  Also, it would be impossible to preserve the
alloca() semantics during interrupts, and would limit you to a (small)
fixed number of outstanding alloca()s based on the size of the private 
stack.
    2/	recode entirely in C to eliminate the need for alloca() as such.
This could be done in several ways.  Perhaps the simplest would be to
have alloca() allocate on the heap, saving the pointer on a private stack,
and in alloca()'s caller record the private stack top in a local variable.
Add a "mark()" routine at the beginning of each function that uses alloca()
and a "release()" routine at each return: 
	old:				new:
foo() { ...			foo() { ...
  char *name=alloca(xxx);	  int alloclim = mymark();
  ...				  char *name=myalloca(xxx);
       return baz;		  ...
  ...					(myrelease(alloclim),  return baz);
}				  ...
				  myrelease(alloclim);
				}
Similarly, setjmp would be protected with a mark/release:
	old:				new:
if (setjmp(buf))		int alloclim; ...
	foo;			if (alloclim = mymark(), setjmp(buf))
					(myrelease(alloclim), foo);
This scheme has the disadvantage of running slower if a true alloca() exists
(unless all the code is conditionalized, which would be a maintenance 
headache).  It also makes the code less clear, again increasing maintenance.
But it has the advantage of being completely portable.
   3/	Recode GNU emacs to do compile-time maximum-size array allocations 
instead of runtime bounds.  This is typical C style but could increase
(dramatically!) the size of the stack.  Except in a few places where reasonable
bounds are known in advance (e.g. dispnew.c), I think this would be a bad idea.
   4/	Hack up the Gould C compiler to use a stack frame.  And it's still
non-portable.
   5/	Do something else entirely.  (e.g. use Hemlock?)
1/, 2/, or 3/ would each take, I estimate, less than a person-week to 
program and debug.  4/ would take several person-months for the average 
hacker.gwyn@BRL.ARPA (12/30/85)
From: Doug Gwyn (VLD/VMB) <gwyn@BRL.ARPA> Alloca() is definitely not portable, and Dennis Ritchie even tried to suppress it altogether in 7th Edition UNIX (obviously not fully successfully). Except for the action of longjmp(), alloca() does nothing that malloc()/free() would not do. I don't understand why GNU EMACS wants to use setjmp() and alloca() so heavily; in the over 500,000 lines of source code that I maintain, there is not a single use of alloca(), and only a few places where it could have been used effectively if available.
macrakis@harvard.UUCP (Stavros Macrakis) (12/31/85)
In article <21@cornell.UUCP>, jqj@bullwinkle writes: > the alloca() routine ... is used heavily by GNU emacs.... > My alternatives seem to be: > 1/ implement a full alloca() for the Gould. ... > 2/ recode entirely in C to eliminate the need for alloca()... > 3/ Recode GNU emacs to do compile-time maximum-size array allocations... > 4/ Hack up the Gould C compiler to use a stack frame.... > 5/ Do something else entirely. (e.g. use Hemlock?) .... > 1/, 2/, or 3/ would each take, I estimate, less than a person-week to > program and debug. 4/ would take several person-months.... Why not the very simplest solution to alloca: a parallel stack for alloca'd objects with Mark/Release? This stack would have to be as large as the total size of allocas alloc'able at one time. In a virtual memory system with large address space, there should be no problem at all. Part of the charm of a Mark/Release scheme (as of regular alloca) is that it is robust: if a Release is `forgotten' via some coding or timing error, the stack still gets cleaned up eventually. #define Max_Total_Alloca 10000 char linear_heap[Max_Total_Alloca]; int linear_heap_pointer = Max_Total_Alloca; Every routine that used alloca would have to include the Mark macro among its declarations (of course, it doesn't hurt to have a Mark even if there is no alloca within the routine): #define Mark char *Mark_point = linear_heap_pointer; ...and would release just before returning: #define Release linear_heap_pointer = Mark_point; ...for convenience: #define Return(x) {Release; return(x);} Setjmp is slighly more complicated. Ideally, the setjmp environment should include the linear_heap_pointer. If this cannot be accomplished, then every statement that has a setjmp within it must become: { Mark; ... setjmp ... Release; } Ideally, Setjmp would just be #define Setjmp (Mark, setjmp(x) + Release) but this won't work for two reasons: 1. Mark declares a variable, and there can be no declarations within C statements; 2. the order of evaluation of f()+g() is undefined, and in particular on the Vax is the wrong way round. The function alloca itself is very simple: #define alloca(x) (Mark_point, _alloca(x)) /* Defined like this to give an error if Mark has not been used. */ char *_alloca(size) int size; { if ((linear_heap_pointer -= size) < 0) ...error...; return(&(linear_heap[linear_heap_pointer])); } Note that as a standard precaution the reset-world function should reset the linear_heap_pointer. So should the top-level loop. Since there are only 46 alloca's and 6 setjmp's in all of gnumacs (and several functions have several allocas), I estimate less than one man-day to install this type of alloca. Since I'm working on my thesis, I am not volunteering (not to mention that my machine has alloca!). Note that there is no need to conditionalize the Mark/Return's in implementations with regular alloca's, since you can just define them to be null: #define Mark On machines with regular alloca, the alloca macro as above can be preserved to provide a compile-time error indication in case of forgetting to use Mark. Of course, this doesn't guarantee that Release is used consistently. -s
shaddock@rti-sel.UUCP (Mike Shaddock) (01/01/86)
In article <21@cornell.UUCP> jqj@bullwinkle writes: >From: jqj@bullwinkle (J Q Johnson) > > ... Here jqj talks about the initial design goals of C, the rise of > alloca(), and the problems of porting GNU Emacs to a Gould PN > machine. I too am trying to port GNU Emacs 16.60 to a Gould machine, and have run into some of the problems mentioned here. I have stopped working on it for the time being, but have decided that several unfortunate design decisions were made: (1) Non-portable use of alloca. As pointed out, alloca will not work on machines without stack and frame pointers. When I pointed this out on the GNU Emacs mailing list, the response was that alloca "was used because it was good for GNU, that GNU would have alloca, and that the author did not intend support machines that were deficient in important features such as not having stack and frame pointers". According to some people very familiar with Gould machines, implementing a full alloca would be very, very hard, and certainly not worth the time for one program. (2) Use of setjmp/longjmp. GNU Emacs uses setjmp/longjmp in several places, when it would have been more portable (and more correct in a pure theoretical sense) to return error codes, etc. instead of just jumping to some other place in the program. (3) Implementation of Lisp_Objects Why wasn't a simple structure, such as typedef struct { char *multiple_things; char type; } Lisp_Object; or /* * No flames if this is slightly incorrect, I * avoid using unions and am not up on the syntax */ typedef struct { union { /* Put all the different things */ /* a Lisp_Object can be here */ } foo; char type; } Lisp_Object; used instead this rather gross way of hacking on an int? What would happen if a pointer needed more than the 24 bits that it gets in the int? This may be handled correctly (I couldn't determine that from the code), but it would certainly be a little less obscure to use a structure. Some places in the code seem to *depend* on Lisp_Objects being ints, so it is non-trivial to change this for a particular machine. (4) Unexec Use of unexec is a gross hack merely for a little efficiency. Given sufficient time I could probably re-write enough of GNU Emacs to fix these problems, but I have neither the time nor the inclination to do so. Some of these complaints may be answered in the newest version of GNU Emacs, but I have a copy of 17.31 and it doesn't look like it. GNU Emacs seems to be a very good program from the user's point of view, and could increase people's productivity, but I'm afraid that the attitude of some of the people involved with GNU may be its downfall. Portability is an important issue, and unless GNU, including GNU Emacs, is highly portable, I'm afraid that it won't be a successful as it should. -- Mike Shaddock {decvax,seismo,ihnp4}!mcnc!rti-sel!shaddock "You're in a twisty maze of sendmail rules, all obscure."
daveb@shrew.UUCP (Dave Brower) (01/05/86)
[ I expect to be corrected ] GNU is not, as far as I can see, intended to be really portable to things like BSD, SV, SV.2, etc., on different machines. Since GNU is intended to be a complete *NIX replacement, rms and the team of hackers can feel free to assume the presence of alloca() on every machine in the universe. They are, after all, designing the universe. It just happens that GNU emacs is available before the rest of the system, and if you can use it, great. I've been contemplating the conditional/macro apprach to alloca on different machines, since you really do want to use it on a machine that will support it. -dB
marick@ccvaxa.UUCP (01/06/86)
Setjmp/longjmp work just fine on Gould machines; only alloca() is missing.
I haven't looked at unexec in gnumacs, but I've written a rather large program
that does something similar for much the same reason.  It might be nice for 
unexec to be a library routine, but in my experience those people who use it 
all need to do something slightly different -- you need an example, not a 
library routine.
I can't provide a full example, but I can tell you how to write an unexec-like
thing for ZMAGIC files:
1.  Find the old executable -- you'll need it for the symbol table.
    The Franz Lisp gstab() routine is an example of what you need to do.
2.  Unlink the destination file (so creat() doesn't reuse the modes).
3.  Fetch the old header (out of the old executable or starting at the first
    word of this process's text space).
4.  Fill in the new header -- note that both header.a_data and header.a_nbdata
    must be updated if you've used sbrk().
5.  Write the new header out.  Its size in bytes is ZTXTOFF (from a.out.h).
6.  Write the text out, starting at header.a_txbase + ZTXTOFF and going for
    header.a_text-ZTXTOFF bytes.
7.  Write the data out.  header.a_text is rounded up to a multiple of pages,
    so you don't have to.
8.  Copy the symbol and symbol tables from the old executable 
    to the new one.  The symbol table starts at a_text + a_data in the old
    executable. 
The only problem I know of is a bug in signal handling.  In (at least some
distributions of) UTX, sigvec breaks when called "again" from a saved process
image.  See sigvec.s -- you will need to do a setsigc system call to get
around the problem.  This bug will be (has been?) fixed in later versions of
UTX.
Hope this helps someone.  Usual disclaimers about opinions apply; further, 
I've been known to make mistakes both writing and reading code, so anything
I wrote above could be wrong.
Brian Marick, Wombat Consort
Gould Computer Systems -- Urbana
...ihnp4!uiucdcs!ccvaxa!marick
ARPA:  Marick@GSWD-VMSsjm@dayton.UUCP (Steven J. McDowall) (01/15/86)
Just a simple question, as a new member of the net, where can I get GNU emacs? Also, how is the GNU project coming along, and where can I find updated information on GNU? -- Steven J. McDowall Dayton-Hudson Dept. Store. Co. UUCP: ihnp4!rosevax!dayton!sjm 700 on the Mall ATT: 1 612 375 2816 Mpls, Mn. 55408