[gnu.emacs.bug] GNU Emacs, memory usage, releasing

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (01/03/90)

In article <USER.90Jan3094015@pmax27.osf.org> USER@osf.org (Michael Meissner) writes:
>In article <1990Jan2.151615.19098@talos.uu.net> kjones@talos.uu.net
>(Kyle Jones) writes:
>
>| It's not really accepted as much as it's tolerated, simply because the
>| vast majority of Emacs users are not inconvenienced by the 16Mbyte
>| limit.  (Leonard Zubkoff pointed out only 5 type bits are actually
>| needed currently, so Emacs can be compiled to handle 64 Mbytes of memory
>| before it dies.)  If I had been in on the early development of GNU Emacs
>| I would have indeed screeched and howled about the pointer munging.
>| But as it stands now, the inconvenience of the memory limitation is far
>| outweighed by the work necessary to fix the problem.
>
>When I started working with GNU C, I noticed a synergy (sp?) between
>the GNU C compiler and emacs.  As has been pointed recently, the top
>byte on big endian systems, and the bottom byte on little endian
>systems is used to hold the type stuff (and the mark bits during
>garbage collection).  Anyway, the machine descriptions that I looked
>at had optimizations to do a load byte, rather than a load word and a
>shift.  Since a lot of the emacs lisp interpreter is looking at the
>type field, this can eliminate extra instructions in the code.
>
This seems to me to be disgusting. Why not just define these
"things" (the pointer with 8 bits of something else stuck in somewhere)
as 

struct THING {
char * ptr;
char tag;
}

?   
What will happen to the original method on machines that NEED a
full 32 (or 16 or 36) bits for a pointer (segmented architectures,
or 36 bit architectures)?  Won't it fail to compile?

This sort of stuff seems to indicate that Gnu software, at least this
part, is pooorly designed. Some of the Gnu stuff I have ported to my
PC has been among the worst C code I have seen. 

Doug McDonald

pcg@thor.cs.aber.ac.uk (Piercarlo Grandi) (01/04/90)

In article <129799@sun.Eng.Sun.COM> jpayne%flam@Sun.COM (Jonathan
Payne) writes:

   I picked linked list because I knew I could not store the files in
   memory simply because I wrote JOVE for the pdp11/70.

I have been using Jove for many years now. Well, five or six...
On PDPs.  I am now using it instead of GNU, by the way. While it
is coded in a way I disagree with, it has steadily improved, and
the mix of deatures offered/non offered is remarkably balanced.

   But since I moved to bigger machines [... and am going to use
   gap for Jove ... ]

Moving to bigger machines, and all hell breaks loose... Aaaaghhhh.

   I will never use a linked list of lines approach again.  Buffer gap is SO
   much better in lots ways from an implementor's point of view and often
   from the user's point of view.  Here are the reasons that come to mind:

As to that the simplest editor can be built with the two tape model of
buffer management, but I hardly recommend it...

	   o File i/o is blindingly fast.  It's standard to implement
	     read-file the following way: [ .... ]

This is also trivial if you use a log method for updates. You read in the
original, and then you keep a log of changes. It is also true that people
often edit files for long times, and cost of reading them in is not that
important.

	     Writing files is done in two chunks, the data to the left of
	     the gap and then the data to the right.

With the log method this is more complicated, but not dramatically so.

	   o A buffer position is represented as an integer. [ .... ]
	     All are possible with linked list, but the code is harder to
	     write, less reliable, and the implementation will be slower.

But not terribly slow. And with costs being constant, instead of O(n).

	   o Insert and delete operations of ANY kind are trivial to
	     implement. [ ... ]

As you say, just copy everything around... This is a *problem*, not a solution.

	   o No line length hassles.

You have to work a bit harder, On the other hand most lines an editor sees
are short. So you can just have more complicated code for the case where line
length overflows a predetermined limit.

	   o Undo/redo is trivial to implement in this scheme.

Undo/redo is also trivial with a log approach.

   The point is, moving the gap around is not that big a deal.

Oh yeah. Too bad that constant time alternatives exist, instead
of O(n), and that O(n) uses a lot of memory bandwidth and cycles
and paging and...  Of course, as always, if you are the only user
of a 10 MIPS 16 MByte workstation you can afford to waste a lot of
those. To me, however, waste is no good, if it can be easily obviated.

   I thought paging algorithms were geared towards sequential access,

Most paging algorithms implement some approximation of LRU; LRU is *guaranteed*
to trash on sequential access.

   >The better solution, made relatively easy by the reasonably modular and
   >layered structure of GNU emacs, would be to accept The Emacs Cookbook
   >recommendation (adopted by Jove and the MicroEmacs/Gnu family of editors) and
   >implement a linked list system. I would suggest just reading (or map, on the
   >operating systems that allow it) the file to be edited as a large lump of
   >virtual address space, then building a linked list of pointers to lines
   >therein, and then to allocate modified lines from ageneral purpose arena.
   >Informing the OS of the peculiar access patterns would also help, if
   >possible.

   Again, I think this would muck up the rest of the code in the editor.
	[ ... ]

   I'm not saying all this isn't possible.  It's just my belief that it's
   not clearly a win.  In fact, in my eyes, for 99% of the editing that I
   do, a buffer gap solution makes the most sense.

Well, well, there are techniques around that. Also, it pays to invest
more effort in a widely used utility like an editor than otherwise. The idea
that a frequently used, response time critical, tool should be coded quick
and dirty makes me object vehemently.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

freedman@granite.cr.bull.com (Jerome Freedman) (01/04/90)

    What is the "Emacs Cookbook"?
    Can I get a copy?
    How?

                                         Jerry Freedman, Jr

kjones@talos.uu.net (Kyle Jones) (01/05/90)

Doug McDonald writes about the pointer munging in GNU Emacs:
 > This seems to me to be disgusting. Why not just define these
 > "things" (the pointer with 8 bits of something else stuck in somewhere)
 > as 
 > 
 > struct THING {
 > char * ptr;
 > char tag;
 > }
 > 
 > ?

Memory savings.  By making packing the type and garbage collection mark
bits into an object pointer, you get considerable saving per object.
That extra char for the tag can cost one to three extra bytes of storage,
due to the padding needed within the struct.

 > What will happen to the original method on machines that NEED a
 > full 32 (or 16 or 36) bits for a pointer (segmented architectures,
 > or 36 bit architectures)?  Won't it fail to compile?

Emacs could probably run on 36-bit machines, given the appopriate config
files.  16-bit machines can forget it, though.

richard@aiai.ed.ac.uk (Richard Tobin) (01/06/90)

In article <1990Jan3.151427.12532@ux1.cso.uiuc.edu> mcdonald@aries.scs.uiuc.edu (Doug McDonald) writes:
>  This seems to me to be disgusting. Why not just define these
>  "things" (the pointer with 8 bits of something else stuck in somewhere)
>  as 
>  
>  struct THING {
>  char * ptr;
>  char tag;
>  }

Because this will typically make each THING take up twice as much
space, which is often unacceptable. 

If portability is more of a concern than speed, one approach is to
use bitfields and treat the "address" as an array subscript.

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin