[gnu.emacs.bug] Really large files and malloc

grossman@polya.Stanford.EDU (Stu Grossman) (03/31/89)

I have been hacking on things recently which will allow gnuemacs to deal with
really large files properly.  The goal is to make gnuemacs accomodate the file
properly, or to gracefully decline to edit the file.

Current problems as of 18.53 include things like hung emacs when attempting to
read a file whose size is just large enough to make sbrk() fail.  I fixed one
of the problems with this a few days ago, but there are probably others waiting
in the wings (see my previous posting of a few days ago).  There are also a
bunch of problems with reading of files that are >= 2^23 bytes (on Vaxen&Suns).
These include strange display behavior, and the inability to find the previous
line of the file when point is at the end of the buffer.

As of now, I have fixed the problems with the >= 2^23 bytes files.  For the
most part, they were sign extension problems.  The solution was to treat all
buffer pointers as XUINTs (XFASTINTs actually) (as far as I know, negative
buffer pointers aren't legal (are they?)).  I will be posting these fixes
shortly too.

Now, for the more difficult problems:

1) In order to represent buffer positions >= 2^23 bytes, there needs to be a
   %u format primitive.  (I will gladly write it upon request.)

2) There is a much more insidious problem with malloc and Lisp objects.  I
   discovered that after reading several large files, that malloc started
   handing out memory with address sizes exceeding VALBITS.  Addresses of
   this size were being truncated when stuffed into Lisp objects.  Future
   references to the data of the lisp object resulted in a reference to a
   random piece of memory.  Needless to say, emacs blew it's wad shortly
   thereafter.

   This brings several solutions to mind:
	1) Make malloc not give out memory whose address (start&end??) cannot
	   fit into VALBITS.  This may not be feasible, as some systems
	   (APOLLO) do not use gnu's malloc.  Also, VALBITS is not defined
	   if your system's C compiler supports typedefs of unions.  (Actually,
	   VALBITS should be defined and used in the definitions of the
	   Lisp_Object union)!

	2) Make a different version of malloc which is guaranteed to return
	   addresses which will fit into VALBITS.  And, only use this version
	   of malloc for Lisp_Object.val's.  This would leave emacs with the
	   flexibility to use regular malloc for things that don't care where
	   they live in memory.

Any comments?

		Stu Grossman

lnz@LUCID.COM (Leonard N. Zubkoff) (03/31/89)

Stu,

When my Apollo modifications are installed, it does run fine with GNU's malloc.

		Leonard