[comp.lang.scheme.c] scheme read

johns@cmell.cs.unm.EDU (John Sturtevant) (06/03/91)

We have recently discovered that we cannot read in data files larger than
2**24 (16,777,214) bytes.  Unfortunately, our research requires that we
read input files larger than this.  Is there a work around for this so that
larger files can be read?  We are currently running Scheme 7.0 on both a
Sun 360 and a DecStation 5000 (mips).

Any help that you can give in overcoming this problem would be greatly
appreciated.

John Sturtevant
johns@unmvax.cs.unm.edu

jinx@zurich.ai.mit.edu (Guillermo J. Rozas) (06/04/91)

In article <9106031616.AA06186@cmell.cs.unm.edu> johns@cmell.cs.unm.EDU (John Sturtevant) writes:

   Path: ai-lab!mintaka!bloom-beacon!dont-send-mail-to-path-lines
   From: johns@cmell.cs.unm.EDU (John Sturtevant)
   Newsgroups: comp.lang.scheme.c
   Date: 3 Jun 91 16:16:20 GMT
   Sender: daemon@athena.mit.edu (Mr Background)
   Organization: The Internet
   Lines: 12


   We have recently discovered that we cannot read in data files larger than
   2**24 (16,777,214) bytes.  Unfortunately, our research requires that we
   read input files larger than this.  Is there a work around for this so that
   larger files can be read?  We are currently running Scheme 7.0 on both a
   Sun 360 and a DecStation 5000 (mips).

   Any help that you can give in overcoming this problem would be greatly
   appreciated.

   John Sturtevant
   johns@unmvax.cs.unm.edu

By read I assume you mean read and hold in memory.  You should
certainly be able to read files longer than that, as long as you don't
attempt to hold the whole things at once.

There are a few things you can do.  None of them are a great
solutions, but may solve your problems:

- Move to 7.1.  The address space in 7.1 is 2^26 rather than 2^24
bytes.  On the other hand, each heap is 1/2 the space, so the real
address space limit is 2^25 bytes.  7.1 also has a compiler back end
for the DecStation 5000, so you may find it much faster than 7.0 on
that machine.

- Use bchscheme instead of scheme.  bchscheme uses a file on disk
rather than memory for its spare heap, thus it can use the full 2^24
or 2^26 address space for data.  I suspect that you are already using
it.  Otherwise you would only be able to handle 2^23-long objects in
7.0 .

I'm afraid that I can't suggest anything better.  Further increases in
the address space for MIT Scheme will require quite a bit of work, and
are not likely to happen soon.  Note that even if MIT Scheme didn't
have problems extending the address space (which it does), the MIPS
architecture only has 26-bit absolute address in jump/call
instructions, so they really don't expect processes whose address
range is much larger than that, and handling that would be a real pain.

In addition, MIT Scheme's GC is really not up to speed to handle such
amounts of persistent memory.  You may be better off using some other
tool.