boba@iscuva.ISCS.COM (Bob Alexander) (06/18/87)
Modern, memory managed operating systems (like UNIX) have addressed quite nicely certain special requirements of executable files. In particular (1) the file (text and data) need not be loaded into memory in its entirety to begin executing, and (2) the pages can be shared among processes that are executing them (both on disk and in memory). As far as I know, those capabilities are not made available to interpreters for their pseudo-code and data, even though they would be equally as applicable as they are to "real" programs. If 15 users are running a program written in an interpretive language, the interpreter code is shared, but the p-code exists separately for each user. This results in a major disadvantage in the use of interpretive languages to produce production programs. Interpretive systems are in quite wide use today (e.g. shells, SQLs, (((Lisp))), Icon, etc., etc., [even BASIC]), and as processor speeds increase, use of interpreters will likely continue to grow. There are a few ways of working this problem with existing UNIX facilities, but the ones I've come up with so far are kluges. My reason for posting to this newsgroup is to get your reaction to a possible new UNIX facility for this purpose. I'll express my suggestion in SVID format, sort of: ------------------------------ NAME vread -- read from a file into memory [but not really, maybe]. SYNOPSIS int vread(fildes, bufptr, nbyte) int fildes; char **bufptr; unsigned nbyte; DESCRIPTION The function "vread" attempts to read "nbyte" bytes from the file associated with "fildes" into an allocated buffer whose address is returned in "bufptr". This function is similar to read(ba_os) [read(ba_os) is SVIDese for read(2)] except for its implications concerning virtual memory and that it allocates a buffer rather than being given one. In a memory managed system, the contents of the file are not transferred into the program's memory space. Instead, the file is "mapped" into an area of the caller's data space (involving no actual data transfer) and demand-paged into real memory, directly from its disk file, as accessed by the program. As long as any such page remains pure, it never needs to be swapped out to disk, and can always be swapped in from its original location on disk. If a page becomes dirty, it will have separate swap space allocated for it on disk and the page will be re-mapped to that space. [This technique is often used for the initialized data portion of executing programs]. Therefore, "vread" produces the appearance of reading from a file into memory, but no data actually transferred (in a memory managed system), and the system is afforded the opportunity to optimize by sharing the data among all processes accessing the file. From the program's point of view, this operation is indistinguishable from an actual data transfer. In non-memory-managed versions of UNIX, "vread" is implemented as a true data transfer. Therefore, "vread" calls are portable between memory-managed and non-memory-managed systems. Since the system decides the address at which the space will be allocated, specific memory management requirements (such as page size and alignment) are hidden from the caller and are therefore of no concern to a program using this facility. In a memory managed system, use of "vread" can provide a significant optimization when large portions of files must be available in their entirety, but are sparsely and/or randomly accessed (such as the pseudo-code for an interpreter), and when it is desirable to share large, read-only files. RETURN VALUE Same as read(ba_os). ERRORS Same as read(ba_os). ------------------------------------- For interpreters to take full advantage of this facility, they would have to interpret their p-code "as is" as it sits on disk. If they modify the code, much of the advantage would be lost. I'd be interested in hearing your comments and suggestions regarding this idea; alternative ideas to solve this problem, ways other OSs have dealt with it, implementation problems, or gross oversights. What would you think of a "read only" option for this function (a fourth argument?), where the data would be mapped as read only. This would cause a trap if the buffer is stored into. -- Bob Alexander ISC Systems Corp. Spokane, WA (509)927-5445 UUCP: ihnp4!tektronix!reed!iscuva!boba
campbell@maynard.BSW.COM (Larry Campbell) (06/30/87)
In article <881@mcdchg.UUCP> boba@iscuva.ISCS.COM (Bob Alexander) writes: >Modern, memory managed operating systems (like UNIX) have addressed ... UNIX isn't particularly modern -- it's 15 years old -- and memory management was tacked on as an afterthought, not designed in. (Otherwise I'd never be able to run it on my 8088!) Bob goes on to lament the fact that data can't easily be shared among UNIX processes and proposes a new system call to allow this. I would like to point out that this is a problem that was solved automatically by Multics and TOPS-20 (and could have been solved by VMS, but wasn't). In Multics and TOPS-20 there are no read and write system calls. Instead of doing "input/output", you just map a region of a disk file to a region of memory; the memory management hardware and software do all the rest. In fact, Multics took it a step further by eliminating, from the application's point of view, any distinction at all between files and memory regions. But since I'm much more familiar with TOPS-20, I'll describe their technique and leave the Multics stuff for someone else... In TOPS-20, by default all disk pages are mapped shareably with "copy-on- write" set. This means that all pages, even writeable data pages, are initially shared. The hardware bits are set to prevent writes. If a process attempts to write into a copy-on-write page, a page fault occurs, the OS makes a private copy of the page which is mapped in place of the shared copy, and the process is resumed. This solves the problem of sharing P-code. EVERYTHING is shared by default; you have to ask specifically for private disk pages. Now, you can't really kludge this into UNIX because UNIX programmers expect to do I/O rather than page mapping. The semantics are all different (for example, memory must be mapped on hardware page boundaries and in page sizes fixed by hardware). So you end up not sharing data pages, or adding special kludge system calls in order to share them. Oh well, just a little nostalgia for a really nice but now commercially dead operating system... -- Larry Campbell The Boston Software Works, Inc. Internet: campbell@maynard.BSW.COM 120 Fulton Street, Boston MA 02109 uucp: {husc6,mirror,think}!maynard!campbell +1 617 367 6846
mouse@mcgill-vision.UUCP (der Mouse) (07/16/87)
In article <984@mcdchg.UUCP>, campbell@maynard.BSW.COM (Larry Campbell) writes: > In article <881@mcdchg.UUCP> boba@iscuva.ISCS.COM (Bob Alexander) writes: >> Modern, memory managed operating systems (like UNIX) have addressed > Bob goes on to lament the fact that data can't easily be shared among > UNIX processes and proposes a new system call to allow this. ([This > problem] could have been solved by VMS, but wasn't). Sure it was. Look up sys$crmpsc(). Ok, ok, you said *easily*....but what *is* easy under VMS? This is certainly no more difficult than using sys$qiow to read a character without echo or waiting for RETURN. der Mouse (mouse@mcgill-vision.uucp)