moss@takahe.cs.umass.edu (Eliot &) (01/31/90)
Another system call (that I feel any virtual memory system ought to support) is one to move a range of pages from one location in the address space to another. This is sometimes desirable for garbage collectors, etc. Note that one need not move the *data*, only the page table entries (though it is certainly more complicated than just a block move within the OS). Note that the call should work even if the range overlaps with itself. It is the page level analog of memcpy. The size of the region should probably *not* be specified in terms of pages, but rather bytes, and the source and destination addresses as byte addresses, too. The call should fail if the addresses are not page aligned and the quantity to move is not a multiple of the page size. If one allows additional arguments for adjusting the protection, and allows the source and/or destination to be associated with different processes and/or files, a general move/messaging operator results. Eliot -- J. Eliot B. Moss, Assistant Professor Department of Computer and Information Science Lederle Graduate Research Center University of Massachusetts Amherst, MA 01003 (413) 545-4206; Moss@cs.umass.edu
jay@emtek.UUCP (Jay Elston) (02/01/90)
I've heard that there might be interest in a "is this memory location readable/writable" function ala: NAME probe - probe memory locations SYNOPSIS #include <sys/probe.h> bool probe (address, length, mode) unknown *address; unsigned length; TBD mode; +- | Jay Elston, EMTEK Health Care System, Inc. (602) 431-9343 | uunet!emtek!jay +-
guy@auspex.auspex.com (Guy Harris) (02/02/90)
>>> vm_offset_t *addr; /* Where to map to (page aligned) */ >>Yes, and so do SunOS 4.x and System V Release 4. What's more, both of >>them implement "mmap", which bears a startling resemblance to "map_fd". > >For a user-mode function, I strongly dislike the page-alignment constraint. >Does mmap have a similar requirement? Yes. The SunOS 4.x/S5R4 VM subsystem implements "mmap()" as an interface to the VM system's mechanism for setting up mappings between pages in your address space and objects such as files; given that said VM mechanism can't, for example, say "bytes 23 through 47 of this particular page are backed by bytes from thus-and-such a vnode", it requires that the address be page-aligned. (I.e., if you really mean "map", there's a *lot* of work involved in lifting the page-alignment restriction.) However, "mmap()" lets you ask the system to assign an address, which is almost always what you want, so applications don't need to worry about page alignment. (If they need to get a page-aligned address, they can use "getpagesize()" and work from there.)
dwc@cbnewsh.ATT.COM (Malaclypse the Elder) (02/03/90)
In article <12067@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: > In article <2863@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: > >> vm_offset_t *addr; /* Where to map to (page aligned) */ > >Yes, and so do SunOS 4.x and System V Release 4. What's more, both of > >them implement "mmap", which bears a startling resemblance to "map_fd". > > For a user-mode function, I strongly dislike the page-alignment constraint. > Does mmap have a similar requirement? yes mmap has a similar requirement. this is because the page is the unit of mapping that the hardware supports. the alternative is some HUGE overhead of faulting on any reference to a page that has an address that is not page aligned and doing any copying necessary to update that page. i'm not sure, but doesn't mmap return the address that something was actually attached to? the *addr above is just a hint to the system about where to attach? so user level code doesn't have to know? danny chen att!hocus!dwc
peter@ficc.uu.net (Peter da Silva) (02/03/90)
In article <12071@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: > there is no way I can see to use mmap() portably > even among systems on which it exists. This puts a serious crimp > in its potential usability. Except that map_fd (and mmap) will accept a NULL address, allocate the memory, and return a pointer to it. -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
dstewart@fas.ri.cmu.edu (David B Stewart) (02/06/90)
How about implementing good ol' semaphores, with a much simpler interface than Sys V. I've written a front end to Sys V semaphores (under SunOS x.x) which give me the good old classical P() and V() operations. I don't need all that other stuff associated with Sys V. Why not provide a simple system call, which works just as described in all those operating system textbooks. The same goes with shared memory; if BSD 4.4 doesn't have lightweight processes with shared memory, then provide some sort of system call to allow processes to establish shared memory segments. Another useful call would be peek(address) which will return 1 if address is legal within the processes address space, or 0 (i.e. error) if a write/read operation to that address would cause a bus error or segmentation fault. I know that the SunOS kernel provides that routine, but there is no way for the user to access it. ~dave -- David B. Stewart, Dept. of Elec. & Comp. Engr., and The Robotics Institute, Carnegie Mellon University, email: stewart@faraday.ece.cmu.edu The following software is now available; ask me for details CHIMERA II, A Real-time OS for Sensor-Based Control Applications
brnstnd@stealth.acf.nyu.edu (02/08/90)
In article <7848@pt.cs.cmu.edu> dstewart@fas.ri.cmu.edu (David B Stewart) writes: > How about implementing good ol' semaphores, with a much simpler interface > than Sys V. Actually, I just built a simple threads library on top of my signal library. Everything is shared. You get threadfork(), threadexit(), threadup() and threaddown() for semaphores, and a few other calls. It ain't Mach but it works. ---Dan
lm@snafu.Sun.COM (Larry McVoy) (02/08/90)
In article <2212.21:08:11@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >In article <7848@pt.cs.cmu.edu> dstewart@fas.ri.cmu.edu (David B Stewart) writes: >> How about implementing good ol' semaphores, with a much simpler interface >> than Sys V. > >Actually, I just built a simple threads library on top of my signal >library. Everything is shared. You get threadfork(), threadexit(), >threadup() and threaddown() for semaphores, and a few other calls. > >It ain't Mach but it works. > >---Dan Yeah, I did the same thing once for kicks. It's cute, but essentially useless. You don't have threadkill() or threadsignal(), but you do have block_all_threads() in the form of read(), write(), select(), and any other system call that can block(). Bottom line: threads without kernel support are largely useless. --- What I say is my opinion. I am not paid to speak for Sun, I'm paid to hack. Besides, I frequently read news when I'm drjhgunghc, err, um, drunk. Larry McVoy, Sun Microsystems (415) 336-7627 ...!sun!lm or lm@sun.com
del@thrush.semi.harris-atd.com (Don Lewis) (02/08/90)
In article <23449@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >In article <1990Jan24.193433.3332@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: >> open(file,O_PEEK) > >This could be a flag on any open, meaning simply ``update ctime rather >than atime or mtime.'' Crackers already know about utimes(); perhaps an >O_PEEK flag would educate inexperienced sysadmins. > >---Dan I don't want it to update the ctime either. I shouldn't have to dump the file just because someone read it. It only needs to keep atime from being changed if you read the file. If you write to the file, it is still ok to change ctime and mtime. I'm looking to preserve the atime so I can still find unused files (candidates for deletion) even if the hierarchy containing them has been tar'ed or cpio'ed. Besides, it's a performance win because the kernel doesn't have to go back and update all those inodes, sort of like treating the filesystem as read-only. -- Don "Truck" Lewis Harris Semiconductor Internet: del@semi.harris-atd.com PO Box 883 MS 62A-028 UUCP: rutgers!soleil!thrush!del Melbourne, FL 32901 Phone: (407) 729-5205
peralta@pinocchio.Encore.COM (Rick Peralta) (02/09/90)
How about resource weighting. For example: . have the free memory weight be tunable That is to say instead of taking a compile time value (maybe 10% free) for when to start flushing things to swap, accept changes from the sysadmin. . memory usage priority (even on the page level) lock this page in memory, this one is largely filler, etc. . I/O priority, get my I/O done ASAP or after everyone else . heavy CPU priority (more than just nice) this thread is to be done at hardware interrupt level x, after hardware interrupts, as part of the idle loop Things like sync could be placed in the lowest priority loop and called more frequently. Things like servers that tend to be bottlenecks and don't consume lots of resources could be placed in the upper categories. Kernel code running in user space (whoops, I for got to mentin that) could be exectued in satisfactory time for most driver applications. Things like compiles could get high memory priority and things like Emacs could get a lower memory priority but higher I/O priority. - Rick "Just a fwe stray synapses..."
brnstnd@stealth.acf.nyu.edu (02/09/90)
In article <1990Feb8.080645.4458@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: > In article <23449@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: > >In article <1990Jan24.193433.3332@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: > >> open(file,O_PEEK) > >This could be a flag on any open, meaning simply ``update ctime rather > >than atime or mtime.'' Crackers already know about utimes(); perhaps an > >O_PEEK flag would educate inexperienced sysadmins. > I don't want it to update the ctime either. That would be a security violation. ---Dan
peter@ficc.uu.net (Peter da Silva) (02/09/90)
In article <131446@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes: > Bottom line: threads without kernel support are largely useless. Which is one reason I want *clean* asynchronous I/O, in the form of some equivalent of my aread/awrite/await proposal. -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
brnstnd@stealth.acf.nyu.edu (02/09/90)
In article <131446@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes: > In article <2212.21:08:11@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: > >In article <7848@pt.cs.cmu.edu> dstewart@fas.ri.cmu.edu (David B Stewart) writes: > >> How about implementing good ol' semaphores, with a much simpler interface > >> than Sys V. > >Actually, I just built a simple threads library on top of my signal > >library. Everything is shared. You get threadfork(), threadexit(), > >threadup() and threaddown() for semaphores, and a few other calls. > > Yeah, I did the same thing once for kicks. It's cute, but essentially useless. > You don't have threadkill() or threadsignal(), As I said, everything is shared---including signals. What's wrong with this? As long as all the threads set up their signal handlers through my library, they won't interfere with each other. > but you do have > block_all_threads() in the form of read(), write(), select(), and any other > system call that can block(). This is based on the faulty assumptions that I demolish in a companion message. When an I/O system call blocks, it can be interrupted! (Larry's point is correct for certain system calls that do not perform I/O. For example, if you open() a pty without anyone to talk to, you'll block in kernel mode, as ps shows.) > Bottom line: threads without kernel support are largely useless. Bottom line: When I'm done with the libraries I'll give them to Rich and see whether the rest of the world thinks they're so useless. ---Dan
del@thrush.semi.harris-atd.com (Don Lewis) (02/09/90)
In article <5068.16:48:52@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >In article <1990Feb8.080645.4458@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: >> In article <23449@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >> >In article <1990Jan24.193433.3332@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: >> >> open(file,O_PEEK) >> >This could be a flag on any open, meaning simply ``update ctime rather >> >than atime or mtime.'' Crackers already know about utimes(); perhaps an >> >O_PEEK flag would educate inexperienced sysadmins. >> I don't want it to update the ctime either. > >That would be a security violation. In what way? The only information that I lose is that I can't tell if someone has been looking at my files. If I cared then I would make them something other than rw-r--r--. Even in the present scheme, if I read my file after the "cracker" has, then I can't tell if it was previously read. If the filesystem is mounted read-only, the atime doesn't get updated, is this a security violation? -- Don "Truck" Lewis Harris Semiconductor Internet: del@semi.harris-atd.com PO Box 883 MS 62A-028 UUCP: rutgers!soleil!thrush!del Melbourne, FL 32901 Phone: (407) 729-5205
dstewart@fas.ri.cmu.edu (David B Stewart) (02/09/90)
Another feature that would be useful as a BSD system call is to lock down one or more pages in physical memory, and allow other processors on a common backplane to mmap it. Of course, this assumes appropriate hardware architecture. As an example, suppose one CPU is running BSD UNIX, while all others have some kind of Real-Time OS (our current situation, except we have SunOS). It is possible for the UNIX machine to mmap part of the other CPUs memory; but the reverse is not possible. The Real-time CPU cannot mmap part of the BSD UNIX memory. Such communication can greatly increase the speed of communication between the Real-Time and Non-real-time environments. On the Sun, this is possible using DVMA (Direct Virtual Memory Access), but it is rather awkward to use. The space reserved for DVMA is available to only kernel routines. User routines do not have access to that memory. Replacing this functionality with a system call would allow user processes to access the reserved memory on the UNIX system, while at the same time letting other CPUs on the backplane also access the memory. I really have no clue if the above type of system call is feasable to implement in BSD, since I am not familiar with the internals of BSD. Any futher insight is welcome. ~dave -- David B. Stewart, Dept. of Elec. & Comp. Engr., and The Robotics Institute, Carnegie Mellon University, email: stewart@faraday.ece.cmu.edu The following software is now available; ask me for details CHIMERA II, A Real-time OS for Sensor-Based Control Applications
peter@ficc.uu.net (Peter da Silva) (02/09/90)
> > Bottom line: threads without kernel support are largely useless. > Bottom line: When I'm done with the libraries I'll give them to Rich and > see whether the rest of the world thinks they're so useless. From your other message, it looks like you *have* kernel support. -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
jfh@rpp386.cactus.org (John F. Haugh II) (02/10/90)
In article <1990Feb9.025853.8202@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: >>> I don't want it to update the ctime either. >> >>That would be a security violation. >In what way? The only information that I lose is that I can't tell if >someone has been looking at my files. If I cared then I would make them >something other than rw-r--r--. Even in the present scheme, if I read my >file after the "cracker" has, then I can't tell if it was previously read. In some sense of the word "secure" information about whether a particular file has been referenced is "security relevant". In the typical "insecure" UNIX environment it would be useful to be able to read a file without updating the atime or ctime. In that sense an "O_PEEK" flag would be Real Nice(tm). File backup utilities are forced to use utimes() which as a side-effect changes the ctime, not really a nice thing to do consider file backup utilities tend to use the ctime for selecting files to dump ... In a more secure environment it is possible to track references to individual files with more granularity than "yes/no" and when. A feature like "O_PEEK" probably wouldn't matter in this case either since interesting files are going to be tracked with other mechanisms. Dan's assertion that O_PEEK is a "security violation" is only true in the most simplistic sense. It most certainly is not a "security violation" in any "official" sense of the word. This is borne out by 2.2.2.2 of the TCSEC - the different times in the i-node DO NOT provide the function required for conformance. So I do not see how any possible mis-use could be contrued to be a lack of protection. Indeed, were such a feature to be provided the only requirement would be that its use be restricted in some fashion and that use of this feature be auditable. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org
jfh@rpp386.cactus.org (John F. Haugh II) (02/10/90)
In article <7904@pt.cs.cmu.edu> dstewart@fas.ri.cmu.edu (David B Stewart) writes: >Another feature that would be useful as a BSD system call is to >lock down one or more pages in physical memory, and allow other >processors on a common backplane to mmap it. Of course, this assumes >appropriate hardware architecture. It is actually possible to mmap() files over the wire - including such transport mechanisms as SL/IP or Morse Code over a spark gap rig. >As an example, suppose one CPU is running BSD UNIX, while all others have >some kind of Real-Time OS (our current situation, except we have SunOS). >It is possible for the UNIX machine to mmap part of the other CPUs >memory; but the reverse is not possible. Anything is possible. Just sit down and dream up some way to make it work. There is nothing special about "real time", provided the "real time" constraints are met. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org
steve@nuchat.UUCP (Steve Nuchia) (02/12/90)
In article <1990Feb9.025853.8202@semi.harris-atd.com> del@thrush.semi.harris-atd.com (Don Lewis) writes: >If the filesystem is mounted read-only, the atime doesn't get updated, is >this a security violation? Hmm... maybe we don't need a new sys call, or a new argument/flag for old ones, to let backup avoid updating the inode. Maybe we just need to remove an arbitrary restriction on an old one. Namely, allow devices to be mounted more than once. If the second mount is RO then you can back up from it and get most of what you want. While I'm on the subject, a thought on disk/partition/file system organization: The current situation is really a mess, with all the partitioning and defect mapping burried down in *each* disk driver. What we need is truly raw drivers for specific hardware and a generic indirect driver implementing "cooked" features -- mapping, partitioning, striping, mirroring -- in a *standard* way. Neither of these suggestions is particularly difficult, unless I missed something when I last looked at the relevant code. -- Steve Nuchia South Coast Computing Services (713) 964-2462 "If the conjecture `You would rather I had not disturbed you by sending you this.' is correct, you may add it to the list of uncomfortable truths." - Edsgar Dijkstra
les@chinet.chi.il.us (Leslie Mikesell) (02/13/90)
In article <19451@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes: >>If the filesystem is mounted read-only, the atime doesn't get updated, is >>this a security violation? >Hmm... maybe we don't need a new sys call, or a new argument/flag for >old ones, to let backup avoid updating the inode. Maybe we just need >to remove an arbitrary restriction on an old one. Namely, allow devices >to be mounted more than once. If the second mount is RO then you >can back up from it and get most of what you want. Yes! I'll second that one. In fact, I'd go even further and let arbitrary directories be mapped as read-only mount points. The mechanisms are probably mostly in place already in RFS and/or NFS. Just provide a local-loopback and take away the restriction of only mounting a resource in one place on a machine (RFS has this, I don't know about NFS). I've wanted this in RFS anyway to give "public" read-only access via one mount point while having "system" read/write access at the same time through a different mount point. Les Mikesell les@chinet.chi.il.us
webb@bass.tcspa.ibm.com (Bill Webb) (02/15/90)
> I'm asking about calls that don't require lots of code or fundamental > changes in the system; that provide a useful service unavailable with > current system calls; that, hopefully, simplify other calls; that don't > hurt security; that don't hurt anything if they're not used. How about a system call to indicate either the maximum number of file descriptors allocated, or better still, a bit-map of allocated e.g. nfound = getfds(nfds, fds); int nfound, nfds; fd_set *fds; The parameters are similar to those of select(2), except that bits are set for any valid file descriptors in the range 0...(nfds-1). It might be useful to specify two fd_set parameters, e.g. nfound = getfds(nfds, readfds, writefds); int nfound, nfds; fd_set *readfds, *writefds; where "readfds" returns bits for file descriptors opened for reading, "writefds" returns bits for file descriptors open for writing, and bits are set in both for file descriptors open for read/writing. These calls would mostly be useful to programs (such as shells) that have to manage file descriptors and either don't want to clobber existing file descriptors, or want to know what file descriptors to close in certain circumstances. This is particularly important now that the number of file descriptors allowed is significantly increased in some Unix implementations. This call should nicely complement 'getdtablesize' which tells you how many bits you will need to hold the resulting information. ---------------------------------------------------------------- The above views are my own, not those of my employer. Bill Webb (IBM AWD Palo Alto), (415) 855-4457. UUCP: ...!uunet!ibmsupt!webb