[comp.unix.wizards] Indirect system call

ji@close.columbia.edu (John Ioannidis) (11/09/89)

In article <28945@shemp.CS.UCLA.EDU> dieter@lynn.cs.ucla.edu (Dieter Rothmeier) writes:
>While browsing through section 2 of the Unix manual,
>I came upon the concept of an indirect system call,
>as in syscall(2). Now that puzzled me. What might be
>the use for such a facility?
>

Lots of things. For example, you want to redefine some system calls
like read() and write() to do something before and after the actual
write (like your own profiling, save state like where the file pointer
is so that you can checkpoint/restart, etc). ALl you have to do is 
provide your own read() routine that would be defined like this:

	read(fd, buf, len)
	{		
		/* stuff */
		syscall(SYS_READ, fd, buf, len);
		/* more stuff */
	}

If you are loooking for something more exotic, here's a good one.

A long time ago I needed to time Sun system calls (but not have to
call the system call 1000 times then divide the times by 1000!). That
was on a vintage (serial number 19, I believe) Sun-1/150U (with a
Sun-2 CPU) running release 2.0 of Sun's Unix (back then it wasn't
called SunOS).  So I hacked the indir() sources (that's what the
routine is called inside the kernel, if my memory serves me right) to
turn on the 8 diagnostic LEDs on the board, then proceed with the
system call, then turn them off. Turning them on and off was a matter
of three or four 68010 instructions, so that didn't consume much time.
Then I hooked up a Logic Analyzer to the 74LS374 that actually drove
the eight LEDs, and set it to trigger when its input was all ones. Now,
if I wanted to time a system call, I would call it indirectly, and I
could just see what was happening on the logic analyzer. 

Those were the days...

/ji

In-Real-Life: John "Heldenprogrammer" Ioannidis
E-Mail-To: ji@cs.columbia.edu
V-Mail-To: +1 212 854 5510
P-Mail-To: 450 Computer Science \n Columbia University \n New York, NY 10027

dce@sony.com (David Elliott) (11/09/89)

In article <28945@shemp.CS.UCLA.EDU> dieter@lynn.cs.ucla.edu (Dieter Rothmeier) writes:
>While browsing through section 2 of the Unix manual,
>I came upon the concept of an indirect system call,
>as in syscall(2). Now that puzzled me. What might be
>the use for such a facility?

1. You're adding a system call to the kernel and you want to be able to
   test it without having to make a new libc.

2. You're modifying a common system call and you want to be able to
   test it without other software (i.e., ls, cat, your editor) possibly
   crashing.

3. You want to be able to print out the arguments to a certain system call
   or set of calls without having to mess around with macros.  (This is
   a lame one, but it has happened).

4. You have a set of object files or a library but no source, and you
   have found a bug that you can work around if you can wrap one or
   more system calls with special-case code.  (I actually had to do this
   once with a package that used curses.  I couldn't rebuild the library
   for some reason, but screen redraw and shell escapes were broken, so
   I made a shell for read() that called syscall(READ) and then handled
   ^L and ! specially.)

Now, none of these are really that big a deal, and, as has been said
here before, a system doesn't need this mechanism to be useable.  On
the other hand, many people would assume that a system without adb
or sdb but with an improved dbx would be acceptable, but checkoff
items being what they are...
-- 
David Elliott
dce@sony.com | ...!{uunet,mips}!sonyusa!dce
(408)944-4073
"You can lead a robot to water, but you can not make him compute."

amos@taux01.UUCP (Amos Shapir) (11/09/89)

That's an elegant way to bypass the separate instruction/data address space
mechanism on high-end models of PDP11.  Early version of UNIX passed
arguments to system calls by putting them in the words following the
"sys" instruction (a.k.a. as "trap"):
	sys 3; fd; addr; size

(This was a standard way of passing arguments to subroutines and system
calls in DEC's systems).  When PDP11/45 and /70 came along, that posed
a problem, since they used a double address space to increase the limit
of 16 bit virtual addresses - data address 524 is in a different place
than instruction address 524, and there is no way to access the latter
except branch there.

The "indirect" system call was invented to solve this problem: a "sys"
instruction is prepared with all its arguments in data space, then
an indirect call is performed to execute it.  Execution is carried out
by the kernel, which can access all of the user's space.

-- 
	Amos Shapir		amos@taux01.nsc.com, amos@nsc.nsc.com
National Semiconductor (Israel) P.O.B. 3007, Herzlia 46104, Israel
Tel. +972 52 522261  TWX: 33691, fax: +972-52-558322 GEO: 34 48 E / 32 10 N

ag@cbmvax.UUCP (Keith Gabryelski) (11/10/89)

In article <28945@shemp.CS.UCLA.EDU> dieter@lynn.cs.ucla.edu (Dieter
Rothmeier) writes:
>While browsing through section 2 of the Unix manual, I came upon the
>concept of an indirect system call, as in syscall(2). Now that
>puzzled me. What might be the use for such a facility?

On Unix, system calls are invoked from a user process by passing [*] a
(system call) number to a routine in the kernel which uses this number
to look up what routine to call in the kernel via the sysent array [**].

The sysent array is really a big structure array list of system calls,
like open(), read(), signal(), and fork() that may also include the
number of arguments to the function and some other useful info.  There
is usually some left over space at the end (or middle) of this array
that can be used to place custom system calls if one disires.  Then,
using syscall(), you can invoke your specified routine.  Once you get
the hang of it and a good debugger, it is actually easy to add your
own system call if you have the capability to link a new kernel.

I have an example of adding select() to a 2.3 SCO Xenix system that
was posted to comp.unix.xenix almost a year back.  It used the
technique above to add select(), sigset(), and friends.  If you would
like I will send it to you.

Pax, Keith

* Passing oneself to a kernel is sort of funky.  It usually requires
   using some special machine langauge instruction such as TRAP or to
   some how cause an exception to otherwise occur (possibly jumping to
   a specified illegal memory location that the kernel will catch and
   do `special stuff' with).

** The reason for all this is that it allows one to have set entry
   points into the kernel that are controlled by the kernel.

-- 
  ag@cbmvax.commodore.com     Keith M. Gabryelski      ...!uunet!cbmvax!ag

scp@ibis.lanl.gov (Stephen Pope) (11/10/89)

In article <28945@shemp.CS.UCLA.EDU> dieter@lynn.cs.ucla.edu (Dieter Rothmeier) writes:

   While browsing through section 2 of the Unix manual,
   I came upon the concept of an indirect system call,
   as in syscall(2). Now that puzzled me. What might be
   the use for such a facility?

One thing they're good for is to "hide" specific system calls.
For example, one well known symbolic manipulation program
doesn't want you to use it freely, so it hides a call to
"hostid" inside an indirect system call, and compares the
result with that obtained via a normal hostid syscall.

stephen pope
scp@sfi.santafe.edu

bill@zycor.UUCP (bill) (11/11/89)

In article <28945@shemp.CS.UCLA.EDU> dieter@lynn.cs.ucla.edu (Dieter Rothmeier) writes:
>While browsing through section 2 of the Unix manual,
>I came upon the concept of an indirect system call,
>as in syscall(2). Now that puzzled me. What might be
>the use for such a facility?

You have stumbled on an interesting quirk in some implementations of UNIX.
It would seem that some processors don't expect the data for a
system call to be on the user stack, but instead immediately following
the system-call opcode in the user address space. In other words, 
something like:
	syscall	5
	word	v1
	word	v2
instead of the more common way that system calls work in a CPU:
	push	v2
	push	v1
	syscall	5

(So on the OS side, it uses the saved PC to access the operands
instead of the user stack pointer).

The problem is that if the system call instruction is in the TEXT
segment you can't write variable quantities after the syscall
opcode, since the text is r/o. Thus, you put the actual opcode
and its operands out in the data segment, and the indirect system
call points at the real one. 

The only time I have seen this was years ago on an Onyx running 7.

Bill Mahoney
bill@zycor.UUCP
The holiday season: time to wonder if the Salvation Army has tanks...

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (11/14/89)

In article <8494@cbmvax.UUCP>, ag@cbmvax.UUCP (Keith Gabryelski) writes:

|  I have an example of adding select() to a 2.3 SCO Xenix system that
|  was posted to comp.unix.xenix almost a year back.  It used the
|  technique above to add select(), sigset(), and friends.  If you would
|  like I will send it to you.

  Might I suggest a post to comp.sources.misc? That way it will be archived.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

ag@amix.commodore.com (Keith Gabryelski) (11/14/89)

In article <1598@crdos1.crd.ge.COM> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <8494@cbmvax.UUCP>, ag@cbmvax.UUCP (Keith Gabryelski) writes:
>
>|  I have an example of adding select() to a 2.3 SCO Xenix system that
>|  was posted to comp.unix.xenix almost a year back.  It used the
>|  technique above to add select(), sigset(), and friends.  If you would
>|  like I will send it to you.
>
>  Might I suggest a post to comp.sources.misc? That way it will be archived.

I actually have received a number of requests for it.  Please look for it
in comp.sources.misc soon.  I will be dumping it off of tape and sending
it off soon.

Pax, Keith
-- 
ag@amix.commodore.com        Keith Gabryelski          ...!cbmvax!amix!ag

lm@snafu.Sun.COM (Larry McVoy) (11/26/89)

There has been some discussion of the use of syscall() on this group.  Just a
fair warning to budding hackers - if my memory servers me correctly there are
times when it won't work.  In particular, syscall returns an int and if you
are calling a syscall that sends out stuff in more than one register (see
below) it won't work.  No, I don't have a list of stuff that fails, but you
should be able to look at the man pages and figure it out.  

--larry

[From Sun's syscall man page:
BUGS
     There is no way to simulate system calls such  as  pipe(2V),
     which  return  values in register d1 on Sun-3 and Sun-4 sys-
     tems or in register %edx on Sun386i systems.
]


	 What I say is my opinion.  I am not paid to speak for Sun.

Larry McVoy, Sun Microsystems                          ...!sun!lm or lm@sun.com

jfh@rpp386.cactus.org (John F. Haugh II) (11/26/89)

In article <128380@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>There has been some discussion of the use of syscall() on this group.  Just a
>fair warning to budding hackers - if my memory servers me correctly there are
>times when it won't work.  In particular, syscall returns an int and if you
>are calling a syscall that sends out stuff in more than one register (see
>below) it won't work.  No, I don't have a list of stuff that fails, but you
>should be able to look at the man pages and figure it out.  

Worse still, there are system calls which return values used by more
than one function.

One function may use the value in the first function return
register while another uses the value in the second function
return register.  getpid() and getppid() come to mind.
-- 
John F. Haugh II                        +-Things you didn't want to know:------
VoiceNet: (512) 832-8832   Data: -8835  | The real meaning of IBM is ...
InterNet: jfh@rpp386.cactus.org         |   ... I've Been to a Meeting.
UUCPNet:  {texbell|bigtex}!rpp386!jfh   +--<><--<><--<><--<><--<><--<><--Yea!--

dan@charyb.COM (Dan Mick) (11/28/89)

In article <128380@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>[From Sun's syscall man page:
>BUGS
>     There is no way to simulate system calls such  as  pipe(2V),
>     which  return  values in register d1 on Sun-3 and Sun-4 sys-
>     tems or in register %edx on Sun386i systems.
>]
 

Pretty neat, since Sun-4 systems have no register named 'd1'...<sigh>...
(They probably mean %o1.)

-- 
.sig files are idiotic and wasteful.