[comp.misc] How does ftp work?

ajayshah@almaak.usc.edu (Ajay Shah) (03/03/91)

Theory based on half-baked Unix fundaes:

1. I say ftp xyz
2. /usr/ucb/ftp comes up, converts xyx into a IP using standard services,
3. creates a request over ethernet saying "Wanna ftp".
4. On machine xyz, inetd is the first guy who sees the packet.
5. He recognises this as a ftp packet and gives it to ftpd.
6. ftpd shows me the login and password (does he use getty?)
7. If the login and password are ok, then he creates a ftp process,
   which has my UID and GID.  That takes care of file security,
   process accounting, etc.
8. The two ftp processes interact and the deed is done.

Is this correct?  What is the mechanics of creating a process in
such a way that it's UID and GID are different from your own?
How does anonymous ftp work?

Any more gory details of mechanics?  What about CRC checks and
packet level handshaking?


-- 
_______________________________________________________________________________
Ajay Shah, (213)734-3930, ajayshah@usc.edu
                              The more things change, the more they stay insane.
_______________________________________________________________________________

jik@athena.mit.edu (Jonathan I. Kamens) (03/04/91)

In article <30772@usc>, ajayshah@almaak.usc.edu (Ajay Shah) writes:
|> 1. I say ftp xyz
|> 2. /usr/ucb/ftp comes up, converts xyx into a IP using standard services,

  More specifically, if xyz is a hostname, then gethostbyname is used to get
its address, and if it's an address in the format that inet_addr understands,
then that is used instead.  So you can specify an IP address explicitly.

|> 3. creates a request over ethernet saying "Wanna ftp".

  This is too low-level, and not what ftp sees at all.

  Ftp creates an Internet socket.  Then, it reads the port number for the ftp
service from /etc/services, and connects to the remote host whose address was
determined above on the correct port.

  Ftp has not concept of what an ethernet, nor does it "send a message" to the
remote host over the ethernet.  It creates the socket and asks for it to be
connected to the remote host, and the kernel on the machine doing the ftp'ing
knows how to turn that request into an actual network connection.

  In order to understand how the network connection actually takes place, you
need to do a bit of reading about the TCP protocol; I'm not sure you're really
interested in this.

|> 4. On machine xyz, inetd is the first guy who sees the packet.

  No, what happens is the kernel on xyz sees the incoming connection on the
ftp port, and looks to see if anyone is listening on that port.  Since the
ftpd services is run from inetd, it turns out that inetd is listening for
connections on that port, so the kernel tells the inetd process that it's got
an incoming connection.  Inted does an "accept" on the socket, thus
establishing the connection with the host doing the ftp'ing.

|> 5. He recognises this as a ftp packet and gives it to ftpd.

  Once again, you're too low-level.  Inetd doesn't know anything about
"packets," it knows only about sockets.  Inetd does realize that the
connection is for ftpd, since it came in on the ftpd port.  So, it runs ftpd
as a sub-process, with the socket assigned to the standard input and standard
output of ftpd (so when ftpd reads from its standard input, it's reading from
the socket and therefore from your ftp program sending data through the
socket, and when it writes, it writes to your ftp program reading from the
socket).

|> 6. ftpd shows me the login and password (does he use getty?)

  Nope.  If autologin is enabled in ftp (and it is by default), then ftp
prompts for a username and password and sends them over the socket to the
ftpd, using the appropriate commands in the ftp protocol for usernames and
passwords.  If autologin isn't enabled, then you need to tell ftp that you
want to log in, using the "user" command.

  No, ftpd doesn't use getty.  All of the communication between ftp and ftpd
takes place over the Internet socket created when you asked ftp to connect to
xyz.

|> 7. If the login and password are ok, then he creates a ftp process,
|>    which has my UID and GID.  That takes care of file security,
|>    process accounting, etc.

  Not quite.  The entire session is handled by ftpd on the xyz end of the
connection.  When you give a username and password to your ftp and it sends
them to the ftpd and ftpd decides they're valid, then ftpd does a setuid()
call to change it's own UID to that of the user whose account you specified. 
It also does an initgroups() to set up groups for the user.  It also chdir()s
into the user's home directory.

|> 8. The two ftp processes interact and the deed is done.

  Well, actually, ftp on your end interacts with ftpd on the other end.

|> Is this correct?  What is the mechanics of creating a process in
|> such a way that it's UID and GID are different from your own?

  See above, and the man pages for setuid() and initgroups().

|> How does anonymous ftp work?

  See the man page ftpd(8) for a description of how the anonymous ftp account
is set up on a machine that is willing to accept anonymous ftp.  Ftpd has
special code in it to deal with anonymous ftp.  In addition to doing setuid()
and initgroups() to the anonymous ftp account, it will allow you to specify
any non-NULL password when logging in.  Furthermore, rather than chdir()ing to
the anonymous ftp home directory, it chroot()s to it, for security reasons. 
See the man page for chroot() for details about what it does.

|> Any more gory details of mechanics?  What about CRC checks and
|> packet level handshaking?

  Unnecessary.  The TCP protocol guarantees reliable message delivery.  Since
ftp and ftpd use Internet sockets built on top of the TCP protocol, messages
sent from ftp to ftpd and vice versa are guaranteed to be correctly
transmitted, and if they can't be, the sending process gets an error.

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710