[comp.unix.internals] Questions regarding tcp_input.c

ktk@nas.nasa.gov (Katy T. Kislitzin) (05/24/91)
What I am trying to figure out at the moment is how the code which does slow start 
actually works.  (I have read Van Jacobsons's paper, 
the 4.3 BSD book and rfcs 793, 1122 etc, but I want to understand the actual 
code as well.)  

I have been looking at the 4.3-Reno TCP code and I have a few questions that
I hope the net can clarify.  The file ./usr/src/sys/netinet/tcp_input.c
contains the following:


int tcprexmtthresh = 3;

...

	/*
	 * In ESTABLISHED state: drop duplicate ACKs; ACK out of range
	 * ACKs.  If the ack is in the range
	 *	tp->snd_una < ti->ti_ack <= tp->snd_max
	 * then advance tp->snd_una to ti->ti_ack and drop
	 * data from the retransmission queue.  If this ACK reflects
	 * more up to date window information we update our window information.
	 */
	case TCPS_ESTABLISHED:
	case TCPS_FIN_WAIT_1:
	case TCPS_FIN_WAIT_2:
	case TCPS_CLOSE_WAIT:
	case TCPS_CLOSING:
	case TCPS_LAST_ACK:
	case TCPS_TIME_WAIT:
		if (SEQ_LEQ(ti->ti_ack, tp->snd_una)) {
			if (ti->ti_len == 0 && ti->ti_win == tp->snd_wnd) {
				tcpstat.tcps_rcvdupack++;
				/*
				 * If we have outstanding data (other than
				 * a window probe), this is a completely
				 * duplicate ack (ie, window info didn't
				 * change), the ack is the biggest we've
				 * seen and we've seen exactly our rexmt
				 * threshhold of them, assume a packet
				 * has been dropped and retransmit it.
				 * Kludge snd_nxt & the congestion
				 * window so we send only this one
				 * packet.
				 *
				 * We know we're losing at the current
				 * window size so do congestion avoidance
				 * (set ssthresh to half the current window
				 * and pull our congestion window back to
				 * the new ssthresh).
				 *
				 * Dup acks mean that packets have left the
				 * network (they're now cached at the receiver) 
				 * so bump cwnd by the amount in the receiver
				 * to keep a constant cwnd packets in the
				 * network.
				 */
				if (tp->t_timer[TCPT_REXMT] == 0 ||
				    ti->ti_ack != tp->snd_una)
					tp->t_dupacks = 0;
				else if (++tp->t_dupacks == tcprexmtthresh) {
					tcp_seq onxt = tp->snd_nxt;
					u_int win =
					    min(tp->snd_wnd, tp->snd_cwnd) / 2 /
						tp->t_maxseg;

					if (win < 2)
						win = 2;
					tp->snd_ssthresh = win * tp->t_maxseg;
					tp->t_timer[TCPT_REXMT] = 0;
					tp->t_rtt = 0;
					tp->snd_nxt = ti->ti_ack;
					tp->snd_cwnd = tp->t_maxseg;
					(void) tcp_output(tp);
					tp->snd_cwnd = tp->snd_ssthresh +
					       tp->t_maxseg * tp->t_dupacks;
					if (SEQ_GT(onxt, tp->snd_nxt))
						tp->snd_nxt = onxt;
					goto drop;
				} else if (tp->t_dupacks > tcprexmtthresh) {
					tp->snd_cwnd += tp->t_maxseg;
					(void) tcp_output(tp);
					goto drop;
				}
			} else
				tp->t_dupacks = 0;
			break;
		}
		/*
		 * If the congestion window was inflated to account
		 * for the other side's cached packets, retract it.
		 */
		if (tp->t_dupacks > tcprexmtthresh &&
		    tp->snd_cwnd > tp->snd_ssthresh)
			tp->snd_cwnd = tp->snd_ssthresh;
		tp->t_dupacks = 0;

As far as I can tell, the line 

				else if (++tp->t_dupacks == tcprexmtthresh) {

is the only place t_dupacks gets incremented in all of the TCP code.  On the
other hand, t_dupacks gets set to zero in all other relevant cases.  What I
would like to know is the following:

1. why isn't t_dupacks reset to zero right after retransmission instead of
waiting till next time through the routine?  

2. how can t_dupacks > tcprexmtthresh occur?

3. In the case

				} else if (tp->t_dupacks > tcprexmtthresh) {

why is the adjustment of cwnd and ssthresh done next time through instead of
before the goto drop like in the previous case?

why aren't all retransmissions handled exculsively by the retransmission timer?

thanks in advance,

--kt

-- 
Katy Kislitzin, ktk@nas.nasa.gov, ...!{ames, uunet}!nas.nasa.gov!ktk

[NASA/Ames is in Mt. View California.  I live in the Santa Cruz
Mountains with my cats Sid, Zippy, Nickel and Copper.]