[net.sport.football] Mathematical ranking methods

djb@cbosgd.UUCP (11/02/83)

About once per week USA TODAY publishes a ranking of college football
teams done according to a mathematical algorithm developed by a
mathematics professor at Penn State.  According to the blurb printed
along with the ratings, the algorithm gives a team one point for each 
win and one-half for a tie, compares schedules (somehow) and produces a
logarithmic power indicator (0.00 being a super-team, with values of around
12.00 for the weakest teams).  In addition, the method allows extrapolation
of what the outcomes would be if all the listed teams played each other.
This metric is expressed in terms of the percentage of games each team
would win.  Teams are ranked according to their power factor.  The current
standings are quite interesting, with several notable differences between
power ratings and the coaches/wire-services poll.

My question is, does anyone know how this algorithm or others similar
to it are implemented?  The idea seems to be to derive a mathematical
way of characterizing the toughness of a team's schedule, with some
weighting factors for individual game outcome.  (If you lose to a tough team,
you don't get penalized as much as if you lose to a weaker team.  Conversely,
if you beat a tough team you get more than if you beat a weaker team.)
The logic is basically obvious, but the translation into mathematical
terms seems considerably less clear.

I would like to have some crafty scheme to rank teams analytically,
even if it does oversimplify in places.  If I can't find out how the
existing ones work, I'll take reasonable suggestions on how to develop
one myself.

	David Bryant   Bell Labs   Columbus, OH   (614) 860-4516
	(cbosg!djb)

ps: Any of you folks from Penn. State out there?  I can dig up the
    name of the professor there (as I remember, his last name starts
    with a 'J') if it'll help.

israel@umcp-cs.UUCP (11/04/83)

How is the showing of a team calculated using this?

For example, if one team blows another out of the water, how
does that differ from edging by them?  If all of Nebraska's
games had been won as one-point squeakers, would Nebraska
have the same or a different ranking?
-- 

^-^ Bruce ^-^

University of Maryland, Computer Science
{rlgvax,seismo}!umcp-cs!israel (Usenet)    israel.umcp-cs@CSNet-Relay (Arpanet)

israel@umcp-cs.UUCP (11/06/83)

[ The following message is an edited reply of mine to some personal
  correspondence.  After I wrote it, I decided that it was of
  general interest. ]

	From:     phipps@fortune
	
	A big thing to me is the quality of the opposition.  This is
	difficult to quantify convincingly; I'm not sure that using the
	won/loss records of opponents isn't just shifting the source of
	the uncertainty (won/loss records) from one team to its
	opponents.

I think that what's needed first is a way of quantifying the
strength of particular win, such as Nebraska's games this season
being in the 6 - 12 range, Auburn's 35 - 23 win over Maryland being
around a 2.5, and Md's win over UNC being around a 1.2.  A tie would
be a strength factor of 1.  This maybe could be some function of
passing and rushing yardage, turnovers, penalties, points (of course),
and other features.  Maybe what would be needed along with this is
offset factors for special situations.  There is one that I can think
of offhand:

lower (or raise) the win strength factor (WSF) if there where
injuries (before or during the game (this factor will
lower the closer it gets to the end of the game)) to the
losing (or winning side).

In other words, missing key players for the winning side
will strengthen the WSF;  missing key players for the
losing side will weaken the WSF.  If the injuries are
during the game, then it won't change the WSF as much.

A losing team missing key players could even end up with
a win for a close game that they lost, but these offsets
have to be calculated carefully enough so that it doesn't
pay for a team to be missing a player.

	And how should catching a higher-ranked team on a letdown
	Saturday affect things ?  A win is a win, but I wouldn't count
	it as much as I would if the losing opponent were really up for
	the game.

I don't see any way of quantifying that, and anyway, a good team is one
that doesn't get caught by letdowns (as demonstrated in the Auburn
Maryland game).

Anyone have any ideas on this?  People interested in trying to build
a mathematical ranking system over the net?
-- 

^-^ Bruce ^-^

University of Maryland, Computer Science
{rlgvax,seismo}!umcp-cs!israel (Usenet)    israel.umcp-cs@CSNet-Relay (Arpanet)

djb@cbosgd.UUCP (David J. Bryant) (11/07/83)

Well, since I posted a recent article asking about mathematical ranking
systems, you can count me as being interested in the topic.  I can see 
this going in two directions.  Either we can try and quantify a quick and 
approximate method of ranking teams or we can immediately go after a 
complex scheme that takes into account game statistics, injuries, home 
field advantage, outcome of previous games, rivalries, etc.

As for discussion up to this point, I think it is headed toward the
latter objective, which I must say I don't mind.  Clearly all those
complex factors come into play in determining the outcome of the game,
and so should be considered in any truly complete mathematical ranking 
system (even if you don't exactly know how to handle "intangibles").
However, I would prefer a simpler, more modest start.  I'd rather have
somthing coarse and fairly straightforward that didn't require 3-megabytes
of statistical data to run.  I consider it a great chore to type in all
the schedules and outcomes for the major college teams.  It seems you
should be able to do a first approximation based on some simpler set
of data about each team's schedule, record, and the records for each 
opponent.  For sake of argument, and hopefully to start things off,
here is a simple example:

	For team A, calculate the average number of wins for all
	the teams A has beaten.  Subtract off the average number of 
	losses for all the teams A has lost to.  This is team A's 
	strength factor.  Ties count as both a win and a loss.

A program to do this would take 15 minutes to write, and it uses
a small amount of data (both important points).  Comments?

	David Bryant   Bell Labs   Columbus, OH   (614) 860-4516
	(cbosg!djb)

ps: I don't suppose anyone has already typed in all the college schedules
    and results, and is willing to make this data available...?

israel@umcp-cs.UUCP (11/09/83)

	  Any attempt to rank teams is always going to conflict with
	some other method of ranking them.

So what?  Does the existence of the AP poll stop the UPI poll from
existing?  Do both stop the net poll?

Why I'm interested in doing such a thing is to see if we can come up
with a method of ranking that does a fairly good job of predicting.
I think that a problem with the polls is that they change too often.
If Grenada State U. is ranked 12th, and they play Nebraska and lose,
They could drop down to 18th or 20th, or even out of the polls.  If
they were considered better than eight other teams in the top twenty
before the game, and were also predicted to lose to Nebraska, why
does losing change people's opinions of them so that they are now
considered worse than those other teams?

I think that the sign of a good system for ranking is that it tends
to approach an equilibrium state where most results are in accordance
with the ranking system, and it tends to change less and less as
it gets more data (read: later in the season).

-- 

^-^ Bruce ^-^

University of Maryland, Computer Science
{rlgvax,seismo}!umcp-cs!israel (Usenet)    israel.umcp-cs@CSNet-Relay (Arpanet)

woods@hao.UUCP (Greg Woods) (11/14/83)

  Any attempt to rank teams is always going to conflict with some other method
of ranking them. We've seen the subjective methods fail to agree, with the UPI
and AP polls rarely agreeing on all teams (and certainly not with the USENET
poll either :-), and even one dramatic case where two simpler, more objective
ranking methods failed to work, during the year of the baseball strike when
the Reds had the best record in baseball over the course of the whole season
and didn't even make the playoffs. Try to use a complicated mathematical ranking
method and sooner or later there will come a time when your method places some
2-5 team over a 4-3 team, say, and the controversey will continue on. 

			GREG
-- 
{ucbvax!hplabs | allegra!nbires | decvax!brl-bmd | harpo!seismo | ihnp4!kpno}
       		        !hao!woods