[sci.math.stat] genes & geography

g570907053ea@ucdavis.UUCP (g570907053ea) (11/03/86)

<worrying about the line eater is tougher than coping with it...>
(sorry if this is posted twice - my POSTNEWS suggested the first one failed)

Hi,

     I have data on ground-squirrel PAGE polymorphisms for a large number of 
sites (31) ranging through California from Santa Barbara and Bakersfield to 
Shasta, Chico and Willows, and even a few in Oregon.  Most of the sites are mid-
state, though, just north and south of the S.F. bay, the Delta, and the
American-Sacramento River boundary.  What can one do with data like this?  

     With a subset of it, mostly the central-boundary sites, we calibrated Nei's
'molecular clock', noting the degree of differentiation (Nei's 'Dv') associated
with the Delta/bay flooding.  This flooding of course halted gene flow -- ground
squirrels can swim, but don't.  (This calibration is published in 'Molecular
Evolution', about 2 years ago, with David Smith and Dick Coss as authors.  I did
the calculations, and am Coss's grad student.  Since I'm more modelling and
quantitatively oriented, I have permission to do with the new data whatever
seems appropriate.) 

     What seems appropriate is incorporating the geography somehow.  My
leitmotiv is J. Felsenstein's article, about '82 in J. Theor. Biol., in which he
reiterates that no models exist for inferring much except genetic distances from
such genetic data-- and that the effects of migration and mutation are there
confounded badly.  A similar theme is apparent in Montgomery Slatkins 'gene
flow' review in the Annual Review of Ecology and Systematics (84 or '85).
Slatkin proposes an emphasis on the 'rare alleles' -- those low-frequency,
presumably non- equilibrated sports which suggest current and recent changes.
I'd like to hear more about this notion.  It seems rather an art-form, but
rewarding.

     The other spatial statistics which have caught my eye: Dan Wartenberg's
canonical correlation, maximizing covariation between composites of the genetic
raw frequencies and polynomial functions of the site coordinates.  This 
purportedly captures large, regional trends, as in the example (in his Fall '85 
Systematic Zoology paper) of human migrations into NW europe from the middle 
eastern origins.  He contrasts this with the local emphasis of Sokal's 
autocorrelation analyses (the two have co-authored on these topics).  Both are 
of course useful- they're not alternatives.  Finally, the Mantel statistic for 
matrix comparisons is much in vogue: the matrices compared can be of genetic 
distances, 'ecological' or simple geographic distance, or some more complicated 
gegraphic matrix which includes the effects of barriers, e.g. rivers or Deltas.
Most of these are desribed or alluded to in either or both of Manly's new 
STATISTICS OF NATURAL SELECTION or J. Endler's NATURAL SELECTION IN THE WILD. 
Oddly enough-- my major concern is NON-selective variation!

     Any ideas or thoughts are welcome.  I'm happy to prepublish here as 
I find the time to conduct these manipulations.  Thanks much!  Ron Goldthwaite