MARK@cuccfa.bitnet (01/18/91)
Greetings from a non-subscriber! Does anybody out there know of available, working software which can calculate serial fragment order from fragment pairwise overlap data? The overlap data resulting from a Columbia University scientist's new experimental techniques is purely binary in nature, indicating whether any two fragments overlap or do not overlap. NO sequence or length information is available for our fragments. I am aware of two papers from the past which addressed precisely this problem: Waterman and Griggs (1986) Interval graphs and maps of DNA Bull Math Biol 48(2): 189-195 Jungck, Dick, and Dick (1982) Computer-assisted sequencing, interval graphs, and molecular evolution BioSystems 15: 259-273 The Cambridge statistician D.G. Kendall addressed similar problems twenty years ago for purposes of seriation in archaeology. Thanks in advance for any responses! Please reply to me directly (address at the end of this message). -- Mark Reboul P.S. -- If you care to read on, here are a few amplifications.... From a novel experimental technique recently developed here at Columbia University, we will get something like a total count of fragments and a list of which pairs of them overlap. There will be nothing fractional: Either a pair of fragments will overlap or it won't. For now, we assume we will detect fragment overlaps without error. The experiment will reveal nothing about the fragment lengths, and thus we have no hope of determining serial order with any measure of length (i.e., scale) built into it. Actually, the new lab technique involves two separate libraries of clones (the clones are the sequence "fragments"), and it is only possible to detect overlaps between clones from opposite libraries. The two libraries span essentially the same stretch of chromosome, so I think as long as we are careful to label the clones distinctly, the two-library aspect of the data needs no other special treatment. Another issue is whether the pattern of fragment overlaps covers the entire stretch of DNA in a totally connected way. If necessary, we will break down the overlap data into multiple clusters of data, each describing a connected series of overlaps. We would then run the ordering algorithm separately on each self-connected cluster of overlap data. So that's our computational problem: to determine a reasonable serial order of fragments from a knowledge of their overlaps. ****************************************************************** T. Mark Reboul, Senior User Services Consultant Columbia University Comprehensive Cancer Center Computing Facility College of Physicians and Surgeons, Room 1-420 630 West 168th Street New York, NY 10032 Bitnet mark@cuccfa Internet mark@cuccfa.ccc.columbia.edu [128.59.97.1] (212) 305-7360