BIOCUKM@osucc.bitnet (09/11/90)
Folks, The recent chatter on alternative ways of presenting nucleotide sequences underscores a common dissatisfaction with the use of the first letters (Roman alphabet) of the English names of the bases. That dissatisfaction is justified since the use of ACTG takes up much more space than needed, since there is a potential ambiguity between C and G on poor quality copies of the sequence, and since computers are required to spot patterns. If we can all agree that ACTG is a poor choice, we should switch to a better one. The chatter mentions a number of good alternatives. I suspect that a change will require one or more courageous journal editors. Are there any? Peace, Ulrich Melcher
kristoff@genbank.bio.net (David Kristofferson) (09/11/90)
> If we can all agree that ACTG is a poor choice, we should > switch to a better one. The chatter mentions a number of good > alternatives. I suspect that a change will require one or more > courageous journal editors. Are there any? I think that it will take a bit more than one courageous editor ... 8-) Considering the minor uproar over some recent features table changes, just think of all of the complaints from people whose software would be rendered obsolete unless they were provided with a conversion program back to the old format. Enough said. -- Sincerely, Dave Kristofferson GenBank Manager kristoff@genbank.bio.net
triplett@CALSHP.CALS.WISC.EDU (09/12/90)
To: Ulrich Melcher I do not agree that ACTG is a poor choice. In my experience, there has been no ambiguity in this matter. This is a long established convention that would make all previous papers unnecessarily obsolete. I would strongly encourage journal editors not to consider such an unnecssary change. Poor quality printers can distort practically any message. I suggest that those who have problems with ACTG should improve the quality of their print output would would be beneficial to all other printing. Eric Triplett, University of Wisconsin
chh9@quads.uchicago.edu (Conrad Halton Halling) (09/12/90)
I find _lowercase_ acgt much easier to read than ACGT; hence, I prefer the GenBank database entries over their EMBL equivalents. Perhaps authors should use lowercase letters in their sequence figures to improve readability. (I think that within a year or two sequence figures will largely disappear, anyway, to be replaced by the accession number and a map of the sequenced region.) -- Conrad Halling chh9@midway.uchicago.edu
jej@chinet.chi.il.us (joe jesson) (09/15/90)
In order to reduce the ambiguity, the code should maximize the hamming distance with either check bits or a simplified code. I've got many ideas for the designation... -- --------------------------------------------------------------------------- Joseph Jesson jej@chinet.uucp Day (312) 856-3645 Eve. (708) 356-6817 21414 W. Honey Lane, Lake Villa, IL, 60046 ---------------------------------------------------------------------------
BIOCUKM@osucc.bitnet (09/18/90)
Responses to my posting on adopting sequence representations other than ATCG require two clarifications: 1. Ambiguity problem Some journals still do not require submission of sequence data to the banks. Even when submission is required, availability in the data banks sometimes lags behind publication. In these cases, I run to the library and make a photocopy of the sequence. It is on those photocopies that I have trouble telling C's from G's. Only those with superior eyesight, superior library copy machines or personal subscriptions to all the journals can breeze through such sequences error free. 2. Software conversion--no problem I am not suggesting that databases or software be changed. ASCII codes for ACTG would still be the form that sequences are stored and manipulated by computers. I am only suggesting that we change how those codes are depicted on the printed page. The only software that would be affected is software that can read a sequence by scanning a printed page. Such software, if it exists, does not seem to be widely used. Peace, Ulrich Melcher Oklahoma State University
roy@phri.nyu.edu (Roy Smith) (09/18/90)
Ulrich Melcher writes: > Some journals still do not require submission of sequence data to the > banks [...] Only those with superior eyesight, superior library copy > machines or personal subscriptions to all the journals can breeze through > [hard-copy] sequences error free. Ulrich, I suppose we simply have differing points of view, but I think the answer to the problem of hard-to-read printed sequences is not to make the printed sequences easier to read, but to get rid of them! Rather than bug the journal editors to change the way they present sequences in print, bug them to insist on timely submission to the appropriate database as a prerequisite to publication. -- Roy Smith, Public Health Research Institute 455 First Avenue, New York, NY 10016 roy@alanine.phri.nyu.edu -OR- {att,cmcl2,rutgers,hombre}!phri!roy "Arcane? Did you say arcane? It wouldn't be Unix if it wasn't arcane!"
NUM208JN@NRCCAD.NRC.CA (JOHN NASH) (09/18/90)
In response to the comment by Ulrich Melcher on sequence changes: UM> I am not suggesting that databases or software be changed. UM>ASCII codes for ACTG would still be the form that sequences are UM>stored and manipulated by computers. I am only suggesting that UM>we change how those codes are depicted on the printed page. I'm not too fussed about how these codes are depicted on a page. However, when I'm transcribing sequence by hand, I often use lower case "g" instead of G. (Unfortunately, this could lead to confusion because g is often used for "probably G".) Just a thought, cheers, John, -------------------------------------------------------------- John H.E. Nash <Bitnet: NUM208JN@NRCCAD.NRC.CA > Institute for Biological Sciences, National Research Council of Canada, Ottawa, Canada K1A 0R6. Phone: (613) 990-0990 Fax: (613) 952-9092. ==============================================================