[bionet.general] Abbreviations for ambigious bases

roy@phri.UUCP (Roy Smith) (03/18/88)

	David Kristofferson was kind enough to send me Rich Roberts'
official enzyme list.  Looking the list over, I'm confused about what the
non-standard bases mean.  For example, I see:

AccI                           GT^JKAC
AeuI (EcoRII)                  CC^LGG

	I've never seen the J or L before.  I would guess that J is [AC]
but I've always used M for that, which I though was the IUPAC standard.
Here's an extract from an include file I always use:

# define BASE_A         1       /* Adenine */
# define BASE_C         2       /* Cytosine */
# define BASE_G         3       /* Guanine */
# define BASE_T         4       /* Thymine */
# define BASE_U         5       /* Uracil */
# define BASE_R         6       /* A or G (puRine) */
# define BASE_Y         7       /* C or T (pYrimidine) */
# define BASE_M         8       /* A or C */
# define BASE_W         9       /* A or T */
# define BASE_S         10      /* C or G */
# define BASE_K         11      /* G or T */
# define BASE_B         12      /* C, G, or T (not A) */
# define BASE_D         13      /* A, G, or T (not C) */
# define BASE_H         14      /* A, C, or T (not G) */
# define BASE_V         15      /* A, C, or G (not T) */
# define BASE_N         16      /* A, C, G, or T (anything) */
# define BASE_BLK       17      /* Blank, place holder for insertions */
# define BASE_ERR       18      /* Error, (illegal character on input) */

	Did the standard change, or was I mislead, or is Rich Roberts using
his own notation, or what?  Come to think of it, if I had my way, I think I
might vote for dropping the special multi-base abbreviations all together
and forcing people who cared about such things to learn about regular
expressions; gt[ac][gt]ac makes a lot more sense to me than either gtmkac
or gtjkac.  The notational convenience of one-base, one-position often
doesn't seem worth the effort of having to remember all those non-mneumonic
abbreviations (not to mention the fact that everybody seems to have their
own idea of what those abbreviations should be).
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016