[comp.lang.fortran] Range of Character Variables

hybl@mbph.UUCP (Albert Hybl Dept of Biophysics SM) (03/28/89)

In message <454@loligo.cc.fsu.edu> mccalpin@loligo.cc.fsu.edu
(John McCalpin: Supercomputer Computations Research Institute)
writes:
>In article <587@mbph.UUCP> hybl@mbph.UUCP (Albert Hybl 
>Dept of Biophysics  SM) writes: ...
>>What does the ANSI X3.9-1978 standard say are the maximum allowable
>>sizes for the character array AND(K)*1 and the character variable
>>DNA*(K)?
>
>... What number do you suggest?

My answer is: the maximum allowable sizes for the character array
and the character variable must be the same and an EQUIVALENCE
statement relating the array and the string should work over the whole
range of the array variable.

As we all know the X3.9-1978 standard specifies that a character
array can be bound by a minimum and maximum bound; for example, 
      CHARACTER*1 AND(-67:2125)       ! conforms to the standard.
      CHARACTER*(-67:2125)  DNA       ! isn't legal so
      EQUIVALENCE  (AND,DNA)          !   can not be legal.

I suggest that the standard allow both lower and upper bounds for a character
variable and to allow the bounds to be negative.  Let me try to explain why.

The National Institutes of Health is about to spend a huge amount of
money to map the human genome.  The Department of Agriculture likewise
wants to map the genomes of our grain crops.  These projects will produce
an enormous library of very valuable information.  Consider the nucleotide
sequence of the cucumber ascorbate oxidase cDNA reported in the Proc.
Natl. Acad. Sci. USA 86:1239-1243 (February 1989);  the nucleotides
are numbered from -67 to 2125.  Starting with nucleotide 1 (not -67),
the authors deduce the protein sequence of the ascorbate oxidase.
The amino acid residues of the protein are numbered from -33 to 554;
the first 33 residues (-33:-1) are part of a putative signal peptide
that is later cleaved off to produce the native protein.  Negative
indexes are required for both the nucleotides of the cDNA and the deduced
amino acid sequence of the protein.

If fortran is to be used to help archive, maintain and retrieve all
this genetic information, then the character variable must be made more
useful and portable.

----------------------------------------------------------------------
Albert Hybl, PhD.              Office UUCP: uunet!mimsy!mbph!hybl
Department of Biophysics       Home   UUCP: uunet!mimsy!mbph!hybl!ah
University of Maryland                CoSy: ahybl
School of Medicine
Baltimore, MD  21201                 Phone: (301) 328-7940 (Office)
----------------------------------------------------------------------