[bionet.general] New Network Service from BIONET - FASTA-MAIL Similarity Searches!!

kristoff@NET.BIO.NET (Dave Kristofferson) (01/10/89)

     IMPORTANT INFORMATION - PLEASE SAVE FOR FUTURE REFERENCE!!!


As part of our efforts to provide easier access to the BIONET resource
and reduce communications costs, we are pleased to announce the
availability of a new service to the scientific community.  

The BIONET FASTA-MAIL program for nucleic acid and protein sequence
database similarity searches is now available on an experimental basis
to academic and other non-profit users with electronic mail access to
the BIONET computer, net.bio.net.  This should include users on the
Internet, BITNET, EARN, NETNORTH, and JANET.

To access this program, it is first necessary to register with BIONET
by sending an electronic mail message to one of the following addresses:

	fasta-req@net.bio.net  (from Internet/BITNET/EARN/JANET/NETNORTH)

	...!{backbone}!bionet!fasta-req (via UUCP)

If you are connected to the networks via UUCP, you must find a
mailpath to bionet.

PLEASE INCLUDE IN THE MESSAGE A DESCRIPTION OF YOUR PROFESSIONAL
POSITION TO ASCERTAIN THAT YOU ARE INDEED ENGAGED IN NON-PROFIT
RESEARCH.  When you receive confirmation of registration by return
e-mail, you will be given an address to access the FASTA-MAIL program.
Directions for use are given below.  There is no fee for the use of
the program, but outside users are on their own.  BIONET does not have
the resources to provide technical support to scientists who do not
have accounts on the BIONET computer.  Although you must interpret
your own results (see references below), we will respond to problems
with the program (mail to fasta-req@net.bio.net).

To ensure that outside users of the FASTA-MAIL program do not drain
computer resources from our regular BIONET subscribers, jobs submitted
from other computers will be placed in a low priority queue.  This
queue can be monitored by sending mail to fasta-q@net.bio.net.  BIONET
will NOT answer questions about job turnaround times.  Also please
note that the program will only allow ONE JOB per registered user in
the queue AT ANY TIME.  NOTE(!!): If you submit a second job while
your first job is still in the queue, your first job will be
automatically canceled!!

I would like to thank both Eliot Lear, one of BIONET's systems
programmers, for his fine work in developing FASTA-MAIL, and Dr.
William Pearson at the University of Virginia, the author of the
original FASTA program.

				Sincerely,

				David Kristofferson, Ph.D.
				BIONET Resource Manager

				kristoff@net.bio.net
			     or	kristofferson@bionet-20.bio.net

----------------------------------------------------------------------

	     HOW TO USE FASTA-MAIL FROM OUTSIDE OF BIONET

To use the FASTA-MAIL service, simply register as described above and
then send a mail message to the batch server with the following
information in the body of the message (mailing address is supplied to
registered users).  Please follow this format PRECISELY, but note that
the program is case-insensitive, i.e., either upper or lower case
letters may be used.  Please note that the syntax is rather
inflexible, as this is an initial version of the code.  (There will be
improvements later).  An example is provided at the end of this
message.  Note that "{your sequence data here}" refers only to the
position and is NOT a requirement to enclose the data in braces, {}.
Sequence data should be limited to no more than 80 characters per
line. 


DATALIB database-name
KTUP ktup-val
BEGIN
{your sequence data here}


The value "database-name" is the name of the database to be searched
and is selected from the left-hand column in the following table.

database-name				Description
-------------				-----------
pir					Protein Identification Resource
swiss-prot				SWISS-PROT Protein Database
embl					entire EMBL nucl. acid seq. database
genbank/primate				Primate Section of Genbank
genbank/rodent				Rodent ...
genbank/other_mammalian			Mammal ...
genbank/other_vertebrate		Vertebrate ...
genbank/invertebrate			Invertebrate ...
genbank/plant				Plant ...
genbank/organelle			Organelle ...
genbank/bacterial			Bacterial ...
genbank/structural_rna			Structural RNA
genbank/viral				Viral
genbank/phage				Phage ...
genbank/synthetitic			Sythetic
genbank/unannotated			Unannotated
genbank/bacterial			Bacteria
genbank/all				Entire Genbank


The value "ktup-val" is the k-tuple value as defined in Pearson and
Lipman's paper:

Pearson, W.R., and Lipman, D.J. (April 1988).  "Improved tools for
biological sequence comparison."  Proc. Natl. Acad. Sci., Vol.
85, pp. 2444-2448.

Typical values for ktup-val are 4 for nucleic acid searches and 2 or 1
for protein searches.


Sequence data must be in either Pearson's FASTA format or
IntelliGenetics format.  Please no more than 80 characters per line! 

Pearson format (a comment line beginning with a > followed by the
sequence data):

>AGMREP4 - Monkey SV40-like genomic segment promoting transcription.
ccccttcaaatctattacaaggtgagcgtctcgccaaggcaatgaaatcgcaatatgatg
tttccatttactttggattatacgtcattataaatattaacaaataagactcaaaaagga
caccttcgggtaggtcagaccaaagtacaaaacttgtgtgtggggctgcagtttgagggc
agtgtctgcagccgtcacatggtagcaaaacggtgttaagcagcgcacgagagtctgcgt
cgaccacagccagagtccatgcatcgggaggttcactcggtttgcgaagaacgggcaggg
catgcacggcctgggctcggcgggcgggcgggcgggccggggcgcagttccccaggttcg
ccactagaggtcaggaggtgaccgcttcggggctggaagacgggcccgtcgtggattggc
tagtgccggcggagggcggggcggagagtggggcggggcggagagtggggcggggcgcag
ttccccaggttcgccactagaggtcaggaggtgaccgcttcggggcgggaagactggccc
gtcggggattggctagtgccggcggggggcggggcggggggcggagggcggggtggacgt
ggcgcctggttgctgacatctggaatgacttttttttggcatcagatttcctgtctttgt
ggggctgatggacccgagtaaagatgcccgttcggggtcaaaggcagagccgcttctgca
gcttctcaaagc

IntelliGenetics format:

;Comment lines (optional) start with a semicolon, e.g.,
;Monkey SV40-like genomic segment promoting transcription.
;Sequence data ends in a 1 (below) if linear or
;a 2 if circular.
;Next line (first non-optional line) is the sequence name.
AGMREP4
ccccttcaaatctattacaaggtgagcgtctcgccaaggcaatgaaatcgcaatatgatg
tttccatttactttggattatacgtcattataaatattaacaaataagactcaaaaagga
caccttcgggtaggtcagaccaaagtacaaaacttgtgtgtggggctgcagtttgagggc
agtgtctgcagccgtcacatggtagcaaaacggtgttaagcagcgcacgagagtctgcgt
cgaccacagccagagtccatgcatcgggaggttcactcggtttgcgaagaacgggcaggg
catgcacggcctgggctcggcgggcgggcgggcgggccggggcgcagttccccaggttcg
ccactagaggtcaggaggtgaccgcttcggggctggaagacgggcccgtcgtggattggc
tagtgccggcggagggcggggcggagagtggggcggggcggagagtggggcggggcgcag
ttccccaggttcgccactagaggtcaggaggtgaccgcttcggggcgggaagactggccc
gtcggggattggctagtgccggcggggggcggggcggggggcggagggcggggtggacgt
ggcgcctggttgctgacatctggaatgacttttttttggcatcagatttcctgtctttgt
ggggctgatggacccgagtaaagatgcccgttcggggtcaaaggcagagccgcttctgca
gcttctcaaagc1


Once we receive your message on our machine, it is placed in a batch
queue.  It will be processed in the order it is received.  Please note
that we are only allowing one job per user in the queue.  If you have
submitted one job, don't submit another before receiving your results
back or the previous job will be cancelled.  Your position in the
queue can be monitored by mailing to fasta-q@net.bio.net.

An example follows.  Please note that the first four lines below are a
mail header which is automatically created when you address a mail
message.  The From: and To: addresses are completely fictious.
Nothing need be entered as a Subject: for the message.  The text that
you enter into the body of the mail message begins with DATALIB.


From drbob@someaddress.somewhere.edu Tue Jun 14 21:36:38 1988
Date: 14 Jun 1988 2129:02-PDT
To: xxxxx@net.bio.net
Subject: Batch Job

DATALIB GenBank/other_mammalian
KTUP 4 
BEGIN
>BOVPRL GenBank entry BOVPRL from omam file.  907 nucleotides.
TGCTTGGCTGAGGAGCCATAGGACGAGAGCTTCCTGGTGAAGTGTGTTTCTTGAAATCAT
CACCACCATGGACAGCAAAGGTTCGTCGCAGAAAGGGTCCCGCCTGCTCCTGCTGCTGGT
GGTGTCAAATCTACTCTTGTGCCAGGGTGTGGTCTCCACCCCCGTCTGTCCCAATGGGCC
TGGCAACTGCCAGGTATCCCTTCGAGACCTGTTTGACCGGGCAGTCATGGTGTCCCACTA
CATCCATGACCTCTCCTCGGAAATGTTCAACGAATTTGATAAACGGTATGCCCAGGGCAA
AGGGTTCATTACCATGGCCCTCAACAGCTGCCATACCTCCTCCCTTCCTACCCCGGAAGA
TAAAGAACAAGCCCAACAGACCCATCATGAAGTCCTTATGAGCTTGATTCTTGGGTTGCT
GCGCTCCTGGAATGACCCTCTGTATCACCTAGTCACCGAGGTACGGGGTATGAAAGGAGC
CCCAGATGCTATCCTATCGAGGGCCATAGAGATTGAGGAAGAAAACAAACGACTTCTGGA
AGGCATGGAGATGATATTTGGCCAGGTTATTCCTGGAGCCAAAGAGACTGAGCCCTACCC
TGTGTGGTCAGGACTCCCGTCCCTGCAAACTAAGGATGAAGATGCACGTTATTCTGCTTT
TTATAACCTGCTCCACTGCCTGCGCAGGGATTCAAGCAAGATTGACACTTACCTTAAGCT
CCTGAATTGCAGAATCATCTACAACAACAACTGCTAAGCCCACATTCCATCCTATCCATT
TCTGAGATGGTTCTTAATGATCCATTCCCTGGCAAACTTCTCTGAGCTTTATAGCTTTGT
AATGCATGCTTGGCTCTAATGGGTTTCATCTTAAATAAAAACAGACTCTGTAGCGATGTC
AAAATCT


While we can not answer questions about interpretation of results, job
turnaround times, etc., we will respond to any bug reports about the
software.  Please direct any problems to fasta-req@net.bio.net.