[bionet.biology.computational] Computer Systems for the Life Sciences !?!?!

curtiss@umiacs.umd.edu (Phil J. Curtiss) (05/10/91)

As some of you may know the University of Maryland Institute for Advanced
Computer Studies is currently looking into the issues surrounding the
formation of a biocomputation lab here at the College Park campus.  This lab
would contain computer systems and software designed to allow biologists
(mainly) to experiment with computational and mathematical models of
biological processes.  In addition the lab may be of use to other
disciplines.

At this point, it is not clear what biologists would find useful in a
computer system.  That is to say, `performance' and `ease of use', seem to
be trade offs in most computer systems when applied to the biological
sciences.  We would like to put systems together in our lab that are both
easy to use as well as deliver supercomputer performance.  Our first system
to attempt this will involve creating a hardware and software link between a
Connection Machine CM-2 and a Stardent 3020 series machine.  We hope to be
able to create the system in such a way that the CM appears to be as
transparent as possible while still providing power and the flexibility to
alter the users project. 

My question is:  In your opinion, what would make an ideal computer system
for your research efforts in terms of performance, ease of use,
accessibility, hardware and software, etc.?
--
                                --- Moderator ---
Domain: curtiss@umiacs.umd.edu		     Phillip Curtiss
  UUCP:	uunet!mimsy!curtiss		UMIACS - Univ. of Maryland
 Phone:	+1-301-405-6710			  College Park, Md 20742

rdkeys@ccvr1.cc.ncsu.edu (05/11/91)

1. Must be big machine with internet connection.

2. Must have SAS on board.

3. Must have Fortran on board.

4. Must have GOOD C and C++ on board.

5. Must have good graphics program on board for laserjet and HPGL
   output to any size plotter.

6. Must be able to handle large databases from bionet genbank, embl,
   etc.

7. Must NOT have other than UNIX operating system. 

8. Each use gets 100mb of storage space and at least 8 mb of ram access.

9. Must have functional TeX/LaTeX system with output device (postscript
   is best probably).

10. Must have ethernet for high speed connects to local users, in
    addition to 19200 pc async connects.

11. Must have full GnuPlot system on board.

12. Should have good 3-dimensional graphics facilities.  Assume output
    to a workstation.

13. Keep the secretaries and admin folks OFF the BLOODY MACHINE.

14. Main box is a big unix machine.  Workstations abound therefrom.

15. CD-ROM with online research bibiligraphic databases (Agricola, etc.)
    on board are a MUST.

16. Some newsfeed to bionet newsgroups is a MUST.


Etc., etc., etc., to fill out the works.  Perhaps I am thinking too
small.....

Good Luck, rdkeys@ccvr1.cc.ncsu.edu


--
                                --- Moderator ---
Domain: curtiss@umiacs.umd.edu		     Phillip Curtiss
  UUCP:	uunet!mimsy!curtiss		UMIACS - Univ. of Maryland
 Phone:	+1-301-405-6710			  College Park, Md 20742

sss3@ukc.ac.uk (05/11/91)

Our department has access to an AMT DAP 32x32 processor array which
is used for DNA and Protein sequence analysis using software produced
by Dr. J.Collins and Dr. A.Coulson of Edinburgh University.

My research is aimed at a similar program using Kent's MIMD Meiko
Compting Surface.  My speciality is in OCCAM programming on
transputers.  I have no desire to go back to sequential programming
because I feel it hinders my thinking, yes I am a bit of a CSP fan.
There is a lot of talk about using functional languages with implicit
parallelism, or using sequential languages with parallel extensions.
I feel this to be a mistake, OCCAM is proving itself to be easy,
logical and secure, and also offers a higher degree of performance due
to it's explicit parallel nature.  On the question of ADA, that
language, although closely realated to CSP as is OCCAM it is proving to
be less efficient than OCCAM because it takes 60-100 times as long to
perform a context switch between concurrent processes, and allows
output guards on channels which consumes a lot of bandwidth slowing
down the communication process for example.

Results so far are promising, I am using the Needleman/Wunsch
alogorithm at present just to test various topologies.  In the case of
the Meiko I can reconfigure the processors from software allowing
great flexibility and control over the distribution of data.  I am six
months into the project and have achieved a speed of 12000 comparisons
a minute on a pipeline of 16 workers.  I have access to 90
transputers, soon to be increased to nearly 200 when we get the new
daemon fitted to our sun host allowing us to use both our cabinets.

With the arrival of the T9000 series I envisage at least an order or
magnitude increase in performance being within reach, not to mention
the ability to write software that is no longer dependant on the
number of links associated with the transputer design.  If all goes
according to plan we should get an array of 32 T9000 when they reach
the production stage, at least that is the aim.  A new version of
OCCAM has been produced (OCCAM 91) to take advantage of the increased
performance.

At the moment BioComputing here at Kent does not have much priority, I
am the only postgraduate working in the field, I hope to see interest
grow as more people realise the potential of the transputer and OCCAM.


************************************************************************

Shane Sturrock, BioComputing,
Biological Laboratory,
University of Kent,                   \
Canterbury,                         (}:-(   That is Biological Captain.
Great Britain.                        /

************************************************************************

--
                                --- Moderator ---
Domain: curtiss@umiacs.umd.edu		     Phillip Curtiss
  UUCP:	uunet!mimsy!curtiss		UMIACS - Univ. of Maryland
 Phone:	+1-301-405-6710			  College Park, Md 20742