NETHELP%EMBL@PUCC.PRINCETON.EDU (12/22/90)
------------------------------------------------------------------------------ | EMBL FILE SERVER News Number 2, December 21th 1990 | | | | European Molecular Biology Laboratory, Data Library & Computer Group, | | Postfach 10.2209, 6900 Heidelberg, Germany. | | E-mail: NetHelp@EMBL.bitnet Tel: +49 6221 387258 Fax: +49 6221 387306 | ------------------------------------------------------------------------------ Contents: <1> Introduction <2> New and updated SWISS-PROT entries <3> Changes to TFD <4> Common questions <5> Updates to existing data collections <6> Updates to software collection <7> Other updates <8> Summary of contents of file server <9> Getting started ? <1> Introduction ------------- The EMBL File Server Newsletter summarises changes to the EMBL File Server. This newsletter and older issues are available from the server (eg. GET DOC:EMBL_Server_News.1). <2> New and updated SWISS-PROT entries ---------------------------------- New SWISS-PROT entries and updates to existing entries are now available in between regular relases. They are not provided on a daily basis like new nucleotide entries, but we intend to make at least one or two sets of new/updated entries available each month. Indices are provided for the latest full release, and also separate indices for the data new since then (see HELP PROT for further details). <3> Changes to TFD -------------- The files in the TFD directory (D. Ghosh's relational Transciption Factor Database) contained records with much more than 80 characters per line. This fact caused some severe problems during mail transfer. Therefore, we have now encoded the TFD files in "uuencode" format. The C source code for the decoding program UUD (UUD.C) is available for VMS, UNIX and DOS machines in the directories VAX_SOFTWARE,UNIX_SOFTWARE and DOS_SOFTWARE. See HELP SOFTWARE for more details. <4> Common questions ---------------- Q: How can I search for keywords, species, authors etc. ? A: You cannot really do these kinds of searches using the EMBL File server, and we are somewhat reluctant to build in this capability. Interactive access to the database is much more likely to provide satisfactory query ability. The sequence databases are distributed at very low cost on both CD-ROM and magnetic tape, and data is also distributed via computer networks to several national hosts. The EMBL CD-ROM contains flexible query/retrieval for MSDOS systems that is designed to query the sequence databases (EMBL/SWISS-PROT) by accession numbers, entry names, free text, authors, citations, database cross-references, feature keys etc. The databases are also in a format suitable for sequence similarity searches with software such as FASTA. EMBL is also involved in a project to form a European Molecular Biology Network (EMBnet). Several centres have been set up to run a national biocomputing service: this includes enabling access to the latest sequence data distributed daily to them by EMBL. ('GET DOC:EMBNET.DOC' for further details). Q: Why have I only received the last part of a file ? A: Some mailers refuse to transfer mail files which exceed a certain size limit. The BITNET recommendation for the maximum file size is 256 KBytes but many mailers have much smaller limits. To accommodate most users, the file server automatically splits large files into parts of 95 Kbytes. Any standard editor can be used to remove the mail headers and join the parts. The individual parts may arrive in random order, but the Subject line gives you the necessary information to join them in the correct order (part n of m). Large files are transported slower through the networks than small ones, and the last part of a package will therefore arrive first in most cases. Depending on the network traffic it may actually take a few hours or even days, until the other parts arrive. If you don't receive some parts at all, there is probably a computer system between EMBL and you which does not allow transfer of 95 Kbytes files. There is not much we can do at EMBL in these cases, but you should check with your local computer specialists whether there are any local limitations at your site. <5> Updates to Existing Data Collections ------------------------------------ The following data collections have been updated recently: NUC - Release 25 November 1990 of the EMBL Nucleotide Sequence Database PROT - Release 16 October 1990 of the SWISS-PROT Protein Sequence Database EPD - Release 25 November 1990 of Philipp Bucher's Eukaryotic Promoter Database REBase - Release 9012 December 1990, of Rich Robert's restriction enzyme database. ECD - Release 5.0 of Manfred Kroeger's E.coli database Prosite - Release 6.0 of Amos Bairoch's protein pattern database. Enzyme - Release 3.0 of Amos Bairoch's enzyme database. RefList - Release 13.0 of Amos Bairoch's SeqAnalRef database. TFD - Release 2.0 of David Ghosh's transcription factor database The directory NUC is continually updated with nucleotide sequence data from EMBL/GenBank/DDBJ. <6> Updates to Software Collection ------------------------------ Here is a list of new (N) molecular biological programs or updates (U): DOS: ----- COSY.UAA (U) Complete package for enzyme kinetics (update from 6-Nov-1990) (M. Eberhard) CREGEX.C (N) Utility program to reformat Prosite for use with L. Kolakowski's PROSEARCH program (J. Leunissen) Mac: ---- DNATRANSLATOR.HQX (N) HyperCard stack with utilities for phylogenetic analyses (D.J. Eernisse) ENDOCYTOSIS.HQX (N) Calculation of parameters of endocytosis reaction (R.E. Williams) LOOPVIEWER.HQX (N) Graphical rpresentation of RNA folding (D. Gilbert) MACLIGAND.HQX (N) Calculation of parameters of ligand binding (R.E.Williams) MACMOLECULE.HQX (N) 3D models of biomelecules (E. Myers et al.) MACPATTERN.HQX (U) Protein pattern searching with Prosite (v1.2.1) (R. Fuchs) MULFOLD.HQX (N) RNA folding prediction (M. Zuker) RIMANAGER.HQX (N) Analysis of genetic mapping experiments with recombinant mouse strains (K. Manly) SPEAKQUENCER.HQX (N) Sequence data entry with acoustic feedback (C. Fritze) STUFFIT_16.HQX (N) New version of archiver/binhexer (R. Lau) UNIX: ----- BLAST.UAA (U) NCBI's fast database searching package LIBNCBI.UAA DFA.UAA CREGEX.C (N) Utility program to reformat Prosite for use with L. Kolakowski's PROSEARCH program (J. Leunissen) SIM.UUE (N) Local similarity searching (G. Huang and W. Miller) TREEALIG.UAA (U) TreeAlign multiple sequence alignment (J. Hein) VAX/VMS: -------- CDACCESS.UAA (U) New version 2.03 of ISO driver software (P.A. Stockwell) CREGEX.C (N) Utility program to reformat Prosite for use with L. Kolakowski's PROSEARCH program (J. Leunissen) FASTEMBL.COM (U) New version 1.1 of DCL shell for EMBL Mail-FASTA access (E.L. Sonnhammer) SCRUTINE.UAA (U) v2 of Scrutineer protein database analysis package (P. Sibbald) TREEALIG.UAA (U) TreeAlign multiple sequence alignment (J. Hein) <7> Other updates ------------- DOC - EMBnet documentation, October 1990 (DOC:EMBNET.DOC) - Compilation of available servers for molecular biology, from Michael Gribskov (DOC:SERVER.TXT). <8> Summary of Directories of the EMBL File Server ------------------------------------------- DIR [GENERAL] Summary of directories available on the EMBL File Server: Databases: EMBL Nucleotide Sequence Database NUC (Rel. 25, Nov 90 + new data from EMBL/GenBank/DDBJ) Eukaryotic Promotor Database (Rel. 25, Nov 90) EPD SwissProt Protein Database (Rel. 16, Oct 90) PROT ProSite pattern database (Rel. 6.0, Nov 90) PROSITE ENZYME database (Rel. 3, Dec 90) ENZYME Brookhaven Protein Structure Database (Rel. 53, Jul 90) PROTEINDATA REBASE, Restriction Enzyme Database (Rel. 9012, Dec 90) REBASE TFD, Transcription Factor Database (Ver 2.0) TFD The E.coli Database (Rel. 5, Nov 90) ECD Drosophila Genetic Map Database (Rel. 3.5, Aug 90) DROSOPHILA Listing of Molecular Biology Databases, LiMB (Rel. 2.0) LIMB Sequence analysis bibliography (SEQANALREF Rel. 13.0, Dec 90) REFLIST Software: Software for MS-DOS computers DOS_SOFTWARE Software for Apple Macintosh MAC_SOFTWARE Software for UNIX UNIX_SOFTWARE Software for VAX/VMS VAX_SOFTWARE Other software, GenBank Clearinghouse, etc. MISC_SOFTWARE Miscellaneous: Technical documents, submission and order forms, etc. DOC Multiple DNA sequence alignments and consensus sequences ALIGN Codon Usage tables CODONUSAGE <9> Getting Started ? ----------------- The EMBL File Server is a facility available on the EMBL computing system for external users to request files by electronic mail. The service is free. For initial information, send standard electronic mail to the address NETSERV@EMBL.bitnet containing just the word HELP on a line by itself. No essays please. For human contact, send electronic mail to NetHelp@EMBL.bitnet.