Doug_Eernisse@UB.CC.UMICH.EDU (05/04/91)
I last posted a note about my HyperCard stack last November, and the new features require a new description. Especially note that it is now possible to convert a typical GenBank/EMBL documented sequence file into a "smart" rescalable gene map, whose sequences of interest can be click-extracted: Features of DNA Translator HyperCard 2.0 stack, Version .98.7 5/3/91 Copyright 1990,1991 D.J. Eernisse Email: usergdef@ub.cc.umich.edu or usergdef@umichub Mailing address: Museum of Zoology, Univ. of Michigan, Ann Arbor, MI 48109 *****************Send a postcard of your hometown************************ The DNA Translator is a HyperCard 2.0 stack providing three complementary sets of utilities for viewing and manipulating molecular data on a Macintosh computer. First, DNA Translator displays codon and amino acid usage data as it differs for a wide variety of organisms and organelles. Dynamically-constructed graphs display percentage usage for a particular codon relative to the other 63 codons or to only other codons with equivalent amino acid coding. A user can select from a diverse list of some 70 taxon-organelle combinations. Second, a "gene mapping" facility draws and displays 2 linearized gene maps for comparison, automatically adjustable to desired scales and locations along all or part of the mapped molecule. Maps and select corresponding complete sequence data are provided for most available fully documented mitochondrial and chloroplast DNA (mtDNA and cpDNA) sequences, which include 9 animal, 1 yeast, and 1 ciliate mtDNAs and 3 green plant cpDNAs. Additional gene maps may be user-created by a direct file conversion of standard GenBank- or EMBL-format documented sequences and their features tables. Third, DNA Translator is a powerful workbench of specialized sequence manipulation tools, catering especially to those with interests in phylogenetic analysis. A user can extract any available sequences for a particular mapped gene or region, or import one or more sequence in a variety of formats, including multiple aligned sequence output of various programs (EuGene, Prophet, CLUSTAL, Nexus, PHYLIP, etc.), and then further manipulate, interleave, compare or translate the gene sequences to amino acids. Multiple aligned sequences can be converted to Nexus, Hennig86 or PHYLIP for subsequent phylogenetic analysis. Most known deviations from the "universal" code, which are typical for mtDNAs, may optionally be used during translation. The codon usage, translation and gene mapping data may be exported or imported in spreadsheet format for incorporating additional molecules of interest or in various standard formats (UWGCG, GenBank, EMBL,Intelligenetics). A built-in editor supports sequence entry with optional computer-speaking for error checking, and a variety of output conversion options. The current version is freely available via anonymous ftp for noncommercial use only. A. Import Formats Accepted 1. Multiple sequence alignments a. Simple text files of string sequences, with or without interleaves or match characters b. MBIR (EuGene, Prophet) "Doolittle" progressive alignments c. CLUSTAL nucleo- or peptide output. d. Nexus (PAUP, MacClade) files e. Phylip "Result" output files 2. String sequences a. String sequence text files of the form: Name AGCTACCT... b. Data input from built-in sequence entry editor c. Sequences extracted from the provided gene mapper d. String output generated by many commonly used programs 3. GenBank/EMBL or PIR-CODATA documented sequence files a. Converted to string sequences for use in conversions below b. GenBank or EMBL documented features to spreadsheet matrix c. Matrix directly convertable to rescalable gene map d. Gene map allows extraction of any mapped or custom subsequence B. Conversions provided 1. Multiple sequence alignments (A1) converted/exported in the following formats: a. Nexus (PAUP, MacClade) with optional cost matrices added b. Phylip 3.2 or 3.3 formats c. Hennig86 (Nucleotide data only) d. Multiple sequence strings (A2a) with gaps preserved or deleted 2. String sequences (A2a,B1d) converted/exported in the following formats or: a. Unmodified to be reimported or converted as needed b. As straight single sequence strings for import by MBIR (EuGene, Prophet), Authorin, Gene Construction Kit, etc. c. Intelligenetics format for import by many programs d. GenBank, EMBL, or PIR-CODATA formats for viewing or import by many programs e. Simple interleaved format with optional match characters for manual alignment or reimport (A1a) f. Subsequence, DNA <-> RNA, upper <-> lower case, complementary strand sequence conversions 3. DNA/RNA -> peptide translation > 1st, 1st & 2nd, or all 3 possible reading frames output > Formatted output displays codons below amino acid abbreviations with either 1- or 3-letter amino acid abbreviations > String output is useful for export or conversion (B2) > May use standard or any provided codon usage table > Additional custom codon usage tables may be added > Codons with ambiguous nucleotide symbols (IUPAC-IUB) are translated as appropriate when translation is unambiguous > Optional termination when stop codons are encountered 4. Peptide -> DNA translation > Peptide strings may be backtranslated, using the reverse of whichever codon usage table the user selects > Backtranslation uses IUPAC-IUB conventions for ambiguous nucleotides D. Nucleotide or peptide sequence entry > Special Editor field has autoformatting capabilities > Adjustable computer speaking supported during or after entry > Sequences may be edited, manipulated, or exported (B2) E. Data provided with stack > Codon usage patterns for about 70 organism/organelle combinations > Most mitochondrial variation in coding supported (3) > Gene maps of 11 mtDNAs plus string sequences > Gene map matrices for 3 green plant chloroplast DNAs F. General Features > Online help facility is available from any card in stack > References, taxon names, and sample data provided > Pulldown or popup menus or dialogs used for commands > Buttons or pulldown menus are used for stack navigation > Data fields may be locked or unlocked for text entry > Data fields shrink down or expand by clicking > Contents of fields exportable as text or printable > Default word processor for exported text files can be chosen > PAUP/MacClade specification for exported Nexus files > PAUP/MacClade open directly from the stack > Add gene mapping or utility cards as required > All data, scripts and resources used are easily accessible > Custom XFCNs by Nigel Perry enable efficient manipulations G. Hardware and System Requirements > Requires any Macintosh that can run HyperCard 2.0 > HyperCard 2.0 (Widely available for Macintosh owners) > Macintosh System 6.0.5 or greater (for HyperCard 2.0) > 2 or more MB Ram recommended H. Availability of current version (.98.7) > (For now) Anonymous ftp to ub.cc.umich.edu (35.1.1.47) After connection type 'cd gdef' and 'get dnastack' Resulting file needs to be "debinhexed" > (Soon) Anonymous ftp to at least the following site IuBio.Bio.Indiana.Edu (129.79.1.101) cd [.molbio.mac] get DNASTACK.HQX (old version there now does not convert GenBank/EMBL sequences to gene maps) > (Soon) EMBL file server by sending the command: GET MAC_SOFTWARE:DNATRANSLATOR.HQX to NETSERV@EMBL.BITNET Entire package is approximately 430K in binhexed form ******************Sequence Speaker HyperCard 2.0 stack************** ********************Copyright 1990 by D. J. Eernisse**************** Follow above instructions but ftp to 'um.cc.umich.edu' and 'cd legd' and 'get SeqSpeak', then after downloading 'debinhex'. > Speaks DNA/RNA or Peptide or All Letters & Numbers > Speaks during or after entry with volume and speed controls > Imports or exports text files and fontsize adjustable > Simple stack - just ask my 3-year old