[comp.theory] Topology of Languages

PAAAAAR@CALSTATE.BITNET (12/15/89)

Thank you!
8 or 9 people gave me clues to the source for the notes I took 20 years ago
on measuring the distance between two languages. After some
detective work in a library or three I have believe that it was
a paper by Chomsky and Schutzenberger (pp 118-161) in Braffort and
Hirschberg, 1963.

Braffort and Hirschberg(Ed): "Computer Programming and Formal Systems",
 North Holland 1963.

The clues also lead me to books with more recent applications of
topology to computing and so helped answer my other two questions:

(1) There is a connection between Dana Scott's way of defining semantics
    and the metric space of formal power series.

(2) There is a well known metric defined on the set of all languages,
    (and it is equivalent to the one I had worked out for myself).

The moral of this story is that graduate students can always
"use the source" [O Kenobi, personal communication]

A bibliography and notes will follow.

Dr. Richard J. Botting,
Department computer science,
California State University, San Bernardino.
paaaaar@calstate.bitnet
PAAAAAR%CALSTATE.BITNET@CUNYVM.CUNY.EDU
dick@silicon.sb.csu.REAL_SOON_NOW!

PAAAAAR@CALSTATE.BITNET (12/15/89)

Partial Chronological Bibliography
(
* Important to metric spaces and formal languages(IMHO),
+ A passing reference to spaces of languages,
? Cited as relevant elsewhere.
)

?1962 Schutzenburger, On a Theorem by Jungen, "Proceedings of the American
 Math Society", Vol 13, pp885-890

?1962 Schutzenburger, Certain elementary families of Automata, "Proceedings
of Symposium on Mathematical Theory of Automata", Polytechnic Institute
of Brooklin, pp139-153

*1963 Chomsky and Schutzenberger, pp 118-161 in
 Braffort and Hirschberg(Ed): "Computer Programming and Formal Systems",
 North Holland 1963.

*1967 E. Shamir : A representation theorem for algebraic and
 context-free power series in noncommuting variables.
 Inform. Control, vol. 11, 238-254.

+1968 Arbib(Ed) MA, "Algebraic Theory of Machines, Languages and Semigroups",
 (partial record of conference at Asilomar, CA, Aug 30-Sep 7th 1966 with notes
  from Lectures at UCB 1964)

?1970 Stanat D: "A Homomorphism Theorem for Weighted Context Free Grammars",
 Int. Journal of Computer and Systems Science, Vol 6, 1972, pp217-232.

 1974  S Eilenberg: Automata, Languages and Machines, Volume A, Academic
 Press, 1974.

?1974 L Padulo and M. A. Arbib "System Theory", Saunders (LC Card QA402 P42?)

*1978 Arto Salomaa and , Matti Soittola,1978: "Automata-theoretic aspects of
 formal power series" Springer-Verlag, 1978.  Series title:  Texts and
 monographs in computer science, [editors, F. L. Bauer, David Gries (LC Card
 QA267.5.S4 S29).

+1983 Bloom SL, "Algebraic Solution of a System of Recursion Equations in
Infinite Trees and Other Contraction Theories" Journal od Computer and
System Science, 1983, 00225-255

 1984 M Nivat & D Perrin: "Automata on Infinite Words", Springer Verlag 1984,
 Lect Notes in CS #192 (LC Card QA267)]

*1985 Mellor (Ed): "Mathematical Foundations of Program Semantics".
  Springer Verlag 1985 (Lect Notes in CS, 239)

*1985 Main, Milton,Mislove, and Schmidt(Eds): "Mathematical Foundations
 of Program Semantics" Springer Verlag 1985 (Lect Notes in CS, 298).

+1985 W Kuich and A Salomaa, "Semirings, Automata, Languages", Springer Verlag
 EATCS monographs, Vol 5, 1985 (LC Card QA267.K85)

*1986 Manes and M Arbib: "Algebraic Approaches to Program Semantics",
 Springer Verlag Monograph 1986.

+ 1986 RN Moll, MA Arbib, AS Kfoury: "An Introduction to Formal Language
 Theory",Springer Verlag monograph.


Notes

Salomaa and Soittola[1978] is a good survey of the whole  area of formal
power series as of 1978. They lead to the correct library shelves.
They define the metric space on the power series by
        if x=y then d(x,y)=0
        else d(x,y)=2^-diff(x,y),
           where diff(x,y)=min {length(w)   : not (Coef(x,w)=Coef(y,w)) },
                where Coef(x,w) = coefficient of word w in power series x.
These authors refer to Schutzenberger[1962] and to Chomsky and
Schutzenberger[1963].

Chomsky and Schutzenberger[1963] is the most likely source for my  notes.
However it's possible that metrics are mentioned in Schutzenberger's 1962
papers.

The Asilomar Conference(1966) seems to have included a early attempts to use
topology in computer science. Arbib's book[1968] publishes some of the papers
including one by E. Shamir [pp329-341] and J.M. Day's
"Expository Lectures on Topological Semigroups" [pp269-294]. Shamir's
paper refers to Schutzenberger's papers [1962, 1963] and avoids topology.
Day's chapter is a presentation of topology and semigroups with no
visible computer science.  The appendix lists other 40 or 50 other papers
from the conference that were  published in "Math System Theory",
"Computer and System Science",  and "Information and Control" during
1967 thru 1978 including Shamir's 1967 paper.

E. Shamir [1967] refers to Chomsky  and Schutzenberger[1963] and uses
a similar definition. Shamir uses the idea that the coefficients of the
power series generated by a grammar become constant as the iteration proceeds.

S. Eilenberg's Magnum Opus(1974) is algebraic and uses  "proof by blatant
assumption" - he starts(Ch VI) by assuming a total finite and infinite
summation function.  He credits Schutzenberger [1962] for the introduction of
power series. Eilenberg has a  tantalising phrase(p158) "Subsequent
publications of Schutzenberger and several of his associates..."
but doesn't mention who or where they were...

Schutzenberger's group was/is at the LITP in Paris University
( 2 place Jussieu Paris 7505, France). Names included: Dominique Perrin,
Michel Fliess, Jean Berstel, M Nivat etc, who have proceeded to interesting
work on infinite strings...for example M Nivat and D Perrin [1984]. In this equi
of all words of length n or less are  equal.  They define limit in
terms of the equivalences.

The connection with Dana Scott's use of the topological
function spaces has been investigated in the last 10
years or so - apparently starting with a "Continuous Lattices" conference
(1973?) which resulted in a book by that name published by Springer Verlag.

Melton and Schmidt [Mellor 1985] state that the Scott topology is the
Tychonoff topology on the function space (the weakest one with function
application being continuous).

A graduate text by Manes and Arbib[1986] uses categories, posets, and
metric spaces to prove that sundry iterative and recursive processes
define a unique object. They describe a metric on languages
without using power series [pp212, Chapter 9, section 1, paragraph 6].
They give Padulo & Arbib [1974]and Bloom[1983]
as sources. Bloom uses metrics and Banach's theorem to prove convergence
in a space of infinitely long strings of characters.

A similar formula is given by E-E Doberkat [1988, p289-302 of Main,
Milton, Mislove, and Schmidt(Eds)]. Several other papers in this
conference proceedings also  use topological ideas. For example:
Reed and Roscoe use  -
        d(x,y)=inf {2^(-t) : for some t , where for all t'<t, x(t)=y(t)}.
They state that this is a complete (ultra-)metric for a timed version of CSP.

By the way: Tom Head published a description of a metric  and  topology
 on character sequences induced by a given grammar [pp147-163 of
M Nivat,  D Perrin,1984]. However a space of strings is not the space of
languages.

--------------------
Dr. Richard J. Botting,
Department computer science,
California State University, San Bernardino.
paaaaar@calstate.bitnet
PAAAAAR%CALSTATE.BITNET@CUNYVM.CUNY.EDU
dick@silicon.sb.csu.REAL_SOON_NOW!
------------------------------------------------------------
May you have fun and work hard, and not know the difference
------------------------------------------------------------