[comp.misc] Panini and Programming Languages

ahudli@silver.ucs.indiana.edu (anand hudli) (04/12/90)
It was in the ALGOL-60 Meeting that the Backus Naur Form (BNF) for specifying
the syntax of programming languages was introduced. Since then BNF has
been widely used to specify the syntax of many programming languages
like Pascal, C, Ada, etc. In the March issue of the Communications
of the ACM , 1967, a letter from one P Z Ingerman to the Editor suggested
that Panini, the famous Sanskrit grammarian of Ancient India, had specified the
syntax of the Sanskrit language using a very similar notation. 
 Ingerman wrote : 
 `` Since it is traditional in professional circles to give credit where
credit is due, and since there is clear evidence that Panini was the
earlier independent inventor of the notation, may I suggest the name 
 Panini-Backus Form as being a more desirable one? "  
 
 Apparently, his suggestion was not taken seriously and BNF remains 
BNF today. Since I have had an education in Computer Science and I
am also interested in Sanskrit, I was curious to know how Panini's
method compared with BNF. After reading a few articles on Panini I was 
convinced that Panini had indeed invented a novel way of specifying 
Sanskrit syntax and that his method is essentially the same as BNF. 

 Panini's grammar consists of rules or sutras grouped together in eight
chapters and referred to as the ASTADHYAYI. Each chapter is made up of
four sections called pada, and within each pada the sutras are numbered
from one onward. The ASTADHYAYI is preceded by a list of sounds used in
the grammar called the Sivasutra. The list of sounds consists of groups
of sounds , each group ending in an indicatory sound.  The vowels form
the first three groups, the semivowels and the consonants follow. 
The Sivasutra is followed by the sutras or rules . It is possible to 
classify the sutras into three categories-- 1) The sutras which 
describe linguistic facts , 2) the sutras which serve the purpose of 
defining technical terms , eg. savarna sounds which are pronounced 
in the same place and with the same tension of the mouth and 3)
the sutras which are actually meta-rules, ie. they describe how 
the other sutras are to be applied in particular cases, in the 
event of a contradiction, etc. It is easy to view the set of rules
(ie.  the entire ASTADHYAYI) as a knowledge base of an expert system 
which deals with Sanskrit grammar. One of the things that expert system
does is explain a linguistic transformation. I will illustrate this with
an example. The word dadhi followed by the word atra becomes dadhyatra.
In this case, the vowel i (pronounced e) gets replaced by y . Similarly
madhu followed by iva gives rise to madhviva. Now Panini could have   
formulated a rule which says ``i,u,r,lr are replaced by y,v,r,l
respectively if followed by a heterogeneous vowel". But there is
actually a sutra which deals with the case when a homogeneous vowel
follows. For example dadhi followed by indra becomes dadheendra. Here
the following vowel i in indra is homogeneous with the i in dadhi. 
But in the case of dadhi followed by atra, a is not homogeneous
(or heterogeneous) with i. The reasoning used by Panini is that 
since the sutra dealing with homogeneous vowels occurs before the
one dealing non-homogeneous vowels, it is enough to state the latter
rule as "i,u,r,lr are replaced by y,v,r,l respectively when followed 
by a vowel". The first rule will take care of the case of homogeneous
vowels. Only if the following vowel is not homogeneous will the second 
rule be used. Again Panini makes use of a novel technique in
specifying groups of letters. In the Sivasutra, the sequence i,u,r,lr
are followed by an indicatory sound k, and the sequence ya,va,ra,la is 
followed by an indicatory sound n. So Panini writes ik to denote the 
sequence i,u,r,lr and yan to denote ya,va,ra,la. Similarly , ac denotes
the entire class of vowels, ie. starting from a ,i,u,r,lr,e,o,au and the
indicatory sound c.  
The rule for the case when a heterogeneous vowel follows is hence 
`` ik followed by ac gets replaced by  yan". 

This sutra is actually stated as :

	iko yan aci 

 In Panini's grammar such replacement rules are extremely common. So
Panini has developed a very precise way of expressing them using the
cases of Sanskrit. It is possible to distinguish among cases by their
terminations. 
The general format of these rules is :

 A+ablative B+genitive C+nominative D+locative 

 The meaning of this rule is :

 if B is preceded by A, and followed by D replace B by C .

    From this it is short step indeed to the BNF notation extended to
deal with context sensitive rules !!

         A B D -->  A C D 

  It is now easy to see how the rule iko yan aci is
derived .

 ik + genitive termination yan + nominative termination ac + locative
termination. 


 The important point is that Panini always talks about REPLACING a 
grammar symbol by another grammar symbol. The replacement rules are in
general context sensitive.  

 Most of the above discussion on Panini's sutras is in the book by 
 JF Staal titled Universals, Univ of Chicago Press,1988. 

 It is indeed difficult to understand why the name Backus-Naur Form was
not changed to Panini-Backus Form or even Panini-Backus-Naur Form.
Panini's ingenuity has rarely been matched in the field of linguistics.
The computer science community has lost an opportunity to recognize 
the  work of the Indian genius. 

Anand V. Hudli
ahudli@silver.ucs.indiana.edu