ahudli@silver.ucs.indiana.edu (anand hudli) (04/12/90)
It was in the ALGOL-60 Meeting that the Backus Naur Form (BNF) for specifying the syntax of programming languages was introduced. Since then BNF has been widely used to specify the syntax of many programming languages like Pascal, C, Ada, etc. In the March issue of the Communications of the ACM , 1967, a letter from one P Z Ingerman to the Editor suggested that Panini, the famous Sanskrit grammarian of Ancient India, had specified the syntax of the Sanskrit language using a very similar notation. Ingerman wrote : `` Since it is traditional in professional circles to give credit where credit is due, and since there is clear evidence that Panini was the earlier independent inventor of the notation, may I suggest the name Panini-Backus Form as being a more desirable one? " Apparently, his suggestion was not taken seriously and BNF remains BNF today. Since I have had an education in Computer Science and I am also interested in Sanskrit, I was curious to know how Panini's method compared with BNF. After reading a few articles on Panini I was convinced that Panini had indeed invented a novel way of specifying Sanskrit syntax and that his method is essentially the same as BNF. Panini's grammar consists of rules or sutras grouped together in eight chapters and referred to as the ASTADHYAYI. Each chapter is made up of four sections called pada, and within each pada the sutras are numbered from one onward. The ASTADHYAYI is preceded by a list of sounds used in the grammar called the Sivasutra. The list of sounds consists of groups of sounds , each group ending in an indicatory sound. The vowels form the first three groups, the semivowels and the consonants follow. The Sivasutra is followed by the sutras or rules . It is possible to classify the sutras into three categories-- 1) The sutras which describe linguistic facts , 2) the sutras which serve the purpose of defining technical terms , eg. savarna sounds which are pronounced in the same place and with the same tension of the mouth and 3) the sutras which are actually meta-rules, ie. they describe how the other sutras are to be applied in particular cases, in the event of a contradiction, etc. It is easy to view the set of rules (ie. the entire ASTADHYAYI) as a knowledge base of an expert system which deals with Sanskrit grammar. One of the things that expert system does is explain a linguistic transformation. I will illustrate this with an example. The word dadhi followed by the word atra becomes dadhyatra. In this case, the vowel i (pronounced e) gets replaced by y . Similarly madhu followed by iva gives rise to madhviva. Now Panini could have formulated a rule which says ``i,u,r,lr are replaced by y,v,r,l respectively if followed by a heterogeneous vowel". But there is actually a sutra which deals with the case when a homogeneous vowel follows. For example dadhi followed by indra becomes dadheendra. Here the following vowel i in indra is homogeneous with the i in dadhi. But in the case of dadhi followed by atra, a is not homogeneous (or heterogeneous) with i. The reasoning used by Panini is that since the sutra dealing with homogeneous vowels occurs before the one dealing non-homogeneous vowels, it is enough to state the latter rule as "i,u,r,lr are replaced by y,v,r,l respectively when followed by a vowel". The first rule will take care of the case of homogeneous vowels. Only if the following vowel is not homogeneous will the second rule be used. Again Panini makes use of a novel technique in specifying groups of letters. In the Sivasutra, the sequence i,u,r,lr are followed by an indicatory sound k, and the sequence ya,va,ra,la is followed by an indicatory sound n. So Panini writes ik to denote the sequence i,u,r,lr and yan to denote ya,va,ra,la. Similarly , ac denotes the entire class of vowels, ie. starting from a ,i,u,r,lr,e,o,au and the indicatory sound c. The rule for the case when a heterogeneous vowel follows is hence `` ik followed by ac gets replaced by yan". This sutra is actually stated as : iko yan aci In Panini's grammar such replacement rules are extremely common. So Panini has developed a very precise way of expressing them using the cases of Sanskrit. It is possible to distinguish among cases by their terminations. The general format of these rules is : A+ablative B+genitive C+nominative D+locative The meaning of this rule is : if B is preceded by A, and followed by D replace B by C . From this it is short step indeed to the BNF notation extended to deal with context sensitive rules !! A B D --> A C D It is now easy to see how the rule iko yan aci is derived . ik + genitive termination yan + nominative termination ac + locative termination. The important point is that Panini always talks about REPLACING a grammar symbol by another grammar symbol. The replacement rules are in general context sensitive. Most of the above discussion on Panini's sutras is in the book by JF Staal titled Universals, Univ of Chicago Press,1988. It is indeed difficult to understand why the name Backus-Naur Form was not changed to Panini-Backus Form or even Panini-Backus-Naur Form. Panini's ingenuity has rarely been matched in the field of linguistics. The computer science community has lost an opportunity to recognize the work of the Indian genius. Anand V. Hudli ahudli@silver.ucs.indiana.edu