sources-request@panda.UUCP (02/07/86)
Mod.sources: Volume 3, Issue 108 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) Fellow NetLandians - This is the introduction to the nine part posting of the source code and documentation for the TRC compiler. TRC is an acronym for Translate Rules to C. TRC is useful for building expert systems. The source code is C language with a YACC parser. The posting consists of the following: Part 0 - This file. Part 1 - Overview document Part 2 - Tutorial document Part 3 - Reference Manual (Part 1) Part 4 - Reference Manual (Part 2) Part 5 - Reference Manual (Part 3) Part 6 - Source Code SHAR containing: Makefile main.c main.h optimize.c Part 7 - Source Code NOT A SHAR: output.c Part 8 - Source Code SHAR containing: p_out.c parser scanner.c Send bug reports to the author. Minor corrections will be posted to net.sources.bugs (unless a better group is suggested). An update will be sent to the mod.sources moderator when it seems reasonable to do so. Daniel D. Kary North Dakota State University Department of Computer Science 300 Minard Hall, P.O. Box 5075 Fargo, ND 58105 (701) 237-8171 ihnp4!dicomed!ndsuvax!nckary
sources-request@panda.UUCP (02/08/86)
Mod.sources: Volume 3, Issue 109 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as overview.doc. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- CHAPTER 1 COMPILER OVERVIEW TRC is a useful tool for building expert systems, but it is not intended as the only tool that the knowledge engineer will use. TRC augments the editing, compiling and debugging tools already present on the system being used for development. Figure 1 illustrates the steps involved in developing an expert system with TRC. The knowledge engineer first creates the TRC source code file using whatever text file editing tools are available. This text file is then passed to the TRC com- piler which produces a set of C language files. In addi- tion to the C language files produced by TRC, the knowledge engineer must provide auxiliary C language file(s). At a minimum the auxiliary file(s) must contain a main procedure which will initialize STM and invoke the inference engine produced by TRC. The size of the auxili- ary code is limited only by the C compiler itself and the address space of target machine. The inference engine might be a small part of a larger system. The input and output facilities provided by TRC are limited. Any interaction with the user or the file system on the target machine will usually be coded in the 1 2 auxiliary code files. C language code can be embedded in either the situation or the action part of a rule. It may be convenient to place any procedures that are called from within a rule in an auxiliary code file. _________________ | | | TRC | | Specification | |_______________| | _____V______ | | | TRC | | Compiler | |__________| | _____________ _____V______ | | | | | Auxiliary | | C | | C | | Files | | Files | |__________| |___________| |__________________| | _____V_____ | | | CC | |_________| | ______V_______ | | | Executable | | Code | |____________| Figure 1: Development Sequence The C code produced by TRC and the auxiliary C code provided by the knowledge engineer are then passed to the C compiler. CC is the traditional name of the command that invokes the sequence of processes required to 3 translate the C language file(s) to an executable code file. This sequence of processes will often include mul- tiple compiler passes, an assembler process and a linker process. In the context of figure 1, CC refers to whatever sequence is required to translate the C language files to executable code. Building an expert system is much like building any other software system. The system is specified, designed, coded, tested and finally released. Each of the steps, particularly coding and testing, will frequently go through several iterations. TRC provides debugging tools which will profile and trace the execution of the infer- ence engine. The TRC debugging tools along with whatever debugging tools are provided with the C language system can be used in the coding and testing phase to simplify development. 9 9 CHAPTER 2 DESIGN OBJECTIVES An expert system for configuring a packet switch pro- vided the initial stimulus for this project. The expert system was implemented using PROS, a LISP based expert system building tool. PROS provided a convenient way to represent the knowledge needed to configure a packet switch. The representation was easy to read and expressed _w_h_a_t the machine was to do more than _h_o_w it was to be done. There was an aesthetic quality to the representa- tion that a seasoned programmer can appreciate. Execution turned out to be a major disappointment. A relatively simple problem required over two hours of CPU time on a VAX 11/780. A human expert could easily solve the same problem in fifteen minutes. Artificial Intelligence programs are not always expected to solve problems faster than humans. For some problems, being able to solve the problem on a computer at all is a major accomplishment. Being able to configure a packet switch with a computer program did not seem like a major accomplishment and it seemed that a program should be able to solve the problem much faster than a human expert. It also seemed that a program could be written in 4 5 a procedural language that would do the same thing in the same way as the expert system, only much faster. The result of the initial attempt to produce an expert system using a procedural language was a compiler called 't' (translator). The 't' compiler was suitable for the packet switch problem and similar simple problems. The packet switch problem which required two hours of cpu time in the LISP based implementation, executed in less than one second when compiled with 't'. The execution time of the code produced by 't' was more reasonable for the complexity of the problem. This seemed to indicate that it might be possible to speed up the execution of more complex problems. The first step in designing TRC was to study the operation of PROS and PASVII. The most objectionable aspect of both these tools is the amount of time required for execution. The expectation was that understanding the operation of these tools would suggest ways in which fas- ter executing expert systems could be produced. PROS and PASVII operate in similar manners and are not unlike other LISP based implementation tools. In PROS and PASVII, both STM and LTM are represented by LISP lists. The STM list contains all the data that the rules will operate on and the LTM list contains all 6 the rules. Each system has a general purpose inference engine that searches through the LTM list for a rule whose situation part is satisfied. To test if the situation part is satisfied, the STM list is searched for whatever pattern was specified in the situation part. Both systems use the trivial conflict resolution strategy of testing the rules sequentially and selecting the first rule whose situation part is satisfied. A LISP list can be used to represent any structure that can be imagined. A single list is certainly suffi- cient to represent STM. Searching the list for a pattern match involves decomposing the list. This operation can be time consuming when the list is large and/or deeply nested. The list must be decomposed each time a rule is tested. In both PROS and PASVII the list is also decom- posed in the action part, if something has to be removed from STM. Reducing the time required to searching STM is an obvious way to speed up execution. Since the LTM is also represented by a single list, it too is continuously decomposed in the execution of the program. Consider an expert system composed of a hundred rules. If each rule is equally likely to be selected by the resolution strategy, then on the average fifty rules will have to be tested before the rule that is to be applied is found. This means that fifty rules have to be 7 extracted from the LTM list and the STM list has to be decomposed fifty times before one rule can be applied. It is not uncommon for this type of system to spend 90% of it's execution time testing rules and only 10% of it's time applying actions[12]. Eliminating the need to decom- pose the LTM and reducing the time spent testing rules are other obvious ways to improve execution speed. Both PROS and PASVII are acceptable languages for expressing rules. The TRC language is quite similar to both PROS and PASVII. Differences between TRC and the LISP based languages are due primarily to differences between C and LISP. TRC also contains some language features not found in either PROS or PASVII. The TRC language is described in detail in Appendix C. Finally, why translate to the C language? The machine of interest (VAX 11/780) runs an operating system (4.2BSD) that is written largely in C. The 4.2BSD C com- piler is intended as a production compiler. Other com- pilers supplied with the system (e.g. PASCAL) are pri- marily instructional tools[18]. The C language is widely used and compilers are available for small computers. The availability of C compilers for small computers makes it feasible to develop expert systems with TRC that are tar- geted to small computers. 9 9 CHAPTER 3 TRANSLATION STRATEGY The first design objective is to reduce the amount of time spent searching STM. The way STM is represented will affect the way a search is conducted. Since speed is the objective, a representation that can be searched quickly is desirable. The representation must also be sufficient for the problem. LISP based implementations use a LISP list to represent STM. The LISP list representation for STM has been sufficient for many complex problems[8, 13, 14, 15, 16]. There is little possibility that any program will exhaust human imagination by using a LISP list in every way it can possibly be used. This implies that the full capability of a LISP list may not be required. The ques- tion, then, is how much or how little is enough. A LISP list can contain atoms or other LISP lists. Atoms can be strings or numbers, and numbers can be integers or reals. A variable in a procedural language can be a string or an integer or a real, so atoms are simple to represent in procedural languages. The question now is how to represent or replace a list? _D_a_t_a _R_e_p_r_e_s_e_n_t_a_t_i_o_n 8 9 It was decided that STM would be represented by linked lists of structures that could contain whatever variables (atoms) that were needed. This is the subset of a LISP list that is easy to build and manipulate in a pro- cedural language. The structures that are used to build the linked list will be referred to as objects. The vari- ables that the object structures contain will be referred to as elements. Element values distinguish the otherwise identical objects from one another. The number and type of elements that are required to distinguish an object will vary between applications. An expert system will often refer to objects that bear no similarity to one another. One object may be described by two strings, while another object may require a real number and an integer to distinguish it from other objects. It would be wasteful to require every object to contain the same number and type of elements. Therefore the description of STM is extended to a set of lists, rather than a single list. Each list is a collection of related objects. One side effect of representing STM as a set of lists is that STM is in effect partially sorted. In the TRC language every reference to an object or element includes an implicit reference to the list where the object may exist. Stated another way, it is not possible to refer to 10 an object or an element without implicitly mentioning the list where the object or element might be found. This means that when STM is to be searched for an object there is only one list where it can possibly exist, therefore only one of the set of lists has to be searched. _R_u_l_e _R_e_p_r_e_s_e_n_t_a_t_i_o_n A situation-action rule is essentially an if-then statement; "if this situation exists, then take this action". LTM is translated to a single procedure which contains an if statement for each rule. The if statements (rules) are tested successively until one evaluates to true. The action part of that statement is then executed (the rule fires). Control then branches back to the first if statement (rule) and testing continues at that point. This sequence of events continues until no if statement (rule) evaluates to true (fires), at which point the pro- cedure terminates. Up to this point the overall action of an expert sys- tem has been described as "searching LTM for a rule whose situation part is true". On each iteration only one rule is applied. If next rule to be applied could be found without searching LTM, the search time would be reduced to zero. Reducing search time is a primary goal of the TRC rule representation. From this point on the overall 11 action of an expert system will be to "reject at the ear- liest possible moment rules that cannot be applied until a rule that can be applied is found". There are some conse- quences of this new attitude worth noting. One side effect of the representation for STM is that it is possible to have some knowledge of what is in STM without searching STM. Suppose there is an expert system where two types of objects can exist in STM, there would then be two lists; call them list A and list B. Since each list can contain only objects and not other lists, it is possible to keep a count of how many objects are in each list. The count of the number of objects in each list can be used to reject a rule without searching STM. Suppose the situation part of a rule specified two objects from list A and one from list B. If either list is empty or if list A contains only one object, the rule can be rejected without searching either list. TRC places a precondition on every rule that causes the rule to be rejected if STM does not contain enough of each type of object to make the situation part possible. _S_e_a_r_c_h_i_n_g The default strategy for searching STM is called the LINEAR strategy. A search is conducted for each object listed in the situation part, in the order the objects are 12 listed. If any single search fails, the rule is aban- doned. This is consistent with the "reject at the earli- est possible moment" attitude adopted for LTM. Unfor- tunately this simple strategy may not be a sufficiently powerful pattern matching tool for some expert system implementation requirements. An alternate search strategy available in TRC, called the RECURSIVE strategy, is designed to perform a combina- toric search on STM. When the RECURSIVE strategy is used, the failure of an individual search causes the rule to fail only when there is no previous search that can be redone. Speed of execution can be dramatically reduced by using the RECURSIVE strategy. Time loss is largely depen- dent on the size of the lists being searched. Allowing the search strategy to be selected by the knowledge engineer on a per-rule basis is a compromise designed to give the best of both worlds; fast searches when combinatoric possibilities do not exist and powerful pattern matching when they do. The search strategy used by PROS and PASVII is similar to the RECURSIVE strategy. Both search strategies are detailed in Appendix B. Of particular interest will be section 6.3.3 of Appendix B where a small example system illustrates the differences between the two strategies. 13 _A_c_t_i_o_n_s The action part consists primarily of additions to stm and deletions from stm. On the surface it seems that adding and deleting objects should be trivial. There are, however, performance issues to be considered. Deletions usually refer to objects that were found in the situation part. This is a matter of practical con- cern, since only those objects found in the situation part are guaranteed to exist in STM. There are two general strategies for deleting the objects, either remember where in STM the objects were found or search STM in the action part for the objects to delete. Both PROS and PASVII search STM in both the situation and the action part. The objects that are found in the situation part are moved to the front of STM to minimize the time spent searching STM in the action part. There are some objectionable aspects to the strategy of searching STM in both the situation and action parts. Every rule that fires can reorder STM. It can be argued that reordering STM is a good idea, but it may not always be what the knowledge engineer wants. If STM is reordered in the situation part it is possible that the search in the action part will find different objects than the search in the situation part found. The possibility of 14 finding something different in the action part than was found in the situation part is at least counter intuitive and possibly incorrect. Finally, searching STM twice for the exact same object(s) is objectionable in and of itself because it consumes time redoing work. PASVII has an interesting way of adding objects to STM. The list that represents STM is initialized to con- tain some number of objects which may be atoms or lists. An atom or a list can replace an atom or a list that exists in STM. If an atom or a list is inserted at the head of the list, the last object (atom or list) in the list falls off. This action is called a metaphor for the action of short term memory in humans. As knowledge is gathered old unused knowledge fades to the back of memory and eventually is forgotten. Quite frankly, this metaphor sounds more like a clever explanation for a program 'feature' than a useful tool. The actions of adding and deleting objects in TRC are not quite as clever as the previous example. Objects to be added to STM are simply inserted at the head of the list, nothing ever falls off the end of the list. STM is searched only in the situation part. Objects that are to be deleted in the action part must have been found in the situation part. This rule is enforced by the compiler. When an object is found in the situation part, it is 15 identified with an indexed pointer. The object can now be referred to or deleted without searching STM. 9 9 CHAPTER 4 OPTIMIZATION Most of the improvements in execution speed provided by TRC are side effects of the translation strategy. STM is partially sorted by virtue of being represented as a set of lists rather than as a single list. For every object that can exist, there is only one list that can contain that object. The TRC lists themselves are simpler than LISP lists. A single linear pass through a TRC list will reveal every object. A LISP list can be more complex to search because it can be arbitrarily nested. Precondi- tions placed on every rule eliminate testing rules when it is known that the rule's situation part can not possibly be met. Selectable search strategies allow quick searches of STM when combinatoric possibilities do not exist. The optimizer does not produce code that is optimum in any sense. What it does is to perform a single, useful code modification that can have a positive impact on exe- cution time. The goal of the optimizer is to reduce the amount of time spent searching. Each time a rule fires a great deal of implicit knowledge about the content of STM is obtained. It is known that no rule previous to the 16 17 current rule is true and no rule previous to the current rule can be true after the execution of the current rule unless the current rule modifies STM in such a way as to make some previous rule true. These simple facts are the entire basis of the optimizer. Three tests must be per- formed to determine if it is possible for a rule to fire. These tests will be called the NOT test, the ADD test and the MARK test. The tests are named after the TRC statements that figure prominently in the test. The details of each statement are presented in Appendix B. For the purpose of this discussion it should suffice to know that the NOT statement is an explicit test for an empty list, the ADD statement is a request to add something to STM and the MARK statement is a request to remove something from STM. The first case to be considered is the case of a rule which contains a NOT statement in the situation part. When a rule that fires contains an ADD statement it will not be possible for any previous rule with a NOT statement referring to that list to be the next rule to fire. If a rule that fires contains a MARK statement and no ADD statement referring to that same list, it is possible that the list will become empty making it possible for the rule with the NOT statement that previously failed to become true. If it is determined that it is possible for a rule 18 to fire after the NOT test, that rule becomes the candi- date rule and no further testing is done. The ADD test finds recursive rules that can not fire on the next pass. Consider the case of a rule with no NOT statements that recursively searches STM for a situation. If this rule fails, it will continue to fail until some- thing is added to STM to make it true. If all rules searched STM recursively it would be known when a rule fires that of the rules that precede the current rule, only those rules that search for something added to STM by the current rule can possibly fire in the next pass. If the current rule adds an object to STM, control could continue with the first rule that searches for that object rather than the first rule in LTM. If no rule prior to the current rule searches for those things added to STM by the current rule or if the current rule adds nothing to STM then no prior rule can execute. Control could continue with the current rule rather than at the beginning of LTM. The MARK test applies to rules that perform a linear search on STM. The previous conclusion about items being added to STM is still true; a rule that adds something to STM can cause a linear search rule to become true. With linear search it is also possible that a rule will become 19 true if something is removed from STM. If a linear rule searches for several similar items which are present but not correctly ordered it is possible for this linear search to fail where a recursive search would not have failed. If there were excess items, removing one or more may cause a different linear assignment which could make a linear rule true. The TRC optimizer selects a continuation point for each rule based on what the rule adds to or deletes from STM rather than testing from the beginning of LTM after any rule fires. The continuation point is the first rule that could fire based on the NOT and ADD tests for all rules, and the MARK test for linear rules. The TRC optim- izer is somewhat naive in that it considers only items added or deleted with the ADD and MARK statements. 9 9 CHAPTER 5 FURTHER RESEARCH A hierarchical arrangement for expert systems has been suggested[19]. The divide and conquer strategy is a technique used by experts in almost every field. By decomposing a set of rules into several subsets arranged in a hierarchy, only the rules that apply to the current part of the problem need to be considered. Reducing the number of rules that have to be tested at any one point will generally reduce the average amount of time it takes to select a rule. Since hand optimization in TRC allows an arbitrary control structure to be imposed on the rule based system, it is not impossible to build a hierarchical expert system with TRC. However, it might not be convenient to build a hierarchical system with the current TRC compiler. The input language to TRC should be redesigned to include the convenient expression of hierarchical systems. Many programming languages support multiple module pro- grams. Each module in a multiple module program usually contains a group of related procedures. It might be rea- sonable to place each system of rules in a hierarchy of rule based systems in a separate module. 20 21 In languages that support multiple module programs, some means of binding the modules together is provided. In the C language the '#include' facility permits common definitions to be loaded by each module. In Ada[20] the package specification is used to make type, variable and procedure declarations visible to other modules. Either of these facilities could serve as a model for designing a hierarchical language. Experts are frequently asked to explain how they arrived at a conclusion. It is reasonable to expect an expert program to do the same thing. TRC provides lip service to this requirement of expert systems with the TRACE facility. The ordered list of rules that were applied can explain how or why a given answer was found, but inferring an explanation from a trace may not be sim- ple. A more convenient facility for generating explana- tions should be designed. With the current TRC grammar it is possible to search for the absence of object/element combinations by using two rules. TRC should be extended to include a way to specify a search for the absence of object/element combi- nations in a single rule. This could be accomplished by extending the NOT statement and will have an impact on the optimizer and search code generation. 9 9 22 Some expert systems allow certainty factors to be associated with rules. A confidence factor is the proba- bility that it is appropriate to apply this rule given that the situation part is true. Confidence factors can also be used to suggest the probability that the answer generated by the expert system is correct. A convenient facility for representing confidence factors should be included in TRC. TRC uses the trivial conflict resolution strategy of applying the first rule whose situation part is satisfied. Alternate conflict resolution strategies should be con- sidered. If confidence factors are implemented, one con- flict resolution strategy may be to test all rules, if more than one rule is satisfied then select one based on confidence factors. The C language is not the only language that could be generated by a compiler like TRC. In a separate pro- ject[21] TRC was modified to generate TURBO PASCAL. It has been suggested[22] that TRC could generate INGRES code. STM can be viewed as a database, the situation part of a rule can be viewed as a database query and the action part of a rule can be viewed as a database transaction. For problems that deal with a large amount of data, the file handling capabilities of a DBMS could be put to good use. Relational calculus is a powerful tool for searching 23 a data base that could be put to good use on some prob- lems. The current optimizer is very weak. By looking at the elements that are being searched for in STM in addi- tion to the objects, additional implicit knowledge of the state of STM is gained. It may be possible to skip over some rules based on this knowledge, thus reducing search time. Consider this sketch where object A has an integer element B: R1: (A.B == 7) => MARK A ; R2: A => MARK A ; R3: => ADD A(B => 6) ; When rule R3 is optimized by the current optimizer, it will decide that it is possible for R1 to fire after R3 has fired because R3 adds an object of type A and R3 searches for an object of type A. Clearly R1 can not fire after R3 fires because the object of type A added by R3 has element B equal to 6 while R1 searches for element B equal to 7. The current optimizer finds the first rule that can possibly fire. This does not mean the rule will fire. There can be any number of rules between the the last rule that fired and the first one that can possibly 24 fire next. Preconditions could be placed on these rules to eliminate testing intermediate rules where possi- ble[23]. Consider this example: R1: B C => MARK B ; R2: A B => MARK A ; R3: A C => MARK A ; R4: A => MARK A ; R5: => ADD C ; The optimizer will correctly deduce that it is possi- ble for R1 to fire after R5 fires. It is also possible that R1 will not fire. If R5 fires and R1 does not fire, it is not possible for R2, R3 or R4 to fire either. Since R5 fired it is known that no previous rule could fire. Since R4 could not fire, it is not possible for R2 or R3 to fire. When R5 fires, preconditions could be placed on R2, R3 and R4 that would prevent even testing those rules since it is known that they cannot fire. 9 9 CHAPTER 6 CONCLUSIONS A compiler has been described and built. This com- piler translates a rule based language to a procedural language and is a useful tool for building expert systems. The translation to a procedural language may be advanta- geous for reasons of speed, portability or convenience. Translation to a procedural language is particularly con- venient when integration of procedural code and an expert system is desirable. Some observations about building expert systems have been made. These observations are not necessarily unique to the compiler that is described, i.e. they may be applied to other expert system tools. If the data objects that the rules will refer to are defined, it is possible to represent STM as a set of lists rather than as a single list. For many search algorithms reducing the size of the list to be searched will reduce search time. Defining data objects also makes automatic generation of preconditions that can eliminate the need for searching a possibility. 9 9 25 26 Many expert system tools are conceptually interpre- tive. A single general purpose inference engine is used for whatever problem is being addressed. The notion of generating a custom inference engine for each set of input rules is novel. The optimizer is probably the most significant out- come, and it too is made possible by defining the objects to be manipulated. Optimization of interpretive expert system tools has centered on developing efficient search- ing and matching strategies. The notion of a separate optimizer that changes the operation of the inference engine without affecting the order in which rules are fired is novel and can be applied to other expert system building tools. 9 9 BIBLIOGRAPHY 1. Aho and Ullman, _P_r_i_n_c_i_p_l_e_s _o_f _C_o_m_p_i_l_e_r _D_e_s_i_g_n. Addison-Wesley, 1977. 2. Pyster, _C_o_m_p_i_l_e_r _D_e_s_i_g_n _a_n_d _C_o_n_s_t_r_u_c_t_i_o_n, Prindle, Weber and Schmidt, 1980. 3. Johnson, _Y_a_c_c: _Y_e_t _A_n_o_t_h_e_r _C_o_m_p_i_l_e_r _C_o_m_p_i_l_e_r. Computer Science Technical Report No. 32, Bell Laboratories, Murray Hill, NJ 1975. 4. Juell, _A_n _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _P_R_O_S _P_r_o_d_u_c_t_i_o_n _S_y_s_t_e_m. Computer Science Department, North Dakota State Univer- sity, 1983. 5. Mittal, _P_A_S-_I_I _U_s_e_r _M_a_n_u_a_l. Department of Computer and Information Science, Ohio State University, 1977. 6. Forgy, _O_P_S_5 _U_s_e_r'_s _M_a_n_u_a_l. Technical Report CMU-CS- 81-135, Carnegie-Mellon University, Pittsburgh, 1981. 7. Kernighan and Ritchie, _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e. Prentice-Hall, NJ, 1978. 8. Hayes-Roth, Waterman and Lenat, _B_u_i_l_d_i_n_g _E_x_p_e_r_t _S_y_s_- _t_e_m_s. Addison-Wesley, 1983. 9. Winston, _A_r_t_i_f_i_c_i_a_l _I_n_t_e_l_l_i_g_e_n_c_e. Addison-Wesley, 27 28 1984. 10. Ritchie and Thompson, _T_h_e _U_N_I_X _T_i_m_e-_S_h_a_r_i_n_g _S_y_s_t_e_m. The Bell System Technical Journal, Vol. 57, No. 6, Part 2, 1978. 11. Winston and Horn, _L_i_s_p. Addison-Wesley, 1984. 12. Gupta, _P_a_r_a_l_l_e_l_i_s_m _i_n _P_r_o_d_u_c_t_i_o_n _S_y_s_t_e_m_s: _T_h_e _S_o_u_r_c_e_s _a_n_d _t_h_e _E_x_p_e_c_t_e_d _S_p_e_e_d-_u_p. Department of Computer Sci- ence, Carnegie-Mellon University, 1984. 13. Lindsay, Buchanan, Feigenbaum and Lederberg, _A_p_p_l_i_c_a_- _t_i_o_n_s _o_f _A_I _f_o_r _O_r_g_a_n_i_c _C_h_e_m_i_s_t_r_y: _T_h_e _D_E_N_D_R_A_L _P_r_o_j_e_c_t. McGraw-Hill, 1981. 14. Shortliffe, _C_o_m_p_u_t_e_r-_B_a_s_e_d _M_e_d_i_c_a_l _C_o_n_s_u_l_t_a_t_i_o_n_s: _M_Y_C_I_N. American Elsevier, New York, 1976. 15. Davis, Buchanan and Shortliffe, _P_r_o_d_u_c_t_i_o_n _R_u_l_e_s _a_s _a _R_e_p_r_e_s_e_n_t_a_t_i_o_n _f_o_r _a _K_n_o_w_l_e_d_g_e-_B_a_s_e_d _C_o_n_s_u_l_t_a_t_i_o_n _P_r_o_g_r_a_m. Artificial Intelligence, Vol. 8, No. 1, 1977. 16. Erman, et. al, _T_h_e _H_e_a_r_s_a_y-_I_I _S_p_e_e_c_h _U_n_d_e_r_s_t_a_n_d_i_n_g _S_y_s_t_e_m: _I_n_t_e_g_r_a_t_i_n_g _K_n_o_w_l_e_d_g_e _t_o _R_e_s_o_l_v_e _U_n_c_e_r_t_a_i_n_t_y. Computing Surveys, June 1980. 17. Davis, _E_x_p_e_r_t _S_y_s_t_e_m_s: _W_h_e_r_e _A_r_e _W_e? _A_n_d _W_h_e_r_e _D_o _W_e _G_o _F_r_o_m _H_e_r_e?. Massachusetts Institute of Technology, A.I. Memo 665, 1982. 29 18. Joy, et. al, _B_e_r_k_e_l_e_y _P_a_s_c_a_l _U_s_e_r'_s _M_a_n_u_a_l. Computer Science Division, University of California, Berkeley, 1983. 19. Mizoguchi and Kakusho, _H_i_e_r_a_r_c_h_i_c_a_l _P_r_o_d_u_c_t_i_o_n _S_y_s_t_e_m, IJCAI-79, p586, 1979. 20. _A_m_e_r_i_c_a_n _N_a_t_i_o_n_a_l _S_t_a_n_d_a_r_d _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l _f_o_r _t_h_e _A_d_a _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e. American National Standards Institute, Inc., 1983. 21. Nygard, personal communication, 1985. 22. Shapiro, personal communication, 1985. 23. Rebel, personal communication, 1985. 9 9
sources-request@panda.UUCP (02/08/86)
Mod.sources: Volume 3, Issue 110 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as tutorial.doc. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- An Introduction to TRC Daniel D. Kary North Dakota State University Computer Science Department 300 Minard Hall Fargo, ND 58102 _A_B_S_T_R_A_C_T TRC is a compiler that is useful in building expert systems. The input is a rule based system whose input is syntactically similar to the input to YACC[1]. The output is a set of C language procedures. While not all features of TRC are discussed, the major features of the language are presented. Example code is used to illustrate the features and references to more detailed documentation are included. This may be the best starting point for first time users. _1. _I_N_T_R_O_D_U_C_T_I_O_N The fundamental notion that virtually the entire field of expert systems is built upon is the situation/action rule paradigm for problem solving. This paradigm is on the one hand the embodyment of simplicity and on the other hand a tool that is stunningly powerful. The situation/action rule paradigm is a way of embody- ing both information about a problem and the way the infor- mation is applied to solving the problem in a single struc- ture. Consider this trivial problem, you have a pile of coins and wish to reduce it to the smallest number of coins of equal value. One of the rules in a system to solve this problem could be: IF: there are five pennys in the pile THEN: substitute a nickel for the five pennys. A situation/action rule has the form of an IF...THEN... statement, common to virtually every progamming language. - 2 - The IF part is the situation, the rule checks to see if this situation exists, if it does the THEN or action part is exe- cuted. In addition to the rules there is a pattern matcher, which is invoked in the situation part to determine if the situation exists. This pattern matcher is typically more powerful that what is available in an IF statement in a pro- gramming language. The pattern matcher searches a data base. The data base contains all the information that is specific to this instance of the problem, and in some cases information that is germain to all or many instances of the problem. In the trivial example given here, the data base would contain the pile of coins. Finally there is a stra- tegy for deciding which rule to test next. Usually the rules are tested in a pre-specified order. When a rule whose situation part is found to be true, its action part is executed. The strategy for testing rules usually continues with the first rule each time any rule fires. This simple paradigm has been used to build systems with expert levels of problem solving ability in areas as diverse as elucidating the structure of hydrocarbons[2] to diagnosing blood diseases[3] or pulmonary function disorder[4]. In each case a system of rules is built up, each rule embodying a 'rule of thumb' or a piece of 'common wisdom' or 'accepted practice' specific to the problem domain and used by a human expert in solving the problem. The system of rules is referred to as a 'rule based system' or an 'expert system'. The expert system attempts to solve a problem by emulating the problem solving behavior of a human expert. Expert systems are often easy to modify or extend. Just as a human expert gains expertise, rules representing new knowledge can be added to an expert system. Since the situation/action rule is basically an IF...THEN... construct, why is a special language needed? Writing these rules in a traditional programming language can be tedious, a single situation part may require multiple conditional tests. In a system with a large number of rules, the structure of the system may be difficult to see and difficult to modify because the many details of the pro- gramming language tend to hide the structure. Writing the code that maintains and searches the data base that is referred to in the situation part is tedious and repiti- tious, making it an ideal subject for automation. The rest of this tutorial is devoted to describing TRC, a tool for building expert systems. The next section, sec- tion 2, describes the overall format of the input to TRC. Section 3 presents a sample set of rules to give an initial overview of how TRC works. The remaining sections present semi-formal descriptions of the syntax of TRC. This document does not contain all the information needed by advanced users. _T_h_e _T_R_C _L_a_n_g_u_a_g_e _R_e_f_e_r_e_n_c_e - 3 - _M_a_n_u_a_l[_5] contains a complete formal description of the features of TRC. _2. _B_A_S_I_C _S_P_E_C_I_F_I_C_A_T_I_O_N_S Every TRC program consists of five sections, the header, definitions, short term memory (data base), long term memory (rules) and the trailer. These sections are separated by double percent "%%" marks as in YACC specifica- tions. The form of a full specification is illustrated in figure 1. The header, short term memory and trailer are optional so the minimum specification would contain only the definitions and long term memory. All of the %% marks must be present in each specification file. header %% definitions %% short term memory %% long term memory %% trailer Figure 1: TRC Specification The purpose of the header and trailer sections is to permit the inclusion of C language code in the program, much as in YACC. One of the common features of compiled pro- cedural languages is that data objects and data types used in the program must be declared or defined before they are used. This is also true for TRC. While TRC is not a pro- cedural language, declaring objects before using them sim- plifys the process of translating to C which is a procedural language. The remaining components of the TRC grammar are tradi- tional components of expert systems. The short term memory section, herinafter abbreviated STM, is sometimes called the data base. It contains the data that is searched in the situation part of a rule. Expert systems are usually modeled after the problem solving behavior of a single expert or a group of experts in solving a single problem or class of problems. The information specific to the current instance of the problem is usually gathered by the expert, manipulated in the solving of the problem and then forgot- ten. The way this information is remembered and then for- gotten in the human brain and the area of the human brain where it is stored is called short term memory by psycholo- gists. Using that same name here refers to the similarity of purpose. - 4 - The long term memory, herinafter referred to as LTM, contains the rules and again is a reference to human brain processes. This name is a reference to the part of the human brain that remembers things for a long time. The expert usually has a set of formal procedures and informal rules of thumb that are used in solving the problem. The data changes with each instance of the problem, but the rules for solving it remain the same. Human experts have the ability to gather more expertise, to learn more about the problem. The experts learning behavior is imitated by adding new rules or modifying existing rules in LTM. _3. _W_R_I_T_I_N_G _R_U_L_E_S The coin problem mentioned in the introduction will be used to illustrate the syntax of writing a rule. This presentation is made only as an illustration. A complete description of the syntax of a rule will be given in section [?]. A set of four rules that will reduce the pile of coins will be given. The rule mentioned in the introduction searched the data base for five pennys and replaced them with a nickel. To express this as a TRC rule, write: R1: 5 PENNY => MARK 5 PENNY ADD NICKEL ; All rules begin with a label, in this case 'R1:'. A label is a token followed by a colon, and a token is a string of characters (upper case or lower case), digits and/or the underscore character. The first character of a token must be a character. Labels are used to refer to the rule by name in optimizing and user specified control. These issues are discussed in section 6. The label is followed by the situation part, which specifies what to search for in the data base (STM). In this case we are specifying a search for five items of type PENNY. The arrow symbol ( => ) marks the end of the situa- tion part and the beginning of the action part. If the situation part evaluates to true (there are 5 pennies in STM) the action part will be executed. There are two state- ments in the action part of this rule. The statement 'MARK 5 PENNY' specifys that the 5 pennies that were found in STM should now be removed. The statement 'ADD NICKEL' specifys that one more nickel should be added to STM (the pile of coins). So this rule solves a small part of the problem by performing a simple transaction. The remaining three rules needed to reduce the pile of - 5 - coins to the minimum number of equal value (assuming only pennies, nickels, dimes and quarters are used) are listed in figure 2. R2: 2 NICKEL => MARK 2 NICKEL ADD DIME ; R3: 2 DIME NICKEL => MARK 2 DIME NICKEL ADD QUARTER ; R4: 3 DIME => MARK 3 DIME ADD QUARTER NICKEL ; Figure 2: Coin Rules This simple set of rules illustrates both the indepen- dence of each rule and the interaction of the rules. Each rule describes a transaction that will reduce the number of coins in the pile without changing the total value of all the coins in the pile. Suppose the pile of coins consists of three dimes and one nickel. Initially R4 is the only rule whose situation part is true. After the action part of R4 is executed, R2 becomes true by virtue of the fact that R4 added a nickel. After R2 is executed the pile is reduced to a quarter and a dime, the minimum number of coins to make thirty-five cents. As was mentioned earlier, each of these rules is basi- cally an IF...THEN.. construct. The meaning and purpose of each of these transactions is quite evident when expressed as a situation/action rule. The same may not be true of an equivalent program written in a procedural language that included IF...THEN.. statements, procedure calls for search- ing the data base (STM) and procedure calls to remove coins from the data base or add coins to the data base. If we want to include other coins, a half dollar or - 6 - dollar coin, we can easily add another rule or two. Unusual coins, perhaps a twenty cent piece and a thirty-five cent piece, may force us to rewrite a previous rule or reorder the rules. These changes are easily acomplished in a rule based language like TRC, they may not be so easily accom- plished in a procedural language. All upper case letters were used for all the tokens, or words, in this set of rules. All reserved words in TRC are all upper case. MARK and ADD are reserved words that can be used in the action part. The rule labels and the names of the things that are being searched for in STM (PENNY, DIME etc.) were also expressed in upper case. This is a sug- gested convention. Either upper or lower or both may be used. Later we will see how C language code can be embedded in both the situation and action parts. Most of the C language is written in lower case, writing TRC in upper case will make it easier to distinguish the two. _4. _D_A_T_A _D_E_F_I_N_I_T_I_O_N Now that the basic idea of a rule based system has been presented a more formal look at the syntax of TRC is in order. TRC rules will request that a data base be searched, or that things be added to or removed from the data base. The code for searching the data base or adding new things to the data base or removing things from the data base is gen- erated by TRC. In order for TRC to generate this code it must know what kinds of things are going to be in the data base (STM). Suppose an expert system that dealt with the real value of coins, rather than just their face value was needed. Information about each coin might include not only it's dom- ination, but also the year and site of it's minting, it's condition and it's numismatic value. The types of things that are referred to in the rules (coins, in this case) will be called objects. The attributes or values that are asso- ciated with each object will be referred to as 'elements'. All objects (and their elements) that will be referred to in the rules must first be defined in the definition section of the code. The definition section of the code consists of a list- ing of the definitions of each of the object types. A YACC grammar for a single definition is given in figure 3. Strong type checking is enforced by the compiler, the type (INT, FLOAT or STRING) of items to be compared must be identical. Each definition which contains an item_list results in the declaration of a structure in the output code. STM consists of lists for each of the types declared and a count of the number of items in each list. For objects which contain no elements, TRC generates no list and no searching code, checking only the count. - 7 - definition : TOKEN | TOKEN '(' item_list ')' ; item_list : item_list item ; item : TOKEN ':' type ; type : INT | FLOAT | STRING | POINTER ; Figure 3: YACC Grammar for a Definition The purpose of the POINTER type is to generate a pointer to a structure of the type of the list. This is a 'hook' that permits building arbitrary structures in user code. There is no direct support for this type. Though it is not necessary in any sense, it may be a good idea to use all upper case character for object and element definitions. This will make these items much more visible in the code output by TRC should it become necessary to review that code. Some correct definitions include: A (A1 : STRING A2 : INT) B (B1 : FLOAT B2 : INT) C These definitions create three classes of objects. Objects in class A contain a string and an integer, those in B contain a double precision floating point value and an integer and those in object class C contain no elements. _5. _S_H_O_R_T _T_E_R_M _M_E_M_O_R_Y The purpose of the STM section of the code is to permit the initial contents of STM to be specified. TRC will pro- duce a single procedure, _i_n_i_t, which adds the listed objects to STM. Objects are added to STM by insertion at the head of the list. The objects listed in the STM section of the code are inserted in the opposite order that they are listed so the final result is that STM is initialized just as listed. This section of the code is intended to serve two pur- poses. When an expert system is being developed it provides - 8 - a way to place data in STM without having to write input routines. After the rules are written and debugged and a separate input routine has been written, this section can be used to specify an initial condition that may be needed for every instance of the problem. Figure 5 gives a YACC gram- mar for a single entry in STM. entry : count TOKEN | count TOKEN '(' init_list ')' ; count : /* empty */ | INTEGER ; init_list : /* empty */ | init_list init_item ; init_item : TOKEN ARROW INTEGER | TOKEN ARROW DOUBLE | TOKEN ARROW STR ; Figure 5: YACC Grammar for a STM Entry The objects to be added are just listed, along with the initial value of any elements the object may have. String elements that are not explicitly initialized are set equal to the null string, numeric elements that are not explicitly initialized are set equal to zero. Suppose the following objects were declared in a definition part: A (A1 : STRING A2 : INT) B (B1 : FLOAT B2 : INT) C Some correct entries would include: 10 A (A2 => 9) B (B1 => 1.1) 2 B (B1 => 2.2 B2 => 6) C A Some incorrect entries are: 10 (A2 => 9) /* the object name is missing */ B (B1 => 9) /* FLOAT literals MUST contain a decimal point */ C (C2 => 1) /* object C does not include - 9 - element C2 */ _6. _L_O_N_G _T_E_R_M _M_E_M_O_R_Y LTM is the section where the rules are enumerated. TRC generates a loop which tests the situation part of each rule in the order they are listed. When a rule is found who's situation part is true, that rule's action part is executed. Testing will normally resume at the beginning of the list of rules. The grammar, or syntax, for a single rule will now be presented. Code examples will illustrate the syntax. LTM may begin with optional switches to turn on trac- ing, profiling or backtracking. These options, which may also be turned on with command line options, are discussed in section 8. Following the option part is a listing of the rules. Figure 5 gives a YACC grammar for a single rule. Each rule begins with a label. This label is copied unmodified to the output source code and is used as a label in the main loop. It will aid the readability of the output if labels are all upper case and as descriptive as possible. Following the label is the left hand side or situation part of the rule. This part of the rule is where the search strategy is specified and the items to search for are enumerated. There are two search strategies, linear and recursive. The linear strategy is the default. In the linear search strategy each match causes a linear search from the begin- ning of STM. The first object that matches the test is marked as in use. Subsequent searches will ignore the pre- viously marked item. As soon as any single match fails the entire rule fails. Consider the following example: A (A.A1 == "TEST") This specification requests that two objects be searched for, one where the elements of the object A can have any value and one where element A1 of object A is the string "TEST". The code generated by TRC for this rule will first check that the list of A objects has at least two objects, little sense in searching a list when it is known that the search will fail. If the list has at least two objects, the first object will match the first A, since no values were specified for the elements any object will match. The list of A objects is then searched for one in which element A1 is equal to the string "TEST". The success or failure of the rule depends on the success or failure of this search. Clearly is is possible that only one object in list A had element A1 equal to "TEST". It is also possible that the object whose element A1 was equal to "TEST" was the first - 10 - production : label lhs '=>' rhs ';' ; label : TOKEN ':' ; lhs : /* empty */ | lhs match ; match : RECURS | count TOKEN | NOT TOKEN | count '(' free_variable match_list ')' ; free_variable : /* empty */ | HAT TOKEN TOKEN ; match_list : /* empty */ | match_list match_element ; match_element : TOKEN '.' TOKEN relop INTEGER | TOKEN '.' TOKEN relop DOUBLE | TOKEN '.' TOKEN relop STR | TOKEN '.' TOKEN relop TOKEN '.' TOKEN ; rhs : optional_part pass_part ; optional_part : /* empty */ | optional_part option ; option : MARK mark_list | ADD add_list | OPTIMIZE TOKEN ; mark_list : mark_item | mark_list mark_item ; mark_item : count TOKEN ; add_list : entry | add_list entry ; pass_part : /* empty */ | C_COCE ; relop : "==" | "!=" | "<=" | '<' | ">=" | '>' ; Figure 5: YACC Grammar for a Rule - 11 - object in the list. If that were in fact the case then the rule would fail even though it could have been true with a different search strategy. In this simple case the problem could be avoided by simply reordering the situation part so that the object with an element A1 equal to "TEST" was searched for first. Not all examples are this simple, and that is the rea- son for the recursive search strategy. In the recursive strategy if a given test fails, the previous test is undone and redone from the point where the selection was made. This process continues until a match is found or it is found that a match is not possible. The recursive search is a more powerful pattern matching tool, but it is much more expensive in execution time. The search time for the linear search is order N while for the recursive strategy it is order N squared. For large N this is a very substantial difference. To select a recursive search, the reserved word RECURS is included in the situation part. The clearest code will result if RECURS immediately follows the rule's label. If a rule is declared RECURS, the recursive search will apply to all objects in the situation part. There is no way to search recursively for some objects and linearly for others in a single rule. The scope of the RECURS declaration is one rule. Many expert system development tools use only the powerful but time consuming recursive search technique. Making this facility optional enables the user to exercise some control on the search time. It is also possible that the order that the objects occur in the list is important, in this case the linear search would be required. TRC always inserts new objects at the front of the list and never reorders a list or drops an element from a list, unless specifically directed to. It is sometimes necessary to compare an element of one object with an element of some previously found object, rather than to some literal value. To do this a name for the previously found object is needed. A name that is assigned to an object is referred to as a free_variable. The scope of a free_variable is the current rule. Using the previous definitions some examples are: (^A FOO) (A.A1 == FOO.A1) (^A BAR A.A1 != "TEST") (B.B2 != BAR.A2) The first line in this example picks the first object of the A list and assigns the free_variable FOO to that object. In the second line the A list is searched for an object whose element A1 is equal to the element A1 found in - 12 - the first line. The third line searches the A list for an object whose element A1 is not equal to "TEST" and assigns the free_variable BAR to this object. The final line searches the B list for an object with an element B2 not equal to the element A2 found in the previous search. Notice what is happening here, elements from different lists are being compared. This comparison is permitted because both elements are integers, so the types of the elements match. In complex matches like this it is frequently neces- sary to use the recursive search. A new definition is needed to consider yet another case: C (C1 : INT C2 : INT) The final case to be considered is the case where two elements of the same object are compared. There are two equivalent ways to specify this: (^C FOO C.C1 == FOO.C2) (C.C1 == C.C2) TRC will generate identical code for either of these statements. In each line a specification is made that the C list be searched for an object where elements C1 and C2 are equal. There is a subtle but important difference between these similar examples and all previous examples. In all previous cases the right hand side of the relational expres- sion evaluated to an absolute value before the search began. In this example the absolute value of the right hand side of the relation changes with every object that is tested. There is a small code overhead for this type of testing, which is noticeable only if used on many different elements of many different types of objects. Finally the form NOT TOKEN, where TOKEN is some object is an explicit test for an empty list. The case of search- ing a list for the absence of some element is discussed later. The situation and action parts are separated by the ARROW symbol, "=>". The action part can contain MARK, ADD and OPTIMIZE statements. The MARK statement lists the number and type of objects to remove from STM. The only items that may be removed from STM by a MARK statement are objects that were enumerated in the situation part. This restriction is necessary because those objects are the only ones that definitely are in STM. The ADD statement lists objects that are to be added to STM. Objects are inserted at the head of their respective lists in the order they are listed in the action part. - 13 - Objects are always ADDed to STM before objects are removed, regardless of the order of ADD and MARK statements. This is necessary because ADD statements can refer to elements that are about to be removed. Assume the previous definitions of A and B for this example rule: RULE: (A.A1 == "BAR") (^A FOO A.A1 == "FOO") => MARK 2 A ADD B (B1 => 3.14159 B2 => FOO.A2) { printf("RULE is firing\n"); } ; TRC will generate code that will search the A list first for an object with element A1 equal to "BAR" and then for an object with element A1 equal to "FOO". If both of these searches succeed the action part will execute. The MARK statement specifies that both A objects are to be removed from STM. This could also have been specified: MARK A A or MARK A MARK A The MARK statement causes objects to be removed in the order they were found. It is possible for a situation to exist where it is not desirable to remove the objects in the order they were found. In the example above it may be desirable to remove the second type A object, but not the first. Objects may be MARKed based on their free_variable name. The following statement will cause only the second type A object from the sample rule to be MARKed: MARK FOO The example ADD statement adds an object of type B to list B copying a value out of list A. The code generated by TRC will do the ADD first then the MARK since the ADD state- ment refers to a MARKed element. The code section is executed after the ADD and MARK code and simply prints a message. It is included here to demonstrate what a code section looks like. Information on techniques used to refer to objects in the code section is presented in _T_h_e _T_R_C _L_a_n_g_u_a_g_e _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l[_5]. The final semicolon is included in the TRC syntax to give the parser a point to sync on in case of syntax errors in the - 14 - source. The OPTIMIZE statement is used to tell TRC that after the current rule executes it is not necessary to search the list of rules from the very beginning of LTM, rather the search can begin with the named rule. The naming of the OPTIMIZE statement refers to its primary, but not neces- sarily only purpose. The OPTIMIZE statement can be used to impose a control structure on LTM. For convenience the label "Start" alway precedes the first rule and the label "End" always follows the last rule. The TRC grammar does not include a way to specify a search for the absence of some element. This can be accom- plished using the OPTIMIZE statement and a side effect of the search strategy. The LTM section in Figure 7 demon- strates this possibility. RULE1: (A.A1 == "FOO") => OPTIMIZE RULE3 ; RULE2: => { /* do your thing here */ } ; RULE3: /* system continues here */ Figure 7: Testing for the Absence of a Pattern Figure 7 illustrates a technique for testing for the absence of some element. RULE1 tests for the presence of the element and uses the OPTIMIZE statement to branch around RULE2 if it is found. The situation part of RULE2 is empty. An empty situation part always evaluates to true. RULE2 will always fire when RULE1 fails and never be tested when RULE1 fires. The combination of RULE1 and RULE2 is a rule that tests for the absence of an element. _7. _H_E_A_D_E_R _a_n_d _T_R_A_I_L_E_R The header and trailer are lexically identical and serve similar purposes; the inclusion of C code that is not related to a specific rule but is of a more global nature. The syntax is identical for the header, trailer and code section of long term memory. The code section must begin with an open brace '{' and end with a closed brace '}'. A code section is recognized by the input scanner using a very - 15 - trivial algorithm; when an open brace is encountered a code section begins. The 'brace count' is set to one and each time an open brace is encountered the 'brace count' is incremented. Each time a closed brace is encountered the 'brace count' is decremented. When the 'brace count' is zero the code section is presumed to have ended. All text in the code section, except the initial open brace and final closed brace, is passed through untouched. This simple algorithm avoids the potential problem of having to parse the C language within TRC, but it is very easy to defeat. If the number of braces in a code section is not balanced, the end of the section will not be deter- mined correctly - this includes braces embedded in comments. A single missing brace may cause the entire compilation to be aborted. Worse yet would be two complimentary missing braces in separate sections of the specification. Very large pieces of the specification may be passed. These problems are common in programming languages and not diffi- cult to avoid. Sample valid headers and trailers are illus- trated in figure 6. { /* this is a sample valid header section */ struct mystruct{ int a, b, c; struct mystruct *next; }; } %% definitions %% short term memory %% long term memory %% { /* this is a sample valid trailer section */ myprocedure() { /* do my thing in here */ } } Figure 6: Sample Header and Trailer On the somewhat related subject of comments, C style comments may be included anywhere in the specification that a space may occur. Comments that are not part of a code section are recognized by the scanner and are discarded. Nested comments outside of the code section are not - 16 - supported, comments occuring in a code section are passed through. The code in the header section is written on the output file _h_e_a_d_e_r._h. This file is included in all output files. The header section should be used to declare structures and variables which may be used in the action part of a rule. Since this code will be included in several files it should not contain initialized data or procedures, which would cause duplicate definition errors at compile time. The code in the trailer section is written on the out- put file _l_o_o_p._c. This is the code file which contains the main loop, and includes the definitions of all the struc- tures and global variables manipulated by the inference engine. Including procedures in the trailer is a convenient way of gaining visibility of those objects. _8. _O_P_T_I_O_N_S The option section occurs at the beginning of LTM. The option section may be empty, but any options must precede the first rule. Options may also be specified with command line flags. There are several options; TRACE, PROFILE, BACKTRACK, DUMP, RECURS, ZERO and SAVE. The TRACE option causes a runtime trace to be created. The primary purpose of this trace is to facilitate generat- ing explanations. People seldom take the advice of an expert without a satisfactory explanation of why the advice should be followed. It may not be reasonable to expect peo- ple to take advice of an expert system that can not explain itself. The trace is a list of rules that were fired, or inferences that were drawn. Having the trace facilitates explanations of the type, "I found that a, b and c were true and therefore concluded that d should be pursued". The gen- eration of explanations is left to user code, only the trace is provided. Turning the TRACE option causes the procedure _a_p_p_e_n_d__t_r_a_c_e and the structure _t_r_a_c_e to be generated. Each time a rule is fired the procedure append_trace is called with the number of the rule that fired. This number is appended to the list of rules. The list structure is defined: struct trace{ int rule; struct trace *next; } *trace_front, *trace_back; The pointer _t_r_a_c_e__f_r_o_n_t points to the head of the list and _t_r_a_c_e__b_a_c_k points to the last item in the list. Rules - 17 - are numbered beginning with 1 from the start of LTM. if the label of the rule would be more convenient it can be obtained from the rule_names array, e.g. the label of rule one is: rule_names[1] The PROFILE option generates two arrays in which counts of the number of times each rule executed and each match was searched are stored. The procedure _p_r_i_n_t__p_r_o_f_i_l_e will print a summary of the execution of the inference engine. The BACKTRACK option generates a structure and pro- cedures needed to implement backtracking. These structures and procedures are detailed in _T_h_e _T_R_C _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l[_5]. Backtracking is a technique for searching a problem space. When a dead end is reached, the last decision is undone and the search continues. In the inference engine generated by TRC one backtracking step is taken each time all the rules in LTM fail. When backtracking is enabled, objects marked for deletion from STM are saved in a back track structure along with their former position. The number of items added by the rule is saved in the same structure. To undo a rule the formerly added objects are deleted and the formerly deleted objects are restored to their original position in the STM. The backtrack procedures assume that the ADD and MARK statements are the only way that STM is modified. If user code modifies STM the backtrack save and restore procedures will have to be modified to be cognizant of user code changes to STM. In a system with backtracking it is essential that some rule recognizes when the problem is solved and returns to the calling procedure. If no rule does this, the system will perform every manipulation on STM that it can in every order that it can and finally will return with STM fully restored to it's original state. Thus vast resources can be consumed to to obtain the same results that not calling the inference engine at all would produce. The DUMP option causes code to print the contents of STM on the standard output to be generated. The procedure _d_u_m_p__s_t_m will print out the contents of each list of objects in STM on the standard output. There is also a procedure _d_u_m_p_%_s__s_t_r_u_c_t generated for each object defined, where "%s" is replaced by the object name. Calls to generate dumps of the entire STM or specific lists may be embedded in the C_Code part. TRC itself never generates calls to the dump procedures. The RECURS option is the only option that may be placed in the option part of LTM or in the situation part of a - 18 - rule. If RECURS is used in the option part all rules will default to the recursive search strategy. This option can be turned off for a given rule by including NORECURS in the situation part of the rule. The ZERO option will generate a single procedure, _z_e_r_o. This procedure will free all the structures that are dynami- cally allocated by the TRC generated code. The structures that are allocated dynamically include STM, the backtracking stack (if backtracking is enabled), the profiling arrays (if profiling is enabled) and the trace list (if tracing is enabled). TRC will generate code for any combination of options. This is useful in situations where the expert sys- tem is called more than once. The zero procedure will clean up anything left by a previous invocation. The SAVE option will generate procedures to write all dynamically allocated structures to a file and procedures to restore those structures from a previously written file. This option makes it easy to write expert systems which checkpoint their own execution. It is then possible to res- tart execution in the case of a crash without having to redo all the work that has already been done. It is necessary to save all dynamically allocated structures including STM, the backtracking stack, profiling arrays and the trace list. Separate procedures are generated for saving and reloading each of these structures. _9. _E_N_V_I_R_O_N_M_E_N_T TRC is a compiler that translates a rule based system to a set of C language procedures. It is useful in develop- ing expert systems. TRC produces only an inference engine and supporting structures, input and output processing must be added with additional code, presumably in C. The minimum external code is a main procedure that will initialize STM and call the inference engine. Figure 8 is a minimal main procedure that includes examples of calls to procedures to dump STM, print the trace list and print the execution pro- file. This assumes that the DUMP, TRACE and PROFILE options were turned on. The output of TRC is written on several files in the current directory. The file names generated are; add.c, dump.c, free.c, loop.c, misc.c, search.c, relink.c, backtrack.c, profile.c, zero.c and save.c. A sample Makefile is given in Figure 8. The reference to main.c refers to user supplied code. - 19 - #include <stdio.h> #include "loop.h" extern char *rule_names[]; main() { struct trace *temp; /* initialize STM */ init(); printf("Initial STM"); dump_stm(); /* call the inference engine */ loop(); printf("Final STM"); dump_stm(); /* dump the contents of the trace structure */ temp = trace_front; while(temp){ printf("%s",rule_names[temp->rule]); temp = temp->next; } print_profile(); } Figure 7: Sample Main Procedure # Makefile for expert systems generated by TRC PROG = loop OBJS = add.o backtrack.o dump.o free.o loop.o \ misc.o profile.o relink.o save.o \ search.o zero.o main.o INCS = loop.h CC = cc all: $(PROG) $(CC) -c -O $< $(OBJS): $(INCS) $(PROG): $(OBJS) $(CC) -o $@ $(OBJS) $(LIBS) Figure 8: Sample Makefile - 20 - BIBLIOGRAPHY 1. Johnson, S. C. (1975), "YACC: Yet Another Compiler Com- piler", Computer Science Technical Report No. 32, Bell Laboratories, Murray Hill, NJ. 2. Lindsay, Robert, et.al (1980), _A_p_p_l_i_c_a_t_i_o_n_s _o_f _A_r_t_i_f_i_c_i_a_l _I_n_t_e_l_l_i_g_e_n_c_e _f_o_r _C_h_e_m_i_c_a_l _I_n_f_e_r_e_n_c_e: _T_h_e _D_E_N_D_R_A_L _P_r_o_j_e_c_t, McGraw Hill, New York. 3. Davis, R., B.G. Buchanan and E.H. Shortliffe (1977), "Production Rules as a Representation for a Knowledge-Based Consultation Program", Artificial Intelligence, Volume 8, Issue 1, February 1977. 4. Feigenbaum, E.A., (1978), "The Art of Artificial Intelli- gence: Themes and case studies of knowledge engineering", IJCAI. 5. Kary, Daniel D. (1985), "The TRC Reference Manual", North Dakota State University, Division of Mathematical Sciences, Fargo, ND.
sources-request@panda.UUCP (02/08/86)
Mod.sources: Volume 3, Issue 111 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as reference.1.doc. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- The TRC Reference Manual Daniel D. Kary North Dakota State University Computer Science Department 300 Minard Hall Fargo, ND 58102 _A_B_S_T_R_A_C_T The syntax of TRC is formally defined. The output of TRC is elucidated. TABLE OF CONTENTS PART ONE - INPUT 1. INTRODUCTION 2. OVERVIEW 3. LEXICAL ELEMENTS 4. DEFINITIONS 5. SHORT TERM MEMORY 6. LONG TERM MEMORY 7. OPTIMIZER PART TWO - OUTPUT 8. OVERVIEW 9. COMMON PROCEDURES 10. DATA OBJECTS 11. MANIPULATING THE DATA 12. TRANSLATING RULES 13. OPTIONS APPENDICES A. TRC GRAMMAR B. ERROR MESSAGES C. STYLE NOTES D. SAMPLE PROGRAM _1. _I_N_T_R_O_D_U_C_T_I_O_N TRC is a programming language that is useful for build- ing expert systems. It is presumed that the reader is fami- liar with expert systems in general and has used at least one expert system building tool. Some terms that are widely - 2 - used in describing expert systems have specific meanings when used to describe TRC and will be defined now. The set of situation-action rules that embody the knowledge an expert uses to solve a problem are referred to as Long Term Memory (LTM). The information that may vary with each instance of the problem is referred to as Short Term Memory (STM). The code which determines if the situa- tion part of a rule is true will be called a pattern matcher or a matcher. The code which determines which rule to activate will be called an inference engine and includes both the matcher and the LTM. The input to the TRC compiler is called a specification. The input to the TRC compiler is a rule based language. The output is a set of C language files. The procedures in the C language files output by the TRC compiler collectively implement the inference engine. An inference engine is to an expert system as a parser is to a compiler: it is of cen- tral importance but it does not comprise a complete imple- mentation. TRC does not provide code for interaction with the user, but does permit the programmer to easily add this code. This document is divided into two parts and a set of appendices. The first part presents a formal definition of the input language with examples of each language feature. The second part describes the output of the TRC compiler and includes some important insight on integrating TRC generated code with other C language code. The appendices include the complete TRC grammar, a listing and explanation of all the error messages that TRC might produce and a sample specifi- cation. PART ONE - INPUT _2. _O_V_E_R_V_I_E_W Every specification file consists of five sections, the header, definitions, short term memory (data base), long term memory (rules) and the trailer. These sections are separated by double percent characters. The form of a full specification is illustrated below. The header, STM and trailer are optional so the minimum specification would con- tain only the definitions and LTM. All of the "%%" marks must be present in each specification file. header %% definitions %% - 3 - STM %% LTM %% trailer The purpose of the header and trailer sections is to permit the inclusion of C language code in the specifica- tion. The header and trailer are each composed of a single lexical element called a c-code which is defined in section 3. Separate sections are devoted to each of the remaining parts of a TRC specification. _3. _L_E_X_I_C_A_L _E_L_E_M_E_N_T_S A program consists of a single file. A file is a sequence of lexical elements composed of characters. Char- acters may be one of these classes; (1) upper-case-letters, (2) lower-case-letters, (3) digits, (4) special characters (5) separators (6) embedded characters and (7) other charac- ters. (1) upper-case-letters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (2) lower-case-letters a b c d e f g h i j k l m n o p q r s t u v w x y z (3) digits 0 1 2 3 4 5 6 7 8 9 (4) special characters " ( ) : ; = < > ! ^ _ $ . { } / * % (5) separators space tab newline (6) embedded characters \n embedded-newline \t embedded-tab \b embedded-backspace \r embedded-carriage-return \f embedded-form-feed \\ embedded-back-slash \" embedded-quote (7) other characters @ # & + [ ] ` ~ ' | , ? The following names are used when referring to special characters: Symbol Name Symbol Name - 4 - " quote : colon ; semicolon = equal ! exclamation ^ hat _ underscore $ dollar < less-than > greater-than ( left-paren ) right-paren { left-bracket } right-bracket * asterisk % percent - minus A lexical element is a either a delimiter, an identif- ier, an integer literal, a floating point literal, a string literal, a comment or a c_code. In some cases a separator is required between lexical elements, specifically when adjacent lexical elements could be interpreted as a single lexical element. A separator is any of space, tab or new- line. One or more separators are permitted between any two lexical element, before the first lexical element and after the last lexical element. _3._1. _D_E_L_I_M_I_T_E_R_S A delimiter may be one of the following special charac- ters: : ; " . ^ - ( ) { } < > A delimiter may be one of the following compound delim- iters. Each compound delimiter is composed of two adjacent special characters. => %% != == >= <= The following names are used when referring to compound delimiters: Delimiter Name => arrow %% delim != not-equal == equality >= greater-than-or-equal >= less-than-or-equal _3._2. _I_D_E_N_T_I_F_I_E_R_S Identifiers are used as tokens and as reserved words. Separators are not allowed in an identifier. The underscore character is the only special character that may be part of an identifier. identifier ::= letter { underscore | letter | digit} - 5 - letter ::= upper-case-letter | lower-case-letter Examples: PENNY Get_Stuff x1 ThisOne WrZ_123 etc_ The identifiers that are reserved words are: ADD MARK PROFILE BACKTRACK NORECURS RECURS DUMP NOT SAVE EMPTY OPTIMIZE STRING FLOAT POINTER TRACE INT PREFIX ZERO _3._3. _N_U_M_E_R_I_C _L_I_T_E_R_A_L_S There are two classes of numeric literals, integer literals and floating point literals. The presence of a decimal point distinguishes floating point literals from integer literals. floating-point-literal ::= [ minus ] digits dot digits integer-literal ::= [ minus ] digits digits ::= digit { digit } Example integer literals: 1 -33 187 Example floating point literals: 0.5 -3.14159 6.0 _3._4. _S_T_R_I_N_G _L_I_T_E_R_A_L_S A string literal is formed by a sequence of characters (possibly zero) enclosed between quote characters. Any of the six classes of characters may be embedded in a string. string-literal ::= quote { [ character ] } quote Examples: "" "\"recursion\"" "these characters can be in a string $, ! => etc_\n" _3._5. _C_O_M_M_E_N_T_S Comments may be included anywhere in the input file that separators or delimiters may occur. Comments follow the style of comments in the C language. Comments may not be nested within comments. Any of the six classes of - 6 - characters may be embedded in a comment. comment ::= slash asterisk { [ character ] } asterisk slash Examples: /* a simple comment */ /* a multi-line comment */ /******************* * A Fancy Comment * *******************/ _3._6. _C__C_O_D_E A c_code is a fragment of C language code that is embedded in the input file. A c_code is recognized by the scanner as a single lexical item. The C language itself is not parsed by TRC. A c-code may not contain a procedure or function. c_code ::= left-bracket { [character] | [c_code] } right-bracket Example: { if(condition){ action(argument); another_action(); } } _4. _D_E_F_I_N_I_T_I_O_N_S Each entity that can be referred to by a TRC rule must be defined in the definition section. Each entity that is defined is called an _o_b_j_e_c_t. Objects may have numeric or string values associated with them. These associated values are called _e_l_e_m_e_n_t_s of the object. There are two forms of definitions. There is a simple form for objects which have no elements and an extended form for objects that have asso- ciated elements. definition ::= identifier definition ::= identifier left-paren item-list right-paren item-list ::= { [ item ] } item ::= identifier colon type type ::= INT | FLOAT | STRING | POINTER For each object there will be an associated object count in the output code, which represents the number of - 7 - objects of that type that exist at any point in time. For each object with at least one element, there will be an associated structure and list of objects of that type in the output code. The set of object counts and object lists defined in the definition section represent the STM that the system of rules may refer to. Each element that is defined for a given object must have a data type specified. Strong type checking is enforced throughout TRC. Comparisons and assignments of elements must involve elements or literals of the same type. There is no coercion of types. The INT, FLOAT, STRING and POINTER types are described in section 10. The POINTER type is a pointer to a structure of the type of the object that contains it. Examples: A b_b (This : INT) CAST ( THAT : INT The_Other : FLOAT) _5. _S_H_O_R_T _T_E_R_M _M_E_M_O_R_Y The short term memory (stm) section of the input is where the initial state of the working memory is specified. The intention of this section is to permit the working memory to be initialized to some state that may be required for each invocation of the expert system. It is also intended to serve as a way of entering test data while the expert system is being developed, before data entry pro- cedures are developed. stm ::= { [ entry ] } entry ::= [ integer-literal ] identifier entry ::= [ integer-literal ] identifier left-paren { [ init-item ] } right-paren init-item ::= identifier arrow value value ::= integer-literal value ::= floating-point-literal value ::= string-literal The short term memory section is a list of objects that are to be entered into the working memory. If an object has one or more elements, those elements can be initialized to user specified values. Numeric values that are not speci- fied are initialized to zero and string values that are not specified are initialized to an empty string. The integer- literal that may precede the name (identifier) of the object specifies how many objects of that type are to be added to working memory with the given element values, e.g.: /* definition part */ - 8 - A (A1 : STRING A2 : FLOAT) B (B1 : STRING B2 : FLOAT) %% /* short term memory */ A ( A1 => "string") /* It is not necessary to initialize all the elements in an object. */ 2 A ( A2 => 3.34 A1 => "thing") /* Nor is it necessary to initialize elements in the order they were declared. */ 3 B /* It is not necessary to initialize the elements at all. */ %% _6. _L_O_N_G _T_E_R_M _M_E_M_O_R_Y Long term memory is the section where the situation/action rules are enumerated. This section may begin with a listing of options that are to be turned on. All options in this section can also be specified by command line flags. Since the syntax for the long term memory sec- tion is more complex than for the other sections, it will be presented in several parts. _6._1. _O_P_T_I_O_N_S The long term memory is composed of two sections, the options and the rules. ltm ::= { [option] } { rule } option ::= ZERO | PROFILE | BACKTRACK | DUMP | RECURS | NORECURS | SAVE | TRACE | PREFIX identifier _6._1._1. _Z_E_R_O The ZERO option directs the compiler to generate a pro- cedure that will free all the dynamic structures allocated by TRC generated code. This feature is useful when develop- ing inference engines that will be entered more than once. It is often necessary to remove the 'leftovers' from a pre- vious execution before beginning a new execution. - 9 - _6._1._2. _P_R_O_F_I_L_E The PROFILE option directs the compiler to generate code to profile the execution of the inference engine and a procedure to print a summary of that profile. The profiling code counts the number of times that each rule searches some part of STM and how many times each rule is fired. _6._1._3. _B_A_C_K_T_R_A_C_K The BACKTRACK option directs the compiler to generate an inference engine that will backtrack when no rule is true. Backtracking is accomplished by undoing the actions of the last rule that fired and continuing to test rules as if the undone rule had never fired. _6._1._4. _D_U_M_P The DUMP option directs the compiler to generate pro- cedures that will print the contents of STM on the standard output. The intention of this option is to simplify the process of developing and debugging rules. By having the DUMP procedures generated automatically, the knowledge engineer is freed of the mundane task of writing procedures to display the current state of the STM. The DUMP pro- cedures are not intended to serve as the output of an expert system. Appropriate output routines will have to be developed by the knowledge engineer after the rules have been written. _6._1._5. _R_E_C_U_R_S TRC will generate code that uses one of two strategies for searching STM. These strategies (detailed in section 6.3.3) are called LINEAR and RECURSIVE. The LINEAR strategy is the default. The RECURS directive in the option part directs the compiler to use the RECURSIVE strategy as the default. It is possible to override the default on a per rule basis. Overriding the default is discussed in section 6.3.3. _6._1._6. _N_O_R_E_C_U_R_S The NORECURS option directs the compiler to use the LINEAR search strategy in all rules, unless otherwise directed. Since this is the default condition, it is not necessary to use this option. _6._1._7. _S_A_V_E The SAVE option directs the compiler to generate pro- cedures to save all objects which are dynamically allocated by TRC code on a file. The compiler will also generate pro- cedures which can restore the dynamically allocated - 10 - structures from the previously written files. The intention of this option is to simplify the development of expert sys- tems with checkpointing and restarting capabilities. The procedures generated by this option and the use of those procedures is described in section 13.6 and Appendix C. _6._1._8. _T_R_A_C_E The TRACE option directs the compiler to trace the exe- cution of the inference engine by maintaining a list of the rules that have been fired in the order they were fired. This list can be used to produce an explaination of the actions taken by the expert system. _6._1._9. _P_R_E_F_I_X The PREFIX option directs the compiler to use the iden- tifier that follows the reserved word 'PREFIX' as a prefix for all data objects and procedures generated by TRC. The intention of this option is to facilitate building expert systems that have more than one inference engine. Supplying different prefixes for each inference engine insures that there will be no name conflicts between separate inference engines, e.g.: PREFIX X_ _6._2. _R_U_L_E_S The second section if LTM is the list of rules. Each rule has a label, which can be supplied or automatically generated by TRC. The label is used whenever it is neces- sary or convenient to refer to the rule by name. The label is followed by the situation part (described in the next section). The situation part is a list of statements fol- lowed by the arrow delimiter. The action part (described in the section following the description of the situation part) follows the arrow delimiter and is itself a list of state- ments. The action part is followed by a semicolon which terminates the rule. rule ::= label situation arrow action semicolon label ::= identifier colon | colon _6._3. _S_I_T_U_A_T_I_O_N The situation part specifies how STM is to be searched and what must be present in STM for the situation part to be true. situation ::= { [ s-option ] } { [ match ] } s-option ::= EMPTY identifier identifier - 11 - s-option ::= RECURS | NORECURS match ::= [ integer-literal ] identifier match ::= NOT identifier match ::= [ integer-literal ] left-paren name match-list right-paren match ::= c-code name ::= hat identifier identifier match-list ::= { match-item } match-item ::= identifier dot identifier relop literal match-item ::= identifier dot identifier relop identifier dot identifier relop ::= equality | not-equal | less-than relop ::= greater-than | greater-than-or-equal relop ::= less-than-or-equal _6._3._1. _M_A_T_C_H_I_N_G It is necessary to understand how matching is specified before the s-option part can be explained. A match, which will also be referred to as a test, is a statement of what the inference engine is to search for in STM. Assume the following objects were defined in the definition section: %% A (A1 : INT A3 : INT A2 : STRING) B (B1 : INT B2 : STRING) %% The simplest match specifies only the object that must be present. A search for one object of type A and one object of type B can be specified as follows: A B A search for two objects of type A and two objects of type B can be specified in many ways, including these four equivalent ways: A A B B 2A 2 B A B A B A A 2B The objects can be listed in any order and may be pre- ceded by an integer literal. The integer literal specifys how many objects of the named type are to be search for. In one of the examples there is a space between the count and the object name and in other examples there is no space - 12 - between the count and the object. Spaces are required only when there would be a conflict without a space. Since the string "2A" (for example) begins with a digit, it is presumed to be a numeric literal. Since "A" is not a digit, the numeric literal ends at that point. Since the numeric literal contained no decimal point, it is an integer- literal. The string is therefore lexically equivalent to "2 A". The reserved word NOT is used to explicitly test for an empty list. The following match statement will be true if there are no objects of type A in STM: NOT A Any rule which contains a search for an object and a test for that same list being empty can never be true. TRC generates an error message in this situation because even though it is syntactically correct, it is in fact meaning- less, e.g.: A NOT A Very often it is necessary to search STM based not only on the type of object, but also based on the values of the elements of the object. This is specified by placing a list of the element values after the element name. (A.A1 == 2) (B.B1 != 3 B.B2 <= "THIS") (A.A2 == "HERE" A.A1 > 6) These three statements can be translated as follows: First search the A list for an object whose element A1 is equal to two, then search the B list for an object whose element B1 is not equal to three and whose element B2 is less than or equal to "THIS", finally search the A list for an object whose element A2 is equal to "HERE" and whose ele- ment A1 is greater than six. This situation part would be true if all three objects were found in STM, otherwise it would be false. In the first match only the value of A1 is specified. Only the elements that are specified are tested, the values of any other elements that the object may contain are not considered. Association of parameters is by name, so it is not necessary to list elements in the order they were declared. The third match statement in the example above lists the value of element A2 first, even though ele- ment A1 was declared first. The final case that must be considered is the case where it is necessary to search STM for an object whose ele- ments are to be tested against the result of some previous - 13 - search. To do this it is first necessary to name the object that is being searched for so that it may later be referred to, e.g.: (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) (A.A1 != A.A3) The first statement begins with a hat character. This indicates that this object is to be named. The hat charac- ter is followed by the object type and it's name. The name is followed by a list of the elements to search for, in this case a search for element A1 equal to two is specified. This statement can be translated as follows: Search the A list for an object whose element A1 is equal to two and name that object "FIRST". A name that is applied to an object is called a free variable. The scope of a free variable is the current rule. Free variable names can be reused in subse- quent rules. The second statement specifys that the B list is to be searched for an element B1 whose value is equal to the value of the element A3 found in the previous statement. The free variable name makes it possible to refer to previously found elements. The third statement, while looking innocent enough, is radically different from all previous examples. In all the previous examples the exact value that was being searched for was known before the search began. That value was expressed as either a literal, or the value of some element that was found in a previous test. In the third statement, the A list is being searched for an object whose elements A1 and A3 are not equal to one another. The values these ele- ments are to have are not specified, only their relationship to one another. This can be further complicated: (^A Second A.A1 == 3) (A.A1 < Second.A3 A.A3 < A.A1) In the second match statement of this example A1 is being compared to the value of A3 in the object named "Second" and it is being compared to the value of the ele- ment A3 in the object that contains it. An element may appear on the left hand side of the relational operator only once in a given match statement. It is now possible to con- sider the effects of the options. - 14 - _6._3._2. _O_P_T_I_O_N_S The situation part begins with a (possibly empty) list of options. The reserved words RECURS or NORECURS may appear in the option part of the situation. The appearance of one of these words causes the named strategy to be used rather than the current default strategy. It is not an error to explicitly specify the default strategy, but it is unnecessary. The option part of the situation may also include EMPTY statements. An EMPTY statement is a static object declaration. The intention of the EMPTY statement is to provide a means of passing data from STM to embedded c- code and from embedded c-code to STM. Examples in section 12 will illustrate these actions. _6._3._3. _S_E_A_R_C_H _S_T_R_A_T_E_G_I_E_S A small example provides an easy way to illustrate the two search strategys. This example is a complete TRC specification, though not useful for anything other than as an example. %% PENNY (MINT : STRING DATE : INT) %% PENNY (MINT => "DENVER" DATE => 1964) PENNY (DATE => 1966) %% R1: (^PENNY First) (PENNY.DATE <= First.DATE) => MARK PENNY ; %% STM will be initialized to contain two objects of type PENNY, the first minted in Denver in 1964, the second minted in an unspecified location in 1966. Since the reserved word RECURS does not appear in either option section, the default LINEAR search strategy will be used. In the LINEAR strategy, STM is searched in a linear fashion for each object specified in the situation part. Objects are searched for in the order they are listed. In this example, the object named "First" will be associated with the first object in the list. Since the values of the elements are not specified, any object of type PENNY will match. This object is then temporarily marked as being "in use" and can not be used to match any subsequent tests. The list will then be searched for an object of type PENNY whose DATE element is less than or equal to the DATE element of the "First" object. The only other object in the list has a - 15 - DATE element of 1966 which is not less than or equal to 1964, so the rule fails. In the LINEAR strategy, when any test in the situation part fails, the entire rule fails immediately, no further tests are made. It should be obvious that this rule would have been true if "First" had been associated with the second object in the PENNY list. This is precisely the purpose of the RECURSIVE search strategy. In the RECURSIVE search stra- tegy, when a test fails, the previous test is redone. To redo a test, the object that was marked as "in use" is unmarked, and the list is searched from that point for an object that will match the test. The RECURSIVE search fails when a single test fails and it is no longer possible to undo the previous test (this occurs when there is no previ- ous test). The RECURSIVE search strategy is a powerful pat- tern matching tool, but it can be expensive in terms of exe- cution time. _6._3._4. _E_M_B_E_D_D_E_D _C_O_D_E Arbitrary C language code may be embedded in the situa- tion part anywhere a match may occur. Recall that embedded code (c-code) is recognized as a single lexical element by the scanner, the C language itself is not parsed by TRC. Errors in embedded code will not be detected by TRC. The intention of permitting embedded code in the situation part is to make it possible to include tests that may not fit the context of a match against STM. In order to integrate an embedded code test with the existing match statements, it is necessary to have a way to refer to objects in embedded code. In order for embedded code to have the same functionality as a match statement, it is necessary to have a way to cause a rule to fail in the embedded code. Each of these facilities are provided. _6._3._5. _E_M_P_T_Y _O_B_J_E_C_T_S The purpose of the EMPTY statement is to create a named object that can be referred to by match statements and embedded code, without having to exist in STM. One of the capabilities that results is the ability to have STM searched on the basis of the result of some external pro- cedure, e.g.: R1: EMPTY PENNY SPARE /* this creates an object of type PENNY that is named SPARE. This object exists separately from STM and it's elements are not initialized. */ { /* this embedded C code precedes - 16 - any search of STM */ if(($SPARE.DATE = external-procedure()) <= 1920){ $FAIL. } } (PENNY.DATE == SPARE.DATE) => MARK PENNY ; Several things are happening in this example. First an object of type PENNY is created and given the name SPARE. This object exists separately from STM and will exist only during the current rule. It is useful only as something that can be referred to in other statements. A section of embedded code precedes the only match statement in this rule. When the code produced by TRC is compiled and run, that embedded code will be executed before STM is searched, by virtue of the fact that it precedes the match statement. The embedded code contains an "if" statement which con- tains an embedded assignment and function call as part of it's condition. The left-hand-side of the embedded assign- ment, "$SPARE.DATE" is not syntactically correct C language code. The dollar character is a flag to TRC that indicates a reference to a named object. The identifier that follows the dollar character will be translated by TRC during the output phase. This translation is described in section 12. The statement "$FAIL." is translated by TRC into whatever statements are required to make this rule fail. The defini- tion of failure depends on the search strategy. If the LINEAR strategy is being used, "$FAIL." will cause the rule to stop searching STM and continue with the next rule. If the RECURSIVE strategy is being used, "$FAIL." will cause the rule to undo and then redo the previous match statement. An object name preceded by the dollar character may occur in the embedded code anywhere a variable name may occur, since that is what it will actually be translated to. Embedded code may also refer to objects that exist in STM using the same dollar character translation technique: R1: RECURS (^PENNY NEW PENNY.MINT == "DENVER) { if(some-function($NEW.DATE)) $FAIL. else $NEW.DATE = 0; } (PENNY.DATE == 1921) - 17 - => MARK PENNY ; The "else" part of the embedded code illustrates an assignment to an element of the object named NEW. The object that is being called NEW exists in STM and this assignment to it's DATE element is permanent. Since this rule is recursive, it is possible that this embedded code will set the DATE element of every object in the PENNY list to zero. These modifications are made before it is even known that the situation part is true. Modifying STM in the situation part of a rule would be a major departure from traditional expert system implementation techniques. Furth- ermore, the BACKTRACKing option is unaware of changes made in STM by embedded code. The BACKTRACKing option is unable to correctly undo this rule. _6._4. _A_C_T_I_O_N The ACTION part specifies what is to be done if the situation part is true. The actions that can be taken pri- marily involve adding objects to STM or deleting objects from STM. Recall that the non-terminal 'entry' was defined in section 4. action ::= statements c-code statements ::= { [statement] } statement ::= MARK mark-list statement ::= ADD add-list statement ::= OPTIMIZE identifier mark-list ::= { [ mark-item ] } mark-item ::= [ integer-literal ] identifier add-list ::= { [ entry ] } _6._4._1. _M_A_R_K The MARK statement is used to delete objects from STM. Only objects that were found in the situation part may be deleted. The reason for this constraint is that only the objects found in the situation part are definitely known to exist in STM. STM is searched only in the situation part, there is no searching in the action part. Objects may be deleted by name or in the order they were found, e.g. (using the definitions from section 6.3.1): R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK A ; - 18 - This MARK statement will delete the object in the A list that met the test (A.A1 != A.A3). In some instances it may be desirable to delete an object that was not the first object that was found, e.g.: R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK FIRST ; The A list object named 'FIRST' is the second object of type A to be found. It is specified as the object to delete by using it's free variable name. A MARK statement can specify a count of how many objects of a given type are to be deleted. A MARK statement may list any number of objects to delete, and each object to be deleted can have a separate MARK statement if desired. In no case can more objects be deleted than were found in the situation part. Each of the following examples is equivalent: R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK B FIRST A ; R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK 2A MARK B ; R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK FIRST MARK A MARK B ; - 19 - _6._4._2. _A_D_D The ADD statement is used to add new objects to STM. As in the MARK statement, an ADD statement can specify one or several objects to add to STM. The value of each element of each object can be specified as in the STM section of the specification. Each object is inserted at the head of the appropriate list. The insertions are actually made in the opposite order that they are listed, the net effect is that the objects appear at the head of the list in the order they are specified. ADD and MARK statements may be intermixed in any order, e.g.: R1: (A.A1 != A.A3) (^A FIRST A.A1 == 2) (B.B1 == FIRST.A3) => MARK FIRST ADD A (A.A1 => 6 A.A3 => FIRST.A3) ADD B (B.B2 => FIRST.A2 B.B1 => 9) MARK B ; All the ADD statements will be executed before any MARK statements are executed regardless of the order of the statements in the action part. The statements are ordered by the compiler to insure that an ADD statement does not refer to an object that has already been MARKed. In the example above, the first ADD statement refers to the object named 'FIRST'. The object named 'FIRST' is MARKed in the previous statement. If the code were executed in the speci- fied order, the element 'FIRST.A3' would not exist when the ADD statement was executed. _6._4._3. _O_P_T_I_M_I_Z_E The OPTIMIZE statement is named for it's primary func- tion, hand optimization of LTM. There is also a built in optimizer that can be invoked. Optimization is discussed in detail in section 7. The OPTIMIZE statement can be thought of as an unconditional GOTO statement. In normal execution, after a rule fires the rules are tested from the beginning of LTM for the next rule that will fire. The OPTIMIZE statement can specify a point other than the start of LTM to begin testing rules. In addition to optimization, it can be used to impose a customized control structure on the set of rules. One example of the use of the OPTIMIZE statement is to implement a search for the absence of some object(s) in STM, - 20 - which is not otherwise supported by the TRC language. To search for the absence of some object(s), use two rules. The first rule searches for the presence of the object(s) in question, if the rule is true then the object(s) are not absent. If the rule fails, the object(s) are absent, e.g.: R1: /* effectively search for the absence of an object A with element A1 == 2 */ (A.A1 == 2) => /* If this rule is true, branch around the next rule */ OPTIMIZE R3 ; R2: /* If R1 fails, then there is no object A with element A1 == 2. An empty situation part such as this always evaluates to true */ => /* whatever you wish to do in response to the absence of A1 == 2 */ ; R3: /* continue here if R1 is true */ . . . _6._4._4. _C-_C_O_D_E A c-code may follow the MARK, ADD and OPTIMIZE state- ments. This is user code that is to be executed when a rule fires. C-code may not appear between MARK, ADD or OPTIMIZE statements. If it is necessary to refer to an object that is being MARKed in c-code, it should be done in the situa- tion part. A c-code may precede the arrow symbol that separates the situation and action parts. C-code in this position is equivalent to c-code in the situation part preceding the MARK, etc. statements, e.g.: R1: (^A FIRST) { /* this c-code follows all situation tests. It is effectively in the action part since it will execute only if the situation is - 21 - true */ some_procedure($FIRST.A1); } => MARK FIRST ; _7. _O_P_T_I_M_I_Z_E_R The optimizer does not produce code that is optimum in any sense. What it does is to perform a single, very useful code modification that can have a very positive impact on execution time. Consider the execution of an inference engine. Each rule is tested until one who's situation part is true is found. This rule's action part is then executed. When rules are being tested the problem space is being searched. When an action part is executed a step is taken in the solu- tion of the problem. Searching the problem space is clearly part of the solution, but the action part is where the the results occur. The goal, which is not attained, is to reduce the search time to zero. To attain this goal it would be neces- sary to know each time a rule fires which rule will fire next. This is generally not known. In particular when the inference engine begins execution, the contents of STM are not known, any rule can be the first rule to fire. Once a rule has fired and each time any rule fires a great deal of implicit knowledge about the contents of STM is obtained. It is known that no rule previous to the current rule is true and no rule previous to the current rule can be true after the execution of the current rule unless the current rule modifies STM in such a way as to make some previous rule true. This simple fact is the entire basis of the optimizer, which attempts to reduce the number of rules that are tested by deducing which rules can not possibly fire. Three tests must be performed to determine a candidate next rule, which is the first rule in LTM that can possibly fire after the current rule fires. The three tests are called the NOT test, the ADD test and the MARK test. The first case to be considered is the case of a rule which contains a NOT statement in the situation part. A NOT test is an explicit test for an empty list. When a rule that fires contains an ADD statement it will not be possible for any previous rule with a NOT statement referring to that list to be the next rule to fire. Likewise, if a rule that fires contains a MARK statement and no ADD statement refer- ring to that same list, it is possible that the list will - 22 - become empty making it possible for the rule with the NOT statement that previously failed to become true. If it is determined that it is possible for a rule to fire after the NOT test, that rule becomes the candidate rule and no further testing is done. Consider the case of a rule with no NOT statements that recursively searches STM for a situation. If this rule fails, it will continue to fail until something is added to STM to make it true. If all rules searched STM recursively it would be known when a rule fires that of the rules that precede the current rule, only those rules that search for something added to STM by the current rule can possibly fire in the next pass. If the current rule adds something to STM, control could continue with the first rule that searches for that something rather than the first rule in LTM. If no rule prior to the current rule searches for those things added to STM by the current rule or if the current rule adds nothing to STM then no prior rule can execute. Control could con- tinue with the current rule rather than at the beginning of LTM. By causing control to continue with a rule later than the first rule the amount of searching is reduced. The case of a rule that performs only a linear search on STM must also be considered. The previous conclusion about items being added to STM is still true; a rule that adds something to STM can cause a linear search rule to become true. With linear search it is also possible that a rule will become true if something is removed from STM. If a linear rule searches for several similar items which are present but not correctly ordered it is possible for this linear search to fail where a recursive search would not have failed. If there were excess items, removing one or more may cause a different linear assignment which could make a linear rule true. This is the MARK test. Examples of this situation are non-trivial, but where correctness is an issue these cases can not be overlooked. The TRC optimizer selects a continuation point for each rule based on what the rule adds to or deletes from STM rather than testing each rule from the beginning of LTM. The continuation point is the first rule that could fire based on the NOT and ADD tests for all rules, and the MARK test for linear rules. The TRC optimizer is somewhat naive in that it considers only items added or deleted with the ADD and MARK statements. The optimizer is unaware of any changes that may have been made to STM by user code. The caveat is if STM is modified in user code the optimizer may produce incorrect code. The optimizer, which can be invoked with a command line option (-O), tests each rule individu- ally and ignores those rules that were hand optimized in the specification.
sources-request@panda.UUCP (02/08/86)
Mod.sources: Volume 3, Issue 112 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as reference.2.doc. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- - 23 - PART TWO - OUTPUT _8. _O_V_E_R_V_I_E_W The output of TRC consists of several procedures writ- ten on several different files. The files contain the definitions and declarations of the data objects to be mani- pulated by the TRC generated inference engine, procedures to manipulate those data objects and a procedure which embodies the rules. The output of TRC is written on several files in the current directory. The file names generated are; add.c, dump.c, free.c, loop.c, misc.c, search.c relink.c, backtrack.c, profile.c, zero.c and save.c. In addition to these files, the user must provide at least a main procedure which will invoke the inference engine, e.g.: main() { /* 'loop' is the name of the procedure that embodies the rules and controls testing the rules */ loop(); } A sample Makefile is given here, the file main.c is user supplied code. # Makefile for expert systems generated by TRC PROG = loop OBJS = add.o backtrack.o dump.o free.o loop.o \ misc.o profile.o relink.o save.o \ search.o zero.o main.o INCS = loop.h CC = cc all: $(PROG) $(CC) -c -O $< $(OBJS): $(INCS) $(PROG): $(OBJS) $(CC) -o $@ $(OBJS) $(LIBS) _9. _C_O_M_M_O_N _P_R_O_C_E_D_U_R_E_S There are several utility procedures that are generated for each input file which are not dependent on the input. - 24 - These procedures, written on the file 'misc.c' perform rela- tional testing. test_int (a,b) int a, b; { if(a < b) return(4); if(a == b) return(2); return(1); } test_double (a,b) double a, b; { if(a < b) return(4); if(a == b) return(2); return(1); } test_string(a,b) char *a, *b; { int i; i = strcmp(a, b); if(i < 0) return(4); if(i == 0) return(2); return(1); } The relational operators are bit encoded in an integer; 'less-than' occupies bit two, 'equality' occupies bit one and 'greater-than' occupies bit zero. Each of these 'test' procedures returns an integer which indicates the relation between the operands. Examples of calls to these procedures are included in section X.X.X. There are eight possible values for a bit encoded relational operator; the generated code: < = > 0 0 0 /* never match */ 0 0 1 /* greater-than */ 0 1 0 /* equal */ 0 1 1 /* greater-than-or-equal */ 1 0 0 /* less-than */ 1 0 1 /* not-equal */ 1 1 0 /* less-than-or-equal */ 1 1 1 /* don't care */ In addition to the testing procedures, a procedure for dynamically allocating memory is written on the file 'misc.c'. This procedure checks for the out of memory - 25 - condition. Using this procedure to allocate memory obviates the need to check for the out of memory condition elsewhere in the code. char *myalloc(n) int n; { char *r; r = (char *) malloc(n); if(r == NULL){ fprintf(stderr,"OUT OF MEMORY"); fprintf(stderr,"TRC IS EXITING"); exit(0); } return(r); } _1_0. _D_A_T_A _O_B_J_E_C_T_S At several points in PART ONE, it was mentioned that a list is maintained for each object that has at least one element. Objects that do not contain elements can not be distinguished from one another, so no list is maintained, only a count of how many there are is needed. The struc- tures those lists are built from are defined in the file 'loop.h'. The example below gives a sample TRC definition part and the output that might be generated with that input: Input: %% A B (B1 : INT B2 : FLOAT B3 : STRING B4 : POINTER) %% Output: #define A 0 #define A_max 2 #define B 1 #define B_max 2 struct B_struct { int B1; double B2; char *B3; struct B_struct *B4; int MARK; struct B_struct *prev; struct B_struct *next; } *B_list[B_max], B_empty[2], *B_temp[B_max]; - 26 - There are two '#define' statements for each object. The first defines the object name to be an integer. This name is used for indexing arrays. The intention is to make code more readable by using the name of the object that is being referred to rather than a literal index number. At the points in the output code where these names are used, their meaning will be explained. The second '#define' asso- ciated with each object is used for specifying the number of pointers that are needed for each object. Since each rule is completely independent of each other rule, the same pointers may be reused in each rule. The maximum number of pointers needed is the maximum used by any single rule. Each object with at least one element has an associated structure. In this example the object A has no elements and therefore no structure. The object B contains four ele- ments, one of each type. The structure is named 'B_struct', each structure will be similarly named by appending '_struct' to the object name. A data object will be included in the structure for each element that was defined for the object. The element names defined in the input are used in the output, again to keep the output code readable and meaningful. The correspondence of input data types to output data types is straight forward; INT translates to int, FLOAT to double, STRING to char *, and POINTER to a pointer to a structure of the type that contains the POINTER. The POINTER type is included for users who wish to extend STM with user supplied code. There is no support for testing or searching POINTERs or anything they may point to. The 'B_list' and 'B_temp' pointers are used as free variables and place holders in the inference engine. The 'B_list[0]' pointer points to the list of B objects. STM consists of the various '*_list[0]' pointers, the lists they point to and the count of how many objects of each type exist at any given moment. _1_1. _M_A_N_I_P_U_L_A_T_I_N_G _T_H_E _D_A_T_A There are three basic manipulations that can be per- formed on the data in STM, an object can be added to STM, an object can be deleted from STM and STM can be searched for the existence of an object with given elements. Since each of the object types is defined by a separate structure, separate add, delete and search procedures must be created for each object type. The following sections give an exam- ple and an explanation of how each procedure is generated. _1_1._1. _A_D_D _P_R_O_C_E_D_U_R_E_S For each object that is defined, a procedure is gen- erated for inserting structures into the list associated with the object. These procedures are written on the file 'add.c'. The parameters that are passed to this procedure - 27 - are the values that are to be assigned to the elements of the object. The parameters are listed in the order that the elements were declared, e.g.: INPUT: A B (B1 : INT B2 : FLOAT B3 : STRING B4 : POINTER) OUTPUT: add_A_struct() { token[A]++; } add_B_struct(B1, B2, B3) int B1; double B2; char *B3; { struct B_struct *temp; temp = (struct B_struct *) myalloc(sizeof(struct B_struct)); temp->B1 = B1; temp->B2 = B2; temp->B3 = (char *) myalloc((strlen(B3)) + 1); strcpy(temp->B3, B3); temp->MARK = 0; temp->next = B_list[0]; temp->prev = NULL; if(B_list[0]) B_list[0]->prev = temp; B_list[0] = temp; token[B]++; } Since the A object contains no elements, adding an A object is trivial; the count is simply incremented. The variable 'token' is an integer array sized to have one integer for each object type. If there are N object types token is an array of N integers, indexed 0 through N-1. In 'add_A_struct' the array 'token' is indexed by A. Recall that A, the name of a type of object, was defined to be an integer, in this case 0. The integer 'token[0]' or 'token[A]' is the count of how many objects of type A exist. The procedure 'add_B_struct' is typical of add pro- cedures for objects with elements. The parameters passed in are the values that are to be assigned to the elements of the new object. Even though B_struct includes a POINTER - 28 - object, no value is assigned to that pointer. As has been mentioned earler, there is no support for the pointer type in TRC generated code. The code is very straight forward; allocate a structure, initialize it's elements (note that space is allocated for strings in the add procedure), insert it at the head of the doubly linked list and increment the count (token[B]++). The file also contains the procedure 'init()'. This procedure is based on the contents of the STM section of code. For each statement in STM, a statement appears in init. The statements are simply calls to the various add procedures. The calls are made in an order opposite the order the STM statements are listed. Since all additions to lists are made as insertions at the head of the list, inverting the order causes the final list to contain the objects in the order they were actually listed, e.g: INPUT: %% A B 5A B (B1 => 7 B2 => 6.6 B3 => "THIS") 5 B (B1 = 2) %% OUTPUT: init() { int i; for(i = 0; i < 5; i++) add_A_struct(2, 0.0, ""); add_B_struct(7, 6.6, "THIS"); for(i = 0; i < 5; i++) add_A_struct(); add_B_struct(0, 0.0, ""); add_A_struct(); } As you can see, this facility is pretty crude, each element that is listed in STM becomes a literal value in the code. These literal values are then copied into dynamically allocated memory, so there are actually two copies of all the data in memory. The intention is that the STM section and the init procedure will be used primarily during development and testing and will be replaced by a user developed front end that will collect the data and create the dynamic structures for the TRC code. It is possible that there is some small amount of data that must always be - 29 - loaded into STM for a given set of problems, it may be con- venient to have the init procedure load this data into STM. _1_1._2. _M_A_R_K _P_R_O_C_E_D_U_R_E_S For each object that is defined, a procedure is gen- erated for deleting structures from the list associated with the object. These procedures are written on the file 'free.c'. The parameter passed to this procedure indicates which object is to be deleted from the list, e.g.: INPUT: A B (B1 : INT B2 : FLOAT B3 : STRING B4 : POINTER) OUTPUT free_A_struct() { token[A]--; } free_B_struct(start) int start; { int i; for(i = start; i < B_max; i++) if(B_list[i]){ if(B_list[i]->prev == NULL) B_list[0] = B_list[0]->next; else B_list[i]->prev->next = B_list[i]->next; if(B_list[i]->next) B_list[i]->next->prev = B_list[i]->prev; free(B_list[i]->B3); free(B_list[i]); B_list[i] = NULL; i = B_max; token[B]--; } } As in the add procedures, the procedure to delete an object with no elements is trivial; decrement the count of objects of that type. The procedure 'free_B_struct' is typ- ical of procedures for deleting an object from a list. Recall that 'B_list[0]' points to the list of B objects in STM and that the other 'B_list' pointers are used as free variables. Each match statement in the situation part - 30 - causes STM to be searched for an object. If an object that matches the test exists in STM, a pointer to that object is returned and assigned to one of the pointers in the 'B_list' array. Recall that only objects that were found in the situation part can be MARKed in the action part and that objects may be MARKed by name or the order in which they are found. There are two cases, the case where a named object is to be MARKed and the case where the first object found is to be MARKed. In the case where a named object is to be MARKed, the name of the object is translated to the index number of the pointer that points to that object. This index number is passed to the free procedure. In cases where the object is being MARKed based on the order it was found, the index 1 (the first free variable used) is passed. Examples of calls to 'free_B_struct' are given in section X.X.X. The array of pointers to objects of the given type (B_list in this case) is searched for the first one that points to an object. This object is unlinked from the list, any space allocated for strings in the object being deleted is freed and finally the space occupied by the structure itself is freed. The count of objects in the list is decre- mented and the 'for' loop counting variable is set to the exit condition. _1_1._3. _S_E_A_R_C_H _P_R_O_C_E_D_U_R_E_S For each object that is defined, a procedure is gen- erated for searching list associated with the object. The procedure simply performs a linear search on the list in question. The RECURSIVE search strategy is implemented as multiple calls to the LINEAR search procedure. These pro- cedures are written on the file 'search.c'. The parameter passed to this procedure indicates where in the list to begin searching, e.g.: INPUT: A B (B1 : INT B2 : FLOAT B3 : STRING B4 : POINTER) OUTPUT: struct B_list *search_B_list(index, B1, B1_relop, B2, B2_relop, B3, B3_relop) int index, B1_relop, B2_relop, B3_relop; int B1; double B2; char *B3; - 31 - { int flag struct B_struct *temp; temp = B_temp[index]; while(temp){ if(temp->MARK == 0){ flag = 7; if(flag & test_int(temp->B1, B1) & B1_relop); else flag = 0; if(flag & test_double(temp->B2, B2) & B2_relop); else flag = 0; if(flag & test_string(temp->B3, B3) & B3_relop); else flag = 0; if(flag){ temp->MARK = 1; return(temp); } } temp = temp->next; } return(NULL); } Since the object A has no elements, there is no list and no search procedure for A objects. In the procedure to search the B list the first parameter, 'index', is the index of the pointer that points to the point in the list where the search is to begin. The remaining parameters are the value that each element is to be compared against and the bit encoded relational operator for the comparison. The first test (temp->MARK==0) checks to see if the object is already 'in use'. Each object mentioned in the situation part must match a unique object, the same object can not match two situation part statements. An object is marked as 'in use' by setting MARK to non-zero. If the object is not 'in use', it's elements are tested, one at a time, against the required values with the required rela- tional operator. The procedures test_int, test_float and test_string return the bit encoded relation of the two values. This relation is bitwise ANDed with the bit encoded relational operator that was passed in. If the result of the bitwise AND is non-zero, the relation is true for those two values. The 'flag' variable ensures that if one test fails, all subsequent tests will fail. If an object is found where all elements match the desired values, it's MARK integer is set to one to indicate that it is 'in use' and a pointer to that object is returned to the calling procedure. If one or more elements of an object fail a test, the next object in the list is tested. If all objects are tested and none match, a NULL pointer is - 32 - returned. This search procedure will work only for searches where the value that is being searched for is known before the call. In cases where an element is being compared to some other element of the same object, a slightly different ver- sion of the search procedure is generated, e.g.: INPUT: %% B (B1:INT B2:INT B3:INT) %% %% R1: (B.B1 == B.B2) (B.B3 < B.B1 B.B1 > B.B2) (B.B3 < B.B2) => MARK B B B ; %% OUTPUT: struct B_struct *search_B_struct(index, B1, B1_relop, B1_case, B2, B2_relop, B3, B3_relop, B3_case) int index, B1_relop, B2_relop, B3_relop; int B1; int B2; int B3; { int flag; struct B_struct *temp; temp = B_temp[index]; while(temp){ if(temp->MARK == 0){ flag = 7; switch(B1_case){ case 0: if(flag & test_int(temp->B1, B1) & B1_relop); else flag = 0; break; case 1: if(flag & test_int(temp->B1, temp->B2) & B1_relop); else flag = 0; break; default: flag = 0; - 33 - } if(flag & test_int(temp->B2, B2) & B2_relop); else flag = 0; switch(B3_case){ case 0: if(flag & test_int(temp->B3, B3) & B3_relop); else flag = 0; break; case 1: if(flag & test_int(temp->B3, temp->B1) & B3_relop); else flag = 0; break; case 2: if(flag & test_int(temp->B3, temp->B2) & B3_relop); else flag = 0; break; default: flag = 0; } if(flag){ temp->MARK = 1; return(temp); } } temp = temp->next; } return(NULL); } As can be seen in the example, the procedure is quite similar. A 'case' variable has been added to the parameter list for each element which might be compared to another element of the same object. Case 0 is the situation where an element is being compared to a value, all other cases are comparisons of an element to another element of the same object. Only the cases that are actually used are gen- erated, not all possible cases. There is an obvious code overhead for comparing ele- ments within an object, but this overhead occurs only once for each type of comparison. Subsequent rules could include similar element to element comparisons without adding any additional code overhead. _1_2. _T_R_A_N_S_L_A_T_I_N_G _R_U_L_E_S The LTM section is translated to a single procedure named 'loop' which is written on the file 'loop.c'. An inference engine is executed by calling the procedure 'init', which is written on the file 'add.c' followed by a - 34 - call to 'loop'. The loop procedure will test rules in the order they were listed until no rule's situation part is true or until the user code executes a return or exit. A simple two rule system will be used to illustrate the trans- lation: INPUT: %% B (B1:INT B2:INT B3:INT) %% %% R1: EMPTY B NAMED (B.B1 == B.B2) { $NAMED.B1 = some_procedure(); if(some_other_procedure($NAMED.B1)) $FAIL. } (B.B1 != NAMED.B1) => MARK B B { printf("Rule R1 fired0); } ; R2: RECURS (B.B1 != 7) (^B FIRST B.B1 == 7) (B.B2 <= FIRST.B3) => MARK FIRST ADD B (B1 => 0 B2 => FIRST.B3 B3 => FIRST.B2) ; %% OUTPUT: loop() { int i; Start: R1: if((token[B] >= 2) && 1){ B_temp[1] = B_list[0]; if((B_list[1] = search_B_struct (1, 0, 2, 1, 0, 7, 0, 7)) == NULL){ restore(); - 35 - goto R2; } B_empty[0].B1 = some_procedure(); if(some_other_procedure(B_empty[0].B1)) { restore(); goto R2; } B_temp[2] = B_list[0]; if((B_list[2] = search_B_struct (2, B_empty[0].B1, 5, 0, 0, 7, 0, 7)) == NULL){ restore(); goto R2; } for(i = 0; i < 2; i++) free_B_struct(1); restore(); printf("Rule R1 fired0); goto Start; } R2: if((token[B] >= 3) && 1){ B_temp[1] = B_list[0]; R2_B_1: if(B_list[1]) B_list[1]->MARK = 0; if((B_list[1] = search_B_struct (1, 7, 5, 0, 0, 7, 0, 7)) == NULL){ restore(); goto End; } B_temp[1] = B_list[1]->next; B_temp[2] = B_list[0]; R2_B_2: if(B_list[2]) B_list[2]->MARK = 0; if((B_list[2] = search_B_struct (2, 7, 2, 0, 0, 7, 0, 7)) == NULL){ goto R2_B_1; } B_temp[2] = B_list[2]->next; B_temp[3] = B_list[0]; R2_B_3: if(B_list[3]) B_list[3]->MARK = 0; if((B_list[3] = search_B_struct (3, 0, 7, 0, B_list[2]->B3, 6, 0, 7)) == NULL){ goto R2_B_2; } B_temp[3] = B_list[3]->next; add_B_struct(0, B_list[2]->B3, B_list[2]->B2); free_B_struct(2); restore(); - 36 - goto Start; } End: return(1); } A rule is translated to an extended 'if' statement. Basically, "if situation then action". Each rule begins with a label that repeats the rule label from the input. The label 'Start' marks the beginning of the rules and the label are included as a convenient way to exit (goto End;) or restart (goto Start;). The code for rule 'R1' begins at the label 'R1' and ends at the label 'R2'. The first statement, "if((token[B] >= 2))", is a pre-test. The array 'token[]' contains a count of how many objects are in each list. Token[B] is the count of how many objects are in the B list. Since rule 'R1' specifys two objects of type B in it's situation part, it is pointless to search the B list if it contains fewer than 2 objects. A statement similar to this is the first in every rule. STM is never searched unless there are enough objects that it is possible for the rule to fire. If this initial test fails, testing will continue at label 'R2'. The next statement, 'B_temp[1] = B_list[0];' initial- izes a pointer to point to the beginning of the B list. The index of this pointer is passed to the search procedure. This use of indirection is not necessary in LINEAR rules but it is convenient in RECURSIVE rules, the same calling tech- nique is used by both search strategies to simplify the code generation. The call to the search procedure is embedded in an 'if' statement along with an assignment to the free variable pointer that will point to the object if it exists in STM. The parameter list in this call consists first of '1', the index of the temp pointer that indicates the start of the search. The next value '0', is the value that the first declared element, 'B1', is to be compared against. The next parameter, '2', is the bit encoded relational operator, equality. The next parameter, '1', is the 'case' of this test. Since it is not zero, 'B1' is not being compared to the value but rather is being compared in this case to the element 'B2' of the same object. Values for elements 'B2' and 'B3' were not specified, so those parameters are filled in with the default value of '0' and relational operator '7' which is the bit encoded 'don't care' operator. If 'NULL' is returned, the object does not exist in STM and the rule fails. A linear rule is made to fail by clearing all free variables (restore();) and continuing with the next rule (goto R2;). - 37 - If the first test does not fail, execution continues with the next statement, which is the translated version of the embedded c-code from rule 'R1'. The string '$NAMED' is translated to 'B_empty[1]' which is the name of the struc- ture that was named by the EMPTY statement. The string '$FAIL.' is translated to the statements "restore(); goto R2;", which cause the rule to fail in the standard manner. The next 'if' statement is identical in form to the previous one, only the values of the elements are different. In this case element 'B1' is being compared to 'NAMED.B1' which, again, is translated to B_empty[0].B1. If this test fails, the pointers will be cleared and execution will con- tinue at 'R2'. If it does not fail, the action part is exe- cuted. The action part of rule 'R1' consists of a MARK state- ment and c-code which contains a 'printf' call. The MARK statement is translated to a 'for' loop which deletes the first object that was found in each of it's calls to 'free_B_struct'. The 'restore();' statement follows all ADD and MARK statements in the action part to clear any active free variables. The c-code 'printf' comes next followed by 'goto Start;' which causes the rule list to be searched again for the first rule whose situation part is true. The form of the second rule is quite similar to the first rule. Since rule 'R2' is RECURSIVE, some minor differences are evident. The first difference is that the start of each test for the existence of an object is labelled. This is to permit backing up to the previous test. The second difference is that only the first test contains the 'restore' and 'goto' statements. All other tests simply back up one position if they fail. The 'B_temp' variables now store the location where the search is to be restarted if some test fails. In the action part of rule 'R2', the call to 'free_B_struct' passes the value '2', indicating that the second object that was found is the one to delete. This was specified with the statement 'MARK FIRST', where the object named 'FIRST' was the second object of type B specified in the situation part. _1_3. _O_P_T_I_O_N_S Options may cause additional procedures to be generated and sometimes cause standard procedures to be modified. This section will detail the effects each option has on the output. - 38 - _1_3._1. _P_R_O_F_I_L_E The intention of the profile option is to provide a summary of the execution of the inference engine. The pro- file option causes the procedure 'loop' to be modified and an additional procedure is written on the file 'profile.c', e.g.: INPUT: %% B (B1:INT B2:INT B3:INT) %% %% PROFILE R1: (B.B1 == B.B2) (B.B1 != B.B2) => MARK B B ; R2: (B.B1 != 7) (^B FIRST B.B1 == 7) (B.B2 <= FIRST.B3) => MARK FIRST ; %% OUTPUT: loop() { int i; Start: test_profile[0]++; R1: test_profile[1]++; if((token[B] >= 2) && 1){ B_temp[1] = B_list[0]; R1_B_1: test_profile[2]++; if((B_list[1] = search_B_struct (1, B_list[1]->B2, 2, 0, 7, 0, 7)) == NULL){ restore(); goto R2; } B_temp[2] = B_list[0]; R1_B_2: test_profile[3]++; if((B_list[2] = search_B_struct - 39 - (2, B_list[1]->B2, 5, 0, 7, 0, 7)) == NULL){ restore(); goto R2; } fire_profile[1]++; for(i = 0; i < 2; i++) free_B_struct(1); restore(); goto Start; } R2: test_profile[4]++; if((token[B] >= 3) && 1){ B_temp[1] = B_list[0]; R2_B_1: test_profile[5]++; if((B_list[1] = search_B_struct (1, 7, 5, 0, 7, 0, 7)) == NULL){ restore(); goto End; } B_temp[2] = B_list[0]; R2_B_2: test_profile[6]++; if((B_list[2] = search_B_struct (2, 7, 2, 0, 7, 0, 7)) == NULL){ restore(); goto End; } B_temp[3] = B_list[0]; R2_B_3: test_profile[7]++; if((B_list[3] = search_B_struct (3, 0, 7, B_list[2]->B3, 6, 0, 7)) == NULL){ restore(); goto End; } fire_profile[2]++; free_B_struct(2); restore(); goto Start; } End: test_profile[8]++; return(1); } } The 'loop' procedure that is generated with the PROFILE option turned on is differs from the standard procedure in several ways. Each test in each rule is labeled whether it is RECURSIVE or not. Each label is followed by a statement - 40 - of form 'test_profile[N]++', causing the array test_profile to maintain a count of how many times the following code was executed. The action part of each rule begins with a state- ment of form 'fire_profile[N]++', causing the array fire_profile to maintain a count of how many times each rule fired. The PROFILE option causes the arrays test_profile and fire_profile to be defined and properly sized. It also defines two character arrays, label_names[] and rules[] to be defined. These character arrays contain the names of each label and each rule respectively. The procedure print_profile is also generated. This procedure will print the names of each label and it's associated count on the standard output, e.g.: OUTPUT: print_profile() { int i, t; t = 0; printf("0ules Tested0); for(i = 0; i < 9; i++){ printf("%d%s0,test_profile[i], label_names[i]); t += test_profile[i]; } printf("%d0, t); t = 0; printf("0ules Fired0); for(i = 1; i < 3; i++){ printf("%d%s0,fire_profile[i], rules[i]); t += fire_profile[i]; } printf("%d0, t); } _1_3._2. _T_R_A_C_E The TRACE option causes the 'loop' procedure to be modified and an additional procedure is written on the file 'misc.c'. The modification to 'loop' is simply the inclu- sion of a procedure call of form 'append_trace(N);' (where N is an integer literal) in the action part of the rule. The parameter is the index of the name of the rule in the char- acter array 'rules' that is generated by the PROFILE option. The PROFILE option only keeps a count of the number of times a rule fires, the TRACE option records the ORDER that the rules were fired. struct trace { int rule; struct trace *next; } *trace_front, *trace_back; - 41 - append_trace(i) int i; { struct trace *temp; temp = (struct trace *) myalloc (sizeof(struct trace)); temp->rule = i; temp->next = NULL; if(trace_front){ trace_back->next = temp; trace_back = trace_back->next; } else trace_front = trace_back = temp; } _1_3._3. _D_U_M_P The DUMP option generates a set of procedures written on the file 'dump.c'. A procedure of form 'dump_NAME_struct()' (where NAME is the name of the object) is generated for each object declared in the definition sec- tion. There is also a procedure 'dump_stm()' which simply calls the other dump procedures in the order that the objects were defined. Each procedure prints the number of objects in that list and the current values of the elements of each object in tabular form on the standard output. INPUT: %% A B (B1:INT B2:INT B3:INT) %% OUTPUT: dump_stm() { dump_A_struct(); dump_B_struct(); } dump_A_struct() { printf("0umping A list (%d)0,token[A]); } dump_B_struct() { int i; struct B_struct *temp; i = 1; - 42 - printf("0umping B list (%d)0,token[B]); temp = B_list[0]; while(temp){ printf("%d.%d%d%d0, i , temp->B1 , temp->B2 , temp->B3); temp = temp->next; i++; } } _1_3._4. _B_A_C_K_T_R_A_C_K The BACKTRACKing option is easily the most complex. While other options usually have a minor effect on the out- put, BACKTRACKing will often double the size of the code generated by TRC. BACKTRACK modifies the add and loop pro- cedures and generates two new procedures, insert_backtrack and backup, on the file 'backtrack.c'. The intent of backtracking is to make it possible to undo the action part of a rule and continue as if the rule had never fired. This facility is useful in systems where the first possible path through the problem space may not lead to a solution or may not lead to the preferred solu- tion. In the code produced by TRC, backtracking will occur whenever no rule's situation part is true and there is a rule which can be undone. A rule is undone by restoring STM to the state it was in before the rule fired and continuing testing at the rule following the rule being undone. There are two obvious ways to restore STM. The first is to save all of STM each time a rule fires. To undo a rule, simply replace STM with the previously saved version. This strategy can be expensive in time and space if STM is large and/or many rules fire. The second strategy is to save only the modifications to STM, to restore STM simply reverse the modifications. The second strategy is employed by TRC. The backtracking strategy is implemented by building a stack in memory which contains all known modifications made to STM by a rule which fires. The only modifications that the backtracking code is aware of are those modifications made by ADD and MARK statements in the action part or by calls to add and relink (discussed below) procedures in embedded c-code in the action part. Modifications made by embedded c-code that do not use the add or relink procedures will not be known to the TRC code. It is the responsibility of the knowledge engineer to insure that any modifications that must be undone are known to TRC. - 43 - The backtracking stack is built in the following manner; whenever a rule fires a new structure is placed on the backtrack stack. This structure contains a count of how many of each object are added by this rule. Since all adds are insertions to the front of the list, the specific objects that were added are implicitly known. MARKed objects are unlinked from their STM lists and relinked into the backtrack structure along with an indication of where they were in the STM list. STM can now be restored by relinking the MARKed objects into their previous position and deleting objects that were added to the front of the STM lists. An example follows: INPUT: %% A B (B1:INT B2:INT B3:INT) %% %% BACKTRACK R1: (B.B1 == B.B2) (B.B1 != B.B2) => MARK B B ; R2: (B.B1 != 7) (^B FIRST B.B1 == 7) (B.B2 <= FIRST.B3) => MARK FIRST ; %% OUTPUT: struct back_track_stack { int Add_A; int mark_A; int Add_B; struct B_struct *mark_B; int next_rule; struct back_track_stack *next; } *backtrack; insert_backtrack(rule) int rule; { struct back_track_stack *temp; temp = (struct back_track_stack *) - 44 - myalloc(sizeof(struct back_track_stack)); temp->next_rule = rule; temp->Add_A = 0; temp->mark_A = 0; temp->Add_B = 0; temp->mark_B = NULL; temp->next = backtrack; backtrack = temp; } The struct back_track_stack, pointed to by 'backtrack', is where the backtracking data is maintained. The struct back_track_stack contains two variables for each object that is defined. The variables are of form 'Add_NAME' and 'mark_NAME', where 'NAME' is the name of the object. The variable of form 'Add_name' is always an integer, it indi- cates how many objects of the named type were added to STM by this rule. The variable of form 'mark_NAME' is an integer for objects that do not contain elements (and there- fore have no associated list) and a pointer for objects that do contain elements. The procedure 'insert_backtrack' allo- cates a structure, places it at the head of the list pointed to by 'backtrack' and initializes it's variables. loop() { int i; Start: R1: if((token[B] >= 2) && 1){ B_temp[1] = B_list[0]; if((B_list[1] = search_B_struct (1, B_list[1]->B2, 2, 0, 7, 0, 7)) == NULL){ restore(); goto R2; } B_temp[2] = B_list[0]; if((B_list[2] = search_B_struct (2, B_list[1]->B2, 5, 0, 7, 0, 7)) == NULL){ restore(); goto R2; } insert_backtrack(1); for(i = 0; i < 2; i++) relink_B_struct(1); restore(); goto Start; } R2: if((token[B] >= 3) && 1){ B_temp[1] = B_list[0]; - 45 - if((B_list[1] = search_B_struct (1, 7, 5, 0, 7, 0, 7)) == NULL){ restore(); goto End; } B_temp[2] = B_list[0]; if((B_list[2] = search_B_struct (2, 7, 2, 0, 7, 0, 7)) == NULL){ restore(); goto End; } B_temp[3] = B_list[0]; if((B_list[3] = search_B_struct (3, 0, 7, B_list[2]->B3, 6, 0, 7)) == NULL){ restore(); goto End; } insert_backtrack(2); relink_B_struct(2); restore(); goto Start; } End: if(backtrack){ i = backtrack->next_rule; backup(); switch(i){ case 1: goto R2; case 2: goto End; default: goto End; } } return(1); } Minor changes are made in the action part of each rule. The action part begins with a call to 'insert_backtrack', which places a structure on top of the backtrack stack. The integer literal that is passed by this procedure indicates which rule is firing. This information is used to determine which rule to test next when this rule is undone. Objects that are to be deleted are deleted with calls to procedures of form 'relink_NAME_struct' where 'NAME' is the name of the affected object. The relink procedures are similar to the free procedures, except they link the object to the backtrack stack instead of freeing it. The relink procedures store a value in the object's variable MARK to indicate the former position of the object in it's list. Recall that the MARK variable is usually used to indicate
sources-request@panda.UUCP (02/09/86)
Mod.sources: Volume 3, Issue 113 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as reference.3.doc. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- - 46 - that an object is 'in-use'. The MARK variable is not needed for it's original purpose when it is in the backtrack stack. OUTPUT: relink_A_struct(start) int start; { backtrack->mark_A++; token[A]--; } relink_B_struct(start) int start; { int i, j; struct B_struct *temp; for(i = start; i < B_max; i++) if(B_list[i]){ temp = B_list[0]; j = 0; while(temp != B_list[i]){ temp = temp->next; j++; } if(B_list[i]->prev == NULL) B_list[0] = B_list[i]->next; else B_list[i]->prev->next = B_list[i]->next; if(B_list[i]->next) B_list[i]->next->prev = B_list[i]->prev; B_list[i]->MARK = j; B_list[i]->next = backtrack->mark_B; backtrack->mark_B = B_list[i]; B_list[i] = NULL; i = B_max; token[B]--; } } The backtracking action itself is initiated after the label 'End:' in the procedure 'loop'. If there is something on the backtrack stack, the index of the last rule that fired is copied out and the procedure 'backup', which undoes the actions of the last rule that fired, is called. Execu- tion continues with the rule that follows the last rule that fired. The procedure 'backup' first restores objects that were MARKed by the last rule and then removes objects that were ADDed. The MARKed objects are restored to their origi- nal positions in the list. OUTPUT: backup() - 47 - { int i; struct back_track_stack *temp; struct B_struct *B_temp, *B_temp2; if(backtrack == NULL) return; token[A] += backtrack->mark_A; token[A] -= backtrack->Add_A; while(backtrack->mark_B){ B_temp2 = backtrack->mark_B; backtrack->mark_B = backtrack->mark_B->next; B_temp2->prev = NULL; B_temp2->next = NULL; B_temp = B_list[0]; if(B_temp){ for(i = 0; i < B_temp2->MARK; i++) if(B_temp->next) B_temp = B_temp->next; else i = B_temp2->MARK + 1; } else i = -1; if(i == B_temp2->MARK){ B_temp2->next = B_temp; B_temp2->prev = B_temp->prev; if(B_temp->prev) B_temp->prev->next = B_temp2; else B_list[0] = B_temp2; B_temp->prev = B_temp2; } else{ if(B_temp){ B_temp->next = B_temp2; B_temp2->prev = B_temp; B_temp2->next = NULL; } else B_list[0] = B_temp2; } B_temp2->MARK = 0; token[B]++; } for(i = 0; i < backtrack->Add_B; i++){ B_list[1] = B_list[0]; free_B_struct(1); } temp = backtrack; backtrack = backtrack->next; free(temp); } The procedures generated by the BACKTRACKing option can - 48 - be called from embedded c-code to implement user controlled backtracking. A rule may undo itself with the following two statements: backup(); goto NEXT_RULE; The label 'NEXT_RULE' is replaced with the label of the rule that follows the current rule (this is done by the knowledge engineer, not TRC). To undo the current rule and the rule that fired previous to the current rule, use the following two statements: backup(); goto End; The label 'End' always refers to the point that follows the action part of the last rule. Appendix C contains a small expert system that implements backtracking with calls in embedded c-code. _1_3._5. _Z_E_R_O The ZERO option generates code that will free all dynamic structures allocated by TRC generated code. It is useful in systems where the loop procedure is entered more than once. It may be necessary to clean up the remains of a previous execution before beginning a new one. A single procedure, 'zero()', is written on the file 'zero.c'. The zero procedure will free all the elements and all the objects in STM and zero the arrays that hold the counts of objects in each list. If BACKTRACKing is enabled, zero will free all the objects and elements in the backtrack stack. If the TRACE option is enabled, zero will free all the entries in the trace list. If the PROFILE option is enabled, zero will set the value of all the integer array elements to zero. The actual code produced for the zero procedure is uninteresting and is not reproduced here. _1_3._6. _S_A_V_E The SAVE option writes a set of procedures on the file 'save.c' that simplify building expert systems capable of checkpointing their own execution. Procedures are generated for saving and reloading each type of dynamic structure that might be generated. In each case the procedures are passed a file pointer that points to an open file. The code pro- duced by the save procedures is uninteresting so only the names are listed here. The purpose of each procedure should be obvious from it's name: save_stm(fp) load_stm(fp) - 49 - save_backtrack(fp) load_backtrack(fp) save_profile(fp) load_profile(fp) save_trace(fp) load_trace(fp) Appendix C presents a small expert system that uses the SAVE option to generate code for checkpointing the execution of the system. The example includes an automatic restart capability. - 50 - APPENDIX A: TRC GRAMMAR The grammar for TRC which is presented throughout PART ONE of this document is presented in it's entirety here. identifier ::= letter { underscore | letter | digit} letter ::= upper-case-letter | lower-case-letter f-p ::= [ minus ] digits dot digits integer-literal ::= [ minus ] digits digits ::= digit { digit } string-literal ::= quote { [ character ] } quote comment ::= slash asterisk { [ character ] } asterisk slash c_code ::= left-bracket { [character] | [c_code] } right-bracket definition ::= identifier definition ::= identifier left-paren item-list right-paren item-list ::= { [ item ] } item ::= identifier colon type type ::= INT | FLOAT | STRING | POINTER stm ::= { [ entry ] } entry ::= [ integer-literal ] identifier entry ::= [ integer-literal ] identifier left-paren { [ init-item ] } right-paren init-item ::= identifier arrow value value ::= integer-literal value ::= floating-point-literal value ::= string-literal ltm ::= { [option] } { rule } option ::= ZERO | PROFILE | BACKTRACK | DUMP | RECURS | NORECURS | SAVE | TRACE | PREFIX identifier rule ::= label situation arrow action semicolon label ::= identifier colon | colon situation ::= { [ s-option ] } { [ match ] } s-option ::= EMPTY identifier identifier s-option ::= RECURS | NORECURS match ::= [ integer-literal ] identifier match ::= NOT identifier match ::= [ integer-literal ] left-paren name match-list right-paren match ::= c-code name ::= hat identifier identifier match-list ::= { match-item } match-item ::= identifier dot identifier relop literal match-item ::= identifier dot identifier relop identifier dot identifier relop ::= equality | not-equal | less-than relop ::= greater-than | greater-than-or-equal relop ::= less-than-or-equal action ::= statements c-code statements ::= { [statement] } statement ::= MARK mark-list statement ::= ADD add-list statement ::= OPTIMIZE identifier - 51 - mark-list ::= { [ mark-item ] } mark-item ::= [ integer-literal ] identifier add-list ::= { [ entry ] } - 52 - APPENDIX B: ERROR MESSAGES Error messages are listed and explained here in alpha- betical order. All messages that refer to the input file begin with the line number of the input file where the error was noticed. This line number is not necessarily the line where the error occurred, the actual error could be on an earlier line. The notations %s and %o mean that a string or an octal number, respectively, from the input file is included in the error message. %s is not an element of %s The named object does not include the named ele- ment. cannot translate %s in rule %s A c-code in rule %s contains an identifier %s that is preceded by a dollar character. The identifier is not known to TRC. This will occur if a dollar character occurs at some random point in the c- code. This error will also occur if an identifier is misspelled. can't have %s and NOT %s in the same rule NOT %s is a test that is true only when %s is an empty list. Obviously a list may not contain an object and be empty so the statement is meaning- less to TRC. can't MARK an EMPTY object An EMPTY object is an object that exists outside of STM. The scope of an EMPTY object is the rule in which it is declared. Since EMPTY objects are not in STM, attempting to MARK one is meaningless. can't mark more %s's than are found Unless STM is searched for an object and the object is found it is not possible to remove that object from STM. STM is searched only in the situation part, anything to be deleted in the action part must have been found in the situation part. - 53 - count on free variables undefined The purpose of a free variable is to assign a name to a specific object. Placing a count in front of a free variable definition is meaningless because the name can be assigned to only one object. degenerate case please rewrite A match in a rule compares an element to itself. The result of comparing an element to itself is known without performing any test. TRC refuses to generate the extra code that this useless test requires. duplicate declaration -> %s Object names must be unique, %s is the name of the object that is mentioned twice in the definition section. duplicate name in definition -> %s Each element in an object definition must have a unique name. free variable already defined -> %s The scope of a free variable is a single rule. Free variable names may be reused in every rule, but only once per rule. label repeats object declaration -> %s Rule labels and object names must be distinct from one another to prevent name conflicts in the out- put source code. negative count is undefined Be serious. How can a list contain less than zero items? newline embedded in string The scanner attempts to prevent errors caused by forgetting to terminate a string. For that reason - 54 - literal newlines are not permitted in strings. A newline and other control characters can be embed- ded in a string using the normal UNIX and C escapes. no code produced due to errors in source TRC will generate code only when there are no errors in the source. object field must be a string object field must be double object field must be integer TRC enforces strong type checking for the three data types, all assignments and relational tests must involve elements of the same type. objects in a complex test must match Here is an example of this type of error: (A.A1 == 2 B.B1 == 3) Because a single set of parens bracket this test it is presumed to be a test for a single object. A single object can not be in both list A (A.A1) and list B (B.B1). Either there is a typo in one of object names or some parens are missing. OUT OF MEMORY TRC IS EXITING This message is not generated by TRC, rather it is generated by the code that TRC produces. It will occur when the TRC generated inference engine attempts to dynamically allocate a data object. If the allocate fails, the TRC generated code prints this message and exits. redefined label -> %s Every rule must have a distinct label. semantic error: use a free variable This message suggests a solution to the perceived - 55 - problem. It is printed when the right hand side of a relational test mentions an object name that is different from the object name mentioned on the right hand side. This type of test can be accom- plished, but the item on the right hand side must be found first and must have a free variable name assigned to it. syntax error syntax error in ADD statement syntax error in MARK statement syntax error in OPTIMIZE statement syntax error in definitions syntax error in header syntax error in previous rule syntax error in short term memory syntax error in trailer A syntax error is generated by the parser when it can not reduce the input tokens with any of it's rules. The current input token will be discarded and the parser will attempt to reduce the new input. At least three input tokens must be parsed before the parser will assume it is in sync. Once the parser finds an error it will throw tokens out until it can sync. This is the reason why semi- colons are used as a rule terminator, they provide an absolute point where the parser can sync no matter how badly the input is botched. This behavior is common to YACC generated parsers. All syntax error messages indicate the line that was being scanned when the error was noticed and most inform the user of what section of the code was being parsed. types of element (%s) and value (%s) do not match Strong type checking is enforced by TRC. Literals must be of the same type as the element they are being assigned to. Floating point literals MUST contain a decimal point. There can be no cross assignment between integer and floating point ele- ments. types of %s.%s and %s.%s do not match Strong type checking is enforced by TRC. Only elements of identical types may be compared. unable to attach %s to the standard input - 56 - The scanner actually reads the standard input. The file named on the command line could not be opened for reading and attached to the standard input. unable to open %s Open failed on one of the output files. TRC aborts. unable to recover from earlier errors The parser completely wigged out, this usually happens when the input terminates before the parser can resync. undefined element -> %s.%s The object %s does not have an element %s. undefined flag (%c) A command line argument included a compiler flag that is not defined. Use 'man trc' to get a manual page. undefined free variable -> %s A reference was made to a free variable on the right hand side of a relational test. That free variable was not attached to an object in the current rule. Remember that the scope of a free variable is a single rule. undefined object -> %s The name %s was used as an object name but not defined as such in the definition section. undefined object field -> %s.%s The object %s was defined, but it did not contain an element %s. unexpected '!' unexpected '%' - 57 - unexpected '=' unexpected or undefined character: %o These messages are generated by the scanner. These characters, when not embedded in a comment, string or code section, are meaningful only as part of a compound symbol (e.g !=, ==, %%). A single character is not returned to the parser since it will only propagate errors. unterminated C code unterminated comment These elements of the input are handled completely by the scanner. These messages are printed if the end of the input file is reached before the ter- minating character is found. Each of these mes- sages indicate the line of the input where scan- ning began. usage: trc [options] filename Command line error. Use 'man trc' to get a manual page. zero count is undefined Nice try. If you really want to search STM for zero occurrences of something use the NOT state- ment described in Section 6.3.1 of this document. - 58 - APPENDIX C: STYLE NOTES TRC was designed to produce fast code, but it is not the least bit difficult to produce very slow code with TRC. The intent of this section is to suggest some things that can be done that will lead to fast code and to suggest some ways to avoid creating slow code. The central issue is reducing the amount of time spent searching STM. In the battle against long search times, TRC's first line of defense is the definition section. Think of STM as a data base. When a data base is designed two issues are central: first the data base must be capable of representing all the facts that are to be stored and retrieved and second the data base should be arranged in a manner that will facilitate searching the data base. In a relational data base, the data base manager will designate primary keys based, in part, on the way that users are likely to specify searches. Think of STM as a data base, the rules in LTM are the users that are searching the data base, design the data base (STM) for searching. Suppose an expert system for routing cargo on commer- cial air carriers is being built. The objects that this expert system will deal with include airplanes and cargo. It is certainly possible to define a single TRC object whose elements can describe either an airplane or a piece of cargo. When a rule searches for an airplane in this system, it has to wade thru cargo and airplanes to find the airplane it needs. Why not define two different objects, one that describes airplanes and one that describes cargo. Then a rule that is searching for an airplane can search only the airplane list without having to wade thru the cargo too. Carried to an extreme, this suggestion implies a dif- ferent object definition for every combination of attributes that can exist in STM. This extreme will often not be feas- able. There is a trade off to be made; by defining more objects, the length of each list should be reduced which should reduce execution time. The penalty is that for each object there is a code overhead for the procedures that manipulate those objects. Code size is being traded for execution speed. For object definitions that do not include any elements there is a very low code overhead. Object definitions that do not include elements are useful when the objects do not differ from one another and only a count of how many there are is needed. Since objects of this type do not differ, no list is maintained. If STM can be represented entirely with objects that contain no elements, all searching will be eliminated. This situation usually leads to the fastest executing systems. - 59 - The situation part of a rule is the second line of defense against slow expert systems. On each pass thru LTM, only one rule is selected for firing. This implies that most rules fail most of the time and that is is somewhat unusual for a rule to actually fire. Since rules generally fail far more often than they fire, wouldn't it be reason- able to design rules to be good at failing, i.e. fail quickly? The preconditions automatically placed on every rule by TRC are an initial attempt to cause a rule to fail without doing any searching. If a rule searches for an object in list A, one in list B and one in list C, the order that the list are searched may not be significant. The order will not be significan unless one of the searches refers to something that was pre- viously found. If the order is not significant, why not first search for the object that is least likely to exist? If the objects are equally likely, why not first search the list that is likely to be shortest? The search is carried out in the order that the objects are listed in the situa- tion part. Remember that rules usually fail and design your rules to fail quickly wherever possible. The optimizer is the third line of defense, use it. - 60 - A SAMPLE SYSTEM This sample expert system demonstrates some of the features of the TRC language. The expert system finds a path from one node to another node in a 'dungeon'. Some of the nodes are marked as 'dangerous', and no path may go through that node. The main procedure prints a map of the 'dungeon' and asks the user for start and end nodes. It then initializes STM and calls loop. On return from loop it prints out the path (if one was found) and the execution time and profile. The path is not necessarily the shortest path, only the first path found. Cycles are not permitted. This sample system uses backtracking that is initiated by embedded c-code. It also uses the SAVE option to sim- plify the checkpoint and reloading procedures. The pro- cedure 'checkpoint' saves the state of all dynamic struc- tures and 're_do' restores from a previous checkpoint. If the system is started with no command line arguments, it simply queries the user for start and end points. If it is started one or more command line arguments, it will restart from a previously saved snapshot. Since it is possible for a system to crash while the checkpoint files are being written, this system writes alternately on two sets of files. A flag file indicates which set of files is complete. INPUT: %% END (E1:STRING) NODE (N1:STRING N2:STRING N3:STRING) PATH (P1:STRING) START (S1:STRING) %% NODE ( N1 => "ANEMONIE" N2 => "DANGER" N3 => "QUAGGA") NODE ( N1 => "ANEMONIE" N2 => "DANGER" N3 => "YENTI") NODE ( N1 => "ANEMONIE" N2 => "SAFE" N3 => "MEADOW") NODE ( N1 => "ANEMONIE" N2 => "SAFE" N3 => "KESTREL") NODE ( N1 => "BANDIT" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "BANDIT" N2 => "SAFE" N3 => "PEGASUS") NODE ( N1 => "BANDIT" N2 => "SAFE" N3 => "LAPIS LASULI") NODE ( N1 => "CAVERN" N2 => "SAFE" N3 => "ICE ROOM") NODE ( N1 => "CAVERN" N2 => "SAFE" N3 => "TREASURE") NODE ( N1 => "CAVERN" N2 => "DANGER" N3 => "HOBGOBLIN") NODE ( N1 => "DUBLOON" N2 => "SAFE" N3 => "ICE ROOM") NODE ( N1 => "DUBLOON" N2 => "DANGER" N3 => "HOBGOBLIN") NODE ( N1 => "DUBLOON" N2 => "SAFE" N3 => "SPRING") NODE ( N1 => "ELVES" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "ELVES" N2 => "SAFE" N3 => "URCHIN") NODE ( N1 => "FOUNTAIN" N2 => "SAFE" N3 => "RUBY") - 61 - NODE ( N1 => "FOUNTAIN" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "FOUNTAIN" N2 => "DANGER" N3 => "XEROC") NODE ( N1 => "GROTTO" N2 => "DANGER" N3 => "WRAITH") NODE ( N1 => "GROTTO" N2 => "SAFE" N3 => "OGRE") NODE ( N1 => "GROTTO" N2 => "SAFE" N3 => "RUBY") NODE ( N1 => "HOBGOBLIN" N2 => "SAFE" N3 => "TREASURE") NODE ( N1 => "HOBGOBLIN" N2 => "SAFE" N3 => "CAVERN") NODE ( N1 => "HOBGOBLIN" N2 => "SAFE" N3 => "ICE ROOM") NODE ( N1 => "HOBGOBLIN" N2 => "SAFE" N3 => "DUBLOON") NODE ( N1 => "HOBGOBLIN" N2 => "SAFE" N3 => "MEADOW") NODE ( N1 => "ICE ROOM" N2 => "SAFE" N3 => "CAVERN") NODE ( N1 => "ICE ROOM" N2 => "DANGER" N3 => "HOBGOBLIN") NODE ( N1 => "ICE ROOM" N2 => "SAFE" N3 => "DUBLOON") NODE ( N1 => "JABBERWOCK" N2 => "SAFE" N3 => "MEADOW") NODE ( N1 => "JABBERWOCK" N2 => "SAFE" N3 => "VERMIN") NODE ( N1 => "JABBERWOCK" N2 => "SAFE" N3 => "LAPIS LASULI") NODE ( N1 => "JABBERWOCK" N2 => "DANGER" N3 => "BANDIT") NODE ( N1 => "JABBERWOCK" N2 => "SAFE" N3 => "PEGASUS") NODE ( N1 => "JABBERWOCK" N2 => "DANGER" N3 => "ZOMBIE") NODE ( N1 => "KESTREL" N2 => "DANGER" N3 => "YENTI") NODE ( N1 => "KESTREL" N2 => "SAFE" N3 => "ANEMONIE") NODE ( N1 => "KESTREL" N2 => "DANGER" N3 => "QUAGGA") NODE ( N1 => "LAPIS LASULI" N2 => "SAFE" N3 => "VERMIN") NODE ( N1 => "LAPIS LASULI" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "LAPIS LASULI" N2 => "DANGER" N3 => "BANDIT") NODE ( N1 => "MEADOW" N2 => "DANGER" N3 => "HOBGOBLIN") NODE ( N1 => "MEADOW" N2 => "SAFE" N3 => "ANEMONIE") NODE ( N1 => "MEADOW" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "MEADOW" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "MEADOW" N2 => "DANGER" N3 => "WRAITH") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "MEADOW") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "ELVES") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "URCHIN") NODE ( N1 => "NYMPH" N2 => "DANGER" N3 => "XEROC") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "FOUNTAIN") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "RUBY") NODE ( N1 => "NYMPH" N2 => "SAFE" N3 => "URCHIN") NODE ( N1 => "OGRE" N2 => "SAFE" N3 => "SPRING") NODE ( N1 => "OGRE" N2 => "DANGER" N3 => "WRAITH") NODE ( N1 => "OGRE" N2 => "SAFE" N3 => "GROTTO") NODE ( N1 => "PEGASUS" N2 => "DANGER" N3 => "BANDIT") NODE ( N1 => "PEGASUS" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "PEGASUS" N2 => "DANGER" N3 => "ZOMBIE") NODE ( N1 => "QUAGGA" N2 => "SAFE" N3 => "KESTREL") NODE ( N1 => "QUAGGA" N2 => "SAFE" N3 => "ANEMONIE") NODE ( N1 => "QUAGGA" N2 => "SAFE" N3 => "VERMIN") NODE ( N1 => "RUBY" N2 => "SAFE" N3 => "GROTTO") NODE ( N1 => "RUBY" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "RUBY" N2 => "SAFE" N3 => "FOUNTAIN") NODE ( N1 => "SPRING" N2 => "SAFE" N3 => "DUBLOON") NODE ( N1 => "SPRING" N2 => "DANGER" N3 => "WRAITH") NODE ( N1 => "SPRING" N2 => "SAFE" N3 => "OGRE") NODE ( N1 => "TREASURE" N2 => "SAFE" N3 => "CAVERN") NODE ( N1 => "TREASURE" N2 => "DANGER" N3 => "HOBGOBLIN") - 62 - NODE ( N1 => "TREASURE" N2 => "DANGER" N3 => "YENTI") NODE ( N1 => "URCHIN" N2 => "SAFE" N3 => "ELVES") NODE ( N1 => "URCHIN" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "URCHIN" N2 => "DANGER" N3 => "XEROC") NODE ( N1 => "VERMIN" N2 => "DANGER" N3 => "QUAGGA") NODE ( N1 => "VERMIN" N2 => "SAFE" N3 => "LAPIS LASULI") NODE ( N1 => "VERMIN" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "WRAITH" N2 => "SAFE" N3 => "MEADOW") NODE ( N1 => "WRAITH" N2 => "SAFE" N3 => "SPRING") NODE ( N1 => "WRAITH" N2 => "SAFE" N3 => "OGRE") NODE ( N1 => "WRAITH" N2 => "SAFE" N3 => "GROTTO") NODE ( N1 => "XEROC" N2 => "SAFE" N3 => "URCHIN") NODE ( N1 => "XEROC" N2 => "SAFE" N3 => "NYMPH") NODE ( N1 => "XEROC" N2 => "SAFE" N3 => "FOUNTAIN") NODE ( N1 => "YENTI" N2 => "SAFE" N3 => "KESTREL") NODE ( N1 => "YENTI" N2 => "SAFE" N3 => "ANEMONIE") NODE ( N1 => "YENTI" N2 => "SAFE" N3 => "TREASURE") NODE ( N1 => "ZOMBIE" N2 => "SAFE" N3 => "JABBERWOCK") NODE ( N1 => "ZOMBIE" N2 => "SAFE" N3 => "PEGASUS") %% BACKTRACK DUMP TRACE PROFILE SAVE ZERO R1: /* See if we are at the end */ (^START HERE) (END.E1 == HERE.S1) => { printf("Found a path"); remove_checkpoint(); return; } ; R2: /* See if the next node that would be selected is already in the path. If it is, remove it from consideration */ (^START HERE) (^NODE NEXT NODE.N2 != "DANGER" NODE.N1 == HERE.S1) (PATH.P1 == NEXT.N3) => MARK NODE { } ; R3: /* Select the first non-dangerous node as the next node */ - 63 - (^START HERE) (^NODE NEXT NODE.N2 != "DANGER" NODE.N1 == HERE.S1) => MARK START NODE ADD START(S1 => NEXT.N3) PATH (P1 => NEXT.N3) { } ; R4: /* A dead end has been reached. Undo the last path taken and mark it as dangerous in R5 */ => { /* undo this rule */ backup(); /* check if there was a previous rule */ while(backtrack){ if(backtrack->next_rule == 5) backtrack = backtrack->next; else{ backup(); /* undo it */ goto R5; /* and mark it as dangerous */ } } /* no solution */ goto R6; } ; R5: /* Grab the first available path and call it dangerous */ (^START HERE) (^NODE NEXT NODE.N2 != "DANGER" NODE.N1 == HERE.S1) => MARK NODE ADD NODE (N1 => NEXT.N1 N2 => "DANGER" N3 => NEXT.N3) { checkpoint(); } ; R6: /* No solution */ => { printf("No Solution"); remove_checkpoint(); return; - 64 - } ; %% MAIN.C #include <ctype.h> #include <sys/file.h> #include <sys/time.h> #include "X_loop.h" extern char *rule_names[]; char *node_names[26] = { "ANEMONIE", "BANDIT", "CAVERN", "DUBLOON", "ELVES", "FOUNTAIN", "GROTTO", "HOBGOBLIN", "ICE ROOM", "JABBERWOCK", "KESTREL", "LAPIS LASULI", "MEADOW", "NYMPH", "OGRE", "PEGASUS", "QUAGGA", "RUBY", "SPRING", "TREASURE", "URCHIN", "VERMIN", "WRAITH", "XEROC", "YENTI", "ZOMBIE" }; int danger[26] = { 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1 }; extern int X_fire_profile[], X_test_profile[]; char *start, *xit; main(argc, argv) int argc; char *argv[]; { int i, fire, test; char c[512]; double d1, d2, d3; struct timeval tp, old_tp; struct timezone tzp; setbuf(stdout, NULL); if(argc > 1){ if(re_do()){ X_dump_PATH_struct(); test = fire = 0; for(i = 0; i < 17; i++) test += X_test_profile[i]; for(i = 0; i < 7; i++) fire += X_fire_profile[i]; X_zero(); printf("%d rules tested",test); printf("%d rules fired",fire); } printf("Continue [y or n] "); - 65 - scanf("%s",c); if(isupper(c[0])) c[0] = tolower(c[0]); if(c[0] == 'n') exit(0); } while(1){ start = xit = NULL; print_map(); while(start == NULL){ printf("Input start node "); scanf("%s",c); if(isupper(c[0])) c[0]=tolower(c[0]); if(islower(c[0])) start = node_names[c[0]-'a']; } while(xit == NULL){ printf("Input exit node "); scanf("%s",c); if(isupper(c[0])) c[0]=tolower(c[0]); if(islower(c[0])) xit = node_names[c[0]-'a']; } printf("initializing"); X_init(); X_backtrack = (struct X_back_track_stack *) malloc(sizeof(struct X_back_track_stack)); X_add_END_struct(xit); X_add_START_struct(start); X_add_PATH_struct(start); free(X_backtrack); X_backtrack = NULL; gettimeofday(&old_tp, &tzp); X_loop(); gettimeofday(&tp, &tzp); X_dump_PATH_struct(); d1 = old_tp.tv_sec; d2 = old_tp.tv_usec; d1 += d2/1000000; d2 = tp.tv_sec; d3 = tp.tv_usec; d2 += d3/1000000; d3 = d2 - d1; test = fire = 0; for(i = 0; i < 17; i++) test += X_test_profile[i]; for(i = 0; i < 7; i++) fire += X_fire_profile[i]; X_zero(); d1 = test; d2 = fire; printf("Elapsed time = %f seconds", d3); - 66 - printf("%d rules tested (%f rules/sec)",test, d1/d3); printf("%d rules fired (%f rules/sec)",fire, d2/d3); printf("Continue [y or n] "); scanf("%s",c); if(isupper(c[0])) c[0] = tolower(c[0]); if(c[0] == 'n') exit(0); } } print_map() { printf(" NODE NAMES (* DANGEROUS NODES) C - I "); printf(" ------------------------------- / \\ / \\ "); printf(" ANEMONIE NYMPH T - H - D "); printf(" BANDIT* OGRE / | \\ "); printf(" CAVERN PEGASUS Y | S "); printf(" DUBLOON QUAGGA* / | | | \\ "); printf(" ELVES RUBY K - A --- M --- W - O "); printf(" FOUNTAIN SPRING \\ / / \\ \\ / "); printf(" GROTTO TREASURE Q / \\ G "); printf(" HOBGOBLIN* URCHIN | / \\ | "); printf(" ICE ROOM VERMIN V / \\ R "); printf(" JABBERWOCK WRAITH* / \\ / \\ / \\ "); printf(" KESTREL XEROC* L - J - Z E - N - F "); printf(" LAPIS LASULI YENTI* \\ / \\ / \\ / \\ / "); printf(" MEADOW ZOMBIE* B - P U - X "); } int state = 0; char *lock = "lock.0"; char *stm = "stm.0"; char *back = "back.0"; char *pro = "pro.0"; char *trace = "trace.0"; checkpoint() /* save a snapshot of stm, back_track_stack, etc. */ { FILE *fp; lock[5] = '0' + state; stm[4] = '0' + state; back[5] = '0' + state; pro[4] = '0' + state; trace[6] = '0' + state; if(state) state = 0; else state = 1; unlink(lock); fp = fopen(stm, "w"); X_save_stm(fp); - 67 - fclose(fp); fp = fopen(back, "w"); X_save_backtrack(fp); fclose(fp); fp = fopen(pro, "w"); X_save_profile(fp); fclose(fp); fp = fopen(trace, "w"); X_save_trace(fp); fclose(fp); fp = fopen(lock, "w"); fprintf(fp,"good"); fclose(fp); } remove_checkpoint() /* remove all old snapshots */ { unlink(lock); unlink(stm); unlink(back); unlink(pro); unlink(trace); lock[5] = '0' + state; stm[4] = '0' + state; back[5] = '0' + state; pro[4] = '0' + state; trace[6] = '0' + state; unlink(lock); unlink(stm); unlink(back); unlink(pro); unlink(trace); } re_do() /* restore from a snapshot and continue execution */ { char c[512]; FILE *fp; if((fp = fopen(lock, "r")) != NULL) fscanf(fp,"%s", c); if(strncmp(c, "good", 4)){ if(fp) fclose(fp); if(state) state = 0; else state = 1; lock[5] = '0' + state; stm[4] = '0' + state; back[5] = '0' + state; pro[4] = '0' + state; - 68 - trace[6] = '0' + state; if((fp = fopen(lock, "r")) != NULL) fscanf(fp,"%s", c); if(strncmp(c, "good", 4)){ remove_checkpoint(); printf("No checkpoint files"); return(0); } } fclose(fp); fp = fopen(stm, "r"); X_load_stm(fp); fclose(fp); fp = fopen(back, "r"); X_load_backtrack(fp); fclose(fp); fp = fopen(pro, "r"); X_load_profile(fp); fclose(fp); fp = fopen(trace, "r"); X_load_trace(fp); fclose(fp); X_loop(); return(1); }
sources-request@panda.UUCP (02/09/86)
Mod.sources: Volume 3, Issue 114 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) : This is a shar archive. Extract with sh, not csh. : The rest of this file will extract: : Makefile main.c main.h optimize.c echo extracting - Makefile sed 's/^X//' > Makefile << '!EOR!' Xtrc : y.tab.o main.o output.o optimize.o p_out.o X cc y.tab.o main.o output.o optimize.o p_out.o -o trc X Xy.tab.o : y.tab.c scanner.c main.h X cc -c y.tab.c X Xy.tab.c : parser X yacc parser X Xmain.o : main.c main.h X cc -c main.c X Xoutput.o : output.c main.h X cc -c output.c X Xp_out.o : p_out.c main.h X cc -c p_out.c X Xoptimize.o : optimize.c main.h X cc -c optimize.c !EOR! echo extracting - main.c sed 's/^X//' > main.c << '!EOR!' X#include <stdio.h> X#include <signal.h> X#include "main.h" X#define total_files 12 X XFILE *fp[total_files]; Xchar *file_name[total_files] = { X "loop.h", X "loop.c", X "free.c", X "misc.c", X "search.c", X "add.c", X "dump.c", X "relink.c", X "backtrack.c", X "profile.c", X "zero.c", X "save.c" X}; X Xbus_error() X{ X fprintf(stderr,"%d: unable to recover from earlier errors\n", lineno); X exit(0); X} X X Xint next_label; X Xmain(argc, argv) Xint argc; Xchar *argv[]; X{ X int i, j; X char s[512]; X X setbuf(stdout, NULL); X errors = pnum = total_tokens = 0; X tracing = profiling = backtracking = 0; X dumping = optimizing = recursing = 0; X pascal = saving = zeroing = 0; X next_label = lineno = 1; X prefix = ""; X if(argc < 2){ X fprintf(stderr,"usage: %s [options] file\n", argv[0]); X exit(); X } X for(i = 1; i < (argc-1); i++){ X j = 0; X while(argv[i][j]){ X switch(argv[i][j]){ X case '-': break; X case 'b': backtracking++; X break; X case 'd': dumping++; X break; X case 'p': profiling++; X break; X case 'r': recursing++; X break; X case 's': saving++; X break; X case 't': tracing++; X break; X case 'z': zeroing++; X break; X case 'O': optimizing++; X break; X case 'P': pascal++; X break; X default: fprintf(stderr,"undefined flag (%c)\n",argv[i][1]); X break; X } X j++; X } X } X if((freopen(argv[argc-1], "r", stdin)) == NULL){ X fprintf(stderr,"unable to attach %s to the standard input\n", argv[argc-1]); X exit(0); X } X insert_init(); /* initialize this list */ X signal(SIGBUS, bus_error); X yyparse(); /* parse the input */ X rule_list = rule_list->next; /* the first entry is just a place holder */ X rule_list->prev = NULL; X nice_token_stuff(); X if(errors){ X fprintf(stderr,"no code produced due to errors in source\n"); X exit(); X } X if(pascal){ X if(optimizing) X optimize(); X p_translate(); X exit(); X } X for(i = 0; i < total_files; i++){ X strcpy(s, prefix); X strcat(s, file_name[i]); X if((fp[i] = fopen(s,"w")) == NULL){ X fprintf(stderr,"unable to open %s\n",s); X exit(0); X } X if(i){ X fprintf(fp[i],"#include\t\"%sloop.h\"\n\n",prefix); X if(i > 1){ X fprintf(fp[i],"extern int %stotal_tokens, %stoken[];\n", prefix, prefix); X fprintf(fp[i],"extern char *%stoken_name[];\n", prefix); X fprintf(fp[i],"extern int %srestoring;\n", prefix); X } X else{ X fprintf(fp[i],"int %srestoring = 0;\n", prefix); X } X } X } X header = fp[0]; X loop = fp[1]; X fre = fp[2]; X misc = fp[3]; X search = fp[4]; X add = fp[5]; X dump = fp[6]; X relink = fp[7]; X backtrack = fp[8]; X profile = fp[9]; X zero = fp[10]; X save = fp[11]; X if(optimizing) X optimize(); X translate(); X} X Xyyerror(s) char *s;{ X return; X} X Xyywrap(){ X return(1); X} X X Xchar *gen_next_label() X/* generate a unique rule label */ X{ X int i, j; X char *r, *s, t[8]; X X /* an unlikely prefix */ X s = "ZqWr_"; X i = 0; X j = next_label++; X while(j){ X t[i++] = '0' + (j % 10); X j = j/10; X } X t[i] = '\0'; X r = (char *) malloc (i + 6); X strcpy(r, s); X strcat(r, t); X return(r); X X} X X Xnice_token_stuff() X/* remove superfluous stuff from the case list */ X{ X struct def_list *d_temp; X struct data_type *dt_temp; X struct case_list *c_temp; X X d_temp = token_list; X while(d_temp){ X dt_temp = d_temp->data_types; X while(dt_temp){ X if(dt_temp->elts){ X while((dt_temp->elts) && (dt_temp->elts->used == 0)) X dt_temp->elts= dt_temp->elts->next; X c_temp = dt_temp->elts; X while(c_temp){ X if((c_temp->next) && (c_temp->next->used == 0)) X c_temp->next = c_temp->next->next; X c_temp = c_temp->next; X } X } X dt_temp = dt_temp->next; X } X d_temp = d_temp->next; X } X} X X Xbuild_case_list() X/* search the token_list for possible cross compare items */ X{ X int i; X struct def_list *d_temp, *d_temp2; X struct data_type *dt_temp, *dt_temp2; X struct case_list *c_temp, *c_temp2; X X d_temp = token_list; X while(d_temp){ X dt_temp = d_temp->data_types; X while(dt_temp){ X i = 1; X dt_temp2 = d_temp->data_types; X while(dt_temp2){ X if((dt_temp != dt_temp2) && (dt_temp->type == dt_temp2->type)){ X if(dt_temp->elts){ X c_temp->next = (struct case_list *) malloc(sizeof(struct case_list)); X c_temp = c_temp->next; X } X else{ X dt_temp->elts = (struct case_list *) malloc(sizeof(struct case_list)); X c_temp = dt_temp->elts; X } X c_temp->next = NULL; X c_temp->name = dt_temp2->name; X c_temp->id = i++; X c_temp->used = 0; X } X dt_temp2 = dt_temp2->next; X } X dt_temp = dt_temp->next; X } X d_temp = d_temp->next; X } X} X Xcmp_type(object, elt1, elt2) X/* compare the types of elt1 and elt2 to see if they match */ Xchar *object, *elt1, *elt2; X{ X struct def_list *d_temp; X struct data_type *dt_temp; X struct case_list *c_temp; X X d_temp = token_list; X while(d_temp){ X if(strcmp(object, d_temp->name) == 0){ X dt_temp = d_temp->data_types; X while(dt_temp){ X if(strcmp(elt1, dt_temp->name) == 0){ X c_temp = dt_temp->elts; X while(c_temp){ X if(strcmp(elt2, c_temp->name) == 0) X return(1); /* the types are equal */ X else X c_temp = c_temp->next; X } X return(-1); /* the types are different */ X } X dt_temp = dt_temp->next; X } X return(-1); /* the types were not found */ X } X d_temp = d_temp->next; X } X} X X Xinsert_label(s) X/* insert the label of a rule into the name_list */ Xchar *s; X{ X struct list *temp; X X if(! first_label) X first_label = s; X temp = (struct list *) malloc(sizeof(struct list)); X temp->name = s; X if(name_list) X temp->next = name_list; X else X temp->next = NULL; X name_list = temp; X rule_list->label = s; X} X X Xinsert_token(s) X/* insert the name of an object into the token_list */ Xchar *s; X{ X struct def_list *temp, *temp2; X X if(token_list){ /* if the list is not nil */ X temp = token_list; X while(temp){ /* search for duplicates */ X if(strcmp(s, temp->name) == 0){ X fprintf(stderr,"%d: duplicate declaration -> %s\n",lineno, s); X errors++; X return(0); X } X temp = temp->next; X } X } X temp = (struct def_list *) malloc(sizeof(struct def_list)); X temp->name = s; X temp->data_types = data_list; X temp->next = NULL; X total_tokens++; X data_list = NULL; X temp2 = token_list; X if(temp2){ X while(temp2->next) X temp2 = temp2->next; X temp2->next = temp; X } X else X token_list = temp; X} X X Xinsert_fields(element, value, free_name, type, index) X/* insert a fields structure into the current object in the init list */ Xchar *element, *value, *free_name; Xint type, index; X{ X struct fields *temp; X struct match *temp2; X X temp2 = rule_list->complex; X temp = (struct fields *) malloc(sizeof(struct fields)); X temp->empty = 0; X temp->object = NULL; X if(free_name) X while(temp2){ X if(strcmp(temp2->free_name, free_name) == 0){ X temp->empty = temp2->empty; X temp->object = temp2->object; X temp2 = NULL; X } X else temp2 = temp2->next; X } X temp->index = index; X temp->element = element; X temp->value = value; X temp->type = type; X temp->next = init_list->items; X init_list->items = temp; X} X X Xinsert_init() X/* insert an empty structure in the stm initialization list */ X{ X struct init *temp; X X temp = (struct init *) malloc(sizeof(struct init)); X temp->count = 0; X temp->object = NULL; X temp->next = NULL; X temp->items = NULL; X if(init_list){ X temp->next = init_list; X init_list = temp; X } X else init_list = temp; X} X X Xinsert_count(n) X/* initialize count in init_list */ Xint n; X{ X init_list->count = n; X} X X Xinsert_rule() X/* insert an empty structure in the list of rules */ X{ X struct rule *temp; X X temp = (struct rule *) malloc(sizeof(struct rule)); X temp->recurs = 0; X temp->prev = NULL; X temp->opt = NULL; X temp->label = NULL; X temp->search = (int *) calloc(total_tokens , sizeof(int)); X temp->mark = (int *) calloc(total_tokens , sizeof(int)); X if(rule_list){ X temp->next = rule_list; X rule_list->prev = temp; X rule_list = temp; X } X else{ X temp->next = NULL; X rule_list = temp; X } X} X X Xfind_name(s) X/* test to see if a rule label has already been used */ Xchar *s; X{ X struct list *temp; X X X temp = name_list; X while(temp){ X if(strcmp(s, temp->name) == 0){ X return(-1); X } X temp = temp->next; X } X return(0); X} X Xfind_token(s) X/* test to see if an object name has already been used */ Xchar *s; X{ X int i; X struct def_list *temp; X X i = 0; X temp = token_list; X while(temp){ X if(strcmp(s, temp->name) == 0){ X return(i); X } X temp = temp->next; X i++; X } X return(-1); X} X X Xadd_struct(s,i) X/* add a structure to define an object to the data_list */ Xchar *s; Xint i; X{ X struct data_type *a, *b; X X /* first check for duplicate definitions */ X a = data_list; X while(a){ X if(strcmp(s, a->name) == 0) X return(-1); X a = a->next; X } X /* add the new element to the list */ X a = (struct data_type *) malloc(sizeof(struct data_type)); X a->name = s; X a->type = i; X a->elts = NULL; X a->next = NULL; X if(data_list){ X b = data_list; X while(b->next != NULL) X b = b->next; X b->next = a; X } X else X data_list = a; X return(2); X} X X Xappend_code(s) X/* append a line of C code to a temporary buffer */ Xchar *s; X{ X struct list *a, *b; X X a = (struct list *) malloc(sizeof(struct list)); X a->name = s; X a->next = NULL; X if(temp_c_code){ X b = temp_c_code; X while(b->next != NULL) X b = b->next; X b->next = a; X } X else X temp_c_code = a; X} X X Xopt(s) Xchar *s; X{ X rule_list->opt = s; X} X X Xdo_header() X/* copy the temporary C code to the header */ X{ X header_code = temp_c_code; X temp_c_code = NULL; X} X X Xdo_trailer() X/* copy the temporary C code to the trailer */ X{ X trailer_code = temp_c_code; X temp_c_code = NULL; X} X X Xdo_init_list(s) X/* add an object to the list of objects to be initialized */ Xchar *s; X{ X int found; X struct fields *temp; X struct def_list *d_temp; X struct data_type *dt_temp, *dt_temp2; X X if(init_list->items){ X d_temp = token_list; X while(strcmp(d_temp->name, s) != 0) X d_temp = d_temp->next; X temp = init_list->items; X while(temp){ X found = 0; X dt_temp = d_temp->data_types; X while(dt_temp){ X if(strcmp(dt_temp->name, temp->element) == 0){ X dt_temp2 = dt_temp; X dt_temp = NULL; found = 1; X } X if(dt_temp) dt_temp = dt_temp->next; X } X if(found == 0){ X fprintf(stderr,"%d: %s is not an element of %s\n",lineno, X temp->element, s); X errors++; X } X else if((temp->type >= 0) && (temp->type <= 2)){ X if(temp->type != dt_temp2->type){ X fprintf(stderr,"%d: types of element (%s) and value (%s) do not match\n", lineno, temp->element, temp->value); X errors++; X } X } X temp = temp->next; X } X } X init_list->object = s; X} X X Xdo_code() X/* copy the temporary C code to the current rule's action part */ X{ X rule_list->c_code = temp_c_code; X temp_c_code = NULL; X} X X Xdo_mark(s) X/* MARK an object */ Xchar *s; X{ X int i; X X i = find_token(s); X if(i < 0) { X return(0); X } X rule_list->mark[i]++; X return(1); X} X X Xmark_free(free_name) X/* mark an object by name */ Xchar *free_name; X{ X struct match *temp; X X temp = rule_list->complex; X while(temp){ X if(strcmp(free_name, temp->free_name) == 0){ X if(temp->empty){ X fprintf(stderr,"%d: can't MARK an EMPTY object\n", lineno); X errors++; X } X else{ X temp->mark = 1; X } X return(1); X } X temp = temp->next; X } X return(0); X} X Xfind_free(free_name) X/* determine if free_name is a declared free variable */ Xchar *free_name; X{ X struct match *temp; X X temp = rule_list->complex; X while(temp){ X if(strcmp(free_name, temp->free_name) == 0) X return(temp->index); X temp = temp->next; X } X return(-1); X} X X Xmatch_type(object, o_elt, free_name, f_elt) X/* check to see if the types match */ Xchar *object, *o_elt, *free_name, *f_elt; X{ X int type; X struct match *temp; X struct def_list *temp2, *temp4; X struct data_type *temp3, *temp5; X X temp = rule_list->complex; X while(temp){ X if(strcmp(free_name, temp->free_name) == 0){ X temp2 = token_list; X while(temp2){ X if(strcmp(temp->object, temp2->name) == 0){ X temp3 = temp2->data_types; X while(temp3){ X if(strcmp(temp3->name, f_elt) == 0){ X type = temp3->type; X temp4 = token_list; X while(temp4){ X if(strcmp(temp4->name, object) == 0){ X temp5 = temp4->data_types; X while(temp5){ X if(strcmp(temp5->name, o_elt) == 0){ X if(type == temp5->type){ X if(strcmp(temp->object, object)) X return(1); X else X return(2); X } X else X return(0); X } X temp5 = temp5->next; X } X return(0); X } X temp4 = temp4->next; X } X return(0); X } X temp3 = temp3->next; X } X return(0); X } X temp2 = temp2->next; X } X return(0); X } X temp = temp->next; X } X return(0); X} X X Xadd_count(n) X/* add the count of how many objects to search X to the rule's situation part */ Xint n; X{ X struct match *temp; X X if((rule_list->complex) && (n > 1)){ X temp = rule_list->complex; X while(temp->next) X temp = temp->next; X temp->count = n; X } X} X X Xadd_test_code() X/* insert C language code in the situation part */ X{ X struct match *temp; X X if(rule_list->complex){ /* not the first match in a rule */ X temp = rule_list->complex; X while(temp->next) X temp = temp->next; X temp->next = (struct match *) malloc(sizeof(struct match)); X temp = temp->next; X } X else{ /* first match in a rule */ X rule_list->complex = (struct match *) malloc(sizeof(struct match)); X temp = rule_list->complex; X } X temp->tests = NULL; X temp->next = NULL; X temp->object = NULL; X temp->free_name = NULL; X temp->empty = 0; X temp->mark = 0; X temp->index = 0; X temp->count = 1; X temp->c_code = temp_c_code; X temp_c_code = NULL; X} X X Xadd_test(object, element, relop, value, type, free_name, index, empty) X/* Xadd data needed for complex matching to rule structure Xobject is the name of the object class this test applies to Xelement is the name of the element of the object this test applies to Xrelop is the relational operator used in the test Xvalue is the value to test against Xtype is the data type of the element Xfree_name is the name of the free variable associated with this object (if any) Xindex is the array index of the free variable this test will assign to Xempty - boolean: is this an empty test? X*/ Xchar *object, *element, *value, *free_name; Xint relop, type, index; X{ X struct match *temp; X struct test *temp2; X struct def_list *d_temp; X struct data_type *dt_temp; X struct case_list *c_temp; X X if(current_match == NULL){ /* the first test in a match */ X if(rule_list->complex){ /* not the first match in a rule */ X temp = rule_list->complex; X while(temp->next) X temp = temp->next; X temp->next = (struct match *) malloc(sizeof(struct match)); X temp = temp->next; X } X else{ /* first match in a rule */ X rule_list->complex = (struct match *) malloc(sizeof(struct match)); X temp = rule_list->complex; X } X temp->tests = NULL; X temp->next = NULL; X temp->object = object; X temp->free_name = free_name; X temp->index = index; X temp->c_code = NULL; X temp->count = 1; X temp->mark = 0; X temp->empty = empty; X current_match = temp; X } X temp = current_match; X if(strcmp(temp->object, object) != 0){ X fprintf(stderr,"%d: objects in a complex test must match\n",lineno); X errors++; X return(0); X } X if(temp->tests){ X temp2 = temp->tests; X while(temp2->next) X temp2 = temp2->next; X temp2->next = (struct test *) malloc(sizeof(struct test)); X temp2 = temp2->next; X } X else{ X temp->tests = (struct test *) malloc(sizeof(struct test)); X temp2 = temp->tests; X } X temp2->element = element; X temp2->relop = relop; X temp2->free_name = free_name; X temp2->value = value; X temp2->type = type; X temp2->next = NULL; X temp2->id = 0; X if(strcmp(object, free_name) == 0){ X d_temp = token_list; X while(d_temp){ X if(strcmp(object, d_temp->name) == 0){ X dt_temp = d_temp->data_types; X while(dt_temp){ X if(strcmp(element, dt_temp->name) == 0){ X c_temp = dt_temp->elts; X while(c_temp){ X if(strcmp(value, c_temp->name) == 0){ X c_temp->used = 1; X temp2->id = c_temp->id; X return; X } X c_temp = c_temp->next; X } X } X dt_temp = dt_temp->next; X } X } X d_temp = d_temp->next; X } X } X} X X Xsearch_structs(object, element) X/* Xsearch the list of structures for 'object', see if it has a field 'element' Xreturn -1 if 'object' is not found, data type of 'element' if it is found. X*/ Xchar *object, *element; X{ X struct def_list *temp; X struct data_type *temp2; X X temp = token_list; X while(temp){ X if((strcmp(temp->name, object)) == 0){ X temp2 = temp->data_types; X while(temp2){ X if((strcmp(temp2->name,element)) == 0){ X return(temp2->type); X } X temp2 = temp2->next; X } X return(-1); X } X temp = temp->next; X } X return(-1); X} !EOR! echo extracting - main.h sed 's/^X//' > main.h << '!EOR!' X#include <stdio.h> X Xstruct list{ X char *name; X struct list *next; X} *name_list, *temp_c_code, *header_code, *trailer_code, *label_list; X Xstruct case_list{ X char *name; /* name of the element */ X int id; /* case identifier */ X int used; /* is this case ever used? */ X struct case_list *next; X}; X Xstruct data_type{ X int type; /* -1 = free variable */ X /* 0 = integer */ X /* 1 = floating */ X /* 2 = string */ X /* 3 = pointer */ X char *name; /* name of the element */ X struct case_list *elts; /* elements that may be compared */ X struct data_type *next; X} *data_list; X Xstruct def_list{ /* list of defined objects */ X char *name; /* the name of the object */ X struct data_type *data_types; /* pointer to the element list */ X struct def_list *next; /* next object in the list */ X} *token_list; X X Xstruct test{ /* one test in a complex match */ X char *element; /* the element of the object to test */ X int relop; /* relational operators are bit encoded X < = > X 0 0 0 0 No Match X 0 0 1 1 > X 0 1 0 2 = X 0 1 1 3 >= X 1 0 0 4 < X 1 0 1 5 != X 1 1 0 6 <= X 1 1 1 7 Match Any X */ X char *free_name; /* name of free var */ X char *value; /* the value to test for */ X int type; /* the data type of the element */ X int id; /* case id for self tests */ X struct test *next; /* pointer to the next one in the list */ X} *current_test; X X Xstruct match{ /* description of a complex matching */ X char *object; /* the object to test */ X char *free_name; /* name of free var attached to the object */ X int index; /* initial array index of this free variable */ X int count; /* number of times to repeat this test */ X int mark; /* boolean: mark this object? */ X int empty; /* boolean: is this an empty test? */ X struct list *c_code; /* C code included in the situation */ X struct test *tests; /* the list of tests for this match */ X struct match *next; /* pointer to the next one in the list */ X} *current_match; X X Xstruct rule { /* facts needed to generate code */ X /* one per rule */ X char *label; /* label of this rule */ X char *opt; /* optimization - goto this label */ X int recurs; /* boolean indicates recursive search */ X int *search; /* items to search for in stm */ X int *mark; /* items to remove from stm */ X struct init *add; /* items to add to stm */ X struct list *c_code; /* code to execute if this rule fires */ X struct match *complex; /* complex matching list */ X struct rule *next; X struct rule *prev; X} *rule_list; X Xstruct fields { X char *object; /* the name of the object to be init */ X char *element; /* the name of the element to be init */ X char *value; /* the value to be assigned to the element */ X int index; /* the array index of this free variable */ X int type; /* the type of the element */ X int empty; /* boolean: is this an empty test? */ X struct fields *next; /* the next element of this object to be initialized */ X}; X Xstruct init { /* stm initialization */ X char *object; /* the name of the object to be initialized */ X int count; /* number of reps of this item */ X struct fields *items; /* list of fields to initialize */ X struct init *next; /* next object in list */ X} *init_list, *init_list2; X X Xint total_tokens, /* total number of objects declared */ X pnum, /* current production */ X *stm, /* array of number of each object type X to MARK or ADD */ X *current_free, /* index of the current free variable X for each object type */ X *max_free, /* maximum number of free variables X needed for each object type */ X *current_empty, /* index of the current empty variable X for each object type */ X *max_empty, /* maximum number of empty variables X needed for each object type */ X errors, /* count of errors detected */ X backtracking, /* boolean to indicate if the user wants X to generate backtracking code */ X profiling, /* boolean to indicate if the user wants X to generate profiling code */ X tracing, /* boolean to indicate if the user wants X to generate tracing code */ X dumping, /* boolean to indicate if the user wants X to generate code to dump stm*/ X optimizing, /* boolean to indicate if the user wants X to invoke the loop optimizer */ X recursing, /* boolean to indicate if the user wants X recursive searches to be the default */ X saving, /* boolean to indicate if the user wants X to generate code to save STM*/ X zeroing, /* boolean to indicate if the user wants X to generate code to zero STM*/ X pascal, /* boolean to indicate if the user wants X to generate pascal instead of C*/ X lineno; /* line number of input file */ X Xchar *first_label, /* label of the first rule in LTM */ X *prefix; /* prefix for all objects */ X XFILE *header, *fre, *misc, *search, /* output files */ X *add, *dump, *loop, *relink, /* purposes are obvious */ X *backtrack, *profile, *zero, *save; X !EOR! echo extracting - optimize.c sed 's/^X//' > optimize.c << '!EOR!' X#include "main.h" X Xoptimize() X/* optimize the goto statements that will be generated for loop.c */ X{ X struct rule *r_const, *r_temp, *r_temp2; X struct match *m_temp; X struct test *t_temp; X struct init *i_temp; X int i; X X /* find the end of the rule list */ X /* since rules are inserted, the first rule is at the end of the list */ X r_const = rule_list; X while(r_const->next != NULL) X r_const = r_const->next; X X /* scan the rule list from the end */ X /* mark all rules with tests that do not check elements as recursive */ X r_temp = r_const; X while(r_temp){ X if(r_temp->recurs == 0){ X i = 1; X m_temp = r_temp->complex; X if(m_temp == NULL){ X r_temp->recurs = 111; X } X while(m_temp && i){ X if(m_temp->tests){ X t_temp = m_temp->tests; X while(t_temp) X if(t_temp->element){ X i = 0; X t_temp = NULL; X } X else X t_temp = t_temp->next; X } X if(i){ X if(m_temp->next == NULL){ X r_temp->recurs = 111; X } X } X m_temp = m_temp->next; X } X } X r_temp = r_temp->prev; X } X X /* scan the rule list for rules that have not been hand optimized */ X /* optimize these rules by scanning for the first rule that could */ X /* possibly fire after the given rule */ X r_temp = r_const; X while(r_temp){ X if(r_temp->opt == NULL){ X r_temp2 = r_const; X while(r_temp2){ X r_temp->opt = r_temp2->label; X if(r_temp2->complex == NULL){ X r_temp2 = NULL; X } X else if(strcmp(r_temp->label, r_temp2->label) == 0){ X r_temp2 = NULL; X } X else{ X /* NOT tests first */ X for(i = 0; i < total_tokens; i++){ X if(r_temp2->search[i] < 0){ X if(r_temp->mark[i] > 0){ X r_temp2 = NULL; X goto next_rule; X } X i_temp = r_temp->add; X while(i_temp){ X if(find_token(i_temp->object) == i) X goto next_rule; X i_temp = i_temp->next; X } X } X } X m_temp = r_temp2->complex; X while(m_temp){ X i = find_token(m_temp->object); X /* ADD test for all rules */ X i_temp = r_temp->add; X while(i_temp){ X if(strcmp(i_temp->object, m_temp->object) == 0){ X r_temp2 = NULL; X m_temp = NULL; X i_temp = NULL; X } X else X i_temp = i_temp->next; X } X /* MARK test for non recursive rules */ X if((r_temp2) X && (r_temp->recurs == 0) X && (r_temp->mark[i])){ X r_temp2 = NULL; X m_temp = NULL; X } X else{ X if(m_temp) { X m_temp = m_temp->next; X } X } X } X } X next_rule: X if(r_temp2 != NULL){ X r_temp2 = r_temp2->prev; X } X } X } X r_temp = r_temp->prev; X } X X /* unmark rules temporarily marked as recursive */ X r_temp = r_const; X while(r_temp){ X if(r_temp->recurs == 111) X r_temp->recurs = 0; X r_temp = r_temp->prev; X } X} !EOR!
sources-request@panda.UUCP (02/09/86)
Mod.sources: Volume 3, Issue 115 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) This is NOT a shell archive. Simply delete everything up to and including the cut mark and save the result as output.c. Dan Kary ihnp4!dicomed!ndsuvax!nckary -------------- cut here --------------- #include <stdio.h> #include "main.h" char *type_names[4] = { "int ", "double ", "char *", "struct " }; gen_test() /* generate the myalloc procedure for allocating data structures */ /* generate procedures to test each data type and return a relop code */ { int i; FILE *fp; fp = misc; fprintf(fp,"\n\nchar* %smyalloc(n)\nint n;\n{\n", prefix); fprintf(fp,"\tchar *r;\n\n\tr = (char *) malloc(n);\n"); fprintf(fp,"\tif(r == NULL){\n\t\tfprintf(stderr,\"OUT OF MEMORY\\n\");\n"); fprintf(fp,"\t\tfprintf(stderr,\"TRC IS EXITING\\n\");\n"); fprintf(fp,"\t\texit(0);\n\t}\n\treturn(r);\n}"); for(i = 0; i < 2; i++){ fprintf(fp,"\n\n%stest_%s(a,b)\n%s a, b;\n{\n", prefix,type_names[i], type_names[i]); fprintf(fp,"\tif(a < b) return(4);\n"); fprintf(fp,"\tif(a == b) return(2);\n"); fprintf(fp,"\treturn(1);\n}\n"); } fprintf(fp,"\n\n%stest_string(a,b)\nchar *a, *b;\n{\n",prefix); fprintf(fp,"\tint i;\n\n\ti = strcmp(a, b);\n"); fprintf(fp,"\tif(i < 0) return(4);\n"); fprintf(fp,"\tif(i == 0) return(2);\n"); fprintf(fp,"\treturn(1);\n}\n"); } gen_search() /* generate procedures to search each structure for a compound match */ { int i; struct def_list *temp; struct data_type *temp2; struct case_list *c_temp; FILE *fp; temp = token_list; while(temp){ if(temp->data_types){ fp = loop; fprintf(fp,"\nextern struct %s%s_struct *search_%s%s_struct();\n", prefix, temp->name, prefix, temp->name); fp = search; fprintf(fp,"\n\nstruct %s%s_struct *search_%s%s_struct(index", prefix, temp->name, prefix, temp->name); temp2 = temp->data_types; while(temp2){ if(temp2->type <= 2){ fprintf(fp,",\n\t\t\t\t"); for(i = 0; i < strlen(temp2->name)-2; i++) fprintf(fp," "); fprintf(fp," "); fprintf(fp,"%s, %s_relop",temp2->name, temp2->name); if(temp2->elts) fprintf(fp,", %s_case",temp2->name); } temp2 = temp2->next; } fprintf(fp,")\n"); if(temp->data_types) fprintf(fp,"int index"); temp2 = temp->data_types; while(temp2){ if(temp2->type <= 2) fprintf(fp,", %s_relop",temp2->name); temp2 = temp2->next; } if(temp->data_types) fprintf(fp,";\n"); temp2 = temp->data_types; while(temp2){ if(temp2->type <= 2) fprintf(fp,"%s%s;\n",type_names[temp2->type], temp2->name); temp2 = temp2->next; } fprintf(fp,"{\n"); if(temp->data_types){ fprintf(fp,"\tint\tflag;\n"); fprintf(fp,"\tstruct %s%s_struct *temp;\n\n", prefix,temp->name); fprintf(fp,"\ttemp = %s%s_temp[index];\n\twhile(temp){", prefix,temp->name); temp2 = temp->data_types; fprintf(fp,"\n\t\tif(temp->MARK == 0){"); fprintf(fp,"\n\t\t\tflag = 7;"); while(temp2){ if(temp2->type <= 2){ if(temp2->elts){ fprintf(fp,"\n\t\t\tswitch(%s_case){",temp2->name); fprintf(fp,"\n\t\t\tcase 0:"); } fprintf(fp,"\n"); if(temp2->elts) fprintf(fp,"\t"); fprintf(fp,"\t\t\tif(flag & %stest_", prefix); switch(temp2->type){ case 0: fprintf(fp,"int"); break; case 1: fprintf(fp,"double"); break; case 2: fprintf(fp,"string"); default: break; } fprintf(fp,"(temp->%s, %s) & %s_relop);", temp2->name, temp2->name, temp2->name); fprintf(fp,"\n\t\t\t\telse flag = 0;"); if(temp2->elts){ fprintf(fp,"\n\t\t\t\tbreak;"); c_temp = temp2->elts; while(c_temp){ fprintf(fp,"\n\t\t\tcase %d:", c_temp->id); fprintf(fp,"\n\t\t\t\tif(flag & test_"); switch(temp2->type){ case 0: fprintf(fp,"int"); break; case 1: fprintf(fp,"double"); break; case 2: fprintf(fp,"string"); default: break; } fprintf(fp,"(temp->%s, temp->%s) & %s_relop);", temp2->name, c_temp->name, temp2->name); fprintf(fp,"\n\t\t\t\telse flag = 0;"); fprintf(fp,"\n\t\t\t\tbreak;"); c_temp = c_temp->next; } fprintf(fp,"\n\t\t\tdefault: flag = 0;\n\t\t\t}"); } } temp2 = temp2->next; } fprintf(fp,"\n\t\t\tif(flag){\n\t\t\t\ttemp->MARK = 1;\n"); fprintf(fp,"\t\t\t\treturn(temp);\n\t\t\t}\n\t\t}\n\t\ttemp = temp->next;\n"); fprintf(fp,"\t}\n\treturn(NULL);\n}\n"); } } temp = temp->next; } } gen_free() /* generate procedures to free a structure */ { int i; struct def_list *temp; struct data_type *temp2; FILE *fp; fp = fre; temp = token_list; while(temp){ if(temp->data_types){ fprintf(fp,"\n\nfree_%s%s_struct(start)\nint start;\n{\n", prefix,temp->name); fprintf(fp,"\tint i;\n\n"); fprintf(fp,"\tfor(i = start; i < %s%s_max; i++)\n", prefix,temp->name); fprintf(fp,"\t\tif(%s%s_list[i]){\n", prefix,temp->name); fprintf(fp,"\t\t\tif(%s%s_list[i]->prev == NULL)\n", prefix,temp->name); fprintf(fp,"\t\t\t\t%s%s_list[0] = %s%s_list[i]->next;\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\telse\n\t\t\t\t%s%s_list[i]->prev->next = %s%s_list[i]->next;\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\tif(%s%s_list[i]->next)\n\t\t\t\t%s%s_list[i]->next->prev = %s%s_list[i]->prev;\n", prefix, temp->name, prefix, temp->name, prefix, temp->name); temp2 = temp->data_types; while(temp2){ if(temp2->type == 2) fprintf(fp,"\t\t\tfree(%s%s_list[i]->%s);\n", prefix,temp->name, temp2->name); temp2 = temp2->next; } fprintf(fp,"\t\t\tfree(%s%s_list[i]);\n", prefix,temp->name); fprintf(fp,"\t\t\t%s%s_list[i] = NULL;\n", prefix,temp->name); fprintf(fp,"\t\t\ti = %s%s_max;\n", prefix,temp->name); fprintf(fp,"\t\t\t%stoken[%s%s]--;\n\t\t}", prefix, prefix,temp->name); fprintf(fp,"\n}\n"); } temp = temp->next; } } gen_relink() /* generate procedures to relink a structure */ { int i; struct def_list *temp; struct data_type *temp2; FILE *fp; fp = relink; temp = token_list; while(temp){ fprintf(fp,"\n\n%srelink_%s_struct(start)\nint start;\n{\n", prefix,temp->name); fprintf(fp,"\tint i, j;\n"); if(temp->data_types){ fprintf(fp,"\tstruct %s%s_struct *temp;\n\n", prefix,temp->name); fprintf(fp,"\tfor(i = start; i < %s%s_max; i++)\n", prefix,temp->name); fprintf(fp,"\t\tif(%s%s_list[i]){\n", prefix,temp->name); fprintf(fp,"\t\t\ttemp = %s%s_list[0];\n", prefix,temp->name); fprintf(fp,"\t\t\tj = 0;\n"); fprintf(fp,"\t\t\twhile(temp != %s%s_list[i]){\n", prefix,temp->name); fprintf(fp,"\t\t\t\ttemp = temp->next;\n"); fprintf(fp,"\t\t\t\tj++;\n\t\t\t}\n"); fprintf(fp,"\t\t\tif(%s%s_list[i]->prev == NULL)\n", prefix,temp->name); fprintf(fp,"\t\t\t\t%s%s_list[0] = %s%s_list[i]->next;\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\telse\n\t\t\t\t%s%s_list[i]->prev->next = %s%s_list[i]->next;\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\tif(%s%s_list[i]->next)\n\t\t\t\t%s%s_list[i]->next->prev = %s%s_list[i]->prev;\n", prefix, temp->name, prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\t%s%s_list[i]->MARK = j;\n", prefix, temp->name); fprintf(fp,"\t\t\t%s%s_list[i]->next = %sbacktrack->mark_%s;\n",prefix , temp->name, prefix, temp->name); fprintf(fp,"\t\t\t%sbacktrack->mark_%s = %s%s_list[i];\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t\t%s%s_list[i] = NULL;\n", prefix,temp->name); fprintf(fp,"\t\t\ti = %s%s_max;\n", prefix,temp->name); fprintf(fp,"\t\t\t%stoken[%s%s]--;\n\t\t}", prefix, prefix,temp->name); } else{ fprintf(fp,"\t\t\t%sbacktrack->mark_%s++;\n", prefix, temp->name); fprintf(fp,"\t\t\t%stoken[%s%s]--;\n", prefix, prefix,temp->name); } fprintf(fp,"\n}\n"); temp = temp->next; } } gen_restore() /* generate procedure to restore structures */ { int i; struct def_list *temp; FILE *fp; fp = misc; temp = token_list; fprintf(fp,"\n\n%srestore()\n{\n\tint i;\n", prefix); while(temp){ if(temp->data_types){ fprintf(fp,"\n\tfor(i = 1; i < %s%s_max; i++)\n", prefix,temp->name); fprintf(fp,"\t\tif(%s%s_list[i]){\n", prefix,temp->name); fprintf(fp,"\t\t\t%s%s_list[i]->MARK = 0;\n", prefix, temp->name); fprintf(fp,"\t\t\t%s%s_list[i] = NULL;\n\t\t}\n", prefix,temp->name); } temp = temp->next; } fprintf(fp,"}\n"); } gen_add() /* generate procedures to add each structure to a list */ { int i; struct def_list *temp; struct data_type *temp2; FILE *fp; fp = add; temp = token_list; while(temp){ fprintf(fp,"\n%sadd_%s_struct(", prefix,temp->name); if(temp->data_types){ temp2 = temp->data_types; i = 0; while(temp2){ if((temp2->type >= 0) && (temp2->type <= 2)){ if(i) fprintf(fp,","); fprintf(fp," %s",temp2->name); } i = 1; temp2 = temp2->next; } } fprintf(fp,")\n"); if(temp->data_types){ temp2 = temp->data_types; while(temp2){ switch(temp2->type){ case 0: fprintf(fp,"int \t"); break; case 1: fprintf(fp,"double\t"); break; case 2: fprintf(fp,"char\t*"); default: break; } if((temp2->type >= 0) && (temp2->type <= 2)) fprintf(fp,"%s;\n",temp2->name); temp2 = temp2->next; } } fprintf(fp,"{\n"); if(temp->data_types){ fprintf(fp,"\n\tstruct %s%s_struct *temp;\n\n", prefix,temp->name); fprintf(fp,"\ttemp = (struct %s%s_struct *) %smyalloc(sizeof(struct %s%s_struct));\n", prefix, temp->name, prefix, prefix, temp->name); temp2 = temp->data_types; while(temp2){ if((temp2->type == 0) || (temp2->type == 1)) fprintf(fp,"\ttemp->%s = %s;\n",temp2->name, temp2->name); else{ fprintf(fp,"\ttemp->%s = (char *) %smyalloc((strlen(%s)) + 1);\n", temp2->name, prefix, temp2->name); fprintf(fp,"\tstrcpy(temp->%s, %s);\n",temp2->name, temp2->name); } temp2 = temp2->next; } fprintf(fp,"\ttemp->MARK = 0;\n"); fprintf(fp,"\ttemp->next = %s%s_list[0];\n", prefix,temp->name); fprintf(fp,"\ttemp->prev = NULL;\n"); fprintf(fp,"\tif(%s%s_list[0])\n", prefix, temp->name); fprintf(fp,"\t\t%s%s_list[0]->prev = temp;\n", prefix,temp->name); fprintf(fp,"\t%s%s_list[0] = temp;\n", prefix,temp->name); } fprintf(fp,"\t%stoken[%s%s]++;\n", prefix, prefix,temp->name); if(backtracking){ fprintf(fp,"\tif(%srestoring == 0)\n", prefix); fprintf(fp,"\t\t%sbacktrack->Add_%s++;\n", prefix, temp->name); } fprintf(fp,"\n}\n\n"); temp = temp->next; } } gen_save(test, fire) int test, fire; /* generate procedures to save the contents of dynamically allocated structures on a file */ { int i, j; struct def_list *temp; struct data_type *temp2; FILE *fp; fp = save; /* profiling arrays */ if(profiling) fprintf(fp,"extern int %sfire_profile[], %stest_profile[];\n", prefix, prefix); /* boolean shared by scan and load procedures */ fprintf(fp,"int %seof_flag;\n", prefix); /* procedure to read a line from a file */ fprintf(fp,"\n\nchar *%sscan(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tchar *s, c[512];\n\tint i;\n"); fprintf(fp,"\n\ti = 0;\n\twhile(1){\n"); fprintf(fp,"\t\tc[i] = getc(fp);\n"); fprintf(fp,"\t\tif((c[i] == '\\n') || (c[i] == EOF)){\n"); fprintf(fp,"\t\t\tif(c[i] == EOF)\n\t\t\t\t%seof_flag = 0;\n", prefix); fprintf(fp,"\t\t\tc[i] = '\\0';\n"); fprintf(fp,"\t\t\ts = (char *) %smyalloc (i + 1);\n", prefix); fprintf(fp,"\t\t\tstrcpy(s,c);\n\t\t\treturn(s);\n"); fprintf(fp,"\t\t}\n\t\tif(i < 511)\n\t\t\ti++;\n"); fprintf(fp,"\t}\n}\n"); /* procedures to save stm on a file */ fprintf(fp,"\n\n%ssave_stm(fp)\n", prefix); fprintf(fp,"FILE *fp;\n{\n"); temp = token_list; while(temp){ fprintf(fp,"\tsave_%s%s_struct(fp);\n", prefix,temp->name); temp = temp->next; } fprintf(fp,"}\n\n"); temp = token_list; while(temp){ fprintf(fp,"\nsave_%s%s_struct(fp)\n", prefix,temp->name); fprintf(fp,"FILE *fp;\n{\n"); if(temp->data_types){ fprintf(fp,"\tstruct %s%s_struct *temp;\n\n", prefix,temp->name); fprintf(fp,"\ttemp = %s%s_list[0];\n", prefix,temp->name); fprintf(fp,"\twhile(temp->next != NULL)\n"); fprintf(fp,"\t\ttemp = temp->next;\n"); fprintf(fp,"\twhile(temp){\n"); } fprintf(fp,"\t\tfprintf(fp,\"%s%s\\n\");\n", prefix, temp->name); if(temp->data_types){ temp2 = temp->data_types; while(temp2){ if((temp2->type <= 2) && (temp2->type >= 0)){ fprintf(fp,"\t\tfprintf(fp,\""); switch(temp2->type){ case 0: fprintf(fp,"%cd", '%'); break; case 1: fprintf(fp,"%cf", '%'); break; case 2: fprintf(fp,"%cs", '%'); default: break; } fprintf(fp,"\\n\",temp->%s);\n", temp2->name); } temp2 = temp2->next; } fprintf(fp,"\t\ttemp = temp->prev;\n\t}\n"); } else{ fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",%stoken[%s%s]);\n",'%', prefix, prefix,temp->name); } fprintf(fp,"}\n\n"); temp = temp->next; } /* procedure to load stm from a file */ fprintf(fp,"\n\n%sload_stm(fp)\nFILE *fp;\n{\n\tchar *s;\n", prefix); temp = token_list; while(temp){ temp2 = temp->data_types; while(temp2){ if((temp2->type <= 2) && (temp2->type >= 0)) fprintf(fp,"\t%s%s%s_%s;\n",type_names[temp2->type], prefix, temp->name, temp2->name); temp2 = temp2->next; } temp = temp->next; } temp = token_list; fprintf(fp,"\t%srestoring = 1;\n", prefix); fprintf(fp,"\n\n\t%seof_flag = 1;\n", prefix); fprintf(fp,"\twhile(%seof_flag){\n", prefix); fprintf(fp,"\t\ts = %sscan(fp);\n", prefix); i = 0; while(temp){ fprintf(fp,"\t\t"); if(i) fprintf(fp,"else "); fprintf(fp,"if(strcmp(s, \"%s%s\") == 0){\n", prefix,temp->name); fprintf(fp,"\t\t\tfree(s);\n"); if(temp->data_types){ temp2 = temp->data_types; while(temp2){ fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); switch(temp2->type){ case 0: fprintf(fp,"\t\t\t%s%s_%s = atoi(s);\n", prefix,temp->name, temp2->name); fprintf(fp,"\t\t\tfree(s);\n"); break; case 1: fprintf(fp,"\t\t\t%s%s_%s = atoi(s);\n", prefix,temp->name, temp2->name); fprintf(fp,"\t\t\tfree(s);\n"); break; case 2: fprintf(fp,"\t\t\t%s%s_%s = s;\n", prefix,temp->name, temp2->name); default: break; } temp2 = temp2->next; } /* generate the add statement here */ fprintf(fp,"\t\t\t%sadd_%s_struct(", prefix, temp->name); i = 0; temp2 = temp->data_types; while(temp2){ if(i) fprintf(fp," ,"); i = 1; fprintf(fp,"%s%s_%s", prefix,temp->name, temp2->name); temp2 = temp2->next; } fprintf(fp,");\n"); temp2 = temp->data_types; while(temp2){ if(temp2->type == 2) fprintf(fp,"\t\t\tfree(%s%s_%s);\n", prefix,temp->name, temp2->name); temp2 = temp2->next; } } else{ /* read in the count and increment token[] */ fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\t%stoken[%s%s] += atoi(s);\n", prefix, prefix, temp->name); fprintf(fp,"\t\t\tfree(s);\n"); } fprintf(fp,"\t\t}\n"); i = 1; temp = temp->next; } fprintf(fp,"\t}\n"); fprintf(fp,"\t%srestoring = 0;\n", prefix); fprintf(fp,"}\n"); if(backtracking){ fprintf(fp,"\n%ssave_backtrack(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tstruct %sback_track_stack *temp;\n",prefix); temp = token_list; while(temp){ if(temp->data_types) fprintf(fp,"\tstruct %s%s_struct *%s_tmp;\n",prefix, temp->name, temp->name); temp = temp->next; } fprintf(fp,"\n\ttemp = %sbacktrack;\n\twhile(temp){\n", prefix); fprintf(fp,"\t\tfprintf(fp,\"%sbacktrack\\n\");\n", prefix); fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",temp->next_rule);\n",'%'); temp = token_list; while(temp){ fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",temp->Add_%s);\n", '%', temp->name); if(temp->data_types == NULL) fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",temp->mark_%s);\n", '%', temp->name); temp = temp->next; } temp = token_list; while(temp){ if(temp->data_types){ fprintf(fp,"\t\t%s_tmp = temp->mark_%s;\n", temp->name, temp->name); fprintf(fp,"\t\twhile(%s_tmp){\n", temp->name); fprintf(fp,"\t\t\tfprintf(fp,\"%s%s\\n\");\n", prefix, temp->name); fprintf(fp,"\t\t\tfprintf(fp,\"%cd\\n\",%s_tmp->MARK);\n", '%', temp->name); temp2 = temp->data_types; while(temp2){ if((temp2->type >= 0) && (temp2->type <= 2)){ fprintf(fp,"\t\t\tfprintf(fp,\"%c", '%'); switch(temp2->type){ case 0: fprintf(fp,"d"); break; case 1: fprintf(fp,"f"); break; case 2: fprintf(fp,"s"); default:break; } fprintf(fp,"\\n\",%s_tmp->%s);\n", temp->name, temp2->name); } temp2 = temp2->next; } fprintf(fp,"\t\t\t%s_tmp = %s_tmp->next;\n", temp->name, temp->name); fprintf(fp,"\t\t}\n"); } temp = temp->next; } fprintf(fp,"\t\ttemp = temp->next;\n\t}\n"); fprintf(fp,"}\n"); fprintf(fp,"\n%sload_backtrack(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tchar *s;\n"); fprintf(fp,"\tstruct %sback_track_stack *temp;\n",prefix); temp = token_list; while(temp){ if(temp->data_types) fprintf(fp,"\tstruct %s%s_struct *%s_tmp;\n",prefix, temp->name, temp->name); temp = temp->next; } fprintf(fp,"\n\t%seof_flag = 1;\n\ttemp = NULL;\n", prefix); fprintf(fp,"\twhile(%seof_flag){\n", prefix); fprintf(fp,"\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\tif(strcmp(s,\"%sbacktrack\") == 0){\n", prefix); fprintf(fp,"\t\t\tfree(s);\n"); fprintf(fp,"\t\t\tif(temp == NULL)\n\t\t\t\t%sbacktrack = temp = ", prefix); fprintf(fp,"(struct %sback_track_stack *) %smyalloc ", prefix, prefix); fprintf(fp,"(sizeof(struct %sback_track_stack));\n",prefix); fprintf(fp,"\t\t\telse{\n\t\t\t\ttemp->next = ", prefix); fprintf(fp,"(struct %sback_track_stack *) %smyalloc ", prefix, prefix); fprintf(fp,"(sizeof(struct %sback_track_stack));\n",prefix); fprintf(fp,"\t\t\t\ttemp = temp->next;\n\t\t\t}\n"); fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\ttemp->next_rule = atoi(s);\n"); fprintf(fp,"\t\t\tfree(s);\n"); temp = token_list; while(temp){ fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\ttemp->Add_%s = atoi(s);\n", temp->name); fprintf(fp,"\t\t\tfree(s);\n"); if(temp->data_types) fprintf(fp,"\t\t\t%s_tmp = temp->mark_%s = NULL;\n", temp->name, temp->name); else{ fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\ttemp->mark_%s = atoi(s);\n", temp->name); fprintf(fp,"\t\t\tfree(s);\n"); } temp = temp->next; } fprintf(fp,"\t\t\ttemp->next = NULL;\n"); fprintf(fp,"\t\t}\n"); temp = token_list; while(temp){ if(temp->data_types){ fprintf(fp,"\t\telse if(strcmp(s,\"%s%s\") == 0){\n", prefix,temp->name); fprintf(fp,"\t\t\tfree(s);\n"); fprintf(fp,"\t\t\tif(%s_tmp == NULL)\n\t\t\t\ttemp->mark_%s = %s_tmp = ", temp->name, temp->name, temp->name); fprintf(fp,"(struct %s%s_struct *) %smyalloc ", prefix, temp->name, prefix); fprintf(fp,"(sizeof(struct %s%s_struct));\n",prefix, temp->name); fprintf(fp,"\t\t\telse{\n\t\t\t\t%s_tmp->next = ", temp->name); fprintf(fp,"(struct %s%s_struct *) %smyalloc ", prefix, temp->name, prefix); fprintf(fp,"(sizeof(struct %s%s_struct));\n",prefix, temp->name); fprintf(fp,"\t\t\t\t%s_tmp = %s_tmp->next;\n\t\t\t}\n", temp->name, temp->name); temp2 = temp->data_types; fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\t%s_tmp->MARK = atoi(s);\n",temp->name); fprintf(fp,"\t\t\tfree(s);\n"); while(temp2){ if((temp2->type >= 0) && (temp2->type <= 2)){ fprintf(fp,"\t\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t\t\t%s_tmp->%s = ",temp->name, temp2->name); switch(temp2->type){ case 0: fprintf(fp,"atoi(s);\n"); fprintf(fp,"\t\t\tfree(s);\n"); break; case 1: fprintf(fp,"atof(s);\n"); fprintf(fp,"\t\t\tfree(s);\n"); break; case 2: fprintf(fp,"s;\n"); default:break; } } temp2 = temp2->next; } fprintf(fp,"\t\t}\n"); } temp = temp->next; } fprintf(fp,"\t}\n"); fprintf(fp,"}\n"); } if(profiling){ fprintf(fp,"\n%ssave_profile(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tint i;\n\n"); fprintf(fp,"\tfor(i = 0; i < %d; i++)\n",fire); fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",%sfire_profile[i]);\n", '%', prefix); fprintf(fp,"\tfor(i = 0; i < %d; i++)\n",test); fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\",%stest_profile[i]);\n", '%', prefix); fprintf(fp,"}\n"); fprintf(fp,"\n%sload_profile(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tchar *s;\n\tint i;\n\n"); fprintf(fp,"\tfor(i = 0; i < %d; i++){\n",fire); fprintf(fp,"\t\t%sfire_profile[i] = atoi(s = %sscan(fp));\n", prefix, prefix); fprintf(fp,"\t\tfree(s);\n\t}\n"); fprintf(fp,"\tfor(i = 0; i < %d; i++){\n",test); fprintf(fp,"\t\t%stest_profile[i] = atoi(s = %sscan(fp));\n", prefix, prefix); fprintf(fp,"\t\tfree(s);\n\t}\n"); fprintf(fp,"}\n"); } if(tracing){ fprintf(fp,"\n%ssave_trace(fp)\nFILE *fp;\n{\n", prefix); fprintf(fp,"\tstruct %strace *temp;\n\n", prefix); fprintf(fp,"\ttemp = %strace_front;\n", prefix); fprintf(fp,"\twhile(temp){\n"); fprintf(fp,"\t\tfprintf(fp,\"%cd\\n\", temp->rule);\n",'%'); fprintf(fp,"\t\ttemp = temp->next;\n\t}\n}\n"); fprintf(fp,"\n%sload_trace(fp)\nFILE *fp;\n{\n\tchar *s;\n\n", prefix); fprintf(fp,"\t%seof_flag = 1;\n\ts = %sscan(fp);\n",prefix ,prefix); fprintf(fp,"\twhile(%seof_flag){\n", prefix); fprintf(fp,"\t\tif(%strace_back){\n", prefix); fprintf(fp,"\t\t\t%strace_back->next = (struct %strace *) ",prefix, prefix); fprintf(fp,"%smyalloc (sizeof(struct %strace));\n", prefix, prefix); fprintf(fp,"\t\t\t%strace_back = %strace_back->next;\n", prefix, prefix); fprintf(fp,"\t\t}\n\t\telse\n"); fprintf(fp,"\t\t\t%strace_front = %strace_back = (struct %strace *) ",prefix, prefix, prefix); fprintf(fp,"%smyalloc (sizeof(struct %strace));\n", prefix, prefix); fprintf(fp,"\t\t%strace_back->rule = atoi(s);\n", prefix); fprintf(fp,"\t\tfree(s);\n\t\ts = %sscan(fp);\n", prefix); fprintf(fp,"\t}\n}\n"); } } gen_print() /* generate procedures to print the contents of stm on the standard output */ { struct def_list *temp; struct data_type *temp2; FILE *fp; fp = dump; fprintf(fp,"\n\n%sdump_stm()\n{\n", prefix); temp = token_list; while(temp){ fprintf(fp,"\t%sdump_%s_struct();\n", prefix,temp->name); temp = temp->next; } fprintf(fp,"}\n\n"); temp = token_list; while(temp){ fprintf(fp,"\n%sdump_%s_struct()\n{\n", prefix,temp->name); if(temp->data_types){ fprintf(fp,"\tint\ti;\n\tstruct %s%s_struct *temp;\n\n", prefix,temp->name); fprintf(fp,"\ti = 1;\n"); } fprintf(fp,"\tprintf(\"\\nDumping %s list (%cd)\\n\",%stoken[%s%s]);\n", temp->name, '%', prefix, prefix, temp->name); if(temp->data_types){ fprintf(fp,"\ttemp = %s%s_list[0];\n\twhile(temp){\n", prefix,temp->name); fprintf(fp,"\t\tprintf(\"%cd.\\t",'%'); temp2 = temp->data_types; while(temp2){ switch(temp2->type){ case 0: fprintf(fp,"%cd\\t", '%'); break; case 1: fprintf(fp,"%cf\\t", '%'); break; case 2: fprintf(fp,"%cs\\t", '%'); default: break; } temp2 = temp2->next; } fprintf(fp,"\\n\", i"); temp2 = temp->data_types; while(temp2){ if((temp2->type >= 0) && (temp2->type <= 2)) fprintf(fp,"\n\t\t\t, temp->%s",temp2->name); temp2 = temp2->next; } fprintf(fp,");\n\t\ttemp = temp->next;\n\t\ti++;\n\t}\n"); } fprintf(fp,"}\n\n"); temp = temp->next; } } gen_init(mode) /* generate procedure to initialize stm */ /* if mode is zero, then generate only code to add to stm */ int mode; { int i; struct init *temp; struct fields *temp2; struct def_list *t; struct data_type *t2; FILE *fp; if(mode) fp = add; else fp = loop; temp = init_list->next; /* the first one is a place holder */ if(mode){ fprintf(fp,"\n\n%sinit()\n{\n\tint i;\n", prefix); if(backtracking){ fprintf(fp,"\n\t%sbacktrack = (struct %sback_track_stack *)", prefix, prefix); fprintf(fp," %smyalloc (sizeof(struct %sback_track_stack));\n", prefix, prefix); } } while(temp){ if(temp->count){ if(mode == 0) fprintf(fp,"\t\t"); fprintf(fp,"\tfor(i = 0; i < %d; i++)\n\t",temp->count); } if(mode == 0) fprintf(fp,"\t\t"); fprintf(fp,"\t%sadd_%s_struct(" , prefix, temp->object); t = token_list; while(strcmp(t->name, temp->object) != 0) t = t->next; i = 0; t2 = t->data_types; while(t2){ temp2 = temp->items; while((temp2) && (strcmp(temp2->element, t2->name) != 0)) temp2 = temp2->next; if((temp2) && (temp2->type != 3)){ if(i) fprintf(fp,", "); i = 1; if(temp2->type >= 0){ if(temp2->type == 2) fprintf(fp,"\""); fprintf(fp,"%s",temp2->value); if(temp2->type == 2) fprintf(fp,"\""); } else{ if(temp2->empty) fprintf(fp,"%s%s_empty[%d].%s", prefix,temp2->object, temp2->index, temp2->value); else fprintf(fp,"%s%s_list[%d]->%s", prefix,temp2->object, temp2->index, temp2->value); } } else if(t2->type != 3){ if(i) fprintf(fp,", "); i = 1; if(t2->type == 2) fprintf(fp,"\"\""); if(t2->type == 1) fprintf(fp,"0.0"); if(t2->type == 0) fprintf(fp,"0"); } t2 = t2->next; } fprintf(fp,");\n"); temp = temp->next; } if(mode){ if(backtracking){ fprintf(fp,"\tfree(%sbacktrack);\n", prefix); fprintf(fp,"\t%sbacktrack = NULL;\n", prefix); } fprintf(fp,"}\n\n\n"); } } gen_structs() /* generate structure definitions from token list */ { int i; struct def_list *temp; struct data_type *temp2; FILE *fp; i = 0; fp = header; temp = token_list; while(temp){ if(temp->data_types){ fprintf(fp,"\nstruct %s%s_struct {\n", prefix,temp->name); if(temp->data_types){ temp2 = temp->data_types; while(temp2){ if(temp2->type != 3) fprintf(fp,"\t\t%s%s;\n",type_names[temp2->type],temp2->name); else fprintf(fp,"\t\tstruct %s%s_struct *%s;\n", prefix,temp->name, temp2->name); temp2 = temp2->next; } } fprintf(fp,"\t\tint MARK;\n"); fprintf(fp,"\t\tstruct %s%s_struct *prev;\n", prefix,temp->name); fprintf(fp,"\t\tstruct %s%s_struct *next;\n", prefix,temp->name); fprintf(fp,"} *%s%s_list[%s%s_max],", prefix,temp->name, prefix, temp->name); if(max_empty[i]) fprintf(fp," %s%s_empty[%d],", prefix,temp->name, max_empty[i]); fprintf(fp," *%s%s_temp[%s%s_max];\n", prefix,temp->name, prefix, temp->name); } i++; temp = temp->next; } if(backtracking){ fprintf(fp,"\nstruct %sback_track_stack {\n", prefix); temp = token_list; while(temp){ fprintf(fp,"\tint Add_%s;\n",temp->name); if(temp->data_types) fprintf(fp,"\tstruct %s%s_struct *mark_%s;\n", prefix,temp->name, temp->name); else fprintf(fp,"\tint mark_%s;\n", temp->name); temp = temp->next; } fprintf(fp,"\tint next_rule;\n"); fprintf(fp,"\tstruct %sback_track_stack *next;\n} *%sbacktrack;\n", prefix, prefix); } } gen_back() /* generate procedures required for backtracking */ { struct def_list *temp; FILE *fp; fp = backtrack; temp = token_list; fprintf(fp,"\n%sinsert_backtrack(rule)\nint rule;\n{\n", prefix); fprintf(fp,"\tstruct %sback_track_stack *temp;\n\n", prefix); fprintf(fp,"\ttemp = (struct %sback_track_stack *) %smyalloc", prefix, prefix); fprintf(fp,"(sizeof(struct %sback_track_stack));\n", prefix); fprintf(fp,"\ttemp->next_rule = rule;\n"); while(temp){ fprintf(fp,"\ttemp->Add_%s = 0;\n", temp->name); if(temp->data_types) fprintf(fp,"\ttemp->mark_%s = NULL;\n", temp->name); else fprintf(fp,"\ttemp->mark_%s = 0;\n", temp->name); temp = temp->next; } fprintf(fp,"\ttemp->next = %sbacktrack;\n", prefix); fprintf(fp,"\t%sbacktrack = temp;\n}\n", prefix); fprintf(fp,"\n%sbackup()\n{\n\tint i;\n", prefix); fprintf(fp,"\tstruct %sback_track_stack *temp;\n", prefix); temp = token_list; while(temp){ if(temp->data_types) fprintf(fp,"\tstruct %s%s_struct *%s_temp, *%s_temp2;\n", prefix, temp->name, temp->name, temp->name); temp = temp->next; } fprintf(fp,"\n\tif(%sbacktrack == NULL)\n\t\treturn;\n", prefix); temp = token_list; while(temp){ if(temp->data_types){ fprintf(fp,"\twhile(%sbacktrack->mark_%s){\n", prefix, temp->name); fprintf(fp,"\t\t%s_temp2 = %sbacktrack->mark_%s;\n", temp->name, prefix, temp->name); fprintf(fp,"\t\t%sbacktrack->mark_%s = %sbacktrack->mark_%s->next;\n", prefix, temp->name, prefix, temp->name); fprintf(fp,"\t\t%s_temp2->prev = NULL;\n", temp->name); fprintf(fp,"\t\t%s_temp2->next = NULL;\n", temp->name); fprintf(fp,"\t\t%s_temp = %s%s_list[0];\n", temp->name, prefix, temp->name); fprintf(fp,"\t\tif(%s_temp){\n", temp->name); fprintf(fp,"\t\t\tfor(i = 0; i < %s_temp2->MARK; i++)\n", temp->name); fprintf(fp,"\t\t\t\tif(%s_temp->next)\n", temp->name); fprintf(fp,"\t\t\t\t\t%s_temp = %s_temp->next;\n", temp->name, temp->name); fprintf(fp,"\t\t\t\telse\n"); fprintf(fp,"\t\t\t\t\ti = %s_temp2->MARK + 1;\n", temp->name); fprintf(fp,"\t\t}\n\t\telse i = -1;\n"); fprintf(fp,"\t\tif(i == %s_temp2->MARK){\n", temp->name); fprintf(fp,"\t\t\t%s_temp2->next = %s_temp;\n", temp->name, temp->name); fprintf(fp,"\t\t\t%s_temp2->prev = %s_temp->prev;\n", temp->name, temp->name); fprintf(fp,"\t\t\tif(%s_temp->prev)\n", temp->name); fprintf(fp,"\t\t\t\t%s_temp->prev->next = %s_temp2;\n", temp->name, temp->name); fprintf(fp,"\t\t\telse\n"); fprintf(fp,"\t\t\t\t%s%s_list[0] = %s_temp2;\n", prefix, temp->name, temp->name); fprintf(fp,"\t\t\t%s_temp->prev = %s_temp2;\n", temp->name, temp->name); fprintf(fp,"\t\t}\n\t\telse{\n"); fprintf(fp,"\t\t\tif(%s_temp){\n", temp->name); fprintf(fp,"\t\t\t\t%s_temp->next = %s_temp2;\n", temp->name, temp->name); fprintf(fp,"\t\t\t\t%s_temp2->prev = %s_temp;\n", temp->name, temp->name); fprintf(fp,"\t\t\t\t%s_temp2->next = NULL;\n", temp->name); fprintf(fp,"\t\t\t}\n\t\t\telse %s%s_list[0] = %s_temp2;\n", prefix, temp->name, temp->name); fprintf(fp,"\t\t}\n"); fprintf(fp,"\t\t%s_temp2->MARK = 0;\n", temp->name); fprintf(fp,"\t\t%stoken[%s%s]++;\n", prefix, prefix, temp->name); fprintf(fp,"\t}\n"); fprintf(fp,"\tfor(i = 0; i < %sbacktrack->Add_%s; i++){\n", prefix, temp->name); fprintf(fp,"\t\t%s%s_list[1] = %s%s_list[0];\n", prefix,temp->name, prefix, temp->name); fprintf(fp,"\t\tfree_%s%s_struct(1);\n\t}\n", prefix, temp->name); } else{ fprintf(fp,"\t%stoken[%s%s] += %sbacktrack->mark_%s;\n", prefix, prefix, temp->name, prefix, temp->name); fprintf(fp,"\t%stoken[%s%s] -= %sbacktrack->Add_%s;\n", prefix, prefix, temp->name, prefix, temp->name); } temp = temp->next; } fprintf(fp,"\ttemp = %sbacktrack;\n", prefix); fprintf(fp,"\t%sbacktrack = %sbacktrack->next;\n", prefix, prefix); fprintf(fp,"\tfree(temp);\n"); fprintf(fp,"}\n"); } gen_trace() /* generate code to support building a trace list */ { int i; struct rule *temp; FILE *fp; fp = header; fprintf(fp,"\nstruct %strace{\n\tint rule;\n", prefix); fprintf(fp,"\tstruct %strace *next;\n} ", prefix); fprintf(fp,"*%strace_front, *%strace_back;\n\n", prefix, prefix); temp = rule_list; i = 1; while(temp->next){ temp = temp->next; i++; } fp = loop; fprintf(fp,"char *%srule_names[%d] = {\n", prefix,i+2); fprintf(fp,"\t\"%sStart\",\n", prefix); while(temp){ fprintf(fp,"\t\"%s%s\",\n", prefix,temp->label); temp = temp->prev; } fprintf(fp,"\t\"%sEnd\"\n};\n\n", prefix); fp = misc; fprintf(fp,"\n%sappend_trace(i)\nint i;\n{\n", prefix); fprintf(fp,"\tstruct %strace *temp;\n\n\t", prefix); fprintf(fp,"temp = (struct %strace *) %smyalloc (sizeof(struct %strace));\n", prefix, prefix, prefix); fprintf(fp,"\ttemp->rule = i;\n\ttemp->next = NULL;\n"); fprintf(fp,"\tif(%strace_front){\n\t\t%strace_back->next = temp;\n", prefix, prefix); fprintf(fp,"\t\t%strace_back = %strace_back->next;\n\t}\n", prefix, prefix); fprintf(fp,"\telse %strace_front = %strace_back = temp;\n}\n\n", prefix, prefix); } gen_profile(n,d) int n,d; /* generate procedures and structures to generate profile */ { struct list *temp; struct rule *r_temp; FILE *fp; fp = profile; temp = label_list; fprintf(fp,"\nint %stest_profile[%d];\n\n", prefix, n); fprintf(fp,"\nint %sfire_profile[%d];\n\n", prefix, d); fprintf(fp,"char *%slabel_names[%d] = {\n", prefix, n); while(temp){ fprintf(fp,"\t\"%s%s\"", prefix, temp->name); if(temp->next) fprintf(fp,",\n"); temp = temp->next; } r_temp = rule_list; while(r_temp->next) r_temp = r_temp->next; fprintf(fp,"};\n\nchar *%srules[%d] = {\n\t\"\",\n", prefix, d); while(r_temp){ fprintf(fp,"\t\"%s%s\"", prefix, r_temp->label); if(r_temp->prev) fprintf(fp,",\n"); r_temp = r_temp->prev; } fprintf(fp,"};\n\n%sprint_profile()\n{\n", prefix); fprintf(fp,"\tint i, t;\n\n\tt = 0;\n\tprintf(\"\\nRules Tested\\n\");\n", n); fprintf(fp,"\tfor(i = 0; i < %d; i++){\n", n); fprintf(fp,"\t\tprintf(\"%cd",'%'); fprintf(fp,"\\t%cs\\n\",%stest_profile[i],", '%', prefix); fprintf(fp," %slabel_names[i]);\n", prefix); fprintf(fp,"\t\tt += %stest_profile[i];\n\t}\n", prefix); fprintf(fp,"\tprintf(\"%cd\\n\", t);\n\tt = 0;\n", '%'); fprintf(fp,"\tprintf(\"\\nRules Fired\\n\");\n", n); fprintf(fp,"\n\tfor(i = 1; i < %d; i++){\n", d); fprintf(fp,"\t\tprintf(\"%cd",'%'); fprintf(fp,"\\t%cs\\n\",%sfire_profile[i],", '%', prefix); fprintf(fp," %srules[i]);\n", prefix); fprintf(fp,"\t\tt += %sfire_profile[i];\n\t}\n", prefix); fprintf(fp,"\tprintf(\"%cd\\n\", t);\n}\n", '%'); } gen_zero(test,fire) int test, fire; /* generate a procedure that will free or zero all data structures generated by trc */ { int i; FILE *fp; struct def_list *d_temp; struct data_type *dt_temp; fp = zero; if(profiling) fprintf(fp,"extern int %stest_profile[], %sfire_profile[];\n",prefix, prefix); fprintf(fp,"\n\n%szero()\n{\n\tint i;\n",prefix); if(backtracking) fprintf(fp,"\tstruct %sback_track_stack *b_temp;\n", prefix); /* pointer definitions */ d_temp = token_list; while(d_temp){ if(d_temp->data_types) fprintf(fp,"\tstruct %s%s_struct *%s_tmp;\n", prefix,d_temp->name,d_temp->name); d_temp = d_temp->next; } /* free struct lists */ d_temp = token_list; while(d_temp){ if(d_temp->data_types){ fprintf(fp,"\twhile(%s%s_list[0]){\n", prefix,d_temp->name); fprintf(fp,"\t\t%s%s_list[1] = %s%s_list[0];\n", prefix,d_temp->name, prefix,d_temp->name); fprintf(fp,"\t\tfree_%s%s_struct(1);\n\t}\n", prefix,d_temp->name); } d_temp = d_temp->next; } /* free backtracking data */ if(backtracking){ fprintf(fp,"\twhile(%sbacktrack){\n", prefix); fprintf(fp,"\t\tb_temp = %sbacktrack;\n", prefix); fprintf(fp,"\t\t%sbacktrack = %sbacktrack->next;\n", prefix, prefix); d_temp = token_list; while(d_temp){ if(d_temp->data_types){ fprintf(fp,"\t\t%s_tmp = b_temp->mark_%s;\n",d_temp->name,d_temp->name); fprintf(fp,"\t\twhile(%s_tmp){\n",d_temp->name); fprintf(fp,"\t\t\tb_temp->mark_%s = b_temp->mark_%s->next;\n",d_temp->name,d_temp->name); dt_temp = d_temp->data_types; while(dt_temp){ if(dt_temp->type == 2) fprintf(fp,"\t\t\tfree(%s_tmp->%s);\n",d_temp->name,dt_temp->name); dt_temp = dt_temp->next; } fprintf(fp,"\t\t\tfree(%s_tmp);\n",d_temp->name); fprintf(fp,"\t\t\t%s_tmp = b_temp->mark_%s;\n\t\t}\n",d_temp->name,d_temp->name); } d_temp = d_temp->next; } fprintf(fp,"\t\tfree(b_temp);\n"); fprintf(fp,"\t}\n"); } /* zero structure pointers */ d_temp = token_list; while(d_temp){ if(d_temp->data_types){ fprintf(fp,"\tfor(i = 0; i < %s%s_max; i++)\n", prefix,d_temp->name); fprintf(fp,"\t\t%s%s_list[i] = %s%s_temp[i] = NULL;\n", prefix,d_temp->name, prefix,d_temp->name); } d_temp = d_temp->next; } /* zero integer arrays */ fprintf(fp,"\tfor(i = 0; i < %d; i++)\n",total_tokens); fprintf(fp,"\t\t%stoken[i] = 0;\n", prefix); if(profiling){ fprintf(fp,"\tfor(i = 0; i < %d; i++)\n",fire); fprintf(fp,"\t\t%sfire_profile[i] = 0;\n", prefix); fprintf(fp,"\tfor(i = 0; i < %d; i++)\n",test); fprintf(fp,"\t\t%stest_profile[i] = 0;\n", prefix); } /* zero trace list */ if(tracing){ fprintf(fp,"\twhile(%strace_front){\n", prefix); fprintf(fp,"\t\t%strace_back = %strace_front;\n", prefix, prefix); fprintf(fp,"\t\t%strace_front = %strace_front->next;\n", prefix, prefix); fprintf(fp,"\t\tfree(%strace_back);\n", prefix); fprintf(fp,"\t}\n\t%strace_back = NULL;\n", prefix); } fprintf(fp,"}\n"); } trans_code(rule, list, fp, label) /* translate references to objects in embedded C code */ struct rule *rule; struct list *list; FILE *fp; char *label; { struct match *m_temp; struct list *l_temp; int i, j; char c[512]; l_temp = list; while(l_temp){ i = 0; while(l_temp->name[i]){ if(l_temp->name[i] == '$'){ i++; j = 0; while(l_temp->name[i] != '.'){ c[j] = l_temp->name[i]; if(c[j] == '\0'){ fprintf(stderr,"cannot translate %s in rule %s\n",c, rule->label); fprintf(stderr,"%s\n", l_temp->name); return; } i++; j++; } i++; c[j] = '\0'; m_temp = rule->complex; if((strcmp(c, "FAIL")) == 0){ fprintf(fp,"{"); if(rule->recurs == 0) fprintf(fp,"%srestore();\n", prefix); fprintf(fp,"goto %s;}\n", label); } else{ while(m_temp && j){ if((strcmp(c, m_temp->free_name)) == 0){ fprintf(fp,"%s%s_", prefix , m_temp->object); if(m_temp->empty) fprintf(fp,"empty[%d].", m_temp->index); else fprintf(fp,"list[%d]->", m_temp->index); j = 0; } m_temp = m_temp->next; } if(j){ fprintf(stderr,"cannot translate %s in rule %s\n",c, rule->label); fprintf(stderr,"%s\n", l_temp->name); return; } } } else{ fprintf(fp,"%c",l_temp->name[i]); i++; } } fprintf(fp,"\n"); l_temp = l_temp->next; } } translate() /* Produce the output code */ { int i, j, k, l, count, prev_index, label_count; char s[512]; struct list *l_temp; struct def_list *d_temp, *d_temp2; struct data_type *dt_temp; struct rule *r_temp, *r_temp2, *r_const; struct match *m_temp, *m_temp2, *m_temp3, *m_temp4; struct test *t_temp; struct list *label_temp; FILE *fp; fp = header; l_temp = header_code; while(l_temp){ fprintf(fp,"%s\n",l_temp->name); l_temp = l_temp->next; } fprintf(fp,"\n#include\t<stdio.h>\n\n"); d_temp = token_list; for(i = 0; i < total_tokens; i++) { fprintf(fp,"#define %s%s %d\n", prefix,d_temp->name, i); j = max_free[i]; if(j <= 2) j = 2; fprintf(fp,"#define %s%s_max %d\n", prefix,d_temp->name, j); d_temp = d_temp->next; } fprintf(fp,"\n"); fp = loop; fprintf(fp,"int %stotal_tokens = %d;\n", prefix,total_tokens); fprintf(fp,"int %stoken[%d];\n", prefix,total_tokens); if(profiling){ fprintf(fp,"extern int %stest_profile[];\n", prefix); fprintf(fp,"extern int %sfire_profile[];\n", prefix); } d_temp = token_list; fprintf(fp,"char *%stoken_name[%d] = {\n", prefix,total_tokens); for(i = 0; i < total_tokens; i++) { fprintf(fp,"\t\t\t\042%s%s\042", prefix,d_temp->name); d_temp = d_temp->next; if(d_temp) fprintf(fp,",\n"); } fprintf(fp,"};\n"); fp = header; gen_structs(); gen_test(); if(backtracking){ gen_back(); gen_relink(); } if(tracing) gen_trace(); if(dumping) gen_print(); gen_free(); gen_restore(); gen_search(); gen_add(); init_list = init_list2; gen_init(1); fp = loop; fprintf(fp,"\n%sloop()\n{\n\tint i;\n", prefix); fprintf(fp,"\twhile(1){\n%sStart:\n", prefix); if(profiling){ label_list = (struct list *) malloc(sizeof(struct list)); label_list->name = (char *) malloc(strlen(prefix) + 6); label_temp = label_list; label_count = 1; strcpy(label_list->name, prefix ); strcat(label_list->name, "Start"); fprintf(fp,"\t%stest_profile[0]++;\n", prefix); } r_temp = rule_list; while(r_temp->next != NULL) r_temp = r_temp->next; r_const = r_temp; while(r_temp){ /* label of this rule */ fprintf(fp,"%s%s:\n", prefix,r_temp->label); if(profiling){ label_temp->next = (struct list *) malloc(sizeof(struct list)); label_temp = label_temp->next; label_temp->name = (char *) malloc(strlen(prefix) + strlen(r_temp->label) + 1); strcpy(label_temp->name, prefix); strcat(label_temp->name, r_temp->label); fprintf(fp,"\t%stest_profile[%d]++;\n", prefix, label_count); label_count++; } /* test for code that must precede all tests */ m_temp3 = NULL; m_temp = r_temp->complex; /* skip over empty definitions */ while((m_temp) && (m_temp->empty)){ m_temp3 = m_temp; m_temp = m_temp->next; } /* if the first non empty entry is c_code it must precede all tests */ if(m_temp) if(m_temp->c_code){ if(r_temp->prev) sprintf(s,"%s%s\0",prefix, r_temp->prev->label); else sprintf(s,"%sEnd\0",prefix); trans_code(r_temp, m_temp->c_code, fp, s); /* unlink the code so it isn't inserted twice */ if(m_temp3) m_temp3->next = m_temp->next; else r_temp->complex = r_temp->complex->next; } /* test for object counts */ fprintf(fp,"\t\tif("); d_temp = token_list; for(i = 0; i < total_tokens; i++){ if(r_temp->search[i] > 0) fprintf(fp,"(%stoken[%s%s] >= %d) &&\n\t\t\t", prefix, prefix,d_temp->name,r_temp->search[i]); if(r_temp->search[i] < 0) fprintf(fp,"(%stoken[%s%s] <= 0) &&\n\t\t\t", prefix, prefix,d_temp->name); d_temp = d_temp->next; } d_temp = token_list; fprintf(fp,"1){"); /* generate complex matching code */ /* first initialize the current free variable matrix */ for(i = 0; i < total_tokens; i++) current_free[i] = 1; m_temp = m_temp3 = r_temp->complex; prev_index = 0; while(m_temp){ if(m_temp->c_code){ if((prev_index == 0) || (r_temp->recurs == 0)){ if(r_temp->prev) sprintf(s,"%s%s\0", prefix,r_temp->prev->label); else sprintf(s,"%sEnd\0", prefix); } else sprintf(s,"%s%s_%s_%d\0", prefix, r_temp->label, m_temp3->object, prev_index); trans_code(r_temp, m_temp->c_code, fp, s); } else if(m_temp->empty){ /* declaration only - don't generate any code */ i = 0; } else{ i = 0; d_temp = token_list; while(strcmp(m_temp->object, d_temp->name) != 0){ i++; d_temp = d_temp->next; } if(d_temp->data_types){ for(count = 0; count < m_temp->count; count++){ /* initialize temp */ fprintf(fp,"\n\t\t\t%s%s_temp[%d] = %s%s_list[0];\n" , prefix, m_temp->object, current_free[i], prefix, m_temp->object); /* print a label */ if((r_temp->recurs) || (profiling)){ fprintf(fp,"%s%s_%s_%d:\n", prefix, r_temp->label, m_temp->object, current_free[i]); if(profiling){ label_temp->next = (struct list *) malloc(sizeof(struct list)); label_temp = label_temp->next; label_temp->name = (char *) malloc(strlen(r_temp->label) + strlen(m_temp->object) + strlen(prefix) + 10); sprintf(label_temp->name,"%s%s_%s_%d", prefix, r_temp->label, m_temp->object, current_free[i]); fprintf(fp,"\t%stest_profile[%d]++;\n", prefix, label_count); label_count++; } } /* free the previously found item */ if(r_temp->recurs){ fprintf(fp,"\t\t\tif(%s%s_list[%d])\n", prefix, m_temp->object, current_free[i]); fprintf(fp,"\t\t\t\t%s%s_list[%d]->MARK = 0;\n", prefix, m_temp->object, current_free[i]); } /* do the search */ fprintf(fp,"\t\t\tif((%s%s_list[%d] = search_%s%s_struct(%d" , prefix , m_temp->object, current_free[i], prefix, m_temp->object, current_free[i]); dt_temp = d_temp->data_types; while(dt_temp){ if(dt_temp->type <= 2){ t_temp = m_temp->tests; j = 1; while(j && t_temp){ if(strcmp(t_temp->element, dt_temp->name) == 0){ j = 0; if((t_temp->type == 0) || (t_temp->type == 1)) fprintf(fp,", %s",t_temp->value); if(t_temp->type == 2) fprintf(fp,", \"%s\"",t_temp->value); if(t_temp->type == -1){ if(t_temp->id) fprintf(fp,", 0"); else{ l = 0; m_temp2 = r_temp->complex; while(m_temp2){ if(strcmp(m_temp2->free_name, t_temp->free_name) == 0){ l = m_temp2->index; m_temp4 = m_temp2; m_temp2 = NULL; } else m_temp2 = m_temp2->next; } if(m_temp4->empty) fprintf(fp,", %s%s_empty[%d].%s", prefix,m_temp4->object,l,t_temp->value); else fprintf(fp,", %s%s_list[%d]->%s", prefix,m_temp4->object,l,t_temp->value); } } fprintf(fp,", %d", t_temp->relop); if(dt_temp->elts) fprintf(fp,", %d",t_temp->id); } else t_temp = t_temp->next; } if(j){ switch(dt_temp->type){ case 0: fprintf(fp,", 0, 7"); break; case 1: fprintf(fp,", 0.0, 7"); break; case 2: fprintf(fp,", \"\", 7"); default: break; } if(dt_temp->elts) fprintf(fp,", 0"); } } dt_temp = dt_temp->next; } fprintf(fp,")) == NULL){\n"); /* search failed on first of rule */ if((prev_index == 0) || (r_temp->recurs == 0)){ fprintf(fp,"\t\t\t\t%srestore();\n", prefix); if(r_temp->prev) fprintf(fp,"\t\t\t\tgoto %s%s;\n\t\t\t}", prefix,r_temp->prev->label); else fprintf(fp,"\t\t\t\tgoto %sEnd;\n\t\t\t}", prefix); } /* search failed - not first of rule */ else{ fprintf(fp,"\t\t\t\tgoto %s%s_%s_%d;\n\t\t\t}", prefix, r_temp->label, m_temp3->object, prev_index); } /* move index one beyond the one currently found */ if(r_temp->recurs) fprintf(fp,"\n\t\t\t%s%s_temp[%d] = %s%s_list[%d]->next;", prefix, m_temp->object, current_free[i], prefix, m_temp->object, current_free[i]); m_temp3 = m_temp; prev_index = current_free[i]; current_free[i]++; } } } m_temp = m_temp->next; } /* get rule number for next 3 statements */ i = 1; r_temp2 = r_const; while(r_temp != r_temp2){ r_temp2 = r_temp2->prev; i++; } /* generate profile code if profiling */ if(profiling){ fprintf(fp,"\n\t\t\t%sfire_profile[%d]++;", prefix, i); } /* generate append code if tracing */ if(tracing){ fprintf(fp,"\n\t\t\t%sappend_trace(%d);", prefix, i); } /* generate insert code if backtracking */ if(backtracking){ fprintf(fp,"\n\t\t\t%sinsert_backtrack(%d);", prefix, i); } /* generate ADD code */ fprintf(fp,"\n"); init_list = r_temp->add; gen_init(0); /* generate MARK code */ /* first MARK objects deleted by name */ m_temp = r_temp->complex; while(m_temp){ if(m_temp->mark){ if(backtracking) fprintf(fp,"\n\t\t\t\t%srelink_%s_struct(%d);", prefix,m_temp->object, m_temp->index); else{ d_temp = token_list; while(strcmp(m_temp->object, d_temp->name)) d_temp = d_temp->next; if(d_temp->data_types) fprintf(fp,"\n\t\t\t\tfree_%s%s_struct(%d);", prefix,m_temp->object, m_temp->index); else fprintf(fp,"\n\t\t\t\t%stoken%s[%s]--;", prefix, prefix,d_temp->name); } } m_temp = m_temp->next; } /* now MARK the rest of the objects */ d_temp = token_list; for(i = 0; i < total_tokens; i++){ if(r_temp->mark[i]){ fprintf(fp,"\n\t\t\tfor(i = 0; i < %d; i++)",r_temp->mark[i]); if(backtracking) fprintf(fp,"\n\t\t\t\t%srelink_%s_struct(1);", prefix,d_temp->name); else{ if(d_temp->data_types) fprintf(fp,"\n\t\t\t\tfree_%s%s_struct(1);", prefix,d_temp->name); else fprintf(fp,"\n\t\t\t\t%stoken[%s%s]--;", prefix, prefix,d_temp->name); } } d_temp = d_temp->next; } d_temp = token_list; fprintf(fp,"\n\t\t\t%srestore();", prefix); l_temp = r_temp->c_code; trans_code(r_temp, l_temp, fp); if(find_name(r_temp->opt)) fprintf(fp,"\t\t\tgoto %s%s;\n\t\t}\n", prefix, r_temp->opt); else fprintf(fp,"\t\t\tgoto %sStart;\n\t\t}\n", prefix); r_temp = r_temp->prev; } fprintf(fp,"\n%sEnd:\n", prefix); if(profiling){ label_temp->next = (struct list *) malloc(sizeof(struct list)); label_temp = label_temp->next; label_temp->name = (char *) malloc(strlen(prefix) + 4); strcpy(label_temp->name, prefix); strcat(label_temp->name, "End"); fprintf(fp,"\t%stest_profile[%d]++;\n", prefix, label_count); label_count++; } if(tracing){ i = 1; r_temp2 = r_const; while(r_temp2){ r_temp2 = r_temp2->prev; i++; } fprintf(fp,"\t\t\t%sappend_trace(%d);\n", prefix,i); } if(backtracking){ fprintf(fp,"\t\t\tif(%sbacktrack){\n", prefix); fprintf(fp,"\t\t\t\ti = %sbacktrack->next_rule;\n", prefix); fprintf(fp,"\t\t\t\t%sbackup();\n", prefix); fprintf(fp,"\t\t\t\tswitch(i){\n"); i = 1; r_temp2 = r_const; while(r_temp2){ fprintf(fp,"\t\t\t\t\tcase %d: goto ", i); if(r_temp2->prev) fprintf(fp,"%s%s;\n", prefix, r_temp2->prev->label); else fprintf(fp,"%sEnd;\n", prefix); r_temp2 = r_temp2->prev; i++; } fprintf(fp,"\t\t\t\t\tdefault: goto %sEnd;", prefix); fprintf(fp,"\n\t\t\t\t}\n\t\t\t}\n"); } fprintf(fp,"\t\t\treturn(1);\n\t}\n}\n"); l_temp = trailer_code; while(l_temp){ fprintf(fp,"%s\n",l_temp->name); l_temp = l_temp->next; } i = 0; r_temp2 = r_const; while(r_temp2){ r_temp2 = r_temp2->prev; i++; } if(profiling) gen_profile(label_count, i+1); if(zeroing) gen_zero(label_count, i+1); if(saving) gen_save(label_count, i+1); }
sources-request@panda.UUCP (02/09/86)
Mod.sources: Volume 3, Issue 116 Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary) : This is a shar archive. Extract with sh, not csh. : The rest of this file will extract: : p_out.c parser scanner.c echo extracting - p_out.c sed 's/^X//' > p_out.c << '!EOR!' X/* P_OUT.C -- Translate production rules to pascal. Version 1.1 */ X/* co-authored by Dean Hystad and Dan Kary. */ X X#include <stdio.h> X#include "main.h" X XFILE *fp,*lp; X Xchar *p_type_names[4] = { X "integer", X "real", X "strings", X "record" X}; X Xp_gen_test() X/* generate procedures to test each data type and return a relop code */ X{ X int i; X X for(i = 0; i < 3; i++){ X fprintf(fp,"\n\nfunction %stest_%s(", prefix, p_type_names[i]) ; X fprintf(fp,"\n\t\ta, b: %s ):", p_type_names[i]) ; X fprintf(fp,"\n\t\tinteger ;") ; X fprintf(fp,"\n\nvar\n\treturn: integer ;") ; X fprintf(fp,"\n\nbegin\n") ; X fprintf(fp,"\tif(a < b) then return := 4\n"); X fprintf(fp,"\telse if(a = b) then return := 2\n"); X fprintf(fp,"\telse return := 1 ;\n"); X fprintf(fp,"\t%stest_%s := return\n", prefix, p_type_names[i]) ; X fprintf(fp,"end ;\n") ; X } X} X X Xp_gen_search() X/* generate procedures to search each structure for a compound match */ X{ X int i; X struct def_list *temp; X struct data_type *temp2; X struct case_list *c_temp; X X temp = token_list; X while(temp){ X if(temp->data_types){ X temp2 = temp->data_types; X fprintf(fp,"\n\nfunction search_%s%s_record(\n\t\tndx : integer",prefix,temp->name); X while(temp2){ X if(temp2->type <= 2){ X fprintf(fp," ;\n\t\t%s : %s",temp2->name,p_type_names[temp2->type]); X fprintf(fp," ;\n\t\t%s_relop : integer",temp2->name); X if(temp2->elts) X fprintf(fp," ;\n\t\t%s_case : %s",temp2->name,p_type_names[temp2->type]); X } X temp2 = temp2->next; X } X fprintf(fp," ):\n\t\t%s%s_record_ptr ;\n\n",prefix,temp->name); X fprintf(fp,"var\n"); X fprintf(fp,"\tflag : integer ;\n"); X fprintf(fp,"\ttemp : %s%s_record_ptr ;\n", prefix,temp->name); X fprintf(fp,"\treturn : %s%s_record_ptr ;\n\n",prefix,temp->name); X fprintf(fp,"begin\n"); X fprintf(fp,"\treturn := nil ;\n"); X fprintf(fp,"\tflag := 0 ;\n"); X fprintf(fp,"\ttemp := %s%s_temp[ndx];\n\twhile (flag=0) and (temp <> nil) do begin", prefix,temp->name); X temp2 = temp->data_types; X fprintf(fp,"\n\t\tif temp^.MARK = 0 then begin"); X fprintf(fp,"\n\t\t\tflag := 7 ;"); X while(temp2){ X if(temp2->type <= 2){ X if(temp2->elts){ X fprintf(fp,"\n\t\t\tcase( %s_case )of",temp2->name); X fprintf(fp,"\n\t\t\t0:"); X } X fprintf(fp,"\n"); X if(temp2->elts) fprintf(fp,"\t"); X fprintf(fp,"\t\t\tif( (flag and %stest_", prefix); X fprintf(fp,"%s",p_type_names[temp2->type]); X fprintf(fp,"(temp^.%s, %s) and %s_relop)=0 )then", X temp2->name, temp2->name, temp2->name); X fprintf(fp,"\n\t\t\t\tflag := 0 ;"); X if(temp2->elts){ X c_temp = temp2->elts; X while(c_temp){ X fprintf(fp,"\n\t\t\t%d:", c_temp->id); X fprintf(fp,"\n\t\t\t\tif( (flag and test_"); X fprintf(fp,"%s",p_type_names[temp2->type]); X fprintf(fp,"(temp^.%s, temp^.%s)and %s_relop)=0 ) then", X temp2->name, c_temp->name, temp2->name); X fprintf(fp,"\n\t\t\t\tflag := 0 ;"); X c_temp = c_temp->next; X } X fprintf(fp,"\n\t\t\telse: flag := 0 ;\n\t\t\tend ;\n\t\t\tend ;"); X } X } X temp2 = temp2->next; X } X fprintf(fp,"\n\t\t\tif( flag<>0 )then begin\n\t\t\t\ttemp^.MARK := 1;\n"); X fprintf(fp,"\t\t\t\treturn := temp ;\n\t\t\tend ;\n\t\tend ;\n\t\ttemp := temp^.next ;\n"); X fprintf(fp,"\tend ;\n\tsearch_%s%s_record := return ;\nend ;\n",prefix, temp->name); X X } X temp = temp->next; X } X} X X Xp_gen_free() X/* generate procedures to free a structure */ X{ X int i; X struct def_list *temp; X struct data_type *temp2; X X temp = token_list; X while(temp){ X if(temp->data_types){ X fprintf(fp,"\n\nprocedure free_%s%s_record(\n",prefix,temp->name); X fprintf(fp,"\t\tstart : integer ) ;\n\n"); X fprintf(fp,"var\n\ti : integer ;\n\nbegin\n"); X fprintf(fp,"\ti := start ;\n"); X fprintf(fp,"\twhile( i < %s%s_max )do begin\n",prefix, temp->name); X fprintf(fp,"\t\tif( %s%s_list[i] <> nil )then begin\n",prefix, temp->name); X fprintf(fp,"\t\t\tif( %s%s_list[i]^.prev = nil )then\n",prefix, temp->name); X fprintf(fp,"\t\t\t\t%s%s_list[0] := %s%s_list[i]^.next\n",prefix,temp->name,prefix,temp->name); X fprintf(fp,"\t\t\telse\n"); X fprintf(fp,"\t\t\t\t%s%s_list[i]^.prev^.next := %s%s_list[i]^.next ;\n",prefix,temp->name,prefix,temp->name); X fprintf(fp,"\t\t\tif( %s%s_list[i]^.next <> nil )then\n",prefix,temp->name); X fprintf(fp,"\t\t\t\t%s%s_list[i]^.next^.prev := %s%s_list[i]^.prev ;\n",prefix,temp->name,prefix,temp->name); X temp2 = temp->data_types; X fprintf(fp,"\t\t\tdispose( %s%s_list[i] ) ;\n",prefix,temp->name); X fprintf(fp,"\t\t\t%s%s_list[i] := nil ;\n",prefix,temp->name); X fprintf(fp,"\t\t\ti := %s%s_max ;\n",prefix,temp->name); X fprintf(fp,"\t\t\t%stoken[%s%s]:= %stoken[%s%s]-1 ;\n",prefix,prefix,temp->name,prefix,temp->name); X fprintf(fp,"\t\tend ;\n\t\ti := i+1 ;\n\tend ;\nend ;\n"); X } X temp = temp->next; X } X} X X Xp_gen_restore() X/* generate procedure to restore structures */ X{ X int i; X struct def_list *temp; X X temp = token_list; X fprintf(fp,"\n\nprocedure %srestore ;\n\n", prefix); X fprintf(fp,"var\n\ti : integer ;\n\nbegin\n"); X while(temp){ X if(temp->data_types){ X fprintf(fp,"\tfor i := 1 to %s%s_max-1 do\n", prefix,temp->name); X fprintf(fp,"\t\tif(%s%s_list[i] <> nil)then begin\n", prefix,temp->name); X fprintf(fp,"\t\t\t%s%s_list[i]^.MARK := 0 ;\n", prefix, temp->name); X fprintf(fp,"\t\t\t%s%s_list[i] := nil ;\n", prefix,temp->name); X fprintf(fp,"\t\tend ;\n"); X } X temp = temp->next; X } X fprintf(fp,"end ;\n"); X} X X Xp_gen_add() X/* generate procedures to add each structure to a list */ X{ X int i; X struct def_list *temp; X struct data_type *temp2; X X temp = token_list; X while(temp){ X fprintf(fp,"\nprocedure %sadd_%s_record", prefix,temp->name); X if(temp->data_types){ X fprintf(fp,"(\n"); X temp2 = temp->data_types; X i = 0; X while(temp2){ X if(i) fprintf(fp," ;\n"); X if((temp2->type >= 0) && (temp2->type <= 2)) X fprintf(fp,"\t\t%s: %s",temp2->name,p_type_names[temp2->type]); X i=1; X temp2 = temp2->next; X } X fprintf(fp," )"); X } X fprintf(fp," ;\n\n"); X if(temp->data_types){ X fprintf(fp,"var\n"); X fprintf(fp,"\ttemp : %s%s_record_ptr ;\n", prefix, temp->name); X } X fprintf(fp,"\nbegin\n"); X if(temp->data_types){ X fprintf(fp,"\tnew(temp) ;\n"); X temp2 = temp->data_types; X while(temp2){ X if(temp2->type <= 2) X fprintf(fp,"\ttemp^.%s := %s ;\n",temp2->name,temp2->name); X temp2 = temp2->next; X } X fprintf(fp,"\ttemp^.MARK := 0 ;\n"); X fprintf(fp,"\ttemp^.next := %s%s_list[0] ;\n",prefix,temp->name); X fprintf(fp,"\ttemp^.prev := nil ;\n"); X fprintf(fp,"\tif(%s%s_list[0] <> nil)then\n",prefix,temp->name); X fprintf(fp,"\t\t%s%s_list[0]^.prev := temp ;\n",prefix,temp->name); X fprintf(fp,"\t%s%s_list[0] := temp ;\n",prefix,temp->name); X } X fprintf(fp,"\t%stoken[%s%s] := %stoken[%s%s]+1 ;\n",prefix,prefix,temp->name,prefix,prefix,temp->name); X fprintf(fp,"end ;\n\n"); X temp = temp->next; X } X} X Xp_gen_init(mode) X/* generate procedure to initialize stm */ X/* if mode is zero, then generate only code to add to stm */ Xint mode; X{ X int i; X struct init *temp; X struct fields *temp2; X struct def_list *t, *d_temp; X struct data_type *t2; X X temp = init_list->next; /* the first one is a place holder */ X if(mode){ X fprintf(fp,"\n\nprocedure %sinit ;\n\nvar\n\ti : integer ;\n\n", prefix); X fprintf(fp,"begin\n"); X fprintf(fp,"\tfor i := 0 to %d do\n",total_tokens-1); X fprintf(fp,"\t\t%stoken[i] := 0 ;\n",prefix); X d_temp = token_list; X for(i = 0; i < total_tokens; i++){ X fprintf(fp,"\t%stoken_name[%d] := '%s%s' ;\n",prefix,i,prefix,d_temp->name); X d_temp = d_temp->next ; X } X d_temp = token_list; X while(d_temp){ X if(d_temp->data_types){ X fprintf(fp,"\tfor i := 0 to %s%s_max do begin\n",prefix,d_temp->name); X fprintf(fp,"\t\t%s%s_list[i] := nil ;\n",prefix,d_temp->name); X fprintf(fp,"\t\t%s%s_temp[i] := nil ;\n",prefix,d_temp->name); X fprintf(fp,"\tend ;\n"); X } X d_temp = d_temp->next; X } X } X while(temp){ X if(temp->count){ X if(mode == 0) fprintf(fp,"\t\t"); X fprintf(fp,"\tfor i := 0 to %d do\n\t",temp->count-1); X } X if(mode == 0) fprintf(fp,"\t\t"); X fprintf(fp,"\t%sadd_%s_record" , prefix, temp->object); X t = token_list; X while(strcmp(t->name, temp->object) != 0) X t = t->next; X i = 0; X t2 = t->data_types; X if(t->data_types) fprintf(fp,"( "); X while(t2){ X temp2 = temp->items; X while((temp2) && (strcmp(temp2->element, t2->name) != 0)) X temp2 = temp2->next; X if((temp2) && (temp2->type != 3)){ X if(i) fprintf(fp,", "); i = 1; X if(temp2->type >= 0){ X if(temp2->type == 2) fprintf(fp,"'"); X fprintf(fp,"%s",temp2->value); X if(temp2->type == 2) fprintf(fp,"'"); X } X else{ X if(temp2->empty) X fprintf(fp,"%s%s_empty[%d].%s", prefix,temp2->object, X temp2->index, temp2->value); X else X fprintf(fp,"%s%s_list[%d]^.%s", prefix,temp2->object, X temp2->index, temp2->value); X } X } X else if(t2->type != 3){ X if(i) fprintf(fp,", "); i = 1; X if(t2->type == 2) X fprintf(fp,"''"); X if(t2->type == 1) X fprintf(fp,"0.0"); X if(t2->type == 0) X fprintf(fp,"0"); X } X t2 = t2->next; X } X if(t->data_types) fprintf(fp," )"); X fprintf(fp," ;\n"); X temp = temp->next; X } X if(mode){ X fprintf(fp,"end ;\n\n\n"); X } X} X X Xp_gen_structs() X/* generate structure definitions from token list */ X{ X int i; X struct def_list *temp; X struct data_type *temp2; X X i = 0; X temp = token_list; X while(temp){ X if(temp->data_types){ X fprintf(fp,"\n\t%s%s_record_ptr = ^%s%s_record ;\n", prefix,temp->name,prefix,temp->name); X fprintf(fp,"\n\t%s%s_record = record\n",prefix,temp->name); X if(temp->data_types){ X temp2 = temp->data_types; X while(temp2){ X if(temp2->type != 3) X fprintf(fp,"\t\t%s : %s ;\n",temp2->name,p_type_names[temp2->type]); X else X fprintf(fp,"\t\t%s : %s%s_record_ptr ;\n", temp2->name,prefix,temp->name); X temp2 = temp2->next; X } X } X fprintf(fp,"\t\tMARK : integer ;\n"); X fprintf(fp,"\t\tprev : %s%s_record_ptr ;\n", prefix,temp->name); X fprintf(fp,"\t\tnext : %s%s_record_ptr ;\n", prefix,temp->name); X fprintf(fp,"\tend ;\n\n"); X } X i++; X temp = temp->next; X } X} X X Xp_gen_zero() X/* Xgenerate a procedure that will free or zero all data Xstructures generated by trc X*/ X{ X int i; X struct def_list *d_temp; X struct data_type *dt_temp; X X fprintf(fp,"\n\nprocedure %szero ;\n\nvar\n\ti : integer ;\n",prefix); X /* pointer definitions */ X d_temp = token_list; X while(d_temp){ X if(d_temp->data_types) X fprintf(fp,"\t%s_tmp : %s%s_record_ptr ;\n", d_temp->name, prefix, d_temp->name); X d_temp = d_temp->next; X } X fprintf(fp,"\nbegin\n"); X /* free struct lists */ X d_temp = token_list; X while(d_temp){ X if(d_temp->data_types){ X fprintf(fp,"\twhile( %s%s_list[0] <> nil )do begin\n", prefix,d_temp->name); X fprintf(fp,"\t\t%s%s_list[1] := %s%s_list[0] ;\n", prefix,d_temp->name, prefix,d_temp->name); X fprintf(fp,"\t\tfree_%s%s_record(1);\n\tend ;\n", prefix,d_temp->name); X } X d_temp = d_temp->next; X } X /* zero structure pointers */ X d_temp = token_list; X while(d_temp){ X if(d_temp->data_types){ X fprintf(fp,"\tfor i := 0 to %s%s_max-1 do begin\n", prefix,d_temp->name); X fprintf(fp,"\t\t%s%s_list[i] := nil ;\n", prefix,d_temp->name); X fprintf(fp,"\t\t%s%s_temp[i] := nil ;\n", prefix,d_temp->name); X fprintf(fp,"\tend ;\n"); X } X d_temp = d_temp->next; X } X /* zero integer arrays */ X fprintf(fp,"\tfor i := 0 to %d do\n",total_tokens-1); X fprintf(fp,"\t\t%stoken[i] := 0 ;\n", prefix); X fprintf(fp,"end ;\n"); X} X X Xp_trans_code(rule, list, fp, label) Xstruct rule *rule; Xstruct list *list; XFILE *fp; Xchar *label; X{ X struct match *m_temp; X struct list *l_temp; X int i, j; X char c[512]; X X l_temp = list; X while(l_temp){ X i = 0; X while(l_temp->name[i]){ X if(l_temp->name[i] == '$'){ X i++; j = 0; X while(l_temp->name[i] != '.'){ X c[j] = l_temp->name[i]; X if(c[j] == '\0'){ X fprintf(stderr,"cannot translate %s in rule %s\n",c, rule->label); X fprintf(stderr,"%s\n", l_temp->name); X return; X } X i++; j++; X } X i++; X c[j] = '\0'; X m_temp = rule->complex; X if((strcmp(c, "FAIL")) == 0){ X fprintf(fp,"begin"); X if(rule->recurs == 0) X fprintf(fp,"\n\t\t\t\t%srestore ;\n",prefix); X fprintf(fp,"\t\t\t\t{1}goto %s\n\t\t\tend\n",label); X } X else{ X while(m_temp && j){ X if((strcmp(c, m_temp->free_name)) == 0){ X fprintf(fp,"%s%s_", prefix , m_temp->object); X if(m_temp->empty) X fprintf(fp,"empty[%d].", m_temp->index); X else X fprintf(fp,"list[%d]^.", m_temp->index); X j = 0; X } X m_temp = m_temp->next; X } X if(j){ X fprintf(stderr,"cannot translate %s in rule %s\n",c, rule->label); X fprintf(stderr,"%s\n", l_temp->name); X return; X } X } X } X else{ X fprintf(fp,"%c",l_temp->name[i]); X i++; X } X } X fprintf(fp,"\n"); X l_temp = l_temp->next; X } X} X X Xp_gen_header() X{ X struct list *l_temp; X struct def_list *d_temp; X int i,j; X X l_temp = header_code ; X while(l_temp){ X fprintf(fp,"%s\n",l_temp->name); X l_temp = l_temp->next; X } X d_temp = token_list; X fprintf(fp,"const\n"); X for(i = 0; i < total_tokens; i++){ X fprintf(fp,"\t%s%s = %d ;\n",prefix,d_temp->name,i); X j = max_free[i]; X if(j < 2) j = 2; X fprintf(fp,"\t%s%s_max = %d ;\n",prefix, d_temp->name, j); X d_temp = d_temp->next; X } X fprintf(fp,"\ntype\n\tstrings = string[20] ;"); X p_gen_structs(); X fprintf(fp,"\nvar\n"); X fprintf(fp,"\t%stotal_tokens : integer ;\n",prefix); X fprintf(fp,"\t%stoken : array[0..%d]of integer ;\n",prefix,total_tokens-1); X fprintf(fp,"\t%stoken_name : array[0..%d]of strings ;\n",prefix,total_tokens-1); X i = 0; X d_temp = token_list; X while(d_temp){ X if(d_temp->data_types){ X fprintf(fp,"\t%s%s_list : array[0..%s%s_max]of %s%s_record_ptr ;\n",prefix,d_temp->name,prefix,d_temp->name,prefix,d_temp->name); X fprintf(fp,"\t%s%s_temp : array[0..%s%s_max]of %s%s_record_ptr ;\n",prefix,d_temp->name,prefix,d_temp->name,prefix,d_temp->name); X if(max_empty[i]) X fprintf(fp,"\t%s%s_empty : array[0..%d]of %s%s_record ;\n",prefix,d_temp->name,max_empty[i]-1,prefix,d_temp->name); X } X d_temp = d_temp->next; X i++; X } X} X X Xp_translate() X/* Produce the output code */ X{ X int i, j, k, l, count, prev_index, label_count; X char s[512]; X struct list *l_temp; X struct def_list *d_temp, *d_temp2; X struct data_type *dt_temp; X struct rule *r_temp, *r_temp2, *r_const; X struct match *m_temp, *m_temp2, *m_temp3, *m_temp4; X struct test *t_temp; X struct list *label_temp; X X if((fp = fopen("loop.p", "w")) == NULL){ X fprintf(stderr,"Unable to open loop.p\n"); X exit(); X } X if((lp = fopen("loop.l", "w")) == NULL){ X fprintf(stderr,"Unable to open loop.l\n"); X exit(); X } X p_gen_header(); X p_gen_free(); X p_gen_restore(); X p_gen_test(); X p_gen_search(); X p_gen_add(); X init_list = init_list2; X p_gen_init(1); X fprintf(fp,"\nprocedure %sloop ;\n\nvar\n\ti : integer ;\n", prefix); X fprintf(fp,"\nlabel\n\tStart,\n****labels*****\n\tStop ;\n\nbegin\n"); X fprintf(fp,"\twhile True do begin\n%sStart:\n", prefix); X r_temp = rule_list; X while(r_temp->next != NULL) X r_temp = r_temp->next; X r_const = r_temp; X while(r_temp){ X X /* label of this rule */ X fprintf(fp,"%s%s:\n", prefix,r_temp->label); X fprintf(lp,"\t%s%s,\n", prefix, r_temp->label); X X /* test for code that must precede all tests */ X m_temp3 = m_temp = r_temp->complex; X /* skip over empty definitions */ X while((m_temp) && (m_temp->empty)){ X m_temp3 = m_temp; X m_temp = m_temp->next; X } X /* if the first non empty entry is c_code it must precede all tests */ X if(m_temp) X if(m_temp->c_code){ X if(r_temp->prev) X sprintf(s,"%s%s\0",prefix, r_temp->prev->label); X else X sprintf(s,"%sEnd\0",prefix); X p_trans_code(r_temp, m_temp->c_code, fp, s); X /* unlink the code so it isn't inserted twice */ X m_temp3->next = m_temp->next; X } X X /* test for object counts */ X fprintf(fp,"\t\tif("); X d_temp = token_list; X for(i = 0; i < total_tokens; i++){ X if(r_temp->search[i] > 0) X fprintf(fp,"(%stoken[%s%s] >= %d) and\n\t\t\t", prefix, prefix,d_temp->name,r_temp->search[i]); X if(r_temp->search[i] < 0) X fprintf(fp,"(%stoken[%s%s] <= 0) and\n\t\t\t", prefix, prefix,d_temp->name); X d_temp = d_temp->next; X } X d_temp = token_list; X fprintf(fp,"True)then begin"); X X /* generate complex matching code */ X X /* first initialize the current free variable matrix */ X for(i = 0; i < total_tokens; i++) X current_free[i] = 1; X X m_temp = m_temp3 = r_temp->complex; X prev_index = 0; X while(m_temp){ X if(m_temp->c_code){ X if((prev_index == 0) || (r_temp->recurs == 0)){ X if(r_temp->prev) X sprintf(s,"%s%s\0", prefix,r_temp->prev->label); X else X sprintf(s,"%s\0End", prefix); X } X else X sprintf(s,"%s%s_%s_%d\0", prefix, X r_temp->label, m_temp3->object, prev_index); X p_trans_code(r_temp, m_temp->c_code, fp, s); X } X else if(m_temp->empty){ X /* declaration only - don't generate any code */ X i = 0; X } X else{ X i = 0; X d_temp = token_list; X while(strcmp(m_temp->object, d_temp->name) != 0){ X i++; X d_temp = d_temp->next; X } X if(d_temp->data_types){ X for(count = 0; count < m_temp->count; count++){ X X /* initialize temp */ X fprintf(fp,"\n\t\t\t%s%s_temp[%d] := %s%s_list[0];\n" X , prefix, m_temp->object, current_free[i], prefix, m_temp->object); X X /* print a label */ X if(r_temp->recurs){ X fprintf(fp,"%s%s_%s_%d:\n",prefix,r_temp->label,m_temp->object,current_free[i]); X fprintf(lp,"\t%s%s_%s_%d,\n",prefix,r_temp->label,m_temp->object,current_free[i]); X } X X /* free the previously found item */ X if(r_temp->recurs){ X fprintf(fp,"\t\t\tif(%s%s_list[%d]<>nil)\n", prefix, m_temp->object, current_free[i]); X fprintf(fp,"\t\t\t\t%s%s_list[%d]^.MARK := 0;\n", prefix, m_temp->object, current_free[i]); X } X X /* do the search */ X fprintf(fp,"\t\t\t%s%s_list[%d] := search_%s%s_record(%d" X , prefix , m_temp->object, current_free[i], prefix, m_temp->object, current_free[i]); X dt_temp = d_temp->data_types; X while(dt_temp){ X if(dt_temp->type <= 2){ X t_temp = m_temp->tests; X j = 1; X while(j && t_temp){ X if(strcmp(t_temp->element, dt_temp->name) == 0){ X j = 0; X if((t_temp->type == 0) || (t_temp->type == 1)) X fprintf(fp,", %s",t_temp->value); X if(t_temp->type == 2) X fprintf(fp,", '%s'",t_temp->value); X if(t_temp->type == -1){ X if(t_temp->id) X fprintf(fp,", 0"); X else{ X l = 0; X m_temp2 = r_temp->complex; X while(m_temp2){ X if(strcmp(m_temp2->free_name, t_temp->free_name) == 0){ X l = m_temp2->index; X m_temp4 = m_temp2; X m_temp2 = NULL; X } X else X m_temp2 = m_temp2->next; X } X if(m_temp4->empty) X fprintf(fp,", %s%s_empty[%d].%s", prefix,m_temp4->object,l,t_temp->value); X else X fprintf(fp,", %s%s_list[%d]^.%s", prefix,m_temp4->object,l,t_temp->value); X } X } X fprintf(fp,", %d", t_temp->relop); X if(dt_temp->elts) X fprintf(fp,", %d",t_temp->id); X } X else X t_temp = t_temp->next; X } X if(j){ X switch(dt_temp->type){ X case 0: fprintf(fp,", 0, 7"); X break; X case 1: fprintf(fp,", 0.0, 7"); X break; X case 2: fprintf(fp,", '', 7"); X default: break; X } X if(dt_temp->elts) X fprintf(fp,", 0"); X } X } X dt_temp = dt_temp->next; X } X fprintf(fp,") ;\n"); X fprintf(fp,"\t\t\tif( %s%s_list[%d] = nil )then begin\n",prefix, m_temp->object,current_free[i]); X /* search failed on first of rule */ X X if((prev_index == 0) || (r_temp->recurs == 0)){ X fprintf(fp,"\t\t\t\t%srestore ;\n", prefix); X if(r_temp->prev) X fprintf(fp,"\t\t\t\t{2}goto %s%s;\n\t\t\tend ;", prefix,r_temp->prev->label); X else X fprintf(fp,"\t\t\t\t{3}goto %sStop ;\n\t\t\tend ;", prefix); X } X X /* search failed - not first of rule */ X else{ X fprintf(fp,"\t\t\t\t{4}goto %s%s_%s_%d ;\n\t\t\tend ;", prefix, X r_temp->label, m_temp3->object, prev_index); X } X X /* move index one beyond the one currently found */ X if(r_temp->recurs) fprintf(fp,"\n\t\t\t%s%s_temp[%d] := %s%s_list[%d]^.next;", prefix, X m_temp->object, current_free[i], prefix, X m_temp->object, current_free[i]); X X m_temp3 = m_temp; X prev_index = current_free[i]; X current_free[i]++; X } X } X } X m_temp = m_temp->next; X } X X /* get rule number for next 3 statements */ X X i = 1; X r_temp2 = r_const; X while(r_temp != r_temp2){ X r_temp2 = r_temp2->prev; X i++; X } X X X /* generate ADD code */ X X fprintf(fp,"\n"); X init_list = r_temp->add; X p_gen_init(0); X X /* generate MARK code */ X /* first MARK objects deleted by name */ X m_temp = r_temp->complex; X while(m_temp){ X if(m_temp->mark){ X d_temp = token_list; X while(strcmp(m_temp->object, d_temp->name)) X d_temp = d_temp->next; X if(d_temp->data_types) X fprintf(fp,"\n\t\t\t\tfree_%s%s_record(%d) ;", prefix,m_temp->object, m_temp->index); X else X fprintf(fp,"\n\t\t\t\t%stoken%s[%s] := %stoken%s[%s]-1 ;", prefix, prefix,d_temp->name,prefix,prefix,d_temp->name); X } X m_temp = m_temp->next; X } X X /* now MARK the rest of the objects */ X d_temp = token_list; X for(i = 0; i < total_tokens; i++){ X if(r_temp->mark[i]){ X fprintf(fp,"\n\t\t\tfor i := 0 to %d do",r_temp->mark[i]-1); X if(d_temp->data_types) X fprintf(fp,"\n\t\t\t\tfree_%s%s_record(1) ;", prefix,d_temp->name); X else X fprintf(fp,"\n\t\t\t\t%stoken%s[%s] := %stoken%s[%s]-1 ;", prefix,prefix,d_temp->name,prefix,prefix,d_temp->name); X } X d_temp = d_temp->next; X } X d_temp = token_list; X X fprintf(fp,"\n\t\t\t%srestore ;\n", prefix); X X l_temp = r_temp->c_code; X p_trans_code(r_temp, l_temp, fp); X if(find_name(r_temp->opt)) X fprintf(fp,"\t\t\t{5}goto %s%s;\n\t\tend ;\n", prefix, r_temp->opt); X else X fprintf(fp,"\t\t\tgoto %sStart;\n\t\tend ;\n", prefix); X r_temp = r_temp->prev; X } X fprintf(fp,"\t\tgoto Stop ;\n\tend ;\n%sStop:\n", prefix); X fprintf(fp,"\nend ;\n"); X if(zeroing) X p_gen_zero; X l_temp = trailer_code; X while(l_temp){ X fprintf(fp,"%s\n",l_temp->name); X l_temp = l_temp->next; X } X} X !EOR! echo extracting - parser sed 's/^X//' > parser << '!EOR!' X%{ X#include "main.h" Xint ii, jj, st, last_free; X%} X X%start file X X%token DELIM ARROW TOKEN MARK ADD C_CODE NOT INT FLOAT STRING POINTER X%token OPTIMIZE INTEGER DOUBLE STR LE GE LT GT EQ NE HAT RECURS SEMI X%token BACKTRACK TRACE PROFILE DUMP NORECURS PREFIX EMPTY SAVE ZERO PASCAL X X%% X X Xfile : header defs stm ltm DELIM trailer X | error X { X fprintf(stderr,"%d: syntax error\n", lineno); X errors++; X } X ; X Xheader : error DELIM X { X fprintf(stderr,"%d: syntax error in header\n",lineno); X errors++; X } X | DELIM X { X st = 1; X last_free = 0; X } X | C_CODE DELIM X { X st = 1; X do_header(); X } X ; X Xdefs : definitions DELIM X { X insert_rule(); X stm = (int *) calloc(total_tokens, sizeof(int)); X current_free = (int *) calloc(total_tokens, sizeof(int)); X current_empty = (int *) calloc(total_tokens, sizeof(int)); X max_free = (int *) calloc(total_tokens, sizeof(int)); X max_empty = (int *) calloc(total_tokens, sizeof(int)); X for(ii = 0; ii < total_tokens; ii++){ X max_free[ii] = current_free[ii] = 1; X max_empty[ii] = current_empty[ii] = 0; X } X } X ; X Xdefinitions : /* empty */ X | error X { X fprintf(stderr,"%d: syntax error in definition\n",lineno); X errors++; X } X | definitions definition X ; X Xdefinition : TOKEN X { X insert_token($1); X } X | TOKEN '(' item_list ')' X { X insert_token($1); X } X ; X Xitem_list : /* empty */ X | item_list item X ; X Xitem : TOKEN ':' type X { X if(add_struct($1, $3) == -1){ X fprintf(stderr,"%d: duplicate name in definition -> %s\n", lineno, $1); X errors++; X } X } X ; X Xtype : INT X { X $$ = 0; X } X | FLOAT X { X $$ = 1; X } X | STRING X { X $$ = 2; X } X | POINTER X { X $$ = 3; X } X ; X Xstm : error DELIM X { X fprintf(stderr,"%d: syntax error in short term memory\n",lineno); X errors++; X } X | st DELIM X { X st = 0; /* no longer parsing stm */ X init_list2 = init_list; /* save init_list */ X init_list = NULL; X insert_init(); /* make a new init_list */ X build_case_list(); /* prepare cross reference for ltm */ X } X ; X Xst : /* empty */ X | st entry X ; X Xentry : count TOKEN X { X if((ii = find_token($2)) < 0){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else{ X if(st) stm[ii]++; /* if stm is being parsed */ X do_init_list($2); X insert_count($1); X insert_init(); X } X } X | count TOKEN '(' init_list ')' X { X if((ii = find_token($2)) < 0){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else{ X if(st) stm[ii]++; /* if stm is being parsed */ X do_init_list($2); X insert_count($1); X insert_init(); X } X } X ; X X Xcount : /* empty */ X { X $$ = 1; X } X | INTEGER X { X jj = atoi($1); X if(jj < 0){ X $$ = 1; X fprintf(stderr,"%d: negative count is undefined\n", lineno); X errors++; X } X else if(jj == 0){ X $$ = 1; X fprintf(stderr,"%d: zero count is undefined\n", lineno); X errors++; X } X else X $$ = jj; X } X ; X X Xinit_list : /* empty */ X | init_list init_item X ; X Xinit_item : TOKEN ARROW INTEGER X { X insert_fields($1, $3, 0, 0, 0); X } X | TOKEN ARROW DOUBLE X { X insert_fields($1, $3, 0, 1, 0); X } X | TOKEN ARROW STR X { X insert_fields($1, $3, 0, 2, 0); X } X | TOKEN ARROW TOKEN '.' TOKEN X { X if(st) { X fprintf(stderr, X "%d: free variables are not permitted in stm\n", X lineno); X errors++; X } X else if((jj = find_free($3)) == -1){ X fprintf(stderr,"%d: undefined free variable -> %s\n",lineno, $3); X errors++; X } X else X insert_fields($1, $5, $3, -1, jj); X } X ; X X Xltm : opts lt X ; X Xopts : /* empty */ X | opts opt X ; X Xopt : BACKTRACK X { X backtracking = 1; X } X | TRACE X { X tracing = 1; X } X | PROFILE X { X profiling = 1; X } X | DUMP X { X dumping = 1; X } X | RECURS X { X recursing = 1; X rule_list->recurs = 1; X } X | NORECURS X { X recursing = 0; X rule_list->recurs = 0; X } X | PREFIX TOKEN X { X prefix = (char *) $2; X } X | SAVE X { X saving = 1; X } X | ZERO X { X zeroing = 1; X } X | PASCAL X { X pascal = 1; X } X ; X Xlt : /* empty */ X | lt production X ; X Xproduction : error SEMI X { X fprintf(stderr,"%d: syntax error in previous rule\n",lineno); X errors++; X } X | label lhs ARROW rhs SEMI X { X pnum++; X rule_list->add = init_list; X init_list = NULL; X insert_init(); X insert_rule(); X if(recursing) X rule_list->recurs = 1; X for(ii = 0; ii < total_tokens; ii++){ X if(max_free[ii] < current_free[ii]) X max_free[ii] = current_free[ii]; X if(max_empty[ii] < current_empty[ii]) X max_empty[ii] = current_empty[ii]; X current_free[ii] = 1; X current_empty[ii] = 0; X } X } X ; X Xlabel : TOKEN ':' X { X if(find_name($1)){ X fprintf(stderr,"%d: redefined label -> %s\n",lineno,$1); X errors++; X } X else if((find_token($1)) >= 0){ X fprintf(stderr,"%d: label repeats object declaration -> %s\n",lineno, $1); X errors++; X } X else{ X insert_label($1); X } X } X | ':' X { X insert_label(gen_next_label()); X } X ; X Xlhs : /* empty */ X | lhs match X ; X Xmatch : count TOKEN X { X if((ii = find_token($2)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $2, $2); X errors++; X } X else{ X add_test($2, 0, 7, 0, 0, 0,current_free[ii], 0); X add_count($1); X if($1 > 1){ X rule_list->search[ii]+= $1; X current_free[ii]+= $1; X } X else{ X rule_list->search[ii]++; X current_free[ii]++; X } X } X current_match = NULL; X current_test = NULL; X } X | NOT TOKEN X { X if((ii = find_token($2)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else if(rule_list->search[ii]){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $2, $2); X errors++; X } X else rule_list->search[ii]--; X current_match = NULL; X current_test = NULL; X X } X | count '(' free_variable match_list ')' X { X last_free = 0; X if(($1 > 1) && $3){ X fprintf(stderr,"%d: count on free variables undefined\n", lineno); X errors++; X } X add_count($1); X current_match = NULL; X current_test = NULL; X if($1 > 1){ X current_free[ii]+= $1; X rule_list->search[ii]+= $1; X } X else{ X current_free[ii]++; X rule_list->search[ii]++; X } X } X | EMPTY TOKEN TOKEN X { X if((ii = find_token($2)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else if(find_free($3) != -1){ X fprintf(stderr,"%d: free variable already defined -> %s\n",lineno, $3); X errors++; X } X else{ X add_test($2,0,-1,0,0,$3, current_empty[ii], -1); X current_empty[ii]++; X current_match = NULL; X current_test = NULL; X } X } X | C_CODE X { X add_test_code(); X } X | RECURS X { X rule_list->recurs = 1; X } X | NORECURS X { X rule_list->recurs = 0; X } X ; X Xfree_variable : /* empty */ X { X $$ = 0; X } X | HAT TOKEN TOKEN X { X if((ii = find_token($2)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X else if(find_free($3) != -1){ X fprintf(stderr,"%d: free variable already defined -> %s\n",lineno, $3); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $2, $2); X errors++; X } X else{ X add_test($2,0,7,0,0,$3, current_free[ii], 0); X last_free = $3; X } X $$ = 1; X } X ; X X Xmatch_list : /* empty */ X { X } X | match_list match_element X { X } X ; X X Xmatch_element : TOKEN '.' TOKEN relop INTEGER X { X if((ii = find_token($1)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$1); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $1, $1); X errors++; X } X else if((jj = search_structs($1,$3)) < 0){ X fprintf(stderr,"%d: undefined object field -> %s.%s\n",lineno,$1,$3); X errors++; X } X else if(jj != 0){ X fprintf(stderr,"%d: object field must be integer\n", lineno); X } X else{ X add_test($1,$3,$4,$5,0, 0, current_free[ii], 0); X } X } X X | TOKEN '.' TOKEN relop DOUBLE X { X if((ii = find_token($1)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$1); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $1, $1); X errors++; X } X else if((jj = search_structs($1,$3)) < 0){ X fprintf(stderr,"%d: undefined object field -> %s.%s\n",lineno,$1,$3); X errors++; X } X else if(jj != 1){ X fprintf(stderr,"%d: object field must be double\n",lineno); X } X else{ X add_test($1,$3,$4,$5,1, 0, current_free[ii], 0); X } X } X X | TOKEN '.' TOKEN relop STR X { X if((ii = find_token($1)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$1); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $1, $1); X errors++; X } X else if((jj = search_structs($1,$3)) < 0){ X fprintf(stderr,"%d: undefined object field -> %s.%s\n",lineno,$1,$3); X errors++; X } X else if(jj != 2){ X fprintf(stderr,"%d: object field must be a string\n",lineno); X } X else{ X add_test($1,$3,$4,$5,2,0,current_free[ii], 0); X } X } X | TOKEN '.' TOKEN relop TOKEN '.' TOKEN X { X if((ii = find_token($1)) == -1){ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$1); X errors++; X } X else if(rule_list->search[ii] < 0){ X fprintf(stderr,"%d: can't have %s and NOT %s in the same rule\n", lineno, $1, $1); X errors++; X } X else if(search_structs($1,$3) < 0){ X fprintf(stderr,"%d: undefined element -> %s.%s\n",lineno,$1,$3); X errors++; X } X else if((find_free($5) == -1) /* not a free var */ X || ((jj = strcmp($5, last_free)) == 0)){ X if(jj == 0) X $5 = $1; X if(strcmp($1, $5) != 0){ X fprintf(stderr,"%d: semantic error: use a free variable\n",lineno); X errors++; X } X else if(strcmp($3, $7) == 0){ X fprintf(stderr,"%d: degenerate case, please rewrite\n",lineno); X errors++; X } X else if(search_structs($5,$7) < 0){ X fprintf(stderr,"%d: undefined element -> %s.%s\n",lineno,$5,$7); X errors++; X } X else if(cmp_type($1, $3, $7) == -1){ X fprintf(stderr,"%d: types of %s.%s and %s.%s do not match\n", lineno, $1, $3, $5, $7); X errors++; X } X else{ X add_test($1,$3,$4,$7,-1, $5, current_free[ii], 0); X } X } X else if((jj = match_type($1, $3, $5, $7)) == 0){ X fprintf(stderr,"%d: types of %s.%s and %s.%s do not match\n", lineno, $1, $3, $5, $7); X errors++; X } X else{ X if((jj == 1) || (jj == 2)) X add_test($1,$3,$4,$7,-1, $5, current_free[ii], 0); X } X } X ; X Xrhs : optional_part pass_part X ; X Xoptional_part : /* empty */ X | optional_part option X ; X Xoption : MARK mark_list X | ADD add_list X | OPTIMIZE TOKEN X { X opt($2); X } X | MARK error X { X fprintf(stderr,"%d: syntax error in MARK statement\n", lineno); X errors++; X } X | ADD error X { X fprintf(stderr,"%d: syntax error in ADD statement\n", lineno); X errors++; X } X | OPTIMIZE error X { X fprintf(stderr,"%d: syntax error in OPTIMIZE statement\n", lineno); X errors++; X } X ; X Xmark_list : /* empty */ X | mark_list mark_item X ; X Xmark_item : count TOKEN X { X jj = 1; X if($1 >0) jj = $1; X if((ii = find_token($2)) == -1){ X if(mark_free($2)){ X if(jj > 1){ X fprintf(stderr,"%d: can't MARK more than 1 %s\n",lineno,$2); X errors++; X } X } X else{ X fprintf(stderr,"%d: undefined object -> %s\n",lineno,$2); X errors++; X } X } X else if(rule_list->search[ii] < (rule_list->mark[ii] + jj)){ X fprintf(stderr,"%d: can't mark more %s's than are found\n", lineno, $2); X errors++; X } X else{ X if($1) X rule_list->mark[ii]+= $1; X else X rule_list->mark[ii]++; X } X } X ; X X Xadd_list : entry X | add_list entry X ; X Xpass_part : /* empty */ X | C_CODE X { X do_code(); X } X ; X Xtrailer : /* empty */ X | error X { X fprintf(stderr,"%d: syntax error in trailer\n",lineno); X errors++; X } X | C_CODE X { X do_trailer(); X } X ; X Xrelop : LE /* <= */ X { X $$ = 6; X } X | GE /* >= */ X { X $$ = 3; X } X | LT /* < */ X { X $$ = 4; X } X | GT /* > */ X { X $$ = 1; X } X | EQ /* == */ X { X $$ = 2; X } X | NE /* != */ X { X $$ = 5; X } X ; X X%% X X#include "scanner.c" X X !EOR! echo extracting - scanner.c sed 's/^X//' > scanner.c << '!EOR!' X#include <stdio.h> X#include <ctype.h> X X#define NUM 19 X/* number of reserved words */ Xchar *words[NUM] = { /* the strings to compare against */ X "MARK", X "ADD", X "NOT", X "INT", X "FLOAT", X "STRING", X "POINTER", X "OPTIMIZE", X "RECURS", X "BACKTRACK", X "TRACE", X "PROFILE", X "DUMP", X "NORECURS", X "PREFIX", X "EMPTY", X "SAVE", X "PASCAL", X "ZERO" X }; Xint ret[NUM] = { /* the value to return to yyparse */ X MARK, X ADD, X NOT, X INT, X FLOAT, X STRING, X POINTER, X OPTIMIZE, X RECURS, X BACKTRACK, X TRACE, X PROFILE, X DUMP, X NORECURS, X PREFIX, X EMPTY, X SAVE, X PASCAL, X ZERO X }; X Xyylex() X{ X char c, s[512], *t; X int i, nb, dot, current_line; X X current_line = lineno; X i = nb = dot = 1; X while((c = getc(stdin)) != EOF){ X if(c == ' '); /* ignore white space */ X else if(c == '\t'); X else if(c == '%'){ X c = getc(stdin); X if(c == '%') X return(DELIM); X ungetc(c, stdin); X fprintf(stderr,"%d: unexpected '%c'\n", '%', lineno); X errors++; X } X else if(c == '.'){ X return('.'); X } X else if(c == ':'){ X return(':'); X } X else if(c == '('){ X return('('); X } X else if(c == ')'){ X return(')'); X } X else if(c == '^'){ X return(HAT); X } X else if(c == '\n'){ X lineno++; X } X else if(c == '>'){ X c = getc(stdin); X if(c == '=') X return(GE); X ungetc(c, stdin); X return(GT); X } X else if(c == '<'){ X c = getc(stdin); X if(c == '=') X return(LE); X ungetc(c, stdin); X return(LT); X } X else if(c == '!'){ X c = getc(stdin); X if(c == '=') X return(NE); X ungetc(c, stdin); X fprintf(stderr,"%d: unexpected '!'\n", lineno); X errors++; X } X else if(c == '='){ X c = getc(stdin); X if(c == '>') X return(ARROW); X if(c == '=') X return(EQ); X ungetc(c, stdin); X fprintf(stderr,"%d: unexpected '='\n", lineno); X errors++; X } X else if(c == ';'){ X return(SEMI); X } X else if(c == '{'){ X i = 0; X while(nb){ X c = getc(stdin); X if(c == EOF){ X fprintf(stderr,"%d: unterminated C code\n", current_line); X errors++; X return(EOF); X } X if(c == '}') { X nb--; X if(nb) X s[i++] = c; X else{ X s[i] = '\0'; X t = (char *) malloc (i + 1); X strcpy(t,s); X append_code(t); X return(C_CODE); X } X } X else{ X if(c == '{') nb++; X if((c == '\n') || (i == 510)){ X lineno++; X s[i] = '\0'; X t = (char *) malloc(i + 1); X strcpy(t,s); X append_code(t); X i = 0; X } X else X s[i++] = c; X } X } X return(C_CODE); X } X else if(c == '\042'){ X i = 0; X while(dot){ X c = getc(stdin); X if(c == '\042'){ X s[i] = '\0'; X dot = 0; X } X else if(c == '\n'){ X fprintf(stderr,"%d: newline embedded in string\n",lineno); X s[i] = '\0'; lineno++; X errors++; dot = 0; X } X else{ X s[i++] = c; X if(c == '\\') X s[i++] = getc(stdin); X } X } X yylval = malloc(strlen(s) + 1); X strcpy(yylval, s); X return(STR); X } X else if((isdigit(c)) || (c == '-')){ X s[0] = c; X i = 1; X while(i){ X c = getc(stdin); X if((isdigit(c)) || ((c == '.') && (dot))){ X s[i++] = c; X if(c == '.') dot = 0; X } X else{ X ungetc(c, stdin); X s[i] = '\0'; X i = 0; X } X } X yylval = malloc(strlen(s) + 1); X strcpy(yylval, s); X if(dot) return(INTEGER); X return(DOUBLE); X } X else if(isalpha(c)){ X s[0] = c; X i = 1; X while(i){ X c = getc(stdin); X if((c == '_') || (isalpha(c)) || (isdigit(c))){ X s[i++] = c; X } X else{ X ungetc(c, stdin); X s[i] = '\0'; X i = 0; X } X } X for(i = 0; i < NUM; i++) /* search the reserved word list */ X if(strcmp(words[i],s) == 0) X return(ret[i]); X yylval = malloc(strlen(s) + 1); X strcpy(yylval, s); X return(TOKEN); X } X else if(c == '/'){ /* check for comments */ X if((c = getc(stdin)) != '*'){ X ungetc(c, stdin); X printf("%d: unexpected '/'\n", lineno); X errors++; X } X else{ /* check for comment terminator */ X i = 1; X while(i){ X c = getc(stdin); X if(c == EOF){ X fprintf(stderr,"%d: unterminated comment\n", current_line); X errors++; X return(EOF); X } X else if(c == '\n') X lineno++; X else if(c == '*'){ X c = getc(stdin); X if(c == '/') X i = 0; X } X } X } X } X else{ X fprintf(stderr,"%d: unexpected or undefined character: \\0%o\n", lineno, c); X errors++; X } X } X return(c); X} X !EOR!