[comp.lang.prolog] Efficiency of DCGs and chart parsers

jmcn@castle.ed.ac.uk (J McNicol) (01/24/91)

I'm attempting to make DCG parsing more efficient by using a chart,
using (approximately) Pereira and Shieber's method.  However I find
that my new parser takes at least twice as long as standard Prolog 
since it has to do so much housekeeping.

It seems a shame to abandon it having got as far as a DCG-compatible 
translator; does anyone have any comments about chart parsing and Prolog, 
such as 'chart parsing can't be done efficiently enough in Prolog'?

The relative efficiency for the grammar I tested is twice as long 
if finding all solutions, three times as long if only finding the 
first solution.

Thank you for any comments!

Julian Smart

EMAIL jmcn@castle.ed.ac.uk

mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) (01/24/91)

Regarding the use of a chart in a DCG parsers implemented in Prolog;
Pereira and Shieber themselves note that the constant factors involved
when the chart is stored by asserting it into the database outweigh
the speedup obtained.

One of the things that the chart does is cache intermediate results (lemmas)
of the computation in such a way that no computation is ever performed
twice.  But as Ken Church first pointed out to me, it may be more
expensive to actually store in and look up the cache than it is to
just redo the original calculation again.  Since asserting
is a very expensive process in Prolog, in most cases it will be
more efficient not to use an assert-based cache.

(There is one case where there is a win, of course.  Left-recursive
clauses cause an infinite number of essentially identical goals
with the top-down proof procedure of prolog, and an approach in
which a computation is performed at most once clearly wins here.
There are other approaches to left-recursion, including left-corner
parsing, which can handle left recursion and doesn't need caching).

I have implemented a couple of chart parsers in Prolog, and
to get half-way acceptable performance I had to do the following:

- Pass around the chart as an argument, rather than store it
  in the database.  This means that the parser itself never
  backtracks, and that edges must be copied before they are used
  (assuming that the categories in the edges are not always ground).

- Build efficient indexing into the chart (edges must be indexed
  on their left-hand string position, etc).

Mark Johnson

quiniou@irisa.fr (Rene Quiniou) (01/24/91)

In article <8014@castle.ed.ac.uk>, jmcn@castle.ed.ac.uk (J McNicol) writes:
| I'm attempting to make DCG parsing more efficient by using a chart,
| using (approximately) Pereira and Shieber's method.  However I find
| that my new parser takes at least twice as long as standard Prolog 
| since it has to do so much housekeeping.
| 
| It seems a shame to abandon it having got as far as a DCG-compatible 
| translator; does anyone have any comments about chart parsing and Prolog, 
| such as 'chart parsing can't be done efficiently enough in Prolog'?
| 


In New Generation Computing, 8 (1990) pp. 113-138, N.K. Simpkins and P. Hancox demonstrate a method called ``Word Incorporation'' which can improve chart parsing, as they state.

Perhaps this may help.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  QUINIOU Rene                  quiniou@irisa.fr
  INRIA / IRISA                         Phone : +33 99 36 20 00
 Campus Universitaire de Beaulieu       Fax :   99 38 38 32
 35042 RENNES CEDEX - FRANCE            Telex : UNIRISA 950 473F
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

gosse@gufal02.let.rug.nl (Gosse Bouma) (01/26/91)

In article <MARK.91Jan24122000@adler.philosophie.uni-stuttgart.de> mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) writes:
>
>I have implemented a couple of chart parsers in Prolog, and
>to get half-way acceptable performance I had to do the following:
>
>- Pass around the chart as an argument, rather than store it
>  in the database.  This means that the parser itself never
>  backtracks, and that edges must be copied before they are used
>  (assuming that the categories in the edges are not always ground).
 
How do you pass the chart around? as a list of chart items? I can hardly
believe that going through a list with, say, a few hundred elements is
more efficient in Prolog than just exploiting the data-base.
Also, the copying, which is avoided if you assert items, seems rather
expensive.

Anyway, in my (sicstus)
prolog chart parser for unification grammar, which just
asserts items dynamically, I can have up to 300 items (for 20 words) and 
have a response in 3 secs or so if the grammar isn't too wild. 
I find that reasonable, given the fact that
feature-unification, rather than prolog-unification, has to be performed and
that I have made no attempts whatsoever to index chart items efficiently
(in fact, the first argument is just the number of the item).
Also, given the fact that in unification grammars, the size of the chart items
tends to be rather large, using a list of items as additional argument
seems practically impossible to me. 

--
Gosse Bouma, Alfa-informatica, RUG, Groningen	               gosse@let.rug.nl

	All we cry, all we contend for, in this world of toil and blood,
	is beneath the notice of the hacker we call God.   (Th. Pynchon)

mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) (01/27/91)

Gosse Bouma askes:

> How do you pass the chart around? as a list of chart items? I can hardly
> believe that going through a list with, say, a few hundred elements is
> more efficient in Prolog than just exploiting the data-base.

No, it shouldn't just be an unstructured list, of course.  
In general you want to index inactive edges on their
left-most string-positions (vertex) and active edges on their
right-most string-positions.  If you want to get n^3
behaviour (with CFGs) then you must index on both left
and right string-positions (if I remember Earley's algorithm
correctly).  Prolog textbooks discuss a variety of data structures 
for implementing such indexing.

(Macho types not fazed by circular structures caused by
e.g. ``empty'' edges dominating the null string can
avoid some of this indexing by representing a vertex 
as a set of edges, perhaps indexed by type and category, 
active edges as pairs consisting of a dotted rule and the
vertex at the left string-position, and inactive edges consisting of 
pairs of a category and the vertex at the right string-position.  
In both cases, an edge's "other" string-position is that of
the vertex of which it is member.
But be warned - I've had one Prolog which supposedly
could handle circular structures go into an infinite loop
when checking if vertices V1 and V2 satisfy V1==V2.  As
I've said elsewhere, Prolog needs either efficient arrays
or circular structures and preferably both, but that's another story!)

> Also, the copying, which is avoided if you assert items, seems rather
> expensive.

Of course, copying (renaming of variables) is one of the
things that happens during an assertion and a following
call.  I believe it is usually faster to copy a term
than it is to assert it into the database, especially if
the language has a copy_term/2 primitive (as does SICStus).
 
> Also, given the fact that in unification grammars, the size of the chart items
> tends to be rather large, using a list of items as additional argument
> seems practically impossible to me.

The chart itself would not be stored as part of a category labelling
an edge, so I'm not sure exactly what you have in mind here.

> Gosse Bouma, Alfa-informatica, RUG, Groningen                  gosse@let.rug.nl

Mark Johnson

wachtel@canon.co.uk (Tom Wachtel) (01/28/91)

gosse@gufal02.let.rug.nl (Gosse Bouma) writes:

>I find that reasonable, given the fact that
>feature-unification, rather than prolog-unification, has to be performed

The contrast between feature-unification and prolog-unification is
unnecessary and expensive.  Mapping high-level feature matrix notation
to prolog terms in a controlled way allows you to treat
feature-unification as prolog-unification, directly on the whole
feature matrix represented as a prolog term with a known structure.  It
becomes the same thing.

Tom Wachtel (wachtel@canon.co.uk)

stolcke@ICSI.Berkeley.EDU (Andreas Stolcke) (01/29/91)

In article <1991Jan28.131922.727@canon.co.uk>, wachtel@canon.co.uk (Tom Wachtel) writes:
|> gosse@gufal02.let.rug.nl (Gosse Bouma) writes:
|> 
|> >I find that reasonable, given the fact that
|> >feature-unification, rather than prolog-unification, has to be performed
|> 
|> The contrast between feature-unification and prolog-unification is
|> unnecessary and expensive.  Mapping high-level feature matrix notation
|> to prolog terms in a controlled way allows you to treat
|> feature-unification as prolog-unification, directly on the whole
|> feature matrix represented as a prolog term with a known structure.  It
|> becomes the same thing.


For that to work, it seems, you'd have to map features onto argument
positions.  Unless you want to severely restrict the domains of features at
`compile time' I don't see how this could work.  You would lose the
very flexibility which makes f-structures so popular among, e.g., grammar
writers.  Also, many Prolog implementations have upper limits on the
number of argument positions in terms.  In any case you wast a lot of space
for keeping argument slots for features not actually used.

The approach I'm familiar with is to encode feature-value sets as
open-ended lists.
This still requires explicit implementation of f-structure unification,
but the resulting complexity is no worse than Prolog's
unification (linear in the size of the smaller structure),
and very acceptable in practice. Also, you can easily add occur checks if
need be.

Maybe you have something else in mind, in which case I'd like to know more
about it.

-- 
Andreas Stolcke					stolcke@icsi.berkeley.edu
International Computer Science Institute	stolcke@ucbicsi.bitnet
1957 Center St., Suite 600, Berkeley, CA 94704	(415) 642-4274 ext. 126

wachtel@canon.co.uk (Tom Wachtel) (01/29/91)

stolcke@ICSI.Berkeley.EDU (Andreas Stolcke) writes:

>For that to work, it seems, you'd have to map features onto argument
>positions.  Unless you want to severely restrict the domains of features at
>`compile time' I don't see how this could work.
[...]
>In any case you wast a lot of space
>for keeping argument slots for features not actually used.
[...]
>The approach I'm familiar with is to encode feature-value sets as
>open-ended lists.

I'm not sure what domain restrictions you have in mind, but I'd be glad
to hear about them. I'm assuming that you never want to have to
explicitly compare or manipulate feature values, but only unify or fail
to unify feature matrices. I'm also assuming you use a compiler to get
from whatever high-level representation you favour down to the prolog
code.  The thing I like most about it is that you are really unifying
(or failing to unify) entire feature matrices rather than specific
features, which I think is an important notion in unification-based
grammars.

It's true that you waste a lot of space for empty slots. The biggest
problem I have come across, however, is the limit on the number of
registers, so that if you have more than 32 non-anonymous variables
(i.e. unified feature values) in the head of a rule, then it all falls
over. You can code around that at compile time, but it is a bit messy.

What you get is matrix unification as =/2 at parse time, without losing
high-level expressiveness, since the compiler handles things for you. I
think it's worth it.

Tom Wachtel (wachtel@canon.co.uk)