XTDL Grammar Formalism

The grammar formalism in SProUT, called XTDL, is a blend of very efficient finite-state techniques and unification-based formalisms which are known to guarantee expressiveness and transparency. To be more precise, a grammar in SProUT consists of so called pattern/action rules, where the LHS of a rule is a regular expression over TFSs (typed feature structures), representing the recognition pattern, and the RHS of a rule is a TFS specification of the output structure.

Additionally, functional operators and coreferences may be used on both sides of the rules. Coreferences express structural identity, create dynamic value assignments, and serve as means of information transport into the out-put descriptions. Functional operators provide a gateway to the outside world, and they are primarily utilized for forming the output of a rule and for introducing complex constraints in the rules. Furthermore, grammar rules can be recursively embedded, which in fact provides grammarians with a context-free formalism.

The following rule for the recognition of prepositional phrases gives an idea of the syntax of the grammar formalism:

pp :>     morph & [  POS Prep,
                              SURFACE #prep,
                              INFL [ CASE #c ] ]
            (morph & [ POS Det,    
                              INFL [  CASE #c,
                                          GENDER #g ]] ) ?
            (morph & [ POS Adjective,
                              INFL [  CASE #c,
                                          NUMBER #n,
                                          GENDER #g ] ] ) *
           (morph & [ POS Noun,
                             SURFACE #noun,
                             INFL [  CASE #c,
                                         NUMBER #n,
                                         GENDER #g ] ] )

->    phrase & [  CAT pp,
                         PREP #prep,
                         AGR agr & [ CASE #c,
                                              NUMBER #n,
                                              GENDER #g]
                         CORE_NP #core_np]],

where #core_np=Append(#det,” “,#noun).

The first TFS matches a preposition. Then one or zero determiners are matched. It is followed by zero or more adjectives. Finally, a noun item is consumed. The variables #c, #n, #g establish coreferences expressing the agreement in case, number, and gender for all matched items (except for the initial preposition item which solely agrees in case with the other items). The RHS of the rule triggers the creation of a TFS of type phrase, where the surface form of the matched preposition is transported into the corresponding slot via the variable #prep. A value for the attribute CORE_NP is created through a concatenation of the matched determiner and noun (variables #det and #noun). This is realized via a call to a functional operator called Append.