Home
LjunglofAmores2006a - Enhanced Multimodal Grammar Library
Contents
1. Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 10 55 1 3 5 Syntactic structures on the sentence level Texts phrases and utterances The outermost linguistic structure is Text A Text is composed from a sequence of Phrases Phr followed by punctuation marks Phrases are built from Utterances Utt which in turn are declarative sentences questions or imperatives but there are also single phrase utterances consisting of noun phrases or other subsentential phrases The difference between Phrase and Utterance is mostly technical a Phrase is an Utterance with an optional leading conjunction but and an optional tailing vocative e g John or please Sentences and clauses The richest of the categories below Utterance is S Sentence A Sentence is formed from a Clause C1 by fixing its Tense Anteriority and Polarity The difference between Sentence and Clause is thus also rather technical For example each of the following strings has a distinct syntax tree in the category Sentence John walks John doesn t walk John walked John didn t walk John has walked John hasn t walked John will walk John won t walk whereas in the category Clause all of them are just different forms of the same syntax tree In figure 1 1 there are some examples of the results of replacing parts of the syntax tree for the sentence John walks Questions The categorie
2. lt left gt X lt left gt lt right gt X P Y lt right gt lt rule gt lt rule lang EN gt lt left gt X lt left gt lt right gt Y X lt right gt lt rule gt lt forEach gt lt forEach gt lt rulesList gt Figure 2 9 Multilingual configuration file Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 24 55 Lamp Hall Lamp Lamp LivingRoom Lamp Radio Bedroom Radio Radio Kitchen Radio Radio Hall Radio Radio LivingRoom Radio Heater Bedroom Heater Heater Kitchen Heater Heater Hall Heater Heater LivingRoom Heater 2 7 Conclusions and future work In this work a new Rule Based approach to automatic grammar generation has been described The proposed solution is based on OWL ontologies and provides linguists with an easy way to take advantage of the information contained within these ontologies This information extraction process will also be easier for the linguist if the ontology has been designed keeping in mind that grammars will be generated from it The solution proposed has achieved the expected goals the linguist can generate a good number of rules from a simple configuration files and by having the rules directly generated from the ontologies domain knowledge and linguistic knowledge coherence and completeness is ensured In addition a rapid prototyp ing of new grammars for the speech recognizer and the NLU module is obtained b
3. A is not of real importance in GF it is the concrete syntax that is ordered not the abstract Therefore we do not have to supply the exact index of the argument it is enough with the name of category e g given the rule f Det Noun NP the property P Noun from f to Noun can be called simply noun instead of arg 2 e A consequence of this is that there will be a great deal less properties in the ontology since many rules can share the same property One problem is when a GF rule has more than one argument with the same category e g a flat sentence generating rule fun sl NP gt Verb gt NP gt S In this case it is difficult to specify which argument NP s belong to the subject and object This can be solved in two ways either by introducing a new name for one of the property names or by adding a new coercion category to the grammar cat ObjectNP fun objectNP NP gt ObjectNP If we have this we can change the type of s1 and use the conversion we have already described sl NP gt Verb gt ObjectNP gt S 3 3 3 Optimizing the grammar ontology In this section we present two possible optimizations for the translation of a GF grammar to an OWL ontology which decreases the size of the ontology One argument rules Suppose that there is a rule intrans in the grammar taking only one argument fun intrans Verb gt VP If there are no more rules with type Verb gt VP then the rule na
4. Edinburgh HCRC UEDIN University of Gothenburg UGOT University of Cambridge UCAM University of Seville USE Deutches Forschungszentrum fur K nstliche Intelligenz DFKI Linguamatics LING BMW Forschung und Technik GmbH BMW Robert Bosch GmbH BOSCH For copies of reports updates on project activities and other TALK related information contact The TALK Project Co ordinator Prof Manfred Pinkal Computerlinguistik Fachrichtung 4 7 Allgemeine Linguistik Postfach 15 11 50 66041 Saarbriicken Germany pinkal coli uni sb de Phone 49 681 302 4343 Fax 49 681 302 4351 Copies of reports and other material can also be accessed via the project s administration homepage http www talk project org 2006 The Individual Authors No part of this document may be reproduced or transmitted in any form or by any means electronic or mechanical including photocopy recording or any information storage and retrieval system without permission from the copyright owner Contents POU oae a ch ade ats ae Re ee ey eae eed eh eee a oe en 1 Introduction 1 1 OWL the Web Ontology Language 0 000002 ee eee 1 2 An overview of the Grammatical Framework o o o o 1 2 1 Main features ot GF coo osas ac a A we ae 1 3 The GF resource grammar library lt cotos iaa aa A aa 1 3 1 Implemented languages and linguistic coverage o ooo o L32 ASS aMplE lt lt me et awa aa a a A a ai 13 3 Inflect
5. GF It is enough to state that adjectives are tables that depend on a gender and nouns have an inherent gender We can even go one step further and make use of the resource grammar described in section 1 3 which gives us a multilingual grammar automatically Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 33 55 Disdvantages with using GF The main disadvantage with converting the Mimus ontology to GF is the same as for any OWL ontology subclass coercions become quite cumbersome and general restrictions can be very complicated in GF But since the conversion from the Mimus ontology to a recognition grammar does not make crucial use of restrictions this is not a big problem 3 3 GF abstract syntax as OWL ontologies 3 3 1 Context free grammars as OWL ontologies A context free grammar can be seen as an ontology This is known previously especially in computer science texts for compiler construction and parsing see e g Appel 1998 e Each category A is a class in the ontology i e the class of syntax trees with mother node A e Each rule A X X is a subclass of A i e the class of syntax trees constructed from that rule Note that the individuals of a class A are syntax trees but not necessarily all trees We have to construct each tree in the ontology as an individual Subtrees as properties Given a rule subclass A X X call it R we define n daughter properties arg
6. a property can be declared to be a subproperty of another Following the idea from subclasses we can either use a coercion function or dependent categories A very common case is when the superproperty does not have any direct instances in which case we can declare it as depending on the subproperties Suppose both Qrop and Trop are subproperties with domain A and range B of Prop which doesn t have any direct instances we write this as cat Prop QT A B QT fun Qrop Trop QT To say that a is related by Qrop to b and by Trop to c we write Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 29 55 fun a_Qrop_b Prop Qrop a b a_Trop_c Prop Trop a c When the superproperty can be inhabited we need a coercion function Supposing that Qrop is a subprop erty of Prop we declare a coercion function from Qrop x y to Prop x y fun Qrop_sub_Prop x A gt y B gt Qrop x y gt Prop x y The most general case is when the subproperty s domain and range are subclasses of the superproperty s domain and range In this case we get the following fun Qrop_sub_Prop x A gt y B gt Qrop x y gt Prop A_sub_A x B_sub_B y Here A_sub_A and B_sub_B are the coercion functions between the domains and ranges There are even more possibilities for implementing subproperties e g if the superproperty s domain or range is uninhabited in which case this could be implement
7. are located cd path to grammar library MP3 gf 2 Load the source module s into GF gt 1 MP3UserSem gf gt 1 MP3UserEng gf gt 1 MP3UserSwe gf 3 Select the English concrete grammar Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 55 55 gt sf lang MP3UserEng 4 Parse an English utterance gt p play like a prayer by madonna request_S play_item__song_artist like_a_prayer madonna 5 Translate i e parsing followed by linearization from English to Swedish gt p play like a prayer by madonna 1 all lang MP3UserSwe spela like a prayer med madonna The option a11 shows all possible variants of linearizing a syntax term 6 Translate from English to GoDiS dialogue moves gt p play like a virgin by madonna 1 lang MP3UserSem request play answer song like_a_virgin answer artist madonna In this case we don t need the option a11 since there is only one possible variant for the semantics 7 Generate 5 random Swedish utterances gt gr number 5 1 lang MP3UserSw in the city med eagle eye cherry rant radio va jag vill andra balansen mitten tack jag vill spela nummer tre tack 8 Quit GF 2ng Version Final Public Distribution Public
8. been chosen specific language rules must be generated Consider then the fragment in figure 2 8 taken from the ontology previously shown describing which elements can be affected by the property locatedIn The multilingual configuration file that would capture the structural differences mentioned above is shown in figure 2 9 Now if only English grammar rules are to be generated the application must be run with the option lang EN obtaining the following result Lamp Bedroom Lamp Lamp Kitchen Lamp Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 23 55 lt owl ObjectProperty rdf ID locatedIn gt lt rdfs domain gt lt owl Class gt lt owl unionOf rdf parseType Collection gt lt owl Class rdf about Lamp gt lt owl Class rdf about Radio gt lt owl Class rdf about Heater gt lt owl unionOf gt lt owl Class gt lt rdfs domain gt lt rdfs range gt lt owl Class gt lt owl unionOf rdf parseType Collection gt lt owl Class rdf about Bedroom gt lt owl Class rdf about Kitchen gt lt owl Class rdf about Hall gt lt owl Class rdf about LivingRoom gt lt owl unionOf gt lt owl Class gt lt rdfs range gt lt owl ObjectProperty gt Figure 2 8 The locatedIn property in OWL XML syntax lt rulesList gt lt forEach property P subProperty0f Location gt lt forEach domain X range Y gt lt rule lang ES gt
9. dialogue system from an ontology describing the devices An example system using this technique is Mimus a dialogue system with which one can control different devices in a home such as lights alarms and washing machines All information about devices is specified in an ontology which can be updated on the fly with e g new devices or new information about existing devices We have also shown that all domain specific utterances in a domain for the GoDiS dialogue system can be specified as a single ontology By representing the utterances as abstract syntax trees in a Grammatical Framework GF resource grammar the representation of utterances is language independent The re source grammar we are using has a coverage comparable to the Core Language Engine and exists for 11 different languages including the TALK languages English German Spanish and Swedish but also the non Indoeuropean languages Finnish Russian and Arabic Since the grammar is multilingual this means that the only thing that has to be done to localize a dialogue application to a new language is to write a lexicon for the domain specific entities In this deliverable we have also described the final status of the full TALK Grammar Library which is written in GF The structure has been modified so that the domain specific parts of a grammar can be described in an OWL ontology and to further simplify localization to new languages domains and modalities 49 IST 507802 TALK
10. entities shall remain in the final dialogue application and organizes them as Input Forms Amores and Quesada 1997 It is important to notice that this approach is meant only as a way to help the linguist and will not provide a ready to use grammar By using this tool the grammar will be easier to generate and more consistent with the domain knowledge but in any case the resultant grammar must be checked and completed manually in a second step The current implementation of this tool provides grammar rules in a format we have developed Amores and Quesada 1997 which basically consists of a left hand symbol followed by an arrow and a list of right hand symbols Obviously this notation is translatable to most standards like BNF 2 4 Configuration files As outlined above the linguist must define a configuration file that will be used in conjunction with the ontology in order to generate the grammar rules In this configuration file the linguist has to identify the properties that may appear in the grammar and the way in which their domain and range will be included in the associated rules In order to do it an easy XML syntax has been defined see DTD in figure 2 2 Basically the linguist can define the generation rules by means of nested forEach loops handling the properties and subproperties of the ontology and using variables to identify the elements from its domain and range In order to better understand this structure as well as
11. feature of GF is the possibility of definining parameters and operations which is done in a resource module The parameters can be used for inflection tables and operations can be used as macros for simplifying grammar writing GF has a rich module system which supports extension importing and instantiation of other modules A concrete grammar can be used as a resource module in other grammars which gives a notion of grammar composition This makes it possible to use a wide coverage grammar such as the GF resource grammar described in section 1 3 when writing a specialized domain specific grammar The GF compiler takes care of extracting only those parts of the resource grammar that are interesting for the domain which will give us efficient parsing of the specialized grammar 1 3 The GF resource grammar library Even though it is easy to write simple GF grammars and make them run in a couple of languages scaling up to bigger language fragments can require considerable work The main part of this work is linguistic the grammar writer has to know a lot about the morphology and the syntax of the target languages to get the concrete syntax correct The solution provided by GF to this problem is that of library based soft ware engineering the work can be divided between resource grammarians working on linguistic details This section is a rewritten and compressed version of the user s manual of the resource grammar library http www cs ch
12. forEach property P subProperty0f hasDeviceCommand gt lt forEach domain X range Y gt lt rule lang ES gt lt left gt Command lt left gt lt right gt P Y lt right gt lt rule gt lt forEach gt lt forEach gt lt rulesList gt Figure 2 6 Configuration file for device related commands Command SwitchOff Lamp Command SwitchOff DimmerLamp Command SwitchOff Radio Command SwitchOff TV Command SwitchOn Fan Command SwitchOn Heater Command SwitchOn Lamp Command SwitchOn DimmerLamp Command gt SwitchOn Radio Command SwitchOn TV Command gt Close Blind Command Open Blind It is important to note that even with this toy ontology sixteen grammar rules have been generated using just two nested forEach loops 2 6 2 Capturing multimodality Now let us assume the same scenario i e the same ontology but including multimodal entries namely voice and pen inputs Following Oviatt s results Sharon Oviatt and Kuhn 1997 it may be expected that the mixed input modalities voice switch this on pen click on the lamp icon may also include alternative constituent orders that is different to the voice only input The NLU module may therefore receive inputs such as lamp switch on verb at the end This new set of rules can be easily accounted for by adding just one rule to the configuration file as shown in figure 2 7 The new output will be the same as before but
13. library It is possible to extend the existing GF library with new languages or domains By a new language we mean any one of the languages covered by the GF resource grammar This is because the system grammars are specified using the resource grammar Adding a new language to the GF grammar library We assume that the language to be added is Finnish If the language is not already present in the grammar library we have to add the following grammars to the Common directory e Concrete syntax for the resource grammar GodisLangFin gf e Concrete syntax for the system grammar GodisSystemFin gf e Concrete syntax for the user grammar GodisUserFin gf The last two grammars are very straightforward to create and require no knowledge about the language The resource grammar however requires enough knowledge about Finnish to translate phrases like I thought you said or the different possible variants of I want to We now assume that the language already is in the library but not in the domain in question which we assume to be the MP3 domain The following grammars have to be added to the MP3 directory e A new lexicon syntax MP3LexiconFin gf e A new syntax for the database resources MusicFin gf e A new syntax for system utterances MP3SystemFin gf e A new syntax for user utterances MP3UserFin gf For the system utterances all you have to do is to add a grammar MP3SystemFin with the following content concrete MP3SystemFin
14. must be possible to recreate the original GF name by knowing the name of the class and which arguments that are instantiated One solution is to enforce that the name of each GF rule has the same prefix followed by a uniquifying suffix in this case a number No other rules in the grammar are allowed to start with the same prefix The name of the merged class will be the common prefix and then it is possible to infer which of the GF rules to use 3 3 4 The GF resource grammar as an OWL ontology The translation described in this section can be used on the multilingual GF resource grammar But since the resource grammar has a very limited lexicon we divide the ontology into two parts to facilitate the writing of the domain specific lexicon e All grammatical categories such as noun phrases verb phrases sentences and questions are sub classes of the class Phrase the grammar rules occur as subclasses of the grammatical categories e All lexical categories such as verbs nouns determiners and prepositions are subclasses of the class Lexicon the lexical entries occur as instances of the lexical categories With this structure it is possible to automatically create the abstract syntax of a lexical GF grammar which is imported by the domain grammar In theory it would also be possible to instantiate the lexicon for different concrete languages by using the built in annotation property rdfs label But this turns out to be very diff
15. on line documentation http www cs chalmers se aarne GF lib resource 1 0 doc 1 3 1 Implemented languages and linguistic coverage The GF Resource Grammar Library contains grammar rules for 11 languages plus some more under con struction These languages are Arabic Danish English Finnish French German Italian Norwegian Russian Spanish and Swedish For each language the library provides e acomplete set of inflectional paradigms e acomprehensive set of syntax rules with structures such as texts punctuation declaratives questions imperatives one phrase utterances predication in different tenses and moods verb phrases constructed from verbs and adjectives with different subcategorization patterns noun phrases formed by determiners and from common nouns proper names adjectives numerals and pronouns Shttp www key project org Snttp webalt math helsinki fi http www cs chalmers se aarne GF doc tutorial gf tutorial2 html 8The Arabic grammar does not cover the full resource API yet Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 8 55 relative clauses and wh questions coordination on different levels As a benchmark for coverage we have used the CLE Core Language Engine grammars as described in Rayner et al 2000 1 3 2 A small example To give an example of how the library is applied consider music playing device
16. the objective of the tool a selection of showcases including the relevant parts of the ontology the configuration file and the resulting grammar rules are shown in the following sections 2 5 Overview of the algorithm In order to better illustrate how the algorithm works this section will describe in more detail its functions The algorithm consists of three major steps Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 18 55 lt DOCTYPE rulesList lt ELEMENT rulesList forEach gt lt ELEMENT forEach forEach rulet gt lt ELEMENT rule left right gt lt ELEMENT left PCDATA gt lt ELEMENT right PCDATA gt lt ATTLIST forEach property CDATA IMPLIED gt lt ATTLIST forEach subPropertyOf CDATA IMPLIED gt lt ATTLIST forEach domain CDATA IMPLIED gt lt ATTLIST forEach range CDATA IMPLIED gt lt ATTLIST rule lang ES EN GR REQUIRED gt gt Figure 2 2 DTD for the configuration file lt left lt rulesList 7A lt forEach lt forEach sule C AA a Gans lt forEach lt forEach lt rule Kel lt rulesList Figure 2 3 FSM for the configuration file parser 1 Parse the OWL ontology The goal of this parsing is to generate an internal representation of the relevant ontological elements This representation will in turn be used to make queries over the ontology 2 Parse the configuration fil
17. theory an individual can be an element of several classes This is equivalent to the element being in the intersection class Properties Properties correspond to relations between individuals There are three kinds of properties object datatype and annotation properties An object property has a domain and a range which are themselves classes The range of a datatype property is not a class but instead a datatype e g a number or a string Annotation properties correspond to extra logical properties e g comments or version information Instead of writing an instance of a property as P a b we often say that a has the property P b Le prop erties are seen as directed from the domain to the range Properties can also be declared to be functional transitive symmetric and more A property can also be a subproperty of another There are more than classes individuals and properties in OWL It is e g possible to create restrictions on classes and properties be logical formulae There are three different levels of OWL OWL Lite OWL http www w3 org 2004 OWL Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 5 55 DL and OWL Full with different expressive power and theorem proving complexity The reader is referred to the OWL Language Reference Dean and Schreiber 2004 and OWL Semantics and Abstract Syntax Patel Schneider et al 2004 for more information Our ontologies a
18. this section we describe the structure of the system grammar Abstract syntax The system grammar has to include the following entities where the examples are from the MP3 domain e Each sort is defined as a GF category together with its instances cat Artist Song fun abba clash Artist happy_new_year london_calling Song Commonly this information is defined in a separate grammar such as the Music resource grammar for the MP3 domain e Each sort can also be used as a short answer fun artist Artist gt ShortAns song Song gt ShortAns e Each action is a constant of the category Action Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 42 55 fun playlist_add Action e Each predicate can be used in two different ways either as a wh question or applied to an argument as a proposition fun song_to_play_Q Question song_to_play_P Song gt Proposition This is all that has to be defined for a domain the common GoDiS grammar knows how to use short answers actions questions and propositions to build all possible dialgoue moves Natural language syntax The system utterances for natural languages are language independent This is obtained by using an incomplete GF grammar with syntax trees from the GF resource grammar Since many grammatical con structions occur repeatedly we use macros defined in GodisLang as an interface to the resource grammar The categorie
19. trees The only thing to think about is that the sort in question has to be made a subclass of the class for the grammatical category in the grammar Thus Location and Line should be subclasses of ProperName Version Final Public Distribution Public Chapter 4 The Enhanced Multimodal Grammar Library The GF grammar library consists of a domain independent group of grammars and three domain specific grammar groups The domain independent grammar group is called Common Each domain specific grammar group corresponds to a GoDiS application The three domains are Agenda MP3 and Tram which in turn correspond to the three GoDiS applications AgendaTalk DJGoDiS and TramGoDis The grammar library is divided into a system grammar and a user grammar which share common modules The overall structure of the grammars for a specific domain is shown in figure 4 1 The domain specific grammars separates the system and user utterances while sharing a common domain independent grammar module with other domains Every domain specific grammar can have its own resources The resource in the Agenda domain is called Bookings the MP3 domain uses a resource called Music and the Tram domain has two resources Stops and Lines To describe the grammars in the library examples will be taken from the MP3 module Both to illustrate the general theory but also to give a more in depth description of a domain specific grammar module 4 1 Separating system and user utteran
20. 1 arg n e A daughter category B X of R yields a property arg i from R to B e A terminal token t X of R yields a datatype property arg i R to a string Since the right hand side of a rule is ordered the daughter properties need to be ordered too Therefore we name the properties arg argp If we want to formally implement a context free grammar as an OWL ontology the daughter properties have to be unique for each rule subclass R Therefore the formal name of property arg i should be arg i A X1 Xn or something similar This means that the ontology will have an very large number of properties and long property names 3 3 2 GF abstract syntax in OWL The abstract syntax of a GF grammar is a context free grammar but without terminal symbols and where each rule has a unique name This simplifies the implementation as an OWL ontology e Each category A corresponds to a class A e Each rule f A An A corresponds to a subclass f of A e Each daughter category A of f corresponds to a property P A from f to A Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 34 55 The resulting ontology is simpler than the ontology for the context free grammar in the following senses e All datatype properties have disappeared since there are no terminal symbols e The names of the rule classes are shorter since the GF rules have unique names e The order between the daughters Aj
21. CALA Enhanced Multimodal Grammar Library Peter Ljunglof Gabriel Amores Hakan Burden Pilar Manch n Guillermo P rez Aarne Ranta Distribution Public TALK Talk and Look Tools for Ambient Linguistic Knowledge IST 507802 Deliverable 1 5 15 08 06 P Od Project funded by the European Community A E under the Sixth Framework Programme for Inf mee Research and Technological Development m gii lt a ociety The deliverable identification sheet is to be found on the reverse of this page Project ref no IST 507802 Project acronym TALK Project full title Talk and Look Tools for Ambient Linguistic Knowledge Instrument STREP Thematic Priority Information Society Technologies Start date duration 01 January 2004 36 Months Security Public Contractual date of delivery Jun 06 Actual date of delivery 15 08 06 Deliverable number 1 5 Deliverable title Enhanced Multimodal Grammar Library Type Report Status amp version Public Final Number of pages 55 excluding front matter Contributing WP 1 WP Task responsible UGOT Other contributors USE Author s Peter Ljungl f Gabriel Amores Hakan Burden Pilar Manch n Guillermo P rez and Aarne Ranta EC Project Officer Evangelia Markidou Keywords grammar multilingual multimodal multimodal fusion OWL ontology dialogue systems Grammatical Frame work TrindiKit GoDiS DelfosN CL The partners in TALK are Saarland University USAAR University of
22. D 1 5 15 08 06 Page 50 55 Version Final Public Distribution Public Bibliography Amores G and Quesada J F 1997 Episteme Proceedings of Procesamiento del Lenguaje Natural 21 1 16 Appel A W 1998 Modern Compiler Implementation in Java Cambridge University Press Baader F Calvanese D McGuinness D Nardi D and Patel Schneider P editors 2003 The De scription Logic Handbook Cambridge University Press Bernstein A Kaufmann E Kaiser C and Kiefer C 2006 Ginseng A guided input natural language search engine for querying ontologies In Jena User Conference Bristol UK Brickley D and Guha R V editors 2004 RDF Vocabulary Description Language 1 0 RDF Schema W3C Recommendation http www w3 org TR rdf schema Bringert B Cooper R Ljungl f P and Ranta A 2005 Development of multimodal and multilingual grammars viability and motivation Deliverable D1 2a TALK Project Coppi S Noia T D Sciascio E D Donini F and Pinto A 2005 Ontology based natural language parser for e marketplaces In amp th Intl Conf on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems volume 3533 pages 279 289 Springer Verlag Dean M and Schreiber G editors 2004 OWL Web Ontology Language Reference W3C Recommen dation http www w3 org TR owl ref Denecke M 2002 Rapid prototyping for spoken dialogue systems In 79th In
23. alid for 2 6 Showcases 2 6 1 Sample rules The example below illustrates a common case in which the grammar rules will be generated Our examples are taken from a smart house domain in which the ontology describes both the hierarchy of devices in the house as well as the actions or commands which can be performed over those devices such as switch on the lamp in the kitchen Thus consider an ontology where a set of properties are grouped as subproperties of a general hasDeviceCommand property These properties are graphically displayed in figure 2 4 In this showcase we are going to analyze the portion describing the device related commands whose XML equivalent is shown in figure 2 5 In this particular case the linguist has detected that all properties are actually actions that is they correspond to the commands to be performed by the system over all the elements in the range in this case all devices within the ontology This can be easily expressed by the configuration file in figure 2 6 Now once the application is run indicating the appropriate configuration file the following results are obtained Command SwitchOff Fan Command SwitchOff Heater Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 20 55 lt owl ObjectProperty rdf ID SwitchoOff gt lt rdfs subPropertyOf rdf resource hasDeviceCommand gt lt rdfs domain rdf resource System gt lt rdfs range gt lt owl Class g
24. almers se aarne GF doc resource pdf Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 7 55 and application grammarians working on domain semantics and lexicon GF s module system helps to maintain this division of labour The GF Resource Grammar Library has been developed during the last five years to serve as a standard library of GF and provide the linguistic details for application grammars on different domains The devel opment has been guided by two major applications using and testing the resource grammars e KeY gt an authoring system for software specifications in the formal language OCL as well as in English and German e WebALT a system for generating mathematical exercises from MathML representations currently English Finnish French Italian Spanish and Swedish In the TALK project the resource grammar library has been used for parsing and generation in dialogue systems currently in English Spanish and Swedish Johansson 2006 In this chapter we give a brief introduction to the GF Resource Grammar Library focusing on how it is used when writing application grammars We presuppose knowledge of GF and its module system knowledge that can be acquired e g from the GF tutorial For the details of the GF Resource Grammar we refer to its own documentation e printable User s Manual from which this chapter is adapted http www cs chalmers se aarne GF doc resource pdf e
25. an OWL individual in the toplevel class Predicate In the example domain there are the following predicates e departure from where do you want to go from the central station e destination to where do you want to go to the central station e shortest_route what is the shortest route between the two given stops the shortest route is 5 To the actions and predicates of a domain we associate utterances that the user and system can perform In GoDiS these were originally defined in a Prolog predicate as a phrase spotting lexicon However in TALK deliverables D1 1 and D1 2 Ljungl f et al 2005 2006 we have shown how to implement the utterances in terms of multilingual and multimodal grammars in GF In the ontology we associate each utterance with a syntax tree from the GF resource grammar e For each Action there is an associated property actionPhrase which associates a verb phrase VP with the action e Each Predicate has two associated properties a wh question and an answer Thus we have the two properties questionPhrase and answerPhrase which associate a wh question WhQ and an answer Clause to each predicate In the GF resource grammar these categories VP WhQ and Clause are very general i e they can be realized in several different linguistic forms This gives the freedom to e g either form a request from the user start over please or some kind of feedback from the syst
26. ances are automatically generated as well as GoDis specification files The specification of the dialogue specific information such as dialogue plans and sortal restrictions and the generation of the GoDiS specification files is presented in TALK deliverable D2 2 Milward et al 2006 In this section we only describe the parts that are used to specify the domain specific utterances 3 4 1 The GF resource grammar The main idea is that the domain specific utterances for the given dialogue domain are specified as syntax trees in the resource grammar ontology described in section 3 3 The resource grammar is grouped by the top level class Syntax of which the grammar classes Phrase and Lexicon are subclasses The grammatical categories are subclasses of Phrase and the syntax trees are instances of these classes 3 4 2 Actions and predicates The actions in GoDiS are used when the user requests the system to perform something e g to restart the dialogue or give the user some help In OWL each GoDiS action becomes an instance of the toplevel class Action In the example domain there are the following actions e restart start over please 2nttp protege stanford edu Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 37 55 e help please give me help The 1 place predicates in GoDiS are used for forming wh questions and for forming answers to questions Each GoDiS predicate becomes
27. at describes how the grammar rules should be generated This approach is more convenient for a Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 17 55 dialogue system where the linguistic information in the ontology will be useless and cumbersome and more suitable from a reusability point of view Secondly OWL was chosen instead of JEACC mainly for two reasons 1 The use of OWL is widely spread and seems to be the basis for the future semantic web This implies that large ontologies are likely to be available in the future and this approach will help create dialogue applications more easily by simply downloading specific domain ontologies 2 OWL is based on RDF and therefore uses Subject Property Object triplets This static structure of OWL is of great help because the algorithm can focus on handling properties letting the linguist define how to create rules that apply to all the elements in its Domain or Range Note therefore that this choice is not just a change in the ontology format the whole parsing algorithm is based on the RDF predefined structure As previously mentioned this approach is completely focused on grammar rules generation no automatic lexicon hierarchy generation has been considered To ensure coherence between the lexicon and the grammar a list of potential non terminal types is extracted that is a list of all the entities within the ontology The linguist decides which
28. ated by several different languages Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 41 55 implemented in any language provided by the resource grammar From this multilingual grammar the concrete languages GodisSystemEng and GodisSystemSwe are automatically derived by instantiating with a suitable lexicon The system grammar GodisSystemSem gives GoDiS semantics for each of the dialogue moves in the grammar User utterances The user grammar is correspondingly called GodisUser and defines the relationship between the user specific dialogue moves requesting answering asking and giving a short answer Resources There are two common resources one for the semantic implementations and one for implementing the natural language grammars The semantic resource Prolog converts the abstract syntax trees into the Prolog annotation used by GoDiS to represent the semantics of the dialogue system The natural language resource GodisLang is an interface between the GF resource grammar and the dialogue system grammars It gives language dependent linearizations for greetings and other general utterances It also provides macros for transforming common phrases in a dialogue system into the syntax trees given by the GF resource grammar 4 3 The domain specific grammars for system utterances The domain specific grammars are just like the domain independent grammars separated into a system grammar and a user grammar In
29. borrowed from English maybe not so strange in certain technical domains 1 3 4 Syntax rules Syntax rules should be looked for in the abstract modules defining the API There are around 10 such modules each defining constructors for a group of one or more related categories For instance the module Noun defines how to construct common nouns noun phrases and determiners Thus the proper place to find out how nouns are modified with adjectives is Noun because the result of the construction is again a common noun Browsing the libraries is helped by the gfdoc generated HTML pages However this is still not easy and the most efficient way is probably to use the parser To find out which resource function implements a particular structure one can just parse a string that exemplifies this structure For instance to find out how sentences are built using transitive verbs write gt i english LangEng gf gt p cat Cl she loves him PredVP UsePron she_Pron ComplV2 love_V2 UsePron he_Pron Parsing with the resource grammar has an acceptable speed for English and the Scandinavian languages In the current implementation parsing for other languages can be inefficient However examples parsed in one language can always be linearized into other languages gt i italian Langlta gf gt 1 PredVP UsePron she_Pron ComplV2 love_V2 UsePron he_Pron lo ama gt p cat C1 lang LangEng she loves him 1 lang Langlta lo ama
30. ce activated virtual butler who handles the full range of multimodal input and presentation possibilities for a spoken and click based interface see figure 2 1 It has been developed at the University of Seville as part of the TALK Project As its immediate predecessor Delfos it is based on the ISU approach The keywords that best define the system are e Agent based e Intelligent e User centered e Industry oriented e Fully multimodal for both input and output e Ontology based It is therefore a context aware and non menu based system that fully adapts to the user needs according the user model configuration The first version of the system is focused on the Smart Home scenario and specifically oriented towards wheel chair bound users needs although the general results can be extrapolated to other user types MIMUS incorporates a series of research results obtained during the project such as e Grammar generation from ontologies e Integration of OWL Ontological knowledge with ISU approach e Dynamic reconfiguration of the home ontology 14 IST 507802 TALK D 1 5 15 08 06 Page 15 55 Display Agent Fuentes Salir Mimus gt gt a su servicio Tel fono 100 m m _ Menu Agent pd dhe es Tel fono J ecmege _ Lavavaj Homo Internet Escritorio Clima Puerta el C mara Figure 2 1 MIMUS Screenshot e Multimodal Turn Plann
31. ces Throughout the GF grammar library the grammar modules are split into a system and a user part with shared additional resources From the perspective of a GoDiS dialogue system the system grammars are used for parsing the semantic representation in Prolog syntax of the dialogue moves into an abstract syntax that can be used for lin earization into the natural language used by the user The user grammars are correspondingly used for translating natural language utterances into a semantic representation in Prolog syntax via the abstract syntax The separation into system and user grammars gives more efficient grammars constructed for their specific needs It is not necessary to have user linearizations for some system specific questions The user will never ask the system Do I want to add a song However the system has to be able to talk about everything within the system If the confidence score on the received user input is low the system will try to ground a request before acting upon it Add a song is that correct 39 IST 507802 TALK D 1 5 15 08 06 Page 40 55 Domain Lexicon Figure 4 1 The structure of the grammars for a domain application Another difference is that we want the user grammar to recognize all possible ways different users will pose questions give answers or request actions For the system we only need one way of phrasing an utterance Since the system has to be able to talk about everythi
32. cion functions If further more both superclasses are subclasses of the same class it can be neccesary to use function definitions 25 IST 507802 TALK D 1 5 15 08 06 Page 26 55 for enforcing equalities between instances E g suppose that B is also a subclass of A and that both A and A are subclasses of C Then we must state that for each b B the result is the same regardless of whether we go via Aor A fun B to C C gt C def B_to_C A_sub_C B_sub_A x A _sub_C B_sub_A x B_to_C x X Uninhabited superclasses There is one common special case of subclasses and that is when the superclass does not have any in stances of its own This superclass can then be implemented in GF as a dependent category Suppose that both D E and F are subclasses of the class Super which itself does not have any instances This can be written as cat Super DEF DEF fun D E E DEF This means that the subclass D corresponds to the GF dependent category Super D An instance d of the subclass D is now written fun d Super D Instances of several classes It is not possible to directly state that an individual is an instance of two classes since a GF term can only have one category We solve this by introducing intersections of classes instead That ab is an instance of both A and B will then be written as cat AB fun AB_sub_A AB gt A AB_sub_B AB gt B ab AB Le ab is an instance of t
33. e The objective here is to generate the list of all applicable rules 3 Generate the output of rules In this step the script goes through the previous list of applicable rules substituting the reference to classes and properties by the corresponding Input Form from the ontology The first two steps described have been implemented by a finite state machine FSM illustrated in figure 2 3 For each state in the FSM only one set of attributes can be parsed These are mentioned in the previous DTD structure Base e No attributes are expected in this state Property Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 19 55 Items Descriptor Location Command hasHNumber hasColor locatedin Eas nee hassize hasDeviceCommand hasEunction hasTelephoneCommand hasLName hasEMail hasRelationShip hasWNumber SwitchOn Undo Redial SwitchOff Help Call Close Find Open CancelTransfer List Transfer MakeConference Figure 2 4 Ontology Structure e propertyRef Indicates the word that references the property in the rule description e subPropertyOf Indicates a superproperty All the subproperties of this including the indi cated one will be treated by the algorithm Triplet e domainRef Indicates the word that references the domain in the rule description e rangeRef Indicates the word that references the range in the rule description Rule e lang Indicates what language the rule is v
34. e place verbs walks e two place verbs loves Mary e three place verbs gives her a kiss e sentence complement verbs says that it is cold e VP complement verbs wants to give her a kiss A special verb is the copula be in English but not even realized by a verb in all languages A copula can take different kinds of complement e an adjectival phrase John is old e an adverb John is here e anoun phrase John is a man The Adjective module How to construct APs The main ways are e positive forms of adjectives old e comparative forms with object of comparison older than John The Adverb module How to construct Advs The main ways are e from adjectives slowly e as prepositional phrases in the car Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 13 55 1 3 7 Multimodality The resource grammar library can also handle multimodal input in the form of general pointing gestures The idea is to extend the grammars with the single category Point The library doesn t define the point category further but leaves it to the specific domain implementation As an example in the multimodal MP3 player DJGoDiS the user can click on artists songs and generic buttons displayed on the screen Another example in a multimodal tram information system tram stops are the interesting points The linearizatio
35. ed as dependent categories Datatype properties GF has primitive notions of strings integers and real numbers with the categories String Int and Real This means that datatype properties which ranges over strings integers or reals are straightforward to handle 3 1 3 General OWL axioms and restrictions More general axioms and restrictions on an ontology can be difficult to implement in GF One example is negated propositions e g stating that two classes are disjoint GF has no primitive notion of negation but uses the type theoretical definition that negation means implication of absurdity Nordstr m et al 1990 To handle general OWL restrictions we can always resort to the standard solution in type theory to declare a category of logical propositions in parallel with the category of proofs of propositions cat Prop Proof Prop We refer to e g Nordstrom et al 1990 or Ranta 1994 for more discussion about these topics 3 1 4 Ontologies and concrete syntax Annotation properties in OWL are for non ontological information about classes and or individuals There are some predefined annotation properties e g rdfs comment rdfs seeAlso and rdfs label For our pur poses the most interesting is rdfs label rdfs label is an instance of rdf Property that may be used to provide a human readable version of a resource s name Multilingual labels are supported using the language tagging facility of RDF literals Brickley a
36. em do you want to start over T m starting over 3 4 3 Sorts and indivuduals The sortal hierarchy is an OWL subclass hierarchy under the top level class Sort Consider the following example hierarchy from the Tram domain Street Location TramStop St BusStop Sort A Li TramLine BusLine Route This subclass hierarchy says that each TramStop is also a Stop which is also a Location Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 38 55 Connecting the sorts to the grammar The individuals of each sort can be used either directly as short answers or as arguments to predicates Subsorts have to be of the same grammatical category as their mother sorts which means that it is the direct subclasses of Sort that decides the associated grammatical category Thus each direct subclass of Sort has a property mapping the instances to a syntax tree of the associated grammatical category For the example hierarchy from the Tram domain we get the following properties e locationPhrase mapping each Location to a ProperName e linePhrase mapping each Line to a ProperName e routePhrase mapping each Route to a Sentence A common case is when a sort is a database of entities such as Location and Line above In these cases the concrete syntax often consists of strings with very regular or non existent morphology and then the annotation property rdfs label can be an alternative instead of syntax
37. epts in the ontology to build a parser translating natural language input to formulae in description logics Bernstein et al 2006 use a similar approach with a static grammar for recognizing domain independent linguistic structures and a dynamic grammar which is automatically generated from an OWL ontology However these systems are simple question answering systems and not general dialogue systems In particular there has been no previous work on incorporating ontological knowledge into recognition and generation grammars for ISU based dialogue systems The main result of this deliverable is that we present different solutions of how this integration of ontologies and grammars into ISU based dialogue systems can be made The solutions are also practical since they are implemented in existing dialogue systems One of the main advantages with integrating ontologies and ISU based dialogue systems is that it becomes simple and fast to implement new languages and or develop new dialogue domains In fact a complete dialogue application for the GoDiS dialogue system can be specified as a single OWL ontology From this ontology all necessary dialogue system files can be generated which is shown in TALK deliverable D2 2 Milward et al 2006 as well as the necessary recognition and generation grammars which is shown in this deliverable This is similar to Denecke 2002 which describes a framework for rapid prototying of form filling dia logue syste
38. ernative way of implementing general properties which is inspired from the previous idea We can declare the category ListB with its constructor functions BaseB and ConsB cat ListB fun BaseB ListB ConsB B gt ListB gt ListB Now the general property Prop can be implemented as a function from A to ListB The definition of equiv_B is analoguous Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 28 55 fun Prop A gt ListB The facts that hat a is not related to anything and that a is related to both b and b are written as def Prop a BaseB Prop a ListB b ListB b BaseB There is a syntactic sugar in GF for writing lists The category ListB can be written B and the declarations above are then written cat B 0 fun Prop A gt B The 0 in the declaration of B says that BaseB takes no arguments i e that B can be uninhabited Cardinality restrictions on properties OWL cardinality restrictions can be implemented in the same way That each A is related to at least three B s can be declared as cat B 3 fun Prop A gt B Maximality restrictions can also be implemented albeit cumbersome That each A is related to at least one and at most three B s can e g be declared as cat B123 fun Bl B gt B123 B2 B gt B gt B123 B3 B gt B gt B gt B123 fun Prop A gt B123 Subproperties In OWL
39. essary GoDiS files and GF grammars can be generated Finally chapter 4 consists of a description of the final structure of the Enhanced Multimodal Grammar Library which is implemented in GF as a front end to the generic dialogue system GoDiS built within TrindiKit The library also includes three example dialogue domains a calendar application called Agen daTalk an MP3 player application called DJGoDiS and a Tram information application called Tram GoDiS The library is designed for making is easy to add new dialogue domains source languages and input and output modalities 1 1 OWL the Web Ontology Language OWL the Web Ontology Language is a W3C standard for describing ontologies We are not using the full flavour of OWL only the main components are needed The main reason why we have chosen OWL instead of any other ontology description language is that it is a standard OWL has three main components classes individuals and properties Classes Classes correspond to sets of individuals The main relation between classes is entailment that classes can be subclasses of other classes There are no constraints on the subclass relation which means that a class can e g be a subclass of several classes It is possible to form combined classes most notably the intersection or the union of several classes Classes can also be declared to be equivalent or disjoint Individuals Individuals are class elements Just as in set
40. g gf PsychoSystemSwe gf e GF grammars for user utterances Abstract syntax PsychoUser gf Semantics PsychoUserSem gf One concrete syntax for each language PsychoUserEng gf PsychoUserSwe gf The files in parenthesis are very straightforward to write they can in fact be created automatically from the abstract syntax The sizes of the files depend largely on the size of the domain roughly the sizes of the grammars are proportional to the number of actions predicates sorts and individuals there are in the domain Version Final Public Distribution Public Chapter 5 Summary and Conclusions The ISU approach uses abstract representations for dialogue states and update rules which allow the generic characterisation of flexible dialogue strategies This enables the same code for dialogue man agement techniques to be used for different natural languages and for different domains In this deliverable we have discussed how dialogue systems and especially grammars for dialogue sys tems can be related to existing knowledge representation systems We have focussed on one specific language for knowledge representation the Web Ontology Language OWL which is a W3C standard for ontology descriptions for knowledge representation Since it is a standard there is already much work done on relating OWL to other ontology formalisms We have shown how to generate the domain specific utterances for a device oriented
41. he intersection of A and B which is a subclass of both A and B Class equivalence In OWL there are two notions of class equivalence intensional and extensional That two classes A and B are extensional equivalent simply says that they are subclasses of each other fun A sub B A gt B B_subA B gt A Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 27 55 This equivalence can be further enforced by defining equivalence functions on A and B fun equiv_A A gt A def equiv B_sub_A A_sub_B x x equiv X x Intensional equality is not directly possible in GF however since operator definitions are not allowed in abstract syntax 3 1 2 Properties Properties can be implemented as categories which depend on the domain and range classes The OWL property Prop with domain A and range B is thus written as cat Prop A B An individual a which has the property Prop b will be a GF constant fun a_Prop_b Prop a b Functional properties A special case is when a property Qrop is functional i e that each element in the domain is connected to one and only one element This can be implemented in GF by using function definitions The property itself is represented by a function from A s to B s fun Qrop A gt B Each instance of the property is then a row in the definition of Qrop b bi def Qrop a Qrop a Alternative view of properties There is an alt
42. icult for most languages since the inflectional patterns are often irregular It is possible to solve by more complicated properties and morphological class hierarchies for each language but we do not pursue this issue further and instead resort to writing the concrete lexicon instances as GF grammars directly Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 36 55 3 3 5 Ontology editing as a multilingual authoring system Dymetman et al 2000 noted that an ontology for abstract syntax can be used for multilingual document authoring This suggests that our OWL interpretation of the resource grammar any OWL ontology editor such as Proteg can be used as a multilingual authoring system for generating grammatically correct sentences e OWL is used to create abstract syntax trees e which are then automatically translated to GF format e and linearized in all languages implemented in the resource grammar This idea is used in section 3 4 1 for specifying multilingual user and system utterances in a dialogue domain Note that the GF system itself comes with tools supporting multilingual document authoring Khegai et al 2003 which can be used as an alternative to an OWL editor 3 4 Using ontologies to specify GoDiS dialogue domains A whole dialogue domain for a GoDiS dialogue system can be specified as an OWL ontology From such an ontology grammars for system utterances and user utter
43. in the system grammar There are two important differences however The first difference is that not all rules from the system grammar will be in the user grammar This is true for most predicates which will either have the question form or the answer form in the user grammar not both forms E g since it is the user who asks which song is the current one it is very unlikely that the user gives an answer to that question The second difference is that the user can give extra information in the form of additional answer moves This means that the user grammar will contain rules taking additional arguments e Combined answers e g London calling by the Clash fun song_artist Song gt Artist gt ShortAns e Combined actions e g Add London calling by the Clash fun playlist_add__song_artist Song gt Artist gt Action e Combined questions e g What songs have the Clash made fun available_song__artist Artist gt Question Natural language syntax The linearization types for user utterances are simple strings which means that it is possible to just list all synonym strings for an utterance lin current_song s variants what is the name of the current song what is the name of this song which is the current song which song is this The first two strings can also be optimized to what is the name of variants the current this song A good habit is t
44. including these new rules Command Fan SwitchOff Command Heater SwitchOff Note that linguistically speaking this order is also possible in English in topicalized or left dislocated construc tions such as The lamp switch it on Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 22 55 lt rulesList gt lt forEach property P subPropertyOf hasDeviceCommand gt lt forEach domain X range Y gt lt rule gt lt left gt Command lt left gt lt right gt P Y lt right gt lt rule gt lt rule gt lt left gt Command lt left gt lt right gt Y P lt right gt lt rule gt lt forEach gt lt forEach gt lt rulesList gt Figure 2 7 Multimodal configuration file Command Lamp SwitchOff Command DimmerLamp SwitchOff Command Radio SwitchOff Command TV SwitchOff Command gt Fan SwitchOn Command Heater SwitchOn Command Lamp SwitchOn Command DimmerLamp SwitchOn Command Radio SwitchOn Command TV SwitchOn Command Blind Close Command Blind Open 2 6 3 Capturing multilinguality Due to the structural differences among the human languages different rules must be generated for dif ferent languages For example to indicate the location of a given device it would be the kitchen light in English whereas in Spanish the sentence order changes la luz de la cocina the light of the kitchen Once the target language has
45. ing e Modality specific resources 3D house and talking head e Multimodal I O fussion fission e Multimodal and multilingual grammar libraries e Multilinguality on the fly e Itis based on a corpus of multimodal WoZ experiments with potential users in the in home domain e Non menu based pro active system e Speech activated dialogue manager 2 2 Automatic grammar generation The problem of manually generating grammars for a Natural Language Understanding NLU system has been widely discussed and several authors have proposed different solutions Two main approaches can be highlighted Grammatical Inference and Rule Based Grammar Generation The Grammatical Inference approach refers to the process of learning grammars and languages from data and is considered nowadays as an independent research area within the Machine Learning techniques Inttp eurise univ st etienne fr gi Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 16 55 Examples of applications based on this approach are ABL Zaanen 2000 and EMILE Williem Adriaans 1992 On the other hand the Rule Based approach tries to generate the grammar rules from scratch therefore based on the expertise of a linguist but trying to minimize the manual work An example of this approach is the Grammatical Framework Ranta 2004 whose proposal is to organize the multilingual grammar construction in two building blocks the category and func
46. ion paradigms gt o co o oeu 24 ea ea ea ee e e a a LA Syne less ka ek se a T a el ew ah a ek oe Oh a whe a 1 3 5 Syntactic structures on the sentence level o o 1 3 6 Sub sentential syntactic structures 2 2 ee ee 3 7 Multimodalltr 2 e nh eb ee eee eh Ke a a 2 Generating Multilingual Grammars from OWL Ontologies in MIMUS 2A Tntroduc on sa sw sa saret BOG See Hoe OM ee bk ae Sak we wad 2 2 Automatic grammar generation 2 2 a aa ea ea aa ee ee ae 2 2 1 Generating grammars from ontologies 0 000000 02 pt o in AE 24 CONSUMO DIES os uso ee e a a a he iei 2 5 Overview of fhe algorithm ocer cocida cda er wee AO SONCA caca a aw ar a AA A A RR ae ZO Samplers a a ta Be tg ce A a a 26 2 Capturing multimodality 6 664 6 ee we 2 6 3 Capturing multilinguality o ee ee ee 2 Conelusions and future Work o o o o eso Gea rer zara RHODA GRO Se 3 Ontologies and Grammatical Framework 3 1 Relating existing ontologies with abstract GF grammars 3 1 1 Classes and individuals oe le be ee ee a Aol ETODRIDOS e o a o a Ho ee we ee ee a ew we 14 14 15 16 16 17 17 19 19 21 22 24 IST 507802 TALK D 1 5 15 08 06 Page 11 55 3 1 3 General OWL axioms and restrictions 2 o oo e 29 3 1 4 Ontologies and concrete syntax 2 2 e e 29 3 2 Incorporating the Mimus in home ontology o oo e 30 3 2 1 The Mim
47. k a different linearization of Song lin Song regGenN chanson feminine But to linearize the rule that modifies a kind with a property we can use the very same rule in German and French The resource function AdjCN has different implementations in the two languages e g different word orders but the application programmer need not worry about the difference Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 9 55 1 3 3 Inflection paradigms Inflection paradigms are defined separately for each language Lng in the module ParadigmsLng For the sake of convenience every language implements these five paradigms oper regN Str gt N regular nouns rega Str gt A regular adjectives regV Str gt V regular verbs regPN Str gt PN regular proper names dirV V gt V2 directtransitive verbs It is often possible to initialize a lexicon by just using these functions and later revise it by using the more involved paradigms For instance in German we cannot use regN Lied for Song because the result would be a Masculine noun with the plural form Liede The individual Paradigms modules tell what cases are covered by the regular heuristics As a limiting case one could even initialize the lexicon for a new language by copying the English or some other already existing lexicon This would produce language with correct grammar but with content words directly
48. lash The linearization types for the dialogue move categories in the user grammar are simple strings not complex phrases from the resource grammar The main reason for this is that the user utterances will only be used in one way since the system replies are already defined in the system grammar Another reason is that some user utterances that we want to capture can be grammatically incorrect and therefore not covered by the resource grammar A third reason is that it should be simple to add a new synonym for the same semantics These new synonyms can e g be taken from a corpus of user utterances and then it 1s easiest to just add a string instead of a syntax tree Abstract syntax The abstract syntax consists of one rule for all synonymic utterances Two utterances are synonyms if they are interpreted in the same way by GoDiS i e they have the same translation into a list of GoDiS dialogue moves This means that any sort action or predicate that the user would want to talk about will have a user specific variant e Giving a short answer fun artist Artist gt ShortAns e Requesting an action fun playlist_add Action e Asking a question Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 45 55 fun current_song Question e Answering a question fun song_to_play Answer Note that these functions look very similar some of them even identical to the corresponding rules
49. le element list from one argument There are also the operations pm2 pm3 etc for constructing multiple argument lists The functions ask and request are defined in the common system grammar and current_song_Q and playlist_add are defined in the MP3 system grammar Combined moves taking extra arguments are handled in the same way only returning longer lists of dialogue moves lin song_artist x y pm2 shortAns song x shortAns artist y playlist_add__song_artist x y pm3 request playlist_add answer song_to_add_P x answer artist_to_add_P y available_song__artist x pm2 ask available_song_Q shortAns artist x 4 5 Integrating multimodality in the grammars The grammar library supports both flavours of multimodality parallel and integrated as introduced in TALK deliverable D1 2a Bringert et al 2005 Parallel multimodality such as graphical output is han dled as just another concrete syntax Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 47 55 Integrated multimodality such as spoken user input integrated with clicks on a screen is handled by adding a row to the linearization record for the input clicks This is already incorporated in the GF resource grammar as explained in section 1 3 7 Both these techniques are thoroughly discussed in TALK deliverable D1 2b Ljunglof et al 2006 and will not be discussed further in this chapter 4 6 Extending the GF grammar
50. le is to exploit techniques for automatically generating multilingual and multimodal grammars from an ontology We present two similar ideas which are implemented in the dialogue systems Mimus and GoDisS In Mimus the dialogue system grammars are automatically generated from an ontology of devices by supplying a configuration file explaining the linguistic details In GoDiS the whole dialogue domain can be specified as an ontology from which GF grammars are generated The rest of the dialogue application can also be generated from the ontology which is discussed in TALK deliverable D2 2 Milward et al 2006 Finally we describe the final structure of the Enhanced Multimodal Grammar Library In making the grammar library we exploit the advantages of the ISU approach The ISU approach utilizes structured Information States to keep track of dialogue context information These Information States can be read and updated by several different modules which access precisely the information that they need This enables a modular architecture which allows generic solutions for dialogue technology For example e different language modules can interact with essentially similar Information States enabling rapid porting of dialogue systems from one language to another and the creation of multilingual dialogue systems e coding of dialogue behaviour is supported independently of language and domain thus allowing for the rapid porting of dialogue systems to diffe
51. le ontology By representing the utterances as abstract syntax trees in a Grammatical Framework GF resource grammar the representation of utterances is language independent The re source grammar we are using has a coverage comparable to the Core Language Engine and exists for 11 different languages including the TALK languages English German Spanish and Swedish but also the non Indoeuropean languages Finnish Russian and Arabic Since the grammar is multilingual this means that the only thing that has to be done to localize a dialogue application to a new language is to write a lexicon for the domain specific entities In this deliverable we also describe the final status of the full TALK Grammar Library which is written in GF The structure is modified so that the domain specific parts of a grammar can be described in an OWL ontology and to further simplify localization to new languages domains and modalities Version Final Public Distribution Public Chapter 1 Introduction This deliverable concerns the development of technology for making use of ontological knowledge in dialogue system grammars We have focussed on one specific language for knowledge representation the Web Ontology Language OWL which is a W3C standard for ontology descriptions for knowledge representation We discuss how the abstract syntax of the Grammatical Framework GF can be used as an ontology specification language The main concern of the deliverab
52. me intrans can be inferred from the context This means that it is not necessary to create the class intrans being a subclass of VP together with the property verb to the daughter class Verb but we can instead state that Verb is a subclass of VP directly Note that this can be applied to the problem of multiple arguments with the same category described above Since the coercion rule objectNP is a unique single argument rule we can simply state that NP is a subclass of ObjectNP Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 35 55 Optional arguments A context free rule with optional arguments can be converted to one single OWL rule class where the optional arguments become optional properties GF does not support optional arguments which means that this must be implemented by several GF functions Suppose that we have the following context free rule with two optional arguments NP gt Det AP Noun PP This has to be simulated by four GF rules fun np0 Det gt AP gt Noun gt PP gt NP npl Det gt Noun gt PP gt NP np2 Det gt AP gt Noun gt NP np3 Det gt Noun gt NP And if we used our translation above we would get four subclasses of the NP class But these can be merged into one single subclass where the daughter properties ap and pp are optional whereas det and noun are obligatory One problem remains and that is the name of the merged class It
53. mes called terms where A are all terms with category A L A f tt tn f A1 An A ti E L A1 t E L An 2The term DL stands for Description Logic Baader et al 2003 3GF also supports non context free rules by the use of dependent types and higher order functions Furthermore it is possible to define reduction rules via the def declaration See Ranta 2004 for further details Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 6 55 Concrete syntax One abstract grammar can have several corresponding concrete grammars which makes GF a natural multilingual grammar formalism The concrete grammar specifies how the abstract grammar rules should be linearized in a compositional manner Each abstract category A has a corresponding linearization type A and each linearization rule f A A A has a corresponding linearization function f over linearization types fF Aj m AZ AP The linearization types and linearization functions are specified in GF by lincat and lin declarations lincat A A lin f x1 xn Pxl xn The linearization of a tree f t tn L A is fa P lal The expressive power of GF comes from the rich type system for specifying linearization types It supports discontinuous constituents inflection tables and inflectional parameters in addition to ordinary string sequences Resource syntax and modules A final
54. ms In ZJJCAI 03 Workshop on Knowledge and Reasoning in Practical Dialogue Systems Acapulco Mexico Milward D and Beverige M 2003 Ontology based dialogue systems In JCAI 2003 Workshop on Knowledge and Reasoning in Practical Dialogue Systems Nordstr m B Petersson K and Smith J 1990 Programming in Martin L f s Type Theory Oxford University Press Patel Schneider P F Hayes P and Horrocks I editors 2004 OWL Web Ontology Language Semantics and Abstract Syntax W3C Recommendation http www w3 org TR owl semantics Quesada J F and Amores G 2002 Knowledge based reference resolution for dialogue management in a home domain environment In Johan Bos M E and Matheson C editors Proceedings of the sixth workshop on the semantics and pragmatics of dialogue Edilog pages 145 189 Ranta A 1994 Type Theoretical Grammar Oxford University Press Ranta A 2004 Grammatical Framework a type theoretical grammar formalism Journal of Functional Programming 14 2 145 189 Rayner M Carter D Bouillon P Digalakis V and Wir n M 2000 The Spoken Language Transla tor Cambridge University Press Russ T Valente A MacGregor R and Swartout W 1999 Practical experiences in trading off ontology usability and reusability In Proceedings of the Knowledge Acquisition Workshop KAW99 Banff Alberta Seki H Matsumara T Fujii M and Kasami T 1991 On multiple co
55. ms Since we are using an ISU based approach we in addition get all the benefits described above such as language independence and dialogue flexibility Furthermore by converting the ontologies to Grammatical Framework grammars we can create a multilingual dialogue system by only writing a domain dependent lexicon for each language Layout of the deliverable We begin by giving a general description of OWL the Web Ontology Language followed by a summary of the Grammatical Framework We then describe the multilingual GF resource grammar which has a wide linguistic coverage of 11 languages and some more in the making In chapter 2 we describe how multilingual grammars for the Mimus dialogue system are generated from an OWL ontology describing the devices in the domain The generation is controlled by a configuration file where the linguistic details are given In chapter 3 we discuss how OWL is related to GF abstract syntax We show how an OWL ontology can be translated to a GF abstract grammar and how multilingual presentations of the concepts in the ontology can be implemented as concrete syntaxes We also discuss how a GF grammar can be converted Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 4 55 to an OWL ontology by which an ontology editor can be used as a multilingual authoring tool The final section describes how a GoDiS dialogue domain can be implemented as an OWL ontology from which all nec
56. n type of points is a single string with the special label point to make it possible to extend other linearization types lincat Point point Str The linearization type of each grammatical category such as NP VP S QS and Utt is then extended with the point linearization type lincat NP Lang NP point Str Here Lang NP is the unimodal linearization type for noun phrases which depends on the language Fi nally each function from the unimodal grammar library is extended to handle pointing gestures lin AdvNP np adv Lang AdvNP np adv point np point adv point The AdvNP function adds an adverbial phrase to a noun phrase The reference Lang AdvNP is to the original unimodal function and the result is extended with the multimodal input pointing gestures in sequence Some specific multimodal demonstratives such as this that here etc are also defined which takes a pointing gesture as argument fun this_point_NP Point gt NP lin this_point_NP p Lang this_NP p These are in contrast with the original unimodal demonstratives which have no associated point attached lin this_NP Lang this_NP point Version Final Public Distribution Public Chapter 2 Generating Multilingual Grammars from OWL Ontologies in MIMUS 2 1 Introduction MIMUS is a user centered multimodal and multilingual dialogue system for the smart house domain an intelligent pro active and voi
57. nd and that each Blind also is a Device by applying the coercion Blind_sub_Device fun Open_sub_Cmd sys System gt dev Device gt Open sys dev gt HasDeviceCommand sys Blind_sub_Device dev Close_sub_Cmd The same goes for SwitchOn and SwitchOff where we have to say that any of their devices also is a Device fun SwitchOn_sub_Cmd sys System gt dev FanHeaterLampRadioTV gt SwitchOn sys dev gt HasDeviceCommand sys FHLRT_sub_Device dev SwitchOff_sub_Cmd 3 2 2 Mimus configuration files as GF concrete syntax Each language in the configuration file becomes a specific concrete syntax in the GF grammar Linearization of commands Recalling the configuration file for extracting commands to a recognition grammar as shown in figure 2 6 on page 21 we can implement this as one single GF function taking a System a Device and a command returning a Command fun command sys System gt dev Device gt HasDeviceCommand sys dev gt Command Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 32 55 Since each command property is a subproperty of HasDeviceCommand they will match this rule The main part of the configuration file comes in the linearization of this function The configuration stated that the command was the command name followed by the device which is specified in GF as pattern command sys dev cmd cmd dev Linearization of subclass and
58. nd Guha 2004 section 3 6 Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 30 55 This is a description of a very simple version of the concrete syntax of a GF grammar where the language tag says which concrete syntax the label is for This means that all label annotations for a certain language in an ontology can be straightforwardly instantiated as a GF concrete syntax GF concrete syntax is much more expressive than that since it supports function arguments and complex datatypes 3 2 Incorporating the Mimus in home ontology In this section we show as an example how the Mimus ontology described in chapter 2 can be implemented as the abstract syntax of a GF grammar The XML configuration files that are used to generate context free recognition grammars then have their natural interpretations as concrete syntaxes of the GF grammar 3 2 1 The Mimus ontology as an abstract GF grammar When doing the conversion from the Mimus ontology we chose to implement all subclasses and subprop erties with coercion functions as explained before An alternative which we didn t pursue would be to use dependent categories for implementing the device classes and the command properties The dialogue system There is a class System in the ontology which commonly has one element the representation of the particular domain cat System fun mimus System Devices There are six devices which are all subcla
59. ng including user utterances the system grammar includes a minimal recognition grammar for user utterances Therefore the user grammar can be seen both as a specialization and an extension of the system grammar it specializes the grammar to only those utterances that the user will say but it extends the number of alternative ways the utterances can be phrased Another consequence is that a small and unefficient user grammar can be constructed straight from the system grammar 4 2 The domain independent grammars All domain independent grammars are inside the Common group of grammars The common system grammar is called GodisSystem and specifies the two common features for all GoDiS applications The first feature is utterances used for various dialogue moves such as reraising issues greetings feedback and grounding The other feature is to define the internal structure of the dialogue moves from the categories Question Action Proposition and ShortAnswer This can be seen as the common grammar specifying the upper level of the grammar structure letting the domain dependent grammars implementing the low level details System utterances The multilingual system grammar GodisSystemI is implemented by using syntax trees from the GF resource grammar This guarantees that all system utterances will be grammatically correct and can be The trailing I in the grammar name says that the grammar is incomplete or multilingual and can be instanti
60. ntext free grammars Theoretical Computer Science 88 191 229 Sharon Oviatt S L D A and Kuhn K 1997 Integration and synchronization of input modes during multimodal human computer interaction In Proceedings of Conference on Human Factors in Comput ing Systems CHI 97 Williem Adriaans P 1992 Language Learning from a Categorial Perspective PhD thesis Amsterdam University Zaanen M V 2000 Abl Alignment based learning In Proceedings of the 18th International Confer ence on Computational Linguistics COLING Saarbriicken Version Final Public Distribution Public Appendix A The enhanced multimodal grammar library A 1 Downloading the grammar library The TALK Enhanced Multimodal Grammar Library can be downloaded from http www ling gu se projekt talk software The distribution consists of a collection of GF grammar modules distributed in the following directories e Common grammars for GoDiS based dialogue systems e The application domains MP3 for the D GoDiS application Agenda for the AgendaTalk appli cation and Tram for the TramGoDiS application The directories and grammar modules are described in more detail in chapter 4 OWL ontology for a GoDiS dialogue domain Apart from the GF grammars the library also consists of an example OWL ontology for the TramGoDiS application This ontology is located in the directory OWL and can be opened by any OWL compliant ontology editor such as Pro
61. o always include the system utterances whenever possible This is done by opening the system grammar as a resource module and use the constants as macro definitions Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 46 55 lin playlist_add variants reqVP playlist_add reglx add variants a song variants to the playlist The term playlist_add following reqVP is the requested action from the system grammar which is different from the rule that we are currently defining The operation reqVP transforms a verb phrase from the resource grammar into a string optionally adding I want to and please The operation req1x does the same on a verb phrase specified as two strings An alternative to using strings to specify utterances is to use syntax trees from the GF resource grammar lin playlist_add__song x variants reqVP ComplV2 add_V2 x reqVP ComplV3 add_to_V3 x the_N_sg playlist_N An advantage with using only syntax trees is that they can be reused for different languages GoDiS semantics We define the semantics for user utterances in terms of the semantics for system utterances This means that we open the semantics system grammar as a resource module and give the semantics as syntax trees lin current_song pml ask current_song_Q playlist_add pml request playlist_add artist x pml shortAns artist x The operation pm1 constructs a sing
62. oases cada a a a ew hws 53 AJ Testing the Brave a a a a a a 54 Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 1 55 Summary The ISU approach uses abstract representations for dialogue states and update rules which allow the generic characterisation of flexible dialogue strategies This enables the same code for dialogue man agement techniques to be used for different natural languages and for different domains In this deliverable we discuss how dialogue systems and especially grammars for dialogue systems can be related to existing knowledge representation systems We focus on one specific language for knowledge representation the Web Ontology Language OWL which is a W3C standard for ontology descriptions for knowledge representation Since it is a standard there is already much work done on relating OWL to other ontology formalisms We show how to generate the domain specific utterances for a device oriented dialogue system from an ontology describing the devices An example system using this technique is Mimus a dialogue system with which one can control different devices in a home such as lights alarms and washing machines All information about devices is specified in an ontology which can be updated on the fly with e g new devices or new information about existing devices We also show that all domain specific utterances in a domain for the GoDiS dialogue system can be specified as a sing
63. of MP3System GodisSystemFin MusicFin MP3SystemI with Grammar GrammarFin GodisLang GodisLangFin MP3Lexicon MP3LexiconFin Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 48 55 This will give us a complete Finnish system grammar provided there is a Finnish version of the common grammars GodisLang and GodisSystem and the domain grammars Music and MP3Lexicon The most difficult part is to write the user grammar MP 3UserFin The simplest alternative is to only use syntax trees from the system grammar but to get robust recognition we have to add the utterances and idioms which are common for Finnish MP3 dialogues Adding a new domain to the GF Grammar library Adding a new domain consists in creating the GoDiS dialogue application and writing the GF grammars We assume that the GoDiS application already exists and the problem only is to write the grammars Assuming the example domain is a computerized psychotherapist which we can call Psycho the following have to be created e A directory Psycho e GF grammars for domain specific lexicon entries Abstract syntax PsychoLexicon gf One concrete syntax for each language PsychoLexiconEng gf PsychoLexiconSwe gf e GF grammars for system utterances Abstract syntax PsychoSystem gf Semantics PsychoSystemSem gf Language independent syntax PsychoSysteml gf One concrete syntax for each language PsychoSystemEn
64. re very simple and fit into OWL Lite except for one thing Some of our properties have the class of classes as their range meaning that in these cases ordinary classes act as individuals This is only allowed in OWL Full which is no serious problem we do not do any formal reasoning on our ontologies 1 2 Anoverview of the Grammatical Framework Grammatical Framework GF has been extensively described in TALK deliverables D1 1 D1 2a and D1 2b Ljunglof et al 2005 Bringert et al 2005 Ljunglof et al 2006 so in this section we only give a short overview of the main features of the formalism In section 1 3 we describe the structure of the multilingual resource grammar 1 2 1 Main features of GF The main idea with the Grammatical Framework is that it maintains a clean separation of abstract and concrete syntax Abstract syntax The abstract syntax is a context free grammar without terminals and where each rule has a unique name An abstract rule in GF is written as a typed function For a person used to phrase structure grammars this syntax might look awkward and another equivalent notation is the one used in Multiple Context Free Grammars MCFG Seki et al 1991 F Aj gt An A GF notation A gt f A1 An MCFG notation The categories and rules are specified in GF by cat and fun declarations cat A Al An fun f Al gt gt An gt A The abstract grammar defines a language of trees over na
65. rent domains e the use of structured Information States allows straightforward implementation of flexible dialogue systems which can access and modify information in the Information State in different sequences and by varying means In this deliverable as well as in the earlier TALK deliverables D1 1 D1 2a and D1 2b Ljunglof et al 2005 Bringert et al 2005 Ljunglof et al 2006 we show that by using an abstract representation for grammars we can further enable rapid porting of dialogue systems between languages domains and 2 IST 507802 TALK D 1 5 15 08 06 Page 3 55 modalities The main tool in defining such grammars is the Grammatical Framework GF which is used in collaboration by UGOT UEDIN and UCAM for making ISU based dialogue systems Comparison with the current state of the art Milward and Beveridge 2003 notes that general use of ontologies within dialogue systems is relatively rare and discuss in what different ways ontological domain knowledge can be used by linguistic com ponents in a dialogue system Although they discuss generation of simple recognition grammars using synonyms from the ontology and extend this to recognition grammars for noun phrases the approach does not extend to the generation of language models for complex commands or questions Some recent work has focussed on incorporating ontological knowledge into the parsing process Coppi et al 2005 use a two level grammar which refers to conc
66. s In the application we may have a semantical category Kind examples of Kinds being Song and Artist In German for instance Song is linearized into the noun Lied but knowing this is not enough to make the application work because the noun must be produced in both singular and plural and in four different cases By using the resource grammar library it is enough to write lin Song reg2N Lied Lieder neuter and the eight forms are correctly generated The resource grammar library contains a complete set of inflectional paradigms such as reg2N here enabling the definition of any lexical items The resource grammar library is not only about inflectional paradigms it also has syntax rules The music player application might also want to modify songs with properties such as American old good The German grammar for adjectival modifications is particularly complex because adjectives have to agree in gender number and case and also depend on what determiner is used ein amerikanisches Lied vs das amerikanische Lied All this variation is taken care of by the resource grammar function AdjCN fun AdjCN AP gt CN gt CN The resource library API is devided into language specific and language independent parts To put it roughly e the lexicon API is language specific e the syntax API is language independent Thus to render the above example in French instead of German we need to pic
67. s Sentence S and Clause Cl have question counterparts in the categories QS and QCI A Question Sentence is formed from a Question Clause by fixing its Tense Anteriority and Polarity in the same way as a Sentence is formed from a Clause Question clauses can be formed from clauses yes no questions or by using an interrogative wh questions The interrogatives are pronouns IP who which song adverbials IAdv why and complements IComp where Other sentence level categories Apart from sentences clauses and questions the resource grammar library contains several other sentence level categories e Relative clauses RC1 and relative sentences RS who loves John Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 11 55 Original syntax tree for the sentence John walks UseCl TPres ASimul PPos PredVP UsePN john_N UseV walk_V Subtree Replacement Resulting sentence TPres TPast John walked TPres gt TFut John will walk ASimul AAnter John has walked PPos PNeg John doesn t walk john_PN mary_PN Mary walks UsePN john_PN UsePron it_Pron it walks walk V sleep V John sleeps UseV walk_V ComplV2 love_V2 somebody_NP John loves somebody Figure 1 1 Results of replacing parts of the syntax tree for the Sentence John walks e Slash clau
68. s for short answers actions questions and propositions have their linearization inherited from categories in the resource grammar e Each sort has its own linearization category often being a noun phrase NP lincat Artist NP lin abba plur_NP Abba The macro plur_NP is defined in GodisLang for creating a plural noun phrase from a string e Short answers are linearized as noun phrases NP lincat ShortAns NP lin artist x x e Actions are linearized as verb phrases VP together with extra information about the ClauseForm lincat Action ClauseForm VP lin playlist_add hasDone Comp1V3 add_to_V3 indef_N_sg song_N the_N_sg playlist_N Actions are used in different contexts by the common GoDiS grammar such as requests Add a song to the playlist please questions Do you want to add a song to the playlist or confirma tions I have added a song to the playlist The information needed for all this is built into the GF resource grammar e Questions are linearized as question clauses QC1 plus ClauseForm information lincat Question ClauseForm QCl lin song_to_play_Q isDoing which_N_do_you_want_to_V2 song_N play_V2 Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 43 55 Questions are also used in different contexts such as questions Which song do you want to play or accomodation Returning to the issue about which song yo
69. s suggests that the semantics grammar is fairly simple to automatically generate Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 44 55 4 4 The domain specific grammars for user utterances There are primarily four kinds of domain specific dialogue moves the user can perform Dialogue move English utterance GoDiS semantics Giving a short answer The Clash answer artist clash Requesting an action Add a song request playlist_add Asking a question Which is the current song ask X current_song X Answering a question It is London calling I want to play answer song_to_play london_calling The common grammar for user utterances GodisUser defines the four categories ShortAns Action Question and Answer Note that these categories are distinct from the corresponding categories in the system grammars A typical utterance by the user often also gives extra information in addition to the main dialogue move This extra information has in GoDiS the form of answers or short answers English utterance GoDiS semantics London calling by the Clash answer song london_calling answer artist clash Add London calling by the Clash request playlist_ada answer song_to_add london_calling answer artist_to_add clash What songs have the Clash made ask X available_song_Q X answer artist c
70. ses Slash whom she sees e Imperatives Imp watch this e Embedded sentences SC whether you go 1 3 6 Sub sentential syntactic structures The linguistic phenomena mostly discussed in both traditional grammars and modern syntax belong to the level of Clauses At this level the major categories are NP noun phrase and VP verb phrase A Clause typically consists of just an NP and a VP The internal structure of both NP and VP can be very complex and these categories are mutually recursive not only can a VP contain an NP VP loves NP somebody but also an NP can contain a VP NP every man RS who VP walks Most of the resource modules thus define functions that are used inside NPs and VPs Here is a brief overview The Noun module How to construct NPs The main three mechanisms for constructing NPs are e from proper names John e from pronouns we e from common nouns by determiners this man The Noun module also defines the construction of common nouns The most frequent ways are e lexical noun items man Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 12 55 e adjectival modification old man e relative clause modification man who sleeps e application of relational nouns successor of the number The Verb module How to construct VPs The main mechanism is verbs with their arguments for instance e on
71. ss et al 1999 the authors proposed a method for generating context free grammar rules from JFACC ontologies Their approach was based on including annotations all along the ontology indicating how to generate each rule They implemented a program that was able to parse the ontology and produce the grammar rules A second precedent of linguistic generation from ontologies can be found in Estival et al 2004 where the author claimed that the concepts of an OWL ontology could be used to generate the lexicon of the NLU module In our work a new rule based solution for generating grammars from ontologies will be described Sec tion 2 motivates and gives an overview of the solution hereby proposed Section 3 describes how the configuration files have to be built Section 4 shows an introduction to the algorithm used Section 5 includes real showcases of the tool at work Section 6 is a summary of the conclusions and future work 2 3 Solution overview The solution proposed here is close to that of Russ et al 1999 in the sense that we also parse the ontology for the rule generation Nonetheless it differs in two ways Firstly the new approach argues that the ontology should remain as is without specific linguistic annota tion Although it is obvious that the ontology itself is not descriptive enough to generate the grammar rules without further information it is preferable to place this additional information in a separate configuration file th
72. sses of the Device class cat Device Blind Fan Heater Lamp Radio TV Some of the commands can operate on a number of devices In the ontology this is done by making the domain the union of these classes as is shown in the XML code in figure 2 5 on page 20 In GF the union has to be given a name and it must be stated that each class is a subclass of the union cat FanHeaterLampRadioTV fun Fan_sub_FHLRT Fan gt FanHeaterLampRadioTV TV_sub_FHLRT TV gt FanHeaterLampRadioTV Each device is also a subclass of the Device class We optimize this a bit by stating that the union is a subclass instead of having to state that each device is a subclass fun Blind_sub_Device Blind gt Device FHLRT_sub_Device FanHeaterLampRadioTV gt Device Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 31 55 Command properties The command properties Open and Close have a System as domain and operates on a Blind cat Open System Blind Close System Blind The command properties SwitchOn and SwitchOff also have a System as domain but can operate on any Fan Heater Lamp Radio or TV cat SwitchOn System FanHeaterLampRadioTV SwitchOff System FanHeaterLampRadioTV Now each of these four properties is a subproperty of HasDeviceCommand which operates on Devices in general cat HasDeviceCommand System Device We have to state that any Open and Close command is a HasDeviceComma
73. subproperty coercions Each function that is defined in the abstract syntax needs a concrete instantiation too The ones missing for us now are the subclass and subproperty coercion functions which in GF simply become identity functions pattern Fan_sub_FHLRT X X FHLRT_sub_Device x x Subproperty linearizations have some dummy arguments for the domain and the range which can be ignored pattern Open_sub_Cmd x Xi SwitchoOff_sub_Cmd _ _ x x 3 2 3 Discussion In this section we have shown that an ontology can be converted to a GF grammar for which we then can write multilingual recognition grammars The ideas are similar to the XML configuration files presented in section 2 4 Advantages with using GF The main advantage with converting the ontology to GF is that we can make use of the rich type system in the concrete syntax for capturing e g inflectional patterns or discontinuous constituents One example is descriptive properties such as hasColor and hasSize linking colors and sizes to devices The idea is to make it possible to say e g the blue lamp or the small TV In an English grammar we can still use a simple context free grammar of the form Lamp Size Lamp TV Size TV Lamp Color Lamp TV Color TV However in a language such as German where the adjective agrees with the gender of the noun this is not correct This problem can be solved by using more complex linearization types in
74. t lt owl unionOf rdf parseType Collection gt lt owl Class rdf about Fan gt lt owl Class rdf about Heater gt lt owl Class rdf about Lamp gt lt owl Class rdf about Radio gt lt owl Class rdf about TV gt lt owl unionOf gt lt owl Class gt lt rdfs range gt lt owl ObjectProperty gt lt owl ObjectProperty rdf ID SwitchOn gt lt rdfs subPropertyOf rdf resource hasDeviceCommand gt lt rdfs domain rdf resource System gt lt rdfs range gt lt owl Class gt lt owl unionOf rdf parseType Collection gt lt owl Class rdf about Fan gt lt owl Class rdf about Heater gt lt owl Class rdf about Lamp gt lt owl Class rdf about Radio gt lt owl Class rdf about TV gt lt owl unionOf gt lt owl Class gt lt rdfs range gt lt owl ObjectProperty gt lt owl ObjectProperty rdf ID Close gt lt rdfs subPropertyOf rdf resource hasDeviceCommand gt lt rdfs domain rdf resource System gt lt rdfs range rdf resource Blind gt lt owl ObjectProperty gt lt owl ObjectProperty rdf ID Open gt lt rdfs subPropertyOf rdf resource hasDeviceCommand gt lt rdfs domain rdf resource System gt lt rdfs range rdf resource Blind gt lt owl ObjectProperty gt Figure 2 5 Device related commands in OWL XML syntax Version Final Public Distribution Public IST 507802 TALK D 1 5 15 08 06 Page 21 55 lt rulesList gt lt
75. teg The structure of the ontology is described in more detail in section 3 4 A 2 Installation instructions First download and install Grammatical Framework Source code binaries and installation instructions can be found on the GF homepage http protege stanford edu 53 IST 507802 TALK D 1 5 15 08 06 Page 54 55 http www cs chalmers se aarne GF Set the search path to the GF library and start GF from inside the directory of the grammar files e Incsh tcsh gt setenv GF_LIB_PATH path to GF lib gt path to GF bin gf Welcome to Grammatical Framework Version 2 4 e In bash export GF_LIB_PATH path to GF lib path to GF bin gf Welcome to Grammatical Framework Version 2 4 If GF will be used on a regular basis the gf binary should be added to the global search path and the environment variable GF_LIB_PATH should be set globally A 3 Testing the grammars The grammars can be tested separately by loading them into GF The relevant concrete syntax modules are DomSrcLng gf where Dom MP3 Agenda Tram Src System User and Lng Eng Swe Sem The following is an example of the capabilities of the GF program For more information about how to use GF see the GF documentation This example assumes we are testing the D GoDiS grammar for user utterances which of course can be replaced by any of the other grammars in the library 1 Start GF in the directory where the grammars
76. ternational Conference on Computational Linguistics Taipei Taiwan Dymetman M Lux V and Ranta A 2000 XML and multilingual document authoring Convergent trends In COLING pages 243 249 Saarbriicken Germany Estival D Nowak C and Zschorn A 2004 Towards ontology based natural language processing In RDF RDFS and OWL in Language Technology 4th Workshop on NLP and XML ACL Johansson M 2006 Globalization and localization of a dialogue system using a resource grammar Master s Thesis Computational Linguistics Gothenburg University Khegai J Nordstrom B and Ranta A 2003 Multilingual syntax editing in GF In Gelbukh A editor CICLing 2003 Intelligent Text Processing and Computational Linguistics LNCS 2588 pages 453 464 Springer 51 IST 507802 TALK D 1 5 15 08 06 Page 52 55 Ljungl f P Amores G Cooper R Hjelm D Manch n P P rez G and Ranta A 2006 Multi modal grammar library Deliverable D1 2b TALK Project Ljungl f P Bringert B Cooper R Forslund A C Hjelm D Jonsson R Larsson S and Ranta A 2005 The TALK grammar library an integration of GF with TrindiKit Deliverable D1 1 TALK Project Milward D Amores G Blaylock N Larsson S Ljungl f P Manch n P and P rez G 2006 Dynamic multimodal interface reconfiguration Deliverable D2 2 TALK Project Milward D and Beveridge M 2003 Ontology based dialogue syste
77. tion declarations abstract syntax and the linearization rules concrete syntax The category and function declarations are done once and for all shared by all languages but the linearization rules are defined on a per language basis Methods which generate grammars from ontologies including ours are also examples of a Rule Based approach 2 2 1 Generating grammars from ontologies The separation of the Knowledge Manager module in charge of the domain knowledge and the NLU module of a Dialogue Manager system has a number of advantages such as reducing the complexity of linguistic components reuse of existing domain knowledge helping the dialogue manager on reference resolution i e anaphoric expressions underspecification presupposition and quantification or helping the dialogue manager to keep dialogue coherence Quesada and Amores 2002 Milward and Beverige 2003 This Knowledge Manager module however has somewhat redundant information with the NLU module The key idea of our work is that this redundancy can be used to automatically generate grammar rules from the relationships between the concepts described in the ontology Thus the fact that the concept Lamp is linked to the concept Blue through the hasColor relationship somehow implies that phrases like the blue lamp should be correct in this Domain and therefore accepted by the NLU grammar The generation of linguistic knowledge from ontologies has been previously proposed In Ru
78. u want to play e Propositions are linearized as clauses C1 plus ClauseForm information lincat Proposition ClauseForm Cl lin song_to_play_P x isDoing you_want_to_VP ComplV2 play_V2 x Propositions can be used in contexts such as answers and feedback You want to listen to London calling or y n questions Do you want to listen to London calling The linearization categories for short answers actions questions and propositions are declared in the common GodisSysteml grammar The ClauseForm information tells in which form e g present or past tense the phrase should be linearized in system reports I e some actions should be reported in present tense I am playing the song whereas other should be in past tense I have added the song to the playlist GoDiS semantics The system grammar for GoDiS semantics is relatively straightforward to implement by using the macros defined in the Prolog resource module e Short answers are one place Prolog functors lin artist ppl artist song ppl song e Actions are Prolog atoms lin playlist_add pp0 playlist_add e Predicates are either one place predicates or wh questions of the form X p X lin song_to_play_P ppl song_to_play song_to_play_Q pWhQ song_to_play Obviously there will be duplicated information in the semantics file the GF functions commonly have the same name as the corresponding GoDiS terms Thi
79. us ontology as an abstract GF grammar 30 3 2 2 Mimus configuration files as GF concrete syntax o o 31 La EA 32 3 3 GF abstract syntax as OWL ontologies c o ooos mece mo des 33 3 3 1 Context free grammars as OWL ontologies o o 33 33 2 GFabstractsyntazin OWL ooo 2 2 24 dasa be ee dea aa 33 3 3 3 Optimizing the grammar ontology 0 002 ee ee eee 34 3 3 4 The GF resource grammar as an OWL ontology 35 3 3 5 Ontology editing as a multilingual authoring system 36 3 4 Using ontologies to specify GoDiS dialogue domains 36 34 1 The GP resource gramitiay ccoo awa eR a aa 36 34 2 Actions and predicates o o e acnee ee ee a ea 36 34 3 Sortsand indivuduals o s e s w ucne ee ee 37 4 The Enhanced Multimodal Grammar Library 39 4 1 Separating system and user utterances e 39 4 2 The domain independent grammars ooo a 40 4 3 The domain specific grammars for system utterances o oo a 41 4 4 The domain specific grammars for user utterances o oo 0000 44 4 5 Integrating multimodality inthe grammars a oaa 0000 00 46 4 6 Extending the GF grammar library rosaa as a 47 5 Summary and Conclusions 49 Bibliography 51 A The enhanced multimodal grammar library 53 A l Downloading the grammar library o oo ee 53 A Z Installation MSTTUCUONS ooo oa
80. y the same mechanism Future research areas include the generation of unification based grammar rules dialogue rules and eval uating the usefulness of the tool with very large OWL ontologies Version Final Public Distribution Public Chapter 3 Ontologies and Grammatical Framework 3 1 Relating existing ontologies with abstract GF grammars In this section we view GF as a language for describing ontologies The abstract syntax in GF is a type theoretical logical framework with dependent types and function definitions Ranta 2004 This expressive power makes it possible to describe general ontologies Recall that the three main components of OWL are classes individuals and properties These have natural counterparts in abstract GF syntax General OWL restrictions are also possible to implement in GF 3 1 1 Classes and individuals Classes can be implemented as GF categories The OWL classes A B and C will then be written as cat A B C Individuals are functions without arguments i e constants The OWL individuals a a b and c will be written as fun a a A b B MES Subclasses Since each individual has to have exactly one categoy there is no direct notion of subclasses or subcate gories in GF That B is a subclass of A is instead represented by a coercion function saying that all B s also are A s fun B_sub_A B gt A That a class is a subclass of two other classes is then implemented as two coer
Download Pdf Manuals
Related Search
Related Contents
Sony LBT-XGR600 User's Manual Wireless Audio Transmitter for PC X-PT 2 - Hilti Acta autoriz uso maquinas herramientas HARPS User Manual Bedienungsanleitung Copyright © All rights reserved.
Failed to retrieve file