Home

(Preliminary) User Manual

image

Contents

1. J D Ullman Principles of Database and Knowledge Base Systems volume 2 Computer Science Press New York 1989 A Van Gelder K Ross and J Schlipf The Well Founded Semantics for General Logic Programs Journal of the ACM 38 3 620 650 July 1991 XML Path Language XPath http www w3 org TR xpath 1999 XSL Transformations XSLT http www w3 org TR xslt 1999
2. is encountered e Queries are answered immediately System queries are executed by sending messages to interface objects Syntactically system commands are path expressions of the form sys expression Most system commands which are relevant to the user are directly applied to the user interface object called sys e g for sys eval the current program is evaluated wrt the current state of the database Other queries are passed to the evaluation component which answers them according to the current state of the database e In case of a syntax error the process is terminated and the error is reported When the end of the program is reached control is given back to the calling level The database then contains a model of the program evaluated so far 2 1 System Commands The following built in system commands are provided 2 PROGRAMMING WITH LOPIX 8 sys consult foo lpx read the program file foo lpx sys load foo lpx load a program file containing only facts directly into the databases e g for loading predicates from ASCII representations sys eval evaluate the current program sys strat dolt evaluate a stratum see Sec 2 1 1 sys echo print argument string sys break dolt stop program execution and get into interactive mode sys return continue program entered in interactive mode after sys break dolt sys end exit LoPiX A number of frequently u
3. sys strat dolt which divides a program into several strata Information queried by a negated subgoal has always to be derived in lower strata than the stratum of the rule containing the negated subgoal The stratification command causes the evaluation of the rules in the higher stratum to be deferred until the fixpoint of the lower stratum is computed Moreover the rules of the lower stratum are not considered any more during the further evaluation 7 Legacy The Florid System LoPiX is based on components of the FLORID system FHK 97 an implementation of F Logic KLW95 When LoPiX is called with the florid option e g malta db florid examples gt lopix florid test flp it acts as FLORID Acknowledgements First of all we want to thank Georg Lausen the head of our group Furthermore our thanks go to the former team members Jurgen Frohn Rainer Himmeroder Paul Th Kandzia Bertram Ludascher Christian Schlepphorst and Heinz Uphoff who de veloped FLORID up to version 2 0 together with the students Thomas Beier Thorsten Pfer dekamper Bernhard Seckinger Markus Seilnacht and Till Westmann from the universities at Mannheim and Freiburg References ABW88 K R Apt H Blair and A Walker Towards a Theory of Declarative Knowledge In J Minker editor Foundations of Deductive Databases and Logic Programming pp 89 148 Morgan Kaufmann 1988 AHV95 S Abiteboul R Hull and V Vianu Foundations of Databas
4. country car_code gt C and population gt P year gt Y gt 1000000 P is bound to the node and the new free element lt bigcountry car code D gt lt population year 1997 gt 83536115 lt population gt lt bigcountry gt is created where the population subelement is not copied but linked from the old one does not support annotated literals instead equiv Example 10 The equality predicate has to be used country car_code gt C and population 83536115 false country car_code gt C and equiv population 83536115 C D 5 2 Equality Objects in XPathLog are uniquely determined by their object identifiers which are only used internally and are invisible to the user who has to access objects by their object names Every object name references exactly one object However there may be several object names denoting the same object often such equalities are derived during program execution 5 BUILT IN FEATURES 17 Example 11 Consider two data sources gs and terra which provide geographical informa tion cia uses english names whereas and terra uses german names Then city objects can be equated fusing the contents X Y gs city gt X equiv name Munich terra city gt Y equiv name Muenchen or X munich gs city gt X equiv name Munich X muenchen terra city gt X equiv name Muenchen muenchen munich Then in the
5. e arithmetic expressions built in predicates and aggregations are described below e In the sequel features which do not belong to the basic XML data model are described Let A C D M and X stand for XPathLog reference expressions e An is a assertion is an expression of the form OisaC object O is a member of class C or C subcl D class C is a subclass of class D When parsing an XML document for every element e of type t e isat holds e A predicate is an expression of the form p X1 Xn The special equality predicate is written as X X93 e A signature atom is an expression of the form C M gt D or C A gt D denoting that elements of type class C have subchildren tagged with M of class D if the class relation ships induced by XML documents is not changed M D or that elements of type class C have an attribute named A which results in a instance of class D if the class relation ships induced by XML documents is not changed D literal object without further distinction Signature atoms describe which properties apply to instances of certain classes Signature atoms together with the class hierarchy form the schema of an XPathLog database e A method application is an expression of the form X m X1 Xn Method applications are also references whose value is defined via equality A special kind of method applications are active methods e g for accessing an url or for exporting a document which
6. of atoms such that all closure properties and all facts and rules of the program are satisfied To avoid redundancy in LOPiX most of the information generated by the closure properties is not inserted into the object manager explicitely but deduced when retrieving information Note that this fixpoint is not necessarily finite 7 LEGACY THE FLORID SYSTEM 23 6 2 Negation and Stratification Negation in LoPiX is handled according to the inflationary semantics KP88 Remember that in a safe rule every variable in a negated subgoal has to be limited by other subgoals Thus only ground instantiated negated subgoals have to be considered during the evaluation Such an instantiation of a negated subgoal is evaluated as true if and only if the intermediate object base given in the moment of the evaluation of the rule does not contain the corre sponding information Inflationary semantics often yields unintended results Hence there are other concepts to handle negation in logic programs One of the most general solutions is the three valued Well Founded Semantics VGRS91 A very common approach is to stratify logic programs and to compute the perfect model ABW88 Prz88 Unfortunately due to the powerful syntax of XPathLog the class of stratified programs is very small Thus automatic stratification cannot be done by LOP1X for a large class of programs Instead programs have to be stratified explicitly by the user with the system command
7. XPathLog queries Pure XPath expressions pure XPath expressions i e without variables are interpreted as existential queries which return true if the result set is non empty country name text Germany city name text true since the country element which has a name element with the text contents Germany contains at least one city descendant with a name element with non empty text contents Output Result Set The query xpath gt N for any xpath binds N to all nodes belonging to the result set of xpath country name text Germany city nameltext gt N N Stuttgart N Mannheim If the result set does not contain literals but elements their internal ids are returned country name text Germany city nameltext gt N C f0_1880 C f0_1888 C f0_1895 3 DOCUMENT ACCESS AND QUERIES IN LOPIX PROGRAMS 12 Additional Variables XPathLog allows to bind all nodes which are considered by an ex pression both by the access path to the result set and in the filters The following expression returns all tuples N1 C N2 such that the city with name N2 belongs to the country with name N1 and car code C country name text gt N1 and car_code gt C city name text gt N2 N2 Stuttgart C D N1 Germany N2 Mannheim C D N1 Germany N2 Karlsruhe C D N1 Germany Local Variables Queries can also conta
8. a is represented as an IDREF attribute Example 3 Attributes We add the data code to Switzerland and make it a member of the European Union C datacode gt ch C memberships gt O country gt C car_code CH organization gt O abbrev text EU results in lt country datacode ch car_code CH memberships org efta org un org eu gt lt country gt We discuss insertion of new subelements below after showing how to create elements Creation of Elements Elements can either be created as free elements by atoms of the form name meaning some element of type name in the rule head this is interpreted to create an element which is not a subelement of any other element or as subelements Example 4 Creation of a free element We create a new free country element re member that in Section 3 2 we equated root to the root node of the parsed Web page country name gt Bavaria and car_code gt BAV and capital gt X and city gt X and city gt Y city gt X name text gt Munich city gt Y nameltext gt Nurnberg Note that the two city elements are linked as subelements This operation has no equivalent in the classical XML model these elements are now children of two country elements Copying elements is described below Thus changing the elements effects both trees By linking elements it is possible to in
9. most important difference to approaches like XSLT XML QL or Quilt where the structure to be generated is always specified by XML patterns this implies that these languages do not allow for updating existing nodes e g adding children or attributes but only for generating complete nodes In contrast in XPathLog existing nodes are communicated via variables to the head where they are modified when appearing at host position of atoms The head of an XPathLog rule is also a conjunction of XPathLog expressions using only the child and sibling axes the operator is allowed only in leftmost position without negation disjunction indexing and built in functions because all these things would cause ambiguities what update is intended When used in the head the and operators and the 4 XPATHLOG RULES 14 construct specify which properties should be added or updated thus does not act as a filter but as a constructor Modification of Elements When using the child or attribute axis for updates the host of the expression gives the element to be updated or extended when a sibling axis is used effectively the parent of the host is extended with a new subelement Generation or Extension of Attributes An expression of the form X a gt V specifies that the attribute a of the node X should be set or extended with V If V is not a literal value but a node a reference to V is stored and in case of generating output
10. scond case the same element is addressed by the expressions munich muenchen gs city gt X equiv name Munich terra city gt X equiv name Muenchen and has all subelements which the individual elements had before and also the attributes are the union of the original attr butes munich nameltext gt N N Muenchen N Munich 5 3 Integers Comparisons and Arithmetics Objects denoting integer or float numbers or strings are different from other objects because the usual comparison operators are defined for them as well as several datatype specific func tions arithmetics string handling There is no built in class integer because it would contain an infinite number of instances Instead there is a built in predicate integer lt argument gt in LoPiX analogously for float and string Within a query or a rule body relations between integer numbers may be tested with the comparison predicates lt gt lt or gt To guarantee safety variables that appear in a comparison P atom have to be bound by another atom or molecule in the rule body Comparison predicates are not allowed in rule heads The basic arithmetic operations addition subtraction teger division are also implemented in LoPiX Arithmetic expressions may be constructed in the usual way even complex expressions e g 3 5 2 or 3 2 3 are possible Note that
11. shell Example 1 Example Session The examples directory contains files smallmondial ml dtd Ipx a small XML file a suitable DTD and a LoPiXprogram If you call malta db lopix examples gt lopix smallmondial flp emacs flp mode emacs lpx mode history file 1 INSTALLATION 6 the program is executed It should tell you that it parses smallmondial cml 1788 bytes and then print some country names You can leave LOPiX with sys end If the program does not work check the four environment variables given in Section 1 1 Command Line Options When LoPiX is started in the unix shell command line options can be given A list of possible options is printed with lopix h If hor an undefined option is given LOPiX terminates after printing an appropriate message malta lopix bin gt lopix h This is LoPiX Options h display all options c lt fileName gt use non default configuration file his lt fileName gt use non default history file d start in trace mode e lt Integer gt set output level for errors lt list lt fileName gt gt start and consult files separated by blank q quit after executing no interactive mode V prints version number The option c denotes a configuration file to be used instead of the standard configuration file specified in the environment variable DEFAULTCFG Similarly the option his loads the given file into the readline history instead of the de
12. strlen gt fails because a query like show all strings with a certain arity e g strlen X 40 may lead to an answer set that is not infinite but too huge to be handled Normally strlen is used in the following way to return the length of a given string strlen logic X strcat lt string gt lt string gt lt stringz gt succeeds if lt string3 gt is the concatenation of lt string gt and lt stringg gt E g strcat a b X returns the binding X ab whereas str cat a Y ab leads to Y b If the arguments contain more than one variable e g strcat a Y X either Y or X have to be bound by other molecules in rule body otherwise strcat fails The reason for this limitation is the same as in the case of strlen substr lt string gt lt stringa gt holds if lt string gt is a substring of lt stringg gt Comparison be tween the two strings is case insensitive for example substr DaTA database re turns true If the arguments contain any variable it has to be bound by other molecules in the rule body otherwise substr fails For substr logic X the corresponding answer set would be infinite match lt string gt lt pattern gt lt fmt list gt lt variable list gt pmatch lt string gt lt pattern gt lt fmt list gt lt variable list gt finds all strings contained in lt string g
13. user friendly interface to LOPiX via the emacs flp mode The emacs lpx mode is defined in the file Ipx el which is located in lopix environment To make emacs load and use this mode the local emacs file has to be extended by the following lines 333 enter lpx mode if a file with the suffix lpx is loaded setq auto mode alist cons lpx lpx mode auto mode alist 35 autoload Ipx el if the functions lpx mode or run lpx are executed autoload lpx mode 1px t autoload run lpx lpx t To be sure that emacs actually finds lpx el either the lines setq load path cons home db lopix environment load path have to be added to the emacs file or lpx el has to be put into a directory where emacs looks for files In lpx el it has to be specified where to find the lopix executable If the lopix bin directory is in the binary search path defvar lpx program name lopix Program name for invoking an inferior LoPiX with run lpx is sufficient Otherwise change it to the actual path Additionally lopix environment default his has to be copied to lopix history for the private readline history After these installation steps emacs has to be started again to let the changes take effect 1 3 LoPiX in the UNIX Shell LoPiX is started from the shell by simply typing lopix at the unix prompt To give an impression of the LOPiX system and its usage we present a short example session in the unix
14. variables may occur anywhere in the rule body respectively the higher level aggregation body Like arithmetic expressions aggregation terms may only occur in the built in predicates He gt lt gt However the aggregation body may contain built in predicates with other aggregation terms Thus aggregation can be nested The semantics of an aggregation term in a comparison predicate e g the inequality V lt agg X G1 Gn body is defined as the conjunction V lt Z Z agg X G1 Gn body Syntactic Restrictions For obvious reasons the following syntactic restrictions apply 6 EVALUATION SEMANTICS 21 e The aggregation variable and all grouping variables must occur in the aggregation body e The variables X Gj G are pairwise distinct e The aggregation body has to obey the safety restrictions i e no variable may represent an infinite answer set Violating these restriction will cause an error The operator count gives the total number of variable bindings to the aggregation variable Note that different from count the operators min max and sum ignore objects other than integer without producing any error message or warning myset item gt 10 40 apple 27 cheese Z count X myset items gt X Z sum X myset items gt X will yield 5 and 77 Aggregates and Stratification As in the case of negation the facts to be used in the aggregation body has to be complete
15. www informatik uni freiburg de dbis Publications 1 1 Environment Variables Shell environment variables are used to set paths leading to LoPiX s configuration files which are needed when LoPiX is called It is also possible to specify the paths by command line options when calling LOPiX The following variables should be set before starting LOPixX e LOPIXCFG e LOPIXHIS LOPIXCFG tells how to find the configuration file and LOPIXHIS points to the history file to preload The configuration file is a sequence of system commands that create the objects needed for a working system and then pass control to the user If one of these variables is not set and the respective command line option is missing the system will print a warning UNIX En In the following we assume that LOP1X s main directory lopix is located at home db vironment Then the system environment variables have to be set to Variables e LOPIXCFG home db lopix environment lopix cfg e LOPIXHIS home db lopix environment lopix his Additionally the environment variables 1 INSTALLATION 5 SP_ENCODING XML and SGML_CATALOG_FILES home db lopix sgml xml soc have to be set for the sp SGML parser Cla which is employed in the current version for parsing XML files into LoPiX If you want to check if your settings are working proceed with Section 1 3 now for a first example 1 2 Settings for LoPiX in Emacs Additionally to the shell emacs provides a very
16. 4 In the second example 3 14 is not a float alm gt 3 14 sys strat dolt alm gt C a m gt C D C 1 A Answer to query alm gt C C 3 14 A Answer to query alm gt C D C 1 false But what happens there Obviously the program is syntactically correct So lets ask what the database looks like then sys theOM dump 3 14 gt 3 14 a m gt 3 14 which is correct 3 14 is interpreted as the result an anonymous object of applying the method 14 to the object 3 Often a conversion between datatypes is needed In LoPiX data conversion is provided by built in predicates string2integer A B is true if B is the integer obtained when reading the string A from left to right as far as possible e g the following holds string2integer 42 42 string2integer 3D 3 string2integer 3 14 3 string2float A B is true if B is the float obtained when reading the string A from left to right as far as possible a leading is ignored so both normal floats and the XPathLog representation of floats can be read E g the following holds string2float 42 42 string2float 3D 3 string2float 3 14 3 14 string2float 73 14 3 14 string2float 3 14D 3 14 Note that integers are seamlessly integrated with floats LOPiX never prints 3 as 3 but always understands 3 as 3 string2object A B is true if B is the object
17. LoP X Logic Programming in XML Preliminary User Manual Wolfgang May may informatik uni freiburg de Institut fur Informatik Universitat Freiburg Germany January 2001 CONTENTS Contents Preface 1 Installation 1 1 Environment Variables 0 0 0 0 nn 1 2 Settings for LOPiX in Emacs 2 22 on nn 1 3 LoPiX inthe UNIX Shell oonan aaa w 2 Programming with LoPiX 2 1 System Commands aaa 2 1 1 Blocks and Stratification 2 2 m oo on 3 Document Access and Queries in LoPiX Programs 3 1 The XPathLog Language 02 000 3 2 Accessing Documents 2 2 Con a 3 3 Querying the Database 2 2 CC Kon nn 4 XPathLog Rules 4 1 Left Hand Side 20 0 0 0 a ee ee 5 Built in Features 5 1 Handling of Literal Values 2 22 CC EEE nn 9 2 Equality 222 Co Coon 5 3 Integers Comparisons and Arithmetics 5 4 String handling 2 e 5 5 Data Conversion 0 00 eee e ee eee 5 6 Aggregation aoao ee 5 7 Output Handling aoaaa e 6 Evaluation Semantics 6 1 Fixpoint Semantics aoao m nen 6 2 Negation and Stratification 2 2 2 Comm 7 Legacy The Florid System References 1 4 Running LOPiX in Emacs 2 2 022 202 0084 oa ow rr A IN 10 11 13 13 15 16 16 17 18 18 20 21 21 22 23 23 23 CONTENTS 3 Preface LoPiX is an implementation of the XPathLog language a logic programming language based on XPa
18. anged in the configuration file lopix cfg Evaluating in semi naive mode is promising for recursive programs with many TP rounds to make up for the overhead due to program analysis rewriting and delta predicate maintainance Semi naive evaluation will probably be slow for programs that frequently change the class hierarchy or equate objects because this makes dependancy analysis very hard To see the rewritten program set the debug mode program 6 1 Fixpoint Semantics The evaluation strategy for XPathLog programs without inheritance is basically the same as for Datalog programs The bottom up evaluation of an XPathLog program starts with a given object base Initially this is the empty object base Facts are rules with an empty body therefore always considered as true The rules and facts of a program are evaluated iteratively in the usual way If there are variable bindings such that the rule body is valid in the actual object base these bindings are propagated into the rule head New information corresponding to the ground instantiations of the rule head or deduced due to the closure properties is inserted into the object base This evaluation of rules is continued as long as new information is obtained As in the case of Datalog the evaluation of a negation free XPathLog program reaches a fixpoint which coincides with the unique minimal model of that program The minimal object base of an XPathLog program is defined as the smallest set
19. are described below e in the rule head further restrictions apply to yield a well defined semantics they are also described below 3 2 Accessing Documents Every resource available in the Web has a unique address called Uniform Resource Locator URL which is used to initiate access to the document There is a predefined class url containing Web address strings and a predefined built in active method parse that when applied to a member of url accesses the corresponding Web document and integrates it into the database The schema for Web access is url subcl string parse gt xmldoc Whenever for an instance u of class url the method u parse is called by a rule head the document at the address u is automatically loaded and analyzed according to the arguments e u parse xml parses the contents of u as an xml document and assigns its root to the reference u parse xml w parse xml has a child node which represents the outermost document node In course this node has children and attributes and so on similar to the DOM representation see below This structure is then queried by XPathLog reference expressions e u parse dtd parses u as a DTD and generates a suitable signature which applies to every document which uses this DTD The signature is queried by signature expressions or it can be dumped with the system command sys theOMAccess export sig The example parse dtd 1px illustrate
20. ate the built in class integer several comparison predicates the basic arithmetic operators predicates for string handling and aggregate functions 5 BUILT IN FEATURES 16 5 1 Handling of Literal Values Literal values can either be represented as attributes or as PCDATA contents Similar to XPath queries XPathLog supports automatical casting from PCDATA only elements into literals in queries and in comparison predicates Casting can be turned on and off using the system command sys annotatedLiterals on sys annotatedLiterals off When annotatedLiterals is on variables which are bound to elements which PCDATA contents are output as literal values this does not effect the variable binding which is used when propagating values to the rule head there the variable is still bound to the node Also in arithmetics and comparisons the literal value is used Example 9 Handling of Literals Consider the following XML fragment smallmondial zml lt country car code D gt lt population year 1997 gt 83536115 lt population gt lt country gt Here population is an annotated literal t e it can act as a literal or as an element country car_code gt C and population gt P year gt Y gt 1000000 N CH Y 1997 P 83536115 Note that the query 83536115 year gt Y false does not yield any results In a rule e g bigcountry car_code gt C and population gt P
21. buffer In this buffer LoPiX is started 2 PROGRAMMING WITH LOPIX 7 Reset System and consult buffer as file Consult additional same as C c C c but without system reset Consult marked region without reset System reset clear database and program Consult whole buffer as region without reset Display buffer lpx and jump there Break evaluation or output Quit LoPiX process Table 1 Special keycodes in the lpx mode running the program given in the program buffer Load examples smallmondial lpx into the emacs and type Control c Control c Then the user can interact in the same way with LoPiX as described before for the unix shell Additionally it is possible to step through the history by Meta P Previous and Meta N Next The difference between C c C a and C c C b is that in the second case the buffer s contents is not saved before calling LoPiX 2 Programming with LoPiX Programs are collections of facts and rules similar to Prolog programs In addition to logical facts and rules system commands can be used for user interaction with LoPiX After entering starting LOPiX with a program or typing C c C c in an emacs program buffer holding the program the following happens LOPiX reads foo 1px rulewise using the XPathLog parser e Facts und rules are added to the current program held by the the interpreter Note that no evaluation takes place until the system query sys eval or sys strat dolt
22. cuments in the Web can be accessed and presents the XPathLog query language The syntax and semantics of XPathLog rules is described in Section 4 Section 5 describes additional built in features mainly concerning datatypes Section 6 gives an overview of the theory of program semantivcs and evaluation 1 INSTALLATION 4 1 Installation The binary distribution of LOPiX comes as a packed and compressed file lopix lt version number gt lt operating system gt tar gz In the following we refer to this file simply as lopix tar gz The file is uncompressed by the command gunzip lopix tar gz The resulting file lopix tar has to be unpacked by entering tar xvf lopix tar Now the directory lopix is created It has the following subdirectories lopix bin lopix environment lopix sgm1 lopix doc The directory lopix bin contains the binary lopix environment contains several files defin ing the LoPiX environment e lopix cfg in the source of the configuration file e lopix his history lines to preload e lpx el the emacs Ipx mode definition file First the LOPiX configuration has to be adapted to the local system by changing to the configure directory lopix environment and calling configure this generates lopix cfg from LoPiX lopix cfg in The directory lopix sgml contains several definitions needed when using SGML docu ments The directory lopix doc contains postscript files of this manual lopix ps LoPiX are available from http
23. es Addison Wesley 1995 CGT90 S Ceri G Gottlob and L Tanca Logic Programming and Databases 1990 Cla J Clark SP An SGML System Conforming to International Standard ISO 8879 Standard Generalized Markup Language http www jclark com sp REFERENCES 24 CRFOO DFF 99 FHK 97 KLW95 KP88 Mon Prz88 Rob99 U1189 VGRS91 XPa99 XSL99 D Chamberlin J Robie and D Florescu Quilt An XML Query Language for Heterogeneous Data Sources In WebDB 2000 pp 53 62 2000 A Deutsch M Fernandez D Florescu A Levy and D Suciu XML QL A Query Language for XML In 8th WWW Conference W3C 1999 World Wide Web Consortium Technical Report NO TE xml ql 19980819 www w3 org TR NOTE xml1 ql J Frohn R Himmeroder P T Kandzia G Lausen and C Schlepphorst FLORID A Prototype for F Logic 1997 M Kifer G Lausen and J Wu Logical Foundations of Object Oriented and Frame Based Languages Journal of the ACM 42 4 741 843 1995 P Kolaitis and C Papadimitriou Why not Negation by Fixpoint pp 231 239 1988 The MONDIAL Database http www informatik uni freiburg de may Mondial T C Przymusinski On the Declarative Semantics of Deductive Databases and Logic Programs In J Minker editor Foundations of Deductive Databases and Logic Programming pp 191 216 Morgan Kaufmann 1988 J Robie XQL XML Query Language http www metalab unc edu xql xql proposal html 1999
24. fault in DEFAULTHIS For operating LOPiX with non standard configuration files regularly it is more convenient to change the environ ment variables cf Section 1 1 The option q causes LOPiX to quit after execution of the command line instead of entering the interactive mode This is helpful if the user wants to call LOPiX in batch mode e g from a shell script lopix q test 1px In addition to the options a list of filenames of F Logic programs can be given LOPiX consults them in a left to right order With option d debug all rules read will be echoed to the console This is useful for tracing configuration errors and locating errors in program files 1 4 Running LoPiX in Emacs Running the system from the emacs not only offers high level editing facilities but also inte grates LOPiX into an environment where all kinds of tools mailreader newsreader several compilers TeX etc are used in a uniform way In order to use LOPiX from emacs the configuration steps in Section 1 2 must have been executed When a file ending with 1px is loaded into emacs emacs automatically enters the XPathLog mode which provides syntax highlighting facility to make XPathLog programs Additionally to editing capabilities the lpx mode defines several key codes for interaction between the editor and LoPiX see Table 1 When the key combination Control c Control c is pressed in the program buffer a new buffer is created in the lower half of the current
25. h expressions can be arbitrarily nested Example 7 Generation of Nested Elements The path expression n a b c name gt gt m generates the XML tree fragment lt a gt lt b gt lt c gt lt name attributes of m gt contents ofm lt name gt lt c gt lt b gt lt a gt In another example we show how data can be restructured Example 8 Creation of free elements The use of variables allows to create elements whose name is given by an attribute of another element casting strings silently into names which is very useful for data restructuring and integration From the elements lt water type river name Mississippi gt lt water gt lt water type sea name North Sea gt lt water gt the rule T name text gt N water type gt T and name text gt N creates lt river name Mississippi gt and lt sea name North Sea gt Attributes and contents are then transformed by separate rules which use name text for identification Properties are copied by using variables at element name and attribute name position X A gt V water type gt T and name text gt N and A gt V T gt X nameltext gt N X S gt V water type gt T and name text gt N and S gt V T gt X nameltext gt N 5 Built in Features The LOPiX implementation of XPathLog provides some built in features namely a comfort able handling of PCDATA contents the equality predic
26. he fact version accepts variables 6 EVALUATION SEMANTICS 22 Program evaluation As already mentioned LOPiX uses a bottom up evaluation strategy This algorithm iteratively deduces new facts from already established facts using a forward chaining technique CGT90 A program P gives rise to an operator Tp on partial models as defined for XPathLog in This operator adds all those facts to the model which can be derived from the already existing facts by a single application of a program rule no recursion To evaluate recursive rules it is necessary to iterate this operator Starting with the empty model or a given finite object world Tp is applied iteratively until a fixpoint TR is reached A single application of Tp can be achieved with sys tp although the user will in general use sys eval for complete evaluation which means iteration of deductive fixpoints Note that the database is not cleared before applying Tp that is the evaluation does not start with the empty set by default This is an important feature for dividing large programs into smaller parts or for user stratification using sys strat doIt see Section 2 1 1 Semi naive Evaluation LoPiX includes an evaluation component providing a semi naive evaluation mode The evaluation mode can be set by a system command sys theEval mode seminaive sys theEval mode naive Naive evaluation is the default setting this may be ch
27. ibling The file mondial example ipx contains many more instructive queries 4 XPathLog Rules XPathLog rules are logical rules of the form head body over XPathLog expressions where we additionally allow conjunctions in the rule head To cut long theory short evaluation in LOPiX means bottom up evaluation with user defined stratification i e an intuitive easy to grasp rule based semantics Given a program 1 e a set of rules the rule bodies right hand side are evaluated like queries yielding a result set The variables in the rule heads are instantiated according to the result sets specifying the facts to be added to the database In a step all rules in a program are evaluated simultaneously As long as they yield new facts the process is iterated Tp operator of Logic Programming Programs can be cut into subprograms by strata as introduced in Section 2 1 1 Then the strata are evaluated sequentially each of tem starting with the database which results from the evaluation of the previous one see program reachable flx for an example Section 6 gives a more detailed overview of the semantics Right Hand Side The body of an XPathLog rule is a conjunction of XPathLog expres sions The evaluation of the body wrt a given database yields variable bindings which are propagated to the rule head where facts are added to the database 4 1 Left Hand Side Using logical expressions for specifying an update is perhaps the
28. in local variables which are used for joins or con ditions their bindings do not occur in the output local variables are of the form _X The following XPath expression returns all names of cities s t the city belongs to a coun try whose name is known and its population is higher than 100000 and its latitude is not known country name text gt N1 city population gt _P and not latitude name text gt N2 _P gt 100000 The semantics of this query is a set of variable bindings for N1 and N2 An equivalent expression without local variable is country nameltext gt N1 city Gpopulation gt 100000 and not latitude name text gt N2 Dereferencing Reference attributes can be resolved in path expressions for navigating through the XML database For every organization give the name of the seat city and all names and types of members organization name text gt N and abbrev gt A and seat name text gt SN members type gt MT country name text gt MN One element of the result set is e g N A UN SN New York MT observer MN Switzerland The following query is equivalent using a join of literals connected by the local variable _Org instead of navigation organization gt _Org name text gt N and abbrev gt A and seat name text gt SN _Org members type gt MT country nameltext gt MN The following query yields the abbreviations of all organizat
29. ions which are seated in the capital of one of their members organization abbrev gt A and seat members country capital We can additionally add variables to bind the names of the organization of the country and of the seat city organization name text gt N and seat members country name text gt CN capital seat name text gt CN Negation in Filters The following query yields all organizations which are not seated in a city in some member country organization name text gt N and not seat members country city The answer includes those where no seat is known since the filter predicate says not those where i the seat attribute is defined and ii references a city in a country which is a member 4 XPATHLOG RULES 13 Navigation Variables Variables can also range over element or attribute names Are there elements which have a name subelement with the pcdata contents Monaco and of which type are they Type gt X name text Monaco Type country X country monaco Type city X city monaco Note that we use the predicate for comparison with a constant in the filter Axes LOPiX support the azes self child parent descendant ancestor attribute and sibling Since the current version does not actively support the ordering of children although the evaluation strategy preserves that ordering in most cases we do not distinguish preceding sibling and following s
30. ly established before the actual aggregation 5 7 Output Handling For every node N the active method N export doctype filename in the head of a rule or as a fact generates XML output of the subtree rooted in N consisting of all subelements and at tributes which are specified by the currently stored signature to the given f lename The DTD metadata public system and the url has to be set by methods e g country Isa doctype country public mondial europe 2 0 dtd sys strat dolt germany export country exp1 sys strat dolt If filename is the empty string the output is done at standard output Similarly sys theOMAccess export xml filename doctype object as a system command allows to export a given element The example parse dtd 1px illus trates the export functionality The signature can either be given by parsing a DTD or as facts in the program 6 Evaluation Semantics An XPathLog program is a collection of facts and rules in arbitrary order Evaluating these facts and rules bottom up an object base is computed which may then be queried Note however that queries are not part of a program They are evaluated respectively executed in case of system commands when parsing the program respectively thze stratum before evaluating the program stratum note that the system command version can be given interactively whereas only t
31. oint can be examined by querying Calling sys eval again will continue the evaluation process where it was halted The pretty printer checks the QUIT signal too so that printing huge answer sets can be stopped 3 Document Access and Queries in LoPiX Programs The internal data model of LoPiX is an in the current version restricted XML data model enhanced by some features known from the relational and the object oriented data mod els predicates metadata signatures and data for controlling document access and a class hierarchy 3 1 The XPathLog Language XPathLog is based on XPath Constants are interpreted as names and elements of the universe For convention constants start with lowercase letters whereas variables start with uppercase ones e an XPathLog reference expression is an XPath expression which satiesfies the following conditions every element name or attribute name in a navigation step may be replaced by variabley name gt var able or even variable gt variablez Here variable is bound to the name ex tending the navigation wildcard and variable is bound to the node s for reference attributes name additional navigation steps may follow implicit deref erencing 3 DOCUMENT ACCESS AND QUERIES IN LOPIX PROGRAMS 10 the id function is not used since XPathLog supports implicit dereferencing filters do not contain logical or to be replaced by two or more rules
32. ond set of rules is evaluated wrt this database Thus an important example is the definition of a block strat which makes stratification of programs more readable This is done in config lpx by default sys strat gt gt sys eval sys forgetProgram Thus calling 3 DOCUMENT ACCESS AND QUERIES IN LOPIX PROGRAMS 9 sys strat dolt executes sys eval and sys forgetProgram sequentially to separate the strata of a program simply put the command between them Besides the method doIt a block has the method display It lists the contents of the block as a list of queries adding the query prompt In the example above calling the method sys strat display will print Block sys eval sys forgetProgram 1 Stratification is an important matter when dealing with negation when facts have to be computed completely before their negation is used The system command sys strat dolt may also be used in negation free XPathLog programs to speed up the evaluation Interrupting an Evaluation A running evaluation can be canceled without terminating the whole system by pressing Control Then a QUIT signal is sent to the system causing a break flag to be set In order to return a somewhat consistent object world the current Tp round has to be finished such that it possibly takes some time before the system actually halts After canceling the partial model evaluated up to this p
33. s DTD parsing and its result 3 DOCUMENT ACCESS AND QUERIES IN LOPIX PROGRAMS 11 By evaluating rules of the form u Isaurl u get lt body gt the internal database is extended by new XML documents Thus loading Web documents is completely data driven New documents are fetched depending on information and links i e URLs or XLinks found in already known documents As long as only a single document is considered it is recommended to assign its local root to the global constant root which is used for evaluating expressions of the form as it is done in smallmondial lpx mondial xml gt file smallmondial xml isa url U parse xml root mondial xml gt U If several documents have to be considered it is useful to assign constants to their roots e g germany xml gt file germany xml isa url france xml gt file france xml isa url U parse xml X X xml gt U Then the constants can be used as starting points for XPathLog reference expressions e g germany city name gt C 3 3 Querying the Database In this section we use the MONDIAL database Mon as an example XML files mondial 2 O xml and mondial europe 2 0 xml use lopix examples mondial example flx The file is accessed by mondial xml gt file mondial europe 2 0 xml isa url U parse xml root mondial xml gt U and can then be queried by XPathLog reference expressions Example 2
34. sed system commands are loaded to the readline history 2 1 1 Blocks and Stratification Sequences of queries in practice this concerns mainly system commands can be combined in a block A block is an interface object providing the method doIt This executes the queries contained in the block Of course the same effect could be achieved by consulting a file with the commands or queries But for short command series often recurring it is more convenient to use a block as a shorthand without having to consult external files Syntactically a block declaration is a multivalued method definition with host object sys Between the curly braces stands a sequence of goals i e queries without Note that in contrast to the XPathLog semantics of multi valued methods the goals form a list here not a set i e the order is relevant Note that you can use logical queries besides system commands in blocks too as far as they contain only one goal Stratification The interplay between the database and the current program allows a user stratification of programs a set of rules which is a part of a larger program can be executed by the two system queries some rules sys eval sys forgetProgram some more rules sys eval First the first set of rules is evaluated by sys eval computing a database Then forgetProgram clears the current program the contents of the database remains unchanged and the sec
35. t which may be a string or a Web document which match the pattern given by a regular expression in lt pattern gt lt fmt list gt is a format string describing how the matched strings should be returned in lt variable list gt This feature is useful when using groups expressions enclosed in in lt pattern gt In the format string lt fmt list gt groups are referred to by their number n where n ranges from 1 to 9 If lt fmt list gt is the empty string all groups are returned without formatting With the exception of lt variable list gt all arguments must be bound pmatch does the same using Perl regular expressions we recommend using pmatch For example match linux98 0 9 CL0 9 2swap 1i X returns X 8swap9 the first and second match are swapped 5 5 Data Conversion LoPiX internally distinguishes objects e g john from literals e g 42 3 14 or John The literal types are further distinguished strings are always given in quotes john integer objects are given as is and floats have to be distinguished by from literals 3 14 Example 12 Note that 3 14 and 8 14 actually are different things In the first example 3 14 is interpreted as a float b m gt 3 14 5 BUILT IN FEATURES 19 sys strat dolt bIm gt C b m gt C J D C 1 A Answer to query b m gt C C 3 14 A Answer to query b m gt C D C 1 C 3 14 D 4 1
36. th The LOPiX system employs the evaluation component of FLORID an implemen tation of F Logic whose data model and language syntax has many similarities with XML and XPath respectively We assume that the reader of this tutorial is familiar with the basic concepts of deductive databases e g Datalog AHV95 CGT90 Ull89 or F Logic KLW95 XPath XPa99 is the standard querying language for XML data Most XML querying an data transformation languages are based on XPath e g XQL Rob99 XSL T XSL99 XML QL DFF 99 Quilt CRF00 and the XLink XPointer proposals XPathLog extends XPath with logic programming style variable and variable binding concepts and a semantics for XPath expressions in rule heads for updating an XML database The evaluation of programs in LOP1X is based on a set oriented bottom up computation as an extension of the algorithm well known from Datalog AHV95 CGT90 U1l89 Here the evaluation component of the FLORID system is employed We assume that the reader is familiar with the basic notions of XPath and Logic Pro gramming The structure of this manual is as follows The first part describes the installation and the environment of LOPiX First we give describe the installation of LOPiX for working from the unix shell or from within emacs Section 1 describes how to run LOPiX in the unix shell or working with emacs Section 2 gives an overview of the structure of LOPiX programs Section 3 describes how XML do
37. the blanks between an arithmetic operator and its operands are mandatory 2 2 leads to a parser error message Arithmetic expressions are only allowed in in equality predicate atoms in a rule body or in filters e g X Y 2 3 X Y 4 4 Objects denoting integers must not be equated to other integer or string objects because their object identity is not independent from their object name Unfortunately avoiding this by a static check is impossible since integer objects may be bound to variables in a rule head If an equation of two different integer objects is derived during the evaluation of an XPathLog program an error is reported 6699 66K 99 multiplication and in Note that comparison predicates are used in infix notation in contrast to other predicate symbols 5 BUILT IN FEATURES 18 5 4 String handling Analogously to integers there are several predefined operations for strings These are provided by the built in predicates which all have a fixed arity Using them with a wrong arity causes a parser error message Furthermore these predicates can only be used in rule bodies string lt arg gt is true if lt arg gt is a string strlen lt string gt lt value gt holds if lt value gt which can be given a constant or a variable rep resents the length of lt string gt For practical reasons variables at position of lt string gt have to be bound by other molecules in the rule body otherwise lt
38. troduce cycles in the document hierarchy For the subsequent examples we associate aliases with the Bavaria and Germany elements C bavaria country gt C name text gt Bavaria C germany country gt C name text gt Germany Insertion of Subelements and Attributes As shown above elements are extended with subelements or attributes by using filter syntax in the rule head Example 5 Subelements The following two rules are equivalent to the above ones 5 BUILT IN FEATURES 15 country name gt Bavaria C car_code gt BAV and capital gt X and city gt X and city gt Y city gt X name text gt Munich city gt Y nameltext gt Nurnberg country gt C nameltext Bavaria Here the first rule creates a free element whereas the second rule uses the variable binding to C for inserting subelements and attributes Generation of Elements by Path Expressions Additionally subelements can be cre ated by path expressions in the rule head which create nested elements which satisfy the given path expression Example 6 Generation of Subelements The ethnicgroups subelements of Germany are copied as subelements of Bavaria bavaria ethnicgroups text gt E and percentage gt P germany ethnicgroups text gt E and percentage gt P Here new elements are created which can be changed independently from the original ones Pat
39. whose id is A converted to lowercase letters 5 BUILT IN FEATURES 20 string2object john 0O A Answer to query string2object john O 0 john string2object John 0 A Answer to query string2object John O 0 john string2object S john A Answer to query string2object S john S john If the given string denotes an integer or a float B is the corresponding integer or float object Note that the above predicates are bidirectional but at least one of the arguments has to be bound string2float 3 14 B A Answer to query string2float 3 14 B B 3 14 string2float X 3 14 A Answer to query string2float X 3 14 X 3 14 5 6 Aggregation An aggregation term has the form age X G1 G body where agg is one of the usual aggregation operators min max count and sum and b is the aggregation body that is a conjunction of literals age X G1 G body returns one value for every vector of values for Gj G All variable bindings satisfy ing body are calculated intentionally yielding bindings for X Gj G Then the X s are grouped by Gj G and for every group agg is calculated and returned The list of grouping variables Gj G is optional and may be omitted e g N count C country gt C If the aggregation body contains other variables than the grouping variables these are local to the aggregation The grouping

Download Pdf Manuals

image

Related Search

Related Contents

Bryant 701A Air Conditioner User Manual  2012年11月11日 ジュネーブオークション ハイライト  Timesaver:Calc for Tax Ireland User`s Manual  Kat. Nr. 30.3049  SA - EM10  Jury : mode d`emploi ». - Le Fonds d`Expérimentation pour la  Cooler Master RS650-ACAAE3-US power supply unit  CAPA 4150 X4.cdr  Fiche de données de sécurité    

Copyright © All rights reserved.
Failed to retrieve file