Home
CDET The Consistent Document Engineering Toolkit User Manual
Contents
1. Classical disjunction formulat The disjunction s subformulae you need at least two 40 Technical Details and Classical conjunction formula The conjunction s subformulae you need at least two not Classical negation formula The negated subformula implies Classical implication formula The first argument formula of the implication the one that implies something formula The second argument formula of the implication the one that is implied forall Universal quantification Required attributes id The quantified variable Make sure that within a rule you quantify at most once over a variable For clarity and simplicity CDET does not support multiple quantification over the same variable Of course in different rules you can reuse variables Optional attributes action Here you can specify a preferred repair action for the quantified variable The following actions are supported delete only derive repairs to delete something change only derive repairs to change something keep don t derive any repairs for this variable keepdomain Cache the quantifier domain the values over which the quan tifier iterates Use this feature with caution since quantifier domains may become really large term A term that tells CDET where the values of the bound variable come from Over all these values the quantifier iterates Make sure that the type of the term is a list 7 then t
2. lt id type variable name gt lt ATTLIST tvar id CDATA REQUIRED gt lt supertype of all other types gt lt ELEMENT top EMPTY gt lt repository state gt lt ELEMENT state EMPTY gt lt function type gt lt ELEMENT funtype type gtype gt B 3 Language DTD language dtd lt top level language element gt lt ELEMENT language desc importlanguage importmodule importparser importforeign typedefs predsymdefs funsymdefs gt lt id language name gt 54 DTDs lt qualifiedPrelude import Haskell s Prelude qualified as gt lt qualifiedXMLTypes import HaXml types qualified as you should do that gt lt qualifiedChar import Haskell s Char module qualified as gt lt qualifiedRatio import Haskell s Ratio module qualified as gt lt qualifiedComplex import Haskell s Complex module qualified as gt lt qualifiedNumeric import Haskell s Numeric module qualified as gt lt qualifiedIx import Haskell s Ix module qualified as gt lt qualifiedList import Haskell s List module qualified as gt lt qualifiedMaybe import Haskell s Maybe module qualified as gt lt qualifiedRandom import Haskell s Random module qualified as gt lt qualifiedTime import Haskell s Time module qualified as gt lt by default we import Prelude unqualified and gt lt all other mod
3. meta Rule metadata Required attributes weight importance level of the rule weak may the rule be violated desc A comprehensive description for this rule docdeps Document dependencies By these document de pendencies you can override the dependencies calculated by CDET Sometimes the calculated dependencies are too pessimistic and you know better The more restrictive the document dependencies are the more consistency checking is accelerated docdep Document dependencies Required attributes regex pattern of files the rule accesses formula The actual predicate logic formula described below The formula element contains the predicate logic part of a rule which is rather standard It consists of the following subelements e desc A comprehensive description of the formula Here I plan to include some automatic translation of the formula into natural language The formula itself may be one of the following like in classic predicate logic e papply An atomic formula i e predicate symbol application to some argument terms Required attributes id The predicate symbol to apply Make sure that the symbol is defined in a language imported by this rule set 38 Technical Details term Argument terms of the predicate symbol application Make sure that the result types of the argument terms fit to the pred icate symbol s type modulo polymorphism and subtyping of course CDET s type checker w
4. performance Performance options Required attributes usecache use CDET s internal caches fastprelude use internal computation for Prelude functions incremental check consistency incrementally filterrules only check rules influenced by a check in optimizerules rewrite rules for faster consistency checking repair Options for repair Required attributes partialResolution should we permit partial in consistency resolution Optional attributes maxsets maximum number of alternative repair sets 32 5 2 Language XML Syntax 33 metrics The metrics for sorting the repair sets Here you can place a Haskell implementation of the function repairMetrics Repair gt Repair gt Bool cdetdir The CDET source directory Required attributes path the directory path includedir Extra include directories needed to compile your project e g directories from which you import parsers or auxiliary Haskell modules Required attributes path the directory path importrules One or more rule sets for your project Against the rules from these files your repository will be checked for consistency Required attributes path the rule definition file 5 2 Language XML Syntax language A user defined language needed by some rules A language de fines types predicate symbols and function symbols Required attributes id the name of the language use a unique name different from any other languag
5. ActChangeField Value FSym Value change a field in a value ActAdd Value insert value ActChangeSuchThat PSym Value Bool we do not know how to set the value data Rating Rating offends Id possibly impacted rules resolves Int Int how many inconsistencies resolved at which importance level repaircost Maybe Cost repair costs In addition isDocVal Value gt Bool determines whether a value is a doc ument isDocListVal Value gt Bool determines whether a value is a list containing documents lt here are the CDET sources gt lt cdetdir path src gt lt import these rules gt lt importrules path ManualsRules xml gt Finally we tell CDET about its own sources and import some rule sets You can specify as many rule sets as you like ManualsRules xml is also located in the xml subdirectory and contains our example consistency rules 4 2 Consistency Rules We now go deeper into the formalization of consistency rules the rule designer s task First we explain how rules are formalized using the syntax of predicate logic Second we show our example rules in CDET s XML syntax 18 What You need to Input into CDET 1 Always links must be valid At each state t we have for all documents z at t that for all their referenced keys k there exists a key definition d in one of the resolvers for k and there exists a manual m with name and kind as defined by d
6. c If the modified program normally reads commands interactively when run you must cause it when started running for such interactive use in the most ordinary way to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty or else saying that you provide a warranty and that users may redistribute the program under these conditions and telling the user how to view a copy of this License Exception if the Program itself is interactive but does not normally print such an announcement your work based on the Program is not required to print an announcement 70 GNU General Public License These requirements apply to the modified work as a whole If identifiable sections of that work are not derived from the Program and can be reasonably considered independent and separate works in themselves then this License and its terms do not apply to those sections when you distribute them as separate works But when you distribute the same sections as part of a whole which is a work based on the Program the distribution of the whole must be on the terms of this License whose permissions for other licensees extend to the entire whole and thus to each and every part regardless of who wrote it Thus it is not the intent of this section to claim rights or contest your rights to work written entirely by you rather the intent is to exercise the right to control the distribution of derivative
7. kee WI 2004 ACM Press 73 74 BIBLIOGRAPHY SBRS04b J Scheffezyk U M Borghoff P R dig and L Schmitz Towards effi SBRS04c Sch04 SSBS04 W30C01 WR99 cient consistency management for informal applications Int Journal of Computer amp Information Science 5 2 109 121 2004 J Scheffezyk Uwe M Borghoff P R dig and L Schmitz S DAGs Towards efficient document repair generation In Proc of the 2nd Int Conf on Computing Communications and Control Technologies volume 2 pages 308 313 Austin TX 2004 University of Texas at Austin J Scheffezyk Consistent Document Engineering PhD thesis Univer sit t der Bundeswehr Miinchen 2004 see www unibw de inf2 CDET J Scheffczyk C Stutz U M Borghoff and J Siedersleben Formale Konsistenzsicherung in informellen Software Spezifikationen Infor matik Forschung und Entwicklung 19 1 17 29 2004 World Wide Web Consortium W3C XML Schema part 0 Primer W3C Recommendation 2001 see www w3 org TR xmlschema 0 M Wallace and C Runciman Haskell and XML Generic combinators or type based translation In Proc of the 4th ACM SIGPLAN Int Conf on Functional Programming pages 148 159 Paris France 1999 ACM Press
8. lt actions gt lt rule gt lt rule phi2 omitted for brevity gt lt rules gt 4 3 Languages Our example rules use some symbols that are not defined yet e g the symbol or the symbol repStates These symbols are defined in languages For convenience CDET provides the predefined language Prelude which defines many useful types and symbols see App A Therefore you only need to define a few symbols and types for specific projects In this section we first review our example language in a formal abstract syntax Second we explore the concrete XML syntax which again will be a bit more verbose For our example we define the language ManualsLanguage which contains symbols and types as shown in Fig 4 2 ManualsLanguage imports the prede fined language Prelude the relevant part of which is also shown in Fig 4 2 For simplicity our formalism neglects import relations between languages We let lt denote an explicit subtype relation between record types a record type may have multiple supertypes The variant list type a is declared as usual in functional programming MTH90 PJ03 the variant constructors are empty list and add an element to the front of a list The record type Document stands for a formal document carrying a name of type String and a check in state of type State That way we dis tinguish different document versions CDET requires that each document type is a subtype of Docu
9. case repaircost rating rep of Just c gt c Nothing gt 0 gt lt metrics gt lt repair gt lt here are the CDET sources gt lt cdetdir path src gt lt import these rules gt lt importrules path ManualsRules xml gt lt consistencycheck gt Let us take a closer look at the XML elements above lt consistencycheck name Manuals reportdtd src dtd report dtd tmpdir haskell outputdir out repositorybase darcs repositorytype darcs gt First we say that our project is called Manuals The DTD for the consistency re port produced by CDET can be found in the directory src dtd and is called report dtd You can use relative or absolute paths as you like The tempo ral directory for CDET is haske11 Here CDET also places the project specific 16 What You need to Input into CDET binary after a successful cdet compile you may however copy the project specific binary whereever you like CDET places its generated XML files in the directory out There you will find after appropriate commands for each rule its S DAG structure file name _SDAG xm1 the corresponding SVG file file name _SDAG xml svg and the overall repairs file name overallRepairs xml The darcs repository holding our project documents is located in the directory darcs lt performance usecache true fastprelude true incremental true filterrules true
10. false IMPLIED gt lt existential quantification with sphere term gt lt ELEMENT exists term formula gt lt id quantified variable gt lt action hint for quantifier gt lt keepdomain also cache the quantifier sphere may be expensive gt lt ATTLIST exists id CDATA REQUIRED action delete add change keep IMPLIED keepdomain true false IMPLIED gt lt quantified variable gt lt TERMS gt lt term with description gt lt ELEMENT term desc var fapply sym rec case typed gt lt variable gt lt ELEMENT var EMPTY gt lt id variable name gt lt ATTLIST var id CDATA REQUIRED gt lt applied function symbol gt lt ELEMENT fapply term gt lt id function symbol name gt 66 DTDs lt ATTLIST fapply id CDATA REQUIRED gt lt function symbol e g as argument for higher order functions gt lt ELEMENT sym EMPTY gt lt id function symbol name gt lt ATTLIST sym id CDATA REQUIRED gt lt record construction gt lt ELEMENT rec label gt lt id record constructor name gt lt ATTLIST rec id CDATA REQUIRED gt lt record label gt lt ELEMENT label term gt lt id label name unique for this record gt lt ATTLIST label id CDATA REQUIRED gt lt variant deconstruction gt lt ELEMENT case term binding gt
11. gt lt ELEMENT subs gtype gt lt variant constructor gt lt ELEMENT vcon gtype gt lt id variant constructor name gt lt ATTLIST vcon id CDATA REQUIRED gt lt TYPES gt lt type ground or function type gt lt ELEMENT type gtype funtype gt lt groundtype gt lt ELEMENT gtype tapply tvar top state gt B 4 Report DTD report dtd 59 lt type application gt lt ELEMENT tapply gtype gt lt tcref type constructor to be applied gt lt ATTLIST tapply tcref CDATA REQUIRED gt lt type variable gt lt ELEMENT tvar EMPTY gt lt id type variable name gt lt ATTLIST tvar id CDATA REQUIRED gt lt supertype of all other types gt lt ELEMENT top EMPTY gt lt repository state gt lt ELEMENT state EMPTY gt lt function type gt lt ELEMENT funtype type gtype gt lt predicate type gt lt ELEMENT predtype type gt B 4 Report DTD report dtd lt CONSISTENCY REPORTS gt lt rule evaluations for each rule its report gt lt ELEMENT evaluations evaluation repairs gt lt ELEMENT evaluation rule report gt lt reports description S DAG gt lt ELEMENT report desc sdag gt lt truth report s boolean truth value gt lt ATTLIST report truth true false REQUIRED gt lt report s truth value
12. lt term with description gt lt ELEMENT term desc var fapply sym rec case typed gt lt variable gt lt ELEMENT var EMPTY gt lt id variable name gt lt ATTLIST var id CDATA REQUIRED gt lt applied function symbol gt lt ELEMENT fapply term gt lt id function symbol name gt lt ATTLIST fapply id CDATA REQUIRED gt lt function symbol e g as argument for higher order functions gt lt ELEMENT sym EMPTY gt lt id function symbol name gt 52 DTDs lt ATTLIST sym id CDATA REQUIRED gt lt record construction gt lt ELEMENT rec label gt lt id record constructor name gt lt ATTLIST rec id CDATA REQUIRED gt lt record label gt lt ELEMENT label term gt lt id label name unique for this record gt lt ATTLIST label id CDATA REQUIRED gt lt variant deconstruction gt lt ELEMENT case term binding gt lt tcref variant constructor type constructor gt lt ATTLIST case tcref CDATA IMPLIED gt lt binding inside case id gt sym gt lt ELEMENT binding EMPTY gt lt id variant constructor gt lt sym function symbol to be applied to constructor arguments gt lt ATTLIST binding id CDATA REQUIRED sym CDATA REQUIRED gt lt explicitly typed term gt lt ELEMENT typed term type gt lt user defined action
13. tore A dId m kId d m dId gt kId d False 3 A kind m kKind d m kind gt kKind d False 2 In CDET s XML syntax we use and for conjunctions and or for disjunctions where both tags can carry arbitrarily many subformulae lt formula gt lt and gt lt formula gt lt papply id gt lt term gt lt var id k gt lt term gt lt term gt lt fapply id key gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt hintalternative gt lt set condition false id d field key repaircost 5 gt lt term gt lt var id k gt lt term gt lt set gt lt hintalternative gt lt hintalternative gt lt set condition false id k repaircost 1 gt lt term gt lt fapply id key gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt set gt lt hintalternative gt lt papply gt lt formula gt lt formula gt lt papply id gt lt term gt 22 What You need to Input into CDET lt fapply id dId gt lt term gt lt var id m gt lt term gt lt fapply gt lt term gt lt term gt lt fapply id kId gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt hintalternative gt lt set condition false id m field dId repaircost 3 gt lt term gt lt fapply id kId gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt set gt lt hintalterna
14. Descr deriving Eq Show data KDef_Attrs KDef_Attrs kDefKey PackedString kDefKld PackedString kDefKKind PackedString deriving Eq Show newtype Descr Descr PackedString deriving Eq Show to the CDET types also in haske11 ManualsLanguageTypes hs data ResD ResD resD_dld PString resD_dState State resD_kDefs KDef deriving Eq Ord data KDef KDef kDef_kId PString kDef_kKind PString kDef_key PString deriving Eq Ord We do this by the function mkResD below lt funsymdef id repResDs gt lt desc gt resolver documents in the repository lt desc gt lt meta gt lt docdep regex key xml gt lt cachemeta cached true reftrans true gt lt argdep num 0 gt lt cachemeta gt lt meta gt lt funtype gt lt type gt lt gtype gt lt state gt lt gtype gt lt type gt lt gtype gt lt tapply tcref gt lt gtype gt lt tapply tcref ResD gt lt gtype gt lt tapply gt lt gtype gt lt funtype gt lt code gt repResDs repo t map 1 d gt mkResD d fromJust err d fromJust err d parseDoc d repo readXml ds where err d repResD internl error reading xml show d ds repDocs keyx xml t repo mkResD Document gt Res Res gt ResD mkResD d Res Res kdefs ResD fresD_dld document_dld d resD_dState document_dState d 4 3 Languages 31 resD_kDefs map mkKDef kdefs mkKDef Res KDef attrs _ KDe
15. ET xml dState 2 A gt kind technical M Abandoned False kind m kKind d m kind field M technical M 2 vedad Y m fdld mant xml dState 1 m fdld man1 xml dState 1 cule da Abandoned kind technical M KEEP m gt 4 das pos man1 xml dState 2 kind field M Chg olman xml dState 2 gt Ukind technical M False kind m kind m m kind field M gt technical M 2 Abandoned Figure 2 3 Example augmented S DAGs right hand leaf Sure the repository contains a second manual man2 xml which would however cause more efforts for repair S DAGs will contain repairs that impose minimal changes to the repository and preserve check ins if possible CDET denotes S DAGs that are not worth repairing by a special leaf Abandoned The S DAG in Fig 2 2 lacks concrete repair actions i e to add change or delete document content In order to support interactive repair CDET can augment S DAGs by repair actions after consistency checking Simply issue cdetExProj augment example_project xml lt repository state gt at the command line By the optional argument lt repository state gt you can choose the repository state for which the S DAGs should be augmented If this argument is not given augmented S DAGs contain repairs for all repository states During S DAG augmentation the repos
16. body of this License The Free Software Foundation may publish revised and or new versions of the General Public License from time to time Such new versions will be similar in spirit to the present version but may differ in detail to address new problems or concerns Each version is given a distinguishing version number If the Program specifies a version number of this License which applies to it and any later version you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation If the Program does not specify a version number of this License you may choose any version ever published by the Free Software Foundation 72 10 11 12 GNU General Public License If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different write to the author to ask for permission For software which is copyrighted by the Free Software Foundation write to the Free Software Foundation we sometimes make exceptions for this Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally NO WARRANTY BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE THERE IS NO WARRANTY FOR THE PROGRAM TO THE EXTENT PERMITTED BY APPLICABLE LAW EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AN
17. commit to using it Some other Free Software Foundation software is covered by the GNU Library General Public License instead You can apply it to your programs too When we speak of free software we are referring to freedom not price Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software and charge for this service if you wish that you receive source code or can get it if you want it that you can change the software or use pieces of it in new free programs and that you know you can do these things To protect your rights we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights These restrictions translate to certain responsibilities for you if you distribute copies of the software or if you modify it For example if you distribute copies of such a program whether gratis or for a fee you must give the recipients all the rights that you have You must make sure that they too receive or can get the source code And you must show them these terms so they know their rights We protect your rights with two steps 1 copyright the software and 2 offer you this license which gives you legal permission to copy distribute and or modify the software Also for each author s protection and ours we want to make certain that everyone understands that there is no warranty for this free software If the software is modified by
18. compile and optimize our rules where example_project xml contains the project description of our example document engineering project Rule com pilation includes type checking thus it may fail if your rules include type errors Technically speaking rule compilation creates some Haskell files and compiles them via GHC Finally a new project specific binary is compiled say cdetExProj for example For large projects compilation via GHC may take some time so be prepared to have some coffee at hand You have to recompile the rules whenever you change them After compilation you can check your repository for consistency Either you can do this offline by issuing cdetExProj check example_project xml or by including the above line in a so called pre commit hook script in order to check consistency online at each check in Using DARCS you issue darcs setpref test cdetExProj check example_project xml at the command line make sure you are in your DARCS repository Of course you can edit the preferences file of your DARCS repository directly see the DARCS manual Rou05 for further instructions Currently for subversion some strange problems with the transaction mechanism currently prevent online consistency checking Notice that during consistency checking in online mode the repository is locked during consistency checking in offline mode the repository is free for check ins Consistency checking generates an S DAG for each consistency
19. find the Haskell code of the metric used to sort the repair collection The above metric counts potentially violated rules negatively and subtracts the repair cost where deleting document content is punished by subtracting 10 and INotice the use of lt CDATA in order to avoid XML parsing errors in the Haskell code 4 2 Consistency Rules 17 deleting a document is punished by subtracting 15 In general the punishment of deletion should be greater than the cost of the most expensive hint In fact deleting content or even a whole document is a fall back solution that should be considered only if no other repair is possible Technically a metric is a Haskell function repairMetrics Repair gt Repair gt Bool that indicates whether we prefer the first list of repairs over the second list for simplicity a repair list represents one alternative repair set For defining metrics you can use the following Haskell data types data Repair Repair vars Id VSym rule identifiers and variables x therein to repair ident Id Int unique identifier for a repair original rule unique number for this rule domain Term the term identifying the domain of x assign VSym Value variable assignment for the domain change RepairAction the actual repair action rating Rating rating for the repair data RepairAction ActDel Value delete value ActChange Value Value change value from to
20. gt lt ELEMENT metrics PCDATA gt lt user defined preference metrics to sort repair sets gt lt function to be defined repairMetrics Repair gt Repair gt Bool gt lt Types to be used gt lt data Repair Repair vars Id VSym rule identifiers and gt lt l variables x therein to repair gt lt ident Id Int gt lt unique identifier for a repair gt lt original rule unique number for this rule gt lt domain Term the term identifying the domain of x gt lt assign VSym Value variable assignment for the domain gt lt cannot contain x gt Shas change RepairAction the actual repair action gt lt rating Rating rating for the repair gt lt data RepairAction gt lt ActDel Value delete value gt lt ActChange Value Value change value from to gt lt l ActChangeField Value FSym Value change a field in a value gt lt ActInsert Value insert value gt lt ActChangeSuchThat PSym Value Bool gt lt we do not know how to set the value gt lt data Rating Rating offends Id possibly impacted rules gt lt resolves Int Int gt lt he how many incons resolved at which importance level gt lt repaircost Maybe Cost repair costs gt lt directory for the CDET sources fo
21. gt lt ELEMENT rating resolves offends gt lt cost repair cost gt lt ATTLIST rating cost CDATA IMPLIED gt lt how many inconsistencies does a rule resolve gt lt ELEMENT resolves EMPTY gt lt ruleprio priority of the rule for which inconsistencies are resolved gt lt inconsistencies number of inconsistencies resolved gt lt ATTLIST resolves ruleprio CDATA REQUIRED inconsistencies CDATA REQUIRED gt lt rules that might be violated by apoplying a repair gt lt rule rule identifier of a possibly violated rule gt lt ELEMENT offends EMPTY gt lt ATTLIST offends B 4 Report DTD report dtd rule CDATA REQUIRED gt lt VALUES gt lt ELEMENT value vat vrec vvar vfun vundefined gt lt atomic value gt lt ELEMENT vat PCDATA gt lt record value gt lt ELEMENT vrec vlabel gt lt id record constructor gt lt ATTLIST vrec id CDATA REQUIRED gt lt record value label gt lt ELEMENT vlabel value gt lt id label name gt lt ATTLIST vlabel id CDATA HREQUIRED gt lt variant value gt lt ELEMENT vvar valuex gt lt id variant constructor name gt lt ATTLIST vvar id CDATA REQUIRED gt lt function value gt lt ELEMENT vfun EMPTY gt lt id name of the function gt lt ATTLIST vfun id CDATA REQUIRED gt lt undefined v
22. its scope The act of running the Program is not restricted and the output from the Program is covered only if its contents constitute a work based on the Program independent of having been made by running the Program Whether that is true depends on what the Program does 1 You may copy and distribute verbatim copies of the Program s source code as you receive it in any medium provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty keep intact all the notices that refer to this License and to the absence of any warranty and give any other recipients of the Program a copy of this License along with the Program You may charge a fee for the physical act of transferring a copy and you may at your option offer warranty protection in exchange for a fee 2 You may modify your copy or copies of the Program or any portion of it thus forming a work based on the Program and copy and distribute such modifications or work under the terms of Section 1 above provided that you also meet all of these conditions a You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change b You must cause any work that you distribute or publish that in whole or in part contains or is derived from the Program or any part thereof to be licensed as a whole at no charge to all third parties under the terms of this License
23. lt tcref variant constructor type constructor gt lt ATTLIST case tcref CDATA IMPLIED gt lt binding inside case id gt sym gt lt ELEMENT binding EMPTY gt lt id variant constructor gt lt sym function symbol to be applied to constructor arguments gt lt ATTLIST binding id CDATA REQUIRED sym CDATA REQUIRED gt lt explicitly typed term gt lt ELEMENT typed term type gt lt user defined action what should be done with the report gt lt not supported at the moment gt lt ELEMENT actions mailreport mailrepair mailsummary gt lt Mail the report to some recipients gt lt not supported at the moment gt lt ELEMENT mailreport recipient gt lt ATTLIST mailreport currentonly true false REQUIRED B 4 Report DTD report dtd 67 explain true false REQUIRED format xml latex REQUIRED attachDocs true false REQUIRED gt lt Mail the repair suggestions to some recipients gt lt not supported at the moment gt lt ELEMENT mailrepair recipient gt lt ATTLIST mailrepair currentonly true false REQUIRED explain true false REQUIRED format xml latex REQUIRED attachDocs true false REQUIRED gt lt Mail a comprehensive summary to some recipients gt lt not supported at the moment gt lt ELEMENT mailsummary recipient gt lt ELEMENT recipient EMPTY gt lt ATTLI
24. on other Lin ux Unix boxes as well may be you need to modify some paths Tf you were successful in running CDET please let me know CDET should be portable to Windows without major efforts however 1 did not come around to that already For running CDET you need the Glasgow Haskell Compiler GHC version 6 4 or above autoconf and a DMS RCS including the headers of their support libraries Currently CDET supports DARCS Rou05 subversion CSFP04 and the normal file system For installing CDET just grab the tar ball and unpack it In the src subdirec tory you should invoke autoconf which creates the configure script For compiling CDET with file system support only simply type configure for darcs support add the option enable darcs lt darcs_dir gt where lt darcs_dir gt is the directory of your darcs sources for subversion support add the option enable svn By typing configure help you can see a list of all available options The configure script shows a short summary of the CDET compile options Now a simple make should compile CDET By make install you copy CDET into your usr local bin directory 1 4 This Manual From here this manual proceeds as follows In Chapter 2 you get to know CDET from the author perspective by a small example Chapter 3 shows details about CDET s system architecture Chapter 4 is directed to the people who formalize consistency requirements here you learn what you should provide CD
25. or collective works based on the Program In addition mere aggregation of another work not based on the Program with the Program or with a work based on the Program on a volume of a storage or distribution medium does not bring the other work under the scope of this License You may copy and distribute the Program or a work based on it under Section 2 in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following a Accompany it with the complete corresponding machine readable source code which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange or b Accompany it with a written offer valid for at least three years to give any third party for a charge no more than your cost of physically performing source distribution a complete machine readable copy of the corresponding source code to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange or c Accompany it with the information you received as to the offer to distribute corresponding source code This alternative is allowed only for noncom mercial distribution and only if you received the program in object code or executable form with such an offer in accord with Subsection b above The source code for a work means the preferred form of the work for making mod ifications to it For an exe
26. rule You can display the S DAGs on the text console doing a cdetExProj show example_project xml lt repository state gt The optional argument lt repository state gt can be used to display the S DAGs for this repository state only For example cdetExProj show example_project xml 2 displays the S DAGs for the second repository state S DAGs visualize inconsistencies and possible repair actions The structure of an S DAG resembles that of a consistency rule Nodes represent logical connectives or atomic formulae edges target the subformulae of a connective We find inner nodes for existential quantifiers O universal quantifiers WM conjunctions and disjunctions V Leafs represent predicates i e atomic formulae Edges below quantifier nodes carry bindings of variables to values A predicate node occurs as a leaf only It contains an atomic formula responsible for an inconsistency and the truth value of e From the user perspective a universal node V represents inconsistencies re sulting from failure prone document content where each edge blames a value for inconsistencies represented by the edge s target S DAG An existential node For each project you need a project description which imports the consistency rules and sets some options 2Don t worry about the path to the actual repository it is included in the project description S DAG fi t gt 2 rule p is x gt did doct txt dState 2
27. someone else and passed on we want its recipients to know that what they have is not the original so that any problems introduced by others will not reflect on the original authors reputations Finally any free program is threatened constantly by software patents We wish to avoid the danger that redistributors of a free program will individually obtain patent 68 69 licenses in effect making the program proprietary To prevent this we have made it clear that any patent must be licensed for everyone s free use or not licensed at all The precise terms and conditions for copying distribution and modification follow GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING DISTRIBUTION AND MODIFICATION 0 This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License The Program below refers to any such program or work and a work based on the Program means either the Program or any derivative work under copyright law that is to say a work containing the Program or a portion of it either verbatim or with modifications and or translated into another language Hereinafter translation is included without limitation in the term modification Each licensee is addressed as you Activities other than copying distribution and modification are not covered by this License they are outside
28. version management access control or deployment management On the other hand DMS RCS fail to manage semantic domain specific consistency require ments Usually authors aim to produce an overall consistent work i e certain relations between the documents are maintained These relations are however mostly implicit and vague e g Links within documents should have a valid tar get What does valid mean Why do we say should Does it mean that the rule does not need to hold always In order to achieve consistency authors have to spend plenty of time re reading and revising their own and related documents Worse each check in to the DMS RCS potentially violates consistency Larger companies define guidelines and policies for writing but still a human reviewer is needed to maintain them What prevents automatic consistency checks is that guidelines are implicit or at best informal CDET implements a formal quality control approach that is built on top of informal document engineering processes We call this kind of quality control semi formal consistency management CDET introduces explicit formal con sistency rules to capture informal quality requirements It does not require any adaptations to editing practices of authors CDET smoothly integrates consistency management into almost arbitrary DMS RCS without requiring adaptations to document engineering processes This is in sharp contrast to entirely formal consis
29. what should be done with the report gt lt not supported at the moment gt lt ELEMENT actions mailreport mailrepair mailsummary gt lt Mail the report to some recipients gt lt not supported at the moment gt lt ELEMENT mailreport recipient gt lt ATTLIST mailreport currentonly true false REQUIRED explain true false REQUIRED format xml latex REQUIRED attachDocs true false REQUIRED gt lt Mail the repair suggestions to some recipients gt lt not supported at the moment gt lt ELEMENT mailrepair recipient gt B 3 Language DTD language dtd 53 lt ATTLIST mailrepair currentonly true false REQUIRED explain true false REQUIRED format xml latex REQUIRED attachDocs true false REQUIRED gt lt Mail a comprehensive summary to some recipients gt lt not supported at the moment gt lt ELEMENT mailsummary recipient gt lt ELEMENT recipient EMPTY gt lt ATTLIST recipient mailto CDATA REQUIRED gt lt TYPES gt lt type ground or function type gt lt ELEMENT type gtype funtype gt lt groundtype gt lt ELEMENT gtype tapply tvar top state gt lt type application gt lt ELEMENT tapply gtype gt lt tcref type constructor to be applied gt lt ATTLIST tapply tcref CDATA REQUIRED gt lt type variable gt lt ELEMENT tvar EMPTY gt
30. x PString PString a Bool x a a Char Bool x PString PString Int Int Int Int l Float gt Float Float Float Int x a gt Int x PString PString PString a Bool x Char Bool x PString i PString PString a Bool x Char Bool x PString PString PString Lal a a gt al a a la a a x 8 gt h 45 Haskell like init Haskell like take take for packed Strings Haskell like drop drop for packed Strings Haskell like reverse reverse for packed Strings Haskell like Haskell like length length for packed Strings Haskell like lookup Haskell like for packed Strings Haskell like map Haskell like concatMap Haskell like foldl foldl for packed Strings Haskell like foldr foldr for packed Strings Haskell like filter advanced filter advanced negative filter Haskell like takeWhile takeWhile for packed Strings Haskell like dropWhile dropWhile for packed Strings Haskell like product for Int Haskell like sum for Int Haskell like product for Float Haskell like sum for Float Haskell like splitAt splitAt for packed Strings Haskell like span span for packed Strings Haskell like break break for packed Strings Haskell like fst Haskell like snd Haskell like maybe Haskell like cat Maybes Haskell like zip Haskell like zipWith Haskell like unzip Haskell like words Ha
31. x to the term e if the atomic formula evaluates to the truth value b the repair imposes the cost c If the type of x is a record type the hint x f gt e b c proposes to change the field f of x In order to repair a violated atomic formula k key d in rule 1 the hints in Fig 4 1 propose to either change k to key d or change the field key of d to k The former repair imposes a cost of 1 the latter repair costs 5 We consider changing a key definition more expensive because this might cause new inconsistencies regard ing other referenced keys In fact the latter repair will be preferred if changing a single key definition resolves multiple inconsistencies that could alternatively be resolved by changing multiple manuals in the above case changing a single key definition is cheaper than changing six manuals Notice that in the manuals are blamed for inconsistent manual identifiers and kinds the hints affect the variable m only A quantifier annotation tKEEP prevents S DAG generation from considering any repairs for the variable t Of course we cannot change repository states nor can we change old manual versions m Let us now formalize rule 1 in CDET s concrete XML syntax Recall that for our example we imported the rules from the file ManualsRules xml which starts as follows lt rules gt lt importlanguage path prelude RulePrelude xml gt lt importlanguage path ManualsLanguage xml gt For formalizing r
32. y KEEP repStates e V x e repDs t e Vk c refs z e Jd e concatMap kDefs repResDs t e I3 m e repManDs t e k key d False 1 pkey d key k False 5 A dId m kId d m dId gt kId d False 3 A kind m kKind d m kind gt kKind d False 2 2 Names and kinds of manuals are invariant over time At all states t we have for all manuals m at t that at all future states t2 there exists a manual ma that has the same name and the same kind as m y KEEP lt repStates e V KEEP lt repStates e ti lt to vV Y mKEEP lt repManDs t e 3 ma e repManDs t2 e dId m dId ma mz dId dId m False 3 A kind m kind m2 m2 kind kind m False 2 Figure 4 1 Example rules and 2 with hints Consistency rules are statements about repository states The rule designer formalizes consistency rules in a variant of temporal logic based on the logic proposed in AHV96 Instead of temporal operators like since or until explicit quantification over repository states is used in rules Consistency rules mostly correspond to standard first order formulae where the domain of a quantifier is described by a term As usual rules use functions and predicates which are defined in languages see Sect 4 3 For our example we define two rules as shown in Fig 4 1 For clarity we deal with hint annotations given in curly braces later in this section Rule 1 first quantifies over all re
33. CDET The Consistent Document Engineering Toolkit User Manual www unibw de inf2 CDET Jan Scheffezyk jan scheffczyk0gmx net September 6 2005 Contents 1 Introduction 1 1 What Can You Use CDET for 1 2 What is CDET not Meant for 1 3 System Requirements amp Installation tA The Manual 2 2 acs a eon Bee A a ee at 2 Working with CDET 3 CDET System Architecture 4 What You need to Input into CDET 4 1 CDET Project Description 2 2 024 4 2 Consistency Rules a a a aia 403 Languages era eed SO eo a a Ba A a 5 Technical Details 5 1 Project Description XML Syntax 0 4 5 2 Language XML SyMtaxX e o 5 3 Rule XML Syntax cipal aid 5 4 Repository Interface o 02020002 e eee 6 Future Work A The CDET Prelude Al Prelude Types 00 poa dol A 2 Prelude Predicates A 3 Prelude Functions B DTDs B 1 Project Description DTD consistencycheck dtd B 2 Rules DTD rules dtd o e B 3 Language DTD language dtd o B 4 Report DTD report dtd C GNU General Public License Bibliography gt eo a N 12 14 14 17 23 32 32 33 37 Al 42 43 43 43 44 47 47 49 53 59 68 73 Abstract When a group of authors collaboratively edits interrelated documents semantic inconsiste
34. D OR OTHER PARTIES PROVIDE THE PROGRAM AS IS WITHOUT WARRANTY OF ANY KIND EITHER EX PRESSED OR IMPLIED INCLUDING BUT NOT LIMITED TO THE IM PLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PAR TICULAR PURPOSE THE ENTIRE RISK AS TO THE QUALITY AND PER FORMANCE OF THE PROGRAM IS WITH YOU SHOULD THE PROGRAM PROVE DEFECTIVE YOU ASSUME THE COST OF ALL NECESSARY SER VICING REPAIR OR CORRECTION IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER OR ANY OTHER PARTY WHO MAY MODIFY AND OR REDISTRIBUTE THE PROGRAM AS PER MITTED ABOVE BE LIABLE TO YOU FOR DAMAGES INCLUDING ANY GENERAL SPECIAL INCIDENTAL OR CONSEQUENTIAL DAMAGES ARIS ING OUT OF THE USE OR INABILITY TO USE THE PROGRAM INCLUD ING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PRO GRAMS EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES END OF TERMS AND CONDITIONS Bibliography AHV96 C 03 CSFP04 MTH90 PJ03 Rou05 SBBS05 SBRS03a SBRS03b SBRS04a S Abiteboul L Herr and J Van den Bussche Temporal versus first order logic to query temporal databases In ACM Symp on Principles of Database Systems pages 49 57 Montreal Canada 1996 ACM Press M Chakravarty et al The Haskell Foreign Function Interface 1 0 Add
35. ET with in order to get meaningful output Chapter 5 includes a detailed description of CDET s syntax in a rather technical manner Finally in Chapter 6 lists some todo items for future works If you think some of these tasks is well suited to you or you have other suggestions please let me know I will be happy to hear This is a manual only Explaining formal details and algorithms behind CDET is out of scope You might want to look up such details in the research pa pers around CDET SBRS03a SBRS03b SBRS04b SSBS04 SBRS04c SBRS04a SBBS05 Notice that report generation as described in SBRS03a SBRSO3b SBRS04b SSBS04 is slightly out of date though the general ideas still hold If you prefer the whole story as one large volume see my PhD thesis Sch04 Chapter 2 Working with CDET In this chapter you learn how to use the command line interface of CDET Cur rently there is no GUI support so some of the interactive features are still not accessible This chapter is meant for authors who do not need to know the formal details about rules etc Chapter 3 introduces the r le model of CDET For illustration we assume a ridiculously simple toy example which is also used in some of my papers Example 2 1 Suppose you archive manuals over a long period of time Docu ments and manuals reference manuals through keys see Fig 2 1 Since names and kinds of manuals may change over time you need key resolvers mapping keys to
36. Man attrs _ ManD manD_dId document_dld d manD_dState document_dState d manD_kind Man manKind attrs fromJust err Just a a 4 3 Languages 29 fromJust err Nothing error err lt code gt lt funsymdef gt The Haskell source above looks a bit more complicated because we have to parse manuals Their names and check in states are not sufficient we need their struc ture For accessing documents in the repository you can use the repository access function parseDoc Document gt Repository gt String gt a gt Maybe a It gets the document to parse the repository and a parser function as arguments parseDoc retrieves the document from the repository and applies the parser func tion to its content given as String The parser function should convert the document content into some data structure which is returned if the document exists For convenience CDET provides a standard parser for XML documents which is generated by HaXml automatically Recall that we told CDET to do this for manuals and key resolvers in the language header Technically CDET s HaXml generates an instance for the type class XmlContent which provides the function readXml XmlContent a gt String gt Maybe a Unfortunately Haskell and HaXml both lack subtyping support such that the types generated by HaXml and by CDET respectively are not compatible HaXml generates the following types for manuals from the DTD manuals dt
37. RED gt lt argument dependencies of the symbol gt lt ELEMENT argdep EMPTY gt lt num argument number responsible for symbol result gt lt ATTLIST argdep num CDATA REQUIRED gt lt symbol properties not supported at the moment gt B 3 Language DTD language dtd 57 lt ELEMENT property simprop fungenprop predgenprop gt lt simple property gt lt ELEMENT simprop EMPTY gt lt prop simple property transitive symmetric antisymmetric asymmetric reflexive irreflexive gt lt ATTLIST simprop prop trans sym antisym asym refl irref1 REQUIRED gt lt generic properties for function symbols currently not supported gt lt pre results imply post result gt lt something similar to ghc rule pragmas gt lt ELEMENT fungenprop fpre fpost gt lt pre result arguments and function result all simple variables gt lt ELEMENT fpre arg gt lt res result variable gt lt ATTLIST fpre res CDATA REQUIRED gt lt ELEMENT arg EMPTY gt lt arg argument variable gt lt ATTLIST arg id CDATA REQUIRED gt lt post result arguments and function result gt lt ELEMENT fpost arg gt lt ATTLIST fpost res CDATA REQUIRED gt lt generic properties for predicate symbols gt lt pre results imply post result gt lt something similar to ghc rule pragmas gt lt ELEMENT predg
38. ST recipient mailto CDATA REQUIRED gt lt TYPES gt lt type ground or function type gt lt ELEMENT type gtype funtype gt lt groundtype gt lt ELEMENT gtype tapply tvar top state gt lt type application gt lt ELEMENT tapply gtype gt lt tcref type constructor to be applied gt lt ATTLIST tapply tcref CDATA REQUIRED gt lt type variable gt lt ELEMENT tvar EMPTY gt lt id type variable name gt lt ATTLIST tvar id CDATA REQUIRED gt lt supertype of all other types gt lt ELEMENT top EMPTY gt lt repository state gt lt ELEMENT state EMPTY gt lt function type gt lt ELEMENT funtype type gtype gt Appendix C GNU General Public License Version 2 June 1991 Copyright 1989 1991 Free Software Foundation Inc 59 Temple Place Suite 330 Boston MA 02111 1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document but changing it is not allowed Preamble The licenses for most software are designed to take away your freedom to share and change it By contrast the GNU General Public License is intended to guarantee your freedom to share and change free software to make sure the software is free for all its users This General Public License applies to most of the Free Software Foundation s software and to any other program whose authors
39. alue gt lt ELEMENT vundefined EMPTY gt lt consistency rule gt lt ELEMENT rule meta formula actions gt lt date creation date of the rule gt lt if a rule is newer than the last runtime then it might have been changed gt lt id rule identifier gt lt ATTLIST rule date CDATA REQUIRED id ID REQUIRED gt lt rule name gt lt rule metadata gt lt ELEMENT meta desc precon docdeps gt 63 64 DTDs lt weight rule weight importance 0 100 gt lt weak may a rule be violated gt lt ATTLIST meta weight CDATA 0 weak true false REQUIRED gt lt consistency rule description gt lt ELEMENT desc PCDATA gt lt rule precondition this rule must be satisfied in order to check gt lt the current rule not supported at the moment gt lt ELEMENT precon EMPTY gt lt ATTLIST precon id CDATA REQUIRED gt lt document dependencies of a rule gt lt not given dont know determined by CDET gt lt given these are the documents the rule really depends on gt lt ELEMENT docdeps docdep gt lt ELEMENT docdep EMPTY gt lt regex regular expression of the dependent documents gt lt ATTLIST docdep regex CDATA REQUIRED gt lt first order formula with description gt lt ELEMENT formula desc papply or and not implies forall exists gt lt
40. applied predicate atomic formula plus hint collection gt lt ELEMENT papply term hintalternative gt lt id predicate symbol gt lt ATTLIST papply id CDATA REQUIRED gt lt alternative hint set gt lt ELEMENT hintalternative set gt lt hint gt lt ELEMENT set term gt lt condition predicate result causing to execute the hint gt lt id variable to set gt lt varcondition set variable only if it is marked new old gt lt field record field to set gt lt repaircost cost imposed by executing the corresponding repair gt lt ATTLIST set condition true false IMPLIED id CDATA REQUIRED varcondition new old IMPLIED B 4 Report DTD report dtd 65 field CDATA IMPLIED repaircost CDATA IMPLIED gt lt disjunction gt lt ELEMENT or formula formulat gt lt conjunction gt lt ELEMENT and formula formulat gt lt implication gt lt ELEMENT implies formula formula gt lt negation gt lt ELEMENT not formula gt lt universal quantification with sphere term gt lt ELEMENT forall term formula gt lt id quantified variable gt lt action hint for quantifier gt lt keepdomain also cache the quantifier sphere may be expensive gt lt ATTLIST forall id CDATA REQUIRED action delete add change keep IMPLIED keepdomain true
41. askell like head head for packed Strings Haskell like last Haskell like tail tail for packed Strings A 3 Prelude Functions init take takes drop drops reverse reverses diff length lengths lookup indexs map concatMap foldl foldlS foldr foldrs filter filterElem filterNotElem takeWhile takeWhileS dropWhile dropWhileS productInt sumInt productFloat sumFloat splitAt splitAts span spans break breaks fst snd maybe catMaybes zip zipWith unzip words lines wordss linesS unwords a a 3 gt a a B B x a gt b x Maybe a gt 8 Maybe a a a x 8 gt a 6 Sa le Ka B gt led 8 Char Char Char Char PString PString PString PString Char Char a gt a Int x a a Int x PString PString Int x a a Int x PString PString a gt a PString PString Q x a a a Int PString Int a x a 8 Maybe 8 a x Int gt a PString x Int Char a B x la gt 6 gt 6 x la gt 8 a axB gt 0 x0ax 6 gt 0 a x Char gt a x a x PString gt a a x P gt p x Bx a gt P Char x 8 8 x B x PString gt 3 l l a Bool x a a 8 x a gt 8 x a a f x a gt 8 x a gt a Bool x a a Char Bool
42. ate symbol name gt lt ATTLIST predsymdef id CDATA REQUIRED gt 56 DTDs lt implementation of symbols Haskell source code gt lt ELEMENT code PCDATA gt lt SYMBOL METADATA for function and predicate symbols gt lt description gt lt ELEMENT desc PCDATA gt lt detailed metadata gt lt ELEMENT meta docdep strdep docargdep cachemeta gt lt calcStates does the symbol calculate over repository states gt lt ATTLIST meta calcStates true false IMPLIED gt lt document dependencies of the symbol gt lt ELEMENT docdep EMPTY gt lt regex regular expression of documents the result depends on gt lt ATTLIST docdep regex CDATA REQUIRED gt lt string dependencies of the symbol gt lt ELEMENT strdep EMPTY gt lt regex regular expression of strings the may symbol result in gt lt ATTLIST strdep regex CDATA REQUIRED gt lt document argument dependencies of the symbol gt lt ELEMENT docargdep EMPTY gt lt num argument number responsible for document access gt lt ATTLIST docargdep num CDATA REQUIRED gt lt cache metadata for a symbol gt lt ELEMENT cachemeta argdep property gt lt cached should the symbol be cached gt lt reftrans is the symbol referentially transparent gt lt ATTLIST cachemeta cached true false REQUIRED reftrans true false REQUI
43. ble only if it is marked new old gt lt field record field to set gt lt repaircost cost imposed by executing the corresponding repair gt lt ATTLIST set condition true false IMPLIED id CDATA REQUIRED varcondition new old IMPLIED field CDATA IMPLIED repaircost CDATA IMPLIED gt lt disjunction gt lt ELEMENT or formula formulat gt lt conjunction gt B 2 Rules DTD rules dtd 51 lt ELEMENT and formula formula gt lt implication gt lt ELEMENT implies formula formula gt lt negation gt lt ELEMENT not formula gt lt universal quantification with sphere term gt lt ELEMENT forall term formula gt lt id quantified variable gt lt action hint for quantifier gt lt keepdomain also cache the quantifier sphere may be expensive gt lt ATTLIST forall id CDATA REQUIRED action delete add change keep IMPLIED keepdomain true false IMPLIED gt lt existential quantification with sphere term gt lt ELEMENT exists term formula gt lt id quantified variable gt lt action hint for quantifier gt lt keepdomain also cache the quantifier sphere may be expensive gt lt ATTLIST exists id CDATA REQUIRED action delete add change keep IMPLIED keepdomain true false IMPLIED gt lt quantified variable gt lt TERMS gt
44. collection containing alternative repair sets which each contains repairs that together resolve all inconsistencies for all your rules 1 1 What Can You Use CDET for Of course you can use CDET for maintaining semantic consistency requirements in a tolerant way during your document engineering projects Besides pure con sistency management there are some other application areas e You can use temporal consistency rules for supervising and enforcing devel opment processes e You can categorize documents on how they conform to some consistency rules e You can even burn your CPU by CDET 1 2 What is CDET not Meant for At date CDET is in a very preliminary state I consider it software Interfaces to DMS RCS are quite stable so CDET should not corrupt your data However I would not recommend CDET for mission critical tasks in high budgeted enterprise projects Sure I am confident this changes in the near future CDET is not meant to ensure data integrity locking or to implement access rights this should be implemented by the DMS RCS you are using CDET does not take off the responsibility to think about semantic consistency in fact it strongly encourages such thoughts Worse you cannot use CDET for cooking coffee except by using the heat produced by your CPU or carrying out your garbage 4 Introduction 1 3 System Requirements amp Installation CDET has been tested on two Debian Linux boxes it should run
45. cutable work complete source code means all the source code for all modules it contains plus any associated interface definition files plus the scripts used to control compilation and installation of the executable However as a special exception the source code distributed need not include anything that is normally distributed in either source or binary form with the major components compiler kernel and so on of the operating system on which the executable runs unless that component itself accompanies the executable If distribution of executable or object code is made by offering access to copy from a designated place then offering equivalent access to copy the source code from the same place counts as distribution of the source code even though third parties are not compelled to copy the source along with the object code You may not copy modify sublicense or distribute the Program except as expressly provided under this License Any attempt otherwise to copy modify sublicense or distribute the Program is void and will automatically terminate your rights under this License However parties who have received copies or rights from you under this License will not have their licenses terminated so long as such parties remain in full compliance You are not required to accept this License since you have not signed it How ever nothing else grants you permission to modify or distribute the Program or 71 its derivati
46. d you can view them in the generated Haskell module haske11 DTD_manual hs data Man Man Man_Attrs PackedString deriving Eq Show data Man_Attrs Man_Attrs manKind PackedString deriving Eq Show CDET generates the following type for manuals from our language definition you can view them in the generated Haskell module haskel1 ManualsLanguageTypes hs data ManD ManD 4 manD_dld PString manD_dState State manD_kind PString deriving Eq Ord In Haskell CDET s general document type looks as follows data Document Document document_dId PString document_dState State deriving Eq Ord For bridging these types we have defined the helper function mkMan above I admit that this is the most complex task in the work with CDET But you have to do this only once Let us risk a look at the definition of the function repResDs repResDs State ResD get all resolver documents Here we have to bridge the following HaXml types you can find them in the file haskell DTD_resolver hs You might ask at what repository state the document is accessed its the check in state of the document given as first argument 5I plan to overcome this severe shortcoming by a HaXml extension that supports a subset of XML Schema and subtypes SNotice that PString is just a type synonym for PackedString 30 What You need to Input into CDET newtype Res Res KDef deriving Eq Show data KDef KDef KDef_Attrs
47. d attributes id the predicate symbol name desc A comprehensive description of the predicate symbol meta Predicate symbol metadata Optional attributes calcStates does the symbol calculate over repository states If the attribute is not given CDET is optimistic and assumes that the symbol does not calculate over repository states docdep Document dependencies Required attributes regex List here a pattern of files the predicate symbol directly accesses via a repository access func tion I hope to calculate these patterns in the future docargdep Argument dependencies of a symbol Required attributes num Give here the argument number re sponsible for document access I hope to calculate these num bers in the future cachemeta Additional metadata to influence caching be havior Required attributes cached should the symbol be cached reftrans is the symbol referentially transparent i e does it not use repStates or repHead In general Haskell guarantees referential transparency of each symbol The repository access functions repStates or repHead can however change their result between consistency checks By the subelement argdep you specify the argument numbers the result of the predicate symbol depends on Usually these are all argument numbers but sometimes not Technical Details predtype The predicate type You need to specify the argu ment types only because each predicate has a boo
48. d values by inspecting their key labels only So by speci fying only a few key labels you can speed up comparisons and thus consistency checking significantly But you must guarantee that records are distinguishable by their key labels only otherwise you get rather strange results For document types i e subtypes of Document it is safe to make all record labels non key labels because the repository guarantees that given a repository state and a document name we find at most one document in the repository You might wonder about the type PString For efficiency reasons CDET uses an array based String type in contrast to Haskell where Strings are lists of Characters Internally CDET uses the FastPacketString library from the DARCS Rou05 distribution 4 3 Languages 27 Since we do not define new predicate symbols we include lt predsymdefs gt and go on with the definition of our first function symbol repDs State Document get all documents in the repository in XML lt funsymdefs gt lt funsymdef id repDs gt lt desc gt Documents in the repository lt desc gt lt meta gt lt docdep regex xml gt lt docdep regex txt gt lt cachemeta cached true reftrans true gt lt argdep num 0 gt lt cachemeta gt lt meta gt lt funtype gt lt type gt lt gtype gt lt state gt lt gtype gt lt type gt lt gtype gt lt tapply tcref gt lt gtype gt lt tapply tcref Docume
49. e project description or rule set Implied attributes qualifiedPrelude import Prelude qualified with this name qualifiedChar import Char qualified with this name qualifiedRatio im port Ratio qualified with this name qualifiedComplex import Complex qual ified with this name qualifiedNumeric import Numeric qualified with this name qualifiedIx import Ix qualified with this name qualifiedList im port List qualified with this name qualifiedMaybe import Maybe qualified with this name qualifiedRandom import Random qualified with this name qualifiedTime import Time qualified with this name By default the Haskell Prelude is imported unqualified the other Haskell mod ules are imported fully qualified e desc A comprehensive description for your language e importlanguage Import another language such that you can use symbols of the imported language for symbol definition Required attributes path the XML file holding the imported language Optional attributes qualifiedAs qualified import with a given name e importmodule Import an auxiliary Haskell module Required attributes module the name of the Haskell module Optional attributes qualifiedAs qualified import with a given name importparser Generate and import a parser for a given DTD The parser will be generated by HaXml Required attributes dtd the name of the DTD for which the Parser should be generated Optional attributes qualifiedAs qualifi
50. ed import with a given name 34 Technical Details e importforeign Import a foreign function say from a C library cur rently only C is supported Required attributes call calling convention currently cca11 only header the C header file foreign the name of the foreign function haske11 the name of the corresponding Haskell function The following subelements specify the type of the Haskell function You need extra XML elements for this in order to prevent XML parsing errors fargtype Argument types of the imported function s Haskell side Required attributes id the argument type name frestype The result type of the imported function s Haskell side Required attributes id the result type name typedefs The type definitions of the language You may define as many types as you want Notice however that their names must be unique typedef A type definition which can be either atomic record or variant Required attributes id the type constructor name desc A comprehensive description of the type One of the following type definitions is permitted atomic An atomic type like Int or Char record A record type like Document Required attributes rcon the record constructor not to be con fused with the record type constructor although often the same tvar Type variables arguments for the record type Required attributes id the type variable name supers Supertypes of
51. ell is your temporal directory for CDET here the project specific binary is placed after compilation Also the directory contains a subdirectory cache xml contains three XML files project description rules language 4 1 CDET Project Description Let us first review the project manager s task CDET uses the project description as the entry point to your project each project should have its own project descrip tion For clarity I do not recommend sharing project descriptions across projects in contrast to rules and languages For our example we use the following project description file ManualsCheck xml located in the subdirectory xm1 14 4 1 CDET Project Description 15 lt consistencycheck name Manuals reportdtd src dtd report dtd tmpdir haskell outputdir out repositorybase darcs repositorytype darcs gt lt performance usecache true fastprelude true incremental true filterrules true optimizerules true gt lt repair partialResolution false maxsets 10 gt lt metrics gt lt CDATA sum metrics repairMetrics rsi rs2 sum map rat rsi gt sum map rat rs2 where rat rep changeRat rep offendsRat rep costRat rep punish deletion changeRat rep case change rep of ActDelete v gt if isDocVal v isDocListVal v then 15 else 10 gt 0 punish offending rules offendsRat rep length offends rating rep costs of a repair costRat rep
52. endum to the Haskell 98 Report 2003 see www cse unsw edu au chak haskell ffi B Collins Sussman B W Fitzpatrick and C M Pilato Version Control with Subversion O Reilly and Associates 2004 R Milner M Tofte and R Harper The Definition of Standard ML MIT Press 1990 S L Peyton Jones Haskell 98 Language and Libraries The Revised Report Cambridge University Press 2003 D Roundy DARCS David s advanced revision control system 2005 see www darcs net J Scheffezyk U M Borghoff A Birk and J Siedersleben Pragmatic consistency management in industrial requirements specifications In Proc of the 8rd IEEE Int Conf on Software Engineering and Formal Methods page to appear Koblenz Germany 2005 IEEE CS Press J Scheffczyk U M Borghoff P R dig and L Schmitz Consistent document engineering In Proc of the 2003 ACM Symp on Document Engineering pages 140 149 Grenoble France 2003 ACM Press J Scheffcezyk U M Borghoff P R dig and L Schmitz Efficient in consistency management for heterogeneous repositories In Proc of the 4th Int Conf on Software Engineering Artificial Intelligence Net working and Parallel Distributed Computing pages 370 377 Liibeck Germany 2003 ACIS J Scheffezyk U M Borghoff P Rodig and L Schmitz Managing inconsistent repositories via prioritized repair actions In Proc of the 2004 ACM Symp on Document Engineering pages 137 146 Milwau
53. eneration by giving hints that tell CDET about how the truth values of atomic formulae can be inverted x setx A hint inside a hint set All hints in a hint set are executed simultaneously Essentially a hint tells CDET about how a variable should be changed in order to invert the truth value of the atomic formula at hand Required attributes id the variable to which the hint applies Optional attributes condition The truth value of the atomic for mula responsible for executing the hint this truth value should be inverted If the attribute is not given CDET will execute the hint on either truth value varcondition Should the hint be executed if the variable is new content has changed since the last check in or old content re mained constant since the last check in If the attribute is not given CDET will execute the hint always field If the variable s type is a record type you can specify the record label to be changed If the attribute is not given the hint applies to the full value of the variable repaircost You can specify the cost of the hint in order to get the cheapest repairs only If the attribute is not given nothing is assumed about the cost hence you might get more repair alter natives By the subelement term you tell CDET how the variable or the field should be changed Make sure that the term s type corresponds to the variable s type or the field s type if given Otherwise you get a type error
54. enprop ppre ppost gt lt ELEMENT ppre arg gt lt ATTLIST ppre res true false REQUIRED gt lt ELEMENT ppost arg gt lt ATTLIST ppost res true false REQUIRED gt lt TYPE DEFINITIONS gt lt type definitions gt lt ELEMENT typedefs typedef gt lt type definition atomic record variant gt 58 DTDs lt ELEMENT typedef desc atomic record variant gt lt id type constructor name gt lt ATTLIST typedef id CDATA REQUIRED gt lt atomic type like Int Char gt lt ELEMENT atomic EMPTY gt lt record type argument type variables super types labels gt lt only argument tyvars can be used in definition gt lt ELEMENT record tvar supers labelx gt lt rcon record constructor gt lt ATTLIST record rcon CDATA REQUIRED gt lt super types of a record gt lt ELEMENT supers gtype gt lt record label gt lt ELEMENT label gtype gt lt id label name gt lt key the label belongs to the key of a record gt lt records can be identified by _only_ looking at their key labels gt lt only key labels are used for value comparison and in repairs gt lt ATTLIST label key true false IMPLIED id CDATA REQUIRED gt lt variant type argument types subtypes constructors gt lt ELEMENT variant tvar subs vcon gt lt subtypes of a variant
55. es fundamental functions predicates and types In user defined languages a language designer defines domain specific func tions predicates and types The semantics of functions and predicates is defined in Haskell PJ03 a statically typed purely functional programming language Haskell provides access to other libraries via a foreign function interface CT 03 Such libraries offer sophisticated functionality e g for parsing documents or heuristics for semantic content analysis Formal document types usually corre sponding to document templates can be expressed by record or variant types This makes CDET independent of any particular document format and facilitates 12 13 external libs XPath heuristics annotated a predicate logic language functions language types predicates designer A formalized partly we amp type check derived document consen rule templates rules amp hints designer PIS a project Tne project templates cdot compile rules amp hints project manager instance of consistency p check in check cdetPrj check author SN cone repository A apply managed by DMS RCS eee S DAGs X gt 1 repair set 2 repair set cdetPrj derive a augmented S DAGs cdetPrj augment Figure 3 1 CDET System Architecture oval
56. f kDef_key Res kDefKey attrs kDef_kId Res kDefKId attrs kDef_kKind Res kDefKKind attrs lt code gt lt funsymdef gt What is left is the function refs which scans an arbitrary document for referenced keys refs Document String keys referenced in a document Below we tell CDET that a key begins with the letters kaA and is separated from other text by white spaces Sure you can imagine more complex definitions than the one below lt funsymdef id refs gt lt desc gt Key references inside a document lt desc gt lt meta gt lt docargdep num 0 gt lt cachemeta cached true reftrans true gt lt argdep num 0 gt lt cachemeta gt lt meta gt lt funtype gt lt type gt lt gtype gt lt tapply tcref Document gt lt gtype gt lt type gt lt gtype gt lt tapply tcref gt lt gtype gt lt tapply tcref PString gt lt gtype gt lt tapply gt lt gtype gt lt funtype gt lt code gt refs repo d fromJust err d parseDoc d repo getRefs where err d refs internal error reading document show d getRefs String gt PString getRefs content map packString filter isKeyRef words content isKeyRef w keyprefix take 3 w keyprefix kaA lt code gt lt funsymdef gt lt funsymdefs gt lt language gt That s it you have defined your first CDET language Sure defining languages suffers from poor tool support In the future I will particula
57. gt lt S DAG gt lt ELEMENT sdag leaf allofng oneofng allofq oneofq abandoned ref gt lt predicate leaf gt lt ELEMENT leaf desc suggestalternative gt lt psym predicate symbol gt lt cons is psym responsible for an inconsistency gt lt result boolean result of psym gt 60 DTDs lt ATTLIST leaf id CDATA REQUIRED psym CDATA REQUIRED cons consistent inconsistent REQUIRED result true false REQUIRED gt lt alternative predicate suggestions set gt lt ELEMENT suggestalternative suggest gt lt predicate suggestion gt lt ELEMENT suggest sugsetsuchthat sugdontknow sugset sugcontradict gt lt set the value of a variable to the specified value gt lt ELEMENT sugset value gt lt var variable to set gt lt field field in the variable to set gt lt cost repair cost gt lt ATTLIST sugset var CDATA REQUIRED field CDATA IMPLIED cost CDATA IMPLIED gt lt set a variable such that the specified predicate holds or not gt lt ELEMENT sugsetsuchthat value gt lt var variable to set gt lt bool targeted boolean value gt lt ATTLIST sugsetsuchthat var CDATA REQUIRED bool true false REQUIRED gt lt don t know how to change the variable gt lt ELEMENT sugdontknow EMPTY gt lt var variable to set gt lt bool targe
58. he S DAG for you find the lower right leaf labeled Ref gt 6 its incoming edge actually points to the node labeled by the number 6 which you find on the left hand side For efficiency reasons S DAGs are not fully shared In an augmented S DAG quantifier edges carry repair actions marked grey An action proposes to either add a value to Add or change a value within Chg or delete a value from the sphere of this quantifier Del An action KEEP indicates that the value must not be changed For our example CDET proposes to change the key reference kaA2 to kaA3 or to delete the document docl txt From an augmented S DAG you can choose repair actions interactively CDET applies a chosen repair action to an S DAG Since CDET s current implementation lacks a graphical user interface this feature is not available at date i e it is implemented but you cannot access it Notice that still it is unclear which repair actions in an S DAG must be applied together in order to resolve all inconsistencies You can instruct CDET to generate the repair collection for both consistency rules thus cdetExProj derive example_project xml lt repository state gt By the optional argument lt repository state gt you can choose the repository state for which the repair collection should be derived But notice that the aug mented S DAGs must include this state otherwise you will get an empty repair collection If this argument is not given t
59. he quantified variable has the type 7 formula The argument formula here the quantified variable is in scope and may be referenced in terms exists Existential quantification attributes and subelements are sim ilar to those of universal quantification Required attributes id The quantified variable Optional attributes action Here you can specify a preferred repair ac tion for the quantified variable The following actions are supported add only derive repairs to add something change only derive repairs to change something keep don t derive any repairs for this variable keepdomain Cache the quantifier domain term A term that tells CDET where the values of the bound variable come from formula The argument formula of the existential quantification here the quantified variable is in scope 5 4 Repository Interface 41 5 4 Repository Interface In this section you learn details about CDET s generic interface to the repository of an arbitrary DMS RCS As already mentioned CDET only makes few assumptions about the underlying DMS RCS The interface consists of five Haskell functions which you can use for implementing functions and predicates the abstract Haskell data type Repository represents the repository itself repStates Repository gt State returns all states of a given repos itory repHead Repository gt State returns the current repository state repDocs Reposi
60. he repair collection will contain repairs that derive all inconsistencies at all repository states which is probably not what you want During repair derivation the repository is not accessed and there fore not locked In contrast to S DAG augmentation you will notice that repair derivation can take a long time because of the combinatorial explosion of repair alternatives CDET tries to reduce them but it cannot succeed in all cases The repair collection contains alternative repair sets each including repairs that are necessary and sufficient to repair all inconsistencies in the repository CDET guarantees that repairs within each set do not contradict each other and altogether resolve the inconsistencies Repairs may however introduce new inconsistencies Fig 2 5 shows some repairs derived from our example S DAGs at state 2 The first component of a repair constitutes its sphere the second component denotes the proposed action For example the repair rep affects the variable k in the rule The repair proposes to change the key kaA2 into kaA3 in the document docl txt at repository state 2 The repair resolves one inconsistency for a high priority rule may violate rule 1 and costs 1 4 The repair collection below shows how the repairs in Fig 2 5 are best combined in order to resolve all inconsistencies at state 2 1 rep COD maJ 2 TeDa TePm mz f 3 TePrano A 4 PED ser La 3 Also derivation will need ple
61. his License would be to refrain entirely from distribution of the Program If any portion of this section is held invalid or unenforceable under any particular circumstance the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system it is up to the author donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice This section is intended to make thoroughly clear what is believed to be a conse quence of the rest of this License If the distribution and or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries so that distribution is permitted only in or among countries not thus excluded In such case this License incorporates the limitation as if written in the
62. i const2 id succInt predInt Int Int Int Int absInt mod divMod succFloat predFloat Float Float Float Float absFloat round floor ceiling mkFloat maximum minimum appends head heads last tail tails Float Float Float Float Float Float Int Float Int Float Int Int Float Al gt a Int Int Int A gt 0 Int Int Int Int Int x Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Int Float Float x Float Float x Float Float x Float Float x Float Float x Float Float Float a gt a a x a a PString x PString PString a gt a PString Char a gt a a gt a PString PString 0 1 2 Haskell like id Haskell like succ for Int Haskell like pred for Int Haskell like for Int Haskell like for Int Haskell like for Int Haskell like div for Int Haskell like abs for Int Haskell like mod Haskell like divMod Haskell like succ for Float Haskell like pred for Float Haskell like for Float Haskell like for Float Haskell like for Float Haskell like for Float Haskell like abs for Float Haskell like round Haskell like floor Haskell like ceiling Convert an Int into a Float Haskell like maximum Haskell like minimum Haskell like for packed Strings H
63. ill assist you in this A term consists of a comprehensive description and the actual term desc A comprehensive description of the term Here I plan to include some automatic translation into natural language CDET s notion of terms is a bit more expressive than in classic pred icate logic CDET supports variables function symbol application function symbols as such record construction variant deconstruc tion and explicitly typed terms to guide type inference A term may be one of the following var A variable Make sure that you only reference variables that are bound by a quantifier else you get a type error CDET supports closed formulae only fapply A function symbol application Required attributes id The function symbol to apply Make sure that the symbol is defined in a language imported by this rule set term Argument terms of the function symbol applica tion Make sure that the result types of the argument terms fit to the function symbol s type modulo polymorphism and subtyping of course CDET s type checker will assist you in this sym A function symbol as argument of a higher order func tion Required attributes id The function symbol s name Make sure that the symbol is defined in a language imported by this rule set rec A record construction Required attributes id The record data constructor not to be confused with the record type constructor Make sure that the record data constr
64. ion type or predicate type from the result type x separates argument types resembles XML Schema subtyping via extension and restriction W3C01 Record and variant type definitions induce new function symbols labels and constructors respectively For example ResD induces kDefs ResD Item We regard predicate symbols as function symbols with a boolean result type denoted by Bool Function symbols starting with rep provide access to documents within the repository at a given state For a document d refs d returns all referenced keys concatMap f xs applies the function f to each member of the list zs and concatenates the result lists Record labels can serve as parameter for concatMap e g in concatMap kDefs repResDs t In addition we have to define the semantics of the symbols in our new ManualsLanguage still no one tells CDET what the symbols really do For semantics definition CDET employs Haskell Let us define our ManualsLanguage in CDET s concrete XML syntax Re call that we have imported the file ManualsLanguage xml in our rule definition ManualsLanguage xml starts as follows lt language id ManualsLanguage gt lt importlanguage path prelude RulePrelude xml qualifiedAs P gt lt importparser dtd dtd manual dtd qualifiedAs Man gt lt importparser dtd dtd resolver dtd qualifiedAs Res gt 4 3 Languages 25 Above we import the basic language Prelude such that we can define our ne
65. itory is not accessed and therefore not locked Fig 2 3 shows the augmented S DAGs for the repository state 2 you can view them by another cdetExProj show example_project xml In addition cdetExProj augment generates quite nice SVG pictures which you can view with your favorite WWW browser and a structured XML output of the S DAGs The name of the SVG picture is _SDAG xml svg for rule and 2_SDAG xml svg for rule Ha respectively You find these files in a directory given in the project description file Fig 2 4 shows the generated SVG files They look a bit different from the ideal S DAGs because currently CDET s implementation does not use Chg Id man pa dState 2 kind technical M Report for rule phil violated ny 13 did docl txt dState 2 1 fdld docl txt dState 2 po 12 kaA3 Del kaA3 kaA2 Chg kaA3 gr 8 m 10 did ma dState m1 10 did man1 xml dState 1 did dState Chg ld fmanl re dState 2 kind technical M 3 Figure 2 4 Example generated SVG files g 11 Id man1 xml Jmanl xml kKind technical M nd technical M key kaA3 3 id manl xml at kKind technical M key kaAZ t 10 Working with CDET sophisitcated graph drawing algorithms I expect this to change in the near fu ture In the SVG graphics each node is identified by a unique number that may be referenced In t
66. k gt kaA3 d fkoy kaA3 kld man1 amiy y l sg lis kaA3 kld man xmi kKind technical M kKind technical M False k key d k kaA2 H gt kaA3 1 d key kaA3 gt kaA2 5 mo man1 xml dState 2 kind field M False kind m kKind d m kind field M technical M 2 Abandoned t gt 2 t gt 1 DA S DAG for rule 7 pls man1 xml dState 15 Abandoned kind technical M m gt Mag bi man1 xml dState 2 kind field M Abandoned False kind m kind m m kind field M gt technical M 2 Figure 2 2 Example S DAGs represents an inconsistency resulting from missing document content This con tent could be either really missing or it could be regarded as missing because one of the edges carries defective content Below conjunction nodes Y and universal nodes V each S DAG must be repaired In contrast it is sufficient to repair only one S DAG below a disjunction node V and existential node O respectively At state 2 CDET generates the S DAGs shown in Fig 2 2 In the leaves you find atomic formulae responsible for inconsistencies At the edges below quantifier nodes you find values blamed for inconsistencies Let us read the S DAG for from its root to the leaves At the second repository state t gt 2 you have an inconsistent document docl txt in your rep
67. lean result type type Argument types of the predicate symbol see be low code Finally the Haskell code of the predicate symbol s im plementation The first parameter of each symbol is the repository type Repository Otherwise you could not access the repository Then follow the parameters as specified by the predicate symbol s type above Don t forget to enclose the code by a CDATA section in order to avoid XML parsing errors e funsymdefs Function symbol definitions You can define as many function symbols as you want Notice however that their names must be unique funsymdef A function symbol definition Required attributes id the function symbol name desc A comprehensive description of the function symbol meta Function symbol metadata see above funtype The function symbol s type consisting of general argument types and a ground result type type Argument types which can be either ground types gtype or function types funtype This supports the defi nition of higher order functions which result in a non function type This restriction is necessary to keep CDET s logic first order One of the following subelements is supported gtype ground type funtype function type gtype A ground type i e a non function type Here the ground type is the result type of a function symbol A ground type can be either a type application tapply a type variable tvar the supertype
68. lt term gt lt fapply id refs gt lt term gt lt var id x gt lt term gt lt fapply gt lt term gt lt formula gt lt exists id d gt lt term gt lt fapply id concatMap gt lt term gt lt sym id kDefs gt lt term gt lt term gt lt fapply id repResDs gt lt term gt lt var id t gt lt term gt lt fapply gt lt term gt lt fapply gt lt term gt lt formula gt lt exists id m gt lt term gt lt fapply id repManDs gt lt term gt lt var id t gt lt term gt lt fapply gt lt term gt We express universal quantification by forall and existential quantification by exists Both elements carry the mandatory attribute id denoting the quanti 4 2 Consistency Rules 21 fied variable and the optional attribute action denoting a quantifier annotation Immediately after the quantification tag we use the element term to denote the quantifier sphere term There fapply denotes the application of a function to argument terms and var denotes a variable Both elements carry the mandatory attribute id denoting the function name and variable name respectively By sym we denote an un applied function that serves as a parameter to a higher order functions e g lt sym id kDefs gt The subformula of a quantification is intro duced by another formula element Let us concentrate on the three way conjunction below the existential quantifier for m k gt key d False 1 EN
69. ment This is important for efficient consistency checking The record type ResD resembles the key resolver structure see Fig 2 1 on pg 5 ResD inherits all record labels from its supertype Document Our notion of subtyping 3Roughly Prelude defines large part of Haskell s Prelude hence the name 24 What You need to Input into CDET Extract from language Prelude predefined Type definitions String strings a I C axla Haskell like lists Document Document dId String dState State documents Predicate symbol definitions Va axa Bool equality Va axa Bool less than or equal Va axa Bool less than Function symbol definitions concatMap Va 3 a 6 xla 8 Haskell like concatMap repStates State all repository states Language ManualsLanguage defined by the language designer imports Prelude Type definitions ManD lt Document Man kind String manual documents ResD lt Document Res kDefs KDef key resolver documents KDef KDef key String kId String kKind String key definitions Function symbol definitions repDs State Document get all documents in the repository repManDs State ManD get all manual documents repResDs State ResD get all resolver documents refs Document String keys referenced in a document Figure 4 2 Example types and symbols the operator separates a symbol from its type separates the argument types of a funct
70. n state A 2 Prelude Predicates Notice that in Haskell equality and inequality require the type class Eq In contrast CDET lets GHC derive instances for the type class Eq for each type such that equality and inequality are fully polymorphic The same holds for class members of the Haskell type class Ord a X a Bool equality key fields only fullEq a x a Bool full structural equality f a X a Bool inequality key fields only lt a X Bool less than x X a Bool less than or equal gt aux a Bool greater than gt aux a Bool greater than or equal all a gt Bool x a Bool Haskell like all any a gt Bool x a gt Bool Haskell like any 43 44 elem a x a Bool elemS Char x PString Bool notElem a x a Bool even Int Bool odd Int Bool null a Bool nullS PString Bool and Bool gt Bool or Bool gt Bool A 3 Prelude Functions The CDET Prelude Haskell like elem elem for packed Strings Haskell like notElem Haskell like even Haskell like odd Haskell like null null for packed Strings Haskell like and Haskell like or Notice that CDET does not support constant values in formulae such that we provide some numerical constants as function symbols with no parameter Notice further that due to lack of type classes CDET distinguishes functions for number calculations w r t their type Int or Float const0O const
71. ncies occur almost immediately Current document management sys tems short DMS provide useful mechanisms such as document locking and ver sion control but often lack consistency management facilities At best semantic consistency is defined via informal guidelines which do not support automatic consistency checks The Consistent Document Engineering Toolkit short CDET aims at over coming these shortcomings It is a result of my research at the Universit t der Bundeswehr Miinchen since 2002 CDET complements DMS or revision control systems short RCS by consistency management The overall goal of the CDET is to provide a means by which you can formalize semantic consistency require ments such as business rules check your documents for consistency automat ically and derive repairs for inconsistencies The crucial point is however to tolerate inconsistencies which are commonly considered a natural consequence of multi author environments Chapter 1 Introduction Larger bodies of writing such as books technical documentations or software specifications contain many interrelated heterogeneous documents Typically a whole team of authors is responsible for producing and maintaining these doc uments In supporting a multi author environment document management sys tems DMS and revision control systems RCS store documents in repositories On the one hand DMS RCS provide fundamental management facilities e g
72. nt gt lt gtype gt lt tapply gt lt gtype gt lt funtype gt lt code gt repDs repo t ds where ds repDocs xml t repo repDocs txt t repo lt code gt lt funsymdef gt Above we have defined the new function symbol repDs which accesses the doc uments ending with xml or txt The symbol is referentially transparent and CDET should cache the symbol results in order to improve performance In ad dition the symbol result depends on its first argument CDET starts counting by zero only Referential transparency and annotations of accessed documents are essential for efficient consistency checking I plan to implement some syntactic analysis that automatically retrieves these information out of the symbol s source code for the moment however you have to provide these annotations by your own After the symbol metadata we define the symbol type Above we say that the argument type is State the result type is a list of documents The funtype element denotes a function type consisting of some argument types and a result type The type element comprises both function types and ground types element 28 What You need to Input into CDET gtype A function type must always result in a ground type in order to ensure the first order properties of CDET s logic Finally we define the semantics of the symbol repDs by giving its Haskell source code The first argument of each function and predicate is the reposit
73. nty of time Repairs are rated by a simple cost model based on natural numbers 11 reDy Rep 1 k refs 1 t gt 2 1 gt dId docl txt dState 2 Chg kaA2 kaA3 Rate 1 high 1 1 rep Rep di d concatMap t kDefs repResDs t t 2 Chg key kaA3 key kaA2 Rate 1 high 41 5 TePkaa2 Rep 1 k refs x t gt 2 2 dId docl txt dState 2 Del kaA2 Rate 1 high 1 0 TePdoc Rep 1 k repDs t t 2 Del dId docl txt dState 2 Rate 3 high gi 0 TCDm Rep d2 ma repManDs t2 ta gt 2 Chg dId man1 xml kind technical M Rate 1 medium 1 2 2 TePm ma Rep 1 mM 2 m2 repManDs t tt gt 2 Chg dId man1 xml kind technical M Rate 2 medium 1 medium 1 2 2 Figure 2 5 Repairs generated for the rules and da The collection implies a preference between the repair sets In the example you change a document rather than deleting it Changing a text document is preferred to changing a key definition within a key resolver In addition you prefer deleting some content within a document to deleting a whole document Of course you can define your own preferences Chapter 3 CDET System Architecture Fig 3 1 illustrates the architecture of CDET CDET distinguishes four r les au thors the project manager the rule designer and the language designer Authors use the DMS RCS as usual except if y
74. of all types top or the repos itory state type state Hence one of the following subele ments is permitted tapply type constructor application The attribute tcref specifies the type constructor to apply The subelement gtype specifies the argument types tvar type variable The attribute id specifies the type variable name top the supertype of all types state the repository state type code Finally the Haskell code of the function symbol s im plementation The first parameter of each symbol is the repository 5 3 Rule XML Syntax 37 type Repository Otherwise you could not access the repository Then follow the parameters as specified by the function symbol s type above Don t forget to enclose the code by a CDATA section in order to avoid XML parsing errors 5 3 Rule XML Syntax rules A user defined rule set Here you define what consistency means You can import this rule set in a project description e importlanguage Import at least one language the symbols of which you use in your rules Required attributes path the XML file holding the imported language e importrulex Import other rule sets This implies that whenever this rule set is used by a project all other imported rule sets are used too Required attributes path the XML file holding the imported rule set e rule A consistency rule Required attributes id the rule name must be unique date the date of the last rule modification
75. optimizerules true gt Above you find performance options I recommend to turn all performance op tions on i e true For our project we use the internal caches of CDET for each function or predicate you can specify whether it should be cached By fastprelude you tell CDET to evaluate Prelude functions and predicates inter nally The basic language Prelude contains large part of Haskell s Prelude see App A We use incremental evaluation and only check those rules that might be touched by a check in In addition we let CDET optimize our rules which essentially means to push quantifiers into the rules lt repair partialResolution false maxsets 10 gt lt metrics gt lt CDATA sum metrics repairMetrics rsi rs2 sum map rat rs1 gt sum map rat rs2 where rat rep changeRat rep offendsRat rep costRat rep punish deletion changeRat rep case change rep of ActDelete v gt if isDocVal v isDocListVal v then 15 else 10 gt 0 punish offending rules offendsRat rep length offends rating rep costs of a repair costRat rep case repaircost rating rep of Just c gt c Nothing gt 0 gt lt metrics gt lt repair gt Above we have specified options for repair generation We turn off partial incon sistency resolution which essentially means to generate a smaller repair collection The repair collection will contain at most 10 alternative repair sets In the metric element you
76. oreign fargtype frestype gt lt call calling convention currently ccall only gt lt header the C header file implies object file to link gt lt foreign the name of the foreign function gt lt haskell the name of the corresponding haskell function gt lt ATTLIST importforeign call ccall cplusplus dotnet jvm stdcall REQUIRED header CDATA REQUIRED foreign CDATA REQUIRED haskell CDATA REQUIRED gt lt an argument type of the corresponding haskell function gt lt use an extra element here to avoid XML parsing errors for Haskell s arrow gt lt ELEMENT fargtype EMPTY gt lt id name of the argument type gt lt ATTLIST fargtype id CDATA REQUIRED gt lt the result type of the corresponding haskell function gt lt ELEMENT frestype EMPTY gt lt id name of the result type gt lt ATTLIST frestype id CDATA REQUIRED gt lt SYMBOL DEFINITIONS gt lt function symbol definitions gt lt ELEMENT funsymdefs funsymdef gt lt function symbol definition gt lt ELEMENT funsymdef desc meta funtype code gt lt id function symbol name gt lt ATTLIST funsymdef id CDATA REQUIRED gt lt predicate symbol defintions gt lt ELEMENT predsymdefs predsymdef gt lt predicate symbol definition gt lt ELEMENT predsymdef desc meta predtype code gt lt id predic
77. ories ignore this value gt lt repositorytype type of the repository dir darcs subversion cas gt lt ATTLIST consistencycheck name CDATA REQUIRED tmpdir CDATA REQUIRED outputdir CDATA REQUIRED reportdtd CDATA REQUIRED repositorybase CDATA REQUIRED repositoryconfig CDATA IMPLIED repositorytype dir darcs subversion cas REQUIRED gt lt some performance tuning gt lt you _should_ turn on everything gt lt ELEMENT performance EMPTY gt lt usecache should we use caches gt lt fastprelude functions from Haskell s Prelude be evaluated internally gt lt incremental generate S DAGs incrementally gt lt filterrules only check rules influenced by a check in gt lt optimizerules rewrite rules for optimum performance gt lt ATTLIST performance AT 48 DTDs usecache true false REQUIRED fastprelude true false REQUIRED incremental true false REQUIRED filterrules true false REQUIRED optimizerules true false REQUIRED gt lt influence repair generation gt lt ELEMENT repair metrics gt lt maxalts not supported at the moment gt lt partialResolution support partial inconsistency resolution gt lt maxsets maximum alternative repair sets in the repair collection gt lt ATTLIST repair maxalts CDATA IMPLIED partialResolution true false REQUIRED maxsets CDATA IMPLIED
78. ory which is an abstract Haskell data type Notice that this argument is implicit i e it is not part of the type declaration Above we simply retrieve some documents from the repository with help of the repository access function repDocs String gt State gt Repository gt Document The first argument denotes a regular file pattern as you know it from your day to day work with files Essentially a regular file pattern is a regular expression where the dot is not escaped explicitly That was easy wasn t it Let us look at the function repManDs which should return the manuals repManDs State ManD get all manual documents In CDET s XML syntax we write lt funsymdef id repManDs gt lt desc gt Manuals in the repository lt desc gt lt meta gt lt docdep regex man xml gt lt cachemeta cached true reftrans true gt lt argdep num 0 gt lt cachemeta gt lt meta gt lt funtype gt lt type gt lt gtype gt lt state gt lt gtype gt lt type gt lt gtype gt lt tapply tcref gt lt gtype gt lt tapply tcref ManD gt lt gtype gt lt tapply gt lt gtype gt lt funtype gt lt code gt repManDs repo t map 1 d gt mkMan d fromJust err d fromJust err d parseDoc d repo readXml ds where err d repManDs internal error reading xml show d ds repDocs man xml t repo mkMan Document gt Man Man gt ManD mkMan d Man
79. ository gt dId docl txt dState 2 Notice that CDET identifies a document by its name did and check in state dState The document docl txt contains two inconsistent key references k kaA2 and k gt kaA3 Let us first follow the left hand edge The key reference to kaA2 is inconsistent because the repository does not include an appropriate key definition You only have a key definition for the key kaA3 d gt key gt kaA3 The left hand leaf indicates that the referenced key k is different from the defined key key d which is due to the above bindings Next we follow the right hand edge for k s universal node Here CDET finds a matching key definition which is the same as above There is also a manual manl xml in the repository that looks promising m H dId man1 xml dState 2 This manual causes however an inconsistency because its kind kind m is dif ferent from the kind of the key definition kKind d This is indicated by the 8 Working with CDET S DAG for t gt 2 rule p i V KEEP 7 x gt did doc1 txt dState 2 Del did doct txt dState 2 k gt kaA3 Del kaA3 Chg kag k gt kaA2 er kaA3 kld man1 amt kKind technical M Chg ibe kaA2 kld man1 an kKind technical M d gt fkey kaA3 kld man1 xmi kKind technical M False k key d m gt ft kaA2 H gt kaA3 1 d key kaA3 gt kaA2 5 mee man1 xml dState 2 kind field M Chg
80. ou formalize strong rules see below For specific projects the project manager chooses consistency rules In addi tion he defines the ordering metrics for sorting the repair collection and global options like the repository and the repository type e g DARCS or subversion The rule designer formalizes consistency rules and hints using a variant of predicate logic Rules define what consistency means they reflect wishes from an administrator perspective CDET distinguishes between strong rules which must be adhered to and weak rules which may be violated Hints guide repair generation In addition the rule designer creates templates for documents These will be DTDs or Schemas if XML is used as document format At a consistency check CDET generates an S DAG for each consistency rule CDET augments S DAGs at request From an augmented S DAG authors can choose repairs and apply them to the documents in the repository Alternatively CDET can derive one ordered repair collection from all S DAGs The metrics for sorting the collection is defined by the project manager Consistency rules use functions and predicates from domain specific languages This makes the rules independent of concrete document formats which can be changed without affecting rules During compilation CDET checks consistency rules for well typedness w r t the functions and predicates used For convenience CDET provides a basic language Prelude which defin
81. plied to the given values results in gt lt ELEMENT changesuchthat value gt lt psym predicate symbol which should be applied to the values gt lt bool requested boolean result gt lt ATTLIST changesuchthat psym CDATA REQUIRED bool true false REQUIRED gt lt do nothing gt lt ELEMENT keep EMPTY gt lt edge below a quantifier node bound values repair actions target S DAG gt lt ELEMENT quantedge value repairaction sdag gt 62 DTDs lt an abandoned S DAG which we do not want to repair gt lt ELEMENT abandoned EMPTY gt lt a reference to another S DAG node gt lt ELEMENT ref EMPTY gt lt ATTLIST ref targetID CDATA REQUIRED gt lt REPAIR collection gt lt ELEMENT repairs repairalt gt lt alternative repair set in the repair collection gt lt ELEMENT repairalt repair gt lt repair gt lt ELEMENT repair concerns term assign repairaction rating gt lt rules for which a repair resolves inconsistencies gt lt ELEMENT concerns EMPTY gt lt rule rule identifier gt lt var variable inside the rule gt lt ATTLIST concerns rule CDATA REQUIRED var CDATA REQUIRED gt lt repair context gt lt ELEMENT assign value gt lt var variable to which a value is assigned gt lt ATTLIST assign var CDATA REQUIRED gt lt rating of a repair
82. pository states provided by repStates repDs t returns the current documents for a state t For each document x the variable k comprises the referenced keys computed via refs x Key definitions in the resolvers are determined with the help of the higher order function concatMap It applies kDefs to each key resolver computed by repResDs t Applied to a key resolver kDefs returns a list of key definitions which are finally concatenated Thus d ranges over the key definitions from all key resolvers we require that the referenced key k equals the key defined by d The variable m ranges over all manuals at the state t computed by repManDs t The name of the manual m must equal the name d points to and the kind of m must equal the kind d points to Rule d2 twice quantifies over time because two different versions of a manual have to be related the old version at state t and the new version at state to Repair generation requires domain specific guidance because CDET supports arbitrary predicates For example if an atomic formula p z y is violated CDET 2A state represents a repository snapshot at a given time 4 2 Consistency Rules 19 cannot anticipate how to change x and y in order to fulfill p x y Therefore rule designers may annotate an atomic formula by a collection of hints containing alternative hint sets Within a hint set all hints are evaluated simultaneously Formally a hint x gt e bc proposes to change the variable
83. r rule and language compilation gt lt this is the parent directory of the CDET directory gt lt ELEMENT cdetdir EMPTY gt B 2 Rules DTD rules dtd 49 lt ATTLIST cdetdir path CDATA REQUIRED gt lt include cirectories needed for rule and language compilation gt lt ELEMENT includedir EMPTY gt lt ATTLIST includedir path CDATA REQUIRED gt lt rulesets against which the repository should be checked gt lt ELEMENT importrules EMPTY gt lt ATTLIST importrules path CDATA REQUIRED gt B 2 Rules DTD rules dtd lt top level ruleset element gt lt ELEMENT rules importlanguage importrule rulex gt lt import a language gt lt ELEMENT importlanguage EMPTY gt lt path to the XML file holding the language definition gt lt ATTLIST importlanguage path CDATA REQUIRED gt lt import another ruleset gt lt ELEMENT importrule EMPTY gt lt path to the XML file holding the ruleset definition gt lt ATTLIST importrule path CDATA REQUIRED gt lt consistency rule gt lt ELEMENT rule meta formula actions gt lt date creation date of the rule gt lt if a rule is newer than the last runtime then it might have been changed gt lt id rule identifier gt lt ATTLIST rule date CDATA REQUIRED id ID REQUIRED gt lt rule name gt lt rule metadata gt lt ELEMENT meta desc precon docdep
84. rly focus on the gap between HaXml and CDET s subtypes When this gap is closed language definition will be almost as easy as rule formalization Chapter 5 Technical Details In this chapter we take a deep technical look at the XML syntax of CDET i e the project description syntax the language definition syntax and the rule definition syntax For the S DAG XML syntax and the repair collection syntax see the report DTD in App B 4 The syntax descriptions below directly correspond to XML elements Finally in Sect 5 4 we present CDET s generic repository interface We present the grammars in DTD tree style You find the child elements of an element e below e and slightly indented An element may be annotated by element is optional element may appear arbitrarily often element must appear at least once element must appear at least twice If not otherwise noted the order of subelements is significant An element description may be omitted if the element was described before or will be described below 5 1 Project Description XML Syntax e consistencycheck The configuration of your project Required attributes name the name of your project tmpdir the tem poral directory for CDET outputdir the output directory for CDET reportdtd the DTD for consistency reports repositorybase the base directory of your repository repositorytype your DMS RCS currently supported are dir darcs and subversion
85. s gt lt weight rule weight importance 0 100 gt lt weak may a rule be violated gt lt ATTLIST meta weight CDATA 0 weak true false REQUIRED gt 50 DTDs lt consistency rule description gt lt ELEMENT desc PCDATA gt lt rule precondition this rule must be satisfied in order to check gt lt the current rule not supported at the moment gt lt ELEMENT precon EMPTY gt lt ATTLIST precon id CDATA REQUIRED gt lt document dependencies of a rule gt lt not given dont know determined by CDET gt lt given these are the documents the rule really depends on gt lt ELEMENT docdeps docdep gt lt ELEMENT docdep EMPTY gt lt regex regular expression of the dependent documents gt lt ATTLIST docdep regex CDATA REQUIRED gt lt first order formula with description gt lt ELEMENT formula desc papply or and not implies forall exists gt lt applied predicate atomic formula plus hint collection gt lt ELEMENT papply term hintalternativex gt lt id predicate symbol gt lt ATTLIST papply id CDATA REQUIRED gt lt alternative hint set gt lt ELEMENT hintalternative set gt lt hint gt lt ELEMENT set term gt lt condition predicate result causing to execute the hint gt lt id variable to set gt lt varcondition set varia
86. s mark fixed components rectangles mark customizable components heterogeneous repositories If XML is used as document format document types and parser functions can be derived from DTDs e g via HaXml WR99 CDET uses an adapted version of HaXml that handles PackedStrings instead of normal Strings Chapter 4 What You need to Input into CDET In this chapter we explore the means you need for managing consistency for our example namely a project description the consistency rules and a language CDET s current implementation uses XML for the concrete syntax Also you have to edit these XML files by yourself I plan however to develop some graphical tools for this purpose Using a DTD aware XML editor like emacs already alleviates editing XML files This chapter is not meant for authors they are not confronted by the formal issues described here This chapter is meant for rule designers and language designers who are experts in first order logic and Haskell respectively Due to the early prototype implementation some things will appear a bit complicated to you I am sure that this will change in the future as CDET evolves For our example project suppose you have the following directory structure darcs contains the DARCS repository you use for managing documents dtd contains two DTDs manual dtd for manuals resolver dtd for key resolvers out here you will find the output of CDET S DAGs and repairs hask
87. sers automatically in a HaXml like manner Improve incremental consistency checking for XML documents provided that the DMS RCS supports XML patches Implement some graphical tools for rule formalization language definition and S DAG playing Improve repairs by data mining approaches Translate rules and repairs into natural language You feel that you could help in some of the above points You want to add something You have an interesting document engineering project and want to ap ply CDET Then feel free to contact me e g by email jan scheffczyk gmx net see www unibw de inf2 00_VCS 42 Appendix A The CDET Prelude Here we list the basic CDET language Prelude which includes large part of the Haskell pre lude helper functions from the FastPackedString library from the DARCS distribution and some basic repository access functions A 1 Prelude Types For convenience we use infix notation where appropriate Notice that in CDET s XML syntax you have to use prefix notation instead Int atomic Haskell like Int Char atomic Haskell like Char PString atomic PackedString from DARCS Float atomic Haskell like Float Maybe a Just a Nothing Haskell like Maybe Bool True False Haskell like Bool a 3 a 3 Haskell like 2 tuples a 3 7 a 8 7 Haskell like 3 tuples ace tuples up to 7 tuples a a al Haskell like lists dld PString Document Document oe State Documents name and check i
88. skell like lines words for packed String lines for packed String Haskell like unwords 46 unlines Char Char unwordss PString PString unlinesS PString PString concat lla a concatS PString PString nils PString repStates State repHead State repInit State next State State prev State State prevState State State The CDET Prelude Haskell like unlines unwords for packed String unlines for packed String Haskell like concat concat for packed Strings empty packed String all repository states repository head state initial repository state next repository state previous repository state total version of prev Appendix B DTDs Here you find the complete DTDs of CDET B 1 Project Description DTD consistencycheck dtd lt top level configuration element gt lt ELEMENT consistencycheck performance repair cdetdir includedir importrules gt lt name a name for your project gt lt tmpdir directory where temporal files can be stored gt lt outputdir directory where the generated XML files are stored gt lt reportdtd where is the DTD for the consistency report gt lt repositorybase base directory of your repository gt lt repositoryconfig optional configuration file for some repository types gt lt le currently this is necessary for subversion only gt lt li gt other reposit
89. t tapply tcref gt lt gtype gt lt tapply tcref KDef gt lt tapply gt lt gtype gt lt tapply gt lt gtype gt lt label gt 26 What You need to Input into CDET lt record gt lt typedef gt lt typedef id KDef gt lt desc gt Key definition lt desc gt lt record rcon KDef gt lt supers gt lt label id key key true gt lt gtype gt lt tapply tcref PString gt lt gtype gt lt label gt lt label id kId key true gt lt gtype gt lt tapply tcref PString gt lt gtype gt lt label gt lt label id kKind key true gt lt gtype gt lt tapply tcref PString gt lt gtype gt lt label gt lt record gt lt typedef gt lt typedefs gt For a record type we specify a record constructor which may be different from the type constructor as in Haskell The supertypes of a record type are given by the supers element The element gtype denotes a ground type i e a non function type tapply denotes the application of a type constructor to ground types For example lt gtype gt lt tapply tcref gt lt gtype gt lt tapply tcref KDef gt lt tapply gt lt gtype gt lt tapply gt lt gtype gt means to apply the type constructor to the type KDef which we would write KDef or more commonly KDef in Haskell After the record s supertypes we specify the record labels The label element carries an optional key attribute CDET compares recor
90. ted boolean value gt lt ATTLIST sugdontknow var CDATA REQUIRED bool true false REQUIRED gt lt contradicting suggestions e g set x to 1 and keep x gt lt ELEMENT sugcontradict suggest gt lt S DAG conjunction node gt lt ELEMENT allofng sdag gt lt ATTLIST allofnq id CDATA REQUIRED gt lt S DAG disjunction node gt lt ELEMENT oneofng sdag gt lt ATTLIST oneofnq id CDATA REQUIRED gt B 4 Report DTD report dtd 61 lt S DAG universal node gt lt ELEMENT allofq domain quantedge gt lt id quantified variable gt lt ATTLIST allofq var CDATA REQUIRED id CDATA REQUIRED gt lt S DAG existential node gt lt ELEMENT oneofq domain quantedge gt lt id quantified variable gt lt ATTLIST oneofq var CDATA REQUIRED id CDATA REQUIRED gt lt domain of a quantifier node gt lt ELEMENT domain value gt lt repair action gt lt ELEMENT repairaction delete add change changesuchthat keep gt lt delete a value gt lt ELEMENT delete value gt lt add a value gt lt ELEMENT add value gt lt change a value from to gt lt ELEMENT change value value gt lt field change this field gt lt ATTLIST change field CDATA IMPLIED gt lt change free variables of the corresponding formula such that gt lt the predicate symbol ap
91. tency management approaches which provide strong means for consistency man agement but severely hinder your work by requiring a specific document model or format By consistency checking I mean to determine how the repository conforms to the rules This is in contrast to classic logic where consistency checking By RCS I mean an arbitrary revision control system and not the specific tool with the same name 2By document engineering process I mean the act of editing interrelated documents in a targeted way similar to software engineering see e g www documentengineering org 1 1 What Can You Use CDET for 3 means to determine whether a set of formulae has a model i e all formulae can be fulfilled at once Essentially CDET is a sophisticated model checker where the DMS RCS provides the model We call a violation of a consistency rule inconsistency We can resolve an inconsistency by applying a repair to some documents in the repository Consistency rules can express intra and inter document requirements regard less of the document model and the document format used Strong rules must be adhered to whereas weak rules may be violated Rules can restrict how docu ments evolve in time hence CDET supports temporal logic From consistency rules CDET generates e S DAGs Suggestion carrying Directed Acyclic Graphs which visualize in consistencies and repair actions for individual rules and e one repair
92. the defined type You can specify multiple supertypes by the subelement gtype see below Any type variables in a gtype element must be arguments of the record type tvar above label The labels of the record type Here you specify the record type s ingredients Required attributes id the name of the label Optional attributes key is the label a key label For testing equality key labels are considered only If the attribute is omitted CDET assumes for safety reasons that the label is a key label By the subelement gtype you specify the label s type see below variant A variant type like Bool tvar Type variables arguments for the variant type Required attributes id the type variable name 5 2 Language XML Syntax 35 subs Subtypes of the defined variant type You can specify multiple subtypes by the subelement gtype see below Any type variables in a gtype element must be arguments of the variant type tvar above vcon The variant data constructors Required attributes id the variant constructor name If the variant constructor has type arguments you can specify them by the subelement gtype Again any type variables in a gtype element must be argu ments of the variant type tvar above e predsymdefs Predicate symbol definitions You can define as many predicate symbols as you want Notice however that their names must be unique predsymdef A predicate symbol definition Require
93. their semantics i e manual kind and name Multiple key resolvers may exist Their actual names are hidden from authors To ensure consistency we require that two step links to manuals are valid 1 and names and kinds of manuals are invariant over time 2 A link to a key k is valid if k is defined in a key resolver and this definition points to an existing manual of the correct kind Fig 2 1 also shows a repository that contains a plain text document docl txt and three XML documents a key resolver keys xml and two manuals man1 xml and man2 xml The check in at state 2 causes three inconsistencies 1 the refer ence to kaA3 violates because the kind of man1 xml is different from the kind of kaA3 s key definition 2 the reference to kaA2 violates because the key documents a Kd key ee State 1 docl txt as shown in manual kaA3 keys xml lt kDef key kaA3 kId man1 xml kKind technical M gt manl xml lt man kind technical M gt man2 xml lt man kind field M gt State 2 docl txt as shown in kaA3 and kaA2 manl xml lt man kind field M gt Figure 2 1 Example repository 6 Working with CDET resolver does not contain a definition for kaA2 3 the state transition violates 2 because the kind of man1l xml has changed Suppose we have already formalized the rules and 2 as consistency rules Then doa cdet compile example_project xml in order to
94. tive gt lt papply gt lt formula gt lt formula gt lt papply id gt lt term gt lt fapply id kind gt lt term gt lt var id m gt lt term gt lt fapply gt lt term gt lt term gt lt fapply id kKind gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt hintalternative gt lt set condition false id m field kind repaircost 2 gt lt term gt lt fapply id kKind gt lt term gt lt var id d gt lt term gt lt fapply gt lt term gt lt set gt lt hintalternative gt lt papply gt lt formula gt lt and gt lt formula gt Predicate application via papply corresponds to function application A predi cate application can carry alternative hint sets hintalternative each of which may include arbitrarily many hints element set Within the element set the 4 3 Languages 23 attribute condition denotes the truth value of the predicate application that should be inverted id denotes the variable to change field denotes the record field to change if appropriate repaircost denotes the cost for repair The set element carries an arbitrary term that calculates the new value for the variable attribute id above Finally we close all open tags lt and gt lt formula gt lt exists gt lt formula gt lt exists gt lt formula gt lt forall gt lt formula gt lt forall gt lt formula gt lt forall gt lt formula gt
95. tory gt State gt String gt Document returns the documents in the repository that are current at a given state and match a regular expression For example repDocs repo 2 xml returns all XML documents current at state 2 in the repository repo Notice that the returned documents only include the name and the last modification state up to the given state data Document Document document_dId PString document_dState State The content of the document has to be parsed by the function parseDoc described below repDirs Repository gt State gt String gt Folder behaves like repDocs but returns directories instead Similar to documents directories include the name and the last modification state only data Folder Folder folder_fId PString folder_fState State parseDoc Repository gt Document gt String gt a gt Maybe a accesses a given document in the repository The third parameter is a func tion that parses the document content and converts it into an appropriate Haskell data structure For XML documents such parser functions can be generated WR99 If the given document does not exist in the repository parseDoc returns Nothing Chapter 6 Future Work As you might have noticed CDET is far from being a mature product Indeed CDET is in a very preliminary state Here are just some directions for future work Support type definition by a subset of XML Schema and derive par
96. uctor is defined in a language imported by this rule set and that it constructs the correct record type of course label Label bindings for the record construction Make sure that you bind all labels of the record including those of the record s supertypes Required attributes id The record label s name Make sure that the name is part of the record type and that you bind it only once in a record construction The label is bound to a term which you specify by the subele ment term case A variant deconstruction Optional attributes tcref You may want to specify the variant type constructor in case you construct variant supertypes without adding new variant data constructors Then CDET s type checker may need some guidance 5 3 Rule XML Syntax 39 or term The variant term to be deconstructed i e the case scrutinee binding Bindings of each alternative variant data con structor including those of the subtypes to a function symbol which will be applied to the variant data constructor s argu ments Required attributes id the variant constructor unique in the case construct sym the function symbol that should be ap plied to the variant constructor s arguments typed An explicitly typed term Explicit typing may guide type inference which is however only seldom necessary term The actual term type It s type annotation hintalternative An alternative hint set You can guide repair g
97. ules we use two languages The predefined language Prelude defined in the file RulePrelude xml contains useful basic types functions and predicates like Document concatMap and see App A The domain specific language in ManualsLanguage xml contains some types and functions specific for our example project e g a document type and a parser function for manuals lt rule id phii date 2004 08 16 gt lt meta weight 70 weak true gt lt desc gt The two step links to manuals are valid lt desc gt lt meta gt A rule starts with some meta data e g its name here phil and its definition date here 16 August 2004 We also assign the weight 70 to and tell CDET that our example rule may be violated weak true If we prohibited violations CDET would not permit any check ins to the repository that violate 1 The actual formula begins with the tag lt formula gt Let us now express the quantifier cascade for the variables t x k d and m y tKEEP lt repStates e Vx c repDs t e Vk lt refs z e Jd e concatMap kDefs repResDs t e J m e repManDs t e in XML 20 What You need to Input into CDET lt formula gt lt forall action keep id t gt lt term gt lt fapply id repStates gt lt term gt lt formula gt lt forall id x gt lt term gt lt fapply id repDs gt lt term gt lt var id t gt lt term gt lt fapply gt lt term gt lt formula gt lt forall id k gt
98. ules fully qualified gt lt ATTLIST language id CDATA REQUIRED qualifiedPrelude CDATA IMPLIED qualifiedXMLTypes CDATA IMPLIED qualifiedChar CDATA IMPLIED qualifiedRatio CDATA IMPLIED qualifiedComplex CDATA IMPLIED qualifiedNumeric CDATA IMPLIED qualifiedIx CDATA IMPLIED qualifiedList CDATA IMPLIED qualifiedMaybe CDATA IMPLIED qualifiedRandom CDATA IMPLIED qualifiedTime CDATA IMPLIED gt lt import other languages gt lt ELEMENT importlanguage EMPTY gt lt path path to the XML file holding he language gt lt qualifiedAs import qualified default unqualified import gt lt ATTLIST importlanguage path CDATA REQUIRED qualifiedAs CDATA IMPLIED gt lt import an XML parser automatically generated by HaXml for a DTD gt lt ELEMENT importparser EMPTY gt lt dtd path to the DTD gt lt qualifiedAs import name necessary to avoid naming conflicts gt lt ATTLIST importparser dtd CDATA REQUIRED qualifiedAs CDATA REQUIRED gt lt import another Haskell module gt lt ELEMENT importmodule EMPTY gt lt module path to the Haskell module gt lt qualifiedAs import qualified default unqualified import gt lt ATTLIST importmodule module CDATA REQUIRED qualifiedAs CDATA IMPLIED gt B 3 Language DTD language dtd 55 lt import a foreign function gt lt currently only C is supported by GHC gt lt ELEMENT importf
99. ve works These actions are prohibited by law if you do not accept this License Therefore by modifying or distributing the Program or any work based on the Program you indicate your acceptance of this License to do so and all its terms and conditions for copying distributing or modifying the Program or works based on it Each time you redistribute the Program or any work based on the Program the recipient automatically receives a license from the original licensor to copy distribute or modify the Program subject to these terms and conditions You may not impose any further restrictions on the recipients exercise of the rights granted herein You are not responsible for enforcing compliance by third parties to this License If as a consequence of a court judgment or allegation of patent infringement or for any other reason not limited to patent issues conditions are imposed on you whether by court order agreement or otherwise that contradict the conditions of this License they do not excuse you from the conditions of this License If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations then as a consequence you may not distribute the Program at all For example if a patent license would not permit royalty free redistribution of the Program by all those who receive copies directly or indirectly through you then the only way you could satisfy both it and t
100. w symbols and types in terms of functions and types defined in Prelude In addition we generate two parsers for manuals and resolver documents from their DTDs CDET uses an HaXml WR99 adaptation for generating appropriate Haskell types and parsers Unfortunately these Haskell types do not exactly correspond to the Haskell types generated by CDET because HaXml lacks support for subtyping Thus there will be still some work to do but at least you do not have to struggle with lexing and parsing issues We import the generated parsers qualified in order to avoid naming conflicts Next we define our three new record types Type definitions ManD lt Document Man kind String manual documents ResD lt Document Res kDefs KDef key resolver documents KDef KDef key String kId String kKind String key definitions in XML lt typedefs gt lt typedef id ManD gt lt desc gt Manual document lt desc gt lt record rcon ManD gt lt supers gt lt gtype gt lt tapply tcref Document gt lt gtype gt lt supers gt lt label id kind key false gt lt gtype gt lt tapply tcref PString gt lt gtype gt lt label gt lt record gt lt typedef gt lt typedef id ResD gt lt desc gt Resolver document lt desc gt lt record rcon ResD gt lt supers gt lt gtype gt lt tapply tcref Document gt lt gtype gt lt supers gt lt label id kDefs key false gt lt gtype gt l
Download Pdf Manuals
Related Search
Related Contents
Meriva, v.8 (rev 2), fr-FR (Work nr: J554B_50) Multiquip MLT25 User's Manual the entire Munis Training Manual TP N° MAESTRO 館内デジタル自主放送装置 - digital PD-D6-J - Audio Design Guide OK W-D4 User's Manual - Wuntronic GmbH Guida Rapida V2 Copyright © All rights reserved.
Failed to retrieve file