Home

Glozz User's Manual

image

Contents

1. Suspendisse lectus tortor dignissim sit amet adipiscing nec ultficies sed elementum dolor Cras ultrices diam Maecenas ligula massd varius a semper Duis arcu massa scelerisque vitae consequat in pretium a enim Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibulum bibendum augue Praesent egestas leo in pede Praesent blandit Then choosing the Relations Coref layout as follows Layout v Boxes i Relations Coref k Schemas Coref Will result in a graph showing each individual chain horizontally 9 4 2 Co reference using schemas Assume a text 1s annoted with schemas one schema embedding all the units a given reference chain 63 Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed fapal pisus 2 8 e A OON d r Suspendiss amp lectus tortor dignissim sit amet adipiscing nec ultricies sed d D D D d dolor Cra elemgntum wm ultrices diam Maecenas ligula m issa varius a D H H fb 1 E z e semper con bue euismod non mi Proin portt ifor 9 nonummy J t D 1 molestie en ES eteifenc m fon ermentum diam nisl sit ameterat Duis N semper Puis arcu massa scelerisque vitae consequat in pretium a chim s B Pellenfesque congue Ur ag risus volutpat libero pharetra tempor Cras vestibulum bibendum augue Present eset gestas leo in pede
2. Unit ad annotation model aam Schemas h athe ho lone d 2 3 1 Units Creating a unit First click on the unit creation mode button This results in hilighting this button showing that this mode is now active but also in activating the Units part of the annotation model selector frame 4 in the left Min Depth Max Depth 3 No limits Search text Unit type Units Relations Schemas Pronom Verbe Then select in the Units model the type you want to give to the next created unit s for instance verb in our example below Units Pronom Ve aN Once in unit creation mode you can add new units directly with the mouse with two fashions drag amp drop put the mouse on the position of the start position of the future unit then press the button and keep it pressed while moving the mouse to the end position When the mouse is released the unit is fully created Two clicks click on the start position Then a begin flag is shown Then move the mouse towards the end position While moving a end flag is following the mouse position Then once on the end position click again 12 ubila Alig ib nas hdipiscing E non dia nrnare nt mi Aenean While creating a unit or just after you may want to cancel what you re doing or what you ve just done To do so press the backspace button of your keyboard Attributing values to feat
3. as aa file Basket from current data Now back to the GlozzQL main window we can click on Relationl in frame 3 which owns utterance and the click again on Add results to Basket Now the basket owns a fith annotation eoo GlozzQL Basket Sort Type Sor Dae Show sel t Visible J s Unite Discontinue ymathet 1246557953609 ymathet 1246557956163 ymathet 124655790 s Schema Anaphorique ymathet 1246557943443 ymathet 1246557946798 ymathet 124655 s Schema Elaboratif ymathet 1246557965878 ymathet 1246557991290 ymathet 124655792 methet 12465033S kymut 123035843068 ID ymathet_12465583986 s_Sch ma Anaphorique ymathet_1246557953609 ymathet_1246557888089 ymathet_124655 ES l gt Save Basket content Clear Erase basket content as aa file Basket from current data e We can go on as long as needed and add to the basket as many annotations as we want 8 5 2 Save basket content as aa file In the bottom left corner of the basket window the button save basket content as aa file enables to create a new annotation file containing only the annotations currently present in the basket When doing so you will be asked to enter a file name This is useful for instance to split the annotations in several files depending on a criterion such as the author s name or so 54 8 5 3 Erase basket content from current data The second action the basket enables is to remove the annotations currentl
4. lt type gt lt relations gt 6 4 Schemas Once again creating schema types is almost the same thing as creating unit types Here we create the referenceChain schema type only with a free feature named Remarks lt schemas gt lt type name referenceChain gt lt featureSet gt feature name Remarks gt value type free default lt feature gt lt featureSet gt lt type gt lt schemas gt 6 5 Groups The notion of group 1s transversal to units relations and schemas The idea 1s to create as many groups as needed each of them corresponding to a paradigm of the campaign and then to set which kinds of units relations and schemas belong to which group For instance if a campaign deals both with syntax and semantics it will be interesting to create two dedicated groups and to set for each type to which group s it belongs Indeed a given type may belong to as many groups as needed or to no group This will be very convinent once in Glozz to have such groups because it will be possible to show hide all elements whose type belong to a given group For instance we will be able to see only annotations dealing with syntax then only annotations dealing with semantics and so on 33 Groups are defined in the aam files in an implicit manner Indeed there is no xml node to create a group We only have to say for a given type to which group s it belongs via the groups attribute and the cor
5. C1 Text Contains isotope scope Unit This constraint has Cl for name concerns Units only scope and asks a unit to contain the text isotope 8 1 2 ConstrainedAnnotation A ConstrainedAnnotation 1s a set of annotations which futfil a given Constraint For instance Uniti C1 defines the set of Units which futfil the CI constraint and which is called Unitl Hence Unit is the set of all units of the current annotions which contain the text isotope 8 1 3 Incremental creation of Constraints and ConstrainedAnnotations By definition a ConstrainedAnnotation depends on a given Constraint Reversly some Constraints depend on given Constraints It s the case for instance of the Contains constraint C2 Contains Unit1 scope Relation Schema is a constraint which expresses the fact for a Relation or a Schema to contain a Unit which contain the text isotope For instance to get the set of schemas which futfil this constraint we can create the ConstrainedAnnotation Schemat1 C2 And to get the set of relations which futfil this constraint we create Relation1 C2 Hence it is possible to build step by step richer and richer constrained annotations in an incremental process 43 8 2 GlozzQL Graphical User Interface To open the GlozzQL GUI click on the dedicated button in the main toolbar GQL It makes the interface open at the first click or show it again if it ha
6. gt Before Any1 127 SETS a EET eRe ultricies sed dolor Cras elementum ultrices diam Ma cenas ligt d Son Tyee CS r_Anaphore ymathet_1246557407511 ymathet_1246557410833 ID ymathet_1246558012878 r_Anaphore ymathet_1246557885835 ymathet_1246557883832 ID ymathet_1246558018015 r Anaphore ymathet 1246557908371 ymathet 1246557916100 ID ymathet_1246558027932 r Anaphore ymathet 1246557913763 ymathet 1246557914641 ID ymathet 1246558033952 r sujet ymathet 1246557397128 ymathet 1246557406225 ID 2ymathet 1246558041904 r sujetiymathet 1246557414188 ymathet 1246557415724 ID2ymathet 1246558047585 iet ymathet 1246557905433 ymathet 1246557906501 ID ymathet 1246558056371 r_complement ymathet_1246557903980 ymathet_ 1246557890338 ID2ymathet 12465581243 58 The reason is that in fact Anyl Any2 and Relation are sets not variables It means for instance that a relation belonging to Relation links any element of Anyl to any element of Any2 Of course all the elements of Any2 are not necessarily before all the elements of Anyl but before at least one of them What we need for our purpose 1s that we talk about the same entities when we say that Any2 is before Anyl and that Relation goes from Anyl to Any2 This is what we call here unification mechanism We launch this process with the dedicated button located in the bottom of the window Unify Variables amp A new window appears dedicated to unification mec
7. Died 9 September um dene aged 36 Ch teau Malrom France Nationality French Field Painter Printmaker dunes illustrator Movement s Art Nouveau French Ts agi do tuluz MEER November 1864 4 September 1901 was a French painter printmaker draughtsman and illustrator C immersion in the colourful and theatrical life of fin de si cle Paris dni ceuvre of exciting elegant and Dainters new record was set when pes a an early painting of a young laundress sold for 22 4 million U S 1 Biography Died 9 September 1901 1901 09 09 aged 36 x Ch teau Malrom France Nationality French Field Painter Printmaker draftsman illustrator Movement Post Impressionism Art Nouveau French pronunciation Gti do t luz Jo tekp 24 November 1864 9 Septet PON was a French painter printmaker draughisihan ama illustrator whose immersion ib the colourful and theatrical life of fin de si cle Paris SCH an uvre of exciting dpt ES provocative images of the modern and sometimes diet tite of those times CS d N Lautrec is known along with C zanne C zanne KE eae E as one the greatest A painters of the Post Impressionist period In a 2005 auction at Christie s auction hause a new record was set when Ca blanchisse an early painting of a young mo sold x for 22 4 million U S 1 Biography two derent styles applied on the s same e corpus 7 2 Creating styles Glozz provide a full wysi
8. You will have to do it again only when working with texts from a different corpus e O Random Entropy Computing input directory iJalignement montage light Ra Iterations 100 4 4 ki Number of annotators 3 R Entropy 1 881813208477623 Now for any text of our corpus we can compute its entropy and having also the random entropy of the corpus get the agreement value Once the annotated text is loaded launch the automatic alignment tool as we ve seen in section 10 3 1 In the same time it makes the alignments it also computes the entropy value Then in the bottom of the window we get the three values last random entropy the random entropy we ve computed from a folder last computed entropy the one we ve just computed for the current annotated text the rate agreement for the current annotated text Last random entropy Last computed entropy Rate agreement In our example with a computed entropy of 1 0646 for the text and a random entropy of the corpus of 1 8780 the rate agreement is 0 4331 Please refer to Mathet amp Widlocher 2011 to see how to consider this value 72 11 Additionnal tools 11 1 Depth selector When units are embedding others recursively it is possible to show only those whose depth belongs to a given range Depth when a unit A is covered by another unit B which is coverder by another unit C covered by no any unit we say that the depth of A 1s 2 the
9. ZI D k Un ry Schemas m Load annotation model aam Sere And then browse tothe special file provided in the Glozz distribution in data annotationModels structure typo aam ee Date de modification argumentation aam 22 mai 2010 11 31 ei lorem ipsum withComplexAnnotations aam 22 mai 2010 11 31 possibleValues aam 22 mai 2010 11 31 structure typo aam N 22 mai 2010 11 31 _ validation aam 22 mai 2010 11 31 And the unit model 1s now fed as follows see paragraph title subtitle and so oni es Units Relations Schemas Please refer to the specific section of this manual to see how to change the style of an unit Then using the structure typo aam style you will be able to change any paragraph to a title a list item and so on You can also move the frontieers of a paragrah define new paragraphs or remove some etc Note that each time you want to watch the new typographic result vou have to save and reload the current corpus No update will be done automatically The next screenshot is the result of our text with some work on typo one main title then two sub titles then a list with 4 items and then paragraphs and once the show typographic annotation option is removed 28 k Henri de Toulouse Lautrec From Wikipedia the free encyclopedia Henri de Toulouse Lautrec Birth name Henri Marie Raymond de Toulouse Lautrec Monfa Born 24 November 1864 1864 11 2
10. and aa formats As we ve seen in File types section you will have to create two combined files Assume we want to create a corpus named Lautrec the two files will be Lautrec ac and Lautrec aa Here are two excerpts of these files eoo Lautrec ac Henri de Toulouse LautrecFrom Wikipedia the free encvclopediaHenri de Toulouse Lautrec Birth name Henri Marie Raymond de Toulouse Lautrec MonfaBorn 24 November 1864 1864 11 24 Albi Tarn FranceDied 9 September 0 1961 1961 69 69 aged 365Ch teau Malrom FranceNationality FrenchField Painter Printmaker draftsman illustratorMovement Post Impressionism Art MouveauHenri Marie Ravmond de Toulouse Lautrec Monfa or simply Henri de Toulouse Lautrec French salgo oi dei da tuluz lo teck 24 November 1864 9 September 1981 was a French painter printmaker draughtsman ong illustrator whose immersion in the colourful and theatrical life of fin de si cle Paris vielded an euvre of exciting elegant and provocative images of the modern and sometimes decadent life of those times Toulouse Lautrec is known along with C zanne Van Gogh and Gauguin as one of the greatest painters v of the Post Impressionist period In a 2885 auction at Christie s auction house a new record was set when La fig 1 an excerpt of the Lautrec ac file 29 lt annotations gt metadata corpusHashcode 10680 2919708 gt unit id TXT_IMPORTER_1290167040404 gt metadata author T
11. at final stage Units can be considered as the first type of elements from which others relations and schemas can be built 1 1 2 Relations A relation 1s a link oriented or not from one element of the URS model a unit a relation or a schema to another element Typically it may involve two units as follows Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscinP pec ultricies sed dolor Cras elementum ultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy molestie enim diam nisl sit But it may link a unit to a relation Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscinP pec ultricies sed dolor Cras elementum congue euismod non mi Proin ultrices diam Maecenas ligula mass porttitor orci nec molestie enim diam nisl sit Or a schema to another schema see next section to see what a schema is Lorem ipsum dolor sit amet adipiscing elit Sed non risus Suspendisse UIISOCLCLUCI lectus tortor dignissim sit amet adipiscing nec ultricies sed dolor Cras elementum k ultrices diam Maecenas ligula m porttitor orci nec nonummy mol Weg A o K Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibulum pm deo i 1
12. click and shows immeditely in the Frames 1 and 2 where the related annotation 1s In the example below it is shown that the Phrase unit of ID 38 is located below in the text of frame 1 pere Dk HILL MIROIR DIL uns een ts 3 PH UILDIVIVO MU as elementum ultrices diam Maecenas ligula massa varius a E Sort Type Sort Date i ci nec wl 4 438 4950 ID 38 u_Phrase 7239 7650 ID 39 u_Paragraphe 4163 6213 ID 40 u_Paragraphe 2113 4163 ID 41 u_Paragraphe 6237 8286 ID 42 r_Anaphore 4 6 ID 45 r_Anaphore 15 14 ID 46 r_Anaphore 22 27 ID 47 r_Anaphore 25 26 ID 48 r_sujet 1 3 ID 49 r_sujet 8 9 ID 50 Command 1 Suspendisse lectus tortor dignissim sit amet adipiscing nec Units Relations v j r dolor Cras elementum ultrices diam Maecenas s ma Pronom Verbe semper congue euismod non mi Proin porttitor orci DEE EE z molestie enim est eleifend mi non fermentum djam nisl sit amet erat Duis l semper Duis arcu massa scelerisque vitae consequat in pretium a enim l I I Pellentesque congue Ut in risus volui libero pharetra tempor Cras I l vestibulum bibendum Praesent egestas leo in pede Praesent blandit U s i i Feat odio eu enin Pellentesque sed dui ut augue blandit sodales Vestibulum ante i pam l demo Zeg primi wii Ubu orci luctus et ultrices posuere cubilia Curae Aliquam nibh Mauris ac mauris sed Pe
13. current selected annotation id 38 not visible list is scrolled so that current selected in the list annotation id 38 is shown 2 4 3 Sorting the list The list 1s sorted either chronologically from the oldest annotation to the youguest or by type Units then Relations the Schemas Use the two buttons 1s the top to do so 2 4 4 Managing visibility for each individual annotation This tool can also be used to hide or show the annotations individually contrary to the use of styles or groups which concern a set of annotations To hide an annotation in the frames 1 and 2 double click on it int the list Then it 1s hidden in the other views and appears hatched in the list Sort Type Sort Date Visible r_sujet 8 9 ID 50 r sujet 20 21 ID 51 r complement 19 18 ID 52 r complement 23 22 ID 53 r Elaboration 17 35 ID 55 R r Elaboration 61 60 ID 56 s Unite Discontinue 43 44 ID 57 D s Unite Discontinue 36 37 19 ID 58 s Schema Anaphorique 33 34 35 8 15 57 ID 59 s Schema Elaboratif 39 42 28 24 ID 60 To have an annotation back to visible double click again on it in the list If you want to have all hidden annotation visible again you can click on the Visible button instead of clicking individually on each of them 2 4 5 Command line feature creating annotations via a predicate entered with the keyboard It 1s possible to create annotations by typing a predicate directly with the keyboard To activ
14. feature value to one of the possible colors as follows Feature name Feature value glozz color green Feature name Feature value Lorem ipsui glozz color red First of the two modes named INDIVIDUAL DEFAULT will show each annotation having an individual color set in its feature set with the given color and the annotations not having one will be shown with default color MIBMBIS T J STYLESHEET Color modA N aA H A INDIVIDUAL STYLESHEET x L itpat libero pl CONNECTED ANNOTATIONS m 39 Second mode named INDIVIDUAL STYLESHEET will do the same with the difference that annotations not having individual color will be shown according to the current stylesheet IT 7 cena J STYLESHEET LE Color mod INDIVIDUAL DEFAULT INDIVIDUAL STYLESHEET itpat libero pl CONNECTED ANNOTATIONS m 7 6 3 Co reference chain color mode When annotating co reference chain it is sometimes difficult to identify which annotation to which chain in the main view Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse Atus tortor dignissim sit amet adipiscing nec ultficies sed dolor Cras elementum ultrices diam f Maecenas ligula massd varius a semper conguef euismod nom VA Proin porttitor orci nec nonummy molestie enfin est eleifend mi nofi fermentum diam njsf Sit amet exqt Duis Duis arcu massa sce Pellentesque congue Ut vestibulum biben
15. hence the prediacte completed The result 1s as follows PI MHNApHUICLL J LY IU T O T r Anaphore 22 27 ID 47 Y Command r complement 43 44 Important the auto completion relies on the current loaded annotation model to complete type names and on the current available annotation to complete the IDs Once the predicate complete press ENTER to have the related annotation created At any moment press BACKSPACE to cancel the effect of the last typed character 21 3 Installing and getting started how to 3 1 Download and unpack Glozz Download the latest version of Glozz from the website http www glozz org as tgz archive Unpack this tgz and you get a folder containing the distribution a 1 H weg CHANGELOG utf8 glozz platform glozz platform gt kal data dist 1 0 0 beta2 tgz dist 1 0 0 betaz glozz platform jar licence pdf StartGlozz bat h downloaded archive tgz unpacked distribution folder distribution folder content 3 2 Launch Glozz The distribution contains the application as glozz platform jar file It is a Java program which can be launched directly by double clicking it If 1t doesn t you probably haven t Java correctly installed or configured on your machine Please use Google with the words Java download and the name of your system to get Java installed If you re running windows you can take advantage of launching Glozz via Star
16. of unit comes in a type node The first one of our example is Noun Then we define the featureSet which contains as many feature nodes as needed At the moment there are two kinds of feature types 31 with a possibleValues node we define a value to be selected among several predefinite values each of them defined in a sub node called value Here for gender there are 3 possible values Male Female and No Note that we can also define a default value here Male which will be automatically set if no value is chosen by the annotator for this field with a lt value type free default gt we can define a free text to be entered and if needed a default value It s the case in our example for the remark type Here 1s the xml code lt units gt lt type name Noun gt lt featureSet gt lt feature name gender gt lt possibleValues default Male gt lt value gt Male lt value gt lt value gt Female lt value gt lt value gt No lt value gt lt possibleValues gt lt feature gt feature name count gt lt possibleValues default Singular gt lt value gt Singular lt value gt lt value gt Plural lt value gt lt value gt No lt value gt lt possibleValues gt lt feature gt feature name remark gt lt value type free default gt lt feature gt lt featureSet gt lt type gt lt type name Pronoun gt lt featureSet gt lt feature name gender gt lt possibleValues defa
17. or schemas s a semper collgue euismod non m 5 M consue euismod non m sa Semper congue euismod non m s a pemper congue euismod non m est eleifend mi non fermentum diam est eleifend mi non fermentum diam est eleifend mi non fermentum diam est eleifend mi non fermentum diam elerisque vitae consequat in pretium i elerisque vitae consequat in pretium i elerisque Les consequat in pretium i elerisque vitae consequat in pretium i t libero pharetra tempor Cras ves t libero pharetra tempor Cras ves libero pharetra tempor Cras ves t libero pharetra tempor Cras ves e Praesent blanditlodio eu enim Pelle e Praesent blanditlodio eu enim Pelle e Praesent K odio eu enim Pelle e Praesent blandit odio eu enim Pelle 1 ante ipsum primis in faucibus orci ante ipsum primis in faucibus orci nte ipsum primis in faucibus orci 1 3 ante ipsum primis in faucibus orci going to the start clicking the start moving to the end clicking the end Note For an element being itself a relation you can put the mouse over any point of its line but for an element being a schema you have to put the mouse over the main circle of this latter Selecting amp Editing relations It is the exact same procedure as for Units Please refer to previous section Deleting a relation It 1s the exact same procedure as for Units Please refer to previous section 2 3 3 Schemas Shemas being richer structures than uni
18. pede Praesent blandit odio eu enim Pellentesque sed dui ut augue blandit sodales Vestibulurn fante ipsum primis in faucibus orci luctus et And so on Relation to Relation Relation to Schema Unit to Schema 1 1 3 Schemas A schema is a set of as many URS elements as whished Hence a given schema can contain some units but also some relations and even some other schemas This enable to construct recursively deep structures Let s see some possible configurations Lorem ipsum dolor sit amet consegtetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed dolor Cras elementum ultrices diam Maecenas ligula mga varius a semper congue euismod non mi Proin porttitor orci nec Inonummy molestie enim est ele fend m non fermentum diam nisl sit amet erat Duis semper Duis arcu massa scelerisque vitae consequat in pretium a enim a schema embedding 3 units Lorem ipsum dolor sit amet adipiscing elit Sed non risus Suspendisse ultrices diam Maecenas ligula massa iasa semper congue euismod non mi Proin porttitor orci nec molestie enim est eleifend mir non fermentum diam nisl sit amet erat Duis semper Duis arcu massa scelerisque vitae consequat in pretium a enim a schema embedding 3 units and a relation Lorem ipsum dolor sit amet adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscing nec u
19. the corpus to compute the entropy of this text These principles are integrated in Glozz in a dedicated tool called Aligner They will be shown in section 10 3 after introducing the views provided by the Aligner tool in section 10 2 However please note that this tool is still under development current version of Glozz is 1 0 0 when writing this version of the manual and should integrate in the future the possibility to choose the dissimilarity function to work with and to adjust the inter categorial matrix 10 2 Aligner special view When a text 1s annotated by several annotators human or software Glozz provides a special view which consists in separating annotations from each annotator and showing them on an horizontal line one line by annotator At the moment Glozz version 1 0 0 only Units are considered by this tool To launch this module use the Viewers menu and click on Alignment 65 ps SandBo Grapher TY 10 2 1 overview eoo Alignment o Wo AEE Ec ts eg Zoom x 1 Q jliabeuf gege alabadie tvallee Last random entropy Last computed entropy Rate agreement In this example annotations come from three different annotators and so are separated into three horizontal lines One line represents the annotations from one annotator on the whole text from the first character at the left to the last character to the right On the left of each line is shown the a
20. to the family on 28 August 1867 but Nationality French died the folowing year Field Painter Printmaker draftsman After the death of his brother his parents separated and a nanny took care of Henri through this time At the age of 8 Henn left to live with his mother in Paris Here he started to draw his first sketches and caricatures on his exercise workbooks The family quickly came to realise that Henri s talent lav with drawina and First we copy the content and paste it in a text editor Then we save the file as txt Lautrec txt encoded in UTF 8 600 j Lautrec txt Henri de Toulouse Lautrec From Wikipedia the free encyclopedia Henri de Toulouse Lautrec Birth name Henri Marie Raymond de Toulouse Lautrec Monfa Born 24 November 1864 1864 11 24 Albi Tarn France Died 9 September 1981 1901 89 89 aged 36 I Ch teau Malrom France Nationality French Field Painter Printmaker draftsman illustrator Movement Post Impressionism Art Nouveau illustrator Movement Post impressionism Art after some cleaning if necessary Henri Marie Raymond de Toulouse Lautrec Monfa or simply Henri de Toulouse Lautrec French pronunciation hui da tuluz lo teck 24 November 1864 9 September 1901 was a French painter printmaker draughtsman and illustrator whose immersion in the colourful and theatrical life of fin de si cle Paris yielded an uvre of exciting elegant and provocative images of the modern and some
21. two annotations will be given the same ID This date 1s nomber of milliseconds elapsed since 1970 janyary the first as it is used namely in Java language Here is an example of a real ID in Glozz with annotator s login ymathet ymathet 1290167040405 1 4 2 Friendly IDs However these real IDs are not user friendly being far too long Hence Glozz uses friendly IDs which 1s a parallel system of ID simply being numbers from 1 to the number of annotations of the current annotated text Hence the first annotation of the text has the friendly ID 1 the second has the friendly ID 2 etc This is much more easy to handle but be careful this makes sense only within a Glozz session and there 1s absolutely no warranty that an annotation which 1s given for example the friendly ID number 312 will have the same number the next time Consequently you should never communicate to other people nor store for yourself friendly IDs but only reald IDs Of course this 1s what 1s done when saving your annotation in a file with Glozz Even if you ve used friendly IDs when working the real ones will be stored Please refer the the User Interface chapter section 2 3 4 to see how to choose between real and friendly IDs when working 2 User interface 2 1 overview The main interface comes with 6 main frames eoo Glozz 0 9 9 Logged as ymathet File Options Import Export Tools Admin Groups Viewers SandBox Eg fi TimePla
22. 1 Overview This tool is another way of viewing and creating annotations and works simultaneously to others views of the application It shows each annotation as a predicate with its name and arguments and so consists in a complete list of the current loaded annotations Let s see an overview of this module in the following screenshot Sort Type Sort Date Visible u Phrase 4438 4950 ID 38 u Phrase 7239 7650 IDs 39 k u Paragraphe 4163 6213 ID 40 u Paragraphe 2113 4163 ID 41 u Paragraphe 6237 8286 ID 42 r Anaphore 4 6 ID 45 r Anaphore 15 14 ID 46 D r Anaphore 22 27 ID 47 r Anaphore 25 26 ID 48 r sujet 1 3 ID 49 A r_sujet 8 9 ID 50 v Command Remark if there are too many annotations to list a vertical scroolbar appears in the right of the window Units for instance the first visible line u Phrase 4438 4950 ID 38 is related to the annotation whose ID is 38 which is a Unit since it starts with u_ whose category is Phrase which starts at character 4438 and ends at character 4950 Relations The last visible line shows the relation since it starts with r whose ID 1s 50 whose category 1s sujet and which links annotation of ID 8 to annotation of ID 9 18 Schemas A schema predicate starts with s_ and its arguments are the ID of the nested annotations of the related schema 2 4 2 Navigation and selection This tool is reactive when you put the mouse over any part of predicate no need to
23. 10522 Create atiew ali Onin mts ee 70 10 3 2 3 Editing an alignment ener 71 104 Agreement EIERE 71 UN WE ee oa R OO EE 73 Nr ER RE e EE 73 I2 Simple SEARCH 100182 ace EE 74 Ig o ie Text Eoo m 74 142222 4 Fai S 00 Dessus donet de na MEE UE 75 HE lt e PAY Ci eege 76 o Man princip Esraa aa aa 76 13 579 EELER 71 NES AUTOS DANS A A 77 ED SA CaO EE 77 D CO te a LEM ME Un M MM ME a 78 Appendice ls hashcode algorithm in 148 eed ananas bs Rt pete Eege 78 EE EE 79 Remark what we call corpus in Glozz and in this manual is an annotated text not a collection of annotated texts as it usually means 1 Introducing Glozz Glozz is a multi purpose annotation tool which can be set to cope with most of paradigms It has been developed since september of year 2008 within the french ANR Annodis project involving Clle Erss Greyc and Irit laboratories by in alphabetic order Yann Mathet and Antoine Widl cher from the Greyc A third developper engineer J r me Chauveau has joined the development team since october 2010 for some months It comes with a fully WYSIWYG interface which makes it possible to annotate texts with rich annotation models and provides additionnal features such as the query language GlozzQL and some exports features SOL CSV 1 1 Meta Model the Unit Relation Schema generic model Glozz relies on the URS Unit Relation Schema meta model from W
24. 4 Albi Tarn France Died 9 September 1901 1901 09 09 aged 36 Ch teau Malrom France Nationality French Field Painter Printmaker draftsman illustrator Movement Post Impressionism Art Nouveau Henri Marie Raymond de Toulouse Lautrec Monfa or simply Henri de Toulouse Lautrec French pronunciation Q amp i do tuluz lo t amp tk 24 November 1864 9 September 1901 was a French painter printmaker draughtsman and illustrator whose immersion in the colourful and theatrical life of fin de si cle Paris yielded an uvre of exciting elegant and provocative images of the modern and sometimes decadent life of those times Toulouse The corpus 1s now created As the corpus provider you are now you ll have to provide to your annotators both the Lautrec ac and the Lautrec aa files built as we ve just seen It is very important all your annotators start their work form these files and then modify Lautrec aa with their own annotations Indeed the import from txt file then cleaning then adding or modifying typography must be done once for all in order that annotations done by different annotators can be compared merged etc 5 2 Automatic corpus creation via a custom program You may want to automatize the creation of your corpora in Glozz format To do so you may use some software provided by other people or you may need to develop your own application We provide in this section some more precise data about the Glozz ac
25. 430 86480 ID anonymous 62 u paragraph 82380 84430 ID anonymous 61 u paragraph 80330 82380 ID anonymous 60 u paragraph 78255 80305 ID anonymous 58 u paragraph 76205 78255 ID anonymous 57 u paragraph 74155 76205 ID anonymous 56 u paragraph 72080 74130 ID anonymous 54 u paragraph 70030 72080 ID anonymous 53 u paragraph 67980 70030 ID anonymous 52 u paragraph 65905 67955 ID anonymous 50 Show sel Visible 46 Then we can click on any item in the list to have it selected in Glozz GlozzQL enim eee n om men SE om o molestie enim est eleifend mj C Sort Type EE DEENEN e Sale u paragraph 2113 4163 ID anonymous_ 5 sit amet erat Duis semper Duis arcu massa scelerisque u_paragraph 41 2091 ID anonymous_ 3 8619 8737 ID ymathet_124655792202 vitae consequat in i ntesque congue u Phiase 41 102 ID ymathet_1246557935046 u Phrase 103 2001 ID vmathet 124655793820 adipiscing nec ultry diam Maecenas li So we ve just seen how to get all the results and for each of them how to have it selected in Glozz in order to see it to edit it to remove it and so on Now let s try to get all the Units not containing this text We already have the C1 constraint expressing a unit contains it Let s create the opposite using the NOT constraint Logic And L Or Not We apply it on CI being in fact the only possibility at the moment AO Constrai
26. 61 9 2 2 Auto selection When moving the mouse over an element of the graph this latter is automatically selected in all the other views 9 2 3 Zoom in Zoom out It is possible to zoom in zoom out either using the box in the top of the window or more friendly using the wheelmouse 9 2 4 Showing embedded text Two modes are available regarding the way the units are represented in the grapher Default one shows each unit as a simple number see the left screenshot below but a second one shows the embedded text see the right screenshot below m eo Cc EN D ER o m e CT z m 3 be ie e E m To switch from one mode to the other use the button as follows d 9 3 SDRT layout By default the graph 1s shown in a SDRT like mode using boxes to show embedding structures It s the best mode for annotations related to the SDRT theory It can be reselected by choosing boxes in the layout option of the toolbar 62 Layou AE SEE E Relations Coref EB Schemas Coref 9 4 Co reference chains layouts For annotations related to reference chains two special layouts were created depending the annotations use relations or schemas to create chains 9 4 1 Co reference using relations Assume a text is annoted with two co reference chains using units linked together by relations Lorem dolor sit amet consectetuer adipiscing elit Sed non risus
27. Creating your own data folder s Please read next section to undestand what kind of data is involved in Glozz and how to create your associated folders 25 4 File types overview Glozz uses 5 file types Two of them are dedicated to store corpus Each corpus is stored via a pair of files one ac and one aa ac is a text file containing all the characters of the corpus including space characters and punctuations but no line feed Consequently it appears on one very long single line with no typography and should never be modified since other files will rely on it Aa is an xml annotation file constructed and updated relying on a given ac file It contains all annotation marks including typgraphic ones titles paragraphs lists etc and of course manual annotations which will be made Of course a aa file must be used only with its associated ac file but glozz will prevent you from doing any mistake since each aa remembers which its associated ac 1s through a hashcode Note that for a given ac you may create as many aa as you whish It will be the case in particular when a coprus is annotated by several annotators each one of them working on a his own aa aam is an xml file describing an instance of what will be called annotation model It s where all types of entities will be available for a given annotation campaign One aam file may be used for several corpus aas is an xml file describing a styles
28. GREYC Glozz User s Manual http www glozz org Version 1 0 Date May the 14 th 2011 Author Yann Mathet Glozz was created within the french ANR project Annodis by in alphabetic order Yann Mathet and Antoine Widlocher mathet unicaen fr widlocher unicaen fr l Table of contents Introduce EE 5 1 1 Meta Model the Unit Relation Schema generic model 5 A BEE 5 BES REKTO E OU E UU LU 6 Ge ne renal een dite On TI DR 6 1 2 Annotation Model a Unit Relation Schema instanciation for a given annotation CS OY EE 7 1 95 A CARRE LS Eege 8 1 4 Annotation IDs in Glozz real IDs versus friendly IDs 9 KAE NND ee 9 1 4 2 Friendly Ise 9 IJser Intetfdee cogens namen SML ant aM S ce ne eat 10 Deas AON CIVIC WW Tcr 10 2 2 Main and macro views viewing and navigating 10 2 3 Annotating how to adding and editing annotations 11 2b JUS sedi whiners a dat 12 202 Et t M DNA 14 2 5 9 Ee EE 15 2 3 4 Working with real IDs or friendly Ten 17 2 4 Annotations as predicates Frame 6 5 2 a pase ole eae ise lel RE a ane 18 2AE RT ccm T m 18 2 4 2 Navigation and selection ss 19 245 SOUS EIS SU cl etree less Bal eae asl de ata les ae 20 2 4 4 Managing visibility for each individual annotation 20 2 4 5 Command line feature creating annotations via a predicate entered with the Ke
29. GlozzQL Glozz Query Language GlozzQL is query language dedicated to Glozz annotations Units Relations Schemas It relies on simple concepts and comes with a full graphical interface to create queries and observe results It is available at any moment even while annotating 8 1 GlozzQL fundamentals GlozzQL relies on two main concepts which interoperate Constraint and Constrained Annotation 8 1 1 Constraint A constraint expresses a condition an annotation must satisfy to be selected 20 kinds of constraints are definied more should come in next versions which enable a large request scope Some constraints concern a specific type of entity Unit Relation or Schema whereas others are universal Three special constraints concern constraints themselves Not Or And in order to combine them Universal constraints Feature a certain feature must match a certain value e g gender female Type name the type name must match a certain value e g pronoun Type specifies the type among Unit Relation and Schema Author specifies the author s login Useful for multi annotated texts Last Author specifies the login of the last person who has modified the annotation Distance lt x specifies a maximum distance in characters between this annotation and another specified entity Distance gt x the same but for a minimum distance After the annotation must come after when reading another specified one Befo
30. Looking for mistakes In most annotation campaigns some configurations are forbidden However the annotation model doesn t provide such capability and an annotator may create wrong annotations according to the campaign For instance it should be the fact that a Proposition cannot contain more nor less than one Verb In these cases it may be possible to express these forbidden configurations in GlozzQL save it as for instance a forbidden gql file and provide the annotators with it Hence at any moment the annotators can check their annotations by clicking on recompute all 1f any utterance appears in the results it means a mistake was done and a click on the utterance s enable to select it immediately in order to correct or remove it 8 7 3 Basic statistics Getting for each ConstrainedAnnotation the number of corresponding utterances it is possible to use GlozzQL to check some assumptions and do some statistics However the use of SQL export may give faster results and moreover enable to work with several annotated texts at the same time contrary to GlozzQL in current 1 0 0 version 60 9 Grapher annotations shown as a graph 9 1 Overview With some complex annotation structures namely with imbrication of schemas and relations it mays become difficult to handle the main views see the screenshot in the left below Glozz provides a module which renders the annotations as a graph see in the right bel
31. MOON Ts Cu ICCLUS FUME UIPULALE SEMI al Sapiecri Vivamus leo Aliquam euismod libero eu ehim Nul sed leo placerat imperdiet Aenean suscipit nutk Suspendisse cursus rutrum augue Nulla tincidunt At pons Curabitur iaculis lorem vel rhoncus faucibus felis magna fermentum augue et ultricies lacus lorem varius purus Curabitur eu amet Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Sort Type Sort Date Show sel 4252 5 t IL7 VYITIALIEL_ L29055 3U 54 5 5 be 4837 ID ymathet 1246557906501 u Verbe 7267 D ymathet 1246557908371 u Verbe 7606 7617 athet 1246557909522 u Verbe 6461 6469 ID yma 557911009 u Verbe 9888 9902 ID ymathet_ 57913763 u Verbe 10191 10201 ID2ymathet 1246557914641 u Verbe 8709 8727 ID ymathet 1246557916100 u Phrase 8619 8737 ID2ymathet 1246557922026 u Phrase 9799 9910 ID 2ymathet 1246557925098 u ps up ipsae 1246557902 RR37 Command This 1s an option to choose at any moment as often as necessary via the option panel by checking or unchecking the dedicated checkbox 17 600 Directories Typography Control Viewer W Consider words as atoms Select objects once created m Use user s friendly IDs versus real IDs R M Use icons versus text in toolbars ee Close 2 4 Annotations as predicates Frame 6 2 4
32. Praesent blandit odio eu enim Pellentesque sed dui ut augue blandit sodales Vestibulum Then choosing the Schemas Coref layout as follows Layout V Boxes Relations Coref Will result in a graph showing each schema horizontally 10 Glozz Aligner Alignment and agreement tool for multi annotated texts 10 1 Principles A new approach of alignment and inter annotator agreement measurement has been developped in Mathet amp Widlocher 2011 which provides a unified method to do both aligning and measuring Please refer to this article to be introduced to this method To sum up the agreement measure principle the set of multi annotations is considered as generating some disorder called entropy compared to a set of annotations with full agreement Hence each set of multi annotations is given an entropy value Besides a random entropy 1s computed by automatic analysis of a reference corpus of a given campaign This can be done one for all for a given campaign Then what is called agreement is the value Agreement randomEntropy entropy randomEntropy For a full agreement entropy 0 and agreement 1 For annotators being not better than random the agreement 0 In some case it 1s possible to get a negative agreement value when annotators do worse than random So what we need to do when we want to get the agreement of a multi annotated text 1s to compute the random entropy of
33. Show typographical annotations corpus reload required and then check the box Show typographical annotations We have then as stated to reload the corpus To do so we can now use the user friendly load last job in the File menu which will save us choosing again the same ac and the same aa Then we get each paragraph appearing with a frame around it It means it is shown as it is really in Glozz an annoated unit with the type paragraph o oo Glozz 0 9 9 Logged as ymathet File Options Import Export Tools Admin Groups Viewers SandBox baie DI E coza August 1867 but died the following year Tarn department of southern France A younger brother was also born to the family on 23 After the death of his brother his parents separated and a nanny took care of Henri through this time 2 At the age of 8 Henn left to live with his mother in Paris Here he started to draw his first sketches and caricatures on his exercise workbooks The family father named Rene Princeteau visited sometimes to give informal lessons Some of Henri s early paintings are of horses a specialty of Princeteau and something that he would later visit with his Circus Paintings 2131 In 1875 Henri returned to Albi because his mother recognised his health problems He finding a vay to improve her son s growth and development tal edit Disability and health problems The Comte and Comtesse themselve
34. XT IMPORTER author creation date 1290167040404 creation date lt lastModifier gt n a lt lastModifier gt lt lastModificationDate gt 0 lt lastModificationDate gt lt metadata gt lt characterisation gt lt type gt title lt type gt lt featureSet gt lt characterisation gt lt positioning gt start lt singlePosition index 0 gt lt start gt end lt singlePosition index 25 gt lt end gt lt positioning gt lt unit gt unit id TXT_IMPORTER_1290167040405 gt metadata cantharsTXT TMPORTER anthars fig 2 an excerpt of the Lautrec aa file As you can see in fig 1 the ac file should only contain text characters including spaces and punctuations but no line feed mark Let s see the main points of the aa file in fig 2 the main node is annotations it contains the metadata corpusHashcode which is a code enabling to identify what ac file corresponds to this aa file It relies on a specific algorithm which uses the length of the text 10680 in our example for the first number and a modulo of the content of the text for the second number The algorithm is provided in the appendices of this manual If you do not whish to implement this algorithm you can create your file with no corpusHashcode node then open the corpus in Glozz then save it The hashcode will be automatically created and stored when saved Then come some units nodes The id parameter 1s co
35. adie fjacomme As you can see some alignements involve all the three annotators wheras others involve two only or one only units leaved alone 69 Remark in this example you can see that most alignment concern a same category rose or orange but one concern a green unit and a rose one The reason is that we ve set the categorial matrix so that green and rose category may be aligned In the next versions of this tool it will be possible to set this matrix manually or automatically but at present time in the version 1 0 0 and contrary to the example above only units of the same category may be aligned 10 3 2 Manual alignment You may need to make some alignements manually for instance in order to create some real examples of what you consider as being aligned or not and then use the print function for instance 10 3 2 1 Choosing special mode First of all use the special mode by clicking on the button as follows Ji L LT Alignment mode add remove or edit alignments 10 3 2 2 Creating a new alignment Then you can create a new alignment by clicking on one unit of each annotator in whatever order in our example we go from top to down Once you ve finished double click the aligment is done It may involve as many annotators as you want all the three in our example 70 10 3 2 3 Editing an alignment To select an alignment put the mouse other one of its lines which becomes red t
36. ain views from no annotation at all if the cursor is set completely to the left to all annotations if the 76 cursor is set completely to the right Below are two examples for the same annotated text with two different cursor positions Lorem ipsum sit amet consdctetuer adipiscirfg elit Std non risus Buspendisse Lorem ipsum dolor sit amet consictetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed dolor Cras elementum lectus tortor dignissim sit amet adipiscing nec ultricies sed dolor Cras elementum ultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin ultrices diam Maecenas ligula massa varius a semper conjfue euismod non mi Proin porttitor orci nec nonummy molesfie enim st eleifend mi non fermentum diam nisl sit porttitor orci nec nonummy molesfie enim est eleifend mi non fermentum diam nisl sit amet erat Duis semper Duis arcu massa scelerisque vitae consequat in pretium a enim amet erat Duis semper Dis arcu fassa scelerisque vitae cqnsequat in pretium a enim Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibulum Pellentesque congue Ut in srisus yolutpap libero pharetra tempor Cras vestibulum bibendum augue Praesent egestas eo in pede Praesent lla
37. at a schema will belong to Schemal if it contains a element of Unitl but also if it exists a relation belonging to Relation whose target is this schema Consequently Schemal is constrained by C3 in addition to its natural constraint C2 and Unitl is constrained by C2 and by C3 in addition to its natural constraint C1 which is very restrictive compared to simple way constraints Let s see what happens with our corpus now we ve activated this option To do so we have to recompute all with the eponym button located in the bottom left of the window Recompute All k And we see that with these constrained working in the two ways the matches are of Schemal go from 4 to 1 and the matches of Unit go from 16 to 1 55 Annotation Constraint Matches Unitl Cl TypeName Verbe 1 Schemal C2 gt Contains Unitl Level infinite min 1 max No limits 1 Relation1 C3 TargetContains Schema 1 Levelz infinite min 1 ma 1 What we ve discovered here is that there is only one unit of type Verbe which is contained in a schema at any level which is contained at any level in the target of a relation This way the semantics of the constraints becomes much more powerful with no additional work for the user It s up to you to see when you need to activate it or not depending on what semantics you want to get You can activate or desactive it at any moment but do not forget to recompute all each time you want to see the
38. ate this feature click in the command line and it appears in green background color Fo AAMIGpDIIUICULO l f IL MU E r Anaphore 22 27 ID 47 v Command k 20 Then you just have to write the predicate in the very exact manner as it is in the list This feature provides auto completion and moreover works with auto completion only so that it is impossible to write a wrong predicate at any moment if you type a character which is not compatible with a possible predicate the character is not entered on the contrary if it 1s a way to complete the predicate being built it will be taken into account and automatically completed with additional characters if there is no other possibility Let s take an example Assume we want to create a relation of type complement betwen annotation 43 and 44 The associated predicate will be r complement 43 44 In fact what we have to type on the keyboard is only rc4344 Indeed here 1s what we type and what we get via auto completion Keyboard Displayed command __ command rot at this stage only u_ r_ands were possible stage at this stage only u_ r_ands were possible u_ r_ands_ were possible r complement among current inion types only complement begins with c character W r_complement 4 de eral annotation IDs begin with 4 4 and 41 to r complement 43 a other ID than 43 begins with 43 hence the comma r complement 43 4 r complement 43 44 same remark
39. ause his mother recognised his health problems He took thermal baths at Am lie les Bains and his mother consulted doctors in the hope of finding a way to improve her son s growth and development 2 edit Disability and health problems The Comte and Comtesse themselves were first cousins Henri s two grandmothers being sisters 2 and Henri suffered from a number of congenital health conditions attributed to this tradition of inbreeding At the age of 13 Henri fractured his right thigh bone and at 14 the left 4 The breaks did not heal properly Modern physicians attribute this to an unknown genetic disorder possibly pycnodysostosis also sometimes known as Toulouse Lautrec Syndrome 5 or a variant disorder Now we can launch Glozz an use the import txt facility Export Tools Admin Gr ayer Import from LinguaStream and we have to fill in 3 paths and the encoding of the txt 25 600 Input txt txt demoGlozz GlozzFiles txt Lautrec txt Browse UTF 8 Output txt ac demoGlozz GlozzFiles ac Lautrec ac Browse Output annotations aa demoGlozz GlozzFiles aa Lautrec aa Import Cancel To do so for each field we can use the Browse button instead of writing the paths manually Since it s the first time we store or load this type of data in Glozz we have to browse through our folders in order to go to the correct places as shown in fig 1 Next tim
40. depth of B is 1 and the depth of C is 0 and so on To launch this tool use the menu as follows Admin Groups Viev Find text shift f Find Units shift u Depth selector shift d n gt oo c You get this additionnal tool on the right side Min Depth ee 0 Max Depth cJ No limits Two slider enable to set respectively the min and the max depth of the units to be shown A reset button reset min to 0 and max to no limit A red cross enables to close this tool Let s what happens when playing with the two cursors orem ipsum dolor sit amet consectetuer adipiscing eit Sed non risus Min Depth Ek 0 Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Max Depth No limits dolor Cras elementum ultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy orem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Min Depth 1 Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Max Depth No limits Nr e e on dolor Cras elementum lultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy orem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Min Depth Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Max Depth Ehe p dolor Cras elementum lult
41. dition to what we ve just seen in last sub section an annotation model defines for each of its types a feature set model to be filled by its elements For instance if we consider again our example of annotation model regarding co reference we could define for the noun Unit type and the same for pronoun the feature set model as follows Possible values Male Female Neuter Singular Plural Suppose now we annotate John as a noun in a text we will fill its feature set as follows And suppose we annotate the cars the feature set will be Feature name value Gender Even if our examples concern units feature sets can be used exactly in the same way for Relations and Schemas Note that at the time this manual is written Glozz version 1 0 feature sets are not recursive a value can t be itself a feature set but this may be added in the future 1 4 Annotation IDs in Glozz real IDs versus friendly IDs 1 4 1 Real IDs Each annotation in Glozz has its own ID which identificates it from any other one in the world This is very important so that no conflict may appear when for instance merging several resources To do so Glozz defines an ID by concataining the annotator s login and the exact date of creation of this annotation Since each annotator in the world has a unique login thanks to the login attribution procedure and the exact date is expressed in millisecond there is no risk
42. dum aere Praesent egestas leo in pede Praesent blandit semper risque vitae sequat in pretium a enim olutpat libero pharetra tempor Cras odio eu enim P llentesque sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae Aliquam nibh Mauris ac mauris sed pede pellentesque fermentum Of course the Grapher can be used with its dedicated layout but it is also possible to identify the chains through colors directly in the main views To do so choose the option as follows in the toolbar INDIVIDUAL DEFAULT INDIVIDUAL STYLESHEET CONNECTED V e gv aen is PUN STYLESHEET And get each element of a connected chain given a same color 3 chains hence 3 colors in our example 40 Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit t adipiscing nec ultficies sed dolor Cras elementum ultrices diam Maecenas ligula mass varius a semper congue euismod no i Proin porttitor orci nec nonummy molestie er noh f diam njs sit amet erat Duis vestibulum bibendum aye Praesent egestas leo in pede Praesent blandit odio eu enim ntesque sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae Aliquam nibh Mauris ac sed pede pellentesque fermentum 41 8
43. e filed Search text diamu Next X Unit type i You can use a second filed to specify a unit type the results must belong to For instance here we re looking for the diam sequence belonging to a Verbe type only Search text diam Next Annot X Unit type Verbe more einem And we get only one result now ras lelementum lultrices diam Maecenas ligu Once again if the current combination of the searched text and the specified unit type does not have any occurrence it appears in red color This may happend while you re typing the unit type category till the category name 1s fully entered This tool comes with an additional feature which enables to automatically create units around each occurrence of a given character sequence Let s take an example we want to annotate each libero word of this text as a unit of type Nom To do so we create a first unit of this type 74 0 2 3 Ty pat llibero pha Units R i Pronom t i Verbe Then we select this unit 5 L gt upat sige phare egestas leo in pede F Now back in the search tool we enter libero as searched text Then we can of course jump from result to result using the Next button but at each time we can choose to click on Annotate to automatically create a Nom unit around the current occurrence Anger If you want to anno
44. ee dedicated buttons icons with character For instance to add a unit once you ve clicked on the adding a unit button move the mouse over the unit you want to add in frame 1 and then click on it As long as you stay in this mode you can add as many units as you want by clicking them p LI d Current rorper felis vitae erat froin SE a suere pe Dal Lil lectus et Genie ligula justo vita 0 ri hs a me E Min Danh Aa N adding a third unit number 78 to current schema number 7 4 raesent Aig am enim at fermentum e ligula massa adipiscing 1 nisl eu lectus Fusce vulputate sem at sapien Vivamus leo Aliquam Editing a schema To edit a schema at first click on the selecting a schema button Then in the frame 1 click on the schema you want to select it appears in green color when the mouse is over then in red once selected by clicking Remark you have to click on the circle of the schema to select it not on one of its components Its informations ID number of units and relations appear in the right panel pien Integer tortor tellus aliguam aucibus con orper felis vitae erat Dron feugiat augue non lectus et tristiqu ligula justo vitae magna raesent aligning ehemma 76 ment i mellis ligt nisl eu lectus Fusce vulputate sem at sapien Vi 16 Use the 3 buttons with a icon to add units relations or schemas and the 3 button
45. elationship one kind of schema reference chain With this annotation model we can annotate all nouns and pronouns of a text with the two kinds of units Then create as many reference chain as needed each of them containing some nouns and pronouns units Then draw relations between some reference chains schemas Of course this is a very simple example Some campaigns use a lot of types for each of the 3 meta types some others use a few or even use units and relations only or units and schemas only etc This deeply depends on the annotation task and the way we modelize it As you can see Glozz is not devoted to any special paradigm It can be configured to any task by defining a relevant annotation model It is also possible to mix different models within a given campaign for instance syntax and semantics either defining a whole annotation model containing syntactic and semantic categories or defining two annotation models one for syntax the other for semantics which would be used respectively at a first stage and at a second stage of the annotation process 1 3 Feature Sets Each annotation element can be associated a feature set 1 e a set of couples feature name feature value which provides additionnal individual information For an individual element the set of the features it will embbed are defined by the type of element it belongs to But for each of these features this individual element has its own value So In ad
46. es these folders will be proposed as soon as you ll click on the Browse buttons So in our example we choose the input text Lautrec txt we ve just created Since it was generated in UTF 8 we leave the default UTF 8 value as it is in the next field For the outputs we browse til we re respectively in the GlozzFiles ac and GlozzFiles aa folders and then complete the names via the keyboard with Lautrec ac and Lautrec aa Note that Glozz will complete automatically the file names with the correct suffixes if you write only Lautrec instead of Lautrec ac and Lautrec aa Once done you can open this new corpus via the File menu or the shortcut button Options Import Export eet Corpus in Annodis format ac demoGlozz GlozzFiles ac Lautrec ac Browse Typographic Annotations aa demoGlozz GlozzFiles aa Lautrec aa Load last job Print Load Cancel Print visible excerpt E Quit What we get is the text organized in simple paragraphs 800 Glozz 0 9 9 Logged as ymathet File Options Import Export Tools Admin Groups Viewers SandBox E HM fi TimePlayer GlozzQL d B amp a ES x Min Depth IS 0 ss X Max Depth No limits Henri de Toulouse Lautrec Search text Next Annot x Unit type From Wikipedia the free encyclopedik CH Henri de Toulouse Lautrec Units Relations Schemas Birth name Henr
47. esides you can edit the feature set associated to this unit just clicking on the values Indeed as soon as a unit is selected the feature set controller frame 5 is automatically set to it showing its features and enabling edition Deleting a Unit To delete a Unit select it then you can either press the Delete button or CTRL Backspace on a Mac of you keyboard or click on the trash icon on the top of the screen 2 3 2 Relations The main procedure is very close to the one concerning units just shown in previous section Hence we will show here the specific points only Creating a Relation First click on the create a new relation button mode which activates this mode and also the related part of the annotation model selector d H n mH Min Depth ken 0 Create a new relation E es E Reset X Max Depth No limits Search text L X Unit type 3 Units Relationp Schemas sujet complement Then the procedure 1s very simple reminder a relation is always a link between two annotation elements and cannot point on a part of text not having annotations these annotations being units relations or schemas put the mouse over the start element it becomes red 14 click put the mouse over the end element it becomes red click Here is an example where start and end elements are units but would be the same with relations
48. et de la mesure d accord inter annotateurs Montpellier actes de TALN 2011 Mathet Y Widl cher A 2011 Strat gie d exploration de corpus multi annot s avec GlozzQL Montpellier short paper actes de TALN 2011 Widlocher A Mathet Y 2009 La plateforme Glozz environnement d annotation et d exploration de corpus Senlis actes de TALN 2009 79
49. hanism and shows a countdown in seconds of the time left because this computation happens to be long in some cases During the process the results appear progressively and we finally get the number of matches and the corresponding list We can click on any of the results and then see its content in a frame below for instance Match 1 in the screenshot 800 Time left seconds O Number of matches 4 Makch 1 Anyl ymathet_1246557946798 Relationl2ymathet 12465581719 Match 2 Anyl ymathet_1246557903980 Relation1 ymathet 12465581243 Match 3 Anyl ymathet_1246557909522 Relationl ymathet_12465581456 Match 4 Anyl ymathet_1246557889191 Relationl ymathet_124655817804 EE Pa u_Pronom 290 300 ID ymathet_1246557423303 u Phrase 865 1279 ID ymathet_1246557946798 r Elaboration ymathet 1246557946798 ymathet 1246557423303 ID ymath P v EE n Add results Add ALL results to Basket to Basket Abort Quit Now only 4 matches appear out of 16 without unification such as the one with r Elaboration and all of them are correct oriented from bottom to top These results can be added to the basket 59 6 7 Use cases We report here some possible use cases of GlozzQL 8 7 1 Splitting annotations by authors types etc Using the relevant constraints and then the basket it is possible to create new aa files with for instance all and only the annotations of a given annotator 8 7 2
50. he left margin and the right margin This is quite convinent when using several types of schemas in odered no to have them all on the same side linked and linked reordered will show the schema as a path from one element to the next one The linked mode will use the natural order of the elements the order they were added to the schema whereas the linked reordered will use the textual order from top to bottom 7 6 Special features Besides the use of style sheets as we ve just seen some other special features may help for specific tasks For this purpose a Color mode chooser is provided in the main toolbar set by default to stylesheet Color mode STYLESHEET s 7 6 1 StyleSheet mode By default as just shown the Color mode is set to STYLESHEET which means that each annotation will be shown with the color associated to its type with respect to the current stylesheet 7 6 2 Individual colors modes Sometimes it mays help to give to a certain annotation a given color for some reason whatever the stylesheet This is possible using some special values in the annotation model Indeed when an annotation feature set contains a specific color element then it is possible to use the color it specifies via one of the two dedicated modes To do so you have to add a glozz color feature in each type of annotation you want in the annotation model Then for a given annotation you can change its glozz color
51. he position is infra Eu quam Mauris ullamcorper felis vitae erat Proin feugiat augue non elementum posuere IIT metus purus iaculis lectus et tristique ligula justo vitae magna Aliquam convallis sollicitudin purus Praesent aliquam enim at fermentum mollis ligula massa adipiscing nisl ac euismod nibh nisl eu lectus Fusce vulputate sem at sapien Vivamus leo Aliquam buts eu enim Nulla nec felis sed leo placerat imperdiet Aenean suscipit nulla in i justo Suspendisse cursus T gue Nulla tincidunt tincidunt mi Curabitur iaculis lorem vel rhoncus faucibus felis magna de augue et ultricies lacus lorem varius a v purus Curabitur eu amet 2 3 Annotating how to adding and editing annotations This point concerns frames 1 3 4 and optionnally 5 A toolbar is prodived in frame 3 as follows which enables to choose the current mode of annotation creating a unit creating a relation d H 2 Le Ty Te iu editing a unit editing a relation accessing to the schemas sub menu Choosing a given mode will result in a specific behavior of the the frame 1 as it 1s detailed in the next sections Important before creating or editing annotations you may load an annotation model so that you can operate with the types deditacted to your annotation task Refer to section 1 2 to see what is an annotation model To load one click on the yellow button of the frame 4 11
52. heet for a given annotation model You can define and use aS many aas as you whish for a given aam file providing you different views on the same corpus For instance if the aam of a campaign copes with syntax and semantics you can create one aas file showing syntax only and a second one showing semantics only in order to focus on one phenomenon at a time when annotating or observing annotations ql is an xml file storing querries expressed through the dedicated Glozz Querry Language GlozzQL You can create as many gql files as you whish for a given campaign or even more generic querries applying on any entities whatever the aam Besides Glozz can generate automatically corpus in Glozz format ac aa from a txt file so you may store in a specific folder your txt files too Since Glozz memorises the folders where you save and load each type of file we strongly recommand that at least for a given campaing you create a dedicated folder organized as follows v L GlozzFiles gt el aa gt ei aam gt el ac gt ai as gt kl cal gt ei ext 24 5 Creating a corpus in Glozz format 5 1 Manual corpus creation from a txt file vi Assume we want to pick an article from Wikipedia to biography of the painter Henri de Toulouse Lautrec Henri de Toulouse Lautrec 2 From Wikipedia the free encyclopedia Henri Marie Raymond de Toulouse Lautrec Monfa or simply Henri de Toulouse Lautrec French pronunciation G
53. hen click and the lines become blue and the unit disc yellow Then you can remove one unit from the selected alignment Put the mouse over its yellow disc click the disc becomes blue then click on the trash button The unit is removed from the alignment and the alignment is no longer selected 10 4 Agreement measurement To get an agreement measurement we need first to compute the random entropy of our corpus To do so we have to put in a folder all or part of the annotated texts of our corpus 71 Then we can submit this folder to the tool so that it computes the best possible entropy randomly The button is as follows EY E R Then browse to your folder montage light in our example below by clicking on the yellow button choose the number of annotators to consider in our example corpus most of the texts are annotated by 3 annotators hence the choice of 3 below and a number of iterations 100 or more is recommanded to have the very good random value but sometimes it takes to much time with certain corpus Then you can click on the green arrow to launch the process In our example some seconds later we get Entropy 1 88181 This is the random entropy and it is store in the system as long as you do not launch again this random entropy computing Even if you leave and restart glozz this value will reappear Hence it permits you to work on the same corpus without re computing the random value each time
54. i Marie Raymond de Toulouse Lautrec Monfa Born 24 November 1864 1864 11 24 Albi Tarn France Died 9 September 1901 1901 09 09 aged 36 Ch teau Malrom France Feature name Feature Val Nationality French LG 2 Field Painter Printmaker draftsman illustrator Movement Post Impressionism Art Nouveau Sort Type Visibl Henri Marie Raymond de Toulouse Lautrec Monfa or simply Henri de Toulouse Lautrec French pronunciation Agi do tuluz lo t amp ek 24 November 1864 9 September 1901 was a French painter printmaker draughtsman and illustrator whose immersion in the colourful and theatrical life of fin de si cle Paris yielded an uvre of exciting elegant and provocative images of the modern and sometimes decadent life of those times Toulouse Lautrec is known along with C zanne Van Gogh and Gauguin as one of the greatest Es e painters of the Post Impressionist period In a 2005 auction at Christie s auction house a v Command 26 We can go further and use some more styling for this corpus To do so we first have to temporary choose which will show typographical annotations Indeed at present time paragraphs are annotated as special units and the Glozz renderer take this information into account to add line feeds on the screen But for the while the paragraph units are hidden So we go to Menu gt Options gt Preferences gt Typography eoo Directories Typography Control Viewer M
55. ial position of each unit or appliying aligner changes to glozz which means that all previous change in the aligner tool will result in a change in the annotations in Glozz 68 Once this option is activated you can observe the real time adjustment in Glozz of any change in the aligner tool Before dragging see the red unit in Glozz starting at the comma BULLETIN amp gt Int r t national PQ jliabeuf L gt impopulaires ou embarrassante habitude de choisir les p riodes de v Lin Je Lim Jl Z o Tl 244 wm ds While dragging see the red unit in Glozz starting now at impopulaires word BULLETIN amp gt Int r t national P stet impopulaires ou embarrassantes les habitude de choisir les p riodes de v Za wm Jo Lim Mawes Vuummmsss Musso wti weg 10 3 Alignment tool An alignment consists in considering that several annotators have annotated a same phenomenon This is obvious when they have all annotated the same unit with the same exact bounds and the same category However this is far too restrictive in numerous annotation campaigns and it is the reasons why a method enabling some tolerance has been developed 10 3 1 Auto alignment Once the annotations are loaded and the Aligner is launched click on the button as follows Automatic alignments computing Then you get some semi vertical lines which consider units from different annotators as aligned jliabeuf alab
56. idl cher This model defines 3 meta types of elements as follows 1 1 1 Units A unit is a contiguous span of text starting at one character position and finishing at another one Units can overlap each other or even cover others lectus tortor dignissim sit amet adipiscing nec ultricies sed dolor Cras elementum ultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy molestie enim est eleifend mi non fermentum diam nisl sit amet erat Duis semper D bibendum augue Praesent egestas leo in pede Praesent blandit odio eu enim Pellentesque sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae Aliquam nibh Mauris ac mauris sed pede pellentesque fermentum Maecenas ut orci vel massa suscipit pulvinar Nulla sollicitudin Fusce varius ligula non tempus aliquam nunc turpis ullamcorper nibh in tempus sapien eros vitae ligula Pellentesque rhoncus nunc et augue Integer id felis In the figure above we can see first two separate units then two overlaping ones and at last some covering ones Note that when a unit covers another one it 1s visually shown in Glozz through a covering frontieer Hence it 1s possible to see which unit contains which others WYSIWYG what you see is what you get or in other words you directly work on the screen with the data as they will appear
57. ing in Glozz It s much more simple to find an error with an XML editor validator than just reading the xml code in a mere text editor As we ve seen in section 1 an annotation model can be considered as an instance of the meta model Units Relations Schemas from Widl cher It states for a given annotation campaign the kinds of Units Relations and Schemas which will be available and for each of these kinds its the feature set structure In Glozz it is an XML file with the file name extension aam It is strongly recommanded that besides reading this chapter you also have a look at the different aam files provided in the distribution We will assume here we want to define a quite simple model for an coreference annotation campaign 6 1 Overview We create a file named coreference aam and store it in our aam folder The main structure 1s as follows containing 3 main nodes units relations and schemas lt xml version 1 0 encoding UTF 8 gt lt annotationModel gt lt units gt lt units gt lt relations gt lt relations gt lt schemas gt lt schemas gt lt annotationModel gt We ll see in details what to put in units relations and schemas node in the next subsections 6 2 Units We would like to have two kinds of units nouns and pronouns Each of them will have a gender male female no a count plural singular no and a field for an additionnal remark Each kind
58. ish time This way you can see exactly the time reality of the annotation process for instance some periods of time will have many annotations created whereas other periods will have almost no new annotations created virtual time corresponds to True time unchecked in the interface means that the cursor ratio corresponds to a ratio in terms of number of annotations chronologically ranked For instance if a text is annotated with 100 annotations the middle position of the live will result in showing the 50 first annotations and so on This way the annotations appear progressively whatever the cadency they were created 11 3 3 Auto replay To play the time automatically you can use the magneto buttons to start stop or accelerate the replay 11 3 4 Caution Before you resume annotation work make sure the time 1s set back to the latest position on the right in order not to have any new created annotation being stangely hidden 77 Appendices Appendice 1 hashcode algorithm in Java Wwe String hash FileInputStream s null try S new FileInputStream f int length s available hash lengtht long code 1 for int 1 0 i lt length itt t int n s read if n 0 code n code code 99999999 System out println code code hash code 78 References Mathet Y Widl cher A 2011 Une approche holiste et unifi e de l alignement
59. k to create a Target constraint Relations whose associated ConstrainedAnnotation is Schemal with First Level Only unchecked Relation target contains Contains Unit1 Level infinite r First Level Only Contains at least ka 1 Contains at most CO eH No limits x and get Relation with one utterance being an Elaboration Annotation Constraint Matches Cl gt TypeName Verbe 16 C2 gt Contains Unitl Level infinite min 1 max No limits 4 Relation C3 gt TargetContains Schemal Level infinite min 1 ma ne CS rElaboratian ymathet 124655835735 l ymathet 1246558330689 ID ymathet 1246558398644 51 One and only one relation matches this complex query Note that if we ve had specified a range of number of utterances concerning the Schema for instance minimum 2 and maximum 3 AA Constraint Creation Un Schema 1 Relationl Contained Annotation First Level Only Contains at least fk 2 Contains at most r 3 ok Cancel we would have got with this corpus no match at all for associated Schemas see Schema2 below Annotation Constraint Matches Unitl Cl gt TypeName Verbe 16 Schemal C2 Contains Unit1 Level infinite min 1 max No limits 4 Relation1 C3 TargetContains Schema1 Level infinite min 1 ma 1 Schema2 C4 Contains Unit1 Level infinite min 2 max 3 0 8 4 Saving and loading GlozzQL querie
60. l is already loaded so that when clicking on the name the possible unit type names are automatically proposed here Noun or Pronoun 600 SHER 6 Unit style Relation style Schema style Tvne name Background color Hide RS RE Pronoun T Then we can click on the color in the next column to set the Background color E Bii 6 Unit style Relation style Schema style Type name Background color Hide NENNEN a 9 0 06 Choose Color chantillons Teinte Saturation Luminosit HSB RVB OR mmm NOR os PU 2220000 R cents EHE 15 EENT LERSCH E Ej x p n RN RR RR ER ERES gt ER S 3 11 S ER CERN T H ae From ESS 0000000000 2 5 51 5 eee CEE EET LIS cuum 2 Aper u O Birth vamd HET 8 Echantillon de texte Echantillon de texte Born 24 November C ok Annuler R initialiser And in the third column we can set if the units of this style should be shown by default or hidden when the box 1s checked In main cases the style files will be saved with all Hide fields unchecked and this boxes will be checked unchecked while working on the annotations to see only what we need But of course the style can be saved with some checked Hide boxes As a result as soon as we ve set some styles the corresponding annotations are immediately styled and even the annotation model appears with the associated colors see in the right of the screenshot below where Noun ap
61. lick on Add results to Basket button just below For instance with our last example we can click on Schema2 and get the 4 utterances appearing in the frame 4 Annotation Constraint Matches Uniti Cl gt TypeName Verbe 16 Schema C2 gt Contains Unitl Level infinite min 1 max No limits 4 Relationl C3 gt TargetContains Schemal Level infinite min 1 ma 1 L Sort Type Sort Date L Show sel 3 Visible s Unite Discontinue ymathet 1246557953609 ymathet 1246557956163 ymathet 124655790398 s Schema Anaphorique ymathet 1246557943443 ymathet 1246557946798 ymathet 1246557950 s Schema Elaboratif ymathet 1246557965878 ymathet 1246557991290 ymathet 124655792202t s Schema Anaphorique ymathet 1246557953609 ymathet 1246557888089 ymathet 1246557956 ER lt l gt Recompute Remove Add results Unify All Selected to Basket k Variables Then clicking on Add results to Basket we get a new window which shows the basket content 53 eoo GlozzQL Basket Sort Type Sor Dae Show sel Visible s Unite Discontinue ymathet 1246557953609 ymathet 1246557956163 ymathet 124655790 s Schema Anaphorique ymathet 1246557943443 ymathet 1246557946798 ymathet 124655 s Schema Elaboratif ymathet 1246557965878 ymathet 1246557991290 ymathet 124655792 s Schema Anaphorique ymathet 1246557953609 ymathet 1246557888089 ymathet 124655 4 Pi Save Basket content Clear Erase basket content
62. ltricies sed dolor Sras elementum porttitor orci nec nonummy inole amet erat Duis semper Duis arcu Massa scelerisque vitae consequat in pretium a enim D H Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibulum bibendum augue Praesent peestagjes it De a schema embedding 3 units a relation and a schema andit odio eu enim Pellentesque 1 2 Annotation Model a Unit Relation Schema instanciation for a given annotation campaign The 3 kinds of elements which are used in Glozz are those just presented above Units Relations and Schemas However for a given annotation campaign we will always rely on a specific instanciation of this meta model not directly on this generic level A specific instanciation instanciation in Glozz is simply called an annotation model Such instanciation of the U R S meta model 1s merly for each of this meta categories U R and S the list of different types we can use to annotate In other words when we ll have to annotate something as a Unit we will never just say this thing 1s a Unit but we will choose among different types of Units provided by the specific instanciation of the model we are using for this campaign Let s see an example which will be practically studied in section 6 Assume we need an annotation model for co reference annotating We could define two kinds of units noun and pronoun two kinds of relations part of and r
63. mposed of the login of the creator TXT IMPORTER in our example since this aa file was created automatically by the Glozz text importer a character and the date the number of milliseconds since 1970 given by new Date getTime in Java For you own program you should choose your own creator name as TXT IMPORTER m our example Last modifier nodes will be of no use till someone makes some changes in the annotation So at this step we suggest you put resp n a and 0 values In the characterisation node you have to put two sub nodes the type name title in our example which should correspond to an annotation model structure typo aam in our example see above in order to format the text and the content of the feature set There is no feature set in our example hence the empty node Then come the positionning which is always given for a unit by two singlePosition nodes Each single position is the index of the character in the ac file starting at 0 In our example there is a title unit starting at position 0 and finishing at position 25 corresponding to Henri de Toulouse Lautrec To be completed with feature sets and then with relations and schemas 30 6 Annotation Models types feature sets groups Note this section needs to have some elementary knowledge in XML To create an annotation model it is recommanded to use a text editor with XML capabilities so that the document can be validated before test
64. n moving the mouse over a unit with the mouse will pre select it this latter appearing in green color and comes with an additional information panel showing its ID its type its author and the last author 1f it has been modified REET RARE WEES bann anne JEL ARRI RE WATERED RENE Arben EEE erat non mauris convallis vehicula Nulla et sapien Integer tortor te aliquam faucibus convallis id congue eu quam Mauris ullamcorper felis erat Proin feugiat augue non elementum posuere metus purus iaculis et tristique ligula justo vitae magna Aliquam convallis sollicitudin purus im at fermentum mollis ligula massa adipiscing nisl ac Then if you click it will really select this unit resulting in a red and dotted frontieer 13 AA spouma ann RR kk MONET hha MERISIER ENS wasr erat non mauris convallis vehicula Nulla et sapien Integer tortor te aliquam faucibus convallis id congue eu quam Mauris ullamcorper felis e im at fermentum mollis ligula massa adipiscing nisl ac ae wm wm wm wm EEGENEN j As long as this unit is selected you can move its begin point and end point by drag amp drop To do so put the mouse above the green circle drawn at the begin or end position press the mouse move it to the new position and release In the example below we drag amp drop the end position of the unit it is the same procedure for the begin one ELICLE ti RL LE ntu mollis fh Fusce vulputat B
65. ndit odio u enim Pellentesque bibendum augue Praesent egestas leo in pede Praesent HIandit odio u enim Pellentesque sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci luctus et sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae Aliquam nibh Mauris ac mauris sed pede pellentesque ultrices posuere cubilia Curae Aljquami nibh Mauris a mauris sed pede pellentesque 800 Glozz Time Player 800 Glozz Time Player Magneto Acceleration Magneto Acceleration Current rank 29 Current rank 62 Tota Ka DE x55 diia Total 62 IP 0100 x55 pos 2 f a k mem x Start time Current time Finish time Start time Current time Finish time Thu Jul 02 19 56 37 CEST 2009 Thu Jul 02 20 05 25 CEST 2009 Thu Jul 02 20 14 08 CEST 2009 Thu Jul 02 19 56 37 CEST 2009 Thu Jul 02 20 14 08 CEST 2009 Thu Jul 02 20 14 08 CEST 2009 Cursor positionned in the center Cursor positionned to the last position 11 3 2 True time option Two time modes are provided in the time player true time versus virtual time True time means that the cursor ratio corresponds to time ratio or in other words that if the cursor 1s set to say the middle of the line then the time will be set to the middle of the annotation time which means the center between the start time and the fin
66. new results 8 6 2 Unification mechanism In GlozzQL unification means considering ConstrainedAnnotations as variables and the set of constraints as an equation system Here again an example will be more meaningful Assume we want to find all relations being oriented in the oppositve way of reading 1 e from bottom to top First we create Anyl the set of all annotations To do so we use the Free constraint which accept every annotations Free ConstraintiD Content Scope Cl Always true Any Then by double clicking it we get Anyl with 128 utterances Annotation Constraint Matches Anyl C1 gt Always true 128 Now we create Any2 with the constraint of being located before an Any2 e Constraint Creation Position in text is before ee OK Cancel 56 We get C2 ConstraintlD Content Scope Cl Always true Any C2 Before Any1 Any And double clicking it we get Any2 with 127 utterances Annotation Constraint Matches Anyl C1 Always true 128 Any2 C2 Before Anyl 127 Now we create Relationl which relies on the combination of two constraints Start with Any and Finish with Any2 as follows We create a Start constraint Relations Start k _ which concerns Any Qe Co CO An Relation start contains First Level Only V Contains at least C 1 Contains at most 3 No limits It s crea
67. nnotator s login and the number of annotations for example there are 8 units created by annotator alabadie alabadie A scale 1s shown above the views going from character 0 to character 2400 in our example 10 2 2 Zooming in zooming out It is possible to zoom in or zoom out with the mouse wheel or with the Zoom cursor 66 Put the mouse where you want the zoom to Then use the mouse wheel to zoom in be centered The scale is adjusted 10 2 3 Moving to the right or to the left When the view is zoomed you see on the screen only a part of the whole text but you can move to the left or to the right with a drag amp drop press the mouse button move the mouse then release the button 10 2 4 Changing the order the annotators appear The annotators are ordered as the first annotations come in the annotation file but you may want to re order them in order to better compare them for some reasons This is possible by dragging the name of an annotator in the left side to the desired position All the views are automatically adjusted while dragging er alabadie labe alabadie alabadie jli pauf tvallee tvallee tvallee before dragging while dragging order adjusted 67 10 2 5 Adjusting units positions Each unit appearing in this alignment tool can be adjusted by a drag amp drop action on either one of its bounds or in its body To move one of its bounds put the mouse other it it appears in red color pres
68. nt Creation Constraint to denegate OK D Cancel jJ We get a new Constraint named C2 ConstraintiD Content Scope Cl Text contains sit Unit C2 Not C1 Unit Again we create a ConstrainedAnnotation of type Unit based on C2 E ZE a CH Constrained Annotation Creation Unit E p Any Cl 8 Unit Unit Not C1 DDO OS Associated constraint A Annotation Constraint And we get Unit2 with 51 matches Annotation Constraint Matches Unitl C1 gt Text contains sit 59 Unit2 C2 gt Not Cl 51 47 8 3 2 Focusing on one particular annotator The queries we ve just created offer respectively 59 and 51 matches which concern units from any annotators In fact this annotated text contains annotations from a human annotator and from a machine the TXT IMPORTER We re going to get all the units containing sit like does Unitl but from annotator whose login is ymathet only We click on Author in frame 4 We enter the requested login Here we write the full login ymathet but it is possible to use a part of it only and check Limit to contains expr For example in this case ym would work e Constraint Creation Limit to contains expr S OK Cancel C3 is created Note that this time its scope is Any since it can be applied to a Unit but also to a Relation or a Schema ConstraintiD Content Scope E Text co
69. ntains sit Unit C2 Not Cl Unit a Author s name ymathet Any We create now a And logic constraint to combine C1 and C3 Logic ee a 4 2 f And i i i Or Not CRE bss sz ss eI F lA A box appears where we have to choose at least two constraints In our case we click on CI and on C2 in whatever order ao Constraint Creation C2 k And Select several Author s name ymathet OK Cancel 48 A new Constraint C4 1s created Note that its scope 1s automatically computed by restriction of the different scopes of its contained constraints Here combining the scope Unit with the scope Any will result in the scope Unit ConstraintiD Content Scope Cl Text contains sit Unit C2 Not Cl Unit C3 Author s name ymathet Any C4 And C1 C3 Unit Now to create the ConstrainedAnnotation associated to C4 we can process as we ve done previously clicking on Unit button and so on However we re going to use a shortcut indeed when a constraint is of a specific scope Le Unit Relation or Schema we can create the associated ConstrainedAnnotation just double clicking it in the list as we do here clicking on the C4 row If the scope is Any and you want to create a ConstrainedAnnotation of a specific type for instance Unit you have to use the usual way ConstraintiD Content Scope Cl Text contains sit Unit C2 Not C1 Unit C3 Author s name ymathet Any CR TC e NN This time
70. o back to the first one using First button The same request with the feminin value for genre feature provides no result as we can see with the Search Unit field appearing in red color Search Unit Nom mem UT eee Attribute constr jenre Next First X Value constr feminin 11 3 TimePlayer 11 3 1 Main principle Each annotation in Glozz is time stamped Hence it is possible to get an history of the annotation process of a given text It s the object of the timePlayer tool To launch it you can use the button as follows either in the toolbar or in the Tool menu l R The TimePlayer appears as follows ADO Glozz Time Player PUE UL Magneto Acceleration F True Time Total 62 1 b HU E ec x55 v C M ot Start time Current time Finish time Thu Jul 02 19 56 37 CEST 2009 Thu Jul 02 20 14 08 CEST 2009 Thu Jul 02 20 14 08 CEST 2009 Its main feature 1s the time line with a cursor which can be set from first time position in the left July 2 th at 19 56 in our example that is to say the time the first annotation of this document was created to a final position same day 20 18 in our example Just below this cursor line are shown resp the first time value the current time value the one of the current position of the cursor and the last time value When playing with the cursor you ll se more or less annotations shown in the m
71. oading GlozzQL queries 52 So XalozzOb ds EE 53 8 5 1 Feeding the basket ss 53 6 5 2 Save Daske content AS 33 LINC Sn nn ce en ne cete 54 8 5 07 Erase DASKEL content ironi CUET ODIT COE ege Ee 55 010 Advanced CONCED S ea tee a Eeer 55 GO Two ways CONS tals ES na dd meta eae tn te nn de 55 0 02 Unification M CANISME Sn inertie tn et teens eines 56 OL ET TE PR EE 60 8 7 1 Splitting annotations by authors types etc 60 8 7 2 Looking for ANSARESS Sn Reid md EDU since tU is eine 60 5 5 I NO Stat AO MID MEE MEM MM ee 60 9 Grapher annotations shown as a graph 61 Hl OVervieW RR 61 D abite UO D E a I e dd is 61 ie Laun MAS CUP EE TEE 61 2 UN OS ICO OM a a Gos tessa MM MU EU M M 62 923 LOO SIL ZOOBISOUED ee eo cbe Ni EN aree i tele one f elu eee Slee 62 9 2 4 Showing embedded text 62 GE LS DRE KYOU eer a een ee 62 OF Costeterenoc EE eh EE Ana etes ae 63 94 1 CO T leTeNCe USING EE EE 63 94 7 CO r ler rence USING SCENAS nd het a nn ni 63 10 Glozz Aligner Alignment and agreement tool for multi annotated texts 65 KEN Beete 65 10 2 ANNT SDecidl VIEN eegen 65 IOE ON CU O N Unes St aie deduce te DER nt een nus 66 10 2 2 Te en E E een A DE ne en teinte 66 10 2 5 Moving To the Tent OF TO n lei RE ae 67 10 2 4 Changing the order the annotators appear 67 10 2 5 AdIUSHNS MEESCHTE 68 MSc Ae OO loen a E 69 K SSEN a din E E T A A 69 10 32 Manualak EE 70 IK 221 e ee Tee 70
72. ow lectus et tristique ligula justo vitae magna Aliquam convallis sollicitudin purus en Lorem ipsum dolor sit amet lectus tortor dignissim sit amet at ultrices diam Maecenas ligula massa vi ius a semper congue euismod non mi Proin porttitor orci nec nonummy molestie enim st eleifend mni non fermentum diam nisl sit m D C o o ER o 3 amet erat Duis semper Duis arcu massa cel que vitae consequat in pretium a enim T Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibuf m bibendum augue Praesent egestas leo in pelle Praese adit odio eu enim Pellentesque sed dui ut augue blandit sodales Yom ante ipsum dem m faucibus orci luctus et ultrices posuere cubilia Curae Aliquam An EE c mauris sed pede pellentesque fermentum Maecenas adipiscing ante non ira sodales hendrerit Ut velit mauris egestas sed gravida nec ornare ut mi Aenean ut orci vel massa suscipit pulvinar Nulla sollicitudin Fusce varius ligula non tempus aliquam nunc turpis ullamcorper nibh in tempus sapien eros vitae ligula Pellentesque rhoncus nunc et augue Integer id felis Curabitur aliquet pellentesque diam Integer quis metus vitae elit lobortis egestas Lorem the same annotations seen in the complex annotations seen in the text view Grapher 9 2 Interface 9 2 1 Launching Grapher To launche the interface use the menu as follows SandBo Alignment MOT SORT Graph
73. own in a logical predicate Let s see in details each of these frames and how to use them in the next sub sections 2 2 Main and macro views viewing and navigating This point concerns frames 1 and 2 These two views are representing the current text with current annotations They only differ on two aspects the view number 2 is a macro view of the document 1 e it is devoted to show the annotations at a large scope not to read the text 10 the view number 1 shows the text in a readable font size and enables the user to create or edit annotations The point is that these two views are linked together so that it is possible to navigate easily in the text using the macro view frame 2 and then watch and work the annotated text using the main view frame 1 To do so click anywhere in the frame 2 and immediately the frame 1 will be positionned at this exact point and conversely Moreover an option makes it possible to show the correlated positions between frame 1 and frame 2 To activate or desactive it use the option menu and the viewer tab Directories Typography Control M Activate inter viewers position pointers Ei Shaw tnctwlad annntatinne Then when the mouse points somewhere in the frame 2 the frame 1 shows a cursor corresponding to the same position if it 1s within the current zone or either an arrow poiting to the top if the position is supra or to the bottom like in the next screenshot if t
74. pears in red and Pronoun in orange 37 DKCH Unit type 1 Units Relations Schemas was a French painter printipaker draughts and illustrator whose immersion in the colourful and theatrical life of fin de si cle Paris yielded an re of exciting elegant and provocative images e o Kal re X L3 H A zn D mters of the Pos nit style Relation style Schema style Type name Background color Hide new record was St Noun for 22 4 milli Weieng Biography e a i EN Feature name Feature value Youth f 1 W D E LI Vid NONG U DUIUO UI C Ong ws a 7 4 Relation styles Relation styles are almost the same as unit styles The only difference 1s that in the second column of course line color is concerned instead of background color So we can apply the same method as in the previous section 7 5 Schema styles Schema styles come with an additional field concerning the shape to give to the graphs called display type in the third column 600 BHM 6 Unit style Relation style _ Schema style Type name Color di nlav tvne Hide barycenter left barycenter right linked 3 linked reordered F Let s see how it behaves in the following table EME pave OE TUNE linked reordered 38 barycenter will show the schema as a star with a central disc and a link to each element barycenter left and barycenter right do the same but put the disc in resp t
75. re the same but before Free specifies in fact no constraint Used to find for instance all the Units or all the Relations or Schemas Unit constraints Text contains the text covered by the unit must contain a given string Regexp the text covered by the unit must match a given regular expression Covers the unit must cover another given unit Covered by the unit must be covered by another given unit Relation constraints Start the first argument of the relation must contain a given annotation This constraint 1s settable in two ways 1 we can specify wether this contain condition 1s recursive search in depth or not first level search 2 we can specify the minimum and maximum utterances of contained elements 42 Target the same with the second argument of the relation Relation or Schema constraints Contains it works the same way as the Start and the Target constraint see above but it may concern also a Schema and if it concerns a Relation then it will be futfil if the first OR the second argument futfill the constraint Logic constraints And concerns two given constraints and combine them with the boolean operator and Or concerns two given constraints and combine them with the boolean operator or Not concerns one given constraint and applies the boolean operator not on 1t Let s finish this section with a very simple example to see how constrained are written
76. responding groups will be automatically created In our simple example let s use two groups named group and group2 as follows we show only the modified part of the code lt units gt lt type name Noun groups groupl group2 gt Case type name Pronoun groups group1 gt TEMA lt units gt lt relations gt lt type name PartOf oriented true groups group2 gt lt relations gt lt schemas gt lt type name referenceChain groups group2 gt Gate lt schemas gt If we want a type to belong to several groups like Noun in our example we list all the groups seperated by a comma With our modified aam file we ve set two groups one named group and containing Noun Pronoun one named group2 and containing Noun ParfOf and referenceChain 34 7 Styles 7 1 Overview A style sheet defines how each type will appear It includes color and visibility for units relations and schemas and shape for schemas For a given annotation model aam file it 1s possible to define as many style sheets as needed as files Of course in main cases we define only one style sheet for a given model but in some cases it 1s interesting to provide several ways to observe the annotations Below are two screenshots of the same annotated text using two different style sheets Born 24 November 1864 1864 11 24 Born 24 November 1864 1864 11 24 Albi Tarn France E Albi Tarn France
77. rices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy Min Depth 2 Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Max Depth 3 No limits dolor Cras elementum ultrices diam Maecenas ligula massa varius a semper congue euismod non mi Proin porttitor orci nec nonummy 73 11 2 Simple search tools 11 2 1 Find Text Tool Besides GlozzQL it is possible to use a quite simple tool to do some full text search Select it in the Tools menu Admin Groups Viev K ra _ Find Units shift u Depth selector shift d gt oo This additionnal tool then appears in the right side of the interface and comes with two text fields and two buttons Enter the word you re looking for in the search text field As long as what you write has at least one occurrence in the text it appears in GREEN color in this field It the case for instance for diam with the lorem ipsum text being loaded Search text diam Sy Next Annot X Unit type In the main view each occurrence appears with a blue background color es diam M Then it is possible to jump from one occurrence to the next one clicking on the Next button As soon as the sequence you re entering does not have any occurrence in the text it appears in RED color it s no use to go further use the backspace key to modify th
78. s with a icon to remove units relations or schemas Deleting a schema Select it then click on the garbage icon or press DEL or Function Backspace on a Mac 2 3 4 Working with real IDs or friendly IDs Please refer to section 1 4 before reading this As we ve seen in the previous chapter each annotation 1s given an ID Its value appears as we ve seen when it 1s selected For instance in this screenshot the selected annotation is shown with ID 26 It is its friendly ID in the frame 1 but also in the frame 6 CUIU MON THIS Cu ICClu5h FUME UIPULALE SEMI dl SdpicH Vivamus leo Aliquam euismod libero eu cf sed leo placerat imperdiet Aenean Rer utk ID 30 Suspendisse cursus rutrum augue Nulla tincidunt NN a Curabitur iaculis lorem vel rhoncus faucibus felis magna fermentum augue et ultricies lacus lorem varius purus Curabitur eu amet Lorem ipsum dolor sit amet consectetuer adipiscing elit Sed non risus Suspendisse lectus tortor dignissim sit amet adipiscing nec ultricies sed Sort Type Sort Date Uu VETDE 432Z 4244 IU ZU u Verbe 4831 4837 ID 21 erbe 7267 7276 ID 22 u Verbe 10191 10201 ID 26 u Verbe 8709 8727 ID 27 u Phrase 8619 8737 ID 28 u Phrase 9799 9910 ID 29 u Phrase 10178 10337 ID 30 u Phracsa A1 102110 231 Command In the next screenshot the same annotation is now shown with its real ID CUINIHUU
79. s At any moment it is possible to save the current set of queries This will save the list of Constraints and the list of ConstrainedAnnotations as a gql file in XML format This won t save the results but of course you can later load again your corpus and your gql file and then get the results again To save click on the save button in top left of the window and enter a name if the name doesn t finished by gql this extension will be automatically added 8016 Enregistrer Gaal HH ROMA Date de modification Format de fichier gql file HH red Lis Nouveau dossier Annuler Enregistrer 52 Loading is as simple as saving Be carful loading a gql file will result in erasing all current Constraints and ConstrainedAnnotations before loading new ones If needed save them before If you are a developer you may be interested in generating queries in gql format from your own programs No DTD is provided at the moment but it is very simple XML Of course if needed do not hesitate to contact the authors for further details 8 5 GlozzQL Basket GlozzQL provides a basket in which it is possible to store results from one or several queries in order to do some specific action on these annotation either saving them as a new aa file or removing them from current annotations 8 5 1 Feeding the basket Each time you click on a ConstrainedAnnotation in the list of frame 3 you can then c
80. s Then we re ask to choose the ConstrainedAnnotation we want to be contained Here only Unit is available and we choose it This constraint can be set by three parameters First Level Only if checked means we want the annotation to be contained at first level not deeply if unchecked it allows for instance the annotation to be contained to be itself contained in another annotation this latter being itself contained Contains at least the mininum number of utterances being contained Contains at most the same for maximum In our case we ve unchecked First Level Only in order to enable a deep search and leave the default values for least and most so that any number of utterances is allowed ANOO Constraint Creation Contained Annotation First Level Only f Contains at least a 1 Contains at most c No limits CKY Cancel jJ We get then C2 in the list ConstraintiD Content Scope Cl TypeName Verbe Any C2 Cogtains Unit1 Level infinite mi Rel Sc 50 We ask to create a Schema based on C2 Schema k d We choose C2 among the two possible constraints which scope is compliant Schema Associated constraint Contains Unit1 Level infinite min 1 max D and get Schemal in the list with 4 utterances Annotation Constraint Matches Un C1 gt TypeName Verbe 16 Schemal C2 Contains Unit1 Level infinite min 1 max No limits 4 Now we as
81. s move while pressing a green vertical line appears in order to help alignment with other annotations then release u I p i u before dragging a bound while dragging a bound To move the whole unit put the mouse over its body the two bounds appear in red color and do the same n L H Selecting a whole unit in order to move it However by default these modification in this tool won t have any consequence on the real annotations If you want to get that behavior i e that any editing action in this tool result in an editing action on the annotations this view comes from you have to select it in the preferences To active the preferences panel click on this button in the top left of the window o 8 0 6 Preferences Views Axes spacing 100 Unit level representation Units height 10 Vertical M Ruler e Nested Synchronization f Synchronize Glozz amp Aligner k On synchronization action 9 Apply Glozz Changes to Aligner Apply Aligner Changes to Glozz Then click on the check box Synchronize Glozz amp Aligner However at the time you do that the real annotations and the ones you re working with in the aligner tool may be different in the case you ve moved some units in the tool without having activated this option It s the reason before selecting this option you have to choose whether applying glozz changes to aligner which means that the aligner tool will go back to the init
82. s disappeared ECHO GlozzQL Save Load Show basket Constrained Annotations Constraints Settings Any me CA x Limit search to visible annotations M Two ways constraints beta Discard Typography Feature Free New 4 Type name gt E Type y Any Relation 4 Author Last author Le e 4 Unit Schema Distance lt x After Annotation Constraint Matches Distance gt x B Before Units Text Contains Covers Regexp Covered by Relations Start f Target e Relations Schemas Sort Type 6 Sort Date 3f Show sel At Visible Contains Logic And Or Not ConstraintiD Content C3 5 Recompute Remove Add results Unify Remove Selected to Basket Variables Selected Remove ALL Since GlozzQL relies on two main concepts Constraints and ConstrainedAnnotations the interface comes with two corresponding panels numbers correspond to the screenshot On the left side the ConstrainedAnnotations panel embeds 3 frames 1 Four buttons to create a ConstrainedAnnotation among Unit Relation and Schema or even Any which mean unspecified 2 The list of ConstrainedAnnotations created still empty in the screenshot with three columns the name of the ConstrainedAnnotation the Constraint it is based on the number of matches according to the current annotated text A click on one i
83. s were first cousins Henri s two grandmothers being this tradition of inbreeding quickly came to realise that Henn s talent lay with drawing and painting and a friend of his took thermal baths at Am lie les Bains and his mother consulted doctors in the hope of sisters 2 and Henn suffered from a number of congenital health conditions attributed to At the age of 13 Henri fractured his right thigh bone and at 14 the left 4 The breaks did not heal properly Modern physicians attribute this to an unknown genetic disorder possibly pycnodysostosis also sometimes known as Toulouse Lautrec Syndrome 5 or a variant disorder along the lines of osteopetrosis achondroplasia or osteogenesis imperfecta 6 Rickets aggravated with praecox virilism has also been suggested His legs ceased to grow so that as an adult he was only 1 52 m 5 ft tall 4 7 having developed an adult sized torso while retaining his child sized legs which were 0 70 m 27 5 in long He is also reported to have had hypertrophied genitals 8 9 Jules Ch ret and Lautrec with poster 4 8 a de Min Depth Ek 0 Max Depth 4 No limits Search text Next x Unit type Units Relations Schemas Feature name Feature Value Type Psoe bl Comman d Besides we have to load the annotation model dedicated to typography To load an annotation model aam we click on the yellow button just above the annotation model table
84. smpgllentesque fermentum Maecenas adipiscing ante non diam sodales hendrerit Ut velit Matemggestas sed gravida nec ornare ut mi Aenean ut orci vel massa suscipit pulvinar Nulla Sort Type ort Date sollicitudin Fusce varius ligula non tempus aliquam nunc turpis I 4438 4950 ID 38 IL bh i GH oe EEN h u_Phrase 7239 7650 ID 39 ullamcorper nibh in tempus sapien eros vitae ligula Pellentesque rhoncus u_Paragraphe 4163 6213 ID 40 The same can be done with the arguments of a predicate FI AATIAPIIVIiC TV IL A r Anaphore 14 ID 46 r Ananhore A 7 IN 47 Reversly if an annotation is currently selected in Glozz whatever the way and you want to have it shown in the list click on Show Sel button 19 Sort Type Sort Date Show sel Visible Sort Type Sort Date e sel Visible Fr Ariapriureuzz 2 4s U_PHPFASCLLUS 2ZUU IL 57 r Anaphore 25 26 ID 48 u Phrase 423 649 ID 33 r sujet 1 3 ID 49 k u_Phrase 865 1279 ID 34 r_sujet 8 9 ID 50 u_Phrase 1688 1910 ID 35 r_sujet 20 21 ID 51 u_Phrase 2214 2358 ID 36 r_complement 19 18 ID 52 u_Phrase 2836 3088 ID 37 r_complement 23 22 ID 53 u_Phrase 4438 4950 ID 38 r_Elaboration 34 11 ID 54 u_Phrase 7239 7650 ID 39 r_Elaboration 17 35 ID 55 u_Paragraphe 4163 6213 ID 40 u_Nom 1402 1412 ID 43 u_Paragraphe 2113 4163 ID 41 u_Nom 1508 1515 ID 44 u_Paragraphe 6237 8286 ID 42 Linitea Diccontinus Al AA IN S7 r Ananhnral fi IN AS
85. straints list with ID C1 the content Text contains sit and the scope being Unit only Units may rely on this constraint ConstraintiD Content Scope Cl Text contains sit Unit Now we can create a ConstrainedAnnotation of type Unit click the button in panel 2 as follows E New E 5 Any E e Unit 7 Annotation Constraint A box appears and shows all the possible constraints which can be used that is to say all the constraints whose scope is Unit At the moment only Cl is available 45 Unit Associated constraint Co Text contains sit Cancel Note that when the mouse is over Cl its content appears as a tooltip Text contains sit which makes it easier to find the correct constraint among all Then we click on C1 and click OK Unit Associated constraint Geox Cancel A new ConstrainedAnnotation is created named Unitl which constraint is Cl gt Text contains sit and which concerns 59 matches as it appears in the list Annotation Constraint Unitl C1 gt Text contains sit Matches 59 We can click on the corresponding row which makes all the utterances appearing in the panel below panel 3 Annotation Constraint Matches C1 gt Text contains sit u_paragraph 90605 92655 ID anonymous 66 u paragraph 88555 90605 ID anonymous 65 u paragraph 86505 88555 ID anonymous 64 u paragraph 84
86. tGlozz bat file It launches the jar application file with more memory which 1s better when dealing with big files 3 3 Create a shortcut on your desktop You may want to have Glozz appearing directly on your desktop Take care not to move the jar nor the bat files from the distribution folder You should rather make a shortcut right click and choose make a shortcut of one of these files and then move this shortcut to your desktop This is important because Glozz has to be launched from its original folder in order to have access to its data files 3 4 Choose a login and ask for a key At this step you can play with Glozz to test it you can log as anonymous but you won t be able to save your annotations To really and fully use Glozz you need to be logged which means having a login and the associated key Indeed in order to guarantee that each annotation is systematically assigned to a unique annotator worldwide each user must be authentified with a unique login before saving his data To get it please send a mail to the authors use the link of the website with the login you want to use it is advised to use the first letter of your first name followed by your last name If this login is free you ll receive the associated key Otherwise you will be proposed another login and its associated key 22 Then the next time you launch Glozz you can log in and your system will remember it for next sessions 3 5
87. tate faster several occurrences you can even click several times on Annotate whithout clicking on Next since this will result in going to the next result and then annotate it libero Next 11 2 2 Find units tool This tool works almost the same way as the Find text one just seen above but is dedicated to looking for units not text It is typically something GlozzQL can do but it was created before and 1s a little faster to use when you just want to do so simple requests It is launched from the Tools menu Admin Groups Viewe Find text shift f b Bind Units Gh i Depth selector shift d gt lura oo fermentum Maecenas adipisci and it appears as follows in the right side of the interface Search Unit ttribute constr Next First X alue constr 75 In the Search Unit field enter the type name of searched units You can additionnaly specify a feature name and its expected value in the two next fields to restrain the search scope In the example below where looking for all utterances of units whose type is Nom and whose genre feature value is masculin There is at least one since the Search Unit field appears in green color Search Unit Nom 4 N e y Attribute constr jenre Next First X Value constr masculin We can jump from one result to the other using Next button and g
88. ted as C3 ConstraintlD Content Scope C1 Always true Any C2 Before Any1 Any C3 StartContains Anyl Level 1 min Relation We create a Target constraint H Target pamm R M O which concerns Any2 QO int Am Relation target contains First Level Only M Contains at least 5 1 Contains at most 3 No limits Cancel 57 It s created as C4 ConstraintlD Content Scope C1 Always true Any C2 Before Any1 Any C3 StartContains Anyl Level 1 min Relation C4 TargetContains Any2 Level 1 Relation We combine C3 and C4 with a And constraint Logic nd C And Select several and get itas C5 ConstraintlD Content Scope C1 Always true Any C2 Before Any1 Any C3 StartContains Any1 Level 1 min Relation C4 TargetContains Any2 Level 1 Relation L5 And C3 C4 Relation By double clicking on C5 Relation is created as follows with 12 utterances Unfortunately as we can see in the next screenshot some of the utterances are oriented from top to bottom contrary to what we expect Curabitur eu amet LIT J gt cnema J Annotation Constraint Matches Png Anyl C1 gt Always true 128 risus Suspendisse lectus tortor dignissim sit amet adipiscing primis in faucibus orci luctus et ultrices posuere Aliquam nibh Mauris ac mauris sed pede pellentesque fermen Any2 C2
89. tem in the list will result in 3 The list of the annotations belonging to the selected ConstrainedAnnotation of frame 2 This panel works exactly the same way as the annotations as predicates tool seen in section 2 4 Additionnaly 4 buttons are available below panel 3 and will be introduced later On the right side the Constraints panel 4 The 20 constraints are available through 20 buttons organized in 5 categories Any Units Relations Relations Schemas Logic 5 The list of the constraints still empty in the screenshot with three columns the ID of the constraint its content a short description and its scope what kind of ConstrainedAnnotations may rely on it 44 Additionnaly two buttons are available below panel 5 remove selected removes the constraint currently selected in panel 5 click on a row to select it remove all removes all the current constraints 8 3 Some examples 8 3 1 Units containing and not containing a text Assume we load the second text of the sandbox menu based on the lorem ispum We want to find all the Units which contain the text sit First we have to create a Text Contains constraint clicking the corresponding button in panel 4 Units Text Contains C D annis m3 C We specify the contained text as sit then click OK 6 0 6 Constraint Creation Ce OK Cancel A new constraint is created and appears in the con
90. times decadent life of those times Toulouse Lautrec is known along with C zanne Van Gogh and Gauguin as one of the greatest painters of the Post Impressionist period In a 2885 auction at Christie s auction house a new record was set when La blanchisseuse an early painting of a young laundress sold for 22 4 million U S Biography Youth DI Henri Marie Raymond de Toulouse Lautrec Monfa was born in Albi Tarn in the Midi Pyr n es r gion of France the firstborn child of Comte Alphonse and Comtesse Ad le de Toulouse Lautrec He was therefore a member of an aristocratic family descendants of the Counts Toulouse and Lautrec and the Viscounts of Montfa a village and commune of the Tarn of department of southern France 4 younger brother was also born to the family on 28 August 1867 but died the following year After the death of his brother his parents separated and a nanny took care of Henri through this time 2 At the age of 8 Henri left to live with his mother in Paris Here he started to draw his first sketches and caricatures on his exercise workbooks The family quickly came to realise that Henri s talent lay with drawing and painting and a friend of his father named Rene Princeteau visited sometimes to give informal lessons Some of Henri s early paintings are of horses a specialty of Princeteau and something that he would later visit with his Circus Paintings 2 3 In 1875 Henri returned to Albi bec
91. ts and relations they require a dedicated toolbar To make this bar appearing click on the special button as shown at the beginning of section 2 3 Then the special menu will appear as follows Adding to the current schema creating a new shema a Unit a Relation a Schema i db Current La s n n ID H Current schema Units informations abi selecting a shema Removing from the current schema a Unit a Relation or a Schema On the right the panel named current shows some informations about the currently edited schema 15 its ID a number which idenficates it Units the number of units contained in this schema Relations the number of relations contained in this schema Creating a new schema To create a new schema click on the dedicated button in the top left corner Immediately a new schema is created in Glozz and its ID appears in the current panel It 1s at this stage an empty schema since it behaves no element at all In the example below a new schema is created with ID number 75 and of course 0 Unit and 0 Relation Current be bdo Od io Units a La on creating a new schema Remark If you happen to click again on the creation button before you add any element to the new one Glozz won t create a new one and you will still stay on the one just created and still empty Then you can add as many Units Relations or Schemas to the current schema using the thr
92. uctus et u enim Pellentesque ultrices posuere cubilia Curae Aliquam nibh Mauris ac mauris sed pede pellentesque fermentum Maecenas adipiscing ante non diam sodales hendrerit Ut velit mauris egestas rad orasida mac nrnara nt mi anmaann nut prci wal marea enecinit mulsrimar Nilla 4 BS La P Min Depth J Reset X Max Depth Search text i Next t x Unit type Units Relatj Schemas Noun Pronoun Feature name ature value gender ale remark count gular Sort Type Sort Date Show sel Visible u_Noun 4465 4495 ID u_Noun 4442 4449 ID s_referenceChain 18 2 u_Pronoun 4225 4228 u_Pronoun 4412 4414 r_Relationship 29 30 u_Pronoun 4699 470 u_Pronoun 4723 4750 Command main view where we can see the annotated text and directly add or edit annotations 2 macro view it s a view on the same annotated text as the main view does text but in macro mode enabling to have a global view on the annotated text and to navigate quickly through it 3 mode buttons in order to set the current mode adding units editing units and so on 4 annotation model where we can see the list of all available types one column for units one for relations and one for schemas 5 feature sets table which shows the features values of the selected element 6 annotation as text table where each element is sh
93. ui de tuluz lo twek 24 November 1864 9 September 1901 was a French painter printmaker draughtsman and ifustrator whose immersion in the colourful and theatrical life of fin de si cle Paris yielded an uvre of exciting elegant and provocative images of the modern and sometimes decadent life of those times Toulouse Lautrec is known along with C zanne Van Gogh and Gauguin as one of the greatest painters of the Post Impressionist period In a 2005 auction at Christie s auction house a new record was set when La blanchisseuse an early painting of a young laundress sold for 22 4 million U S Contents hide 1 Blography 1 1 Youth 1 2 Disability and health problems 1 3 Paris 1 4 London 1 5 Alcoholism 1 6 Death 2 Art 3 Selected works 3 1 Signature 4 Movies 5 References 6 External links a import txt menu work with in Glozz Here the Henn de Toulouse Lautrec Biography edi Birth Henn Mane Raymond de name Toulouse Lautrec Monfa Youth a ie es Henri Marie Raymond de Toulouse Lautrec Monta was born in Albi Tarn in the Midi Pyr n es r gion of France the firstborn child of Comte Alphonse and Died 8 September 1901 aged 36 Comtesse Ad le de Toulouse Lautrec He was therefore a member of an aristocratic family descendants of the Counts of Toulouse and Lautrec and the Chateau Malrom France Viscounts of Monta a village and commune of the Tarn department of southern France A younger brother was also born
94. ult Male gt lt value gt Male lt value gt lt value gt Female lt value gt lt value gt No lt value gt lt possibleValues gt lt feature gt lt feature name count gt lt possibleValues default Singular gt lt value gt Singular lt value gt lt value gt Plural lt value gt lt value gt No lt value gt lt possibleValues gt lt feature gt lt featureSet gt lt type gt lt units gt 6 3 Relations Creating relation types is very similar to creating unit types The main difference is that a relation can be oriented or not This is set in the oriented attribute of the type value An oriented relation will be shown in the interface via an arrow instead of a simple line in the case of a non oriented one In our example we create a PartOf relation which is oriented and a Relationship one which is not Hence the respective false and true values in the xml code 22 below Besides the Relationship relation comes with a feature set containing one feature and the PartOf relation comes with no feature set lt relations gt lt type name PartOf oriented true gt lt type gt type name Relationship oriented false gt lt featureSet gt lt feature name kind gt lt possibleValues default Other gt lt value gt Family lt value gt lt value gt Friend lt value gt lt value gt Collegue lt value gt lt value gt Other lt value gt lt possibleValues gt lt feature gt lt featureSet gt
95. ure sets Once the unit is created it appears as a colored box red in the example below which color is given by the stylesheet according to the chosen type in the annotation model As soon as it is created its feature set is shown in frame 5 and can be edited immediately To do so click on the value you want to edit in the feature value column Then the corresponding row is highlighted groupe feature below and the value can be re set felis sed leo placerat imperdiet E Feature name Feature value sse cursus rutrum augue Nulla temps present orem vel faucibus felis groupe pfemier lorem varius purus Curabitur eu Choosing between character or word atoms option In some cases you may want to work at a character level for instance if some of your units may begin or end within a word whereas in other cases you may prefer to work at a word level You can choose between two corresponding modes in the options as shown below Directories Typography Control Viewer Wi Consider words as atoms With words as atoms selected each click on a word will result on positionning a frontieer just before for a begin frontieer of just after for a end frontieer the word Moreover if you double click on a word this will create a unit just surrouding this latter Selecting amp Editing units Click on the dedicated button in the mode toolbar as shown at the beginning of section 2 3 Then whe
96. v DO RS ne Re en E eee 20 Installing and getting started bow to eene 22 Sch Dowmoad AM unpack CO A ee 22 2 25 EE 22 3 3 Create a shortcut on your desktop Us 22 34 Chooses loci and ask Tora Key eset e me tabo EATON NS INS neut 22 3 5 Creating your own data folder s ss 23 Pile Cy DES OV Ely INR een nn 24 Creaune a corpus 10 GIOZZ FON E 25 5 1 Manual corpus creation from a txt file via import txt menu 25 5 2 Automatic corpus creation via a custom program users 29 Annotation Models types feature sets groups 3l 6 1 OVervieW 3l 02a E Seite TE 31 Gis ee EE 32 AE O 0 1 E 33 OOU COU EE 33 SLY EE 35 MB E 35 E e TE 35 Toe DHEA dE di er diese EEN 36 EE Ee RECU A CS ee LT 38 TS SCOR Ed ne en Ee ta 38 7 6 Special features sisi 39 OL e ee nn aa a ie io on 39 quoc tel 39 7 6 3 Co reference chain color mode 40 5c OZZL 61077 Query ER 42 Sols GlOZZOL EES nn naar EE 42 M MESUI Ir Ee 42 64127 onsitdtnicd Vno HOT soe dose ars bao ubt duse d ls er dr Ee 43 8 1 3 Incremental creation of Constraints and ConstrainedAnnotations 43 8 2 OlozzOL Graphical User Interface oe os RU UR ERREUR ER Ex ET eee 44 Gos DOME OAPI RETOUR 45 8 3 1 Units containing and not containing a text 45 8 3 2 Focusing on one particular annotator eene 48 8 3 3 Deeper queries with schemas and relatong 49 8 4 Saving and l
97. we get only 10 matches 41 others in Unit2 do not belong to this annotator Annotation Constraint Matches Unitl Cl Text contains sit 59 Unit2 C2 gt Not Cl 51 Unt 3 C4 And Cl C3 Sort Type Sort Dae Show sel BR Visible u_Phrase 8619 8737 ID ymathet_1246557922026 u Phrase 41 102 ID ymathet_1246557935046 u Phrase 103 200 ID2ymathet 1246557938202 u Phrase 865 1279 ID ymathet 1246557946798 u Phrase 2214 2358 ID ymathet 1246557953609 u Phrase 4438 4950 ID ymathet 1246557960954 u Phrase 7239 7650 IDsymathet 1246557965878 u Paragraphe 4163 6213 ID ymathet 1246557977596 u Paragraphe 2113 4163 ID ymathet 1246557984190 u Paragraphe 6237 8286 ID ymathet 1246557991290 8 3 3 Deeper queries with schemas and relations Let s see some more complex queries which rely Contains and Target constraints and involve deep structures Assume we want to get all the Relations whose target contains a Schema which contains a Unit whose type is Verben To do so we first create C1 as follows 49 ConstraintiD Content Scope C1 TypeName Verbe Any Then we create Unit the set of units whose type name is Verbe 0670 Constrained Annotation Creation Unit Unit Associated constraint We get it in the list Annotation Constraint Matches Un C1 gt TypeName Verbe 16 Now we create a Contains constrained by clicking this button Relations Schemas Contain
98. wyg interface to create and modify styles Then there is no need to know how it is stored in an as file for your information it is in XML format First of all we load the annotation model for which we want to create the styles Un Schemas Load annotation model aam Gg JS VUE GITE 35 Let s take the coreference aam file we ve created in the last section Since no style file is loaded all the style names appear with the same color Units Relations Schemas Now we open the style editor pen style editor And get a new window as follows 600 DHEA 6 Type name Background color Hide This editor provides a tabbed pane with 3 tabs one for Units one for Relations and one for Schemas The buttons are as follows from left to right open a as style file save the current styles save the current styles as a new as file add a new style remove selected style 7 3 Unit styles Let s first create two unit styles We select the Unit style tab then click on the button We get a new style with default values name is newStyle 1 color is gray and Hide is unchecked as follows 600 D Bii 6 Unit style Relation style Schema style Type name Background color Hide newStyle 1 k 36 Then we can change this default values to the relevant ones First we have to modify the name just clicking on it It s very important that an annotation mode
99. y present in the basket from the current loaded annotation in Glozz This action combined to the other one save basket make it possible to reorganize completly an annotated corpus 8 6 Advanced concepts Two advanced concepts very easy to launch but maybe more difficult to understand enhance the GlozzQL capabilities 8 6 1 Two ways constraints The two ways constraints option of GlozzQL means that when a constraint argument is a constrainedAnnotation then this constrainedAnnotation 1s itself constrained by the reciproque of this constraint An example will be much more meaningful than this definition Going back to the last example we ve created Unitl based on Cl with 16 utterances Then we created Schemal whose constraint involves Unitl with 4 utterances Annotation Constraint Matches Unitl Cl gt TypeName Verbe 16 Schemal C2 gt Contains Unitl Level infinite min 1 max No limits 4 Relation1 C3 TargetContains Schema 1 Level infinite min 1 ma 1 Now if we choose to activate the two ways constraints option Settings f Limit search to visible annotations M Two ways constraints beta Discard Typography k then Unitl will be also constrained by C2 which means that a unit will belong to Unit if it is of type Verbe but also if it exists at least un schema belonging to Schemal which contains this Unit Moreover since Schemal appears in C3 it 1s also constrained by it so th
100. yer GlozzQL ant ee ass nef oh een sollicitudin Fusce varius ligula mon tempus aliquam nunc turpis ullamcorper nibh in tempus sapien eros vitae ligula Pellentesque rhoncus nunc et augue Integer id felis Curabitur aliquet pellentesque diam Integer quis metus vitae elit lobortis egestas Lorem ipsum dolor sit amet consectetuer adi lit Morbi vel erat non mauris convallis vehicula Nulla et sapien or tellus aliquam faucibus convallis id congue eu quam Mauris ull r felis vitae erat Proin feugiat augu iaculis lectus et tristique ligula justo Praesent aliquam enim ollis ligula massa adipiscing nisl ac euismod nibh nisl eu lectus Fusce vulputate sem at sapien Vivamus leo Aliquam euismod libero eu enim Nulla nec felis sed leo placerat imperdiet suscipit nulla in justo Suspendisse cursus rutrum aug e Nulla tincidunt tincidunt mi Curabitur iaculis lorem Kel rhoncus fauctbus felis magna fermentum augue et ultricies lacus lorem varius purus Curabitur eu amet Lorem ipsum dolor sit amet conse tetuer adipiscing elit Sed non risus Suspendisse amet erat Duis semper Duis arcu massa scelerisque vitae consequat in pretium a enim Pellentesque congue Ut in risus volutpat libero pharetra tempor Cras vestibulum bibendum augue Praesent eestagjeo is pede Praesent blandit odio sed dui ut augue blandit sodales Vestibulum ante ipsum primis in faucibus orci l

Download Pdf Manuals

image

Related Search

Related Contents

Frans 1 ,2 Examen VWO  ClassAct User Manual  Broan-NuTone PM44 Kitchen Hood    

Copyright © All rights reserved.
Failed to retrieve file