Home
TERMINAE User Manual - V11-2
Contents
1. 10 10 11 11 7 Terminae Terminological level step 2 perspective 7 Perspective overview 7 2 Data Terminological forms 7 3 Terminological actions menu 7 3 1 Termino concept management submenu 7 3 0 Form management submenu 7 3 3 Feature managementsubmenu 8 Terminae TerminoConceptual level perspective 8 1 Perspective overview 8 2 Data Termino conceptual forms 8 3 TerminoConceptual actions menu 8 3 1 File submenu 8 3 2 Termino concept management submenu 8 3 3 Feature management submenu 8 3 4 Neon ontologysubmenu 9 Neon toolkit Conceptual level OWL perspective 9 1 Perspective overview 9 2 Terminae links menu 10 1 XML backup DTD forterms 10 2 XML backup DTD for ENs 10 3 EnsLexUnit DTD 10 4 Thesaurus DID 10 5 TreeTagger English Tagset 10 6 TreeTagger French Tagset 10 7 Use ANNIE to extract named entities 10 8 Gate named entity type file 21 21 21 23 23 24 24 26 26 26 28 28 29 30 30 33 33 34 Chapter 1 Introduction This document describes the functionalities of the TERMINAE platform which is an eclipse application Chapter 2 gives a very short insight of the methodol ogy Chapter B gives the technical characteristics and the installation instructions
2. Inknawn lHeiaht Forward Facina Figure 6 3 Select an occurrence identifier CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE19 e Add occurrence for a term to enter a new occurrence for a term You have to select a term and fill the form see Figure 6 4 e Remove occurrence for a term to remove an occurrence for a term You have to select the identifier of the occurrence you want to re move fe XS Fill the form The document number and text occurrence fields are required Term CRS Document ID 0 Sentence ID Occurrence text OK Cancel Figure 6 4 Add occurrence for a term 6 3 3 Cleaning submenu This menu allows to clean up the list of terminological units by removing a certain category of terms or named entities Various options are proposed e Remove terms listed in a file allows to suppress all the ter minological units that are listed in a given file You have to give the name of that file in which the stop words are listed one at each line e Remove terms involving given characters allows to clean the list of terminological units on a character basis You have to type in the list of forbidden characters e Remove single character terms allows to suppress the single character terms from the list of terminological units e Remove adjectives allows to suppress the terms that are tagged as adjectives e Remove numbers allows to suppress the terms that ar
3. Chapter 4 presents the main menu and the following chapters chapters 5 to 9 introduces the 5 perspectives of the platform and the related functionalities Chapter 2 The Terminae method TERMINAE is a tool that is supported by a method and some very short fore words on the method can help using the tool The task is to build a domain on tology This is an expert task since it needs to decide which concepts are really important for the domains and how they are related It has been experienced that linguistic tools relying on texts specific of the domain can help the expert They do not do the work in his her place but they propose a good starting point to im prove the coverage of the domain and some ambiguities they raise reveal real and unseen ambiguities of the domain vocabulary The TERMINAE method starts from the linguistic results produced by Tree Tagger and YaTeA It has then three steps e At the linguistic level the input is a list of term candidates i e words or group of words which on a linguistic basis could possibly figure in a terminology of the domain a list of its main terms The goal of this level is in a first step chapter 6 to constitute clean and improve the list removing parasistic or irrelevant proposals A second step 7 involves grouping those which are morphologic variants of the same term and collecting linguistic relations This work relies on the list of occurences of each term which are g
4. YaTeA term extractor form which gives its canonical form grammatical type which gives its grammatical category NE extractor Gate which the range is its type if the lexical unit is a recognised named entity The first three fields are automatically filled in by information provided by Ya TeA The last one is an ANNIE Gate information e The Variants view lists all the lexical forms that are associated as vari ants to the canonical form They can be found in the corpus or manually added CHAPTER 7 TERMINAE TERMINOLOGICAL LEVEL STEP 2 PERSPECTIVE23 e The Relations view presents the relations that the terminological unit has The Syntactical relations list shows the phrases to which it belongs either as a head or as a modifier The syntactical information is provided by YaTeA analysis of the corpus The Terminological relations list shows what are its ter minological relationships In the current version of the TERMINAE platform the terminological relations have to be filled manually e The Occurrences view lists all the occurences of the terminological unit that have been identified They can be occurrences of the canonical form or of any of its alternative variant form e The Related termino concepts view shows to which termino concepts the terminological unit is related As indicated in the second column of the Terminological form list view a terminological form can be In progress or Compl
5. component 1 belt corrosion test 1 belt of a type 2 belt of the type 1 belt strap i1 belt twisting 1 belt type i6 belt s 1 bench 7 bench seat 3 j bench type seat 1 block i3 number of lines 3655 Occurrence 3 Noun phrases Occurrence 1 ID 0cc1483 doc 0 sent 555 However if the belt adjustment device for height is constituted by the belt anchorage as lapproved in accordance with the provisions of Regulation iNo 14 the Technical Service responsible for testing may at its discretion apply the provisions of paragraph 1 Occurrence 2 IID occ8006 doc 0 sent 742 below except in the case of retractors having a pulley or strap guide at the upper belt anchorage when the load will be 980 daN and the length of strap remaining wound on the reel shall be the length resulting from locking as close as possible to 450 mm from the end of the strap ID 0cc9120 doc 0 sent 134 the tyne and dimensions of v Figure 6 1 Visualisation of Yatea results Term by frequency Frequency or by type terms vs named entities and named entity type Named entity The last column of the Lexical units view allows to write comments if you click on a cell comment a text field appears and you can add a comment to the corresponding terminological unit The comments are saved with
6. corpus annotated with ANNIE right click on a document in the resources tree and choose Save as XML In addition all documents in a corpus can be saved as individual XML files into a directory by right clicking on the corpus in the resources tree and choosing the option Save as XML For French corpora you have to install treetagger and load the Tagger Framework plugin In the resource directory you find TreeTagger FR Tokenization gapp You load this application in Gate platform You also load the Lang French plugin and the french gapp Gate application The selected processing resources are defined in Figure Fl TreeTagger FR T UN ANNIE Selected Processing resources l i Name Type E reset Document Reset PR EE S RegEx Sentence Splitter RegEx Sentence Splitter i GenericTagger eg French Gazetteer ANNIE Gazetteer ea ANNIE POS Tagger ANNIE POS Tagger ai et ANNIE NE Transducer ANNIE NE Transducer Figure 10 1 Selected processing resources CHAPTER 10 ANNEX 42 10 8 Gate named entity type file The DTD of the XML file which contains named entity type file which is used when loading named entities see 6 1 2 lt xml version 1 0 encoding UTF 8 gt lt ensTypeEn gt typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn typeEn gt Organization lt typeEn gt gt Date lt ty
7. first dialog window appears in which you must indicate the name of the project e A second dialogue window appears in which you must indicate in which directory you want to locate the project A directory with the same name as the project is automatically created with 6 subdirectories To start working on your project to build termino ontological resource from a given corpus you need to have at least the following files in your project directory more details in 6 1 1 e In the corpora subdirectory A tagged version of the row corpus txt tt file as output by TreeTag ger e In the yatea subdirectory the list of terms that have been extracted from the tagged version of the corpus by YaTeA xml file You must also give the name of the corpus if you exploit one and the name s of the authors s of the future resource s When the project is created its main characteristics are presented in the Terminae project information view on the left by default of the project perspec tive and you can start working on it CHAPTER 3 TECHNICAL CHARACTERISTICS 7 3 3 Hidden files The software creates 2 hidden files to manage the Terminae application e The file Terminae contains the name of the current project It is created in the directory where you launch the Terminae application You normaly do not need to modify it e The file nameOfProject xcfg defines the configuration of each project the set of files exploited by t
8. letter of the searched term e Cluster terms to cluster several lexical units You first have to select the various units you want to cluster then click on the Cluster terms action and choose the canonical form you want to keep The alternative forms are removed from the term list and all their occurrences are attached to the canonical form which frequency count is updated e Add a termto add a new term to the term list e Remove a termto remove the selected term from the list e Undo remove to undo the last remove action This may also undo a clean ing action see Section 6 3 3 e View occurrence context to visualise the surrounding sentences of an occurrence You have to select the occurrence identifier see Figure 6 3 and to set the size of the expected context expressed as a number of sen tences amp y Terminae project TestDemo Linguistic actions Perspectives Show View help E E Terminae Terminological level step 1 B Terminae Project perpective O Lexical units L3 Occurrences Term Freque Named entity comments E Named A CONDUCTING 1 1 i UTINITOWTT 1 Unknown CONDUCTING APPROVAL TE 1 e Sas CONFORMITY Unknown E Occurrence 1 CRF Unkn i ID occ5661 doc 0 sent 1819 Gral B ISO F2 Reduced Height CRF base ji i Forward Facing toddler CRS CRS 9 UG EN e i F 1 Occurrence 2 seems a nee ID occ5662 doc 0 sent 1820 Centreplane of occupant i B1 ISO F2X Reduced Child ia
9. of the Me e vehicle with the interior of the passenger compartment i L P number of lines 102 Occurrence 2 Iri Cua b narco decena cn Figure 8 1 Terminae TerminoConceptual level perspective A termino conceptual form is usually composed of the following views e The TerminoConcept features view presents the properties of the selected termino concept its Synonyms its Links that have been derived from the terminological levels This mainly holds for termino concepts related to named entities for which type information can be collected Typical links are brother father links e The NL definition view allows to enter a natural language definition for the selected termino concept e The Occurrences view presents the occurrences in the corpus of the lexical units to which the termino concept is linked CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE28 e The TC relations view presents the termino conceptual relations in which the termino concept is domain or range Note that the meaning of a termino concept is not formally defined It is mainly described by its related occurrences 8 3 TerminoConceptual actions menu The action menu associated with the Terminae TerminoConceptual level perspective is the TerminoConceptual action menu It proposes 4 sub menus which are presented in the following subsections File submenu Termino concept management submenu Feature management submenu Ne
10. toolkit conceptual level perspective click right In the Neon toolkit conceptual level perspective you can also import an ex isting project In this case you have to refresh the view to display the imported project and to link it to the terminoConceptual perspective see the following sec tion You can also import an ontology use import item from the menu of the navigator view of Neon toolkit conceptual level perspective 9 1 Perspective overview The Neon toolkit Conceptual level OWL perspective presentation 1s very similar to that of the Terminae TerminoConceputal level per spective It is composed of two main parts with a global view on the left and a set of more detailed and dependant views on the right see Figure 8 2 See the documentation http www neon toolkit org wiki Documentation and Support 33 CHAPTER 9 NEON TOOLKIT CONCEPTUAL LEVEL OWL PERSPECTIVE34 9 2 Terminae links menu Terminae links menu has been added to the Neon Toolkit perspective to link the conceptual and the termino conceptual levels of Neon and TERMINAE projects and of the resulting termino conceptual resources e To terminoConceptual level is used to switch from the Neon toolkit Conceptual level OWL perspective to the Terminae TerminoConceputal level perspective Clicking on this action item re opens the termino conceptual perspective and selects the termino concept associated with the class initially selected in the conceptual p
11. A gt lt ELEMENT SYNTACTIC_CATEGORY PCDATA gt lt ELEMENT TERM CANDIDATE ID LEMMA FORM List Variants NUMBER_OCCURRENCES LIST_OCCURRENCES MORPHOSYNTACTIC_FEATURES gt lt ELEMENT TERM_EXTRACTION_RESULTS LIST_TERM_CANDIDATES gt lt ELEMENT Texte PCDATA gt lt ELEMENT Variant PCDATA gt 35 CHAPTER 10 ANNEX 10 2 XML backup DTD for ENs 36 The DTD of the XML file which contains named entities and their occurrences which is visualized in Terminae Terminological level step 1 perspective lt ELEMENT DOC PCDATA gt lt ELEMENT END POSITION PCDATA gt lt ELEMENT FORM EMPTY gt lt ELEMENT ID PCDATA gt lt ELEMENT LEMMA PCDATA gt lt ELEMENT LIST EN NAMED ENTITY gt lt ELEMENT LIST OCCURRENCES OCCURRENCE gt lt ELEMENT LIST SENT SENT gt lt ELEMENT List_Lemme EMPTY gt lt ELEMENT List_Variants EMPTY gt lt ELEMENT NAMED ENTITY ID LEMMA FORM List Variants Types NUMBER OCCURRENCES LIST OCCURRENCES LIST SENT lt ELEMENT NUMBER OCCURRENCES PCDATA gt lt ELEMENT OCCURRENCE ID DOC SENTENCE START POSITION END POSITION Texte gt lt ELEM
12. E When the resources have been loaded a corpus pipeline called ANNIE will be created as before JAPE is a Java Annotation Patterns Engine It provides finite state transduction over annota tions based on regular expressions JAPE allows you to recognise regular expressions in annota tions on documents CHAPTER 10 ANNEX 41 The next step is to add a corpus and select this corpus from the drop down cor pus menu in the Serial Application editor Finally click on Run from the Serial Application editor or by right clicking on the application name in the resources pane and selecting Run To view the results double click on one of the document contained in the corpus processed in the left hand tree view No annotation sets nor annotations will be shown until annotations are selected in the annotation sets the Default set is indicated only with an unlabelled right arrowhead which must be selected in order to make visible the available annotations Open the default annotation set and select some of the annotations to see what the ANNIE application has done Having selected an annotation type in the annotation sets view hovering over an annotation in the main resource viewer or right clicking on it will bring up a popup box containing a list of the annotations associated with it from which one can select an annotation to view in the annotation editor or if there is only one the annotation editor for that annotation Now to save your
13. ENT SENT ID offset phrase List Lemme gt lt ELEMENT SENTENCE PCDATA gt lt ELEMENT START POSITION PCDATA gt lt ELEMENT Texte PCDATA gt lt ELEMENT Types type gt lt ELEMENT offset PCDATA gt lt ELEMENT phrase PCDATA gt lt ELEMENT type PCDATA gt 10 3 EnsLexUnit DTD The DTD of the XML file which contains terms named entities and their occur rences which is visualized in Terminae Terminological level step 1 perspective lt ELEMENT DOC PCDATA lt ELEMENT lt ELEMENT END_POSIT ON gt PCDATA Ens_Variants EMPTY gt lt ELEMENT lt ELEMENT FORM ID PCDATA PCDATA gt gt gt gt CHAPTER 10 ANNEX 37 lt ELEMENT LEMMA PCDATA gt lt ELEMENT LIST EN NAMED ENTITY gt lt ELEMENT LIST OCCURRENCES OCCURRENCEx gt lt ELEMENT LIST SENT SENTx gt lt ELEMENT LIST TERM CANDIDATES TERM CANDIDATE gt lt ELEMENT List Variants Variant gt lt ELEMENT MORPHOSYNTACTIC_FEATURES SYNTACTIC_CATEGORY gt lt ELEMENT NAMED ENTITY Ens Variants ID LEMMA LIS
14. OSIT gt ON gt Ss CHAPTER 10 ANNEX 38 END POSITION Texte gt lt ELEMENT PrefLabel PCDATA gt lt ELEMENT RelationRTC name domain range Skos_type gt lt ELEMENT SENTENCE PCDATA gt lt ELEMENT START POSITION PCDATA gt lt ELEMENT See_also PCDATA gt lt ELEMENT SetRTC RelationRTC gt lt ELEMENT Skos_type PCDATA gt lt ELEMENT Synonym PCDATA gt lt ELEMENT TerminoConcept ID NL Definition OCCURRENCE PrefLabel See also SetRTC Synonym children fathers lt ELEMENT Texte PCDATA gt lt ELEMENT child PCDATA gt lt ELEMENT children childx gt lt ELEMENT domain PCDATA gt lt ELEMENT father PCDATA gt lt ELEMENT fathers father gt lt ELEMENT name PCDATA gt lt ELEMENT range PCDATA gt 10 5 TreeTagger English Tagset GC CD DT EX FW IN Preposition or subo JJ Adjective JJR Adjective JJS Adjective superlative LS list item marker MD Modal NN Noun singular or mass NNS Noun plural NP Proper noun NPS Proper noun PDT Predeterminer POS Possessive ending PP Personal pronoun PPS Possessive pronoun Cardinal number Determiner Existential there Foreign word comparative singular plural Cooordinating conjunction rdinating conjunctio
15. T OCCURRENCES LIST SENT NUMBER OCCURRENCES Types x lt ELEMENT NUMBER OCCURRENCES PCDATA gt lt ELEMENT OCCURRENCE ID DOC SENTENCE START POSITION END POSITION Texte lt ELEMENT SENT EMPTY gt lt ATTLIST SENT ID CDATA REQUIRED gt lt ELEMENT SENTENCE PCDATA gt lt ELEMENT START POSITION PCDATA gt lt ELEMENT SYNTACTIC CATEGORY PCDATA gt lt ELEMENT TERM CANDIDATE ID LEMMA NUMBER OCCURRENCES LIST_OCCURRENCES FORM MORPHOSYNTACTIC_FEATURES List_Variants NAMED_ENTITY gt lt ELEMENT TERM EXTRACTION RESULTS LIST TERM CANDIDATES LIST EN lt ELEMENT Texte PCDATA gt lt ELEMENT Types type gt lt ELEMENT Variant PCDATA gt lt ELEMENT type PCDATA gt 10 4 Thesaurus DTD The DTD of the XML file which contains a thesaurus which is visualized in Ter minae TerminoConceptual level perspective A thesaurus contains a collection of terminoconcepts Each terminoconcept is described by an ID a natural language definition corpus occurrences a prefLabel a set of see also a set of synonyms altLabel a set of children and its father lt ELEMENT DOC PCDATA gt lt ELEMENT END POSITION PCDATA gt lt ELEMENT EnsTerminoConcepts name TerminoConceptt lt ELEMENT ID PCDATA gt lt ELEMENT NL Definition PCDATA gt lt ELEMENT OCCURRENCE ID DOC SENTENCE START P
16. TERMINAE User Manual vii 2 Sylvie Szulman Paris 13 with contributions from Adeline Nazarenko Paris 13 2011 July Abstract TERMINAE is a platform that assists users in designing termino ontological resources from texts It can be used by terminologists to build terminological forms and by knowledge engineers to build either thesaurus expressed in SKOS or ontologies organising concepts and lexical units in a formal way supporting inferences This platform allows to link textual elements to terminological and conceptual resources The acquisition corpus may contain one or several documents The supported languages are English and French Keyword list Ontology acquisition terminology assisting tool Executive Summary This document is the user guide of TERMINAE TERMINAE is a platform that assists users in the design of termino ontological resources from texts In ONTORULE it is used to build from texts e thesaurus expressed in SKOS and e ontologies organising in a formal way the concepts associated to the terms and supporting inferences This platform allows to link textual elements to terminological and conceptual resources The corpus may contain one or several documents The supported languages are English and French TERMINAE is organised in three main levels the first step of the terminolog ical level enables to constitute the set of terms of the corpus its second step or ganises these according to lexical and sy
17. Toolkit ontology is used to create an ontology This ontology is part of the newly created Neon project e Create a class is used to create a class in the previous ontology and from the selected termino concept A dialog window opens in which you have to give a name to the class and select a class father in the existing ontol ogy The class can be visualized in the Neon toolkit Conceptual level OWL perspective see Figure 8 2 Note that the class is cre ated with an annotation property in which the link to the source termino concept and its identifier is saved Once it has been linked to a class at CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE3 1 the conceptual level the termino concept is displayed in blue color in the TerminoConcept tree e To ontology level is used to switch from the termino conceptual perspective to the OWL one This action opens the OWL perspective and shows the class corresponding to the selected termino concept e Link to Neon project is used when one wants to exploit an existing Neon toolkit project e Link to Neon ontology is used when one wants to exploit an exist ing ontology in a specified project e Link to a class is used to link a termino concept to an existing class e Create an ObjectProperty is used to create an objectProerty from a termino conceptual relation A dialog window opens and you have to enter the name of the property the father object property its domain and range The
18. VE 11 The corpus is in a txt file it is advised to use utf 8 encoding See section to have the description of the used files You can either e Create a new project Create Terminae project if you start to build a specific termino ontological resource from a given corpus You have to specify The name of your project The name of the directory where you want to locate your project A default directory is proposed but click on the cancel button and nav igate through the file system if you want to choose another directory e Switch from one project to another Load Terminae project note that only one project can be opened at the same time You are first offered to navigate through the file system to select the directory containing the concerned project directory e Export the current project Export project A zipped file is created in which all the required directories and files are included If you have created a Neon project its directory is also included in the zipped file e Import an existing project Import project The project to be im ported is represented as a zipped file containing the project directory with all the required subdirectories and files You do not have to unzip the file but you have to specify The zipped file to load The name of the directory where you want the project to be imported 5 2 Help menu The Help information is not available yet 5 3 Show View menu Each perspe
19. al forms to create all terminological forms from a preexisting thesaurus This functionality is useful when you want to add terminological information and occurrences to an existing the saurus You start from an existing thesaurus and create a terminological form for each termino concept using a defined corpus 8 3 3 Feature management submenu This submenu proposes various actions related to the detailed information pro vided for a given termino concept and recorded in its termino conceptual form e Add a synonym to add a synonym to the selected termino concept A dialog window opens for capturing the new synonym If the corresponding terminological unit has been found by YaTeA or ANNIE its occurrences are automatically clustered with that of the current termino concept e Remove a synonym to remove a synonym You have to confirm if you want also to remove the related occurrences e Add a link to add a type of link and its value e Remove a link to remove a type of link and its value 8 3 4 Neon ontology submenu This menu is used to link TERMINAE and Neon ToolKit It supports the creation of the conceptual level and many actions to connect it to the termino conceptual one e Create a Neon project is used to create a Neon toolkit project If you want to work at the conceptual level you have to create a Neon project and to specify its name It is recommended to use different names for the TERMINAE and Neon projects e Create Neon
20. athered with linguistic information in terminological forms e The termino conceptual level chapter 8 is specific to TERMINAE Whereas terms are at the vocabulary level the goal is now to analyse the use of terms in the corpus at the semantic level The work is to recognize and distribute the various senses of this term into several termino concepts distribut ing also the occurences of the term between senses At the same time the termino concepts of the form can be tagged as having a synonym in an other form or being otherwise more loosely related 2 CHAPTER 2 THE TERMINAE METHOD 3 e The ontological level see chapter 9 now relies on termino concepts and their relations to build the ontology First synonym termino concepts should only yield one concept All the related termino concepts help building the hierarchical relations and defining the roles as can do some other linguistic information gathered during the process a part of which is under explo ration in the framework of the ontorule project in particular the analysis of verbs and SBVR fact forms Chapter 3 Technical Characteristics e The current version of TERMINAE platform is compiled using SUN 1 6 Java virtual machine e It relies on UTF 8 text encoding e It can be used for English and French 3 1 Installation To install TERMINAE you need java version 1 6 Download the version of the platform for your system from the http www lipn univ parisl3
21. ceptual level OWL E Terminae TerminoConceptual level H Terminae Terminological level step 2 B Ter E 4 Ontology Naviga 23 0 ig Entity Properties 23 a 4 P O Attribute URI khttp lipn univ paris13 fr RCLN terminae Audi Airbag gt b BusinessObj E P 3 Category Annotations b O Conditioning R Annotation Propert Value Type b Device pasy YP b Dimension Concept Airbag JrerminoConcept v O Function create new G Adjustingt ao b O Anchorage Anchorage Buckle b ChildResti SafetyBe Seat Mo C L C IE D Class Restrictions Taxonomy Annotations Source View Figure 8 2 Neon toolkit conceptual level OWL perspective Chapter 9 Neon toolkit Conceptual level OWL perspective The conceptual perpective is a Neon toolkit plugin version 2 4 to which a spe cific menu has been added for the TERMINAE platform to link the conceptual and termino conceptual levels When using Neon toolkit conceptual level perspective you need to create or to import a Neon toolkit project which is different from the Terminae project and to create or import an ontology in this project This can be done either from theNeon ontology submenu of the Terminae TerminoConceptual perspective Create a Neon project and Create Neon Toolkit ontology items or create the project and the ontology from the menu of the navigator view in Neon
22. ctive has many views and a main view which is on the left side of the perspective A click on an item in the main view change values in other views These views may be closed by the user or he she may want to see a view of another perspective which is not in the used perspective only one perspective may be selected CHAPTER 5 PROJECT MANAGEMENT PERSPECTIVE 12 This menu is used to reopen a view that has previously been closed Click on the single item Other to visualise the list of available views and choose again Other to find TERMINAE views Select the view you want to reopen or to see and be aware that the view may be dependant of one or the other perspective Chapter 6 Terminae Terminological level step 1 perspective The Terminae Terminological level allows to browse and modify the list of domain specific lexical units that have been extracted from the source corpus using term extraction and named entity recognition tools such as YaTeAl and ANNIE TERMINAE assumes that the acquisition corpus has been processed by Tree Tagger and YaTeA and possibly ANNIE beforehand YaTeA takes as input e A tagged corpus required e A list of terms extracted from it as input required see Section 6 1 1 e Lists of named entities and named entity types optional see Section 6 1 2 6 1 Data Terminological files 6 1 1 Term files When you open the Terminological level perspective you have to specify the terminological data you
23. default presented on the left part of the perspective It gives the lists of all the canonical terminolog ical units for which a terminological form has been created the form can be In progress or Completed e The other views form the terminological form of the unit that has been se lected in the Terminological form list see Section 7 2 Note that when the list of terminological forms is selected you can find any terminological form by typing the first letter of its canonical terminological unit 7 2 Data Terminological forms An example of terminological form is displayed on the right part of Figure 7 1 A terminological form gathers all the lexical and terminological information that has been collected or manually added for a given term or named entity It is usually composed of the following views 21 CHAPTER 7 TERMINAE TERMINOLOGICAL LEVEL STEP 2 PERSPECTIVE22 amp w Terminae project TestDemo Y Y Y Terminological actions Perspectives Show View help E E Terminae Terminological level step 2 E Terminae Terminological level step 1 E Terminae Project perpective Terminological for 23 BH Fi Lexical information 2 as Terminological form Entry range Variants abrasion conditioning Term extractor Yatea ix airbag acceleration device form Airbag acceleration test device grammatical type NN adjusting device NE extractor Gate agreement T RT Relatio
24. e numbers CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE20 e Remove adverbs allows to suppress the terms that are tagged as ad verbs 6 3 4 Terminological form actions This menu is used to define terminological forms described in next chapter e New terminological form allows to create a terminological form for the selected term Once the terminological form is created the new form can be visualized on the Terminae Terminological level step 2 perspective which is automatically opened and the lexical unit which form has been created is displayed in blue character in the Lexical units view Terminae Terminological level step 1 per spective e To terminological formallows to visualise the terminological form of the selected terminological unit if it has one This action automatically switches from the Terminae Terminological level step 1 perspective to the Terminae Terminological level step 2 perspective Chapter 7 Terminae Terminological level step 2 perspective This perspective can be opened either by creating a terminological form or from the main Perspective menu Terminological level step 2 7 1 Perspective overview The Terminae Terminological level step 2 perspective is com posed of two main parts with a global view on the left and a set of more detailed and dependant views on the right see Figure 7 1 e The Terminological form list view is by
25. erminoConceptual level perspective see Sec tion B Neon toolkit Conceptual level OWL perspective see Section D The 4 first perspectives make up Terminae The OWL perpective belongs to Neon ToolKit 2 4 Please note that the last Eclipse perspective Team Synchronizing is used by Neon ToolKit e The item Search is proposed Neon toolkit Conceptual level OWL per spective it is not described in this report e The item Help is proposed in all eclipse application it is not described in this report e An additional Terminae submenu is proposed on MacOS systems It gives access to the standard application main operations information About Terminae Preferences Hide Terminae Quit Terminae Chapter 5 Project management perspective TERMINAE Starts with the project management perspective This perspective has 2 views Fig 5 1 Figure 5 1 Project management perspective e The left view presents the project information if a project has been already defined project corpus thesaurus and author s names e The right view is a text editor where the user may write comments To save the comments you have to click on ctrl s 5 1 Terminae project actions menu A project consists of all data used or created by TERMINAE when building a spe cific termino ontological resource from a given corpus see Section for a description of the project structure 10 CHAPTER 5 PROJECT MANAGEMENT PERSPECTI
26. erspective e Create a termino concept is used to create a termino concept and link it to the selected This functionality is useful when you want to add the saurus information to an existing ontology You start from an existing class and create a termino concept in the thesaurus of the TERMINAE project e To link a class to a TCisusedto link aclass to an existing termino concept in the thesaurus of the TERMINAE project Chapter 10 Annex This annex lists the DTD used by Teminae 10 1 XML backup DTD for terms The DTD of the XML file which contains terms and their occurrences which is vi sualized in Terminae Terminological level step 1 perspective lt ELEMEN lt ELEMEN NUMBER_OCCURRENCES PCDATA gt OCCURRENCE ID DOC SENTENCE START_POSITION END_POSITION Texte gt lt ELEMENT DOC PCDATA gt lt ELEMENT END POSITION PCDATA gt lt ELEMENT FORM PCDATA gt lt ELEMENT ID PCDATA gt lt ELEMENT LEMMA PCDATA gt lt ELEMENT LIST OCCURRENCES OCCURRENCE gt lt ELEMENT LIST TERM CANDIDATES TERM _CANDIDATE gt lt ELEMENT List Variants Variant gt lt ELEMENT MORPHOSYNTACTIC FEATURES SYNTACTIC CATEGORY gt I E lt ELEMENT SENTENCE PCDATA gt lt ELEMENT START_POSITION PCDAT
27. eted Each terminological form is saved in an XML file in the terminoFormDir directory The list of terminological forms is saved in the filet ableTermeFiches xml in terminoFormDir directory 7 3 Terminological actions menu The action menu associated with the Terminae Terminological level step 2 perspective is the Terminological action menu It proposes 3 submenus which are presented in the following subsections e Termino concept management submenu e Form management submenu e Feature management submenu The corresponding actions are also contextually accessible from the right click of the mouse 7 3 1 Termino concept management submenu This submenu proposes three different actions CHAPTER 7 TERMINAE TERMINOLOGICAL LEVEL STEP 2 PERSPECTIVE24 e Create a termino concept to create a termino concept linked to the selected terminological unit The termino concept is added to the cur rent thesaurus If the terminological unit is a named entity the type of the named entity may also give bearth to a termino concept and a kindOf link is created between the two termino concepts e Remove a termino concept to remove a termino concept from the current thesaurus e To TerminoConceptual level to switch from the Terminae Terminolo gical level step 2 perspective tothe Terminae TerminoConceptual level perspective 7 3 2 Form management submenu This submenu proposes two actions related to termino
28. fr szulman TerminaeWorkbench web page and unzip the downloaded file The default language is English but it can be changed If you want to work with a French platform edit the terminae ini file and change the line n1 enby nl fr FR This file is located in the Terminae directory on Linux and Windows systems and in the Terminae app Contents MacOS directory on MacOS systems 3 2 How to start To launch the TERMINAE platform click on the Terminae application either Terminae on Linux system Terminae exe on Windows system or Terminae app on MacOS Initially the project management perspective Terminae Project perspective is open and you have to import or create a project 4 CHAPTER 3 TECHNICAL CHARACTERISTICS 5 3 2 1 Project location and structure In any case you have to define your project directory On Linux and Windows systems it is advised to locate it in the workspace directory created by the eclipse application A project has a fixed structure represented as the 6 following subdirectories corpora Contains the corpus data raw and tagged and the results of named entity recognition tools The current version of the platform is de signed to work with TreeTaggel and ANNIE named entity recognition tool terminoFormDir Contains the terminological forms that are created using TERMINAE and output by it linguae Contains the search patterns that have been designed and their results no pattern design tool
29. he project Advertised user may easily under stand its content and may happen to change it in tricky cases e g for renaming directories or files These files are text files or modifiable xml files Chapter 4 Main menu Figure 4 I presents the main menu of the TERMINAE platform which is accessible from any perspective It presents 4 items which are associated to specific actions or sub menu Terminae project actions Perspectives Show View help Figure 4 1 Main menu e The action submenu gives access to the specific functionalities accessible at the Terminae level where you are currently working The name of these action menu depends to the perspective from which it depends Terminae project actions Linguistics actions Terminological actions TerminoConceptual actionsandTerminae links e The Perspectives item allows to open new perspectives you simply have to click on the name of the perspective you want to open in the per spective list that appears 5 perspectives are accessible Terminae Project perspective which is the default per spective which is opened when a project is loaded It is presented in Section 5 Terminae Terminological level Section 6 Terminae Terminological level Section 7 step 1 perspective see step 2 perspective see This main menu slightly differe from on exploitation system to another CHAPTER 4 MAIN MENU 9 Terminae T
30. is available in the current version thesauri Contains the termino conceptual resources that are created us ing TERMINAE and output by it system Contains some files automatically created by TERMINAE yatea Contains the results of term extraction tools The current version of the platform is designed to work with YaTeA term extracto 3 2 2 How to import a project A project to be imported is represented as a zipped file containing the project directory with all the required subdirectories and files of a given project You do not have to unzip the file Go to the main menu Click on Terminae project actions Click on Import project A first dialog window appears in which you must indicate the zipped file to load http www ims uni stuttgart fr projekte corplex TreeTagger http gate ac uk ie annie html http search cpan org 7Ethhamon Lingua YaTeA 0 5 CHAPTER 3 TECHNICAL CHARACTERISTICS 6 e A second dialogue window appears to propose the directory into which the project will be imported If you do not accept you ll be offered to choose another one When the project is imported its main characteristics are presented in the Terminae project information view on the left by default of the project perspec tive and you can start working on it 3 2 3 How to create a project To start working on a new project e Go to the main menu e Clickon Terminae project actions e Click on Create Terminae project e A
31. logical forms e Remove a terminological form e Validate a terminological form this action is used to note that the work on this terminological form is completed It acts as a com ment aimed at the user 7 3 3 Feature management submenu This submenu proposes various actions related to the detailed information pro vided for a given terminological unit and recorded in its terminological form e Add a variant to add a lexical variant of the selected term e Remove a variant to remove a lexical variant of the selected term e Add a lexical entry to add a lexical entry for the selected term You have to type in the entry name and its value separated by two points e Remove a lexical entry to remove a lexical entry e Add a syntactical relation headto adda phrase where the se lected term is the head e Add a syntactical relation modifier to add a phrase with the selected term as a modifier e Remove a syntactical relation head to remove the selected relation CHAPTER 7 TERMINAE TERMINOLOGICAL LEVEL STEP 2 PERSPECTIVE25 e Remove a syntactical relation modifier to remove the se lected relation e Add a terminological relation to add a terminological relation where the selected term is term1 or term2 e Remove a terminological relation toremove a terminological relation e Add an occurrence to add an occurrence to the selected term You have to specify the docume
32. n x gt CHAPTER 10 ANNEX 39 RB Adverb RBR Adverb comparative RBS Adverb superlative RP Particle SYM Symbol TO to UH Interjection VB Verb base form VBD Verb past tense VBG Verb gerund or present participle VBN Verb past participle VBP Verb non 3rd person singular present VBZ Verb 3rd person singular present WDT Wh determiner WP Wh pronoun WPS Possesive wh pronoun WRB Wh adverb 10 6 TreeTagger French Tagset ABR abreviation ADJ adjective ADV adverb DET ART article DET POS possessive pronoun ma ta INT interjection KON conjunction NAM proper name NOM noun NUM numeral PRO pronoun PRO DEM demonstrative pronoun PRO IND indefinite pronoun PRO PER personal pronoun PRO POS possessive pronoun mien tien PRO REL relative pronoun PRP preposition PRP det preposition plus article au du aux des PUN punctuation PUN cit punctuation citation SENT sentence tag SYM symbol CHAPTER 10 ANNEX 40 VER cond verb conditional VER futu verb futur VER impe verb imperative VER impf verb imperfect VER infi verb infinitive VER pper verb past participle VER ppre verb present participle VER pres verb present VER simp verb simple past VER subi verb subjunctive imperfect VER subp verb subjunctive present 10 7 Use ANNIE to extract named entities This annex describes the procedure to be followed to use ANNIE to extac
33. ns 52 D amendment anchorage Syntactical relations Terminological relations 7s es Se aaa anchorage of the belt Head Modifier Tem1 name of relation anchorage of the seat passenger airbag airbag assembly anchorage point gt CR D C B angle of the strap angle quadrant Sed Term occurrences X ELI PER RO atmosphere y oun phrases belt anchorage Securencell 2 buckle ID 0cc8343 doc 0 sent 58 Airbag buckle test Airbag assembly means a device installed to supplement safety belts and restraint systems in power driven vehicles i e system which in the event of a severe impact affecting the vehicle automatically deploys a carriage of passenger flexible structure intended to limit by compression of the gas contained cold conditioning within it the gravity of the contacts of one or more parts of the body of an occupant of the vehicle with the interior of the passenger compartment calibration test conditioning o em C lo L Occurrence 2 number of lines 90 ID 0cc9230 doc 0 sent 60 zl l ML Im mme m onm mmm Mmmm oam mele me mma ds da intandad ta mnm dmm de hae Figure 7 1 Terminae Terminological level step 2 perspective e The Lexical information view is a form in which you can freely create modify or suppress some fields By default four lexical fields are defined Term extractor Yatea which range is X if the terminolog ical unit has been extracted by
34. nt identifier and to type in the text of the occur rence e Remove an occurrence to remove an occurrence to the selected term Select the relevant occurrence to indicate which occurrence has to be re moved Chapter 8 Terminae TerminoConceptual level perspective This perspective must be opened from the Perspective submenu in the main menu by selecting the Terminae TerminoConceptual level 8 1 Perspective overview The Terminae TerminoConceptual level perspective presentation is very similar to that of the Terminae Terminological level step 2 perspective It is composed of two main parts with a global view on the left and a set of more detailed and dependant views on the right see Figure 8 1 e The TerminoConcept tree view is by default presented on the left part of the perspective It shows the hierarchy of all the termino concepts that have been created e The other views form the termino conceptual form of the termino concept that has been selected in the TerminoConcept tree see Section 8 2 Note that you can find a termino concept simply by typing its first letter in the TerminoConcept tree view 8 2 Data Termino conceptual forms The termino conceptual level is a bridge between the terminological level and the conceptual level the ontology It is made of a set of termino concepts which are themselves described by termino conceptual forms gathering the relevant infor mation that has been collected or defined fo
35. ntactic relations the termino conceptual level organizes the terminology according to semantic relations the third level the ontological level enables to create a formal ontology out of the list of termino concepts created at the second level This document describes the functionalities of the TERMINAE platform The first chapter describes the technical characteristics and the installation instruc tions The following chapters present the main menus of the platform that are accessible from its main window Contents 1 Introduction r3 The Terminae method 3 Technical Characteristics 3 1 Installation 1 54 9 4 goog qoe ok amp ow Vr ECC 9 oos y es 92 HOW EO A II 3 2 1 Project location and structure 3 22 Howto import a project llle 3 23 How to create a project o 0 20 e 3 3 RIOT uuo a ROR ds o de ae A AA 4 Main menu 5 6 Project management perspective 5 1 Terminae project actionsmenu SAO A 5 33 Show RR MAA Terminae Terminological level step 1 perspective 6 1 Data Terminological files Gl Term WIGS xk eum x o3 X AA 6 1 2 Named entity files ascos sis ee we ee we xS 6 2 Perspective OVEIVIEW o ce 2 ek ED Oe ee a 63 Linguistic actionsmenu 63 1 Filesubmenu 6 3 2 Term Management submenu 6 33 Cleaningsubmenu 6 3 4 Terminological form actions
36. objectProperty is created with an annotation property in which the name and type of the source termino conceptual relation are saved e Link a RT and an ObjectProperty is used to link a termino conceptual relation to an existing objectProperty e Link a RT and a classisused to link a termino conceptual relation to a an existing class e Create classes and TCs is used to derive a set of classes from a set of selected termino concepts If these termino concepts have termino conceptual relations objectProperties are created and linked to these source relations e Create classes and TCs without dialog offers the same func tionality as above but there without dialog The default values are system atically kept name of class name of terminoconcept name of objectproperty name of the RTC if termino concepts are linked by a isKindOf link the correspond ing classes are in the same hierarchical order e Link to an individual is used to link a termino concept to an in dividual You have to enter the individual name and select the class from which it belongs thanks to dialog windows CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE32 e Create an individual is used to create an individual You have to enter the individual name and select the class from which it belongs thanks to dialog windows amp amp Terminae project TestDemo Y Y Y Terminae links Perspectives Show View Search help B Neon toolkit con
37. on ontology submenu The corresponding actions are also contextually accessible from the right click of the mouse 8 3 1 File submenu This menu allows to load and save termino conceptual data It proposes the fol lowing actions Load XML format to load a thesaurus in XML format see DTD in An nex 10 4 Save XML format to save a thesaurus in XML format Export SKOS to Import SKOS to load an existing thesaurus in Skos format export a thesaurus in Skos format A dialog window opens in which you have to define an URI added to the name of skos con cepts to guarantee they are uniquely identified for instance http www lipn univ paris 13 fr terminae Note that in the current version of the TERMINAE platform the termino conceptual relations are not described in the exported file Export SKOS R DE XML format to export a thesaurus in RDF XML format A dialog window opens in which you have to define an URI as for the skos format CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE29 8 3 2 Termino concept management submenu e Create termino concept tocreate a new termino concept You have to type in the name of the termino concept if it is not created directly from a terminological unit e Remove termino concept to remove the selected termino concept You have to confirm the removal e Rename termino concept to change the name of the selected termino concept e Add kindOf link
38. peEn gt gt Person lt typeEn gt gt Percent lt typeEn gt gt Location lt typeEn gt gt Money lt typeEn gt gt Title lt typeEn gt gt Address lt typeEn gt gt Unknown lt typeEn gt gt Jobtitle lt typeEn gt gt FirstPerson lt typeEn gt gt Location lt typeEn gt gt UrlPre lt typeEn gt lt ensTypeEn gt
39. r those termino concepts 26 CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE27 amp w Terminae project TestDemo Y y X TerminoConceptual actions Perspectives Show View help rs E Terminae TerminoConceptual level El Terminae Terminological level step 1 E Terminae Terminological level step 2 B Te ima E TerminoConcept 9 BF TerminoConcept features 23 TerminoConcepts Term iD airbag Synonyms Links OWLClass OWLClass B AdjustingDevice Bl Agreement ll PassengerAirbag Bl Amendment E v MH Anchorage O NL Definition a Mi AnchorageOfTheBe lg Bl AnchorageOfTheSeat A ttp Test DemoOntoAirbag http lipn univ paris13 fr RCLN Bl AnchoragePoint Bl AngleOftheStrap B AngleQuadrant lll Atmosphere B BeltAnchorage Ml Breakingstrengthofs F1 Occurrences 3i 2 BP TCrela 2 Noun phrases Bl Buckle Occurrence 1 Domain Ill BuckleTest ID 0cc8343 doc 0 sent 58 a Airbag assembly means a device installed to supplement safety Bi CalibrationTest belts and restraint systems in power driven vehicles i e system Bl ChildRestraintSysterr which in the event of a severe impact affecting the vehicle IdConditioni automatically deploys a flexible structure intended to limit by E ColdConditioning compression of the gas contained within it the gravity of the ll Conditioning contacts of one or more parts of the body of an occupant
40. t 1820 CH compound hardness 1 j i B1 ISO F2X Reduced CONDUCTING 11 Unknown Height Forward Facing CONDUCTING APPROVAL TE 1 goce CONFORMITY i1 Unknown occurrence 3 CRF E Unknown ID occ7088 doc 0 sent 1821 i i C ISO R3 Full Size CRF base jm i Rearward Facing toddler CRS 9 Unknown CRS Centreplane i1 Unknown AA i i i Occurrence 4 A i ID 0cc7280 doc 0 sent 1824 Child 3 Unknown F ISO L1 Left Lateral Child RESTRAINT SYSTEMS 2 facing position CRS carry Child RESTRAINT SYSTEMS Ir 1 EE aos Classes 11 Occurrence 5 Classes i1 1D occ7888 doc 0 sent 1833 j i Figure 1 ISO F3 envelope Co i n i i v dimensions for a full height number oflines 3655 forward facing toddler CRS ui ple A o Figure 6 2 Visualisation of terms and named entities CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE18 6 3 2 Term Management submenu This menu allows to manage terminological data i e to visualise the list of termi nological units and edit it by clustering removing or adding some of them The Term Management menu proposes 9 different actions e Visualize all terms toredisplay the list of terminological units af ter a search sequence e Find a term to search for a specific unit on the basis of its beginning characters Note that this functionality is also directly accessible when the list of terms is selected by typing the first
41. t named entities from a given document only one document can be processed at a time Note that the following procedure is extracted from the Gate documentation for processing English corpora http gate ac uk sale tao splitch3 html GATE enables you to extract named entities from plain texts and annotate your corpus with it GATE is distributed with an IE system called ANNIE ANNIE relies on finite state algorithms and the JAPH language Take one large pile of text documents emails etc Call this your corpus If you right click on Language Resources in the resources pane select New then GATE Document the window Parameters for the new GATE Document will appear Once you indicate the corpus to work on it you can call for ANNIE From the File menu select Load ANNIE System To run it in its default state choose with Defaults This will automatically load all the ANNIE re sources and create a corpus pipeline called ANNIE with the correct resources selected in the right order and the default input and output annotation sets If without Defaults is selected the same processing resources will be loaded but a popup window will appear for each resource which enables the user to spec ify a name location and other parameters for the resource This is exactly the same procedure as for loading a processing resource individually the difference being that the system automatically selects those resources contained within ANNI
42. the termi nological results and can be reloaded upon request when the YaTeA results are loaded The occurrences of the selected terminological unit in the working corpus ap pear on the right view CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE16 6 3 Linguistic actions menu The action menu associated with the Terminae Terminological level step 1 perspectiveisthe Linguistic action menu It proposes 3 sub menus and 2 actions which are also contextually accessible from the right click of the mouse e File submenu e Term management submenu e Cleaning submenu e New terminological formaction e To terminologcial formaction Those submenus and actions are presented in the following subsections 6 3 1 File submenu This menu allows to load and save terminological data It proposes the following actions e Load Yatea results to load the terms initially extracted from your corpus by YaTeA or saved in a XML backup The procedure is the same as that described in Section 6 1 1 e Save Yatea results to make an XML backup see Annex for details on the file format e Load named entities from ANNIE results to load the named entities identified by the ANNIE named entity recognition tool see Sec tion 6 1 2 A first file dialog window opens in which you have to indicate which named entity types you are interested in by selecting a named entity type XML file that should be located in the corpus s
43. to give a father to the selected termino concept A dialog window opens in which you have to give the name of the father termino concept e Remove kindOf link to remove a father of the selected termino concept e Add a RTC to add a termino concept relation for the selected termino concept A first dialog window opens in which you have to give the name of the relation A second dialog window opens in which you have to click on ok if the selected termino concept is the domain and on cancel if not A third dialog window opens in which you have to give the name of the range or domain depending on the previous answer That termino concept must pre exist A choice dialog window then opens in which you have to select the skos type of the relation e Remove a RTC to remove the selected termino conceptual relation e Add occurrence to add an occurrence to the selected termino concept e Remove occurrence to remove an occurrence of the selecteed termino concept You have to select the identifier of the occurrence to be removed e Create a terminological form to create a terminological form from a termino concept This functionality is useful when you want to add terminological information and occurrences to an existing thesaurus You start from an existing termino concept and create a terminological form us ing a defined corpus CHAPTER 8 TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE30 e Create all terminologic
44. ubdirectory of your project A second file dialog window opens in which you have to select an other xml file containing the list of named entities extracted by AN NIE This file should also be located in the corpus subdirectory of your project CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE17 e Save named entities to make an XML backup see Annex for details on the file format e Load named entities to load the named entities from an XML backup e Load all lexical units to load the terms and named entities from a single XML backup e Save all lexical units to make an XML backup of all entities terms and named entities see Annex for details on the file format If everything works properly when all types of terminological data are loaded the window of Figure 6 2 appears amp w Terminae project TestDemo wa x Linguistic actions Perspectives Show View help raf E Terminae Terminological level step 1 E Terminae Project perpective F3 Occurrences a Named entity type O Lexical units z Term irnos Named soli comments Kinom CATEGORIES 1 Unknown n CATEGORIES Installed ON ISt1 Occurrence 1 CATEGORY i3 ID occ5661 doc 0 sent 1819 S j j j B ISO F2 Reduced Height CATEGORY Child RESTRAINT Z i i Forward Facing toddler CRS cc 1 Unknown PASADAS Hen i1 Punta Occurrence 2 i i ID occ5662 doc 0 sen
45. ut by the ANNIE named entity recognition tool see Annex 10 7 for details on the file format and which are expected to be located in the corpus subdirectory of your project e The first xml file indicates which named entity types you are interested in e The second xml file contains the list of named entities extracted by ANNIE To create such files follow the procedure described in Annex 6 2 Perspective overview If everything works properly when loading the terminological data the window of Figure 6 IJappears when the Terminae Terminological level step 1 perspective is first opened The window is composed of two views the Lexical units view on the left and the Occurrences view on the right The terminological units either terms or named entities are listed on the left view By clicking on the heads of the columns you can sort the list alphabetically CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE15 amp iy Terminae project TestDemo Linguistic actions Perspectives Show View help BS E Terminae Terminological level step 1 E Terminae Project perpective E Occurrences O Lexical units Term Freque Named entity comments base i2 i base of the fixture e 1 basis 3 basis of these prescription 1 belt i258 i E belt access gap 1 IE belt adjustment device 8 belt arrangement 1 i belt assembly 27 belt assembly
46. want to start with note that additional data can be loaded afterwards You have to e Load a term list Load Yatea file which is supposed to be located in the yatea subdirectory of your project e Indicate how many documents your corpus encompasses Note that docu ments are numbered starting from 1 if there are several of them but that a single document has number 0 http search cpan org 7Ethhamon Lingua YaTeA 0 5 http gate ac uk ie annie html 13 CHAPTER 6 TERMINAE TERMINOLOGICAL LEVEL STEP 1 PERSPECTIVE14 e Select the tagged corpus from which the terms have been extracted tt file It is supposed to be located in the corpus subdirectory of your project e Speficy the corpus language English en or French fr When the terminological data is loaded TERMINAE creates two additional files in the yatea directory e f TempCorpus2XML xml which is an xml version of the corpus e fTempTT2XML xml which is an xml version of the tagged corpus If you have several documents each one must be processed by TreeTagger and the results must be concatenated in a single file where the various intial documents are separated by a document tag as shown below Text n TAB Document TAB n where TAB is the tabulation character and n varies between 0 and x 1 x being the total number of documents 6 1 2 Named entity files You may also want to work with named entities In that case you need two files that are outp
Download Pdf Manuals
Related Search
Related Contents
HQ EL-WDB101 door bell Entrematic Sliding Door Operator EMSL Installation and CSR Instructions for USE - Boehringer Laboratories, Inc. Geovision GV-VD222D PC-PlannerNT-Brochure ALTA 4 Manual de producto multi charger X1 TOUCH 200 日本語版マニュアル Braccio, Nadia Samsung 23,6" моноблок серии 7 700A3D-A02 User Manual (Windows 8) Dragon12-Plus-USB Trainer Copyright © All rights reserved.
Failed to retrieve file