Home
        TERMINAE User Manual - V11-2
         Contents
1.                  10  10  11  11    7 Terminae Terminological level  step 2  perspective    7   Perspective overview           7 2 Data  Terminological forms    7 3 Terminological actions menu                       7 3 1    Termino concept management submenu           7 3 0 Form management submenu                   7 3 3 Feature managementsubmenu                  8 Terminae TerminoConceptual level perspective    8 1 Perspective overview           8 2 Data  Termino conceptual forms                     8 3 TerminoConceptual actions menu                 8 3 1 File submenu        8 3 2 Termino concept management submenu         8 3 3 Feature management submenu                  8 3 4 Neon ontologysubmenu                      9 Neon toolkit Conceptual level  OWL  perspective    9 1 Perspective overview         9 2  Terminae links menu    10 1 XML backup DTD forterms                         10 2 XML backup DTD for ENs  10 3 EnsLexUnit DTD           10 4 Thesaurus DID         10 5 TreeTagger English Tagset  10 6 TreeTagger French Tagset      10 7 Use ANNIE to extract named entities                     10 8 Gate named entity type file    21  21  21  23  23  24  24    26  26  26  28  28  29  30  30    33  33  34    Chapter 1    Introduction    This document describes the functionalities of the TERMINAE platform which  is an eclipse application  Chapter  2  gives a very short insight of the methodol   ogy  Chapter B  gives the technical characteristics and the installation instructions  
2.   Inknawn   lHeiaht Forward Facina    Figure 6 3  Select an occurrence identifier    CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE19    e Add occurrence for a term to enter a new occurrence for a term   You have to select a term and fill the form  see Figure 6 4      e Remove occurrence for a term to remove an occurrence for a  term  You have to select the identifier of the occurrence you want to re   move        fe XS    Fill the form   The document number and text occurrence fields are required          Term CRS Document ID  0 Sentence ID                Occurrence text            OK    Cancel          Figure 6 4  Add occurrence for a term    6 3 3 Cleaning submenu    This menu allows to clean up the list of terminological units by removing a certain  category of terms or named entities  Various options are proposed     e Remove terms listed in a file allows to suppress all the ter   minological units that are listed in a given file  You have to give the name  of that file  in which the stop words are listed  one at each line     e Remove terms involving given characters allows to clean  the list of terminological units on a character basis  You have to type in the  list of forbidden characters        e Remove single character terms allows to suppress the single   character terms from the list of terminological units        e Remove adjectives allows to suppress the terms that are tagged as  adjectives     e Remove numbers allows to suppress the terms that ar
3.  Chapter 4  presents the main menu and the following chapters  chapters  5  to  9   introduces the 5 perspectives of the platform and the related functionalities     Chapter 2    The Terminae method    TERMINAE is a tool that is supported by a method  and some  very short  fore   words on the method can help using the tool  The task is to build a domain on   tology  This is an expert task  since it needs to decide which concepts are really  important for the domains  and how they are related  It has been experienced that  linguistic tools  relying on texts specific of the domain  can help the expert  They  do not do the work in his her place  but they propose a good starting point to im   prove the coverage of the domain  and some ambiguities they raise reveal real and  unseen ambiguities of the domain vocabulary     The TERMINAE method starts from the linguistic results produced by Tree   Tagger and YaTeA  It has then three steps     e At the linguistic level  the input is a list of term candidates  i e  words  or group of words which  on a linguistic basis  could possibly figure in a  terminology of the domain  a list of its main terms   The goal of this level  is in a first step  chapter  6 to constitute  clean and improve the list  removing  parasistic or irrelevant proposals  A second step  7  involves grouping those  which are morphologic variants of the same term and collecting linguistic  relations  This work relies on the list of occurences of each term  which are  g
4.  YaTeA term extractor     form  which gives its canonical form       grammatical type  which gives its grammatical category         NE extractor  Gate   which the range is its type if the lexical  unit is a recognised named entity     The first three fields are automatically filled in by information provided by  Ya TeA  The last one is an ANNIE  Gate  information     e The Variants view lists all the lexical forms that are associated as vari   ants to the canonical form  They can be found in the corpus or manually  added     CHAPTER 7  TERMINAE TERMINOLOGICAL LEVEL  STEP 2  PERSPECTIVE23    e The Relations view presents the relations that the terminological unit  has         The Syntactical relations list shows the phrases to which it  belongs either as a head or as a modifier  The syntactical information  is provided by YaTeA analysis of the corpus               The Terminological relations list shows what are its ter   minological relationships  In the current version of the TERMINAE  platform  the terminological relations have to be filled manually     e The Occurrences view lists all the occurences of the terminological unit  that have been identified  They can be occurrences of the canonical form or  of any of its alternative  variant  form     e The Related termino concepts view shows to which termino concepts  the terminological unit is related     As indicated in the second column of the Terminological form list  view  a terminological form can be In progress or Compl
5.  component 1         belt corrosion test   1        belt of a type   2       belt of the type 1         belt strap i1       belt twisting   1        belt type i6        belt s   1        bench 7       bench seat   3 j       bench type seat 1      block i3       number of lines   3655            Occurrence 3        Noun phrases    Occurrence 1      ID 0cc1483 doc 0 sent 555  However   if the belt    adjustment device for    height is constituted by the    belt anchorage   as  lapproved in accordance with  the provisions of Regulation  iNo  14   the Technical    Service responsible for   testing may   at its     discretion   apply the  provisions of paragraph  1         Occurrence 2      IID occ8006 doc 0 sent 742      below   except in the case     of retractors having a pulley     or strap guide at the upper   belt anchorage   when      the load will be 980 daN and     the length of strap     remaining wound on the reel  shall be the length resulting  from locking as close as     possible to 450 mm from     the end of the strap      ID 0cc9120 doc 0 sent 134   the tyne and dimensions of  v              Figure 6 1  Visualisation of Yatea results     Term   by frequency  Frequency  or by type  terms vs  named entities  and    named entity type  Named entity      The last column of the Lexical units view allows to write comments  if  you click on a cell comment  a text field appears and you can add a comment to  the corresponding terminological unit  The comments are saved with 
6.  corpus annotated with ANNIE  right click on a document  in the resources tree and choose  Save as XML     In addition  all documents in a  corpus can be saved as individual XML files into a directory by right clicking on  the corpus in the resources tree and choosing the option  Save as XML       For French corpora  you have to install treetagger and load the Tagger Framework  plugin  In the resource directory  you find TreeTagger FR Tokenization gapp  You  load this application in Gate platform  You also load the Lang French plugin and  the french gapp Gate application  The selected processing resources are defined    in Figure            Fl TreeTagger FR  T      UN ANNIE    Selected Processing resources                l i Name Type  E reset Document Reset PR  EE S   RegEx Sentence Splitter RegEx Sentence Splitter           i     GenericTagger                eg French Gazetteer ANNIE Gazetteer  ea ANNIE POS Tagger ANNIE POS Tagger  ai et  ANNIE NE Transducer   ANNIE NE Transducer              Figure 10 1  Selected processing resources    CHAPTER 10  ANNEX 42    10 8 Gate named entity type file    The DTD    of the XML file which contains named entity type file which is used    when loading named entities  see 6 1 2       lt  xml version  1 0  encoding     UTF 8      gt    lt ensTypeEn gt       typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn    typeEn         typeEn     gt Organization lt  typeEn gt    gt Date lt  ty
7.  first dialog window appears  in which you must indicate the name of the  project     e A second dialogue window appears  in which you must indicate in which  directory you want to locate the project  A directory with the same name as  the project is automatically created with 6 subdirectories     To start working on your project to build termino ontological resource from a  given corpus  you need to have at least the following files in your project directory     more details in 6 1 1      e In the corpora subdirectory         A tagged version of the row corpus  txt    tt file as output by TreeTag   ger     e In the yatea subdirectory  the list of terms that have been extracted from  the tagged version of the corpus by YaTeA   xml file      You must also give the name of the corpus if you exploit one and the name s   of the authors s  of the future resource s     When the project is created  its main characteristics are presented in the Terminae  project information view on the left  by default  of the project perspec   tive and you can start working on it     CHAPTER 3  TECHNICAL CHARACTERISTICS 7    3 3 Hidden files    The software creates 2 hidden files to manage the Terminae application     e The file  Terminae contains the name of the current project  It is created  in the directory where you launch the Terminae application  You normaly  do not need to modify it     e The file  nameOfProject xcfg defines the configuration of each project   the set of files exploited by t
8.  letter of the searched term     e Cluster terms to cluster several lexical units  You first have to select  the various units you want to cluster  then click on the Cluster terms  action and choose the canonical form you want to keep  The alternative  forms are removed from the term list and all their occurrences are attached  to the canonical form  which frequency count is updated     e Add a termto add a new term to the term list   e Remove a termto remove the selected term from the list    e Undo remove to undo the last remove action  This may also undo a clean   ing action  see Section 6 3 3      e View occurrence context to visualise the surrounding sentences of  an occurrence  You have to select the occurrence identifier  see Figure 6 3   and to set the size of the expected context  expressed as a number of sen   tences       amp  y Terminae project TestDemo  Linguistic actions Perspectives Show View help       E   E Terminae Terminological level  step 1  B Terminae Project perpective                         O Lexical units L3 Occurrences    Term   Freque  Named entity  comments E Named A  CONDUCTING 1 1 i UTINITOWTT 1 Unknown  CONDUCTING APPROVAL TE  1         e Sas  CONFORMITY   Unknown   E Occurrence 1   CRF   Unkn i ID occ5661 doc 0 sent 1819  Gral B ISO  F2   Reduced Height  CRF base ji i   Forward Facing toddler CRS    CRS 9 UG EN e  i F 1 Occurrence 2   seems a nee ID occ5662 doc 0 sent 1820     Centreplane of occupant       i   B1 ISO  F2X   Reduced  Child ia  
9.  of the  Me e vehicle with the interior of the passenger compartment    i L  P  number of lines   102   Occurrence 2  Iri Cua b       narco decena cn           Figure 8 1  Terminae TerminoConceptual level perspective    A termino conceptual form is usually composed of the following views        e The TerminoConcept features view presents the properties of the  selected termino concept       its Synonyms         its Links  that have been derived from the terminological levels   This mainly holds for termino concepts related to named entities for  which type information can be collected  Typical links are brother   father links     e The NL definition view allows to enter a natural language definition  for the selected termino concept     e The Occurrences view presents the occurrences in the corpus of the  lexical units to which the termino concept is linked     CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE28    e The TC relations view presents the termino conceptual relations in  which the termino concept is domain or range     Note that the meaning of a termino concept is not formally defined  It is mainly  described by its related occurrences     8 3 TerminoConceptual actions menu    The action menu associated with the Terminae TerminoConceptual level  perspective is the TerminoConceptual action menu  It proposes 4 sub   menus which are presented in the following subsections     File submenu    Termino concept management submenu    Feature management submenu    Ne
10.  toolkit conceptual level perspective  click right     In the Neon toolkit conceptual level perspective  you can also import an ex   isting project  In this case  you have to refresh the view to display the imported  project and to link it to the terminoConceptual perspective  see the following sec   tion   You can also import an ontology  use import item from the menu of the  navigator view of Neon toolkit conceptual level perspective     9 1 Perspective overview       The Neon toolkit Conceptual level  OWL  perspective presentation  1s very similar to that of the Terminae TerminoConceputal level per   spective  It is composed of two main parts  with a global view on the left and a  set of more detailed and dependant views on the right  see Figure  8 2   See the  documentation  http   www neon toolkit org wiki Documentation  and Support      33       CHAPTER 9  NEON TOOLKIT CONCEPTUAL LEVEL  OWL  PERSPECTIVE34       9 2 Terminae links menu    Terminae links menu has been added to the Neon Toolkit perspective to link  the conceptual and the termino conceptual levels of Neon and TERMINAE projects  and of the resulting termino conceptual resources     e To terminoConceptual level is used to switch from the Neon  toolkit Conceptual level  OWL  perspective to the Terminae  TerminoConceputal level perspective  Clicking on this action item   re  opens the termino conceptual perspective and selects the termino concept  associated with the class initially selected in the conceptual p
11. A    gt     lt  ELEMENT SYNTACTIC_CATEGORY    PCDATA    gt     lt  ELEMENT TERM CANDIDATE   ID  LEMMA  FORM  List Variants                                NUMBER_OCCURRENCES  LIST_OCCURRENCES  MORPHOSYNTACTIC_FEATURES    gt    lt  ELEMENT TERM_EXTRACTION_RESULTS   LIST_TERM_CANDIDATES    gt    lt  ELEMENT Texte    PCDATA    gt     lt  ELEMENT Variant    PCDATA    gt                                               35    CHAPTER 10  ANNEX    10 2 XML backup DTD for ENs    36    The DTD of the XML file which contains named entities and their occurrences                                                                                                                                                                                                    which is visualized in Terminae Terminological level  step 1    perspective     lt  ELEMENT DOC    PCDATA    gt     lt  ELEMENT END POSITION    PCDATA    gt     lt  ELEMENT FORM EMPTY  gt     lt  ELEMENT ID    PCDATA    gt     lt  ELEMENT LEMMA    PCDATA    gt     lt  ELEMENT LIST EN   NAMED ENTITY     gt     lt  ELEMENT LIST OCCURRENCES   OCCURRENCE      gt     lt  ELEMENT LIST SENT   SENT     gt     lt  ELEMENT List_Lemme EMPTY  gt     lt  ELEMENT List_Variants EMPTY  gt     lt  ELEMENT NAMED ENTITY   ID  LEMMA  FORM  List Variants   Types  NUMBER OCCURRENCES  LIST OCCURRENCES  LIST SENT    lt  ELEMENT NUMBER OCCURRENCES    PCDATA    gt     lt  ELEMENT OCCURRENCE   ID  DOC  SENTENCE  START POSITION    END POSITION  Texte    gt     lt  ELEM
12. E   When the resources have been loaded  a corpus pipeline called ANNIE will be  created as before         JAPE is a Java Annotation Patterns Engine  It provides finite state transduction over annota   tions based on regular expressions  JAPE allows you to recognise regular expressions in annota   tions on documents     CHAPTER 10  ANNEX 41    The next step is to add a corpus  and select this corpus from the drop down cor   pus menu in the Serial Application editor  Finally click on  Run  from the Serial  Application editor  or by right clicking on the application name in the resources  pane and selecting  Run     To view the results  double click on one of the document contained in the  corpus processed in the left hand tree view  No annotation sets nor annotations  will be shown until annotations are selected in the annotation sets  the  Default   set is indicated only with an unlabelled right arrowhead which must be selected  in order to make visible the available annotations  Open the default annotation set  and select some of the annotations to see what the ANNIE application has done    Having selected an annotation type in the annotation sets view  hovering over  an annotation in the main resource viewer or right clicking on it will bring up a  popup box containing a list of the annotations associated with it  from which one  can select an annotation to view in the annotation editor  or if there is only one   the annotation editor for that annotation    Now to save your
13. ENT SENT   ID  offset  phrase  List Lemme    gt     lt  ELEMENT SENTENCE    PCDATA    gt     lt  ELEMENT START POSITION    PCDATA    gt     lt  ELEMENT Texte    PCDATA    gt     lt  ELEMENT Types   type     gt     lt  ELEMENT offset    PCDATA    gt     lt  ELEMENT phrase    PCDATA    gt     lt  ELEMENT type    PCDATA    gt    10 3 EnsLexUnit DTD   The DTD of the XML file which contains terms  named entities and their occur    rences which is visualized in Terminae Terminological level  step    1  perspective      lt  ELEMENT    DOC          PCDATA             lt  ELEMENT   lt  ELEMENT    END_POSIT             ON             gt      PCDATA         Ens_Variants EMPTY  gt         lt  ELEMENT   lt  ELEMENT    FORM  ID               PCDATA     PCDATA    gt            gt      gt           gt                                                                                                                                                                                                                                                                                   CHAPTER 10  ANNEX 37   lt  ELEMENT LEMMA    PCDATA    gt    lt  ELEMENT LIST EN   NAMED ENTITY     gt    lt  ELEMENT LIST OCCURRENCES   OCCURRENCEx    gt    lt  ELEMENT LIST SENT   SENTx    gt    lt  ELEMENT LIST TERM CANDIDATES   TERM CANDIDATE     gt    lt  ELEMENT List Variants   Variant      gt    lt  ELEMENT MORPHOSYNTACTIC_FEATURES   SYNTACTIC_CATEGORY    gt    lt  ELEMENT NAMED ENTITY   Ens Variants   ID   LEMMA    LIS
14. OSIT                          gt     ON      gt          Ss                                                                                              CHAPTER 10  ANNEX 38  END POSITION  Texte   gt    lt  ELEMENT PrefLabel    PCDATA    gt    lt  ELEMENT RelationRTC   name  domain  range  Skos_type    gt    lt  ELEMENT SENTENCE    PCDATA    gt    lt  ELEMENT START POSITION    PCDATA    gt    lt  ELEMENT See_also    PCDATA    gt    lt  ELEMENT SetRTC   RelationRTC     gt    lt  ELEMENT Skos_type    PCDATA    gt    lt  ELEMENT Synonym    PCDATA    gt    lt  ELEMENT TerminoConcept   ID   NL Definition   OCCURRENCE    PrefLabel   See also   SetRTC   Synonym   children   fathers   lt  ELEMENT Texte    PCDATA    gt    lt  ELEMENT child    PCDATA    gt    lt  ELEMENT children   childx     gt    lt  ELEMENT domain    PCDATA    gt    lt  ELEMENT father    PCDATA    gt    lt  ELEMENT fathers   father     gt    lt  ELEMENT name    PCDATA    gt    lt  ELEMENT range    PCDATA    gt     10 5 TreeTagger English Tagset    GC  CD  DT  EX  FW  IN Preposition or subo  JJ Adjective  JJR Adjective   JJS Adjective  superlative  LS list item marker   MD Modal   NN Noun  singular or mass  NNS Noun  plural  NP Proper noun   NPS Proper noun   PDT Predeterminer   POS Possessive ending  PP Personal pronoun  PPS Possessive pronoun    Cardinal number  Determiner  Existential there  Foreign             word          comparative       singular  plural       Cooordinating conjunction    rdinating conjunctio
15. T OCCURRENCES   LIST SENT   NUMBER OCCURRENCES   Types   x   lt  ELEMENT NUMBER OCCURRENCES    PCDATA    gt    lt  ELEMENT OCCURRENCE   ID  DOC  SENTENCE  START POSITION   END POSITION  Texte        lt  ELEMENT SENT EMPTY  gt    lt  ATTLIST SENT ID CDATA  REQUIRED  gt    lt  ELEMENT SENTENCE    PCDATA    gt    lt  ELEMENT START POSITION    PCDATA    gt    lt  ELEMENT SYNTACTIC CATEGORY    PCDATA    gt    lt  ELEMENT TERM CANDIDATE   ID  LEMMA  NUMBER OCCURRENCES   LIST_OCCURRENCES  FORM  MORPHOSYNTACTIC_FEATURES  List_Variants   NAMED_ENTITY     gt    lt  ELEMENT TERM EXTRACTION RESULTS   LIST TERM CANDIDATES  LIST EN   lt  ELEMENT Texte    PCDATA    gt    lt  ELEMENT Types   type     gt    lt  ELEMENT Variant    PCDATA    gt    lt  ELEMENT type    PCDATA    gt     10 4 Thesaurus DTD       The DTD of the XML file which contains a thesaurus which is visualized in Ter   minae TerminoConceptual level perspective  A thesaurus contains a collection of  terminoconcepts  Each terminoconcept is described by an ID  a natural language  definition  corpus occurrences  a prefLabel  a set of  see also   a set of synonyms   altLabel   a set of children and its father                                                         lt  ELEMENT DOC    PCDATA    gt     lt  ELEMENT END POSITION    PCDATA    gt     lt  ELEMENT EnsTerminoConcepts   name  TerminoConceptt     lt  ELEMENT ID    PCDATA    gt     lt  ELEMENT NL Definition    PCDATA    gt     lt  ELEMENT OCCURRENCE   ID  DOC  SENTENCE  START P
16. TERMINAE User Manual   vii 2    Sylvie Szulman  Paris 13     with contributions from  Adeline Nazarenko  Paris 13     2011 July    Abstract    TERMINAE is a platform that assists users in designing termino ontological  resources from texts  It can be used by terminologists to build terminological  forms and by knowledge engineers to build either thesaurus expressed in SKOS  or ontologies organising concepts and lexical units in a formal way supporting  inferences    This platform allows to link textual elements to terminological and conceptual  resources  The acquisition corpus may contain one or several documents  The  supported languages are English and French    Keyword list  Ontology acquisition  terminology  assisting tool    Executive Summary    This document is the user guide of TERMINAE   TERMINAE is a platform that assists users in the design of termino ontological  resources from texts  In ONTORULE  it is used to build from texts    e thesaurus expressed in SKOS  and    e ontologies organising in a formal way the concepts associated to the terms  and supporting inferences     This platform allows to link textual elements to terminological and conceptual  resources  The corpus may contain one or several documents  The supported  languages are English and French    TERMINAE is organised in three main levels  the first step of the terminolog   ical level enables to constitute the set of terms of the corpus  its second step or   ganises these according to lexical and sy
17. Toolkit ontology is used to create an ontology  This  ontology is part of the newly created Neon project     e Create a class is used to create a class in the previous ontology and  from the selected termino concept  A dialog window opens  in which you  have to give a name to the class and select a class father in the existing ontol   ogy  The class can be visualized in the Neon toolkit Conceptual  level  OWL  perspective  see Figure  8 2   Note that the class is cre   ated with an annotation property in which the link to the source termino   concept and its identifier is saved  Once it has been linked to a class at    CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE3 1    the conceptual level  the termino concept is displayed in blue color in the  TerminoConcept tree     e To ontology level is used to switch from the termino conceptual  perspective to the OWL one  This action opens the OWL perspective and  shows the class corresponding to the selected termino concept     e Link to Neon project is used when one wants to exploit an existing  Neon toolkit project     e Link to Neon ontology is used when one wants to exploit an exist   ing ontology in a specified project     e Link to a class is used to link a termino concept to an existing class     e Create an ObjectProperty is used to create an objectProerty from  a termino conceptual relation  A dialog window opens and you have to  enter the name of the property  the father object property  its domain and  range  The 
18. VE 11    The corpus is in a txt file  it is advised to use utf 8 encoding   See section to  have the description of the used files   You can either     e Create a new project  Create Terminae project  if you start to  build a specific termino ontological resource from a given corpus  You have  to specify         The name of your project         The name of the directory where you want to locate your project  A  default directory is proposed but click on the cancel button and nav   igate through the file system if you want to choose another directory     e Switch from one project to another  Load Terminae project  note  that only one project can be opened at the same time   You are first offered  to navigate through the file system to select the directory containing the  concerned project directory     e Export the current project  Export project   A zipped file is created in  which all the required directories and files are included  If you have created  a Neon project  its directory is also included in the zipped file        e Import an existing project  Import project   The project to be im   ported is represented as a zipped file containing the project directory with  all the required subdirectories and files  You do not have to unzip the file  but you have to specify         The zipped file to load         The name of the directory where you want the project to be imported     5 2 Help menu    The Help information is not available yet     5 3 Show View menu    Each perspe
19. al forms to create all terminological  forms from a preexisting thesaurus  This functionality is useful when you  want to add terminological information and occurrences to an existing the   saurus  You start from an existing thesaurus and create a terminological  form for each termino concept using a defined corpus     8 3 3 Feature management submenu    This submenu proposes various actions related to the detailed information pro   vided for a given termino concept and recorded in its termino conceptual form     e Add a synonym to add a synonym to the selected termino concept  A  dialog window opens for capturing the new synonym  If the corresponding  terminological unit has been found by YaTeA or ANNIE  its occurrences  are automatically clustered with that of the current termino concept     e Remove a synonym to remove a synonym  You have to confirm if you  want also to remove the related occurrences     e Add a link to add a type of link and its value     e Remove a link to remove a type of link and its value     8 3 4 Neon ontology submenu    This menu is used to link TERMINAE and Neon ToolKit  It supports the creation  of the conceptual level and many actions to connect it to the termino conceptual  one     e Create a Neon project is used to create a Neon toolkit project  If  you want to work at the conceptual level  you have to create a Neon project  and to specify its name  It is recommended to use different names for the  TERMINAE and Neon projects     e Create Neon 
20. athered with linguistic information in terminological forms     e The termino conceptual level  chapter 8  is specific to TERMINAE  Whereas  terms are at the vocabulary level  the goal is now to analyse the use of terms  in the corpus at the semantic level  The work is to recognize and distribute  the various senses of this term into several termino concepts  distribut   ing also the occurences of the term between senses  At the same time  the  termino concepts of the form can be tagged as having a synonym in an other  form  or being otherwise  more loosely  related     2    CHAPTER 2  THE TERMINAE METHOD 3    e The ontological level  see chapter  9  now relies on termino concepts and  their relations to build the ontology  First  synonym termino concepts should  only yield one concept  All the related termino concepts help building the  hierarchical relations and defining the roles  as can do some other linguistic  information gathered during the process   a part of which is under explo   ration in the framework of the ontorule project  in particular the analysis of  verbs and SBVR fact forms      Chapter 3    Technical Characteristics    e The current version of TERMINAE platform is compiled using SUN 1 6 Java  virtual machine     e It relies on UTF 8 text encoding     e It can be used for English and French     3 1 Installation    To install TERMINAE  you need java  version 1 6  Download the version of the  platform for your system from the    http   www lipn univ parisl3 
21. ceptual level  OWL   E Terminae TerminoConceptual level H Terminae Terminological level  step 2  B Ter        E               4  Ontology Naviga    23    0 ig Entity Properties 23       a             4    P O Attribute URI   khttp   lipn univ paris13 fr RCLN terminae Audi Airbag gt     b BusinessObj  E   P  3 Category    Annotations   b O Conditioning         R Annotation Propert Value Type  b Device pasy YP  b Dimension  Concept  Airbag JrerminoConcept  v O Function    create new     G Adjustingt                   ao        b O Anchorage       Anchorage  Buckle   b ChildResti   SafetyBe   Seat Mo    C L                C IE  D     Class Restrictions   Taxonomy   Annotations   Source View                  Figure 8 2  Neon toolkit conceptual level  OWL  perspective    Chapter 9    Neon toolkit Conceptual  level  OWL  perspective    The conceptual perpective is a Neon toolkit plugin  version 2 4  to which a spe   cific menu has been added for the TERMINAE platform to link the conceptual and  termino conceptual levels    When using Neon toolkit conceptual level perspective  you need to create or to  import a Neon toolkit project  which is different from the Terminae project   and  to create or import an ontology in this project    This can be done either from theNeon ontology submenu of the Terminae  TerminoConceptual perspective   Create a Neon project and Create  Neon Toolkit  ontology items     or create the project and the ontology from the menu of the  navigator view in Neon
22. ctive has many views and a main view which is on the left side of the  perspective  A click on an item in the main view change values in other views   These views may be closed by the user or he she may want to see a view of another  perspective which is not in the used perspective  only one perspective may be  selected      CHAPTER 5  PROJECT MANAGEMENT PERSPECTIVE 12    This menu is used to reopen a view that has previously been closed  Click on the  single item  Other     to visualise the list of available views and choose again  Other to find TERMINAE views  Select the view you want to reopen or to see    and be aware that the view may be dependant of one or the other perspective    Chapter 6    Terminae Terminological level  step  1  perspective    The Terminae Terminological level allows to browse and modify the list of domain  specific lexical units that have been extracted from the source corpus using term  extraction and named entity recognition tools such as YaTeAl   and ANNIE    TERMINAE assumes that the acquisition corpus has been processed by Tree   Tagger and YaTeA and possibly ANNIE beforehand  YaTeA takes as input     e A tagged corpus  required    e A list of terms extracted from it as input  required  see Section 6 1 1      e Lists of named entities and named entity types  optional  see Section 6 1 2      6 1 Data  Terminological files    6 1 1 Term files    When you open the Terminological level perspective  you have to  specify the terminological data you 
23. default presented on the  left part of the perspective  It gives the lists of all the canonical terminolog   ical units for which a terminological form has been created  the form can be   In progress or Completed         e The other views form the terminological form of the unit that has been se   lected in the Terminological form list  see Section 7 2         Note that  when the list of terminological forms is selected  you can find any  terminological form by typing the first letter of its canonical terminological unit     7 2 Data  Terminological forms  An example of terminological form is displayed on the right part of Figure 7 1  A  terminological form gathers all the lexical and terminological information that has    been collected or manually added for a given term or named entity  It is usually  composed of the following views     21    CHAPTER 7  TERMINAE TERMINOLOGICAL LEVEL  STEP 2  PERSPECTIVE22                                         amp w Terminae project TestDemo Y Y Y  Terminological actions Perspectives Show View help  E   E Terminae Terminological level  step 2  E Terminae Terminological level  step 1   E Terminae Project perpective      Terminological for    23     BH Fi Lexical information 2 as    Terminological form    Entry  range    Variants  abrasion conditioning Term extractor  Yatea  ix airbag  acceleration device form   Airbag    acceleration test device grammatical type   NN  adjusting device      NE extractor  Gate     agreement      T  RT Relatio
24. e numbers     CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE20    e Remove adverbs allows to suppress the terms that are tagged as ad   verbs     6 3 4 Terminological form actions  This menu is used to define terminological forms described in next chapter     e New terminological form allows to create a terminological form  for the selected term  Once the terminological form is created  the new  form can be visualized on the Terminae Terminological level    step 2  perspective  which is automatically opened  and the lexical unit  which form has been created is displayed in blue character in the Lexical  units view Terminae Terminological level  step 1  per   spective                    e To terminological formallows to visualise the terminological form  of the selected terminological unit if it has one  This action automatically  switches from the Terminae Terminological level  step 1   perspective to the Terminae Terminological level  step 2   perspective                       Chapter 7    Terminae Terminological level  step  2  perspective    This perspective can be opened either by creating a terminological form or from  the main Perspective menu  Terminological level  step 2          7 1 Perspective overview    The Terminae Terminological level  step 2  perspective is com   posed of two main parts  with a global view on the left and a set of more detailed  and dependant views on the right  see Figure 7 1            e The Terminological form list view is by 
25. erminoConceptual level perspective  see Sec   tion  B         Neon toolkit Conceptual level  OWL  perspective  see  Section D         The 4 first perspectives make up Terminae  The OWL perpective belongs  to Neon ToolKit 2 4  Please note that the last Eclipse perspective  Team  Synchronizing is used by Neon ToolKit     e The item Search is proposed Neon toolkit Conceptual level  OWL  per   spective  it is not described in this report      e The item Help is proposed in all eclipse application  it is not described in  this report      e An additional Terminae submenu is proposed on MacOS systems  It  gives access to the standard application main operations  information  About  Terminae   Preferences  Hide Terminae  Quit Terminae        Chapter 5    Project management perspective    TERMINAE Starts with the project management perspective  This perspective has    2 views  Fig   5 1                  Figure 5 1  Project management perspective    e The left view presents the project information if a project has been already  defined  project  corpus  thesaurus and author s     names     e The right view is a text editor where the user may write comments  To save  the comments  you have to click on ctrl s   5 1 Terminae project actions menu    A project consists of all data used or created by TERMINAE when building a spe   cific termino ontological resource from a given corpus  see Section for a  description of the project structure      10    CHAPTER 5  PROJECT MANAGEMENT PERSPECTI
26. erspective              e Create a termino concept is used to create a termino concept and  link it to the selected  This functionality is useful when you want to add the   saurus information to an existing ontology  You start from an existing class  and create a termino concept in the thesaurus of the TERMINAE project     e To link a class to a TCisusedto link aclass to an existing termino     concept in the thesaurus of the TERMINAE project     Chapter 10    Annex    This annex lists the DTD used by Teminae     10 1 XML backup DTD for terms    The DTD of the XML file which contains terms and their occurrences which is vi   sualized in Terminae Terminological level  step 1  perspective                                                                                                lt  ELEMEN   lt  ELEMEN    NUMBER_OCCURRENCES    PCDATA    gt   OCCURRENCE   ID  DOC  SENTENCE  START_POSITION   END_POSITION  Texte    gt      lt  ELEMENT DOC    PCDATA    gt    lt  ELEMENT END POSITION    PCDATA    gt    lt  ELEMENT FORM    PCDATA    gt    lt  ELEMENT ID    PCDATA    gt    lt  ELEMENT LEMMA    PCDATA    gt    lt  ELEMENT LIST OCCURRENCES   OCCURRENCE     gt    lt  ELEMENT LIST TERM CANDIDATES   TERM _CANDIDATE     gt    lt  ELEMENT List Variants   Variant     gt    lt  ELEMENT MORPHOSYNTACTIC FEATURES   SYNTACTIC CATEGORY    gt   I  E                                                                          lt  ELEMENT SENTENCE    PCDATA    gt     lt  ELEMENT START_POSITION    PCDAT
27. eted    Each terminological form is saved in an XML file in the terminoFormDir  directory  The list of terminological forms is saved in the filet ableTermeFiches xml  in terminoFormDir directory                          7 3 Terminological actions menu    The action menu associated with the Terminae Terminological level   step 2  perspective is the Terminological action menu  It proposes  3 submenus which are presented in the following subsections           e Termino concept management submenu    e Form management submenu       e Feature management submenu  The corresponding actions are also contextually accessible from the right click of    the mouse     7 3 1 Termino concept management submenu    This submenu proposes three different actions     CHAPTER 7  TERMINAE TERMINOLOGICAL LEVEL  STEP 2  PERSPECTIVE24    e Create a termino concept to create a termino concept linked to  the selected terminological unit  The termino concept is added to the cur   rent thesaurus  If the terminological unit is a named entity  the type of the  named entity may also give bearth to a termino concept and a kindOf link  is created between the two termino concepts     e Remove a termino concept to remove a termino concept from the  current thesaurus     e To TerminoConceptual level to switch from the Terminae Terminolo   gical level  step 2  perspective tothe Terminae TerminoConceptual  level perspective        7 3 2 Form management submenu    This submenu proposes two actions related to termino
28. fr  szulman TerminaeWorkbench        web page and unzip the downloaded file    The default language is English but it can be changed  If you want to work  with a French platform  edit the terminae ini file and change the line n1  enby nl fr FR This file is located in the Terminae directory on Linux and  Windows systems and in the Terminae app Contents MacOS directory  on MacOS systems     3 2 How to start    To launch the TERMINAE platform  click on the Terminae application  either  Terminae on Linux system  Terminae exe on Windows system or Terminae app  on MacOS     Initially  the project management perspective  Terminae Project perspective   is open and you have to import or create a project        4    CHAPTER 3  TECHNICAL CHARACTERISTICS 5    3 2 1 Project location and structure    In any case  you have to define your project directory  On Linux and Windows  systems  it is advised to locate it in the workspace directory created by the  eclipse application    A project has a fixed structure  represented as the 6 following subdirectories     corpora  Contains the corpus data  raw and tagged  and the results of  named entity recognition tools  The current version of the platform is de   signed to work with TreeTaggel   and ANNIE named entity recognition    tool     terminoFormDir  Contains the terminological forms that are created  using TERMINAE and output by it        linguae  Contains the search patterns that have been designed and their  results  no pattern design tool 
29. he project   Advertised user may easily under   stand its content  and may happen to change it in tricky cases  e g  for  renaming directories or files      These files are text files or modifiable xml files     Chapter 4    Main menu    Figure 4  I presents the main menu of the TERMINAE platform  which is accessible  from any perspective  It presents 4 items which are associated to specific actions    or sub menu        Terminae project actions Perspectives Show View help    Figure 4 1  Main menu    e The action submenu gives access to the specific functionalities accessible  at the Terminae level where you are currently working  The name of these  action menu depends to the perspective from which it depends  Terminae  project actions  Linguistics actions  Terminological          actions TerminoConceptual actionsandTerminae links     e The Perspectives item allows to open new perspectives  you simply  have to click on the name of the perspective you want to open in the per   spective list that appears  5 perspectives are accessible            Terminae Project perspective  which is the default per   spective which is opened when a project is loaded  It is presented                      in Section  5        Terminae Terminological level  Section  6         Terminae Terminological level  Section  7          step 1  perspective  see     step 2  perspective  see     This main menu slightly differe from on exploitation system to another     CHAPTER 4  MAIN MENU 9        Terminae T
30. is available in the current version      thesauri  Contains the termino conceptual resources that are created us   ing TERMINAE and output by it     system  Contains some files automatically created by TERMINAE     yatea  Contains the results of term extraction tools  The current version  of the platform is designed to work with YaTeA term extracto      3 2 2 How to import a project    A project to be imported is represented as a zipped file containing the project  directory with all the required subdirectories and files of a given project  You do  not have to unzip the file     Go to the main menu    Click on Terminae project actions       Click on Import project    A first dialog window appears in which you must indicate the zipped file to  load         http   www ims uni stuttgart fr projekte corplex TreeTagger    http   gate ac uk ie annie html   http   search cpan org  7Ethhamon Lingua YaTeA 0 5     CHAPTER 3  TECHNICAL CHARACTERISTICS 6    e A second dialogue window appears  to propose the directory into which the  project will be imported  If you do not accept  you ll be offered to choose  another one     When the project is imported  its main characteristics are presented in the Terminae  project information view on the left  by default  of the project perspec   tive and you can start working on it     3 2 3 How to create a project   To start working on a new project   e Go to the main menu  e Clickon Terminae project actions  e Click on Create Terminae project    e A
31. logical forms        e Remove a terminological form    e Validate a terminological form  this action is used to note  that the work on this terminological form is completed  It acts as a com   ment aimed at the user        7 3 3 Feature management submenu    This submenu proposes various actions related to the detailed information pro   vided for a given terminological unit and recorded in its terminological form     e Add a variant to add a lexical variant of the selected term     e Remove a variant to remove a lexical variant of the selected term        e Add a lexical entry to add a lexical entry for the selected term   You have to type in the entry name and its value separated by two points     e Remove a lexical entry to remove a lexical entry              e Add a syntactical relation headto adda phrase where the se   lected term is the head                 e Add a syntactical relation modifier to add a phrase with  the selected term as a modifier           e Remove a syntactical relation head to remove the selected  relation        CHAPTER 7  TERMINAE TERMINOLOGICAL LEVEL  STEP 2  PERSPECTIVE25       e Remove a syntactical relation modifier to remove the se   lected relation              e Add a terminological relation to add a terminological relation  where the selected term is term1 or term2           e Remove a terminological relation toremove a terminological  relation     e Add an occurrence to add an occurrence to the selected term  You  have to specify the docume
32. n      x     gt     CHAPTER 10  ANNEX 39    RB Adverb   RBR Adverb  comparative  RBS Adverb  superlative  RP Particle   SYM Symbol  TO to   UH Interjection   VB Verb  base form   VBD Verb  past tense   VBG Verb  gerund or present participle  VBN Verb  past participle   VBP Verb  non 3rd person singular present  VBZ Verb  3rd person singular present   WDT Wh determiner   WP Wh pronoun   WPS Possesive wh pronoun   WRB Wh adverb             10 6 TreeTagger French Tagset    ABR abreviation   ADJ adjective   ADV adverb   DET ART article   DET POS possessive pronoun  ma  ta        INT interjection   KON conjunction   NAM proper name   NOM noun   NUM numeral   PRO pronoun   PRO DEM demonstrative pronoun   PRO  IND indefinite pronoun   PRO PER personal pronoun  PRO POS possessive pronoun  mien  tien        PRO REL relative pronoun  PRP preposition  PRP det preposition plus article  au  du  aux  des   PUN punctuation  PUN cit punctuation citation  SENT sentence tag   SYM symbol                                     CHAPTER 10  ANNEX 40    VER  cond verb conditional  VER  futu verb futur   VER  impe verb imperative  VER  impf verb imperfect  VER  infi verb infinitive   VER  pper verb past participle   VER  ppre verb present participle  VER  pres verb present   VER simp verb simple past   VER  subi verb subjunctive imperfect  VER  subp verb subjunctive present                10 7 Use ANNIE to extract named entities    This annex describes the procedure to be followed to use ANNIE to extac
33. ns 52 D  amendment    anchorage Syntactical relations     Terminological relations    7s es Se aaa  anchorage of the belt   Head     Modifier    Tem1  name of relation    anchorage of the seat passenger airbag   airbag assembly  anchorage point             gt   CR  D    C B  angle of the strap  angle quadrant      Sed    Term occurrences X  ELI PER RO  atmosphere   y         oun phrases  belt anchorage   Securencell  2   buckle ID 0cc8343 doc 0 sent 58      Airbag    buckle test   Airbag assembly   means a device installed to supplement safety belts  and restraint systems in power driven vehicles   i  e  system which   in the  event of a severe impact affecting the vehicle automatically deploys a  carriage of passenger flexible structure intended to limit   by compression of the gas contained  cold conditioning within it   the gravity of the contacts of one or more parts of the body of an  occupant of the vehicle with the interior of the passenger compartment      calibration test    conditioning  o em  C  lo  L     Occurrence 2   number of lines   90   ID 0cc9230 doc 0 sent 60  zl  l    ML Im mme m onm mmm Mmmm oam mele me mma ds da intandad ta mnm dmm de hae       Figure 7 1  Terminae Terminological level  step 2  perspective       e The Lexical information view is a form in which you can freely  create  modify or suppress some fields  By default four lexical fields are  defined     Term extractor  Yatea   which range is X if the terminolog   ical unit has been extracted by
34. nt identifier and to type in the text of the occur   rence     e Remove an occurrence to remove an occurrence to the selected term   Select the relevant occurrence to indicate which occurrence has to be re   moved     Chapter 8    Terminae TerminoConceptual level  perspective    This perspective must be opened from the Perspective submenu in the main  menu by selecting the Terminae TerminoConceptual level     8 1 Perspective overview    The Terminae TerminoConceptual level perspective presentation is  very similar to that of the Terminae Terminological level  step  2  perspective  It is composed of two main parts  with a global view on the  left and a set of more detailed and dependant views on the right  see Figure 8 1         e The TerminoConcept tree view is  by default  presented on the left  part of the perspective  It shows the hierarchy of all the termino concepts  that have been created     e The other views form the termino conceptual form of the termino concept  that has been selected in the TerminoConcept tree  see Section 8 2      Note that you can find a termino concept simply by typing its first letter in the  TerminoConcept tree view     8 2 Data  Termino conceptual forms    The termino conceptual level is a bridge between the terminological level and the  conceptual level  the ontology   It is made of a set of termino concepts which are  themselves described by termino conceptual forms gathering the relevant infor   mation that has been collected or defined fo
35. ntactic relations  the termino conceptual  level organizes the terminology according to semantic relations  the third level  the  ontological level  enables to create a formal ontology out of the list of termino   concepts created at the second level    This document describes the functionalities of the TERMINAE platform  The  first chapter describes the technical characteristics and the installation instruc   tions  The following chapters present the main menus of the platform that are  accessible from its main window     Contents    1 Introduction    r3     The Terminae method    3  Technical Characteristics    3 1 Installation  1 54 9 4 goog qoe ok  amp  ow Vr ECC 9 oos y es  92 HOW EO A II  3 2 1 Project location and structure                     3 22 Howto import a project            llle  3 23 How to create a project       o 0  20      e   3 3 RIOT uuo a ROR ds o de ae A AA    4 Main menu    5    6    Project management perspective  5 1 Terminae project actionsmenu                  SAO A  5 33 Show RR MAA    Terminae Terminological level  step 1  perspective  6 1 Data  Terminological files                         Gl  Term WIGS xk eum     x o3 X AA    6 1 2 Named entity files    ascos sis ee we ee we xS  6 2 Perspective OVEIVIEW    o   ce 2 ek ED Oe ee a  63 Linguistic actionsmenu                       63 1 Filesubmenu                         6 3 2 Term Management submenu                  6 33 Cleaningsubmenu                         6 3 4 Terminological form actions     
36. objectProperty is created with an annotation property in which  the name and type of the source termino conceptual relation are saved     e Link a RT and an ObjectProperty is used to link a termino conceptual  relation to an existing objectProperty     e Link a RT and a classisused to link a termino conceptual relation  to a an existing class     e Create classes and TCs is used to derive a set of classes from a  set of selected termino concepts  If these termino concepts have termino   conceptual relations  objectProperties are created and linked to these source  relations     e Create classes and TCs without dialog offers the same func   tionality as above but there without dialog  The default values are system   atically kept         name of class   name of terminoconcept       name of objectproperty   name of the RTC       if termino concepts are linked by a isKindOf link  the correspond     ing classes are in the same hierarchical order     e Link to an individual is used to link a termino concept to an in   dividual  You have to enter the individual name and select the class from  which it belongs thanks to dialog windows     CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE32    e Create an individual is used to create an individual  You have to  enter the individual name and select the class from which it belongs thanks    to dialog windows         amp   amp  Terminae project TestDemo Y Y Y  Terminae links Perspectives Show View Search help   B  Neon toolkit con
37. on ontology    submenu    The corresponding actions are also contextually accessible from the right click of  the mouse     8 3 1 File submenu    This menu allows to load and save termino conceptual data  It proposes the fol   lowing actions     Load XML format to load a thesaurus in XML format  see DTD in An     nex 10 4      Save XML format to save a thesaurus in XML format        Export SKOS to    Import SKOS to load an existing thesaurus in Skos format     export a thesaurus in Skos format  A dialog window    opens  in which you have to define an URI  added to the name of skos con   cepts to guarantee they are uniquely identified   for instance http   www lipn univ     paris 13 fr terminae       Note that  in the current version of the TERMINAE    platform  the termino conceptual relations are not described in the exported    file     Export SKOS R       DE  XML format to export a thesaurus in RDF XML    format  A dialog window opens  in which you have to define an URI as for    the skos format     CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE29    8 3 2 Termino concept management submenu    e Create termino concept tocreate a new termino concept  You have  to type in the name of the termino concept if it is not created directly from  a terminological unit     e Remove termino concept to remove the selected termino concept   You have to confirm the removal        e Rename termino concept to change the name of the selected termino   concept     e Add kindOf link 
38. peEn gt    gt Person lt  typeEn gt         gt Percent lt  typeEn gt    gt Location lt  typeEn gt    gt Money lt  typeEn gt    gt Title lt  typeEn gt    gt Address lt  typeEn gt    gt Unknown lt  typeEn gt    gt Jobtitle lt  typeEn gt    gt FirstPerson lt  typeEn gt    gt Location lt  typeEn gt    gt UrlPre lt  typeEn gt      lt  ensTypeEn gt     
39. r those termino concepts     26    CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE27        amp w Terminae project TestDemo Y  y X  TerminoConceptual actions Perspectives Show View help   rs    E Terminae TerminoConceptual level  El Terminae Terminological level  step 1  E Terminae Terminological level  step 2  B Te    ima            E TerminoConcept     9     BF TerminoConcept features 23         TerminoConcepts       Term iD airbag          Synonyms     Links     OWLClass    OWLClass    B AdjustingDevice  Bl Agreement  ll PassengerAirbag  Bl Amendment    E  v MH Anchorage  O NL Definition        a   Mi AnchorageOfTheBe   lg  Bl AnchorageOfTheSeat     A         ttp   Test DemoOntoAirbag     http   lipn univ paris13 fr RCLN          Bl AnchoragePoint  Bl AngleOftheStrap    B AngleQuadrant   lll Atmosphere   B BeltAnchorage   Ml Breakingstrengthofs      F1 Occurrences 3i   2 BP TCrela   2         Noun phrases             Bl Buckle Occurrence 1    Domain  Ill BuckleTest ID 0cc8343 doc 0 sent 58  a     Airbag assembly   means a device installed to supplement safety  Bi CalibrationTest belts and restraint systems in power driven vehicles   i  e  system  Bl ChildRestraintSysterr which   in the event of a severe impact affecting the vehicle  IdConditioni automatically deploys a flexible structure intended to limit   by  E ColdConditioning compression of the gas contained within it   the gravity of the  ll Conditioning contacts of one or more parts of the body of an occupant
40. t 1820     CH compound hardness   1 j i  B1 ISO  F2X   Reduced     CONDUCTING 11   Unknown   Height Forward Facing  CONDUCTING APPROVAL TE  1         goce   CONFORMITY i1 Unknown   occurrence 3   CRF E   Unknown     ID occ7088 doc 0 sent 1821    i i   C ISO  R3   Full Size      CRF base jm i   Rearward Facing toddler  CRS 9 Unknown   CRS       Centreplane i1   Unknown   AA  i i i Occurrence 4   A i    ID 0cc7280 doc 0 sent 1824     Child  3 Unknown   F ISO  L1   Left Lateral     Child RESTRAINT SYSTEMS  2     facing position CRS   carry  Child RESTRAINT SYSTEMS Ir 1 EE aos     Classes 11   Occurrence 5   Classes   i1   1D occ7888 doc 0 sent 1833    j i  Figure 1 ISO  F3 envelope    Co i n i i  v    dimensions for a full height    number oflines 3655   forward facing toddler CRS   ui  ple     A   o             Figure 6 2  Visualisation of terms and named entities    CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE18    6 3 2 Term Management submenu    This menu allows to manage terminological data  i e  to visualise the list of termi   nological units and edit it by clustering  removing or adding some of them  The  Term Management menu proposes 9 different actions     e Visualize all terms toredisplay the list of terminological units af   ter a search sequence        e Find a term to search for a specific unit  on the basis of its beginning  characters  Note that this functionality is also directly accessible when the  list of terms is selected by typing the first
41. t named  entities from a given document  only one document can be processed at a time     Note that the following procedure is extracted from the Gate documentation  for processing English corpora  http   gate ac uk sale tao splitch3 html    GATE enables you to extract named entities from plain texts and annotate your  corpus with it  GATE is distributed with an IE system called ANNIE  ANNIE  relies on finite state algorithms and the JAPH  language    Take one large pile of text  documents  emails  etc    Call this your corpus    If you right click on  Language Resources  in the resources pane  select  New   then  GATE Document     the window  Parameters for the new GATE Document   will appear    Once you indicate the corpus to work on it  you can call for ANNIE    From the File menu  select  Load ANNIE System   To run it in its default  state  choose  with Defaults   This will automatically load all the ANNIE re   sources  and create a corpus pipeline called ANNIE with the correct resources  selected in the right order  and the default input and output annotation sets    If  without Defaults    is selected  the same processing resources will be loaded   but a popup window will appear for each resource  which enables the user to spec   ify a name  location and other parameters for the resource  This is exactly the same  procedure as for loading a processing resource individually  the difference being  that the system automatically selects those resources contained within ANNI
42. the termi   nological results and can be reloaded upon request when the YaTeA results are    loaded     The occurrences of the selected terminological unit in the working corpus ap     pear on the right view     CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE16    6 3 Linguistic actions menu    The action menu associated with the Terminae Terminological level   step 1  perspectiveisthe Linguistic action menu  It proposes 3 sub   menus and 2 actions  which are also contextually accessible from the right click  of the mouse        e File submenu  e Term management submenu  e Cleaning submenu    e New terminological formaction       e To terminologcial formaction       Those submenus and actions are presented in the following subsections     6 3 1 File submenu    This menu allows to load and save terminological data  It proposes the following  actions     e Load Yatea results to load the terms initially extracted from your  corpus by YaTeA or saved in a XML backup  The procedure is the same as  that described in Section  6 1 1    e Save Yatea results to make an XML backup  see Annex for  details on the file format         e Load named entities from ANNIE results to load the named  entities identified by the ANNIE named entity recognition tool  see Sec     tion 6 1 2          A first file dialog window opens  in which you have to indicate which  named entity types you are interested in by selecting a named entity  type XML file that should be located in the corpus s
43. to give a father to the selected termino concept  A  dialog window opens  in which you have to give the name of the father  termino concept     e Remove kindOf link to remove a father of the selected termino concept     e Add a RTC to add a termino concept relation for the selected termino   concept         A first dialog window opens  in which you have to give the name of  the relation         A second dialog window opens  in which you have to click on ok if  the selected termino concept is the domain and on cancel if not         A third dialog window opens  in which you have to give the name  of the range or domain  depending on the previous answer   That  termino concept must pre exist         A choice dialog window then opens  in which you have to select the  skos type of the relation     e Remove a RTC to remove the selected termino conceptual relation   e Add occurrence to add an occurrence to the selected termino concept     e Remove occurrence to remove an occurrence of the selecteed termino   concept  You have to select the identifier of the occurrence to be removed     e Create a terminological form to create a terminological form  from a termino concept  This functionality is useful when you want to add  terminological information and occurrences to an existing thesaurus  You  start from an existing termino concept and create a terminological form us   ing a defined corpus        CHAPTER 8  TERMINAE TERMINOCONCEPTUAL LEVEL PERSPECTIVE30    e Create all terminologic
44. ubdirectory of  your project         A second file dialog window opens  in which you have to select an   other xml file containing the list of named entities extracted by AN   NIE  This file should also be located in the corpus subdirectory of  your project     CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE17    e Save named entities to make an XML backup  see Annex for  details on the file format      e Load named entities to load the named entities from an XML backup                       e Load all lexical units to load the terms and named entities from  a single XML backup   e Save all lexical units to make an XML backup of all entities     terms and named entities   see Annex for details on the file format      If everything works properly  when all types of terminological data are loaded   the window of Figure 6 2 appears         amp w Terminae project TestDemo wa x  Linguistic actions Perspectives Show View help       raf   E Terminae Terminological level  step 1  E Terminae Project perpective             F3 Occurrences   a    Named entity type    O Lexical units                           z   Term  irnos Named soli  comments   Kinom     CATEGORIES 1   Unknown  n          CATEGORIES Installed ON ISt1     Occurrence 1   CATEGORY i3       ID occ5661 doc 0 sent 1819    S j j j  B ISO  F2   Reduced Height  CATEGORY Child RESTRAINT   Z i i Forward Facing toddler CRS        cc  1 Unknown      PASADAS  Hen i1  Punta  Occurrence 2       i i   ID occ5662 doc 0 sen
45. ut by the ANNIE named entity recognition tool  see Annex  10 7  for  details on the file format  and which are expected to be located in the corpus  subdirectory of your project     e The first xml file indicates which named entity types you are interested in     e The second xml file contains the list of named entities extracted by ANNIE   To create such files  follow the procedure described in Annex    6 2 Perspective overview    If everything works properly when loading the terminological data  the window of  Figure 6  IJappears when the Terminae Terminological level  step  1  perspective is first opened    The window is composed of two views  the Lexical units view on the  left and the Occurrences view on the right    The terminological units  either terms or named entities  are listed on the left  view  By clicking on the heads of the columns  you can sort the list alphabetically          CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE15        amp     iy Terminae project TestDemo  Linguistic actions Perspectives Show View help       BS   E Terminae Terminological level  step 1  E Terminae Project perpective         E Occurrences         O Lexical units               Term   Freque  Named entity  comments      base i2 i    base of the fixture e 1        basis 3      basis of these prescription   1        belt i258   i E       belt access gap   1     IE     belt adjustment device 8             belt arrangement   1 i      belt assembly   27       belt assembly
46. want to start with  note that additional data  can be loaded afterwards   You have to           e Load a term list  Load Yatea file   which is supposed to be located  in the yatea subdirectory of your project     e Indicate how many documents your corpus encompasses  Note that docu   ments are numbered starting from 1 if there are several of them but that a  single document has number 0           http   search cpan org  7Ethhamon Lingua YaTeA 0 5    http   gate ac uk ie annie html    13    CHAPTER 6  TERMINAE TERMINOLOGICAL LEVEL  STEP 1  PERSPECTIVE14    e Select the tagged corpus from which the terms have been extracted    tt  file   It is supposed to be located in the corpus subdirectory of your  project     e Speficy the corpus language  English  en  or French  fr      When the terminological data is loaded  TERMINAE creates two additional  files in the yatea directory     e f TempCorpus2XML xml which is an xml version of the corpus   e fTempTT2XML  xml which is an xml version of the tagged corpus     If you have several documents  each one must be processed by TreeTagger and  the results must be concatenated in a single file where the various intial documents  are separated by a document tag as shown below     Text n TAB Document TAB n where TAB is the tabulation character and  n varies between 0 and x 1  x being the total number of documents         6 1 2 Named entity files    You may also want to work with named entities  In that case  you need two files  that are outp
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
HQ EL-WDB101 door bell  Entrematic Sliding Door Operator EMSL Installation and  CSR Instructions for USE - Boehringer Laboratories, Inc.  Geovision GV-VD222D  PC-PlannerNT-Brochure  ALTA 4 Manual de producto  multi charger X1 TOUCH 200 日本語版マニュアル  Braccio, Nadia  Samsung 23,6" моноблок серии 7 700A3D-A02 User Manual (Windows 8)  Dragon12-Plus-USB Trainer    Copyright © All rights reserved. 
   Failed to retrieve file