Home
        User Manual - LanA Consulting
         Contents
1.        SpellCheck                      Return type   Short integer  1 or 0   Parameters   Name Type Description  Word String Word to check whether it can be found in a  lexicon  you can use this method for your own  spell checking   Description     Tries to match a text word against a current lexicon  In case of success  it will return 1  otherwise 0     Restrictions   Only single words can be matched            Start    Return type   Nothing    Parameters   No    26    Description    This method is used to open a current lexicon  initialize it and prepare it for a working session  You  must call this method once after each successful SetDict method  After the Start you can call other  methods anytime  It is not necessary to call the Start method before the Taglt or TagAndDisambiguate  methods  Use of other methods  excluding GetDict and SetDict  before the Start method can lead  to unpredictable results     Restrictions   None         Tagit  Return type  String  Parameters     Name Type Description  Text String Lexicon look up                      Description    This method is used match text words against a current lexicon  For every matched lexical item it  assigns all tags found in the item entry  No disambiguation is done at this stage of analysis  You can  develop your own tag disambiguator or use one of our FLAT     Restrictions   The text cannot exceed 16 Kb  Symbols         are not allowed in the input text  After this method has  been used you can call GetUntagged t
2.      V   LT     Adj   N   There can be up to 15 values listed for one TAG in a rule     You can also write a more flexible tag context condition in the form    TAG  lt   single_value   which means that TAG can have any set of multiple values  that necessarily  include said single_value  for example     CT  lt   N  means that a current tag can have any multiple value including  N  eg   N V   Adj N_ or   Adj N V    but CT cannot have a single value like just  N   or any multiple value not including     N  e g   V   N or  Adj V     Word context conditions disambiguate tags depending upon the word context  The word context  covers 5 words with the word having a current tag to disambiguate in the middle     19    WORD     CW   a current word that has a current tag   LW   the word left of the current word   LLW      the word two steps left of the current word  RW   the word right of the current word   RRW  the word two steps right of the current word    You can write the tag context condition as follows    WORD        wordl     word2         for example     CW        comprises       99  99 99 cce    LW       measuring        checking        improving        There can be up to 15 words listed like this    You can write the same conditions in a more compact form   WORD   Variable_name     where Variable_name is  for example  a name of a list of words  which you declare in the declaration  part of the formal description of disambiguation rules  see how to write the declaration part fu
3.    Type a new tag in the active text area and click    OK    to close this window  Note that though  created it will not appear in the interface until at least one lexeme is input onto your lexicon     3  Proceed to create other tags    4  Click    OK    to close the    Configure tag set    dialogue box when finished with tags   To edit a code tag   1  Select a tag to edit in the    Configure tag set    pop up window     2  Click the button    Edit    in this window  You will get the    Edit tag    box with the tag in    3     question displayed in the active text area   Edit the tag in the active text area and click    OK    to close this window     4  Proceed to edit other tags   5  Select a tag to edit in the    Configure tag set    pop up window     To delete a code tag   1  Select a tag to delete in the    Configure tag set    pop up window   2  Click the    Delete    button in this window    3  Follow the dialogue    4  Proceed to delete other tags    5  Select a tag to edit in the    Configure tag set    pop up window     Assign Edit word coding of the lexemes in the lexicon list    You can assign edit tag coding to a single lexical item or to a group of lexemes that are either in a  lexicon list or are imported from a text file or the tagger     To assign edit a tag of a single lexeme in the lexicon list   1  Highlight the lexeme    2  Check Uncheck tag boxes in the right pane of the interface  3  Click the    OK    button     To assign edit tag of a group lexemes in 
4.    Wh word     Tag disambiguation rules are based on these supertags  You can see them in the interpreter interface     Integration of FLAT into a different application    With FLAT Control  based on Active X technology  you can create your own Windows applications that  integrate all FLAT features     The FLAT Control main features are       Tagging of a given text based on a lexicon created in FLAT Lexicon Creator     Tagging and disambiguating of a given text based on a lexicon and rules  created with FLAT  Lexicon Creator and Tagger     Spell checking     Dynamic input of new lexical items to a FLAT lexicon from a user  your  lexicon of a different  application    FLAT Control is compatible with all development environments that comply with Active X technology   can use Active X controls      FLAT Control is an invisible Active X Control  based on the MFC library  It has several methods         AddWord       GetDict       GetTagList       GetUntagged       SetDict       SpellCheck       Start       TagAndDismbiguate     Taglt       CheckDisRules  reserved for internal use in FLAT components       GetLastDisError  reserved for internal use in FLAT components       GetLastDisErrorPosition  reserved for internal use in FLAT components     LoadDisRules  reserved for internal use in FLAT components       SaveDisRules  reserved for internal use in FLAT components     23    FLAT Control is delivered with two sample projects  for Visual Basic 6 0 and Borland C   Builder 4 0   for
5.   The rules are purely illustrative  and only cover the following text     Flying planes is dangerous due to this  It is safe to fly this plane     Configure the tagger to the test lexicon lex_test  Download the text_test file to the tagger  Tag the  text to get      Flying  Partg  planes  Np  is  Vc  dangerous  Adj  due to  Prep  this  Det Pron    It  Pron  is  Vc  safe  Adj N  to  Prep  fly  N Vp  this  Det Pron  plane  N     Disambiguate and compare the text before and after disambiguation      Flying  G  planes  Np  is  Vc  dangerous   Adj  due to  Prep  this  Pron   It  Pron   is  Vc  safe  Adj  to  Prep  fly  Inf  this  Det  plane  N     Open the rule interpreter and see what rules were used for disambiguation  Pay attention to the  format  At the top of the interactive interface pane declared are two new tags  G and Inf  and one list   SETVAR list   We hope these examples will help you to write your own    real world    rules     FLAT Patent domain knowledge    Lexicon and tag disambiguation rules    The FLAT lexicon included in this product has been semi automatically acquired from a 5 million word  on line corpus of complete patent disclosures  It is a list of supertagged lexemes  In our model a  supertag codes morphological information  such as POS and inflection type  and semantic information    an ontological concept  defining a word membership in a certain semantic class  such as object  process   substance  etc    For example  the feature structure of a noun sup
6.   To get your registration key please mail your serial number and your user  code to lanaconsult mail dk  We will reply your message     Uninstallation    If you want to uninstall FLAT  proceed as follows     1     2     Open the dialog box Add Remove Program Properties by selecting Settings Control Panel   Add Remove Programs from the Windows Start menu   Select FLAT for removal     Registration    Run FLAT  after you ran the program FLAT  you will get a pop up window in which you will see your  user code and the serial number of FLAT  You will be suggested to enter your name and registration    key     To get your registration key please e mail your serial number and your user code to  lanaconsult mail dk  We will send you your registration key in reply to your message     Starting and resuming FLAT sessions    To start a FLAT session for the first time double click on the Lexicon Creator icon  You should have a  Flat lexicon created before using the FLAT Tagger     To resume a Flat session double click either on the Lexicon Creator icon or the Tagger icon depending  upon your task     Quitting FLAT and saving files    Before quitting FLAT Lexicon Creator do not forget to save your lexicon as a lexicon file to be able to  resume the FLAT session to work with this lexicon later  To save your lexicon as a lexicon file select  options  File   gt  Save  or  File   gt  Save as       By default the lexicon you created will be saved in the     Lexicons    subdirectory of the FLAT 
7.   or    Save tagged as    from the    File    menu  The  names of the tagged files will automatically be marked with the suffix    Tag        12    Improve the coverage    One of the essential features of the FLAT tagger is that it can directly be used for lexical acquisition  from relevant corpora due to its functionalities to show  save and or import unknown words to FLAT  Lexicon Creator  see Figure 4   The coverage of the lexicon is thus updated incrementally     If the text you tagged is not covered by your lexicon unknown words will appear in the    Unknown  words    pop up window  see Figure 4      You can    e Close this window by clicking on the    OK    button   e Save all the words from this window in a file to further work with them     1     2     Right click on any spot in the    Unknown words    window You will get a pop up menu of  possible actions    Select    Save all words as    to save them in a text file that you can import to a FLAT  Lexicon Creator lexicon later     e Paste the new words to a FLAT lexicon    1     8     Select the words you want to paste to your lexicon in the    Unknown words    pop up  window in the tagger interface    Right click on any spot in the    Unknown words    window You will get a pop up menu of  possible actions    Select    Copy selected    to paste all selected words to the Lexicon Creator or   Select    Copy all    to paste all of the words to the Lexicon Creator without selecting any of  the words    Open  or return to 
8.  Lexicon Creator    Select    Paste from tagger    in the    Edit    menu  You will get the    Paste words from tagger     pop up window  The words to be pasted will be shown in the    Words from the buffer     window  The set of tags of your lexicon will be shown below    Check the tag boxes in case you want to assign checked tags to all the words from the  buffer or leave the boxes unchecked to code the words later    Click the button    Paste    at the bottom of the pop up window     You can always re open the    Unknown words    window in the tagger by selecting    Show unknown  words    in the    Configure    menu     13    AA Flexible Language Analysis Tools  Disambiguating tagge  loj x     Plain Text                                  empty  extemal    Downloaded Text   noname  txt         The upper window will show a text to be tagge  look up tagging  At the set up stage both winde  download a text from an external file or type it c  windows remind you what file you are working     File  menu will change it in the interface     n above both    Tag  Copy selected    Copy all       Tagged Text  nonameT ag  txt      The  UKN  user  UKN  can  UKN feither     download  UKN  a  UKN  text  UKN  from    type  UKN  it  Pron  directi UKN  into  U   window UKN   Inscription  UKN fabove  L   windows  UKN  remind  UKN  you  UKN dv   with  UKN   Assigning UKN  a  UKN  new   using  UKN  Save UKN  as UKN   from  DK   Files UKN   menu  UKN  will    UKN  chan            screen Delete sel
9.  s  total    Figure 2  A screen shot of the Lexicon Creator interface at the beginning of lexicon acquisition  session  The new tag    N    is being added to the lexicon tag set  currently empty     To create a new lexicon    1  Select    New    from the    File    menu  you will get an    empty    screen of the interface    2  Create a tag set  For correct program performance it is highly recommended to have at  least one tag created before you input a first lexeme in your lexicon  Tags will always  appear in Latin alphabet    a  Click    Configure    in the top menu  the    Configure tag set    dialogue box will appear   b  Click the button    New    in this box  the    New Tag    box will appear  see Figure 2      c  Type a new tag in the active text area and click    OK    to close this window  Note that  though created it will not appear in the interface until at least one lexeme is input   Proceed to create other tags or start inputting lexemes  see the section    Add new  lexical items       d  Click    OK    to close the    Configure tag set    dialogue box when finished with tags   3  Create lexical items  see the section    Add new lexical items        You can reuse the set of tags of any lexicon you have created before the current one    1  Open any lexicon created earlier using the    Open    selection in the    File    menu  A password  box will appear    2  Fill in the password and click    OK     In case you have no password for the lexicon you want  to open le
10.  the user  See section Look up tagging   You can always add  other tags to this lexical item by pasting them to the Lexical Creator using the functionalities of the  Lexical Creator  see section    Add tags         24           GetDict    Return type   String    Parameters   No    Description   Returns a current lexicon path and filename     Restrictions   None         GetTagList    Return type   String    Parameters   None    Description   Returns all tags in a current lexicon as a string separated by a period      for example     Adj N P Prep        Restrictions   None         GetUntagged    Return type   String    Parameters   None    Description   Returns a list of unknown words  words that were not found in a current lexicon and tagged as     UNK           as a string separated by comma      for example     device  word  developed        Restrictions   None    25           SetDict                         Return type    Short integer  1 or 0    Parameters   Name Type Description  Filename String Full filename  path and filename  of a lexicon  Password   String Valid password for this lexicon   Description     This method is used to set a current lexicon  You must provide a full filename and valid password to use  a lexicon  In case of success  filename and password are OK  the system will return 1  otherwise 0  You  should check the return value before using other methods  Using other methods after returning 0 can  lead to unpredictable results     Restrictions   None    
11.  you to learn how to use FLAT Control  However  before you can use them FLAT must be installed     The sample project for Visual Basic 6 0 is located in the directory Samples VB on installation CD   The sample project for Borland C   Builder 4 0 is located in the directory Samples BCB     NB  In order to use FLAT Control in Borland C   Builder  you must install the package provided for  FLAT Control  The package is located in the directory Samples BCB Package     To create an installation package for your applications that uses FLAT Control  you must include the  files MorphAnControl ocx and DLLMorphAnNew2 dll in the installation package from System32 or  System sub folder in the Windows folder     All methods not reserved for internal use are described below                                 AddWord   Return type   Nothing  Parameters    Name Type Description   Word String Word  or phrase  to add into a user lexicon   Tag String One valid tag  that already exists in your lexicon    without   as the first character   Description     This method is used to dynamically add words or phrases to the user lexicon of your own application  A  new word  or phrase  will be recognized and tagged without restarting the FLAT system     Restrictions    Phrases should not consist of more than four words  As every new word phrase is transferred from a  particular text where it has just one meaning it will be but input into the FLAT lexicon with the tag  corresponding to that meaning  chosen by
12. Flexible Language Analysis Tools    FLAT    User Manual    LanA Consulting  Madvigs All    9  2  DK 1829 Copenhagen  Denmark  Tel    45 33 25 04 41  Fax   45 33 22 38 22  e mail  lanaconsult mail dk  www lanaconsult com    Content    Whats FLA Toscanini di it als 1  A OT 1  Systemi E Sadat aad S 2  Ne E 2    A era eae O A Seen ee ee 2  E e EE 2  Starting and resuming FLAT ARS 2  Cutie FLAT  and sayin UE 3  A a EE e EE E E ERE ATA E RR a a aa 3  EE EE 3  tee AA NS PO o hee Ge lh a a Be 3  er il hed aes hence that pds ead ere A A A ta 4  A Eeeg 5  Start Lexicon Creaton as 5  Create d New E 5  Prot  ct your lexico di 6  Update   n existing lO a dd 6   E A A O 6  NS A Ee 8  Remove lexical  iS na 8  Adds Edit Gr TEMOVE code TAS 8  Assign Edit word coding of the lexemes in the lexicon Iet  nono nonos 9  Assign lexeme coding when importing them form a file or the tagger    oooonnccnonccnonacioncnonnnannnonnonnnonns 9  Search Filter lacas edo ee 10  Tagger LOOK Up ida 11  o Caer oats reese ec as aac a E as eat A eee 11  How EE 11  Start th   farra dai o da dic 11  Configure the tagger for a tagging session EE 11  Download  EE 12  reet MEALA RE 13   Tag  Disambiz  atiot tna 14  How to work with FLAT to Disambiguate Tags  15  Interface for disambiguation ls in ii 16  Disambiguation rule formali Siimes eege ee 16  FLAT Patent domain knowledge uni iia 22  Lexicon and tag disambiguation rules tido 22  Integration of FLAT into a different Sprangen 23    BISA WEE WEE 28    What is FLAT    F
13. If you checked the    tag s     radio button check tag boxe s  corresponding to your search  parameters  Click the button    Filter    to get a list of lexemes meeting your search parameters  in the left pane of the interface     To search new  uncoded  words select    New words    from the    Edit    menu     To restore the list of all lexemes in the lexicon select    Show all words    from the    Edit    menu     10    Tagger Look up    Tagger Look up assigns the lexemes all code tags from a particular lexicon  created with FLAT Lexicon  Creator   The tagger is pipelined to FLAT Lexicon Creator  so that you can  for example  tag the same  text based on different lexicons  thus defining how large     deep    or    shallow    a lexicon should be for  your application     Any changes you might make in the lexicon are immediately traced in the tagger thus allowing for  operative testing of lexicon coverage  The tagger reports immediately whether the text is covered by a  lexicon listing unknown words  if any  in a pop up window  The text to tag can be either typed in the  active text area of the tagger control interface or downloaded from a text file  Both the input text and  results of any tagging session can be traced in the interface and saved  The names of the tagged files will  automatically be marked with the suffix    Tag     This makes it very easy to compare different traces  based on different lexeme tag sets     One of the essential features of the tagger is that it 
14. LAT stands for Flexible Language Analysis Tool  and is a multipurpose interactive tool for  developing NLP systems and or training computational linguists    FLAT includes the following programs     Lexicon Creator    Tagger  look up and disambiguation    Interactive Tag Disambiguation Interpreter    FLAT Control  an invisible Active X Control  based on the MFC library  for integration with a  foreign application     A ha    And sample patent domain knowledge     1  Patent domain Lexicon  2  Rules for Tag Disambiguation     FLAT    e can be used for any language based on ANSI character set without reengineering   e can easily be integrated in any NLP application  by thus dramatically reducing the complexity and  costs of producing multilingual applications    e is equipped with control and interactive interfaces for updating linguistic knowledge and tracing  processing steps and does not require programming skills to create and experiment with different  depths  and sizes  of lexical and grammatical knowledge     This version of FLAT is a 32 bit Windows application developed to run in a number of operating  environments  Windows 95 98 Me or Windows NT 4 0 2000 XP     Technical support   LanA Consulting  an IT company located in Copenhagen  Denmark  reserves all rights for the FLAT  application  All registered users will receive up to date information on new versions of this program as  well as full support for this system  consulting  training  versions upgrade  FAQ answers e
15. a    3  Enter the password  see section    Protect your lexicon       4  Click the    OK    Button    11       PE Flexible Language Analysis Tools  Disambiguating Ta 151 x     File Language Configure Help  Plain Text      Cone configuration xi  Lexicon  C  Program Files Lan4  Consulting FLAT  Lexicons lex Browse       Password      Cancel            Disambiguate    Tag    Refresh      Tagged Text  nonameT ag  txt        Figure 3  A screenshot of the tagger interface  which is being configured to a FLAT  text lexicon    The tagger is set to work  The last configuration  the name of the lexicon and password  you worked  with will be remembered by the tagger so that next time you start the tagger 1t is considered as a  default configuration  thus making it unnecessary to configure the tagger every time     Download  tag and save    You can either type a text to be tagged directly into the upper interactive window or download a text  from an external file  Note that you can only download a plain text  The text cannot exceed 16 Kb   Symbols         are not allowed in the input text           wE    Select    Download    in the    File    menu    Follow the usual dialogue until you see your text in the upper window    Assign a name to the file by selecting    Save as    from the    File    menu  The name of the file will  appear in the interface    To tag a text click the button    Tag    after the text appears in the upper window    To save traces of tagging select    Save tagged  
16. al items  the same  lexemes with same tag set   If you add the same lexeme with different tag sets  which might happen   for example  when uploading lists of lexemes from different files  every lexeme will appear in the  lexicon list just once associated with a tag set which is a unification of tags for the lexeme in the  input lists     There are several ways to add new items to the lexicon     1  Manual input     a     d     e     Click the button    New    above the word list in the right pane of the interface  You will get  a pop up window with a type in text area       Type in your new lexeme  up to 4 words     Click    OK     The new lexeme will be added to the lexicon list   1  If you add a multiword lexeme its first word will appear as a separate item in the  lexicon  Do not forget to tag it   Assign tags to this lexeme  see the section    Assign Edit word coding for lexemes in a  lexicon          Click OK in bottom of the right pane  under the tag list      2  Import lists of lexemes from external text file     a     Create lists of lexemes in text files    i  Important  Every line of the imported text file should contain one word only  positioned at the beginning of this line  You can easily do this with any external  sorting program    ii  Hint  You may put different types of lexemes  for example  parts of speech  into  different files to tag lists imported from the files in one take    Select    Import       from the    File    menu  You will get the    Import tex
17. ave the type in area empty and click    OK       a  If you type in a wrong password a new    empty lexicon    will open    3  Select    New    from the    File    menu  you will get a set up interface screen with a message  asking you whether you want to keep the tags from the old open lexicon    4  Click    Yes    to get a set up interface screen with the old set of tags but empty word list    5  Click    No    to get an    empty    set up interface screen     Protect your lexicon    You might want to have your lexicon inaccessible for other users  for example  when integrating it  into a different application  You can control the access to your lexicon with a password  By default  any lexicon is considered to have an empty  no  password     To create a password   1  Select    Change password    in the    Edit    menu  a dialogue box will appear   2  Leave the    Old password    text area empty and type in your password in the    New password     and    Confirm new password    text areas     You can change your password calling    Change password    as many times as you want     Update an existing lexicon    With    Lexicon Creator    you can easily update your lexicon by adding editing words and tags or and     retagging    the lexemes  Read the sections below to learn how to do this  Any variant of your  lexicon thus updated can be saved and re opened for further work     Add new lexical items    When adding lexemes Lexicon Creator will take care of not duplicating identic
18. can directly be used for lexical acquisition from  relevant corpora due to its functionality to import lexemes recognized as unknown to a FLAT lexicon    see the section    Add new lexical items      The coverage of the lexicon is thus updated incrementally     Interface    The interface screen contains the main top menu consisting of    File        Configure    and    Help     selections  and a control screen divided into two windows  see Figure 3      The upper window will show a text to be tagged  The lower window of the screen will show the traces  of look up tagging  At the set up stage both windows are empty  The upper window is active  The user  can either download a text from an external file or type it directly into the upper interactive window   Inscription above both windows remind you what file you are working with  Assigning a new name to  the file using    Save as    from the    File    menu will change it in the interface     How to    Start the tagger  To start the tagger double click on the tagger icon     Configure the tagger for a tagging session    Important  Before starting a tagging session you should configure the tagger to a particular FLAT  lexicon     To configure the tagger to a particular FLAT lexicon     1  Select    Tagging       in the    Configure    menu  You will get the    Tagging configuration    pop up  window       2  Click the    Browse    button to browse for a lexicon you want to work with to get its name in the     Lexicon    text are
19. d list  To look through the list of lexemes    1  Use the scroll bar  you can only scroll 10 000 lexical items   2  Use the    Back    or    Forward    buttons that make it possible to scroll every next previous  10 000 lexical items in turn     Search Filter lexical items    Depending upon the selection in the    Edit    menu the user can get either a full word list of the  lexicon or sub lists of words sorted by their suffixes  prefixes  or tags  It is also possible to get a list  of untagged words and tag them using the Lexicon Creator interface     To search a single lexical item    1  Type a lexeme in the text area at the top of the left pane of the interface  2  Click the button    Find      i  In case the lexeme is in the lexicon it will appear in the active window at the top of the  right pane of the interface  You can further edit it   11  In case the lexeme is not in the lexicon a lexeme that follows it in the alphabet list will  appear    To search any lexical item by prefixes  suffixes or by code tags    1  Select    Filter    from the    Edit    menu  The    Select filter    pop up window will appear   2  Check a radio button corresponding to your parameter of search   3  In case you have checked the    prefix    or    suffix    radio button type in a corresponding string    of characters in the active text area at the top of the window  Click the button    Filter    to get a  list of lexemes meeting your search parameters in the left pane of the interface    4  
20. directory     Trouble shooting    e If you get an unexpected system error message  which might be caused by misuse of the program  click    OK    and continue working  In case it does not help exit FLAT and open it again     Lexicon Creator    Functionalities    FLAT Lexicon Creator is a program for lexicon acquisition that allows for different depth descriptions  of lexical items in any language based on ANSI character set  The entry of the lexicon contains two  fields  the name of the lexeme and a tag or supertag that can code more knowledge than a regular part   of speech  The basic principle for this tool is that the user can easily update both the list of words and  tags  supertags  making the tag set larger or smaller in number and as    shallow    or    deep    as required   See a sample set of supertags in the section    FLAT Patent domain knowledge         Lexicon Creator is pipelined to the FLAT tagger  see the section Look up Tagger   so as to  automatically import words that were not recognized by the system after tagging a certain amount of  text  The coverage of the lexicon thus improves incrementally     With Lexicon Creator you can do the following     e Create lexicons of any size from scratch  customizing them as necessary and save them in files  e Update an existing lexicon  o Add new lexical items without duplication the same lexemes with the same codes by    Importing them from external text file    Pasting words recognized as unknown from the tagger    O
21. e If the current word has only one single value tag  no deletion will take place to prevent the  occurrence of untagged words     The rules can be further augmented by using the declaration part of the rule formalism   You can declare new tags and lists of words     By declaring tags you can enrich your tag set at the disambiguation step without changing your  lexicon  For example  you can detect gerunds  if your lexicon only codes such words as measuring  as  ing forms of the verbs  Tags are declared one by one as follows  mind the         sign      SETTAG  G  SETTAG  Inf    You can declare as many tags as you want  Maximum length of a tag is 15 symbols  in case of  multiple tags   Tags thus declared will be valid in rules  but will not be added to the lexicon     You can also declare lists of words as follows     SETVAR Var name        word         word2         y  for example     SETVAR preps       for        by        in order to      SETVAR nums       one        two        three        four        five        six        seven     SETVAR oneword        one       Var_name     is the name of the word list  It must only consist of letters  digits in Var name are not  allowed in the current formalism  There can be up to 20 words in one list and up to 20 Var_names  declared  You can further use these Var_names as values of WORDs  see the section    Examples of  rules      21    Examples of disambiguation rules    In this section we give several examples of disambiguation rules
22. e are two status bars  One bar shows the location and name of your  lexicon  another displays the total number of lexical entries            Flexible Language Acquisition Tool  Lexicon Creato   d loj x   File Edit Configure Help    _Find   Word  or phrase  found  you can edit it     Mew   ENDE Delete     abandoning  a    a certain number of  Abbr M Ad    a number of  a plurality of JT    Adv IT Conj    Zen Select eicon ES  abandoned Look in   Y Lexicons y       ck E    8  lex_test mwl  a  MORPHAN   ML  la  noname mvil                 abandonment  abaringe  abasement  abashed  abatement  abbey  abbot  abbreviated  abbreviation  abdomen  abdominal    abdominis  abduction File name   MORPHAN           abed   aberrant Files of type   Lexicons y  Cancel     aberrantly Lo   aberration   aberrations x   st 2s sl        lt  lt  lt  Back   Forward  gt  gt  gt    OK   Cancel      Current lexicon  C  Program Files LanA Consulting FLAT   Lexicons MORPHAN  Ready   f 39462 word  s  total          Figure 1  A screen shot of the Lexicon Creator interface with the Flat lexicon open     How to    Start Lexicon Creator  To start the Lexicon Creator double click on its icon     Create a new lexicon       FA Flexible Language Acquisition Tool  Lexicon Creator          File Edit Language Configure Help    New UNDO   Delete            Configure tag set          Eorward  gt  gt  gt     DE   Cancel            Current lexicon  C  Program FilesiLanA Consulting FLAT   Lexicons noname  Ready       0 word
23. e is a type in area in which you can write the rules  To make rule writing  less tedious and time consuming the right pane of the interpreter contains two clickable menus  The first  menu lists all the tags from the FLAT lexicon you configured your tagger to  the second menu contains  expressions used in rules  After you wrote a rule click the    Check it    button  This button triggers a rule  check and in case of a mistake displays an error description message  After closing the message box on  the    OK    button click you will find the cursor right in the place where a correction should be made   Immediately after saving new rules an updated trace can be displayed in the interface     Disambiguation rule formalism    Disambiguation rules are always based on a certain FLAT lexicon  In spite of the IF THEN ELSE   ENDIF formalism simplicity the rules have quite a rich and flexible inventory of    right hand side     conditions that provide for rather fine  vs  coarse  disambiguation  For example  one can check a context  within a five string window with the tag in question in the middle  The context could be checked either  in terms of tags and or word strings  It is also possible to check whether a context tag word belongs to a    16    certain list  etc  The disambiguation rules can be both    reductionistic    and    substitution    ones  Rules are  case sensitive     Disambiguation formalism includes a declaration  optional  part and rules  obligatory   We will first  de
24. e pop up window     4     Paste lexemes from a user lexicon    A user lexicon can be created in your own application and paste into a FLAT lexicon  see the  section    Integrating FLAT in your own application      Use the selection    User lexicon    in the     Edit    menu     Edit lexical items    To edit a lexical item  to correct the spelling or reassign tags     1     SAUS    Highlight a lexical item in the list of lexemes in the left pane of the Lexicon Creator  interface  The lexeme will appear in the active text area at the top of the right pane of the  interface    Edit the word in the text area    Click the    OK    button at the bottom of the right pane    Recheck tag boxes to recode the lexeme   Click the    OK    button at the bottom of the right pane     Remove lexical items    To remove lexical items from your lexicon    l   Ze    Select the lexemes to be removed in the list in the left pane  Click the    Delete    button   1  In case a word coincides with the first word of a phrase  which is in your lexicon you  will not be allowed to delete it  you should delete the phrase first     Add  edit or remove code tag sets    Selecting    Configure    in the main menu gives access to the    Configure tag set    pop up window  through which one can delete  edit or add new code tags to your current lexicon     To add a new code tag    L    2     Click the button    New    in the    Configure tag set    pop up window  You will get the    New  tag    box  see Figure 2  
25. ected       Save all as       Figure 4  Import of words from the tagger    Tag Disambiguation    FLAT Tagger includes a rule based tag disambiguator that disambiguates multiple tags  For tag  disambiguation you will need the buttons    Disambiguate    and    Refresh    and one more selection in the     Configure    menu       Tag disambiguation     which opens an interactive interpreter for writing or  updating tag disambiguation rules  Immediately after saving new rules an updated trace can be displayed  in the interface     The rules are linked to a particular FLAT lexicon  A set of disambiguation rules tuned to a patent  domain are included in FLAT together with a domain lexicon  See section FLAT lexicon      FLAT is provided with a disambiguation rule interpreter so that you can create and test different sets of  disambiguation rules based on the same or different FLAT lexicons tags  In spite of formalism  simplicity the rules have quite a rich and flexible inventory of    right hand side    conditions that can  provide for rather fine  vs  coarse  disambiguation  For example  one can check a context within a five   string window with the tag in question in the middle  The context could be checked either in terms of  tags and or word strings  It is also possible to check whether a context tag word belongs to a certain list   etc  The disambiguation rules can be both    reductionistic    and    substitution    ones  Any changes you  might make in the rule set are immediate
26. ertag is as follows   Tag  POS   Noun   Object   plural  singular   Process   ing  other  plural  singular     Substance  plural  singular   Other  plural  singular         The    depth    of supertags is specific for every part of speech and codes only the amount of knowledge  that is believed to be sufficient for patent texts analysis procedure  We do not assign equally    deep     supertags for every word in this lexicon  For example  supertags for verbs include such morphological  features as verb POS  and verb forms   ing form   ed form  irregular form  finite form   For finite forms  we further code the number  plural or singular      22    The following notations are used in the FLAT patent domain lexicon and tagger     Abbr abbreviation  Adj     adjective  Adv     adverb  Conj     conjunction  Det   determiner  singular   DetPl     determiner plural  N noun  singular  object  Nf     noun  singular  action  does not end in    ing  Nfp    noun  plural  action  does not end in    ing  Nfg   noun  singular  action  ends in   ing  Nm    noun  singular  substance  Nmp   noun  plural  substance  No   noun  singular  other  No   noun   plural  other  Num    numeral    P     verb  finite  singular  present  Pd     verb form  ends in    ed  Pg     verb form  ends in    ing  Pi      Verb form  irregular  Pis     verb form    is     Pare     verb form    are     Pp     verb form  finite  plural   present  Prep     preposition  PronPs     possessive pronoun  Qu    quantifier  Wh  
27. ly designed test suit  to investigate the problem  of coverage and knowledge necessary for disambiguation  A student can experiment with changing   inventing  tags to see whether a    deeper    description of lexical units gives better resolution   Interpreter can also be used to teach students to write formal  programmable  language descriptions   etc     28    
28. ly traced in the tagger interface thus allowing for operative  testing of rule coverage capacity     14    AT Flexible Language Analysis Tools  Disambiguating Tagger   lol xj       File Configure Help  Plain Text    Downloaded Text  C  LANA FLAT FLAT Full test_text  txt       Flying planes is dangerous due to this  It is safe to fly this plane  2   Text before and after disambiguation xi    Tagged Text  C 4LANAAFLATAFLAT Fulltest_textT ag txt      Flying     Partg  planes  Np  is  vc  dangerous     Adj  due to   Prep  this  Det Pron   lt  Pron  fis  Vc  safe  Adj N  to  Prep  h Np  this   Det Pron  plane 7N         e    Save As    H    Disambiguated Text  C  LANA FLATSFLATFull test_textDis  txt    Flying   G  planes Np fis  Ye  dangerous   Adj  due to  Prep  this  Pron   It  Pron fis     Vc Save       safe  Adj  to  Prep  fly    Inf  this     Det  plane   N   Save 4s      Figure 5  A screen shot of the Tagger interface with the pop up disambiguation  window  It shows the results of disambiguation of a test text based on test lexicon   see the section    Disambiguation rule formalism        How to work with FLAT to Disambiguate Tags    1  Create a lexicon with FLAT Lexicon Creator    2  Open the tagger by double clicking on its icon    3  Configure the tagger to a lexicon for which you will write the disambiguation rules   4  Write disambiguation rules  read the section    Write disambiguation rules       5  Tag the text   update your lexicon with new words in case there are a
29. ne by one  manually  Edit lexical items  Remove lexical items  one by one or in groups   Add  edit or remove code tags  Assign Edit code tags for lexemes  one by one or in groups   Search any lexical item  Look through the word list by    Using the scroll bar  you can only scroll 10 000 lexical items     Using the    Back    or    Forward    buttons that make it possible to scroll every  next previous 10 000 lexical items in turn  o Filter lexical items    By prefixes  suffixes  code tags    Without code tags  e g   newly imported uncoded words   o Undo an action    000000    Interface    The Lexicon Creator interface displaying a sample lexicon and set of code tags is shown in Figure 1   The main menu in the right top corner of the interface has    File        Edit        Language     and    Configure     selections  You will be explained how to use these in the following sections  The left pane shows an  interactive    Find    window  the buttons    New        Undo    and    Delete     for updating a word list  and a  scrollable list of lexical units  including multiword prepositions  adverbs  idiomatic phrases  etc  Under  the list of words there are two buttons    Back    and    Forward    which will be enabled in case your lexicon  contains more than 10 000 units  The right pane contains an interactive editing window and a number of  code tags next to check boxes  A checked box indicates a code tag assigned to a highlighted word  At  the very bottom of the interface ther
30. ny and tag the text again      6  Click the button    Disambiguate     You will get the pop up window with two control screens  The  upper screen shows a tagged text before disambiguation  the lower screen displays the same text  with tags disambiguated according to your set of disambiguation rules  You can compare both  texts to brush up your rules  The analysis traces can be saved for further use     15    Interface for disambiguation rules    FLAT Tagger is provided with a special interactive interface for writing disambiguation rules in a  simple formalism  which will be described further  Open this interface by selecting    Tag  disambiguation    in the    Configure    menu of the tagger interface        Disambiguation Rules ES    SETTAG    Inf  SETTAG  G        Conditions    IF CT   Adj N     Adj Nf     Adi Nfg     Adj   Nm     Adj No   THEN    THEN  DELETE    Adj  ENDIF   ENDIF    IF CT   Adj N     Adj Nf     Adi Nig     Adi Nm     Adi No   THEN  IF  LT   Det  DetPI  Det DetPl   OR  LLT   Det  DetPl  Det      THEN  DELETE    Adj  ENDIF    ENDIF Double click on item Actions    pm u to insert it in a rule  CT    eng ia after the cursor E  IF  Lw     a    another    one  H OR LLW    a    another    one H    THEN    4     A Check rules   Save  amp  Exit      Figure 6  A screen shot of the interactive interface for writing disambiguation rules  filled with FLAT rules  see the section    FLAT knowledge    for the meaning of  supertags           The right pane of the interfac
31. o get a list of unknown words          TagAndDisambiguate    Return type  String    Parameters   Name Type Description  Text String Text to tag and disambiguate                      Description   This method is used to tag a text based on current lexicon tags and disambiguation rules  It returns a  tagged text with completely or partially disambiguated tags  depending upon the disambiguation rules      Restrictions     The text cannot exceed 16 Kb  Symbols         are not allowed in the input text  After this method has  been used you can call the GetUntagged method to get a list of unknown words     27    FLAT in a classroom    Extreme user friendliness of the tool interfaces makes the software rather suitable for the classroom  when teaching NLP  It allows the students  computational linguists to concentrate on the linguistic  issues Of developing NLP applications  e g  machine translation  without augmenting the problems  with programming issues  Actually  the description of tools in the previous sections gives some hints  about for what and how to use FLAT for the teaching of NLP  First of all an instructor might use the  tools to familiarize students with the problems of NLP software  linguistic error analysis  specificity of  the sublanguage approach to NLP  etc  Another way is to use FLAT is to participate in building an  NLP system  For example  based on the tagging lexicon acquisition tool and interactive rule  interpreter  exercises can be developed  using a special
32. ollows  mind the  brackets       conditionl AND condition2    condition  JOR condition2   NOT  condition     Using Boolean operators you can write conditions of any complexity  mind the brackets   for  example       condition1 AND condition2  OR NOT condition3      18    Any rule can contain any number of imbedded IF THEN ELSE structures and Boolean expressions  combining several rules into one  Thus though formally there can be up to 50 separate rules  in  practice you can cover as much disambiguating procedures as you can think of within the suggested  formalism     The order of the rules is relevant  The program linked to the rule interpreter performs a  disambiguation action after the context meets the rule condition and moves to process next tag     Simple conditions  There are several groups of simple conditions     Tag context conditions help tag disambiguation depending upon the tags context  The tag context is  a five tag window with the tag in question in the middle     TAG     CT   a current tag  LT   the tag left of the current tag  LLT   the tag two steps left of the current one  RT   the tag right of the current tag  RRT   the tag two steps right of the current one    TAG values can be single   N   V  or multiple   N V   Adj V   Adj V N   The order of single  tags in a multiple tag is not relevant  You can write the tag context condition as follows     TAG    valuel  value2       valueN   for example   CT     N   LLT     N   N V   RRT     Adj N V   Adj N      RT
33. rther  in the text   You will thus avoid listing the same sets of words in rules  for example  a condition can  be written like    CW   ListA  Other conditions are as follows   ISLAST     means that a current word is positioned before a period  comma  colon or semicolon     Conditions listed below may be specially useful when analyzing phrases  i e   when you know the  phrase boundaries    ISLW  means that there is a word left to the current one in a text  the current word is not the first  one in the text    ISRW  means that there is a word right to the current one in a text  the current word is not the  last one in the text    ISLLW means that there are two words left to the current one in a text  the current word is not  the first one in the text    ISRRW means that there are two words right to the current one in a text  the current word is not  the last one in the text     20    Actions    There are only two actions in tag disambiguation rules     CT    tag_any value  DELETE  tag_single value     for example     CT  N   CT NN  DELETE  N  DELETE  Adj    The first action is used in    substitution    rules  It will substitute the tag of a current word  or phrase   with a tag stated in a rule    The second action is used in    reductionistic    rules  This action will try to delete a single value tag  from the tag of a current word  This action will not be executed in two cases     e If there is no stated single tag assigned to the current word this action will do nothing   
34. scribe the structure of disambiguation rules and then the declaration part of the formalism     Rule structure  The formats of disambiguation rules allowed by the interpreter are listed below  see also Figure 6    1  The upper level rule has always the IF THEN ENDIF structure as follows     Rule format 1   IF condition  THEN  Actionl  Action2  ENDIF    2  The structure IF THEN ENDIF can be embedded in the upper  and next  level structure  as  many times as you need      Rule format 2   IF condition   THEN  IF condition2  THEN  Action   ENDIF  ENDIF    You can also embed the structure IF THEN ELSE ENDIF in the rules in format  and format   and get a rule in format3     Rule format 3     IF condition   THEN  IF condition2  THEN  Action   ELSE  Action2  ENDIF  ENDIF    17    It is possible to have any number of embedded structures in one rule  for example  like the  following     Rule format 4     IF condition   THEN  IF condition2  THEN  Action list or IF THEN ELSE ENDIF structure    ELSE  Action list or IF THEN ELSE ENDIF structure    ENDIF  ENDIF    Attention  There cannot be ELSE in the upper level of the rule  that is there cannot be a rule  like this    SIE condition   THEN   Action list or IF THEN ELSE ENDIF structure   ELSE   Action list or IF THEN ELSE ENDIF structure  ENDIF    Below all actions and conditions are described  Conditions   Conditions can be simple or complex   Complex conditions    Complex conditions are Boolean expressions combining other conditions as f
35. t file    pop up  window    Type in the path to your file or click the           button next to the type in area to get a  regular browsing window    Follow the dialogue to get the path to the file to import in the type in text area    Assign tags to all of the lexemes from the file or leave the tag boxes unchecked if the  lexemes from this file require different coding  You can assign tags to them later    Click the button    Import    at the bottom of the pop up window     3  Paste words recognized as unknown from the FLAT Tagger    gm oo    Tag a text with FLAT Tagger  see the section    Look up Tagger        Select the words you want to paste to your lexicon in the    Unknown words    pop up  window in the tagger interface    Right click on any spot in the    Unknown words    window  You will get a pop up menu of  possible actions    Select    Copy selected    to paste all selected words to the Lexicon Creator or   Select    Copy all    to paste all of the words without selecting any of the words    Return to Lexicon Creator    Select    Paste from tagger    in the    Edit    menu  You will get the    Paste words from  tagger    pop up window  The words to be pasted will be shown in the    Words from the  buffer    window  The set of tags of your lexicon will be shown below    Check the tag boxes in case you want to assign checked tags to all the words from the  buffer or leave the boxes unchecked to code the words later    Click the button    Paste    at the bottom of th
36. tc       Contact address     Lana Consulting  Madvigs alle 9  1829 Copenhagen  Denmark  Ph    45 33250441   Fax   45 22332822   E mail  lanaconsult mail dk    URL  http   www lanaconsult com       System requirements    Your system must fulfill the following minimum requirements if you want to run FLAT     PC  Pentium processor  32 MB RAM  100MB free hard disk space  CD ROM drive  Operating system  Windows 95 98 Me or Windows NT 4 0 2000 XP    Installation    To install FLAT on your personal computer  proceed as follows     l     Start the setup program  Insert the CD ROM in the appropriate drive and start the setup exe  program on the CD ROM  To do this  choose the Run command on the Start menu and type the  following in the command line  d setup exe  replacing d  with the letter for your CD ROM  drive  if different     Select the installation directory  Confirm the suggested installation location  default  C    Program Files FLAT   You may also choose a different directory    Select the Program Folder  Confirm the suggested Program Folder  default  FLAT   You may  type a new folder name  or select one from the existing Folder list    Run FLAT  Click on the icon of FLAT in Program Folder or on the icon on the Desktop  or  check the check box  Yes  Launch the program file     Register  After you ran the program FLAT  you will get a pop up window in which you will see  your user code and the serial number of FLAT  You would be suggested to enter your name  and registration key
37. the lexicon list           Highlight the lexemes   Right click on any spot in the left pane to get a pop up menu   Select    Set tags for selected    in the pop up menu  You will get the    Set or clear tags for  selected words    pop up window with the selected words displayed at the top  Note that when  dealing with a group of words you cannot clear and assign tags to them in one take   4  To clear the tags from the words coding  i  Check the    to clear    radio button to delete or to assign tags from the coding of the  group of words in question  11  Check tag boxes next to the tags you want to delete in the pop up window  111  Click the    OK    button in the pop up window  5  To assign new tags to the words coding  1  Check the    to set    radio button to delete or to assign tags from the coding of the group  of words in question  11  Check tag boxes in the pop up window  111  Click the    OK    button in the pop up window    Si    Assign lexeme coding when importing them form a file or the tagger    When importing lexemes from a text file or pasting them from the tagger you can either assign the  same tag coding to all the lexemes shown at the top of the corresponding pop up window  see the    section    Add new lexical items     or put them in the lexicon uncoded and then proceed as described  in the section above     To assign tag coding to all the imported paste lexemes    1  Check tag boxes in the pop up window  2  Click the    OK    button    Look through the wor
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Agilent Technologies 4339B ハイレジスタンスメータ Agilent  Para Comenzar - RCA Antennas  User manual and Installation instructions  ADIC 601355 Rev A Digital Camera User Manual  Bedienungsanleitung für den Universal  D-DS_1 engl 01-2004  Samsung 2043WM - Black Uživatelská přiručka  心電計) [PDFファイル/263KB]  Netlogo!  Methods and systems for facilitating transfer of sessions between    Copyright © All rights reserved. 
   Failed to retrieve file