Home
User manual for the base application
Contents
1. Adding a NER term annotation Dictionary enrichment Diagram illustrating the options of adding new terms to the diclionaiysss aee wc UR dre hdd Nen e ON BIS d d Correcting termi annotation Changing project settings il Chapter 1 Basic concepts and interaction 1 1 Introduction Note is a Biomedical Text Mining workbench that integrates current Biomed ical Text Mining BioTM methods and provides biologists with intuitive tools capable of supporting their bibliographic searches and further litera ture curation The major guidelines of its development were interoperability extensibility and user friendly interface The workbench is meant for both Bio TM research and curation On one hand it supports regular curation ac tivities providing an intuitive Graphical User Interface GUI interface that does not require any knowledge about workbench or technique implementa tion On the other hand it is also meant for people with programming skills that might wish to extend the workbench capabilities Note is implemented over AIBench a JAVA framework meant to ease the development of Artificial Intelligence and Data Analysis applications The main strengths of AIBench are its clear design and available services Its design is problem independent minimum framework related code is re quired in order to produce new functionalities M
2. Settings Project anoteProject Proxy Settings Use proxy Local Documents Path D Projects aNote Docs LocalDocs Host ost Root Path D Projects aNote Docs RootPath sanl Port Save Project Path ettings RafaelDesktoplbtStructanp s e Mandatory Fields o Cancel JA OK Figure 2 25 Changing project settings 26
3. e BioCyc flatfile http biocyc org download shtml e ChEBI flatfiles http www ebi ac uk chebi downloadsForward do e NCBI Taxnonomy flatfiles ftp ftp ncbi nih gov pub taxonomy e UniProtK B Swiss Prot flatfiles http www uniprot org downloads e MGI Entrez Genes flatfiles ftp ftp informatics jax org pub reports index html The user may choose to upload all contents available at a given source or according to source specifications he can restrict data upload to a given organism and a subset of the embraced classes 2 7 2 Loading lookup tables This option allows to load pre defined lookup tables for a number of biolog ical classes These are at current version only available for three classes biological related verbs physiological states and experimental techniques 25 2 8 Project Settings It is possible to change the settings of a project after it has been created To do so the user should select the Settings option on the menu bar and then Project Settings or click with the right button over a project on the clipboard In the popup window Figure 2 25 it is possible to change the location of the project s local documents the root directory of the documents the file to where the project could be saved and editing the proxy s configuration To define the host and port for the proxy the user has to activate the option Use proxy and then type the host and port
4. 2 inspector The Views will by default be launched on the right side of the work area So the original layout of the components has three major areas the menus on the top the clipboard on the left and the view s area on the right 1 4 Getting started The concepts within Note can be overwhelming for a beginner in BioTM Therefore we provide a guideline to start using Note Basics The first two steps that are needed are i to create a new project see Section 2 1 1 and ii to create a new database see Section 2 2 2 in alter native the user can create a connection to an existing database as explained in Section 2 2 1 From this point on the user has two alternatives i to use the application with some sample data provided by using the option Load sample data in the Database menu ii to start their own case study from scratch We would recommend the first option to start with In this case the results of a pre defined query to PubMed are loaded into the catalogue and a dictionary with terms related to the organism E coli is loaded into the database Alternatively at this step a query can be performed by the user see Section 2 3 and there is also the need to create specific lexical resources check Section 2 7 for details The next logical step will be to select a number of documents and to load them The user selects the set of interesting documents or all of them and chooses if only abstracts or full texts
5. 2 6 1 Annotatinganewterm 20 2 6 2 Correcting anannotation 20 2 63 Removing an annotation 23 2 7 Handling lexical resources 23 2 7 1 Handling dictionaries 24 2 7 2 Loading lookup tables 25 zc Project Settini AA dn mea o E SG ass AA 26 List of Figures 2 1 2 2 2 3 24 2 5 2 5 2 7 2 8 2 9 2 10 2 11 2 12 2 13 2 14 2 15 2 16 2 17 2 18 2 19 2 20 2 21 2 22 2 23 2 24 2 25 Creating a new project Configuring a new project Savine a Pio Abre sch edere eerie MN Reed do Se uut dos Loadine a Project aur E he E ens iS Ae eet rre wi Creating a database connection Connecting to an existing database Creating a new local database Listing PubMed queries Viewing a Result Set Checking detailed information about a publication Viewing the PDF file of a publication Setting the publication s relevance Selecting the publicationset Viewing the Publication Sot 0 0 0a a a a Selecting the Publication Set for NER Selecting the dictionary and the classes to annotate Annotationoptions NER running operation ANoteNerBox view Document view
6. 49 F3 Abstract The basic underlying problem in reverse engineering of gene regulatory networks from gene expression data is that the expression of a gene encoding the regulator provides only limited information about its protein activity The proteins which result from translation are subject to stringent posttranscriptional control and modification Often itis only the modified version of the protein that is capable of activating or repressing its regulatory targets At present there exists no reliable high throughput technology to measure the protein activity levels in real time and therefore they are so to say lost in translation However these activity levels canbe o maan recovered by studving the gene expression of their targets Here we describe a computational approach to LY Figure 2 10 Checking detailed information about a publication In the publication s details view a weight relevance measure is presented If the publication belongs to more than one query a average of all relevance for those queries is calculated and presented The user can visualise the actual relevance of the document for each query it belongs to and edit it Figure 2 13 If the same relevance is pretended for all the queries it is possible to select all queries and then choose the relevance level Let us now define how the user can retrieve the documents from the editor s sites when these are available
7. z 10126 uoneniur uone sueJ jenapegq Jo ays Gulpulg apijoajanu eui 07789691 90025 euab Kioje nba1 ujajosd femyed punodwo2 e wos m 4 Jope uogduasue samy pueH A eueb awAzua sanbiuysa A uoqoea saa 4 I sennu3 ESUHISNSJONV 0 amp uejsur uon euuo21 efo1d uonaeuuo2126 1ux 02289691 42 gt 0 amp auejsur kogianajony S3 3 Bul JSITXOGISNS ONY SI X0g JAN D 0 e uejsur jesuone iiandeyoNv D isrmesuoneoiangejoNy sjesDupiud eauejsul s qe sjjnse jejoNv Jesiins 0 82uejsur anBojejegajony anGoyey palo 19 e A section with buttons to save the changes done in the document doing zoon in and zoon out in the text undo the last change carried out and a field to search text s excerpt in the document e A section to change the colours of annotated entities e A section with the annotated classes and the terms of each class e A section with the structure of the text the user can click on a section and skip to the respective section in the text e A central section with the text 2 6 Manual Annotation A number of options are available to the user under the document s view described in the previous section These allow the manual curation of the automatic NER annotations 2 6 1 Annotating a new term To annotate a new term the user must select the term and a popup window will appear with the possible options i e biological classes The Add Tag option must be
8. Documentor Clipboard Pan L GI joteProject anoteProject E Catalo Create Local DB New Query Query gt Catalogue ANoteCatalogue instance 0 nA Host Schema User darwin di uminho pt 3306 biotextmining bioUser localhost localhost 3306 bio root localhost 3306 bio root 3306 localhost 3306 biotextmining root biotextmining User root Add connect Figure 2 6 Connecting to an existing database 2 2 2 Creating a local database To create a new local database the user selects the option Create Local DB In the popup window two fields have be to filled Figure 2 7 The first is the MySql root password on the local host given in the instalation and the second is the new database name When all the fields have been set and the Create button is pressed Note will create a new database predefined schema As before an object representing the new connection will be added to the clipboard Create Local DB Root Password eeeeeee Database Name biotextminingl Catalogue Catalogue ANoteCatalogue instance Q cancel Create Figure 2 7 Creating a new local database 2 2 3 Loading default data
9. The option Load sample data in the Database menu loads some pre defined data allowing for the beginner user to get acquainted with the ap plication with a reduced effort and time In this case the results of a pre defined query to PubMed are loaded into the catalogue the query uses the keywords Escherichia coli stringent response and a dictionary with terms to the organism E coli is loaded to the database The data source used for this dictionary is the BioWarehouse integrated repository 2 3 PubMed Searches To perform PubMed searches a database connection has to be previously and successfully established The user clicks in the project s catalogue and a view will appear in the working area of the application A list of the database s existing queries is given If none of the listed queries is wanted the user can add a new query pressing the New Query button This option is also available in the menu Database option New Query A new PubMed search will be performed using the keywords selected by the user in the popup window The Execute button starts the search process This new query if succeed will be added to the previous list A query has an associated list of publications The user can select the query he intends to work on and click in the Load button This action will load the information about all the publications of the selected query Information about these publications will be listed on a new datatype item named Resul
10. the text typed in the text field and the document s selected content the matched publications will be highlighted The View section shows the types of documents that is possible to choose The types are e Abstract the publication s abstract without any annotation e Full Text the unstructured full text of the publication without any annotation this is the direct result of the PdfToTxt operation e Structured the entire text of the publication without annotations but with a base structuring i e the text is split in the areas containing the title authors abstract paper sections and others e NER in case of NER been made to abstracts this shows the publi cation s annotated abstract if the NER was made over full texts it shows the entire annotated document To view a document the user just has to click in the right publication s row button The type of document that will be opened is the selected in the View section 17 File Database Documents Settings YALE Documentor 4 z ect roject italogue ANoteCatalogue instance Publication s Set sultSet ANoteResultsTable instan rkingSets ANotePublicationSetList ANotePublicationSet instance 0 B Working 102 Ner Box List ANoteNerBoxList 5 ANoteNerBox instance 0 Escherichia coli stringent response ectConnection rojectConnection instance 0 Boxes Full Text Box Ner Boxs 0 C Abstract box Used Dictionaries Structu
11. CCTC Computer Science and Technology Center University of Minho IBB Institute for Biotechnology and Bioengineering Centre of Biological Engineering University of Minho SING Next Generation Computer Systems Group School of Informatics Engineering University of Vigo Note Biomedical Text Mining Workbench User Guide of Note Basics Analia Louren o Rafael Carreira Paulo Maia Sonia Carneiro Daniel Glez Pena Florentino Fdez Riverola Eug nio C Ferreira Isabel Rocha Miguel Rocha 2008 Contents 1 Basic concepts and interaction 1 Ld Introduction sais ain xg eR AR ANG RE oe Se 1 1 2 Datatypes and operations 2 da User interaction Ye akin e ei a I Aa 2 14 Getting started 3 2 Main functionalities of Note Basics 4 Dol aoe T Pr jects as grp te ane ate aa ee NE RE Oe ie 4 2 1 1 Creating a new project 4 Asi Savine a Projecte 1 nerd Sv a o DR HOUR T a 6 2 1 3 Loading an existing project 6 2 2 Handling database connections 7 2 2 1 Connecting to an existing database T 2 2 2 Creating a local database 8 2 2 3 Loading default data 9 2 3 PubMed Searches 9 24 Journal Retrieval 10 2 5 Named Entity Recognition 13 2 5 1 Document View 18 2 6 ManualAnnotation 20
12. Jeans uoj d 18128 qo23I aH Jo pauinba si asuodsa jueDuuis ay KI 7 8suodsai ssans e ndepe ue jo uisiueu2aui jenapeg asuodsa uaus eu KI i dseijueDuuis jeuapeg 0 snobojouiou asuodsal ssans eAndepe ue jo wsiueyIaw jue d KI ns ele oj uogeydepe Duunp einjna snonumuoo ui 102 eiyauayasg Jo uorssaldxe auag B E asd ul UUn2IUUE1nIB wnuayeqauluog ul asuodsa juabuujs BUUnp uoissaidxa auab 160019 KI Ss 3ioqejaul e se z 4l Z 10126 uoneniui uogejsuey jeuapeq Jo ays Hulpuig apyosjanu ayl KI Japeg Jijiydoway awapa ue ui paglala esuodsai jueDuujs ay jo sis eue E2ibojoIs ud B E esuli uogJsuuo9p3a 0148 0NY O S J asuodsa ssayjs Juapuadap Lods Buyu yams ay uonaejejul Lodgjulajosd sawed Aay LT lhe LSIINSSYSIONY Jasunssaylt enDojeje 2 ejoNv enDojeje m ESS828U SI 1 09 EIj3I 84283 UJ0JJ urejo1d jeurosoqu jo UIEWOP jeurus N 819X34 eu L KI 7 eausjnii uauxay elje6lus 10 jeyuassa s vsxq Aq bju jo uorssaidxe jo uonanpui 2 BIYIUSYIS Ul via JO Jajowojd e qionpur puoa2es e jo uogezuejoeJeu pue uorneayguep uognaeuuo2j2efo1dejoNYv yafo1gajoue e EJ qjoefo1dsjoNv 11 S Publication 17121995 e Pmid 17121995 Journal Proc Natl Acad Sci U S A Pages 18592 6 Title Reconstructing repressor protein levels from expression of gene targets in Weight Relevance Date 2006 Escherichia coli N Volume 103 Status MEDLINE Authors Khanin R Vinciotti V Wit E 4 Issue
13. according to the user s permissions By default all the publications in the Result Set are selected but the user can select the intended publications If the Download Non available Full Text PDF bottom right option is selected the application will invoke the Journal Retrieval operation The Journal Retrieval operation will try to find on the Web the PDFs of the selected publications For each document found the application will download it to the project s local directory and this PDFs will be available for future work After downloading all the PDFs available a pdf to text conversion will be conducted By default this option is not selected because this take a few minutes to process When the user presses the Get Publications button if JR option was selected the preceding process will be done and the selected publication will be loaded to the application In the end of the process a window will be presented to the user that will choose the Publication Set where the publications will be loaded It is possible to add new publications to an existing Publication Set or to create a new one Figure 2 13 A Publication Set can have documents coming from distinct queries and it is also possible to add previously non selected publications from the same query 12 i 4 Page of7 p r Bucriorocy Aug Kk p 5494 3300 WSILOO D doi 10 E PRI BLOCS 6 06 XKR American Society for Microbiology All Rights Reserved Strin
14. aur aucune ue ddo dd aeydsoydrp na 0 exueisui uogoeuuo2ioefo1dejoNv o em G E aursowenS purq ues je urejatd 2urpurqspryosponu suruens st ZA Z 791291 Yonenn euone suer T E nan _ uisiuefi d 33ensqy ensavsioNv xog pensay gi B ayoqejaw o p a L SODC ET wore mataar 10 paateoar 900CLZ Amr OD 1apmog 2u 2oTeuos prop Aus q papeonmwwog Aweunag BLIINJSJONY x08 KaL ling S USHIA SPPSS exospreg wen Jo Aysan nsmreqoorg pexs qg Jo anyysuy pue Ari ewonry E109 PRN L pes Jo Staon onnpejApog nsmuraqoorg jo ejunsu spuegeqeN ALL peu HO PSCE Aysan man yoraszy ONY Xogpelpempnas mqnosqourorg 10j 13u29 jeoafiq H DIN Oeus TEpzo owrreure jo Asaan lt YON oog jo yueupedag lt h luxeezecao sommo m A 30 Kaopa208T bueysui xogiensiony lb Stena Oo OPNE pue suo oog Joy uoq I enpu eurupoy i oganaony 1811 x04 18N D A Supe S reat vy euuy sraylo 1195 5 zose oong answer efourop equas aua UON prad e uejsu jasuoneanandajony T 4 pesuonetiangsioNv 5 850UPHOM As i Bu elae1sinsesapNv yesunsey 7 i esu anBoyeyegajony enDojeje 2 B 1 pefo1gayoue B o gt M T palolgajony Tosuas 3r oqejaur v se ZJ Z 10083 eH KHER uone suer perrajeqgo AS Surpung apnoopnu sy 21 uoryejouue ULI YAN Y SUIPPY Tc c Binan auu t eu NN eup NEN wsiue610 m BUAH Al0 en6a p ouob
15. ave a hierarchy where a given object A contains objects B and C these objects are called compound objects The set of data objects and their hierarchy in a given application are shown in a tree the clipboard typically appearing in the left side of the screen In this tree compound objects can be opened to show their contents a list of other data objects When a given data object is selected double clicked the available visualizers if any are launched in the right area of the screen e Operations each operation defines a function that takes zero or more data objects as its inputs and can create as an output zero or more data objects and or merely change the input data objects Operations can be accessed through the menu options being typically grouped in several menus and sub menus Operations can also be run from the clipboard by right clicking a data object this will show the list of all available operations using that data object as an input 1 3 User interaction Since Note Basics is an AIBench based application the user interaction is thought to be as simple as possible The Model View Controller MVC architectural pattern has been used in every step of development of AIBench as well as of Note resulting in a great deal of decoupling between the operational data and the views As referenced before a View is related to a given Datatype If there is no View associated with a Datatype a Default View is launched a bean
16. bjects under its clipboard tree e A Catalogue that represents the object used to perform queries to PubMed and store the results e LexicalResources a sub tree that handles the resources for perform ing annotation These include dictionaries and lookup tables This set is initially empty In the project tree other types of objects will appear as a result of the operations described in the next few sections 2 1 2 Saving a project To save a project the user chooses the File menu option and then Save Project In the popup window a project and a file to save it are selected as before the file must have extension anp E X File Database Document Dictionary Settings Documentor Clipboard 1 Y ey ANoteProject anoteProject amp Catalogue ANot 9 ResultSet ANote Y DZ WorkingSets AN of ANotePublica Y e ANoteProjectConnection Q ANoteProjectConne Save Project g Project lanoteProject File C Documents and Settings Rafael Desktop proj anp o Cancel LU L a 0K 4 il I AlBench Figure 2 3 Saving a project 2 1 3 Loading an existing project If there are previously saved projects it is possible to load them To load a project the user selects the File menu and the option Load Project In the popup window the user chooses the file where the project was saved anp extension and clicks Load to perform the ope
17. chosen and the intended class is selected If the selected term is already annotated it can t be annotated again After adding a new annotation the new term can be added to the cur rently used dictionary if that is intended by the user The changes will be made in the underlying database supporting the dictionary and can therefore be used to annotate other documents in the future The diagram depicted in Figure 2 23 explains how this option works and the effects of the user s choices over the dictionary 2 6 2 Correcting an annotation It is also possible for the user to correct an annotation This correction can be done in the lexical form of the term or in the class that the term is annotated To do so the user selects the term and chooses the Correct Tag option in the popup displayed When this option is selected a window appears where the user can correct the annotation The window contains the current class of the term the new term initially identical to the selected term and a list with all the classes that the user can choose to the term 20 M9IA quournoo T 0Z Z JINSIA Cun eup wuistue610 mmm s26 Al0 en6a1 p paa wai mE Gran gs ouo OG 3 Aemuped m onbiuu29 NEN sana wepe ue2 ddo dd jo jeu pue lt Qc se ymw se Aq eurpep Lenn uonerjuesuo2 412 201 suoyipuoo ssens pun 12AeA 0H Moy rea ddo dd jo yey pue yan Aten st uonerjuesuoo JLO ayy suonrpuoo peurdo 1epun Sumo spao uj ELLA Ieq ur oswodsar jours ur paapo
18. e current state can be viewed by clicking on each of these datatypes The sub menus Dictionaries and Lookup tables handle the operations regarding each resource type The options for each case are given below 23 No similar nor alias selected Figure 2 23 Diagram illustrating the options of adding new terms to the dictionary 2 7 1 Handling dictionaries Three distinct operations can be performed in dictionary management that are given by three options in the Dictionaries sub menu or by right clicking a dictionary object e New dictionary creates a new empty dictionary in the project e Dictionary contents allows the user to add contents to a dictionary which can come from several sources e Merging dictionaries allows the user to merge the contents of sev eral dictionaries into a new one only allowed for dictionaries where the sets of classes do not overlap The second option adding contents deserves a more complete explana tion The process starts by the selection of the dictionary where the contents 24 Correct Selected Text Tag New Text F2 Current Tag New Tag pathway al protein regulatory gene cancel Apply Figure 2 24 Correcting a term annotation will be added In the bottom part the data source is configured Currently the system supports the following sources e BioWarehouse integrated databases http biowarehouse ai sri com
19. for the pdf original documents handled by the project In this folder all pdf documents captured in Journal Retrieval processes will be saved 2 Secondly the Root Path is defined This is the folder where all the project documents processed by Note will be saved This includes all the annotated documents documents created as a result from pdf to txt processes among others At this stage the user can also select a path to save the project main file anp extension and also define proxy configurations if they are needed by the available internet connection The configurations carried out in this step can be changed later in the Project settings menu When the OK button is pressed the project is created and a data item of the type ANoteProject is added to the clipboard This will be the root of all objects of a given working session m X File Database Documents Settings YALE Documentor Clipboard M Clipboard Create Project Project Name Name anoteProject lidat Project Documents Path Local Documents Path D Projects aNote Docs LocalDocs Root Path D Projects aNote Docs RootPath Save Project Path C Documents and Settings Rafael Desktop anoteProject anp Proxy Settings C Use proxy Host Port Mandatory fields Q Cancel J OK AlBench Figure 2 2 Configuring a new project When a project is created it has two different o
20. gent Response Is Required for Helicobacter pylori Su of Stationary Phase Exposure to Acid and Aerobic Shock Kyle Mouery Bethany A Rader Erin C Gaynor and Karen Guillemin Institute of Molecular Biology University of Oregon Eugene Oregon and Department of Microbiology an Immunology University of British Columbia Vancouver British Columbia Canada Received 14 March MW Acoepted 16 May 2005 The gastric pathogen Helicobacter pylori must adapt to fluctuating conditions in the harsh environment of human stomach with the use of a minimal number of transcriptional regulators We investigated whether pylori utilizes the stringent response involving signaling through the alarmone pippGpp as a surv rategy during environmental stresses We show that the H pylori homologue of the bifunctional p pp wnthetase and hydrolase SpoT is responsible for all cellular p ppGpp production in response to starvat ditions Furthermore the H pylori spoT gene complements the growth defect of Escherichia coli muta packing pippGpp An H pylori spoT deletion mutant is impaired for stationary phase survival and unde ia premature transformation to a coccoid morphology In addition the spoT deletion mutant is unable to su pecific environmental stresses including aerobic shock and acid exposure which are likely to be encountei by this bacterium during infection and transmission acter pylori is an proteobacterium that infects over orid gt p
21. m uojo 0191 Rm siap oNoqejoui pog Aemujed m onbiuuoo meme wNanor CU iowusev ef wenden cuo ma 2121S NN punoduuo2gmEK U00231 7 sYOHINY C eo MEL S10Y Ny dola anil anowiay esmueid sag TL aqueduros anea ay sepuospmu omy ayy 20 semis epams seq fll feusnor aaouiay aq 10 pezqeurou saamo uiQippauruuejep are saper ay E 0 2uggsul U0H2SUU 0219 SONY o chida 9H gg Aq peonper sem ayer uoysear ayy asnesaq arowuaypm J B uonaeuuo2isefo1dsjoNv pensqy 9 oUI9H enuesuoo ayy ALO JO Aa juejsuoo e JO exueseid ay ur alk 7 sasualajay anouiay 29 uoneruexio2 dde dd uo spuadap uoyeuuoz epydedrp i JSqvejoNv xog i esqvy S ampmu e sparen dd B radar meat ae Lan grana Ep DTA JO seur aq UO sapnooponu supren jo ge aq JONY xog xe paimanag EH Wog Jo suoyenuasuo Sn span q ALO Jo exueseid ayy juxgezsseol PJ p Ur SOE 03 pumog YNA PNU JO Jang ep jq umos qux 04189691 47 Ta WA pareduroo ddodd 73 pue gas cal Jo maps ss yna parduos dde dd pue qq jo exueserd ayy w pumoq PUESU xogieNelONv K a err jo arrea T a ae Promenaiony as xog 1eN D 4 b pre 4q5 q pesueo yey iip aa mrs ddgdd q 2UeISU resuoneotiangeioNv J 4 fesuone iangeioNv 5 35DUPHOM As bun a qe1snsayaoNy 3esuinses esu anBoyeyegajony anDojeje 2 B 4 pefo1gayoue B o gt H palolgajony ALO Wt puo yom ur poguos ayy yun pereduro UE peurureisuiu y HW Ul X Jy Jo Surpurq juspuedop 23 Jo ya
22. m ap E O 2A Tetjuejsqns e Io 22uasard au E anbiuyoa o B sassej old jasuoneawgndeyoNy 22 12 Add Term to Dictionary Term tRNA Similar Terms Alias nn Selected tRNA serine tRNA fMet tRNA seryl tRNA RNAPhe o Cancel Figure 2 22 Dictionary enrichment To change the class of the term the user just has to select one of the classes in the given list If the user wants to correct the term she he can edit the term in the New Text field if not she he just clicks the Apply button without editing the term When the Apply button is pressed the changes will be made and the window to add a term to the dictionary will appear The process of adding a term to the dictionary is the same as described above 2 6 3 Removing an annotation If the user knows that the term s annotation is incorrect and that the term should not be annotated with any of the possible classes she he can remove the annotation of that term To do that the user has just to select the term and choose the option Remove Tag from the popup This action will only remove the annotation but not the term from the dictionary 2 7 Handling lexical resources The menu Lexical resources contains a number of operations to manage the lexical resources of a project namely dictionaries and lookup tables Both the set of dictionaries and lookup tables are represented by clipboard objects and th
23. n is pressed the NER operation will start and a small window will appear indicating the execution of the operation Figure 2 18 The NER operation will take a few minutes When the process is finished a new Ner Box List object will be added to the clipboard This object contains a list of items of the datatype AN oteNerBox each being the result of a NER operation The Ner Box List exists because it is possible to create different kinds of configurations to NER e g distinct dictionaries and each configuration yields a distinct NerBox By clicking on a NER Box in the clipboard the respective view window is presented Figure 2 19 In the upper part of this window the keywords that originated the original Publication Set are given The used dictionary the annotated entities the number of publications annotated and all the 16 S Txt Structuring and NER Annotation Options Lookup Tables v Biology related Verb Y Laboratory Technique v Physiological State Predefined Expert Hand Rules Type of text to annotate Abstract Full Text Figure 2 17 Annotation options annotation options are also presented In the bottom part of the window there are two sections The Search section allows to search a publication in the list A search can be done by different contents that can be selected in the list at the right hand side of the search s text field If there are matches between
24. nical isolates 19 H pylori has been reported to lack a stringent n Figure 2 11 Viewing the PDF file of a publication If a new Publication Set is selected a new instance of Publication Set will be added to the clipboard All the instances of that type will be squat on a root object of the type WorkingSets 2 5 Named Entity Recognition When there are one or more Publication Sets available in a project it is possible to execute the Named Entity Recognition NER operation over one of these sets right clicking it When the user clicks on a Publication set a view is presented with information about the publications added to it Fig ure 2 14 and some more information about the sets of processed documents associated to it When the Q button in the view is pressed the Txt Structuring and NER option on the Document menu or by right clicking on a Publication 13 l Publication 17121995 E3 Query Keywords Relevance All Queries Figure 2 12 Setting the publication s relevance S Publication Set Sellection C3 Select the Publication Set Figure 2 13 Selecting the publication set Set item of the clipboard a wizard will be presented This allows to configure the NER process The first step is to select the Publication Set over which the NER will be performed Figure 2 15 When the desired Publication Set is selected the Next button is pressed In the next step a dictionary mu
25. on s row The publication information view shows the available data about the publication and also implements two other features e to view the PDF document e to view and edit the publication s relevance to the query In case the publication holds the respective PDF document locally i e the pdf file is in the project local document s folder it is possible to visualise this PDF This typically occurs when a previous Journal Retrieval JR process has been performed In this case the PDF button will be enabled and the user can click it to see the document Figure 2 11 10 jog 3 ns93 V SUIMOIA 6 c OMIA Ala 9002 muog 3a bugyo aw amei aj soo Ye NE Jayas A USI0pBEW N ejeqed A enO H aay N I USIUEXEN Jou e9 yg Japey y Aienoyw 7 PeXxsI yisnid 9 exswoiqeg 3 nH uueuuelieM W aug MOUI E O EZj819 uueun2018 3IsuJ0 3 OMUBYISLL d UOJIW CEDE UEXE L 1 EMEZIUSIN IESE B wo sues euet Sa 9007 H WOW euiluso qe mmi P Ex HI paaa Wa uenuer l esoaniB Buunp uoissaidxe aus 160018 seyeuip1oo2 ajeydsoydoidsiq eursoueno S PIDE OUILUE O 1 09 BIYINBYIS3 Jo uogejdepe ay ui xejdujo2 sseajoid uo sjeudsoud Jod yi KI Z Jgualayald eysqns ENPIAIPU se ara SUISI4II E enssm UELUNY ue4es jo Buljyoud M12u2sd lt KI ed 337 uawaseya ajisojajua jo sn20 ayy UI uorssaidxe aua souo vsxq yum ddodd dxa eseud Aieuonejs jo
26. opulation and is associated with a spectrum of cases 5 This bacterium for which no environmen ir has been demonstrated inhabits the harsh envi of human and primate stomachs Within the environ he stomach as well as during the as yet ill defined ion process H pylori must endure rapid fluctuations wgen tension and chemical insults from the host system The regulatory mechanisms of H pylon s sur egies in the onslaught of these environmental stresses crest in part because the bacterium s small genome few transcriptional regulators including just three tors 29 Insight into H pylon s stress response pro abet the design of better antibiotics to treat infec ce resistance to traditional therapies is increasing Many other bacteria possess a single bifunction synthetase and hydrolase 23 The distribution bifunctional enzymes was previously thought to be gram positive bacteria but recent work has demo multiple gram negative bacteria particularly the and protcobacterial families also harbor single synthetase hydrolase enzymes 14 18 35 The al ulate transcription through production of the si molecule p ppGpp is emerging as an important t tiple pathogens to survive environments specific tion and transmission processes 7 For example tant of the related protcobacterium Campio survives poorly in low COyhigh O environme paired in adhesion invasion and intracellular su tured epithelial cells 14 pylori cli
27. oreover it generates GUI code and enforces well designed MVC code supporting three main artifacts operations data types and views Operations and data types are used in problem modelling while views display data in a friendly way Regarding operations Note sustains the general workflow of Bio TM fully covering all activities performed in manual curation The workbench supports the retrieval processing and annotation of documents as well as their analysis at different levels So far only dictionary and ontology based annotation are supported as it was considered more important to provide means for the creation of annotated corpora rather than the construction of models based on general biomedical corpora This document briefly explains the functionalities of the Note Basics application and the way it can presently be used This application brings together a number of basic components of the full Note platform in a single application with the basic tools of BioT M oriented towards the basic needs of biologists This is still a preliminary version of the documentation 1 2 Datatypes and operations Every application built based on AIBench is organized around the concepts of datatypes and operations defined as follows e Datatypes define the types of data that are of interest to a given ap plication For each data object one or more visualizers can be defined to show its content to the user in a given perspective Data objects can h
28. ration Figure 2 4 As a result an ANoteProject object is added to the clipboard ij File Database Document Dictionary Settings Documentor 4 ab Clipboard Clipboard AlBench Figure 2 4 Loading a project 2 2 Handling database connections An Note project needs to have a database connection associated with it the MySQL database engine is used since many operations work over data in the database The database connection is created in the context of the Catalogue datatype Figure 2 5 or under the menu option Databases The user can choose to create a connection to an existing database or to create a new local database 2 2 1 Connecting to an existing database To create a connection to a previously existing database the user selects the option Create DB Connection In the popup window Figure 2 6 the user can select previously saved connection parameters and edit them if necessary or define a new connection The user saves the new configuration by clicking in the Add button The user can also remove a previous connection configuration After configuring all the connection fields host port database schema user and password the Connect button must be pressed A new item of datatype Database Connection is added to the clipboard and the view for this datatype includes information about the host port and database name j h Fr v2 0b2 File Database Documents Settings YALE
29. red Box KI il T TK Publication Set View Core Running 1 operations Figure 2 18 NER running operation 2 5 1 Document View When the user selects a document an item representing this document is added to the clipboard under the tree of the respective ANoteNerBox represented by its name The PublicationSet item on the clipboard will have nested boxes of documents There are four types of boxes that a Pub licationSet can enclose namely ANoteNerBox box with abstracts or fulltexts curated by NER Structured Text Box box with structured documents but without annotations Full Text Box box with unstructured documents and without annota tions Abstract Box box with just the abstracts without annotations To view the document the user has to click on the document s item in the clipboard and a view will be opened The document s view is structured in the following sections 18 MOTA XO J9N93O0NV 6T c AMSA pue uopepeibap uoneaes Ulajosd jueuiquuoae uo SIE paa sjensans jo paya 67955951 so0z N 3e pou uoyezijyn eje pue pig ouie Joy peuinbai si asuodsai jueDuus sul 66106091 so0Z N Buunp uoissaidxe aua6 jego sajeuipaoo2 eje udsoudoudsig 15 eriz9voi 00z Y Kieuonejs jo Jeans uojid 18 9eq091 aH 10 pauinbai si esuodsai jueDuus sul 6e755894 9007 G wens NuUa eqasuUAIOD ui asuodsal JuabuIs Buunp uoissasdxa aus jeqolg ezeigeai 9007 G e se ZI
30. st be selected for the NER Here a new dictionary can be imported how to import dictionaries will be described later in this document After the dictionary has been chosen the list of possible classes will be presented The user selects the classes to annotate by moving them from the left to the right list In the next step Figure 2 17 a set of complementary classes that the user can choose to be annotated are presented Those are classes which are given by lists of terms manually compiled The available options are e Biology related Verbs e Laboratory Techniques e Physiological States e Predefined Expert Hand Rules In the same window the user defines if he decides to annotate abstracts or full texts 14 109 WoLVoTqng oy BUIMOLA PTZ MB xog 1x91 und xog pennas seueuonoig pest xoq pensay o isxog son 23llqngajoNv 5 29FUDIOM YA PLsunseuejoNv jesuinses f asuodsa uaus 1109 Elu21180283 Nbojejeoajony enBo eje 2 m Biignga oNu ns8uajoNv Jasynsay 15 Txt Structuring and NER E3 Publication Set ANotePublicationSet instance 0 M a e Figure 2 15 Selecting the Publication Set for NER S Txt Structuring and NER t3 ecol_biowarehouse e meon Figure 2 16 Selecting the dictionary and the classes to annotate After all the configurations have been made the Execute button gear icon has to be pressed When the butto
31. tSet that is loaded into the clipboard By clicking in the P File Database Documents Settings YALE Documentor ciipboard 4 e a as Database Session Q9 ANoteProject Host localhost S Port 3306 9 anoterrajel ection Schema biotextmining anoteProjectconnectiot Queries Matching Pub Available Abs Downloaded Converted Pu 294 286 105 102 1409 404 J108 103 388 104 104 182 165 200 197 Keywords Escherichia coli stringent response Esche H Ie Q cancel ui Execute Q New Query oad 4 TI DP Catalogue View AiBench Figure 2 8 Listing PubMed queries ResultSet the user can analyse the set of loaded publications 2 4 Journal Retrieval In the ResultSet view the list of publications is presented This list con tains all the publications that were selected from PubMed using the original query In this step it is possible to select what are the publications the user really wants to retrieve to the project Each line of the view s table corre sponds to one publication and contains the title author s list and date of the publication If this information is not sufficient for the user to decide if he she wants to get the publication more detailed information about a publication can be viewed by clicking on the leftmost side button on the publicati
32. will be used The first option reduces the time since abstracts are already loaded from PubMed The latter implies the journal retrieval of the available full texts see Section 2 4 an operation that can take quite a while if the number of selected documents is big The set of selected documents can then be annotated using the available lexical resources see Section 2 5 for details A final step is the visualization of the annotation results and the manual curation of the user desires to correct errors and enrich the lexical resources see Section 2 6 The next sections will give further detail in the operations mentioned in this brief introduction Chapter 2 Main functionalities of Note Basics 2 1 Note Projects 2 1 1 Creating a new project To create a new project the user chooses the corresponding operation in the File menu selecting New Project 15 Note AlBench Framework v2 0b2 Woks File Database Document Dictionary Settings Documentor EJ Save Project Q our Figure 2 1 Creating a new project In the following popup window a name for the project has to be chosen When the name is set the Validate button is pressed If no project with the chosen name exists it is accepted and the user is able to proceed with the configuration At this point there are two mandatory fields to configure Figure 2 2 1 Firstly the Local Documents Path is set which is the local folder
Download Pdf Manuals
Related Search
Related Contents
一 クリエイティブバーテーション取扱説明書 一 Pregnancy Test Operating instructions 0123 Hydropool 2006 Hot Tub Manual.indd SNMP Dale Tiffany TF12409 Instructions / Assembly Rockbox user manual Terminal Industrial - Diagramas Electronicos Copyright © All rights reserved.
Failed to retrieve file