Home

APP: an Automated Proteomics Pipeline for the analysis of mass

image

Contents

1. Size Type Date Modified File and Folder Tasks File Folder 5 26 2014 9 194 AA File Folder 5 26 2014 9 20 4 Other Places gt plugins File Folder 5 26 2014 9 20 Share File Folder 5 26 2014 9 19 Se 05 C E directory 1KB DIRECTORY File 5 26 2014 8 37 4 E My Documents La APPServer jar 978KB Executable Jar File 5 26 2014 6 37 5 Shared Documents E logback xmi 2EB ML Document 5 26 2014 6 37 y My Computer E README TXT 2KB Text Document 5 26 2014 8 37 4 UL APPServer exe 33 KB Application 5 26 2014 9 24 a My Network Places e i Date Created 5 26 2014 9 25 AM Size 33 0 KE 33 0 KE e My Computer Figure 7 The unzipped client folder When the graphical user interface opens you might first want to look at the Execution setup tab for details see section Once this is done click the Client setup tab see Fig 8 Hetwork executor File Help View Cian sanp Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution Manual setup Server namellP adress 0 0 0 0 Sever port 466 Start as client kill client _ Autoconnect on startup Figure 8 The client setup tab This will show the manual setup tab If you want to you can manually add the URL or IP number of the server here If you want this worker to auto connect to your server click the Autoconnect on startup checkbox Once done click the Find server tab 11 El Network executor Fil
2. 2014 05 27 10 02 52 INFO tppinterfacepluginbased network FileSyncRun pool 6 thread 4 214 tn FileSyncRun completeSync Ending sync success 2014 05 27 10 02 52 INFO tppinterfacepluginbased network FileSyncRun FileSyncRun Tue May 27 10 02 50 CEST 2014 158 tn FileSyncRun run S 2014 05 27 10 02 54 INFO tppinterfacepluginbased network ConnectionController Thread 11 513 tn ConnectionControllerswhenConnectedRunna Retreiving value for NO_OLDER_THAN Time 2014 05 27 10 02 54 INFO tppinterfacepluginbased network ConnectionController Thread 11 364 tn ConnectionController write Object Forwarding 2014 05 27 10 05 48 INFO tppinterfacepluginbased network ConnectionController AWT EventQueue 0 364 tn ConnectionController writeObject Fo 201 4 05 27 10 12 02 INFO tppinterfacepluginbased network ConnectionController AWT EventQueue 0 364 tn ConnectionController writeObject Fo 2014 05 27 10 13 21 INFO tppinterfacepluginbased network ConnectionController AWT EventQueue 0 364 tn ConnectionController writeObject Fo 2014 05 27 10 13 59 WARN Networkinfrastructure ClientServer Connection Connection AWT EventQueue 0 340 N C C Connection close Closing d w Figure 19 Task creation tab with popup menu understand it is recommended to use mzML Finally search engines need info about what database post translational modifications and mass tolerances to use This is supplied via the General Search Settings plugin In addition
3. Figure 2 A blank server GUI For now click on the Execution setup tab to set some standard options see Figl3 There are a number of options available here most should be left alone Ed Network executor vo An x File Help View l Status Server setup Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution Set base Setbasedirectory Set directories containing exe Change webserver directory Set local directories for server Set directories to preindex for fi LJ Run only server only applications on server J Output console and log to statusarea Can cause performance loss J Use file cacheing Y Use local submission Cores used for exec Timeout multiplier for this machine 2 o _ Execute multi core tasks only when full max cores are available Figure 3 The execution setup tab The options are Set base directory Sets APPs base directory temporary files tasks and other things will be stored under this directory By default it is the directory containing APP but you can put this anywhere such as on a drive with more space On linux you should be aware of the folders permissions since the directory needs to be both readable and writeable Change webserver directory This directory is referenced by APP as the base folder for TPP derived web utilities such as spectrum viewers and results viewers If you installed using standard pat
4. sign button at the base of your plugin see Fig 26 Click the button on your MsConvert plugin The should turn red MsConvert Open plugin 1 Xtandem search plugin 1 1 Peptide Prophet plugin 1 1 iProphet plugin 1 0 Protein Prophet plugin 1 0 Double click to add label Double click to add label X Tandem Double click to add label Double click to add label Setup Setup Setup Setup Setup Add input files Add input files L Add input files Add input files Add input files GeneralSearchSettings 1 0 Comet Search Engine 1 0 Peptide Prophet plugin 1 1 Double click to add label Double click to add label Setup Setup Add input files ene Add input files i Add input files J Figure 26 Initiating a plugin link e Now directly after click anywhere on the X Tandem plugin box Do this until the stops being red Now when you hover your cursor over MsConvert X Tandem should appear red Vice versa hover over X Tandem and MsConvert should appear blue Click the on MsConvert again and then click the Comet plugin Note You can hold the shift key to allow linking or unlinking multiple plugins at once i e click or hold shift
5. 30 Denoise MS2 spectra Denoises MS2 spectra to remove random signal Can greatly increase the number of hits for X Tandem and other search engines Fiz titles for Mascot If the MsConvert plugin is being used to generate data for Mascot search this option should be checked It has no negative effect on other data and as such is left on by default Perform peak picking Msconvert will centroid the peaks usually greatly boosting database search hits Peak picking levels Decides which MS levels centroiding applies to If selected with the Peak Picking option Default is to Centroid all MS levels from 1 and up Prefer vendor centroiding Vendor centroiding algorithms is provided to the MSconvert team by various mass spectrometer vendors and usually ensures the best possible centroid This is not available for all vendors and if not present centroiding will act as if this option is set to false In this case centroiding will be performed by a local maximum seeking algorithm Use specific mass range This will keep only peaks in between the specified mz values specified in the linked options minimum m z and mazimum m z Apply to MS level Decides which MS levels will have peak filtering done defaults to only MS2 i e the mass range and peak filtering options are only applied to MS levels specified here Keep only specific peaks Decides if peak filtering will be performed at all defaults to off Linked settings are Coun
6. ASAPratio setup Iprophet setup 4 il gt Minimum probability peptides to keep 0 05 Minimum length peptid y Use decoy hits to estimate correct peptides _ Use separate parser execution if normal execution fails Decoy label De taf Experiment name interact _ Use non parametric model Ignore charge state J 0 1 LJ 2 LJ 3 L 4 Use command line instead of GUI options Trypsin v ITRAQ SILAC Cancel XPress settings OK Libra settings Specify database Figure 30 Add decoy option to PeptideProphet plugin settings Finally have a look at the search engine settings Click Setup on the X Tandem plugin then make sure the defaltKScore parameters are selected see Fig semi tryptic parameter files for tandem Native score and KScore are also provided but these are more computationally intensive 26 a lt gt x Base X Tandem parameter tile Leave untouched unless you know what you re doing defauttParametersNatve xml SemilrypticbefautParameterskhscore xmi axonomy xml semilrypticParametersNative xml Use refine after search Results _ processing should not use NTT model in Peptide Prophet Chose main search settings Cid Chose refine search settings Uses only mods A Figure 31 X Tandem settings choose the KScore parameter file There i
7. o Currently Running Run Browse output and jobs o 3 APP Tutorial z o 5 APP tutorial Extract Stuf A ea Browsing task APP tutorial Extract Stuff Files on server e 3 input o output Y ProteinProphetpeptides bd Y ProteinProphet proteins bd Y ProteinProphet hits bd Y ProteinProphet combined bd kl ecelved 82 ProteinProphel ecelved 65 ProteinProphel zecelved 88 ProteinProphet teceived 91 ProteinProphet ecelved 95 ProteinProphel ecelved 98 ProteinProphet teceived 100 ProteinProph Succesfull download of Protein Download to dir 2 Erie Quickdowntoacly Figure 42 Quickdownload will download files to your Interfaces download directory Files are downloaded a folder composed of Task name plugin name and any label applied to the plugin 8 Available Plugins An aim for APP plugins is to provide complete compatibility with the excellent Trans Proteomic Pipeline TPP and the brunt of plugins aim to automate tasks done through TPP A core distribution of APP also includes several tools not typically found in a vanilla TPP install along with plugins included for easier automation 8 1 General plugin function Plugins in general come with two direct ways of interaction A setup button and an add file button All plugins behaviours are governed by their individual settings along witth their input files After completion the plugin then passes on a selection of fi
8. voa x APP Tutorial lam working through the APP tutorial using example data Cancel Figure 33 Save task dialog Click on the Task window menu and chose Submit current task If you are fast you can look at the ongoing file transfers by looking at the Monitor transfers tab and clicking update see Fig 34 Once all the transfers are done your task should start processing Be Application title not specified 3J 0G File Task Network options Help Server actions Create plugins Monitor transfers View input output View submitted tasks Output Sending PM_Band_30 maf Status SUCCESS Completed 100 Sent 4610160 From Linux ErikMalmsMac UI to Linux ErikMalmsMac SERVER Sending PM_Band_13 mof Status ACTIVE Completed 43 Sent2457600 From Linux ErikMalmsMac UI to Linux ErikMalmsMac SERVER Sending PM_Band_16 maf Status SUCCESS Completed 100 Sent5364867 From Linux ErikMalmsMac UI to Linux ErikMalmsMac SERVER Sending PM_Band_18 mof Status SUBMITTED Completed 0 Sent 0 From Linux ErikMalmsMac UI to Linux ErikMalmsMac SERVER PO Sending PM_Band_19 mof Status SUBMITTED Completed 0 Sento From Linux ErikMalmsMac UI to Linux ErikMalmsMac SERVER po _ a Ii aaa xx Sending PM_Band_2 mgf Status SUBMITTED Update Completed 0 bd een 40 PM_Band_39 mogf Sent 42 PM_Band_39 mof Sent 43 PM_Band_39 mof Sent 44 PM_Band_39 mof Sent 45
9. PM_Band_39 mof Sent 46 PM_Band_39 mof Sent 48 PM_Band_39 mof Mo Sent 49 PM_Band_39 mof Sent 50 PM_Band_39 mof aa hi E Figure 34 Monitor transfers going to and coming from the running interface 7 0 4 Monitor execution Click on the View submitted tasks tab Here all submitted tasks are shown in a tree form Expand the tree and you will find a list of execution indexes rising from 1 Double click to expand any of these nodes to see the state of your task running tasks are updated roughly once every 40 seconds Under each plugin is a list of jobs performed by this plugin Click on one to see any output so far Click on the task or a plugin to get summary information Wait for the task to show the status complete when you highlight it to track ongoing task execution in detail have a look at individual plugin nodes Right click 28 anywhere on a task and choose Open selected task to see a more familiar representation of your task see Fig 35 x Application title not specified File Task Network options Help View submitted tasks Task name SearchEngine2013 Description Executi Open selected task a gt Pe F Executi send task command to server Reset task gt Gene Open close filter dialog Delete task F MsCe hs Convert plugin SUCCESS no 1 Fause task MsConvert plugin SUCCESS no 2 Unpause task WsConvert plugin S
10. 0 1330 0 1330 2 2 94 1 eptide Prophet plugin 1 1 1 1 iPro 3 na3ASTDKPDRESIK No label Browse output and jobs Brows gt tr A9P7US A9P7US_POPTR 1 0000 Browse input and out Brows h confidence 1 00 coverage 15 9 num unique peps 3 tot indep spectra 4 mp 2 gt Pyruvate kinase OS Populus trichocarpa GN POPTR_0006s11860g PE 2 SV 1 3 Files on server a nsp ad init 4 input weight peptide sequence aa J prob ntt nsp tot 9 J output Py ProteinProphet protxs wt 1 00 TENIEGLTHFDEILQEADGIILSR O 9994 0 9990 2 2 071 y ProteinProphet protni gal 3_TENIEGLTHFDEILQEADGIILSR Py ProteinProphet prot mx ggl wt 0 50 LGDLYOTOIFAK 0 9994 0 9999 2 2 57 2 y ProteinProphet prot png 2_LGDLYQTQIFAK Ey ProteinProphetprotxml 2_IDFLSLSYTR wt 0 50 AEATDVANAVLDGSDAILLGAETLR 0 9630 0 9388 2 2 60 2 2 AEATDVANAVLDGSDAILLGAETLR 3 AEATDVANAVLDGSDAILLGAETLR wt 1 00 GLYPIETISTVGK 0 1575 0 1575 2 2911 2 GLYPIETIST VGK 3 1r A9P7VS ASP7VS POPTR 1 0000 confidence 1 00 coverage 22 6 num unique peps 7 tot indep spectra 14 gt Monodehydroascorbate reductase family protein OS Populus trichocarpa GN POPTR_0006s11570g PE 2 SV 1 weight peptide sequence nsp adj prob init prob ntt nsp tota wt 1 00 TIDVYDYLPFFYSR 0 9995 0 9990 2 4 47 3 2 TIDVYDYLPFFYSR wt 1 00 ISISDVYAVGDVATFPLK 0 9995 0 9990 2 4 47 4 Search form 2 TST SDVY AVGDVATEPI K Open f Figure 39 The final protein list from our test search can be retrieved from the Pro
11. 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 173 N C C Connection 185 N C C Connection 517 N C_C Connection 326 N_ C C Connection 173 N C C Connection 185 N C C Connection 517 N C C Connection 326 N C C Connection 173 N C C Connection 185 N C C Connection 326 N C C Connection 173 N C C Connection 185 N C C Connection 517 N C C Connection 326 N C C Connection 173 N C C Connection 185 N C C Connection 517 N C C Connection 326 N C C Connection 173 N C C Connection 185 N C C Connection 517 N C C Connection 326 N C C Connection setupConnection Initializing outputinput streams Socket addr 130 237 79 70 por 1466 localpor 2953 setupConnection Streams to server established Socketladdr 1 30 237 79 70 port 1466 localpon 2953 initiateHandshake Initiating handshake of clientWvindows XP MassSpec CLIENT sendClientiD Sent ID of this clientWindows XP MassSpec CLIENT setupConnection Initializing outputinput streams Socketfaddr 1 30 237 79 70 port 1 466 localport 2954 setupConnection Streams to server established Socketladdr 1 30 237 79 70 port 1 466 localporn 2954 initiateHandshake Initiating handshake of clientW ndows XP
12. MassSpec CLIENT sendClientiD Sent ID ofthis clientWindows XP MassSpec CLIENT setupConnection Initializing outputinput streams Socketladdr 130 237 79 70 por 1466 localpor 2955 setupConnection Streams to server established Socketjaddr 1 30 237 79 70 port 1 466 localpor 2955 517 N C C Connection initiateHandshake Initiating handshake of clientWvindows XP MassSpec CLIENT sendClientiD Sent ID of this clientWvindows XP MassSpec CLIENT setupConnection Initializing outputinbut streams Socketladdr 130 237 79 70 por 1466 localport 2956 setupConnection Streams to server established Socketladdr 1 30 237 79 70 port 1466 localport 2956 initiateHandshake Initiating handshake of clientVindows XP MassSpec CLIENT sendClientiD Sent ID ofthis clientWindows XP MassSpec CLIENT setupConnection Initializing outputinput streams Socketladdr 130 237 79 70 por 1466 localpor 2957 setupConnection Streams to server established Socketjaddr 1 30 237 79 70 por 1466 localpon 2957 inittateHandshake Initiating handshake of clientvvindows XP MassSpec CLIENT sendClientiID Sent ID of this clientwWindows XP MassSpec CLIENT setupConnection Initializing outputinput streams Socketfaddr 1 30 237 79 70 port 1 466 localpor 2958 setupConnection Streams to server established Socketladdr 130 237 79 70 port 1 466 localport 2958 initiateHandshake Initiating handshake of clienti ndows XP MassSpec CLIENT sendClientiD Sent ID ofthis clientWWindows XP Mass
13. and Comet using settings provided from GeneralSearchSettings Finally output will be analyzed by PeptideProphet and will give vetted set of results iProphet will take these and build a combined search result file which will then be processed by ProteinProphet into a final protein list The final step here is to setup each plugin and add the input files e Click setup on MSconvert and click Perform peak picking and make sure Levels 1 and above is the chosen option This will mark peaks as centroided Finally select mzML as the output format MsConvert settings Sort spectra by scan time start aj 4 RJ Output formatimzML Keep only ms2 and above scan Precision From 32 w gZip output ZLib Conversion Deisotope MS2 spectra Denoise MS2 spectra k Perform peak picking RE OO Fix titles for mascot important for M Peak picking on MS levels 1 and above Prefer vendor centroiding true Use specific mass range 100 Minimum m z 2000 Maximum m z _ Keep only specific peaks 2 Apply to MS level Use 1 2 or 1 1 3 etc OK Cancel Count peaks 1100 Threshold Use 0 01 1 for relative values and integers for absolutes Keep peaks above threshold Use external parameter file Select external parameter file 1 Spli i E 4 Ill gt Figure 28 MsConvert settings
14. be set up such as a Windows machine using Vendor library enabled MsConvert in a network of Linux machines that perform database searches The server can be configured and started either via a command line interface or through a rich GUI Configuration files are stored in XML formatted in a human readable fashion and simple text files These can be directly edited or accessed through the interface for an easier setup 3 Tasks Anything worth doing in proteomics usually involves more than a few steps In the simplest scenarios where one merely wants to analyze a specific gel band this still involves several steps following MS analysis Typically a work flow will need to process mass spectrometer raw data and pick out features from MS scans convert the data into a format that MS search engines can handle and then search the data using one or several database search engines Any output is generally post processed either manually or in an automated fashion to give a probability for all protein identifications APP uses this model of the flow of data at its core since tasks are defined as a network of plugins where each will push the resulting data into the next plugin In the example given data would be processed and converted using the MsConvert plugin preferably running on Windows searched using any combination of InsPecT OMSSA Comet X Tandem Myrimatch and MSGFPlus search engines and the output would be analyzed with PeptideProphet iProphe
15. da ee bebe eee doet ahead Beat heatageeunsaseSeaeHeuaeeua 39 8a GEES s owe tbe eee Rhee eRe ass bone eG ee eae E AO Sel InskccL Mie osorno soe sra AAA 42 Aaa ss A AAA 42 8 6 Spectral search engine plugins a 42 8 6 1 SpectraST library builder plugin e 43 8 0 2 Spectraol search Pluto cs ems rosa eras ass 44 AA AAA III 45 8 7 1 PeptideProphet plug 45 sz Prophet PICs 6 4 oe so roed rdi kadie osa A 46 R ee E E EEE E E 46 8 9 Smaller utilities es ae ew ESR osas As 46 8 9 1 Spectrum name fixer plugin summary o oa a a eee ee ee ee 46 8 9 2 Label Free Data Extractor 2 oa a a a a 47 8 9 3 LibraProteimRatioParsend a a a a a 49 TERNA 49 895 Libra Normalizer ou se sa s eabes ases sde ORS 49 8 9 6 Input file feederl 50 bee ee ee eee eee eb eeae te eae ee ee eee yeas eee 50 Soo ADICON gt raso ete ene eee Ree ee ae Pee ea ee oe ee 50 1 Project info e Project homepage https sourceforge net projects automatedproteo e Discussion group For support questions please contact 2 An introduction Automated Proteomics Pipeline APP is a working name given to our efforts to gather a large amount of pro teomics functionality under one common interface It builds on the work of many in the proteomics community to offer up an integrated user interface and distributed server infrastructure without many of the common difh culties of such setups In a way it s a
16. each search engine has a set of specific settings which we will ignore for the time being but read up under each specific plugin Now set up a task to accommodate this In the interface click on the Create plugins tab You should now have an empty white window in front of you Fig 19 Right click in that window to bring up the plugin menu The actions of the menu should not be too hard to grasp allowing creation deletion and cloning of plugins Try it now right click the white space then click Add plugin item then Raw data conversion and finally MsConvert plugin See Fig 20 19 EJ i Application title not specified gt e File Task Network options Help Server actions Create plugins Monitor transfers View input output V View submitted tasks Output Add plugin item No category Remove Selected Plugin Data Processing See Selected Plugin Description Quantitation Clone Selected Plugin Raw data conversion PKL to MGF converter Search Engine mzXML20ther Search results processing MsConvert plugin Spectral search Utility 2014 05 27 10 02 51 INFO Networkinfrastructure ClientServer Connection Connection pool 6 thread 3 517 N C C Connection initiateHandshake In 2014 05 27 10 02 51 INFO Networkinfrastructure ClientServer Connection Connection pool 6 thread 3 326 N C C Connection sendClientID Sent ID 2014 05 27 10 02 51 INFO Networkinfrastructure ClientServer FileTransferCo
17. for test data e Click setup on GeneralSearchSettings This will initiate such a sync available datbases from the server As such if the datbase field stays empty for too long please close the setup window and open it again which should also display any new datbases available Choose 50 ppm tolerance for MS1 and 0 2Da for MS2 Chose the Populus database with decoys if you re using the base database set this is named uniprot organismPopulusTrichocarpaUniprotReferenceProteomeDecoyPrefixDc from the dbase menu 20 see Fig 29 add a Carbamidomethyl on C fixed mod and an Oxidation of M variable mod Hint To get around quick press the first letter of the mod you re looking for The decoy database will allow use of PeptideProphet semi parametric model which is necessary for valida tion of most supported search engines If you only wish to run X Tandem you can use one of the non decoy databases Variable modifications Oxidation on M Anywhere Fixed modifications gt Carbamidomethyl on C Anywhere Modification Methyl on D Anywhere Methyl on C term Any C term Methylthio on C Anywhere NIPCAM on C Anywhere Oxidation on W Anywhere Oxidation on H Anywhere Oxidation on M Anywhere Phospho on Y Anyv Oxidation Oxidation or Hydroxylation M 15 9994 15 994915 Add fixed modifications Add variable modifications Remove selected modifications Databases uniprot organismPopulusTrichocarpaUniprotReferenceProteomeDecoyPrefi
18. net installation to run properly Most likely you have this already but if you run into problems please reinstall net from http www microsoft com 6 1 2 Client Server installation on Ubuntu 12 04 14 04 Download APPUbuntul4 04Bundled zip from the sourceforge page unpack the zip file Using a terminal invoke sudo Ubuntulnstaller sh install Note that this will install a web server all linux compatible search engines TPP4 7 1 rev 0 will be updated each revision and open port 80 and 1466 in iptables You will be asked to confirm each of these steps at the command prompt To start the APP server tool type APPServer at the terminal To start the APP interface type APPInterface at the terminal TPP and APP directories can all be located under usr local tpp To make APPServer auto start at boot time type sudo update rc d APPServer defaults at the terminal If you have set APP up as an auto connection client node you can make the connection auto start when the computer turns on by running sudo update rc d APPClient defaults 6 1 3 Initial server client setup using the GUI The first thing that will greet you upon starting the server GUI is a blank status window see Fig 2 By default the outputs of APP are not sent here but can be made to do so Ed Network executor we a 0 File Help View Status Server setup Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution
19. plugin rE Application title not specified 909 File Task Network options Help Server actions Monitor transfers View input output View submitted tasks MsConvert Open plugin 1 Xtandem search plugin 1 1 Double click to add label Double click to add label Add input files Add input files Add plugin item No category Remove Selected Plugin Data Processing See Selected Plugin Description Quantitation Raw data conversion Search Engine Xtandem search plugin 1 1 1 Search results processing MyriMatch search plugin Spectral search 5 Utility InSPECT Plugin Thread 1 MSGFPlus search plugin ini GeneralSearchSettings OMSSA Plugin 1 1 Figure 23 Adding a Comet search engine plugin 22 e Now to add the plugin that will provide search settings add the GeneralSearchSettings plugin from the search engine category see Fig 24 EJ Application title not specified File Task Network options Help Server actions View input output View submitted tasks Monitor transfers Create plugins MsConvert Open plugin 1 Double click to add label Xtandem search plugin 1 1 Double click to add label Setup en Setup Add input files Add input files Comet Search Engine 1 0 Double click to add label Add plugin item Remove Selected Plugin No category Data Processing Quantitation Raw data conversion a Zz TT Search Engine 2014 05 27 10 02 52 I
20. the APPServer directory If your current JVM is not at least 1 7 0 you will be directed to a download page Please note that Oracle bundle Java with ASK toolbar make sure to uncheck the selection for this during installation since it is unrelated bloatware e Torun a full APP server it is best to also install the Trans Proteomic Pipeline e First download and install ActivePerl Community edition from http www activestate com activeperl Download the appropriate either 32 or 64 bit version and double click the downloaded msi file to install e Download and install the latest TPP version from the Sashimi project page at http sourceforge net projects sashimi Note that SourceForge has been known to bundle bloatware inside installers though this does not seem to have affected TPP so far users should pay attention to any check boxes to avoid installing ad ware e Start the server by double clicking APPServer exe if Windows hides the extension it should say Appli cation as file type Now your server should either start up or you will be taken to the correct webpage for downloading Java In the second case download and install Java note that on Windows Java is also bundled with bloatware say no to any other suggested programs and then execute APPServer exe again e See section for further instructions you now have the brain of your distributed computing network e Note ProteoWizard idconvert msconvert needs a
21. your task should look like in Figl25 Also try double clicking the Double click to add label text on the PeptideProphet plugin Add a label to track which one is which E Application title not specified File Task Network options Help Server actions View input output View submitted tasks Output Create plugins Monitor transfers iProphet plugin 1 0 Protein Prophet plugin 1 0 Peptide Prophet plugin 1 1 Xtandem search plugin 1 1 MsConvert Open plugin 1 Double click to add label Double click to add label X Tandem Double click to add label Double click to add label Setup h Add input files Setup e Setup h Setup la Setup Add input files Add input files Add input files Add input files Comet Search Engine 1 0 Peptide Prophet plugin 1 1 GeneralSearchSettings 1 0 Double click to add label Double click to add label Setup Setup la Add input files EE Add input files iml Add input files iml Figure 25 Layout of the final task 23 Make sure you arrange the plugins in roughly the same order as pictured since each task maintains a flow from left to right starting the leftmost plugin in a linked chain first e To direct a plugin to feed it s output into another click the
22. Availaible output formats General search settings None None gsp Modification Fixed modificati Variable modifications Label 180 1 on C term Any C term ja Carbamidomethyl on C Anywhere Oxidation on M Anywhere Met gt Hse on M Any C term Met gt Hsl on M Any C term Methyl on E Anywhere Methyl on D Anywhere Methyl on C term Any C term Methylthio on C Anywhere NIPCAM on C Anywhere Oxidation on W Anywhere Oxidation on H Anywhere Oxidation oni Anywhere Phospho on Y Anywhere Phospho on T Anywhere Phospho on S Anywhere Propionamide on C Anywhere Pyro carbamidomethyl on C Any N term Databases Prefered MSMS tolerance unit Uniprot_completeSaccharomyces_proteomeDc1Dc2 fasta y m Select External FASTA database Prefered mass tolerance type MSMS tolerance Da v 200 0 ppm 50 0 PPM a Use external d Peptide mass folerany 4 y y Da 08 Da 1 2 and 3 v i R Charge state Monoisotopic Average mass Max missed cleavages Monolsotopic v Add variable Add fixed bad modifications modifications Save current Load saved Ok gt parameters Ab Parameter namel Standard settings to file arameters from file a Cancel Remove selected modifications Figure 43 The general search settings setup window e Fixed dynamic modifications An export of the Unimod database is provided with the plugin this provides a large set of modifications for use in search e MS1 MS2 mass tolerance as well as preference for which to use In b
23. NFO tppinterfaceplugin Search results processing 2014 05 27 10 02 52 INFO tppinterfaceplugin Spectral search 2014 05 27 10 02 54 INFO tppinterfaceplugin Retreiving value for NO_OLDER_THAN Tim 2014 05 27 10 02 54 INFO tppinterfacepluginbased network ConnectionCon 2014 05 27 10 05 48 INFO tppinterfacepluginbased network ConnectionCon 2014 05 27 10 12 02 INFO tppinterfacepluginbased network ConnectionCon 201 4 05 27 10 13 21 INFO tppinterfacepluginbased network ConnectionCon See Selected Plugin Description Xtandem search plugin 1 1 1 MyriMatch search plugin Comet Search Engine InSPECT Plugin MSGFPlus search plugin le OMSSA Plugin 1 1 AVVT EventQueue 0 201 4 05 27 10 13 59 WARN Networkinfrastructure ClientServer Connection Connection n completeSync Ending sync succes 5 EST 2014 158 t n FileSyncRun runs tionController WhenConnectedRunne tionController writeObject Forwarding ConnectionController writeObject Fo ConnectionController writeObject Fo ConnectionController writeObject Fo 40 N C C Connection close Closing v gt Figure 24 Adding a GeneralSearchSettings plugin e Now on your own from the Data Processing category add two PeptideProphet plugins to the right of the search engine plugins then an Prophet plugin one step to the right of those and finally a ProteinProphet plugin to the right of that one Drag and drop the plugins if you place them in the wrong place when you re done
24. Spec CLIENT Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to LinuxErikMalmsMac SERVER Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to Linux EnkMalmsMac SERVER Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to Linux ErikMalmsMac SERVER Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to LinuxErikMalmsMac SERVER Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to Linux EnkMalmsMac SERVER Mon May 26 09 29 12 CEST 2014 137 N C FileTransferConnectionStore addConnection Added new file transfer connection to Linux ErikMalmsMac SERVER Figure 10 Status area of connecting worker A better way to check if your client is connected is via an APP interface see section 7 0 2 Click the Server actions menu at the top of the window and then the Get server report button you will receive a list of 12 connected workers and their current workload Fig 11 F 4 Info from server Connected clients 2 Windows XP MassSpec CLIENT Maximum cores 2 Minimum load 0 1 100000023841858 Submitted tasks 0 Returned tasks 0 Active true Paused false File TransferConnect
25. TPGSAASMNSGGAGK NTA PM _Band_2 00007 00007 25T 328 272 PM_Band_200009 00009 27 234 209 Pm _Band_2 00011 00011 25T 246 235 R nunk D 1016 RapcamsvR e PM _Band_2 00013 00013 25T 232 230 R KaYKGEK E inBROTHAIBAGTICA POPTR 1 Pm Band_200014 00014 357 387 365 ALAPTSCuso o3NRCuso 030R 0 Dc222021 20001500015451 571 554 VA GA KA ME i A PM_Band_2 00015 00015 47 23 102 R APKSPQELKTASPFPALK Q s De22533 S 1909 0677 PM_Band_2 00016 0001 6 37 416 401 16 80 R AGSPSSAALAATSAAVDSALR T Dc15440 1872 9537 7 1 la tor 11 22 916 R QRWDRATSR K Dc21644 1174 5957 11 20 K ESANSDOPLOK A rIBSGW15 B9GW15 POPTR 1215 5731 12 12 16 K AEGPM147 o4ALYK Gh r B9IJBOJB9IJBO_POPTR 3 994 4797 A K TLAEVVGSEEEAR K 1 B911V8 B911V8 POPTR 1388 6787 7 1 A 1 EA KER 7 1 i 1 5 9 0 1 l 1 2 5 0 6 22 k 16 R 20 K 12 126 Ik Loza 54 R KVKNMAVR Al USG5B4 U5SG5B4 POPT 944 5587 15 1016 R 44 102 R lt gt Figure 38 Output from X Tandem search engine Looking through the output you will find some entries with a protein name starting in Dc These are the 30 decoy spectrum matches and should be filtered out in the next step Click Browse files on the ProteinProphet plugin and open ProteinProphet prot xml in the same way to get the final output fig 39 2 n434STDKPDR wt 1 00 ASTDKPDRESIK
26. UCCESS no 3 WsConvert plugin SUCCESS no 4 WsConvert plugin SUCCESS no 5 WsConvert plugin SUCCESS no 6 MsConvert plugin SUCCESS no 7 WsConvert plugin SUCCESS no 8 WsConvert plugin SUCCESS no 9 W Exaritinn ineayv T Resettask from plugin Resettask from plugin keep successfull resubmit all unfinished Figure 35 See all currently running tasks by default only tasks from the last 30 days will be displayed Click browse jobs on any of the plugins to get detailed information on the execution fig B6 Browsing task APP Tutorial a E x Browse output and jobs Browse output and jobs Browse output and jobs Br Browse input and out Browse input and out Browse input and out B Runnables for MsConvert Open plugin 1 0 Job 1 SUCCESS Job 2 SUCCESS processing file homestpp2 data APP taskController APPTutoriall c82df4fc1ed4f57aca798f2af1 d6678 MsCon riting output file hometpp2 data APP runnableController MsConvertOpenplugin 327fedd7d8634e20b0d353 Job 7 SUCCESS Job 8 SUCCESS Job 9 SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS Figure 36 View the details of all operations performed by a plugin in the Browse output and jobs view Click Browse input and output files for the X Tandem plugin Highlight any of the files ending with pep xml and click Open fig 37 29 FE E cg
27. User manual APP an Automated Proteomics Pipeline for the analysis of mass spectrometry data based on multiple open access tools Erik Malm Contents 3__ Tasks 4 The server L QU TENIENTE ITA AEREA 6 1 1 Client Server installation on Windows 6 1 2 Client Server installation on Ubuntu 12 04 14 04 6 1 3 Initial server client setup using the GUI 6 1 4 Connecting an additional worker using the GUI oaa a a a a a 6 2 Configure on the command linel 6 3 User accounts and access keys 6 3 1 Activate user accounts 2 a 6 4 Add a client access key nnyo dedantys ok a tosda sida 6 5 lest dataset search 2 exa AE AAA A ES 6 5 1 Interface installation Windows Linux 0 0000 ee ee ee eee 7 Tutorials and sample datasets 1O02 o ED e sen ease baw Asa 7 0 3 Build a multiple search engine task 2 2 e Errar 1 0 5 Work with output from previous task e 8 Available Plugins 8 1 General plugin function osos aaa AAA 8 2 Raw data conversion plugins oo lt s perso deseas O 8 2 1 MSconvert pl ginj rapera AAA AA AAA So PERLEN i cn ee CREE Eee Ph whee IA A 2 8 3 Database search engine plugins 1 1 a a e 8 3 1 General Search Settings plugin e 16 16 17 21 30 8 3 2 XiTandeml lt lt lt lt 246 b4 ce risas as dass Ew OX 38 Soo NPrImMACH s edsa we Oe eae Ee ERE eo A 39 AA twa
28. al most options should be left as default 8 23 PKL to MGF Table 4 Pkl to MGF plugin summary Plugin name Needed binaries Accepted input formats Availaible output formats PKL to MGF None PKL mef XML file processed by BioTools PKL files are here converted to MGF typically for further conversion into mzXML and use in the pipeline The interface offers no options and any tweaking of the raw data should be done in a separate MsConvert plugin step 8 3 Database search engine plugins Several MS search engines are provided with the default APP installation All search engines take their basic input for mass tolerances database to search fixed and dynamic amino acid modifications etc from a single plugin As such all search engine plugins expect at least three input files A fasta database to search a gsp settings file from the General Search Options plugin described below and also at least one data file in mzXML or mzML format mzXML has broader compatability 8 3 1 General Search Settings plugin All search engine plugins take their basic input for mass tolerances database to search fixed and dynamic amino acid modifications etc from a single plugin T his allows search settings to be described only once and then utilized for all search plugins in the current task The settings handled through the general search settings plugin are 37 Table 5 General search settings plugin Plugin name Needed binaries Accepted input formats
29. and then click on any plugins e Click the plus button on the X Tandem plugin and then click PeptideProphet e Hover your cursor over plugins to see their inputs and outputs see Fig 27 for an example Any input plugin will be highlighted as blue whereas a plugin targeted for output will appear red 24 Xtandem search plugin 1 1 iProphet plugin 1 0 Protein Prophet plugin 1 0 k Double click to add label Double click to add label Double click to add label Setup Setup Setup Add input files Add input files a Add input files Add input files Listo lic Add input files J GeneralSearchSettings 1 0 Comet Search Engine 1 0 Peptide Prophet plugin 1 1 Double click to add label Double click to add label Comet Setup Setup Setup Add input files Add input files Add input files rn A Figure 27 Checking plugin input and output e To link up the rest of the plugins link GeneralSearchSettings to both Xltandem and Comet Link Comet to the PeptideProphet plugin that does not have X Tandem linked to it e Link both PeptideProphet plugins to iProphet and finally link the iProphet plugin to the ProteinProphet plugin This is our chain files will be converted from MGF to mzML in the MsConvert plugin and then searched by X Tandem
30. atches _ Export only the specified fields leave out defaults _ Use Protein Prophet peptide weights to calculate weighted average using shared peptides _ Export fasta sequences of any extracted protein names OK Cancel Figure 40 The LabelFreeDataExtractor plugins allows you to extract info from prot xmls and pep xmls into tabular formats Start by adding the files from your previous task to this plugin as detailed below or see Fig 41 e Click Add input files on the plugin e Click the Browse compatible files on server button A sync of files available on the server should now commence from the server Syncing info on available files from the server can take a few seconds if the display of files is empty try closing the window and clicking the button again e A list should now appear as below Fig 41 If there are many files you can type the name of your task in the Category filter textfield or any part of the file name in the File name filter textbox Now navigate to the output of ProteinProphet highlight ProteinProphet xml and click Add vaultfiles Fig 41 e Save name and submit your task as previously 32 Be vx c3 Files on server c3 Selected files GJ task GJ task complete c complete CJ app tutorial CJ app tutorial o 3 1 msconvert open plugin no 1 1 5 protei
31. ced Do not assemble protein groups Peport calculated protein molecular weight Advanced Normalize MSF using protein length Advanced Use expected number ofion instances to adjust the peptide probabilities prior to NSP adjustment Advanced Check peptide s total weightin the Protein Group against the threshold default check peptide s actual weight ag qo Parse Libra results Will use the first condition xml encountered in apep xmls directory unless specified select File Figure 52 ProteinProphet plugin settings window than Spectrum number emphAs such any pep xml output from MSGFPlus and Myrimatch need to be processed through this plugin before using iProphet or any quantitation tools other than pectral counting Raw data files are parsed to match index to spectrum number if data files have been moved since the search they can be set as input for the plugin The plugin has no settings merely feed it a pep xml file from IDConvert MSGF Plus or from Myrimatch 8 9 2 Label Free Data Extractor The label free data extractor is a utility for extracting information from pep xml prot xml files The plugin performs basic quantitation such as spectral counting and extraction of Total Ion Current for proteins The plugin has a number of filter settings to determine what proteins peptides and peptide spectrum matches get included in amount estimations Plugin name Needed binaries Acc
32. chose Open with OpenJDK 7 from the right click menu 7 Tutorials and sample datasets 7 0 2 Starting up Unpack the provided APPInterface zip file This will give you a directory contain APPlInterface jar and for Windows users APPInterface exe You will need Java 7 to open the program either OpenJRE7 available in your local linux repository or Oracle JRE7 www java com On Windows double click APPInterface exe to run a Java check auto set the reserved memory and start the interface On linux double click the launch sh file and wait for the APP interface to load A small connection dialog should pop up with three tabs see Fig 17 17 Manual settings Detected services Task sync settings Server port 1466 Serer P proteomicserver se Clent name Default OK Incaming Figure 17 Connection dialog If your server is on the local network simply click Detect service tab then the Detect server button and wait for your server to be detected see Fig 18 Head back to the Manual settings tab and click connect If your server is not detectable i e not on the local network you will have to enter the url manually into the connection dialog E al e x Manual settings Detected services Task sync settings Detekt Servers EA Figure 18 Detection tab of connection dialog Is user managment is enabled you will now be prompted for a username and password combination Once connected the firs
33. der if not specified in the spectrum file 3 Maximum precursor charge to consider if not specified in the spectrum file 0 1 Range of allowed isotope peak errors _ Output additional features Ceana Figure 48 MSGFPlus search settings Fragmentation type Collision induced dissociation and Electron induced dissociation models are supported Instrument rule Supported rules include high and low sensitivity ion trap instruments along with Time of Flight instruments Enzyme used Determines MSGFPlus expected cleavage rules Protocol setting Protocols allow focused detection of various features such as phosphorylated peptides or TRAQ tags Number of tryptic termini Determines if non enzymatic peptides are considered 42 Table 10 InsPecT search plugin Plugin name Needed binaries Accepted input formats Availaible output formats InsPecT plugin InsPecT exe mzML mzML pep xml msconvert exe Unless input is 32bit fasta non zlibbed non gziped mzXML gsp 8 5 1 InsPecT plugin InsPecT integration does not yet implement unrestricted search f search engine Most settings are provided by a gsp file from a settings plugin todo D lt 2 gt O 2 Maximum number of modifications on a peptide Protease Trypsin y Instrument QTOF _ Use multicharge guess precursor charge and consider other charge states v Run through msconvert necessary if file is gzipped zlibbed or 64 bit precisio
34. ders or indexed folders Do the same for any folders where you want the server to index files Examples of folders to index are directories containing raw data or fasta databases This will make them show up under the Browse compatible files on server when adding input to plugins later on If you want databases to be shown in the General search settings plugin put them in a subfolder called dbase Switch to the Activate Deactivate Plugin Execution here you can decide which plugins should be executed by a client or server see Fig 5 Network executor wo AX Status Server setup Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution gi InteractParser combiner 1 0 Y Protein Prophet plugin 1 0 W PKLto MGF converter 1 0 W Xtandem search plugin 1 1 1 1 1 1 Y mzxML20ther 1 0 W MyriMatch search plugin 1 0 LibraProteinRatioParser 1 0 hk INSPECT Plugin 1 0 Y iProphet plugin 1 0 W MsConvert plugin 1 0 W SpectraST spectral library builder plugin 1 0 W MSGFPlus search plugin 1 0 W mascot2xml plugin 1 0 W SpectraST Plugin 1 1 1 1 W Label free data extractor 1 0 W GeneralSearchSettings 1 0 W Input file feeder 1 0 Y Peptide Prophet plugin 1 1 1 1 W Spectrum fixer 1 0 Y Libra normalizer 1 0 W OMSSA Plugin 1 1 1 0 W IDConvert plugin 1 0 Figure 5 Activate Deactivate plugin execution Note that only
35. des information and files needed for a task automatically so no data files need to be directly made available to the workers though shared storage can also be used See figure 1 for a schematic or see table for an overview of APP site content Job execution on workers 0 Mark success New job _ Execute job Request Evaluate failure needed files results Create separate Return results to server jobs for task for use in creating next round of jobs y gt Interface L l Figure 1 A schematic server setup showing three workers processing a job provided by the interface 5 The interface Figure 1 shows a schematic server worker structure The APP interface is where all user interaction will take place the interface allows users to build a proteomics task mixing files stored on a local computer with files stored on the server and monitor the execution of any tasks currently running on the server Please see the tutorial section for using the interface together with our provided tutorial data 6 Installation and use 6 1 Installation Installation differs slightly depending on if you wish to run a full server or merely add another processing node to an already running server The default zip bundles contain all files needed to add a database search engine node and will include executables for MSGFPlus Myrimatch X Tandem OMSSA and Comet Included is also the open source components
36. e Help View Status Server setup Client setup Execution setup FileTransfer setup Activate Deaviivate Local Plugin Execution Manual setup Find servers serveMe Detect servers Connectto selected server Figure 9 The client setup tab Here you can either click the Detect server button once the server has been detected you can click connect see Fig 9 If you entered the server URL on the manual setup tab you can click connect without detecting Your worker node is now connected to your server note that if your server is not available the worker node will keep trying until a server is found Forever This means that if the server goes offline the worker clients will auto reconnect If you clicked the Output console and log to statusarea option during Execution setup you can double check the connection progress by clicking the Status tab Fig 10 FA Network executor File Help View Status Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution Monitor transfers May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May 26 09 29 12 CEST 2014 May
37. epted input formats Availaible output formats Label Free Data Extractor plugin None pep xml proteins txt prot xml peptides txt pep xml hits txt combined txt e Create amount estimations from TIC values Will parse source mzXML mzML files to get the Total Ion Current for each spectra This wil be used to calculate a total and an average for each protein giving a label free way of estimating amount e Export spectral count Will count all spectra that pass filtering for each peptide and protein 48 EE soe Create amount estimations from TIC values Export spectral count _ Extract Xpress calculated label free values _ Use only unique peptides By sequence for calculations Use only non degenerate peptides for calculations 0 00 iminimumPeptide Weight 0 95 Protein probability cutoff 0 90 Peptide probability cutoff Assigned by protein prophet 0 50 Peptide probability cutoff Peptide prophet individual matches Enter comma separated accession numbers Add extra comma separated properties for proteins to export Add extra comma separated properties for summary peptides to export Add extra comma separated properties for summary of hits to export _ Export only the specified fields leave out defaults _ Use Protein Prophet peptide weights to calculate weighted average using shared pep AAA ne O Figure 53 Label Free Data Extractor plugin settings window Extract Xpress calculated label free values If the fi
38. ghted averages for shared peptides Will use Proteinprophet assigned peptide weights to distribute spectrum counts TIC between proteins with shared peptides 8 9 3 LibraProteinRatioParser Allows running LibraProteinRatioParser on prot xml files calculating protein ratios Plugin name Needed binaries Accepted input formats Availaible output formats LibraProteinRatioParser plugin LibraProteinRatioParser exe prot xml prot xml e Condition file The condition file to use when calculating ratios 8 9 4 Spectractor The Spectractor plugin uses wkhtml2pdf and the Comet spectrum viewer NOT the search engine to extract PDF images of spectra from a prot xml Plugin name Needed binaries Accepted input formats Availaible output formats Spectractor plugin wkhtmltopdf exe prot xml pdf comet spectrum viewer on server e Condition file The condition file to use when calculating ratios filtering options match those of Label Free Data Extractor with a few addition Only the extra options will be covered here e cgi bin directory prefix on server Typically will be tpp bin for a server running on Windows and cgi bin for a server on linux e Keep only top spectrum per peptide Will only export a pdf of the top ranking PSM for each peptide e Extract only proteins with one peptide Many publications demand the submission of spectra from single peptide hit this allows an easy way to extract only spectra from such proteins 8 9 5 Libra Normalizer Lib
39. h 5 being best will be removed from the created spectral library e Mark at level Spectra with a quality level will be marked in the library this is a mean of keeping track of less than perfect spectra e Minimum probability for library Only spectra at or above this Peptide Prophet or iProphet probability will be included into the library e Generate decoy spectra in library The final library will have decoy spectra generated These spectra will have an associated protein name starting with DECOY e Ratio of decoy spectra Ratio of decoys to real spectra in the final splib e Create a consensus library Will create a consensus library keeping a merged spectra from all available spectra for each peptide This is necessary for decoy generation 8 6 2 SpectraST search plugin SpectraST search plugin performs searches against created spectral libraries 45 Table 13 SpectraST spectral search plugin Plugin name Needed binaries Accepted input formats Availaible output formats SpectraST spectral search plugin spectrast exe fasta pep xml gsp mzML mzML splib Table 14 SpectraST spectral search plugin Plugin name Needed binaries Accepted input formats Availaible output forr PeptideProphet plugin xinteract exe pep xml pep xml InteractParser exe fasta remaps pep xml PeptideProphet Parser exe mzXML mzML remaps pep xml InterProphetParser exe ProteinProphetParser exe Additional TPP provided parsers 8 7 Da
40. hat the same data searched with different search engines should retain the same experiment name e Ignore charge states Peptide prophet will ignore spectra that have precursor in these charge states analysing data e Use command line instead of GUI options This will feed the contents of the text field below directly to the xinteract command It allows use of more advanced xinteract options if not available through the GUI 46 e Enzyme list default Trypsin Provides interactParser with info on which enzyme is used this is also used for later analysis e Libra settings Brings up settings for TRAQ analysis For details see tutorial section e ASAPRatio settings Settings for analysing isotopically labeled samples e XPRESS settings Secondary utility to handle isotopically labelled samples simpler settings but less advanced than ASAPratio e Prophet options Used to enable iProphet analysis and more options Also set options for using PTM Prophet on data for more info on these options see the iProphet and PTMProphet plugins 8 7 2 iProphet plugin iProphet provides a powerful meta analysis tool allowing combination and analysis of results from several experiments search engines or samples Settings are used to enable or disable iProphets scoring models for more info on these please see the iProphet source paper This plugin is the fir Table 15 iProphet plugin Plugin name Needed binaries Accepted input formats Avai
41. he following the dat file a fasta database file all the original mzXML mzML files 6 feed the output pep xml files to a PeptideProphet plugin 8 9 8 IDConvert IDConvert is a ProteoWizard provided tool for conversion between MSMS results formats pep xml prot xml and MZlIdent it also provides output as it s own internal text format The main use of IDConvert within APP is providing a way to convert MSGF Plus derived mzID files into usefule pep xml files Plugin name Needed binaries Accepted input formats Availaible output formats IDConvert idconvert exe pep xml mzid prot xml pep xml txt mzid A gt N Output mzldentML Output pepxML in AAA AA AR SAS RE A ZE Figure 54 The single setting of idconvert Note that conversions for searches done with mzXMLs will not work and some conversions from mzXMLs converted into mzMLs also fails It is best to keep the whole pipeline using mzML if possible ol
42. hs you can leave it alone but this can also be set manually here Set directories containing executables The executable folders will be checked by APP for any needed binaries such as search engines data conversion utilities etc Default should be fine but binaries can be stored anywhere this should be set here Set local directories for server to index Add folders here that contain files that should be accessible to all users examples of this are fasta database files for use with search engines or directories containing raw data from the mass spectrometer These folders will be monitored for the addition of more files which will then be presented to the interface when queried For example APP comes with a set of standard databases but will also display any other fasta or fa datbases in the settings interface if they are placed in a subfolder called dbases Any other files will be available if you click Browse files on server allowing use of files on the server rather thna the users desktop Set directories to preindex for files On a client commonly used files can be stored in pre specified directories preventing the need to transfer databases or other such files Any directory provided here will be indexed and files with a matching size and name to any transfered file will instead be retrieved from the local directory Can also be used to map shared storage Run only server only applications on server Only re
43. ians Maximum cores 2 Minimum load 1 100000023841858 submitted tasks 0 Returned tasks 0 Active true Paused false Current transfers from to client 0 Currently queued jobs Total 0 Linux ErikMalmsMac SERVER Maximum cores 1 Minimum load 6 1 100000023841858 submitted tasks 0 Returned tasks 0 Active true Paused false File TransferConnections Maximum cores 1 Minimum load 9 1 100000023841858 submitted tasks 0 Returned tasks 0 Active true Paused false Current transfers from to client O Currently queued jobs Total O e OK Figure 11 Server report received by interface A final way to see which workers are currently connected is to use the connected clients dialog of the APPServer by toggling it s visibility in the View menu see figure 13 Network executor x File Help AU AAA AAA Status f alibia connected clients ution Viewtz i SS Se EE computer name MassSpec Seth OS Windows XP Type CLIENT Change w IP 130 237 80 199 Co 2 L Figure 12 Connected clients dialog of server running with a GUI APP is routinely tested with up to a hundred such nodes connected and standard heavy usage is typically handled by around 30 clients connected to a server However for most tasks depending on the power of the nodes in question a few nodes is more than sufficient 6 2 Configure on the command line If you want to configure a server without using the GUI the correspo
44. ks Msconvert conversion can get very close to the results from vendor specific post processing with a bit of tweaking Options that can be set from the GUI include the following e Output format Controls the mz format of the output options are mzXML recommended for maximum compatability with APP tools mzML and MGF e gzip output file zlib peak list Compresses the output file or peak list to reduce space usage though some search engines do not deal well with either gzip or zlib As such it is recommended to be left off and to instead keep only MS2 and above spectra as well as perform deisotoping denoising and peakpicking as this will also reduce output file size e Precison from conversion Precision value in bits for both MS1 and MS2 data 32 is recommended since some search engines has trouble handling 64 bit precision For most tasks the lost precision does not affect accuracy e Keep only MS2 and above Will filter out all MS1 scans This will reduce data size and makes conversion much faster If data is for database search or de novo sequencing e Sort spectra by scan times Orders spectra in the produced file by scan time Is set by default since it makes spectra comparisons between different converted files more consistent but has little effect on the workflow e Deisotope MS2 spectra Deisotopes all peaks in MS2 spectra usually is a great aid for MSMS search software but is left off by default http proteowizard sourceforge net
45. laible output formats IProphet plugin InterProphetParser exe pep xml pep xml 8 8 Protein prophet plugin Utilize the Protein Prophet tool to create protein lists for finished experiments Can also import results from LIBRA XPRESS or ASAPRatio to show protein level quantification Table 16 ProteinProphet plugin Plugin name Needed binaries Accepted input formats Availaible output formats ProteinProphet plugin ProteinProphet exe pep xml prot xml 8 9 Smaller utilities 8 9 1 Spectrum name fixer plugin summary Table 17 Spectrum name fixer plugin Plugin name Needed binaries Accepted input formats Availaible output formats Spectrum name fixer None pep xml pep xml mzxml mzml Spectrum name fixer will handle differences with reggards to search engine output not reporting the same spectrum names such as in the case of Myrimatch and MSG F Plus where spectrum Index is reported rather http sourceforge net projects sashimi http sourceforge net projects sashimi 47 wf Inputis from iProphet icat data color Cysteines H glycosylation data color NAST ImportXPRESS protein ratios Import ASAPR atio protein ratios and pvalues a Do notinclude zero probability protein entries in output Report protein length F Advanced Delude do notlook up ALL proteins corresponding to shared pepsi i 3 Advanced Do not use Occam s razor for shared peps get max protein list including many false positives Advan
46. le has been analyzed using Xpress label free settings you can extract the values into a more convenient text file Use only unique peptides Will keep only peptides with a unique unmodified sequence i e NO shared peptides will be included in calculations Use only non degenerate peptides Filter out all peptides that are marked as non degenerate by Protein Prophet Minimum peptide weight Keep only peptides assigned above this peptide weight Minimum protein probability Keep only peptides and PSMs from proteins with this or higher Protein Prophet probability Minimum peptide probability Keep only peptides with a ProteinProphet assigned probability over this value Minimum peptide probabiliy Peptide prophet individual matches Minimum iProphet Peptide Prophet probability for individual spectrum matches to be included Comma separated accesssion numbers protein names Only proteins matching these will be considered Protein properties Any property stored in the prot xml can be exported by adding it here Add the names with comma separation Extra properties to export these are properties stored in the prot xml Extra properties for hits to export Properties of PSMs from peptide prophet i e precursormass or charge etc Export only extra fields Will ignore standard fields such as spectrum protein name etc and instead only export properties specified by the user 49 e Use protein prophet weights to calculate wei
47. les for further processing down the pipeline This is not necessarily every file generated by the plugin such as in the case of the PeptideProphet plugin where pep xml outputs are passed on even if a prot xml is also generated 8 2 Raw data conversion plugins Mass spectrometry data comes in a large variety of sizes and formats in spite of many attempts to standardize While a push towards mzML format can be seen throughout industry the most compatible format in terms of tools utilizing it is for the present still mzXML and this is the prefered format used by APP mzML is still http sourceforge net projects sashimi 34 available for the tools that handle it well All plugins in the Raw Data conversion category are used to process convert or otherwise modify mass spectrometry data Individual plugins are further described below 8 2 1 MSconvert plugin Table 2 MsConvert plugin summary Plugin name Needed binaries Accepted input formats Availaible output formats MsConvert plugin MsConvert exe raw Windows only d Windows only mzXML mzXML mzML mzML mef mgf The msconvert plugin provides an interface to the excellent MSconvert utility created by the ProteoWizard project Msconvert can convert a number of vendor specific formats to the open mgf mzXML or mzML formats Additional functionality include a number of filters such as keeping only the top peaks in an MS2 spectra and functions to denoise deisotope and centroid pea
48. levant for the server some plugins are tagged to not be distributed to clients Examples of this is PeptideProphet iProphet and ProteinProphet where all files referenced in pep xml files are expected to be accessible to the program If this setting is checked the server will execute ONLY such programs which can prevent long running searches from blocking relatively quick validation steps Leave this unchecked unless you plan to attach one or more clients to the server e Output console and log to status area if running in GUI mode the standard output and logging operations of APP can be redirected to show in the Status tab Check this box temporarily if you want to find out what is happening in the guts of the program e Use file caching Only relevant to APP running as Client Will cache transfered files This means that repeatedly transferred files such as databases or raw data files will not be deleted immediately but stored until a predetermined size limit is reached preventing the need to transfer files multiple times This should be enabled on most clients Files below 5 megabytes are never cached e Timeout multiplier If a client or server is slow it might need extra time to complete the jobs given to it The timeout multiplier allows one to increase or decrease the total time available to each job between status updates Setting the multiplier to 2 would give each job twice as long to run before a timeout and set
49. lication title not specified s Help Server actions v A x txt f PM_Band_22 mgf txt h if PM_Band_27 maf i i txt ff PM_Band_32 mgf ds txt if PM_Band_37 mgf txt if PM_Band_42 mgf ip files to the list Click remove to remove file select a file to see it s full path txt f PM_Band_47mgl a ES z irowse compatible files on server oK Cancel homeleriktmp PM_Band_29 mgf Hhome erikimp PM_Band_14 mgf Browse compatible files on server CON Figure 21 Adding input files to an existing plugin Drag and drop You now have a plugin that will get input files and convert them into a different raw data format Let s look at the details a bit later for now add a few more plugins e Add one X Tandem and one Comet plugin from the Search engine category See figures 22 and 21 Ly Application title not specified se WA 0 File Task Network options Help Server actions Monitor transfers View input output View submitted tasks MsConvert Open plugin 1 Double click to add label Remove Selected Plugin Data Processing See Selected Plugin Description Quantitation Add input files Raw data oe Search Engine Search results processing MyriMatch ER plugin Spectral search Comet Search Engine Utility InSPECT Plugin MSGFPlus search plugin GeneralSearchSettings OMSSA Plugin 1 1 Figure 22 Adding an X Tandem search engine
50. ltiple parts All files contain a subset of spectra from the original files A value of 3 will split input files into 3 parts A file containing a 1000 MSMS spectra would thus generate two files containing 333 spectra and the final file would contain 334 spectra Splitting files allow large files to be distributed more easily MSGFPlus for example has limited 36 multi thread support but can easily be run as several parallel process on a split file Output files have a label of part1 part2 etc And will be compatible with all TPP APP analysis tools Note that SpectrumNameFizer should be run on output from MSGFPlus IDConvert and Myrimatch to ensure spectrum names are consistent with TPP expectations It is also possible to provide an MSconvert options file directly by uploading it or to enter command line options directly into the text provided text field just as one would if running from command line 8 2 2 mzXML2Other Table 3 mzxml2other plugin summary Plugin name Needed binaries Accepted input formats Availaible output formats MzXML2Other plugin mzxml2other exe mzXML pkl odta dta ms2 TPP provided conversion utility offers conversion from mzxml to a number of other formats including MGF dta and others Conversion to mgf provides titling of the spectra and these can then be searched through MASCOT and then remapping This utility has mostly been replaced by MsConvert and will not be covered in great detail here in gener
51. managment 6 4 Add a client access key Computational nodes are not covered under the user accounts however APP Servers can be made to only accept connections from clients with the same access key To generate a secure access key click Generate secure connection key button in the server connection tab see fig 15 Server setup Client setup Execution setup FileTransfer s Server name arikes ST t jt a a Start Serve gt La Sere Generate secure connec k W Autostart server from ine Connection key chSbef59b3532f46dfel 9fb4f Clients need to match Any client wishing to connect will need to be provided with an identical key in the connection tab See fig 16 Figure 15 Generate an access key 16 FileTransfer ar setup Client setup Execution setup Manual setup Server name IP adri 4 0 0 0 sever Por 456 Connection key cbsbefsob3532H46dfe1 9fbaf6 Needs to match server J Autoconnect on startup Figure 16 Add an access key to connectiong client 6 5 Test dataset search 6 5 1 Interface installation Windows Linux On our SourceForge page https sourceforge net projects automatedproteo download A PPInterface zip Unzip the archive using your prefered method On Windows double click APPInterface exe this will prompt a java installation if a JVM is missing or old version On Ubuntu right click the jar file and check the Executable property Then
52. n OK Cancel Figure 49 InsPec T setup window 8 5 2 OMSSA Provides a plugin for the Open Mass Spectrometry Search Algorithm search engine Most settings are provided by a gsp file from a settings plugin OMSSA needs to be run using an MGF file and will be converted to such before being searched with the plugin alternatively an MGF may be directly provided to the plugin Table 11 OMSSA search plugin Plugin name Needed binaries Accepted input formats Availaible output formats OMSSA plugin omssacl exe mzML mzML mef pep xml msconvert exe fasta or mzxml2other exe gsp If file is not a mgf 8 6 Spectral search engine plugins Spectral search offers a complementary function to database search It allows a department to store all identified peptides in spectral libraries along with their identified spectra These spectra can then be used in spectral searches this is much faster than database searches and incredibly sensitive As such tracking of specific peptides through a multitude of experiments is greatly facilitated by using spectral libraries and spectral search can be performed with a wider mass tolerance and will find any modified peptides from previously identified experiments http proteomics ucsd edu Software http www ncbi nlm nih gov pubmed 15473683 downloadpagenolongerup 43 Run mgf conversion using mzxXML2Search 7 If msConvert use titlemaker High E value c
53. n prophet plugin no 1 gt 3 2 comet search engine no 1 output o c 2 xtandem search plugin 1 1 1 no 1 D ProteinProphet protxml o 3 3 peptide prophet plugin 1 1 no 1 o 3 3 peptide prophet plugin 1 1 no 2 o 3 4 iprophet plugin no 1 1 5 protein prophet plugin no 1 o 3 input c3 output D ProteinProphet protxml Name ProteinProphet prot xml Type prot xml Add vaultFiles Path homelftpp2 data APPHtaskController APPTutorial1c82df4fc1ed4f57aca798Nafi d667 k Size 9 MB Remove selected v Figure 41 Files available from your previous tasks can be used as input for new tasks directly on the server Note the two filter fields for file names and task names categories Once the task has completed right click on the task to open a menu then click open task After that click open files on the LabelFreeDataExtractor plugin highlight them all and click Quick Download Fig 42 Files download in the background so there is no need to keep the task window open The result files will now be added to your local Downloads directory For the most succinct info for the plugins open ProteinProphet proteins txt in the spreadsheet editor of your choice to see all kinds of interesting information These files are tab separated text files as such avoid opening them in notepad 33 File Task Network options Create plugins Label free data extractor 1 0 Tasks No label
54. nding settings are stored in the following files e Execution setup Filetransfer setup Setup xml Client setup client ini Server setup server ini Activate Deactivate plugins Add a string of plugin name version to the file blockedPlugins ini in the base dir of the APP server component A list of strings with plugins and versions can be retrieved by launching APPServer jar with the l option or help for more info i e java jar APPServer jar I 14 6 3 User accounts and access keys 6 3 1 Activate user accounts By default APP does not protect access small deployments are expected to be started when needed and killed when not For sensitive or more permanent deployments it is possible to restrict access to the server by activating user managment under the execution setup tab see fig Two kinds of user accounts are provided by APP administrators can add and remove users on the server see and reset delete and open any tasks Non administrator users can see open reset and delete tasks they started Each user also belongs to a group and members of a group can see and open each others task but not reset delete etc FileTransfer setup Activate Deavtiva server setup Client setup Execution setup Set base directory Set directories containing exe Change webserver directory Set local directories for server Set directories to preindex for fi _ Run only server only applica
55. nnectionStore ConnectionController Tue May 27 10 02 45 CEST 2014 2014 05 27 10 02 52 INFO tppinterfacepluginbased network FileSyncRun pool 6 thread 4 214 tn FileSyncRun completeSync Ending sync success 2014 05 27 10 02 52 INFO tppinterfacepluginbased network FileSyncRun FileSyncRun Tue May 27 10 02 50 CEST 2014 158 tn FileSyncRun run S 2014 05 27 10 02 54 INFO tppinterfacepluginbased network ConnectionController Thread 11 513 tn ConnectionControllergiwhenConnectedRunna Retreiving value for NO_OLDER_THAN Time 2014 05 27 10 02 54 INFO tppinterfacepluginbased network ConnectionController Thread 11 364 tn ConnectionController write Object Forwarding lA Figure 20 Create an MsConvert plugin You should now have a small box with the words Add input files and Setup buttons on it Move it around by clicking and dragging wherever there are no buttons drop it somewhere to the right Double click the Double click to add label to label specific plugins this simplifies keeping track of specific plugins Zoom in and out by using the mouse wheel or chose one of the zoom options in the window Task menu You can drag the entire field of view around by left clicking outside a plugin and dragging Start by adding input files to our MsConvert plugin Click on the Add input files button then drag and drop files from your favourite file manager on to dialog and click OK see Fig 21 20 Sy App
56. of proteowizard For instructions on installation of the GUI see and for usage examples see 7 0 2 Project files File name Description APPWinBundle32 zip Contains all files needed for a quick client setup on Windows Also get as starting point for full server setup on Windows APPUbuntul4 04Bundle zip Contains all needed components for a full server or client install on linux APPInterface zip Packages the interface package of APP This is also included in the Windows and linux bundles APPTutorialData zip Contains a set of test data for search in mgf format APPSource zip Contains all the sources for APPs interface APPs server components and all plugins Also contains a list of needed dependencies Table 1 Listing of files available on the APP sourceforge page and their uses 6 1 1 Client Server installation on Windows On Windows the easiest setup process involves three steps e Client setup Download the APPWindows32Bundle zip from our SourceForge page sourceforge net projects automatedproteo This contains all plugins needed for standard database search and spectral search While you can unzip it anywhere some software automated through APP has problems with folder names containing spaces For this reason it would be best to unzip in a folder other than your desktop such as C APPServer since for example Comet will fail if there is a space anywhere in the filepath Once unzipped double click APPServer exe in
57. ond round of searching usually using a larger number of dynamic modifications against any protein models identified in the first run To use this methodology a second settings file has to be provided it is then possible to choose which set of modifications to use for the second round of searches The gui provides options for the following Base X Tandem parameter file Leave Untouched unless you know what you re doing defaultParametersKscore xml Select external parameter file specify database for ext parameter file Lise external parameter file Use refine after search Results wf processing should not use NTT model in Peptide Prophet Refine parameters Chose main search settings stan dard qsp Chose refine search settings Uses only mods Standard gsp 7 Ok Cancel Figure 44 X tandems setup window Table 6 X Tandem search plugin Plugin name Needed binaries Accepted input formats Availaible output formats X Tandem plugin tandem exe mzML mzML pep xml tandem2xml exe on server only fasta SSP e X Tandem base parameter file to deal with X Tandems extensive range of esoteric options such as spectrum conditioning or variants of scoring the plugin uses several standard X Tandem parameter files as it s base Base variants are provided for using kscore tandem native scoring with either tryptic or semi tryptic search variants e Use external parameter file If a different set of base op
58. oth PPM and Da Since some search engines do not support PPM setting it is important to provide both Da and PPM tolerances The search engine plugin will then use your prefered mass tolerance method if supported and will fall back on Da if needed e Database choose the search database to use Files provided by the server are available through the dropdown menu It is also possible to provide your own fasta file though this should be limited to smaller DBs e Charges limit which charges are considered This setting is not respected by all search engines Though X Tandem and OMSSA do and should usually be left alone e Name of the parameters decides the name of the output Has no effect on actual search but makes it easier to keep track of which set of parameters has been used e Monoisotopic or average mass switch Most modern instruments should only consider using monoisotopic mass e Max missed cleavages determines how many cleavage sites can be missed for a peptide Higher numbers greatly increase search time 38 8 3 2 X Tandem Provides access to X Tandem search engind Minimum input is a database file a search settings file and a raw data file in either mzXML recommended for maximum compatability or mzML X Tandem has a large range of options dwarfing most other search engines Among these is an ability to perform a search using one set of fixed and dynamic modifications and to then further refine the search by doing a sec
59. output 4 PM_Band_2tandem pep xml Y PM_Band_13 tandem pep xml Y PM_Band_40 mzML tandem Y PM_Band_27 mzML tandem Y PM_Band_16 mzML param Y PM_Band_35 mzML tandem Y PM_Band_4748 tandem pep xml Y PM_Band_37 mzML param Y PM_Band_6 mzML param Y PM_Band_26 tandem pep xml Y PM_Band_37 tandem pep xml Y PM_Band_23 tandem pep xml Y PM_Band_4950 mzML param Y PM_Band_10 mzML param Y PM_Band_13 mzML tandem Y PM_Band_11 mzML tandem Y PM_Band_4748 mzML param Y PM_Band_36 mzML param Y PM_Band_31 mzML tandem Y PM_Band_4950 mzML tandem Y PM_Band_32 mzML tandem Y PM_Band_38 tandem pep xml Y PM_Band_26 mzML param Figure 37 Any files created by a plugin can be seen in the View files dialog Some file types pep xml prot xml and html files can be forwarded to your browser for display A popup window with a URL will show up click OK and a browser should open showing your results see Fig 138 _ Save to Mendeley PepAML wewer 2UU0 SPONSE A Page 1 of 20 1 FIRST 12345611 NEXT LAST PERE 12118 R KPDIDGGTTK s rIBSH5L8 B9H5L8 POPTR 1 1030 5297 9 22 R ENGRSSNEGSAR T IBSIGG1 B9IGG1 POPTR 1262 5597 12 K KLTAASSPTGTK H IBSGME4 B9GME4 POPTR 1160 6397 l l PM_Band_2 00001 00001 25T 2 00001 00001 2 504 374 2000020000227 329 328 71 _200003 000032 278 264 2000040000421 268 265 48 1122 k mesoscaase v MB9GUXTIBAGUXT POPTR _2 00005 00005 K L
60. psin in the output Apply variable C term modifications Determines if Comet considers c term variable modifications on every peptide or only for specific protein derived peptides Apply variable N term modifications Determines if Comet considers N term variable modifications on every peptide or only for specific protein derived peptides Apply variable C term modifications Bin size determines in how fine partitions the MS2 spectra are treated during a comet search This roughly translates into fragment tolerance and lower numbers will require a higher amount of memory to be available A more full explanation is available on the comet home page Comet advanced options Comet advanced options provide access to the search engines full range of options these should not be tweaked without first consulting http comet ms sourceforge net 8 5 MSGFPlus MSGF Plus features powerful models for detecting phosphorylated peptides and an innovative edge scoring algo rithm MSGFPlus outputs exclusively mzID files these need to be converted into pep xml using the ID Convert plugin when included in a project To ensure correct spectrum names DConvert output should then be fed into the SpectrumNameFizxer plugin Table 9 MSGFPLus search plugin summary Plugin name Needed binaries Accepted input formats Availaible output formats MSGFPlus plugin MSGPlus jar mzML mzML mzid fasta amp sp Al D n ASAPratio options Das ALL activations accepted 2 Minim
61. ra Normalizer will normalize all Libra TRAQ channels to contain exactly the same total intensity This is a good way to compensate for pipeting errors and other sample preparation errors Typically the normalization factors will be very close to 1 larger or smaller factors indicate often indicate a problem somewhere during sample preparation Plugin name Needed binaries Accepted input formats Availaible output formats LibraNormalizer plugin non pep xml normalized pep xml Normalized pep xmls can then be used together with the ProteinProphet and LibraProteinRatioParser plu gins to generate final normalized quantities 50 Plugin name Needed binaries Accepted input formats Availaible output formats Input file feeder plugin None ANYTHING THE SAME ANYTHING 8 9 6 Input file feeder The simplest of plugins any file input into Input file feeder plugin will merely be output As such it can be used to give the same input to a large amount of other plugins by linking them Useful to provide the same data to multiple search engines for example 8 9 7 Mascot2XML plugin Mascot2XML is a TPP provided converter which allows conversion from Mascots dat files into pep xml files A typical usage looks like this 1 Convert data into mzXML mzml 2 Use MsConvert with titlemaker option or mzxml2other to create mascot compatible MGF files 3 Search files on mascot 4 Download dat file from mascot 5 Create a Mascot2XML plugin and provide it with t
62. s no need to change any of the settings for iProphet Instead click Setup on the ProteinProphet plugin and click the Input is from iProphet checkbox Fig 132 You are now ready to submit your first APP task Y Input is from iProphet _ icat data color Cysteines _ N glycosylation data color NXS T _ Import XPRESS protein ratios Import ASAPRatio protein ratios and pvalues Do not include zero probability protein entries in output _ Report protein length _ Report calculated protein molecular weight _ Advanced Delude do not look up ALL proteins corresponding to shared peps Advanced Do not use Occam s razor for shared peps get max protein list including many false positives Advanced Do not assemble protein groups Advanced Normalize NSP using protein length Advanced Use expected number of ion instances to adjust the peptide probabilities prior to NSP adjustment _ Advanced Check peptide s total weight in the Protein Group against the threshold default check peptide s actual weight ag 4 _ Parse Libra results Will use the first condition xml encountered in a pep xmls directory unless specified SelectFile Figure 32 ProteinProphet plugin setup window Save your task in the window menu File Pick a name and a description for your task chose a file by clicking Save as and typing a file name Finally click Save see Fig 33 27
63. simplified grid computing implementation that is perfectly happy to run on whatever systems are available and focuses on wrapping the functionalities of external software Our original aim when building APP was to provide simplified infrastructure for many of our own complex workflows and to remove from the user the consideration of where parts of the task are executed APP is open source under the GPL license Each implemented APP function is provided as a plugin these plugins provide their own user interfaces and execution methodologies A number of such plugins are then linked to provide an end to end processing workflow for the data Common among the plugins is that they pass result files onward to plugins further down the line though there are some exceptions such as where a file is modified in place The tasks are then submitted to an APP server which will handle organized execution of the task Results are stored and accessible on the APP server The server component can be in communication with other nodes computers virtual machines or any other piece of hardware running java 7 on the network and will portion out tasks to these nodes for execution If a task is succesfully completed the results will be collected if a task fails it will be attempted on other nodes Nodes with a high success rate will be prioritized for tasks if several are available Each node can have its own set of plugins allowing machines specialized for specific tasks to
64. t and finally ProteinProphet plugins APP also automates many of the small steps to keep these different applications interoperable Search settings for all of the mentioned search engines are handled from a single plugin which produces a general settings file for feeding into all APP database search plugins where it is interpreted into search engine specific format Functionality that is unique to any single search engine such as X tandems ability to use a second refined set of search criteria can be accessed through each individual plugins settings All plugin default settings aim to be ready to go though so for general usage there should be no need to tweak Other more housekeeping oriented functions are also automated for example MyriMatch MSGFPlus and OMSSA will renumber spectra in their output To keep analysis consistent spectrum references need to be corrected This functionality is provided through a SpectrumPFixer plugin There are a number of templates available and complex tasks can be reused to provide templates for future searches There are examples of more complex work flows in the tutorial section 4 The server The server handles execution of tasks and maintains a record of all tasks and their output The server also communicates with users through an interface and also with other computers running either as a semi server refered to as a worker that executes different parts of the task The server feeds the different no
65. t peaks Count peaks after ties Absolute Relative to top BPI Relative to top TIC Minimum tic Cutoff All determine by what criteria should be used for filtering Count is the default setting in this case a certain number of peaks are kept in each spectra Typically the top 40 100 peaks are kept Count after ties is equivalent but also keeps all peaks that have equivalent intensities The relative values will keep all peaks that are close to the top intensity peaks using either base peak intensity or total ion current criteria Absolute uses a certain intensity value cutoff and keeps peaks that pass it Minimum TIC keeps peak above or below a certain TIC value Threshold Determines the cutoff level If using for an absolute such as the count count after ties absolute or minimum tic criteria threshold should be set to a full number i e setting threshold to 100 will keep the top 100 peaks and if using a relative value set somewhere between 0 01 1 For example a threshold of 0 5 with a criteria of Relative to top TIC will keep any peaks that are within 50 of the top peak Keep above threshold keep below threshold Determines if peaks above or below the set criteria are kept As such a count criteria with a threshold of 100 and the keep above option will keep the top 100 peaks in a spectra whereas if set to Keep below threshold it will keep the lowest 100 peaks Split into X parts Will split mgf mzML mzXML files into mu
66. t thing that should happen is a syncing up of any new remote plugins of the server and the local plugins of the interface This will be shown by a small sync gui If there is such a sync event it is recommended to fully restart the interface after the initial sync this is done since plugins load at startup as long as the plugins server side are not changed the sync should not repeat 7 0 3 Build a multiple search engine task First download the APP tutorial data from our website please see reference table 1 section 6 1 If you do not have plans to use at least two computing nodes you might want to only use a few of the provided MGF files as input A search in APP is made up of a series of connected modules This means that a few parts need to be present to build a task The task needs actual database search engines such as X Tandem and Comet let s use those two for the tutorial and to set their specific settings The search engines also need the raw data in a format they can 18 de Application title not specified gt e File Task Network options Help Server actions Create plugins Monitor transfers View input output View submitted tasks Output MsConvert Open plugin 1 Double click to add label Add plugin item No category Remove Selected Plugin k Pre See Selected Plugin Description Quantitation iPr Raw data conversion Pej Search Engine Search results processing Spectral search Utility Add input files
67. ta file is used to map spectra to a database Inputting mzML mzXML files directly to this plugin will mean these files are used rather than any files of matching names referenced in the pep xml file and is an easy way to compensate for a broken reference in a file e gsp file From the general seach settings plugin Any mods defined will be used to create a custom modification file and ensure they get imported e Fasta database file Used to map spectra to protein names e Pep zml files Need to already have been processed using either Peptide Prophet or iProphet These will be used to create spectral libraries from any spectra exceeding the minimum probability cutoff http sourceforge net projects sashimi 44 e Splib spectral library All input spectral libraries will be combined into a single splib along with any additional freshly created spectral libraries hd TestLibran Mame of created spectral library wf Perform quality filter Remove at level 2 mark at level 4 0 9 Minimum probability for library 0 1 mi Generate decoy spectras in library 1 Patio of decoy spectra wf Create consensus library OK Cancel Figure 51 Spectral library builder setup window e Name of created spectral library The base name of the spectral library e Perform quality filter Run a quality filtering step on the created spectral library e Remove at level Spectra with a quality score at or below this level 1 5 wit
68. ta processing 8 7 1 PeptideProphet plugin The name is somewhat misleading since the plugin actually interacts with the xinteract executable and as such can invoke several other tools including Libra iTRAQ Peptide Prophet Protein Prophet PTMProphet etc Most of these have their own settings under the Peptide Prophet plugin GUI This plugin is the first step in processing search result from any of the search engines supported by APP The options for Xinteract are extensive as such the best place for info on all options is at the TPP wiki at tools proteomecenter org wiki index php title Main_Page The most common options available are covered here e Minimum probability peptides to keep Peptide Specrum Matches with a lower probability than this will be filtered out of resulting pep xml files e Minimum lenght peptide As above but but for peptide length No of amino acids e Use decoy hits to estimate correct peptides Peptide prophet will use hits from known decoys to calibrate it s internal null distribution For most search engines this should be on e Use non parametric model Peptide prophet will use disregard it s preset parameters totally and rescore from decoy hits e Experiment name Experiment names tag spectra and are used by iProphet in the Number of replicate experiments model As such samples that are considered unique should have different experiment names if there is interest in combining them later Note t
69. tein Prophet plugin 7 0 5 Work with output from previous task Now if you wanted to do a similar task again all you would have to do is open your saved task and change the input data to the MsConvert plugin If you want to use the LabelFreeQuantitation plugin to extract information about your last search as an example this can be done directly using the output of the previous task This plugin extracts info about a search including information that can be used for quantitation such as each proteins Spectral count or average Total Ion Current It s also an easy way to extract info not generally displayed in the web based viewers such as Xpress generated label free quantities e Create a new task from the file menu e Add a LabelFreeDataExtractor plugin from the Quantitation category There is no need to change anything in the settings but you can still have a look at the setup screen Fig 40 31 e vax Label free data extractor 1 lv Create amount estimations from TIC values Export spectral count Double click to add label _ Extract Xpress calculated label free values Setup lv Use only unique peptides By sequence for calculations _ Use only non degenerate peptides for calculations 0 00 _ minimumPeptideWeight Add input files 0 95 _ Protein probability cutoff 0 90 Peptide probability cutoff Assigned by protein prophet 0 50 Peptide probability cutoff Peptide prophet individual m
70. the Windows version of MSconvert supports vendor formats such as Waters Thermo raw files and Agilent d files and these will can not be converted on Linux A suggested setup would be to allow linux computers to only run the Msconvert Open plugin which disallows input of unsupported vendor data formats Usually all of this setup is optional but available for tweaking To start the server click the Server setup see Fig 6 tab and enter a name for your server then click start server It might take up to 3 minutes for a server to be detectable over the network but you should be able to connect almost instantly if you know the IP address of your server You should now move on to the tutorial section 7 0 2 and process some data i Network executor YOA X File Help View Status Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution Server NM aryeme Start senjer gj Autostart server from ini Figure 6 Starting the server 6 1 4 Connecting an additional worker using the GUI Connecting one or several additional worker nodes is relatively easy A First run a client installation on the OS of your choice for Windows this just means unzipping the distribution file see Fig 7 Once again start the program by using APPServer exe or APPServer jar 10 C UAPPServer File Edt View Favorites Tools Help e Bak J 5 ps Search ey Folders Address C APPServer had Go
71. ting 1t to 0 5 would allow only half of the standard values e Cores used for execution Determines the amount of cores that APP can access On clients all assigned cores will be available to jobs while one core will be reserved for the administrative functions on the server e Execute multi core tasks only when full max cores are available by default APP starts any available jobs as soon as at least one core is available T his option will prevent this and multi core jobs will instead be prevented from starting until the maximum amount of cores they can utilize or the max available to APP are ready for use Leave this alone unless you have reason to For now if you have your executables in a non standard location these should be added to the Executable folders list Click on the Set directories containing executables button and then drag and drop any folders that contain programs useful to APP onto the window Fig 4 Y home erik NetworkTPPprojects rebuild APPServer bin Cli a Home d Network BB Root Trash ES KTH Matrix ES plugins _ primary my LUDDWD bin tppBase Dolphin E a 6 Find Preview Split i0 Control ya Home gt tppBase a bin cgi bin Data etc html a tmp ni 184 6 GiB Hard Drive Fraa nant NMriva File Help View Figure 4 Drag and drop folders into the selection dialog to add executable fol
72. tions are needed for the search this can be provided by uploading an X Tandem parameter file e Use refine search This demands the input of at least two different parameter files By selecting a main and secondary gsp file it is possible to perform a refine search using a second set of parameters and mass tolerances http www thegpm org TANDEM 39 8 3 3 Myrimatch Provides a plugin for the excellent MyrimatcH search engine Most settings are provided by a gsp file from a settings plugin The produced pep xml file often have scrambled spectrum references this can be corrected through SpectrumNameFizer plugin plugin The gui provides options for the following Table 7 Myrimatch plugin overview Plugin name Needed binaries Accepted input formats Availaible output formats MyriMatch plugin myrimatch exe mzML mzML pep xml fasta MZident gsp CID fragmentation Maximum number of modifications on a peptide 2 Trpsin P 7 MinimumTerminiCieavages 2 Update status every 20 pepe Shift isotope for monisotopic precursors 0 OK Cancel Figure 45 Myrimatch setup window 8 4 Comet Comet is fully featured search engine it provides a wealth of spectrum processing options along with very fast speed Table 8 Comet search plugin Plugin name Needed binaries Accepted input formats Availaible output formats Comet plugin comet linux exe only on Linux 64 bit mzML mzML pep xml comet win32 e
73. tions on server _ Output console and log to statusarea Can cause performance lo W Use file cacheing _ Use local submission Cores used for exec Timeout multiplier for this machine Goo w ho W Utilize user man _ Execute multi core tasks only when full max cores are available All Pe downloads off Only for admini 7 Lo a factor 1 16 Figure 13 Enable user managment When first enabled a default user with the username and password of root is added It is recommended to remove this default user after creating a new user with administrator privileges This can be done in the user managment tab see fig Add a new user by clicking the Add user button and enter a new username and password Then click the administrator checkbox to grant administrator rights After this click Apply changes and when prompted to enter root root as username and password After this feel free to delete the root user to prevent access 15 File Help View Status Server setup Client setup Execution setup FileTransfer setup Activate Deavtivate Local Plugin Execution User Managment Username User group User is administrator burk burk _ Administrator Operations Change password Delete user Change password Delete user Change password Delete user erik erik _ Administrator root root Y Administrator Add user Apply changes Reset changes Figure 14 Enable user
74. um precursor charge No decoy search x 5 Maximum precursor charge Dc Decoy prefix only valid with decoy search _ Apply tolerance values vs actual charged ion instead of theoretica Do not remove precursor peak x 2 0 Tolerance if removing precursor peaks Da _ Clear MZ range i e ignore signals in range for iTRAQ reporter ions 0 0 Clear from MZ 0 0 Clear upto MZ _ Automatically remove N terminal methionins from sequence entrie l0 4JFragment bin offset read documentation _ Specify scan range in fragment spectrum to search 1 Minimum scan to include 1000 Maximum scan to include O In memory spectrum pool size 0 unlimited Use A ions Incorporate intensities from narrow bins 3 Max charge of fragment ion Use B ions 0 0 Minimum fragment peak intensity 10 Minimum peaks in fragment spectra 1 Num of output matches hit reported 100 Number of in memory results 1 100 _ Also output txt results file J Use C ions Use NL ions Use X ions Use Y ions Use Zions C Range of precursor charges to search Off means all _ Use sparse matrix less memory use slower Figure 47 Comets advanced settings experts can tweak away Bi Ts CID fragmentation x Trypsin Default x No Protocol Default v 2 Number of tryptic termini 0 2 6_ Minimum peptide length to consider 40 Maximum peptide length to consider 2 Minimum precursor charge to consi
75. utoff 2000000 Please use the GeneralSearchSettings plugin to provide other settings OK Cancel Figure 50 OMSSA setup window 8 6 1 SpectraST library builder plugin SpectraST is a powerful spectral search engine that offers a large array of options For this purpose all options for creating and maintaining a set of spectral libraries have been split of into a separate plugin from the core search plugin The plugin can handle creation of Spectral libraries from pep xml files if these are uploaded Table 12 SpectraST spectral library builder plugin Plugin name Needed binaries Accepted input formats Availaible output formats SpectraST spectral library builder plugin spectrast exe pep xml splib fasta gsp Co mzML mzML from a remote source they need to have raw data uploaded along with them in mzML or mzXML format Additionally the plugin handles other splib files allowing one to use SpectraST to combine multiple spectral libraries into a larger one and to generate spectral decoys in the libraries This makes it easier to utilize several previously established spectral libraries created in a multitude of experiments for a single spectral search Inputs to plugin The plugin needs a pep xml as input from this it will attempt to retrieve the location of matching mzXML mzML files The GSP file is used to define modifications for inclusion in the spectral library to ensure tha no exotic modifications are excluded The fas
76. xDc fasta Prefered precursor mass tolerance type ppm vy Prefered fragment ion tolerance unit Da v Charge states 1 2 and 3 Da 0 2 Da 0 2 ppm 50 PPM 50 0 Max missed cleavages 2 Parameter name Tutorial OK Cancel Figure 29 Setting up search parameters Monoisotopic Average mass Monolsotopic v v v e Click setup on the PeptideProphet plugin linked to Comet hover your cursor over the comet plugin to see which one that is Both Comet and X Tandem can be analyzed without using peptide prophets parametric model but since Myrimatch InsPecT and MSGF Plus all require decoys it s good to get into the habit of using them To enable this click the Use decoys to estimate correct peptides see Fig 30 and enter Dc in the Decoy label field Do the same thing for the PeptideProphet plugin linked to X Tandem Use phospho information Use N Glycosylation motif informa JO pe Use pl information C Use Hydrophobicity RT informatio C Use accurate mass binning C Do not use NTT model _ Do not use NMC model C MALDI data C Exclude all entries with asterixed _ Leave all entries with asterixed s _ Force the fitting of the mixture mc C Use expect value only for Tandem C Use Gamma distribution to updati _ Run protein prophet afterwards _ If proteien prophet do not assem _ As above but do not use occams C Run separate command for each f
77. xe only on Windows 32 bit fasta comet win64 exe only on Windows 64 bit gsp The comet interface provides two separate settings groups one for comets standard settings along with a range of advanced settings The basic settings are outlined here but for a full understanding of the advanced options users should refer to documentation found on Comets homepage http comet ms sourceforge net http fenchurch mc vanderbilt edu lab software php 40 Be Yi comet version 2013 02 rev Ol Wersion of Comet needed for parameter file Fullengyme seardh R Search enzyme Trypsin id Sample enzymen Apply variable C term modifications to All peptides 7 Variable modification on the n terminus can be applied all pepti 7 0 5 Fragment bin size lower for high res data will use more memory Roughly fragment tolerance Comet advanced options Figure 46 Comet basic setup window Comet version This header is needed for generation of parmeter files unless you upgrade comet don t touch Full enzyme search Semi tryptic Defines which peptides are considered choices include fully tryptic de fault fully semi tryptic or consider semi tryptic cleavage only in the direction of the N or C terminus Search enzyme Defines which enzyme is used to define cleavage rules for Comet Sample enzymeThis enzyme will be reported in output For example it is possible to perform the search with No enzyme but present try

Download Pdf Manuals

image

Related Search

Related Contents

Texas Instruments TMS320C64x DSP User's Manual  EL-1750P3 Operation-Manual GB DE FR ES IT SE NL PT FI  Samsung 1100PPLUS User Manual    HP Pavilion x2 11-h003ea  Samsung Digimax V4000 User's Manual  LIBRETTO di USO e MANUTENZIONE dell`impianto di  日本印刷産業連合会 「シール印刷サービス」グリーン基準  Mode d`emploi    

Copyright © All rights reserved.
Failed to retrieve file