Home

Scaffold - Proteome Software - Wiki

1. page 127 Figure 2 4 Samples View GF Scaffold Q Samples tutorial 1 Sex File Edit View Experiment Export Quant Window Help D Hua amp Ga A 8 ay Ma a amp ih Qt protein trreshold 99 0 Mn Peptides 2 oy Peptide Threshold 95 Display Options Protein Identification Probability ReqMods No Filter Search did Probability Legend Load Data over 95 z 3 T 80 to94 2 50 107 E EES i 20 to 49 3 g amp 0 to 19 5 7 w S 45 a 3 8 lio view 4 2 8 2 amp identified Proteins 7 2 2 amp 8 4 1 W P19141 Beta crystallin B3 Beta CRBB3_BOV 24kDa 100 EA 2 7 PX1843 Beta crystallin A3 varia CRBA_BOVL 25kDa 400 4 Pry 3 W P02522 Beta crystallin B2 BP CRBB2_BOV 23kDa 100 Proteins 4 V P07318 Beta crystallin B1 CRBB1_BOV 23kDa 400 ee 5 iV P11842 Beta crystallin A4 Beta CRBA4_BOV 24kDa 100 6 V P26444 Beta crystallin A2 Beta CRBA2_BOV 22kDa 100 j 7 V keratin 67K type II cytoskeletal CONT gil8 65102 HOOH aot ant Similarity Mtl Quantify ca da Statistics Protein Information Sample Information Lookup Accession Number In NCBI ie gi 1351907 v Biological Sample Sample Category 7 Proteins at 99 0 Minimum 2 Min Peptides Sample Description 0 0 Prophet FOR MS MS Sample MS MS Sample Notes s
2. 1 int 2 Jun E No Category Specified ii j P009 Process Bi I Alphabetical 3DPiot 7 Show Values Proteins Total Unique Peptides Total Unique Spectra Accession Protein N Publis Unknown Biological regulation Immune system process Localization Locomotion Multi organism process Metabolic process Response to stimulus Cellular process Developmental process Multicellular organismal process Unknown The Quantify view includes the following panes The Quantitative Value pane in the upper left of the Quantify View provides information about relative quantities of a specific protein and allows comparisons between biosamples and categories The The Quantitative Scatterplots pane in the upper right of the Quantify View shows the degree of error associated with the spectral count measurements The The Venn Diagrams pane in the lower right of the Quantify View shows the relationship between proteins total unique peptides and total unique spectra in various categories and allows the User to easily identify proteins peptides or spectra of interest The Gene Ontology Terms pane in the lower right of the Qunatify View helps identify which proteins may be biologically significant Scaffold User s Manual Chapter 7 Quantify View The Quantitative Value pane The Quantitative
3. 30 Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold Proteins View Scaffold s Proteins view structures a large amount of detailed information about a protein e Sequence coverage for this and similar proteins e The peptide sequence with identified peptides highlighted in yellow and modifications highlighted in green e The spectra used to identify each peptide with associated error measurements e The fragmentation table listing the ion fragments along with their associated peaks Figure 2 5 Proteins View GF Sct 0 Proteins al OOOO OOOO O ce File Edit View Experiment Export Quant Window Help OS BS QI A BW S A Be ek bth Qt Protenttveshott s90 v in Peptides 2 Peptide Threshold 95 P19141 Beta crystallin B3 Beta B3 v All Biological Samples Valid Sequence Prob __ _ R AINGTWVGYEFPGYR G Sequence Coverage Protein Accession Category Bio Sampl 400 ES ic Load Data 683_BOVIN Uncategoriz BioSample v R CELTAECPNLTESLLEK V oom 414 0 4 2 43 W E CPNLTESLLEK V mowi 2 66 0 23 2 44 _ R GEQYVLEK G Do9 2 82 0 07 2 W _ R GEQYVLEK G B99 2 03 0 28 2 Y R HWNEWDANQPOLQ 5S moo 2 41 0 23 2 V R HWNEWDANQPOLOSVR R 00 4 10 0 44 2 _ RJHWNEWDANQPQLOSVR R moo 4 80 0 47 2 V _ R KMEIVDDDVPSLWAH G moos 4 80 0 51 2 V _ R KMEIVDDDYPSLWAH G moo 435 0 52 2 i V V
4. fua Build Library m Input Files Add Directory Add Paths lt Previous Cancel 3 Library Scaffold_expe build completed Explore library A message appears once the library is built up and ready for use The newly built library is listed in the text area Libraries in the Peptide Settings dialog Library tab To view the newly loaded library first select it in the Peptide Settings dialog and then choose View gt Spectral Libraries The Spectral Library Explorer dialog opens Scaffold User s Manual 195 Chapter 11 Reports Figure 11 8 Skyline Spectral Library Explorer i tu Spectral Library Explorer Library scaffold _ibrary EJ Peptide RAAAEEENSIK 9 AAQSPSSLDGLPASR E 400 y5 rank 1 2 AAVNPGPDGK 3 AAVPSGASTGIYEALELR 9 ADMGGAATICSAIVSAAK G Al 9 AEQHSTPEQAAAGK 9 AEQHSTPEGAAAGK AGPPPGPAPGSGPAPAPAPAPAQPAPAAK AGSVLVQAGPWVGYEQANCK 9 AGSVLVQAGPWVGYEGANCK AHNIVLYTGAK 9 AINGTWVGYEFPGYR o 4 9 ANNTFYGLSAGIFTNDIDK APSWIDTGLSEMR scaffold_library AAEEEINSLYK Charge 2 500 N f Intensity 1043 6 4 ee amp nie nN lt x o jop j 200 400 600 800 1000 1200 1400 M Z lt lt Previous Page 1 of 4 Next Peptides 1 through 100 of 339 total File tutorial_4 mzid gz_Mudpit_bovin
5. How should we identify decoys REVERSE RANDOM decoy l Auto Parse Use Regular Expressions Cancel This dialog Box allows the user to select one of the two parsing methods Scaffold uses to align protein names and accession numbers e Auto Parse This option provides an automatic way of searching for the optimal accession numbers between the database and the type of data loaded into Scaffold It initially identifies the type of parsing rule that better fits both the data and the selected database It then matches the rule protein by protein while ensuring uniqueness If a protein does not include the type of rule initially selected Auto Parse looks for other rules more compatible with the specific protein accession number and defaults to a more general accession number if everything fails e Use Regular Expressions This options opens the Configure Database Parser window which allows the verification of the protein names and accession numbers alignment for Scaffold s indexing and parsing It also gives the possibility to modify the parsing rules according to the User s needs How should we identify Decoys This box contains typical tags used to label decoy proteins in a database When parsing a database that contains decoys the User should Scaffold User s Manual 69 Chapter 4 The Scaffold Window 70 make sure that the decoy identification tag used in the his her database is included in th
6. Ta See Molecular Systems Biology 1 2005 0017 for more information mzidentML Scaffold fully supports the mzIdentML standard format for Proteomics data developed by the HUPO Proteomics Standards Initiative Proteomics Informatics Standards group A description of the standard specifications is available at www psidev info mzidentml and a Java desktop software for validating mzIdentML can be downloaded at code google com p psi pi downloads list Scaffold User s Manual 191 Chapter 11 Reports 192 Scaffold supports both mzIdentML 1 0 0 and the latest version 1 1 0 Exports are compatible with The PRIDE XML converter SKYLINE for building spectral libraries See Creating a spectral library in Skyline Selecting the menu command Export gt mzIdentML opens the Export mzIdentML dialog where the User can easily customize his her mzIdentML exports The Export mzIdentML dialog shows three basic options for creating mxIdentML exports optimized for the following uses e Scaffold perSPECtives analysis e Scaffold PTM analysis e PRIDE Scaffold re analysis This export is suggested both for loading data in PRIDE or reloading data in Scaffold using mzIdentML instead of the regular search engines files Figure 11 3 Export mzldentML short dialog What type of export would you like Scaffold PTM analysis PRIDE Scaffold re analysis Clicking the Advanced button expands the dialog to show the full list of o
7. Auto Parse is the preferred method for parsing the database If you are carrying out this procedure using the sample tutorial_3seq data or the sample tutorial_3mas data provided by Proteome Software Proteome Software then select Auto Parse Click Use Regular Expressions to open the Configure Database Parser dialog box and select a specific pre configured parsing rule to use or create your own parsing rule See Figure 3 16 on page 56 Scaffold User s Manual 55 Chapter 3 Loading Data in Scaffold Figure 3 16 Configure Database Parser dialog box Reset File Location Decoy Accession Number Parse Rule gt Description Parse Rule gt Decoy Protein Parse Rule REVERSEIRANDOM R decoy Use Magic Matching 6 After the parsing rules are applied you return to the Edit Databases dialog box with the correct database selected Click OK 7 Continue to Load and Analyze Data on page 46 56 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Validation with X Tandem Each search engine uses its own algorithms to identify proteins Identification confidence is higher when multiple algorithms find the same protein Likewise knowing which protein IDs are not confirmed by a second search engine lets the User screen out some false positives This means that adding X Tandem results to previous output files gives more confid
8. File name Add to Import Queue Network Fesof type Datarles Cancel 2 Navigate to the directory in which you saved your sample data set and FASTA database select the sample data set and then click Add to Import Queue a If you are carrying out this procedure using the sample tutorial 3seq data provided by Proteome Software e Open the folder tutorial 3seq e Select several or all of all the sub folders bovine_spot_01 through bovine_spot 20 Each of these folders represents one mass spectrometry sample holding data from the corresponding spot in the 2D gel Note if you open one of those folders and just select one of the OUT files it contains Scaffold will automatically load all of the files in that folder That is because SEQUEST places the information related to one MS sample search in numerous separate files within the folder a If you are carrying out this procedure using the sample tutorial_3mas data provided by Proteome Software e Open the folder tutorial 3mas e Select the file control _O1 dat The Select Data file dialog box closes and you return to the Scaffold Wizard Queue Files for Loading Queue More Files Queue More Files The page prompts you to load additional data files for the current BioSample Scaffold User s Manual 43 Chapter 3 Loading Data in Scaffold Figure 3 6 Scaffold Wizard Queue More Files for Loading page Funes E N 1 Wel
9. Scaffold User s Manual 187 188 Scaffold User s Manual Chapter 11 Reports Chapter 11 Reports A variety of reports are available in Scaffold to assist the User in interpreting and working with quantitative analysis data All the reports are available from the Export option on the Scaffold main menu Every report is saved in a predefined format and in the same directory as its quantitative analysis data The User cannot change the report format but can always select a different location in which to save the report When the User saves a Scaffold ProtXML report he she must provide a name for the report When the User saves an Excel report a default name in the format lt Report Name gt lt Scaffold File name gt is provided for the report but their values can always be changed Finally the User can open and view any Excel report such as the Publication report in Excel or another spreadsheet application but the User might need to specify that the report file is a tab delimited file to do so The following reports are available in Scaffold E ec MUN EE Leo cage rene pee ore EA A E Serer E ree te erate 190 SPOUT ai lagen beeen aaa 190 e PRGA MIL EDOT aaia 191 s T tics Meena Ore eer ereren tran Trent eaeeny Tere nr trnere reer tr reer ee tree 191 MCATOIGE ACN x eoa O iaunneennaTminitatnnees 196 s Deve MICs nai E A 196 Exports compatible with Exell suspiria irna EE ENE EEE 197 Scaffold User s Manual 189 C
10. Similari g Gene Ontology Terms int x 2 un 3 NoategorySpeci re Charts Bar charis by category Proteins Distinct Peptides Distinct spectra Biological Process E Alphabetical 7 3D Plot 7 Show Values Access Protei a gi Unknown 5 Int Un Multicellular organismal process 5 Cellular P process 4 Developmen tal process Biological regulation Immune system process Localization Locomotion Multi organism process Metabolic process Response to stimulus Cellular process Developmental process Multicellular organismal process Unknown Publish View The Publish View provides detailed information about the experiment in general The view includes two tabs The Experimental Methods tab which describes the parameters used when performing the experiment This is information about the experiment typically required by the major proteomics journals like Scaffold User s Manual 33 Chapter 2 Identifying Proteins with Scaffold e Molecular amp Cellular Proteomics e Proteomics e Journal of Proteomics Research The MCP Submission tab which is a checklist that will lead you through the process of submitting all the supplemental material needed for publication in these proteomics journals This p
11. 26 Protein annotation preferences 126 Protein Information pane in the Samples View eeeeeeeeeeeneee 139 Protein list 124 Hidden proteins eee 128 Protein cluster cece 125 Protein QrOup eeeeeeeeeeeee 124 Proteins of interest 00 127 Sorting Feature eee 127 ProteinProphet ceeeee 27 proteins hiding in the Samples View 128 proteins of interest identifying in the Samples View starring proteins ee 127 ProtXML export ceeeeeeeeee 191 ProtXML report eee eeeeeee 191 Publication report cee 198 Q Quantify View Gene Ontology terms pane 148 GO bar Charts eee 149 GO pie Charts eee 148 Quantitation in Scaffold Precursor intensity 04 183 R Release Information 3 Renewing time based license key 17 Reports ineei ncen a 189 215 reports Peptide itiave ie ici eee 201 PROUXMiss3is 2 sos Sestande rape science 191 Publication ece ian 198 SampleSinresrenieninrarei 199 S Samples report eee eeeeeeee 199 Samples Table eee 123 Samples View advanced Searchconfigure advanced protein filter 138 Display Pane eee cree 135 hiding proteins from 128 Protein Information pane 139 proteins of interest 127 sorting feature 0 cee 127 Scaffold tiered licensing for 13 Scaffold 3 terminnology comparis
12. Although numbering begins on the cover page this number is not visible on the cover page or front matter pages Page numbers are visible beginning with the first page of the Table of Contents If information appears in blue it is a hyperlink Table of Contents and Index entries are also hyperlinks Click the hyperlink to advance to the referenced information The example experiments data and databases available for download in zip format on Proteome Software s website at the following link www proteomesoftware com products demo data scaffold is used as the basis for most screen captures examples and data manipulations that are shown in the manual Assumptions for the manual The Scaffold User s Manual assumes that e You are familiar with Windows operating systems and basic Windows navigational elements content formatting and layout tools You have the appropriate licensing to run Scaffold You have downloaded one of the three example experiments available at www proteomesoftware com products demo data scaffold Sy Choose SEQUEST or Mascot Samples and download the appropriate zip file e When the download has finished move the file to the desired location on your hard drive and unzip it You ll have a folder entitled scaffold_tutorial containing e Sample search engine files results and related databases to be used in the loading example described in Chapter 3 Loading Data in Scaffold on page 37 The
13. Scaffold Version 4 0 User s Manual Release information Copyright Limit of Liability Trademarks The following release information applies to this version of the Scaffold User s Manual This document is applicable for Scaffold Release 4 0 or greater and is current until replaced Document Version Number Scaffold 4 UG4 3 0 Document Status Released Document Release Date February 12 2014 2014 Proteome Software Inc All rights reserved The information contained herein is proprietary and confidential and is the exclusive property of Proteome Software Inc It may not be copied disclosed used distributed modified or reproduced in whole or in part without the express written permission of Proteome Software Inc Proteome Software Inc has used their best effort in preparing this guide Proteome Software Inc makes no representations or warranties with respect to the accuracy or completeness of the contents of this guide and specifically disclaims any implied warranties of merchantability or fitness for a particular purpose Information in this document is subject to change without notice and does not represent a commitment on the part of Proteome Software Inc or any of its affiliates The accuracy and completeness of the information contained herein and the opinions stated herein are not guaranteed or warranted to produce any particular results and the advice and strategies contained herein may not be
14. rT Peptide Tolerance 2 5 Da Average Modification Mass AA Modification Mass AA aS Fragment Tolerance 0 00 Da Monoisotopic to 0 50 Da Monoisote eeen SUT 2 Min Peptides Digestion Enzyme Non specific Ammoniadoss 17 03 n a pee eee Searched Database the swissprot_bovine database 1705 entries Gin gt pyro Glu 17 03 n i z F f 16 00 M Perms Original Search Date Sequest 04 27 2005 and X Tandem 02 27 2 ae 5 0 33 Prophet FDR Scaffold Version Scaffold_4 0 0 beta 1e experimental751 esesos W rall a T At any time the User can remove files from either the Loading Queue or the Ready Pane If the User should load a file by mistake or wish to change the make up of a BioSample he she needs to do the following 1 Click the Load Data icon to go to the Load Data View 2 You will see the files loaded and analyzed on the right Click on one to select it Note that you select the entire MS sample including both the SEQUEST or Mascot and X Tandem runs 3 Click the right mouse button You will see a single menu item Remove Selected Samples Click it 4 A second dialog asks you to confirm the removal For now click Don t Remove Save the Experiment To Save the experiment Scaffold User s Manual 51 Chapter 3 Loading Data in Scaffold 1 Go to the File menu and select Save 2 Navigate to the folder in which you wish to save these tutorials and enter the
15. 19 v 097764 Zeta crystallin QOR_BOVIN 35kDa Bos Ta o O oo o 1400 100 20 J TRYPSIN PRECURSOR CONT gil1 24kDa unkno 100 100 21 J P55052 Fatty acid binding prote FABPE_BOVIN 15kDa unkno 1400 100 22 W Q28088 Gamma crystallin C Lar CRGC_BOVIN 21kDa amp Bos Ta oo Moos moo 23 V PX8208 gamma E crystallin add GRGE_BOVIN 21kDa unkno 98 100 24 V P13696 Phosphatidylethanolam PEBP_BOVIN 21kDa unkno 100 100 25 P16116 Aldose reductase EC 1 ALDR_BOVIN 36kDa Bos Ta o 50 100 256 V P02584 Profilin 1 Profilin 1 PROFI_BOV 15kDa BosTa 000 oo 1400 100 27 V Q06002 Filensin Beaded filame BFSP1_BOVIN 83kDa Bos Ta o oo 100 100 28 V P04272 Annexin A2 Annexin II ANXA2_BOV 38kDa BosTa 0O00 O oooo 50 100 29 7 P68103 Elongation factor 1 alph EF1A1_BOVIN 50kDa Bos Ta o o gt 50 100 30 V P62935 Peptidyl prolyl cis trans PPIA_BOVIN 18kDa BosTa 50 100 31 V P63103 14 3 3 protein zeta del 1433Z_BOV 23kDa Bos Ta o 100 100 32 V Q29450 Adenylate cyclase type ADCY7_BOV 121kDa Bos Ta OO o 100 100 Protein Information Gene Ontology 16 Lookup Accession Number In NCBI Qe gi 1351907 ALBU_BOVIN P02769 x biological regulation Biological Sample 1 6 CRYAB BOVIN S regulation of biological process py Gas os regulation of localization Sample Description S regulation of transport regulation of intracel
16. Average Precursor Intensity Total Precursor Intensity and Top 3 Precursor Intensity These are calculated from the Intensity values shown in the Peptides Table as follows lt D a Sequence Modifications Charge Intensity Obser Vv PIOS OPeEren rn Oxidase Cyrexcate 6 B W R NYDSMKDFEEMRK A Oxidation 16 Oxidation 16 3 2 61E7 B 5 _ _ RJNYDSMKDFEEMR K _ 5 25 Oe v 1 Ox M SSGALLPKPOMR 3 a 1 0 M SSGALLPKPQMR G 2 6100000 65 1 0 _ K AGIFQSAK ae 2 s17 au 1 0 R NYDSMKDFEEMRK A Oxidation 16 2 1390000 85 1 0 R NYDSMKDFEEMRK A Oxidation 16 3 6060000 57 1 0 K KAYAEFYR N i 2 2430000 52 1 0 _ R NYDSMKDFEEMRK A Deamidated 1 Oxidation 16 2 85 1 0 _ R NYDSMKDFEEMRK A Ondan HE 3 an RJNYDSMKDFEEMR K Oxidation 16 3 52 If SSGALLPKPQMR G Acetyl 42 Oxidation 16 MYSSGALLPKPQMR G Acetyl 42 Oxidation 16 ISSSSSSSSSS5 5 58 o 1 0 M SSGALLPKPQMR G Acetyl 42 2 3 81E7 66 1 0 M SSGALLPKPOMR G Acetyl 42 2 3 81E7 66 Figure 10 5 Intensity values for a singleBioSample for a specific protein A Note that often multiple MS2 spectra are collected from a single MS1 spectrum This results in duplicate reports of the same Intensity value Scaffold counts each value only once B Intensities for different charge states of the same peptide are summed to give the total intensity for that
17. Click Next 44 Scaffold User s Manual Chapter 3 Loading Data in Scaffold The Scaffold Wizard Add Another BioSample page opens Figure 3 7 Scaffold Wizard Add Another BioSample page te O O O 1 Welcome to Wizard Add Another BioSample 2 Select Quantitative Technique pip eein You can create several BioSamples and put fles into their loading queues before loading and analyzing their data 5 Add Another BioSample which can be a slow process Manana Click Add another BioSample to loop back to the beginning of this Wizard and create a new BioSample Click Next to load and analyze your data J Ga Add Another BioSample coal neta aaa el 3 Continue to Add another BioSample on page 46 Scaffold User s Manual 45 Chapter 3 Loading Data in Scaffold Add another BioSample 1 e Do one of the following If you have other BioSamples that are to be analyzed then for each of these BioSamples click Add Another BioSample to return to page 2 of the Scaffold Wizard cycle through the wizard to add the sample and then click Next If you do not have other BioSamples that are to be analyzed then click Next a If you are carrying out this procedure using the sample tutorial 3seq data provided by Proteome Software then click Next Ifyou are carrying out this procedure using the sample tutorial_3mas data provided by Proteome Software Re
18. In this case Scaffold and Scaffold Q are unable to perform Precursor Intensity Quantitation It is possible although not required in MaxQuant 1 4 to create an experiment file The experiments can be named through the MaxQuant 1 4 GUI and then an experiment file can be exported by right Scaffold User s Manual clicking and choosing Export The user should name the file Experiment txt and then Scaffold will recognize it and loading can proceed as for MaxQuant 1 3 results Performing Quantitation in Scaffold Scaffold reads the precursor intensity values already computed by the search engine software from the input files For each peptide spectrum match it reports the intensity value in the Peptides Table in the upper right of the Proteins View Figure 10 3 Delta Delta Charge Rete 7 Other Pr 2 0 011 6 5 N lt 9 2 0 011 6 5 1 88E7 190600 2 0 3 0 008 5 2 8 33E7 191400 2 15 0 3 0 0091 5 5 2720 8 33E7 559200 2 15 0 3 0 009 5 8 2880 403000 63620 2 15 0 3 0 0095 5 9 1700 20800 f 301700 2 15 0 2 0 018 7 9 5230 30000 547400 16 33 0 2 0 013 6 0 3990 195000 2466000 16 33 0 4 0 0063 2 8 3990 28220 16 33 0 3 0 013 6 0 5250 5920000 256800 16 33 0 4 0 0048 21 5250 j 24070 16 33 0 3 0 015 6 6 5190 5920000 18650 16 33 0 3 0 015 69 3990 4260000 522900 16 33 0 4 0 013 5 7 5190 5530000 10210 16 33 0 3 0 012 6 0 5600 2 33E7 519400 18 33 0 3 0
19. V Custer of Serum albumin precursor Allergen Bos d 6 ALBU_BOVIN ALBU_BOVIN 5 69 kDa I Serum albumin precursor Allergen Bos d 6 ALBU_BOVIN 1 69kDa v Serum albumin precursor ALBU_RABIT 69 kDa w Serum albumin precursor Fragment ALBU_PIG 69 kDa v Serum albumin precursor Contains Neurotensin re ALBU_RAT 69 kDa V Custer of NF00050265 Ovotransferrin Gallus gallus OTRF_CONTR OTRF_CONTR 2 76 kDa i NF00050265 Ovotransferrin Gallus gallus OTRF_CONTR 1 76 kDa V Custer of Beta lactoglobulin precursor Beta LG Allergen LACB_BOVIN LACB_BOVIN 2 20 kDa Beta lactoglobulin precursor Beta LG Allergen LACB_BOVIN 20kDa w NF00161101 beta lactoglobulin Bos taurus LACB_CONTR 17kDa The column header of the samples included in the category considered as the numerator are highlighted in blue the category considered as the denominator are highlighted in red Coefficient of Variance or Coefficient of Variation 174 A coefficient of variation or of variance CV can be calculated and interpreted in two different settings analyzing a single variable or interpreting a model The standard formulation of the CV the ratio of the standard deviation to the mean applies in the single variable setting In the modeling setting the CV is calculated as the ratio of the root mean squared error RMSE to the mean of the dependent variable In both settings the CV is often presented as the given ratio multiplied by 100 The CV for a singl
20. but depending on the selected tab a different menu might be available 106 Scaffold User s Manual Chapter 4 The Scaffold Window Right Click Menu C Copy Image Copy Selected Cell Copy Selected Row Copy All Data Save JPEG Image Print Export To Excel Find e Protein Sequence tab Available right click menu Right Click Menu D Copy WMF EMF Copy Publication Sized JPEG Save JPEG Image Save As Print Copy Protein Sequence Use Amino Acid Finder Show Fixed Modifications BLAST Protein Sequence Scaffold User s Manual 107 Chapter 4 The Scaffold Window e Spectrum Tab and Spectrum Model Error Available right click menu Right Click Menu E Copy WMF EMF Copy Publication Sized JPEG Save JPEG Image Save As Print Copy Peaklist Zoom Out Use Peakfinder Display Unknown Markers Display Parent lons Use PPM Masses BLAST Peptide Sequence Similarity View e Grouping Table Available right click menu Right Click Menu C e Identifications Tab and Fragmentation Table tab Available right click menu Right Click Menu C e Spectrum and Spectrum Model Error tab Available right click menu Right Click Menu E Quantify View e Quantitative Value Pane and Quantitative Scatter plots Pane Available right click menu Right Click Menu F Copy WMF EMF Copy Publication Sized JPEG Save JPEG Image Save As Print Copy Chart Data Zoom Out e Protein Venn D
21. s SVM classifier LFDR uses log likelihood ratios generated by Naive Bayes classifiers to discriminate between target and decoy hits Naive Bayes was chosen specifically for robustness to over fitting a frequently occurring problem when training a classifier on a subset of testing data Training data is selected by iteratively testing three sets of ten classifiers to hone in on the optimal number of spectra to avoid training with incorrect identifications assigned to target proteins Then the posterior peptide probabilities are derived using LFDR estimates in a Bayesian framework Instead of considering LFDR bins of discrete score distances Scaffold uses variable width bins keeping the number of values in each bin constant This gives more refined assessments of probability in score areas with more values while simultaneously ensuring that LFDR estimates stay reasonable with fewer Finally a Bayesian algorithm is used to confirm peptide probabilities based on likelihoods calculated using parent mass accuracy See LFDR_users_meeting 2013 ppt 26 Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold PeptideProphet When using PeptideProphet Scaffold determines the distributions of the scores assigned by a search engine like SEQUEST Mascot MaxQuant or others which depend on the database size used for the search and the specific characteristics of the analyzed sample see Keller 2002 From these distributions Scaffold transla
22. type I error The User may want to seek guidance from a statistician and ask about methods of correction for multiple testing in analyzing the T test and ANOVA results Scaffold User s Manual 177 Chapter 9 Quantitative Methods and tests 178 Scaffold User s Manual Chapter 10 Precursor Intensity Quantitation An increasingly popular option for quantitative Proteomics is Precursor Intensity Quantitation which offers a good compromise between the accuracy of labeled techniques and the simplicity and lower cost of label free quantitation This method relies on measuring the signal intensity of the peptide precursors representing a specific protein at the MS level and comparing these intensities across samples Both Scaffold and Scaffold Q Q S support this method in different ways Precursor Intensity Quantitation in Scaffold Scaffold is designed to provide easy and confident validation visualization and quantitation of search results It does not read raw files and does not have direct access to precursor information instead it reads intensity data already computed by the identification software Currently Scaffold is able to obtain precursor intensity information from Thermo Proteome Discoverer Mascot Distiller Agilent Spectrum Mill and MaxQuant files Scaffold normalizes precursor intensity values across samples and calculates fold changes at the BioSample or Category level Statistical tests of differences in the calculated intens
23. z a 0 to 19 aj 5 K ayo E Si 2 j S D B oy 2 Bio View ri Sy os O15 hoon S ra 2 2 i 358 Proteins in 305 Clusters 3 kd eri Guster of Hemoglobin subunit alpha 0S Spalax leucodon ehr_ HBA_SPAEH 32 15kDa FNO 5I 11 ed Hemoglobin subunit alpha 0S Spalax leucodon ehrenbergi HBA_SPAEH 15kDa N 1 2 Ea Hemoglobin subunit alpha OS Mus musculus GN Hba PE 1 HBA_MOUSE 15kDa l 1 3 E Hemoglobin subunit alpha 0S Spermophilus citellus GN HB HBA_SPECI 15kDa i 1 4 E Hemoglobin subunit alpha 0S Chalinolobus morio GN HBA HBA_CHAMP 15kDa 15 T Hemoglobin subunit alpha 0S Semnopithecus entellus GN HBA_SEMEN 15kDa l 1 6 Fd Hemoglobin subunit alpha O0S Balaenoptera acutorostrata HBA_BALAC 16kDa 17 ed Hemoglobin subunit alpha OS Spermophilus townsendii GN HBA_SPETO 15kDa J 18 E Hemoglobin subunit alpha 0S Ondatra zibethicus GN HBA HBA_ONDZI 15kDa H 1 9 Ei Hemoglobin subunit alpha 0S Suncus murinus GN HBA PE HBA_SUNMU 15kDa 1 10 E Hemoglobin subunit alpha OS Cynopterus sphinx GN HBA HBA_CYNSP 1 15kDa Ki Figure 10 6 The Samples View A Quantitative Method selected B Protein level intensity values C Fold Change calculated from the precursor intensity values Using the Quantitative Values based on precursor intensity Scaffold can also calculate fold change at either the BioSample or the Category level The desired fold change option is specified in the Quantitative Anal
24. 013 64 5540 2 337 93110 18 33 0 Figure 10 3 Precursor Intensities in the Peptides Table Scaffold provides three methods of using these values to perform relative quantitation at the protein level These methods are available through Experiment gt Quantitative Analysis or by clicking on the bar graph icon at the top of the screen Either of these methods brings up the Quantitative Analysis Setup dialog Figure 6 Because it is a relative quantitative method when using label free quantitation it is necessary to select at least two samples It is also important to adjust the Minimum Value setting to a value that is appropriate for intensities Values other than zero require the use of the Other option in the dropdown A checkbox allows the user to choose whether or not to normalize between samples Quantitative Analysis Setup eal No Test Applied Removed Samples Selected Samples 5 Fold Change by Sample Category Sample Category BioSample 1 Control Fold Change by Category BioSample 2 Treatment Coefficient of Variance Minimum Value Scaffold User s Manual 183 184 Figure 10 4 The Quantitative Analysis Setup Dialog A Selecting the Quantitative Value B Specifying the Minimum Value C Selecting Normalization option There are a number of options for combining peptide precursor intensities to provide an estimate of relative quantities at the protein level Scaffold provides three methods
25. 11 9 LVPITYPQGLAMAK TRUE 12 10 TVFDEAIR true B 95 95s 13 11 VDSKPVNLGLWDTAGQEDYDR TRUE 14 T 433 263 263 Generally this approach works well to eliminate false assignments however in certain instances it can result in a protein that may actually be found in the sample being eliminated from consideration and thus not seen in Scaffold s other views Unfortunately changing the filter settings has no effect upon this type of grouping algorithm A different approach can now be tried by using the clustering option available in Scaffold 4 The new grouping algorithm does not forcefully assign peptides uniquely to a protein but considers shared peptides among different proteins Scaffold User s Manual 161 162 Scaffold User s Manual Chapter 9 Quantitative Methods and tests Chapter 9 Quantitative Methods and tests Scaffold supports label free quantitative methods Some of them are based on spectrum counting and others are based on MS1 intensity measurements For the purpose of establishing differential expressions among the categories present in a Scaffold experiment it is important to normalize the values and accommodate for systematic differences and experimental errors Scaffold provides option for normalizing values and taking care of missing values This chapter describes the various label free quantitative methods available in Scaffold and how Scaffold normalizes them It also describes the different Quantitative tests av
26. 1M 1G 1G 1G 16 1M 1G 1G 1G 16G 3 3 aw al a Pe a a a a 3 3 3 E E e a 9 93 E e 3 9 8 E s 3 9 8 E e a s 2 i a a H ad a ait ad al wl wo al aah at ya a at at a a E a a a a E a a a a E a a a a E a a a a 363 3 E 55 3 88 3 3 8 3 v p E E E E 2 E E E E 2 E E E E 2 E E E E S S 2 2 2 2 Z g g 2 2 Z z 2 2 2 Z 2 2 22 5p 3 a 3 E A 3 3 aa a a a a 2a a a a a a a a a a a a Protein report The protein report details all the proteins passing the current filter and threshold settings This report is designed to be used as part of the supplemental information supporting a journal article The report header rows identify the data and how it was created which is the same information that is contained in the Publication report Then there is a single report entry for each protein in every sample For example if there are 3 samples each with the same 12 proteins there will be 36 rows in this report Figure 11 14 Protein report columns gt Z z gt w Ei FZ Ble v 2 E gg lt 7z S oo Bo o TE IE IE IESE cs 5 5 8 EJ E vo gooo En m a8 98 in amp R amp amp 8 ra e a 0j E z g cw Slr riririft g e Ww 5 u z o o a 5 2 gt S o a 2 3 2 amp 2 5 2 slal m F 8 S ol elal a y BR g ls a E g c 2g 9 isi vizgleigl Z Z E g 2 amp 5 auo uw gy gt j amp SS w E 2 2 9 9 2 Zz
27. 2 lg amp gig Heals Sal 8l 6 2 815 1 P02470 Alpha crystallin A chain CRYAA_BO 20kDa Bosta oo o oo a 2 W P07318 Beta crystallin B1 CRBB1_BOV 235kDa Bos ta oo 3 P19141 Beta crystallin B3 Beta CRBB3_BOV 24kDa amp Bos ta oo 4 P11843 Beta crystallin A3 Cont CRBA1_BOV 25kDa amp Bos ta o oo 5 P11842 Beta crystallin A4 Beta CRBA4_BOV 24kDa Bos ta oo i j L 6 P02510 Alpha crystallin B chain CRYAB_BOV 20 kDa Bosta O88 e ee ee mean 7 P0252 Beta crystallin B2 BP CRBB2_BOV 23kDa amp Bos ta o o oo Similarity 8 26444 Beta crystallin A2 Beta CRBA2_BOV 22kDa Bos ta o oo 9 P02526 Gamma crystallin B Ga CRGB_BOVIN 21kDa Bos ta o o oo 10 ___ P48644 Retinal dehydrogenase ALIA1_BOV o 11 P00727 Cytosol aminopeptidase AMPL_BOVIN 12 P48616 Vimentin VIME_BOVIN 54kDa Bosta ee ee ee 13 V P60712 Actin cytoplasmic 1 Be ACTB_BOVL 42kDa Bosta ee o Quantify 14 V P08209 Gamma crystallin D Ha CRGD_BOVIN 21kDa Bos ta oo E 15 Q9XS34 Alpha enolase EC 4 2 1 ENOA_BOVIN 47kDa Bos ta ee e e eee 5 16 V Q28177 Phakinin Beaded filam BFSP2_BOVIN 46kDa Bos ta oo oo o oo yu 17 P10096 Glyceraldehyde 3 phos G3P_BOVIN 34kDa Bos ta oO o o oo o 18 V P06504 Beta crystallin S Gamm CRBS_BOVIN 21kDa Bosta oo oo 19 V 097764 Zeta crystallin QOR_BOV
28. It is called exact since it calculates the significance of the deviation from the null hypothesis with an exact method not using an approximation dependent on the size of the sample statistics This means that the Fisher s exact test is more appropriate than the T test if there are fewer replicates Like the T test the Fisher s Exact Test produces a p value Scaffold calculates the Fisher s exact test p value according to a model discussed in Zhang 2006 The paper performs a systematic analysis of the various approaches to quantify differential expressions among different experiments Particularly it describes how and when it is reasonable to apply a Fisher s Exact test in a pair wise experiment As described in the paper to calculate the Fisher exact test for a target protein Scaffold arranges the spectral counts for a pair of categories into a two way contingency table where the first row contains the counts for the target protein in each category and the second row contains the rest of the counts for the rest of proteins listed in the Samples table The test is based on the assumption that the row and column totals are fixed which means that any entry in the table completely determines the others The probability assigned to a particular arrangement of spectral counts in the table is calculated using a hyper geometric distribution The p value assigned by the Fisher s Exact test to the target protein is the sum of all p values over all
29. Load Data e See The Samples View e See The Proteins View Proteins e See The Similarity View aiid Similarity e See The Quantify View Quantify See the Publish View W Publish See The Statistics View Statistics Scaffold User s Manual 101 Chapter 4 The Scaffold Window FDR Dashboard Scaffold calculates the False Discovery Rate FDR for both peptides and proteins and reports the values in the FDR Dash Board located underneath the navigation pane Protein and peptide FDR values are reported based on the specific protein and peptide thresholds selected in the Filtering pane Figure 4 25 FDR Info Box Red background searches run with decoy concatenated database Blue background searches run with target database 21 Proteins at 7 Proteins at 99 0 Minimum 99 0 Minimum 2 Min Peptides 2 Min Peptides 10 0 Decoy FDR 0 0 Prophet FDR 11359 Spectra at 447 Spectra at 95 0 Minimum 95 0 Minimum 0 00 Decoy FDR 0 27 Prophet FDR Depending on the type of database used to search the data loaded in Scaffold the FDR is calculated in the following ways e When the search is performed against a target database the FDR is calculated with proteins and peptides probabilities estimated using Peptide and ProteinProphet The FDR dashboard where the values are reported appears with a blue background e When the search is performed against a decoy or reversed concatenated d
30. Min Peptides 2 v Peptide Threshold 95 This view includes lower scoring matches and does not include entire protein clusters Quantitative Scatterplots BioSample Q Q Scatterplot Mean Deviation Scatterplot X AA v E AA NBB ECC 2 BB o 5 10 15 20 25 30 35 40 Je Spectra Scaffold User s Manual 147 Chapter 7 Quantify View e protein accession numbers peptide sequences or spectral peptide sequence and nominal charge spectra display display in the table next to the Venn diagram St The Venn Diagram is interactive When selecting a region of the diagram the When double clicking on a region of the Venn diagram Scaffold switches to The Samples View and applies an Advanced Filter so that only those proteins in the selected region are displayed The Search input box becomes highlighted in yellow and displays Advanced To remove the Venn Diagram s applied advanced filter the User needs to clear the contents of the highlighted yellow input box or simply double click outside the Venn Diagram in the diagram pane Gene Ontology Terms pane The Gene Ontology Terms pane gets populated only when GO terms have been searched and found see Apply GO Annotations Apply NCBI Configure GO annotation Sources and Edit GO Term Options When the terms have been added each protein displayed in the Samples Table may show one or many Gene Ontology Terms describing it These ter
31. R KMEIVDDDYPSLWAHGFQDR v MOO 3 88 0 41 2 Y _ R KMEIVDDDYPSLWAHGFQDR v MOOI 3 85 0 38 2 i Y _ R KMEIVDDDVPSLWAHGFQDR v MOONI 4 37 0 44 2 Y R KMEIVDDDVPSLWAHGFQDR v MOOI 5 69 0 46 2 Y K LHLFENPAFGGR K Moo 3 54 0 31 2 Y K MEIVDDDVPSLWAH G moo 3 19 0 39 2 i j l Y K MEIVDDDVPSLWAHGFQDR v M99 2 84 0 25 2 i ee fee Y _ K MEIVDDDVPSLWAHGFOQDR 400 4 02 0 48 2 Similarity V K RCELTAECPNLTESLLEK V moos 402 027 2 a f Y _ K RCELTAECPNLTESLLEK Y moo 4 96 0 52 2 I m rife mL J r Hahl Protein Sequence Similar Proteins Spectrum Spectrum Model Error Fragmentation Table CRBB3_BOVIN 100 24 196 7 Da Quantify P19141 Beta crystallin B3 Beta B3 crystallin m 38 distinct peptides 58 distinct spectra 179 total spectra 171 210 amino acids 81 coverage R R VASV R GRQY VFERGEYR RIRDQK FLSS WHKRGV For each peptide the user can also see tty e its charge mass and position in the peptide sequence e associated confidence scores from other search engines e modifications if any With all this information available at a glance or at a mouse click the user can have confidence in his her results and organized evidence to document the various findings Scaffold User s Manual 31 Chapter 2 Identifying Proteins with Scaffold Similarity View The Similarity View allows to see how the different peptides are shared in the various hierarchal gr
32. Scaffold e The Quantitative Value pane where spectrum counts can be viewed for a selected protein A drop down list allows the User to choose which protein s spectrum counts are displayed e The Quantitative Scatterplots pane where the User can check the degree of error associated with the spectral count measurements The Venn diagram pane where the User can see the relationship between or among categories of proteins exclusive distinct peptides or exclusive distinct spectra identifications The Gene Ontology Terms pane where the User can see a pie chart displaying the GO terms for the overall Scaffold experiment or select the Bar Charts tab to view the GO terms by category Figure 2 7 Quantify View File Edit View Experiment Export Quant Window Help D SHS OB WY A a oh ih Qt protentiresha 99 0 v Min Peptides 2 v Peptde Treshold 95 Quantitative Value Quantitative Scatterplots Keratin type I cytoskeletal 10 C v Q Q Scatterplot Mean Deviation Scatterplot Load Data E x Int x v Un SS a itso T o er Int Un Q Q Scatterplot e E 2 E 100 100 a Samp 2 75 75 Ee 3 c N so gt 50 T 4 E 25 25 is 8 Ta E oo r i y gt o 4 2 3 4 6 6 7 8 8 0 11 Proteins S int Biological Sample q m Significant Insignificant Optimal 45 Lower Error Bar LI j l Seann High Error Bar Eem Protein Venn Diagram
33. Scaffold was not able to apply proper parsing rules to connect the accession numbers appearing in the search results to the database used when loading Selecting another database or re parsing the database used in the loading phase typically resolves the problem Loading data searched using multiple databases In this case data is typically loaded selecting one of the databases used for the search Proteins identified with the other databases are not correctly parsed and their molecular weights appear as a question marks This problem can be resolved by selecting and applying the other databases used in the analyses Scaffold picks up the unidentified proteins and resolves the question marks appearing in the molecular weight column and retrieves the protein sequences appearing in the Peptides View When selected the menu option opens the Select Database dialog showing the list of FASTA databases currently loaded in Scaffold The User can then choose a different database and apply it to the current list of proteins The functionality present in this dialog are the same as those appearing in Edit FASTA Databases Load and Analyze Queue This command is available only when there are files present in the Loading Queue shown in the The Load Data View When selected it opens the Load and Analyze Data page of the Loading Wizard where the User can select the proper loading parameters and load the data in Scaffold Reset Peptide Validation In the Prote
34. Serum albumin precursor Serum albumin precursor Fragment Serum albumin precursor Contains Neurotensin re Cluster of NF00050265 Ovotransferrin Gallus gallus OTRF_CO e Protein Identification Probability Scaffold s calculated probability that the protein identification for any of the MS Samples is correct Results are color coded to indicate significant differences in protein ID confidence Serum albumin precursor Allergen Bos d 6 NF00050265 Ovotransferrin Gallus gallus NF00163549 Ig gamma 2 chain C region clone 32 2 Bos IMMUNOGLOBULIN LAMBDA LIGHT CHAIN BOS TAURUS NF00161101 beta lactoglobulin Bos taurus Dornvidace 1A nracurcar FO 111171 ISS S888 visible DU bw Ne ISS 88 Visible e Percentage Coverage The percentage of all the amino acids in the protein sequence that were detected in the sample Scaffold User s Manual 135 Chapter 6 The Samples View Req Mods 136 Percentage of Total Spectra The number of spectra matched to a protein summed over all MS Samples as a percentage of the total number of spectra in the sample Exclusive Unique Peptide Count corresponding to Number of Unique Peptides in Scaffold3 The number of different amino acid sequences regardless of any modification that are associated with a single protein group or PEG Total Unique Peptide Count only available with clustering algorithm selected Number of different amino acid sequences that are associ
35. a software key As a command line batch program ScaffoldBatch is intended to be called from a batch script In the Microsoft world this might be a BAT file In the Linux world this might be a SH file ScaffoldBatch is driven by an XML file SCAFML that specifies all the needed operations to create a SF3 Scaffold file experiment For more technical detailed information about how to install and run ScaffoldBatch the User can consult the ScaffoldBatch manual at www proteomesoftware com pdf scaffold_batch users guide pdf Scaffold User s Manual 19 Chapter 1 Getting Started with Scaffold How Scaffold structures data Scaffold stores all the data related to an experiment in one single file Each experiment file SF3 can hold up to 600 000 spectra and associated data The User can create name and save as many experiment files as disk space permits but only one at a time can be opened with full Scaffold capabilities Multiple experiments can be opened in the Viewer mode Experimenters frequently categorize biological samples in larger groups to compare for example diseased with control treated with control day1 with day2 pregnant with not pregnant To capture this Scaffold associates a sample category with each biological sample Data associated with a biological sample abbreviated in Scaffold as BioSample comes from a sample taken by a doctor medical researcher or biologist such as a drop of blood or biopsy from a patie
36. able to apply GO terms to the protein list The GO Annotations Tab also includes a search box and the Import annotations button to import GO databases Scaffold User s Manual Chapter 4 The Scaffold Window Figure 4 10 Go annotations tab GO Term Configuration Displayed GO Terms GO Annotation Databases fi Database Name NCBI Annotations Shan_goa_file 1642 B_anthracis_Ames_ancestor_modified goa All Proteomes e Oa EEEE New Database Edit Delete Import annotations The Import Annotations button opens a dialog through which the User can import GO databases in Scaffold A pull down menu directs Scaffold to different locations where the GO Database can be downloaded Scaffold User s Manual 75 Chapter 4 The Scaffold Window 76 Figure 4 11 Add GO Annotations Database dialog Scaffold can annotate proteins from SwissProt or IPI using the Gene Ontology Annotations Database Complete download all proteomes takes 4 hours and 35 GB Complete human subset takes 5 minutes and 100 MB Please visit the GOA site for more information All Proteomes EBI UK 4hrs wv Choose file A ftp ftp ebi ac uk pub databases GO goa UNIPROT gene_association goa_uniprot gz Name All Proteomes Save to C Program Files Scaffold 4 parameters Help The pull down list includes the following items All Proteomes pro
37. and older when during the loading phase the clustering option is not selected or when data is already loaded in Scaffold and the option Use Protein Cluster Analysis is not selected in Edit Experiment Window Generally the Legacy Protein Grouping algorithm groups proteins using a table very similar to the one shown in the Similarity View when the Protein Cluster Analysis option is not selected see Figure 8 7 Figure 8 7 Scaffold Legacy j m mj r Similarity View Serum albumin precursor Allergen Bos d 6 Group d Data e Elelelelelsi ei a 2 8 8 38 fa a a T a q D e e 9 Si i 2 n 2 4 i il j S v 2 g k 5 a p Siem 3 ee ES 2 Eo o ES o Ea o E Py gt 4 1 1 lt 1 lt 1 a a lt lt lt jua d 1 AADKDNCFATEGPNLVARSKE SEMIMADIN 7 95 4 2 AATIIK T 36 mp 3 AEFVEVIK Serimalbid v 100 100 4 AIPENLPPLTADFAEDKDVCK SenmmanoM v 100 00 E 5 CCAADDKEACFAVEGPK senmabum 100 100 t 6 CCTESLVNR Vi 100 100 100 100 100 100 100 Pry h 8 DAFLGSFLYEYSR Serumalbu v 100 100 iteins 3 DAIPENLPPLTADFAEDKDVCK SQRIRAIBEIN v 400 400 10 DDPHACYSTVFDK Serumalbu v 100 100 11 DLGEEHFK Serumalbul 100 100 100 12 ECCDKPLLEK vi 100 100 100 100 100 13 ECCHGDLLECADDR vi 100 100 100 100 100 100 100 100 100 100 100 100 14 ECCHGDLLECADDRADLAK
38. are ways to filter the data which are independent of filtering the data on the peptide and protein probabilities calculated by Scaffold Min Peptide Length Filters out peptides with less than the minimum peptide length This filter can be used to exclude short peptides which are seldom unique to a single protein These short peptides may cause a very large number of similar proteins to be displayed in the Similarity View peptide length filter option The Scaffold minimum peptide length filter is only T e Most search engines Mascot Sequest X Tandem etc have a minimum useful if this filtering was not done on the search engine e Using the minimum filter option on the search engine will reduce the processing and file sizes in Scaffold Scaffold User s Manual 133 Chapter 6 The Samples View FDR Filtering 134 Scaffold allows the User to filter on peptide and or protein FDR rates when analyzing results of a decoy search When search results that include decoy matches are loaded in Scaffold the Peptide and Protein Threshold pull down list includes FDR values in their lists of selectable values In addition it is possible to type a custom FDR threshold directly into the box by adding FDR to the end of the string e g 10 3 FDR FDR filtering in Scaffold works by finding the combination of peptide and protein probability thresholds that maximizes the number of proteins identified without exceeding the FDR thresholds a
39. as presented in Scaffold Scaffold User s Manual 169 Chapter 9 Quantitative Methods and tests In particular the User should be cautious about drawing conclusions about differential abundances for proteins where the spectral counts are small numbers Scaffold tries to mitigate this problem by its treatment of Missing values Missing values For differential protein expression tests Scaffold replaces missing values with a specified Minimum Value Whenever a sample has no assigned spectra for a specific protein and that protein is found in a different sample the specified minimum value is used to calculate the normalized values Minimum Value The minimum value option allows the User to set a floor when calculating Label Free quantitative values Higher values will output shorter lists of highly confident changes lower values will output longer lists that may contain less confident changes The minimum value is set by the User in the Quantitative Analysis dialog through the Minimum Value pull down list and it defaults to 0 When selecting the option Other from the list the Set Minimum Value dialog opens allowing the User to record and use a minimum value different from the ones shown in the pull down Figure 9 2 Setting the minimum value Quantitative Analysis Setup No Test Applied Removed Samples Selected Samples Sample Category Sample Category ci AA 2 AA 3 B5 c4 BE c5 cc Coefficient of Variance O Analy
40. by selecting New from the Main menu commands or using the selections available in the The Load Data View The following document published on the Proteome Software website provides detailed information on search engine data files compatible with Scaffold Scaffold Q or Scaffold File_compatibility_matrix pdf 20 Scaffold User s Manual Chapter 1 Getting Started with Scaffold Quantitative Data Scaffold Q and Q S are Proteome Software s labeling quantitation software packages e Q loads iTRAQ Applied Biosystems and Tandem Mass Tagged TMT Thermo Scientific labeled data e Q S can also load stable isotope labeled samples If the User has purchased Scaffold Q or Scaffold Q S he she will use Scaffold s file importing wizard to load the search results of the labeled data Quantitative Data File compatibility Please check the file compatibility matrix at the following link File_compatibility_matrix pdf Characterizing data The data imported in Scaffold are the results of a previous search against a particular FASTA protein database using a particular search engine SEQUEST Mascot X Tandem or others on a particular set of data When the User imports these data Scaffold needs to know certain characteristics of the specific search so the User in the loading phase will be asked to e Specify the Database e Specify the parameters used for the search Specify the Database As part of the loading process t
41. choosing algorithms that are less sensitive to missing values for example it uses the geometric mean rather than the average in calculating protein level fold changes Scaffold User s Manual 179 Calculation of Precursor Intensities Precursor Intensity Quantitation is based on the principle that the area of the peak in the MS1 chromatogram provides a measure of the relative abundance of the corresponding peptide in the sample Peptides are identified based on their MS MS spectra and then the corresponding MS1 peaks are identified in each LC MS MS run The areas under these peaks are calculated and normalized and their ratios are used as a measure of the relative abundance of the peptides in different samples Relative quantities of proteins are estimated by combining the precursor intensities of the constituent peptides in various ways The following illustration of the typical LC MS MS analysis of a peptide is reproduced from Lai 2013 Figure 10 1 Identification of a peptide through LC MS MS analysis RT 51 3 53 79 Relative abundance 51 5 52 52 5 53 Time min NL 1 8556 Base peak MS MS 20100203 St 0 10P1 Figure 10 1 a The peptide is eluted from the LC column and its ion intensity is plotted as a function of the retention time 20100203_St_0_10P1 4036 RT 51 45 AV 1 NL 7 82E3 T ITMS c ESI full ms 300 20000 100 814 56 80 Relative abundance 400 600 800 1000 1200 1400 m z 83
42. databases provided in the downloads are subset databases that will allow the tutorial searches to complete in a relatively short time They do not necessarily generate complete protein identification results the way Scaffold can be used are available at http www proteomesoftware com products scaffold under the section Scaffold Tutorials in the left side menu St Further example exercises with guided explanations detailing different aspects of Organization of the manual In addition to this Preface the Scaffold Users Manual contains the following chapters Chapter 1 Getting Started with Scaffold on page 11 which explains the tiered license structure for the Scaffold application suite It also explains how to start Scaffold and also Scaffold User s Manual Preface details the different types of data that can be analyzed in Scaffold Scaffold Q and Scaffold Q S Chapter 2 Identifying Proteins with Scaffold on page 23 which introduces the different views available in Scaffold to help mass spectrometrists and medical researchers confidently identify proteins in biological samples Chapter 3 How Scaffold Structures Data on page 35 which introduces the User to the way Scaffold thinks about an experiment the type of data it loads and what can be done to have an in depth look at the search results loaded in the program Chapter 3 Loading Data in Scaffold on page 37 which guides the first time Use
43. each sample is about the same It is not appropriate if the total amount of protein varies widely from one sample to the next In Scaffold there are two levels of summarization the MS level which shows the samples run through the mass spectrometer and the BioSample level were BioSamples can contain one or more MS samples Frequently the biological sample or BioSample in Scaffold is fractionated into multiple MS samples Scaffold allows the User to view the MS samples within a BioSample or to combine all the MS samples into a single sample using the MuDPIT option Normalization is performed at the MS sample level The normalization scheme in Scaffold adjusts the sum of the selected quantitative value for all proteins in the list within each Ms sample to a common value the average of the sums of all MS samples present in the experiment This is achieved by applying a scaling factor for each sample to each protein or protein group adjusting in this way the selected value to a normalized Quantitative Value Note For Precursor Intensities since they operate at the peptide level there might be various spectra that will show the same Intensity values In the normalization scheme only one value will be considered for calculation purposes for more information see Precursor Intensity Quantitation in Scaffold and Performing Quantitation in Scaffold Note on Low abundance peptides The normalization method used in Scaffold as mentioned a
44. either Ion Identity Scoring Discoverer Mascot The User can select either Ion Identity Scoring For PD version 1 3 and above we were able to identify the Mascot parameters that affected the amount of information recorded in the MSF files once the search is ended Proper suggestions concerning their adjustment are recorded in Configuring Proteome Discoverer Sequest and Mascot Note Note that selecting Use lon Score Only provides a list of proteins different in length than when Use lon Identity Scoring is selected Configuring Proteome Discoverer Sequest and Mascot When searching MS data using Thermo Proteome Discoverer PD it is possible to adjust the amount and type of information stored in the output files MFS files Some of the settings used to reduce the quantity of stored information in the output files have default values that discard most of the low hit spectra When loading this types of MSF files in Scaffold the User might encounter inconsistencies in the way Scaffold assigns probabilities to the peptides and proteins To be able to properly run the different scoring algorithms like the LFDR and the PeptideProphet algorithms in a statistically meaningful way Scaffold needs a certain number of false hit spectra included in the imported data With MSF files created using the default settings in PD 1 4 and older it is quite clear that many of the false hit spectra are discarded and so are not saved in the search result fi
45. feature and sort the display by clicking on any column header For example to sort the proteins based on increasing molecular weight click the Molecular Weight column header once To sort the proteins based on decreasing molecular weight click the Molecular Weight column header twice To return to the default display click the Molecular Weight column header a third time Proteins of Interest The User can mark proteins in an experiment that are of special interest by clicking the Star icon in the Starred column for the protein Two different colored stars blue and orange and a combination of them are available by clicking multiple times on the same star Scaffold User s Manual 127 Chapter 6 The Samples View or by selecting in the right click menu the option star By using a combination of different stars it is possible to create three different sets of proteins of interest You can then bring these proteins to the top of the display by clicking the Starred column header To return to the default protein display click the column header twice more Figure 6 8 Samples View Starring proteins 8 um w 197 28 3 5 Bio View 3 2 a amp Identified Proteins 7 2 g r 1 7 P19141 Beta crystallin B3 Beta CRBB3_BOV 24kDa 100 sd 2 7 PX1843 Beta crystallin A3 varia CRBA_BOVL 25kDa 100 f Ry 3 V P02522 Beta crystallin B2 BP CRBB2_BOV 23kDa 100 Proteins 4 7 P07318 Beta crys
46. filename tutorial 3seq Scaffold appends the suffix SF3 to your experiment files 52 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Specifying the FASTA database Figure 3 11 Searched Database Pane im U0 f O 1 Welcome to Wizard Load and Analyze Data 2 Select Quantitative Technique 3 New BioSample i 4 Queue Files For Loading Seel eee 5 Add Another BioSample IPI_human v3 85 decoy_uniprotaccession_modified FASTA Database X 6 Load and Analyze Data E x J Use non default forward decoy ratio No Decoys Add New Database X Tandem F Analvze with YI Tandem 1 Specify the FASTA database that is associated with these sample files You can select from a list of existing FASTA databases shown in the pull down menu or you can add a new FASTA database e Just select a database from the existing list Figure 3 12 a TEO O_O Load and Analyze Data 1 Welcome to Wizard 2 Select Quantitative Technique 3 New BioSample 4 Queue Files For Loading 5 Add Another BioSample 6 Load and Analyze Data IIPI_human v3 85 decoy_uniprotaccession FASTA Database IPI_human v3 85 decoy_uniprotaccession_modified FASTA Database JIPI_human v3 85 decoy_uniprotaccession FASTA Database 2 X Tandem Analyze with X Tandem Scoring System Use LFDR scorina all instruments e Ifyou are adding a new database continue to
47. filter or define custom filters e Protect Changing Display Settings Password required to hide or display hidden pages e Protect Hidden Proteins Password required to access View gt Show Hidden Proteins Paths The Scaffold Installation comes with a generic UNIMOD database typically used to unify modifications naming among different search engines when their results are loaded in the same Scaffold experiment This Tab allows the User to select alternative UNIMOD databases when loading data Do not use UNIMOD This option tells Scaffold to retrieve the information about modifications directly from the search engine results that are being loaded Use Scaffold default UNIMOD This options tells Scaffold to use the default UNIMOD database Scaffold User s Manual Chapter 4 The Scaffold Window Use a custom UNIMOD file This option allows the User to direct Scaffold to a location where a custom UNIMOD database is available and retrieve the modification information from the selected database Scaffold User s Manual 81 Chapter 4 The Scaffold Window Advanced Preferences To have Scaffold correctly compute the discriminant score used by the PeptideProphet algorithm see Keller 2002 the loaded data files need to contain specific pieces of information Depending on how the search engine parameters are set it might happen that some vital information used to properly calculate the discriminant score is discarded
48. for download If security is enabled on the Mascot server a login window pops up asking for account name and password The User should make sure that the account he she is using in Mascot has administrative privileges Tip Edit gt Preferences gt Mascot Server allows the user to create a default connection to a Mascot Server of choice Scaffold automatically logs in to the server specified in the settings Figure 5 6 Queue files from Mascot Server for Loading 3 z z a Filtering files on Mascot Server http datacruncher mascot v Connected to datacruncher Mascot Server User Name Email Title B Discoverer _luisa zini control_071904_02_ms_amanda konto Discoverer tisaznil eontr 071904 01 ms_amanda_ kontro Discoverer asaza ______eontol071904 01_mascot Nod konto Discoverer saai feont 071904 02 mascot Node Ee 1833 control Discoverer _luisa zini control_071904_01_mascot_seq Remove Mbownload Status Job Number Database User Name 8409 SwissPro The dialog can be divided into 5 different panes containing a number of tools to help the User smoothly select and load data files into Scaffold e Connection and filtering pane which contains information about the status of the connection to the Mascot Server spelled out in the Mascot Server address text box When the server is not connected this is the only text box available in the dialog see Figure 5 5 Below the connection information there are three different f
49. gt Queue Files from Mascot Server see Queue Files From Mascot Server for Loading St Files can also be added to a BioSample directly from the Mascot Server The User Queue Files for Loading Selecting this command opens a standard file browser From there the User can navigate to the location where the data files to be loaded in Scaffold are stored Once the files are selected Scaffold places them in the Loading Queue The Queue Files For Loading command can be selected from the following locations in the program The Experiment menu e The Load data View Scaffold User s Manual Chapter 5 The Load Data View e The Queue files for loading page in the Wizard The User should be able to reach the files of interest from the system where Scaffold is installed and should also make sure that the format of the data files is supported by Scaffold by consulting the file_compatibility_matrix pdf When selected outside of the Loading Wizard and if the User has multiple biological samples already defined he she should make sure that the right biological sample is chosen in the Load Data view before beginning to queue files Multiple files can be selected at one time as long a they contain search results of data run against the same FASTA database The User is asked to specify the database in the Load and Analyze Data page of the Wizard Queue Structured Directory for Loading This command allows the User to streamline the loading o
50. improved performance Samp Files in Loading Queue Files Currently Loaded bovine _spot_08 eins paces sad 4 Navigate to the directory in which you saved your sample data set and FASTA database select the FASTA database and then click Open 54 Scaffold User s Manual Chapter 3 Loading Data in Scaffold a If you are carrying out this procedure using the sample tutorial_3seq data provided by Proteome Software navigate to where you have saved the subset FASTA data base swissprot_bovine fasta a If you are carrying out this procedure using the sample tutorial_3mas data provided by Proteome Software navigate to where you have saved the subset FASTA data base control_sprot fasta The Parsing Method dialog box opens You use the options on this dialog box to select the parsing rules that display protein accession numbers and protein descriptions in the correct format Figure 3 15 Parsing Method dialog box r Parsing Method Select a method for parsing your database How should we identify decoys REVERSE RANDOM decoy l Auto Parse J Use Regular Expressions J Cancel 5 Do one of the following Click Auto Parse to have Scaffold decide on the parsing rules to use If you are parsing a database that contains decoys make sure the decoy identification tag is included in the How should we identify Decoys list shown in the dialog box
51. minimum B Protein Thresholds 199 0 minimum and 2 peptides minimum DATABASE SEARCHING Tandem mass spectra were extracted by unknown version unknown Charge state deconvolution and deisotoping were not performed All MS MS samples were analyzed using Mascot Matrix Science London UK version Mascot Mascot was set up to search the NCBinr_20050928 database selected for Metazoa unknown version 819710 entries assuming the digestion enzyme trypsin Mascot was searched with a fragment ion mass tolerance of 0 20 Da and a parention tolerance of 0 30 Da lodoacetamide derivative of cysteine was specified in Mascot as a fixed modification Deamidation of asparagine and glutamine and oxidation of methionine were specified in Mascot as variable modifications CRITERIA FOR PROTEIN IDENTIFICATION Scaffold version Scaffold_4 0 0 alphate experimental7 16 Proteome Soflware Inc Portland OR was used to validate MS MS based peptide and protein identifications Peptide identifications were accepted if they could be established at greater than 95 0 probability as specified by the Peptide Prophet algorithm Keller Aetal Anal Chem 2002 74 20 5383 92 Protein identifications were accepted if they could be established at greater than 99 0 probability and contained at least 2 identified peptides Protein probabilities were assigned by the Protein Prophet algorithm Nesvizhskii Al et al Anal Chem 2003 75 17 4646 58 Proteins
52. of ATP returns both ATP synthase and calcium transporting ATPase In another example a search string of sodium transport returns all values that have sodium and or transport in the protein name Sodium potassium transporting and Calcium transporting ATPase sar and so on Click on the magnifier glass button to the right side of the search text box for more advanced search features See Advanced Search Scaffold User s Manual 137 Chapter 6 The Samples View Advanced Search 138 When clicking on the magnifier glass button on the right hand side of the search text box in the Samples View the Configure Advanced Protein Filter dialog opens Figure 6 17 Samples View Configure Advanced Protein Filter Configure Advanced Protein Filter xs Search for proteins based on accession number protein name peptide sequences or sub sequences and spectrum ID names All searches support regular expressions For example you can find possible CaMKII phosphorylation sites by searching for peptides with the R ST motif Please visit this site for more help with forming regular expressions Warning Searching by spectrum name can be extremely slow It works best on small data sets Matching All w of the text filters Accession Protein Name Protein Sequence Motif Identified Peptide Sequence Spectrum Name SLOW Foundin All of the Categories Absent from the Categories Add Categor
53. of decoys included in each of them A database in the list might appear highlighted in various colors as a warning e Pink highlight Missing database Scaffold cannot connect with the database using the current stored information e Grey highlight Out of date database The database indexing is not updated with the Scaffold version in use When selecting a highlighted database the button FIX appears Depending on the issue Scaffold User s Manual Chapter 4 The Scaffold Window clicking fix either calls the parsing method directly to build or rebuild the related index file or asks for a location where to find the database if it was moved Add Database button This button adds a new database to Scaffold When selected a file browser appears allowing the User to point Scaffold to the location where the database he she wants to load is stored Once the FASTA file is selected the Parsing Method Dialog appears Edit button To edit one of the existing databases the User has to select a name from the Loaded Databases table and click Edit The Parsing Method Dialog appears Delete button To delete one of the existing databases the User has to select a name from the Loaded Databases table and click Delete After parsing rules are applied or databases are deleted the User can click OK Parsing Method Dialog Figure 4 6 Database Parsing methods dialog G gt Parsing Method x Select a method for parsing your database
54. off is 0 As an example Figure 8 4 shows peptides shared by multiple proteins with their related weights listed in the Similarity view Figure 8 4 Similarity View Peptide weights Myoglobin o Group 2 MYG_ELEMA Peptide ADIAGHGQEVLIR 100 ALELFR 33 ETLEKFDKFKNLKSEDEMKGS 400 Madonna GDFGADAQGAMTK 100 GLSDGEWQQVLNVWGK HGTVVLTALGGILK 1400 Myoalobin 0 11 0 11 0 09 0 09 0 09 0 09 1 00 Op w n e Index These weights are also displayed in the Proteins View Peptide Table in the column titled Apportionments which replaces the traditional Assigned column when the cluster grouping model is used Protein Paring Next the protein list is pared down according to the principle of parsimony As in the case of 154 Scaffold User s Manual the Legacy Protein Grouping the Shared Peptide Grouping algorithm thins down the list of proteins by eliminating any for which there is no independent evidence However independent evidence is defined differently in the two grouping algorithms In the Shared Peptide Grouping a protein is considered having independent evidence when it contains at least one exclusive unique peptide Proteins for which there is no exclusive evidence are then eliminated from the protein identification list This process can best be seen in Scaffold s Similarity View Here all proteins sharing peptide evidence are assembled into a table Proteins with ex
55. or collapsing all of the protein clusters displayed in the Samples View Sy The initial default option is Use Protein Cluster Analysis e Use standard experiment wide protein grouping When selected Scaffold groups proteins across all MS samples and BioSamples e Use legacy independent sample protein grouping When selected Scaffold groups proteins only within each MS sample Each MS samples appears as if it was loaded independently If you are carrying out this procedure using the sample tutorial_3seq data or the sample tutorial_3mas data provided by Proteome Software select the option Use standard experiment wide protein grouping 48 Scaffold User s Manual Chapter 3 Loading Data in Scaffold For more information on the grouping and clustering algorithms used in Scaffold see Chapter 8 Protein Grouping and Clustering on page 151 Protein Annotations pane Included options for searching the Gene Ontology annotations GO terms during loading Don t Annotate Fetch Go annotations remotely If the GOA database is not configured the option will appear grayed out For activation click the link Configure GO Source and select a GOA database from the Edit GO Term Options GO Annotation Databases pane If the database you are searching is not available click New database and import the GOA database of your interest If you are carrying out this procedure using the sample tutorial_3seq data or the sample
56. row in the table for each protein group or protein cluster which has at least one MS Sample identified that passes the assigned filter thresholds requirements To display MS Samples that do not meet the confidence requirements the User can select Show Lower Scoring Matches from the View Menu Figure 6 4 Samples View MS Samples View pel Window Help i J gt z r A g a 4 A E eB lil Qt Protein mreshola 99 0 v Min Peptides 2 v Peptide Treshold 95 v f Protein Identification Probability v ReqMods NoFilter v Search Q Probability Legend Mosale Fa 3 a 3 a amp 0 to 19 2 MS MS View Identified Proteins 7 ed Accession Number Molecular Weight bovine_spat_07 nl bovine_spot_09 bovine_spot_10 bovine_spot_11 bovine_spat_12 bovine_spot_13 bovine_spot_1 Ce bovine_spot_16 bovine_spot_17 bovine_spot_18 bovine_spot_19 BIYisible Star z 1 P19141 Beta crystallin B3 Beta CRBB3_BOV 24kDa 100 100 100 100 77 100 100 100 sd t 2 V PX1843 Beta crystallin A3 varia CRBA_BOVI 25kDa 1400 100 100 I Pry 3 V P02522 Beta crystallin B2 BP CRBB2_BOV 23kDa 400 67 100 100 99 Protei er 4 V P07318 Beta crystallin B1 CRBB1_BOV 28kDa 100 100 77 5 V P11842 Beta crystallin A4 Beta CRBA4_BOV 24kDa 1400 1 6 V P26444 Beta crystallin A2 Beta CRBA2_BOV 22kDa 100 100 i i 7 V keratin 67K type II cytoskeletal
57. suitable for every user The software described herein is furnished under a license agreement or a non disclosure agreement The software may be copied or used only in accordance with the terms of the agreement It is against the law to copy the software on any medium except as specifically allowed in the license or the non disclosure agreement The name Proteome Software the Proteome Software logo Scaffold Scaffold Q Scaffold O S and the Scaffold Scaffold Q and Scaffold Q S logos are trademarks or registered trademarks of Proteome Software Inc All other products and company names mentioned herein may be trademarks or registered trademarks of their respective owners Customer Customer support is available to organizations that purchase Scaffold Support Scaffold Q or Scaffold Q S and that have an annual support agreement Contact Proteome Software at Proteome Software Inc 1340 SW Bertha Blvd Suite 10 Portland OR 97219 1 800 944 6027 Toll Free 1 503 245 4910 Fax www proteomesoftware com Table of Contents PrefaCe 5 coecs sso vce bhec ppp te cee ts tenet oh cpee es ieisanhi isch eat de edoenbaaacsssessaareveneeecatess 7 Chapter 1 Getting Started with Scaffold eecccssssseeeesseeeeeseeeeseees 11 Chapter 2 Identifying Proteins with Scaffold ccccsssseessseeeeeeees 23 Chapter 3 Loading Data in Scaffold cccccceeesseeeeeeeeeeeeeeeeeeeeeeeeeeeees 37 Chapter 4 The Scaffold Win
58. the MSF output files Those parameters are located in the 1 1 Peptide Scoring Options section visible only when Show Advanced Parameters is selected We advise the User to adjust the following parameters to their minimum value e Peptide Cut Off Score 0 e Peptide Without Cut Off Score 0 86 Scaffold User s Manual Chapter 4 The Scaffold Window Figure 4 16 Mascot Advanced Parameters Peptide Scoring Options in Proteome Discoverer Parameters vw oe 24 Hide Advandfameters 4 1 Input Data A Protein Database Enzyme Name Maximum Missed Cleavage 1 Instrument Peptide Cut Off Score Pepti 0 Precursor Mass Tolerance 10 ppm Fragment Mass Tolerance 0 8 Da Use Average Precursor Ma False 4 3 Modification Groups From Quan Method 4 4 Dynamic Modifications 1 Dynamic Modification 2 Dynamic Modification v Protein Database The sequence database to be searched PD Sequest HT suggested settings In Proteome Discoverer 1 4 there is a new version of Sequest available called Sequest HT As for regular Sequest and Mascot the Work Flow settings for Sequest HT include a parameter that determines the amount and type of Spectra saved in the MSF output files That parameter is located in the 2 Scoring Options section We advise the User to adjust the following parameter to its minimum value e Max Delta Cn 0 Scaffold User s Manual 87 Chapter 4 The Scaffold Window Figure 4 17 Sequest H
59. the possible configurations of the two way table that have p values less or equal to the initial target protein two way table which means integrating over the tail of the distribution Scaffold User s Manual Chapter 9 Quantitative Methods and tests Multiple tests The statistical quantitative tests available in Scaffold give some measure of how different the BioSamples in the various categories are These measures are either a ratio a coefficient of variance or a p value How big do these measures have to be before they are significant The naive approach is to set an arbitrary value say a 2 fold change or a p value of less than a significance level of 0 05 This is what Scaffold does when it colors the fold change or p value green This probably includes some proteins that should not be labeled as differentially expressed but have gotten on the list by chance A better approach is to sort the proteins by their p value so the small p values are on top The proteins at the top of the list are the most likely to be accurately classified as differentially expressed As the User goes down the list the confidence in the classification should diminish The question becomes where the User should draw the threshold line Scaffold leaves this to the judgment of the researcher or User The issue is complicated because considering a set of statistical inferences all together increases the chance of falsely identifying a difference as significant
60. the web or email Condensed files with only identified spectra are generally 50 the size of uncondensed files while Condensed files without any spectra are generally 10 the size Warning Saving Without Any Spectra and Save MCP Required Spectra options remove TIC and precusor intensity quantitation data Save Only Identified Spectra Save Frozen Only Identified Spectra Save Without Any Spectra Save Frozen Without Any Spectra Save MCP Required Spectra Save Frozen MCP Required Spectra Le Coe x There are six options for condensing data while it is saved Save Only Identified Spectra This option saves all the data that can be seen in the Scaffold Viewer It does not save the spectra that were not matched to peptides Saving with this option generally cuts the size of the SF3 file in half Save Frozen Only Identified Spectra This command condenses the saved output file just like the Save Only Identified Spectra option does except it also freezes the data in the files Save Without Any Spectra This option saves all the peptides and their scores but does not save any of the spectra Save Frozen Without Any Spectra This command condenses the saved output file just like the Save Without Any Spectra option does except it also freezes the data in the file Save MCP Required Spectra This saves only those spectra required by the proteomics journal
61. total amount of the specified peptide that eluted Scaffold reads these values from its input files and uses them to do quantitative analysis Preparing Data for Precursor Intensity Quantitation in Scaffold Scaffold reads precursor intensity information from various identification programs provided that the User has requested this type of quantitation during the search Following are Scaffold User s Manual 181 182 instructions for preparing input files for quantitation in Scaffold Proteome Discoverer Proteome Discoverer provides a workflow template for computing precursor intensity values The template WF_LTQ_Orbitrap_Sequest_Precursor_ions_Area_Detector can be used as a starting point and the search engine choice or instrument settings may be changed Scaffold reads the precursor intensities from the MSF file Mascot Distiller When setting up the Mascot search select Average MD as the quantitation method When the search is complete in Distiller select Analysis gt Calculate XIC and then Analysis gt Quantitate Export the results as an XML file using Analysis gt Quantitive Report gt Save as XML Also create an ROV file by saving the project with File gt Save Project As Place the ROV file and the XML file in the same directory and if the DAT file is not accessible directly from the Mascot Server also place that file in the same location Load only the XML file into Scaffold Spectrum Mill No special settings are require
62. tutorial_3mas data provided by Proteome Software then select Dont Annotate 2 Ifyou have selected to run X Tandem continue to Validation with X Tandem on page 57 3 Once all the options have been properly checked click e Load and Analyze Data if X Tandem was not selected A message opens indicating that the data is being loaded and analyzed After the analysis is complete the data opens in the Samples View Continue to Modify make up of BioSamples on page 50 Scaffold User s Manual 49 Chapter 3 Loading Data in Scaffold Modify make up of BioSamples When the files have been loaded they move from the import queue on the left side of the Load data View table to the ready pane on the right Figure 3 9 Load Data View Files in Import Queue Scaffold Load Data My Experiment 2 8 File Edit View Experiment Export Quant Window Help D Bl OB A amp B bah Protein tiesto Min Peptides 2 Peptide Thresholds 95 Hent Oeds i Queue Fies For Loading J p Queue Structured Directories f h Add BioSample Protein Grouping Experiment Wide BioSample 1 i BioSample 1 0 Spectra Uncategorized Sample e Standard sample each fie wil be analyzed separately E 4 Condensing on condense data as it is loaded for improved performance Samp Files in Loading Queue Files Currently Loaded EEEE sequest bovine _spot_06 g Sa t bovine_spot_07 paeet bovine_spot_0
63. 08209 Gamma crystallin D CRGD_BOVIN 21kDa Bos ta 10 2 vV PX8208 gamma E crystallin ad GRGE_BOVIN 2ikDa unkno 11 v P48644 Retinal dehydrogenase AL1A1_BOVIN 55 kDa Bos ta o 12 v P00727 Cytosol aminopeptidase AMPL_ BOVIN 53 kDa Bos ta e 13 J P48616 Vimentin nit at 54kDa amp Bos ta e 14 V P60712 Actin cytoplasmic 1 EATE BVN 1 D 42kDa Bos ta l 15 V Q9X5J4 Alpha enolase EC 4 2 1 ENOA BOVT 47kDa Bosta e 16 J Q28177 Phakinin Beaded filam BFSP2_BOVIN 46kDa Bosta 17 v P10096 Glyceraldehyde 3 phos G3P_BOVIN 34kDa Bos ta o o 18 P06504 Beta crystallin S Gamm CRBS_BOVIN 21kDa Bosta oo 19 J 097764 Zeta crystallin QOR_BOVIN 35kDa Bosta o 20 ww TRYPSIN PRECURSOR CONT gi 1364 P0O7 24kDa unkno Protein aes Gene Ontology Yy binding protein binding identical protein binding protein homodimerization activ molecular Function B binding e Protein Cluster a set of protein groups created using a hierarchical clustering algorithm The clustering algorithm is similar to the one used by Mascot to create protein families but with more stringent grouping rules Members of the cluster share some peptides but not all of them Protein clusters are by default represented by the protein group that shows the highest associated probability Clusters can be collapsed or expanded directly in the protein list see Figure 6 6 For m
64. 3 263 260 4 L Each peptide is then assigned to the protein that has the highest total probability among all those where the peptide is found see Figure 8 9 If two or more proteins have equal total probabilities and that is the highest for that peptide it is assigned to all of them Figure 8 9 Assigned peptides are shown in green unassigned in gray T Scattold Table Export xis off B c D E F 6 H 1 M M N o Po 2 2 gt lt oF E E E E E X Ea y S P amp o S P C ad S G 5 amp rA fis F amp F F FF F F SF EF e 1 lt x C s E s 2 AKWYPEVR FALSE 9 3 9 3 9 9 3 9 3 CVVVGDGAVGK FALSE 28 28 28 28 28 28 4 DDKOTIEK TRUE B 73 B 7 73 73 73 5 GSPQAIK Chain A Small G Protein TRUE 6 _IISAMATIKCVVVGDGAVGK TRUE 7 KLTPITYPQGLAMAK Chain A Small G Protein TRUE BE 35 35 95 e 95 95 95 8 _LIPITYPQGLAMAK Ras related C3 botulinum tox TRUE 95 95 9 LTPITYPQGLAMAK Chain A Small G Protein TRUE E 95 a5 as 5 a5 95 95 10 LVPITYPQGLAMAK TRUE 11 TVFDEAIR true 95 95 95 95 95 95 95 95 12 VDSKPVNLGLWDTAGQEDYDR TRUE 3 433 358 358 285 285 285 263 263 263 263 260 14 Defining protein Groups 160 Now the grouping begins Proteins with no peptides assigned are eliminated from consideration the evidence for those proteins has already been accounted for in proteins which are more likely to be present in the analyzed sample Proteins with the same pept
65. 36 search feature eee 137 E Export MZIGENtML oo eee etree 191 Pride zi sescee stint Seca 192 ProtXMI vais nce eas 191 Scaffold perSPECtives 192 ScaffoldPTM ccccceeeeeee 192 Spectra earan 190 Subset database 189 190 Export reports ssssieeeeeesrrnesrnnes 189 F FASTA databases eee 68 FASTA databases in Scaffold 53 FDR How Scaffold calculates 134 FDR filtering Filtering Samples 134 Filtering SAMpIeS eee 129 Custom peptide filters 131 FDR filtering 22 eeeeeeee 134 Minimum number of peptides 730 Peptide threshold 0 130 Protein thresholds 129 G Gene Ontology terms pane 148 GO bar Charts oe eeeeeeee 149 GO Pie charts cceceeeeeseeereee 148 L LFDR based scoring system 26 licensing for Scaffold 13 Loading Wizard 38 M Mouse right click commands 272 mzidentML OXPOM n ET 191 O Organization of the manual 8 P Peptide report ceceeeeeeee 201 PeptideProphet cceeee 27 Precursor intensity quantitation 779 Calculations 000 eeeeeeeee 180 Mascot Distiller 182 MaxQuant s is 182 Scaffold User s Manual Performing quantitation 183 Preparing data for 181 Proteome Discoverer 182 Spectrum Mill eee 182 Pride OXPOM io aa eee 192 Protein and PeptideProphet
66. 7 92 1080 6 12 3 1219 06 1364 42 1548 35 1736 26 1888 96 2000 Figure 10 1 b At the first scan time shown in red in a a full MS scan is performed The ion with m z 786 09 is selected as a precursor ion for MS MS analysis 180 Scaffold User s Manual 20100203_St_0_10P1 4037 RT 51 46 AV 1 NL 1 31E2 T ITMS c ESI d w full ms2 786 09 cid35 205 20000 84 3 100 oe 480 28 80 1056 42 Relative abundance 1154 32 995 93 400 600 800 1000 1200 1400 1600 1800 2000 m z Figure 10 1 c At the next scan also shown in red in a an MS MS scan is performed providing peptide fragmentation information for peptide identification Once a peptide has been identified a program can go back to the MS1 scans and find a series of spectra which contain peaks corresponding to the same peptide as it continues to elute from the column These spectra are then aligned and the intensities of the peaks for the specific m z value which represents the parent ion of this peptide are plotted against the retention time giving an extracted ion chromatogram MS scans Intensity m z of peptide Retention time Figure 10 2 The intensities of the MS peaks at the same m z value are plotted as a function of the retention time The area under this curve enclosed in red is the precursor ion intensity In the extracted ion chromatogram a curve is fit to the intensities at a specific m z The area under this curve represents the
67. 9 Proteins bovine _spot_10 ot ant imilarity Wi hl E Queue Quantify Publ Statistics Bf Beet i Analysis Information Fixed Modifications Variable Modifications Peptide Tolerance f Modification Mass AA Modification Mass AA Be Data Fragment Tolerance Digestion Enzyme Searched Database Original Search Date Scaffold Version esesoe Se a m T Ti When they ve been analyzed Scaffold highlights them in yellow and then switches to the Samples view 50 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Figure 3 10 Load Data View Files in Import Queue Scaffold Load Data My Experiment o File Edit View Experiment Export Quant Window Help QOeaeuaga mA Yh Sy amp amp th Protein Threshold 99 9 v Min Peptides 2 Peptide Threshold 95 Breen ena i Queue Fies For Loading l p Queue Structured Directories By Add BioSample Protein Grouping Experiment Wide BioSample 1 BioSample 1 54 Spectra Uncategorized Sample a Standard sample each fie wil be analyzed separately E 4 Condensing on condense data as it is loaded for improved performance Samp Files in Loading Queue Files Currently Loaded u Sequest X Tandem zq 4 Proteins Fret e Similari Manil eam x Queue Quantify amp Publish Statistics 37 4 Bent Analysis Information Fixed Modifications Variable Modifications
68. A in Figure 6 5 It is also possible to over all change the proteins representing the groups by going to Apply Protein annotation Preferences The different proteins present in the group are also listed in the Protein Information pane represented as buttons labeled with each protein accession number By clicking one of the buttons is possible to gather further information looking up the proteins on specific look up sites Figure 6 5 Samples View Protein groups A Pull down list of proteins in the group B regular appearance of the group C Protein group in the protein information tab n l a ee et eee the 2 EERE zlil F 3 2 88s 523 AR i 3 9 S85 28a Al 2 amp 82 Proteins in 31 Clusters 2 2 S Bes ez Sg 1 v P02470 Alpha crystallin A chain CRYAA_BOVIN 20 kDa ta ooo oOo 2 v P07318 Beta crystallin B1 CRBB1 BOVIN 28 kDa ta 3 v P19141 Beta crystallin B3 Beta 4 5 v P11842 Beta crystallin A4 seta Bos ta 6 v P02510 Alpha crystallin B c Bos ta e e ee 7 V P02522 Beta crystallin B2 BP T k Bos ta o 8 J P26444 Beta crystallin A2 Beta CRBA2 BO k Bos ta 9 v P02526 Gamma crystallin B Ga CRGB_BOVIN 21kDa amp Bos ta o o 10 J Custer of P08209 Gamma cryst CRGD_BOVIN 2 21kDa Bos ta 10 1 v P
69. BioSample View MS Sample View This view combines all MS Samples into a summarized BioSample level which is the highest overview level of the results Each BioSample is represented by a column sorted by category and then by BioSample name Figure 6 3 Samples View BioSample View ant Window Help ofa i Ge eB li Qt Protein Threshoii 99 0 v Min Peptides 2 v PeptideT Display Options Protein Identification Probability v Req Mods NoFilter Search Probability Legend over 95 0 to 19 Accession Number lolecular Weight Starred ified Proteins 7 Protein Grouping Ambiguity 588888 Boson P19141 Beta crystallin B3 Beta CRBB3_BOV 24kDa PX1843 Beta crystallin A3 varia CRBA_BOVI 25kDa P02522 Beta crystallin B2 BP CRBB2_BOV 23kDa P07318 Beta crystallin B1 CRBB1_BOV 28 kDa P11842 Beta crystallin A4 Beta CRBA4_BOV 24kDa P26444 Beta crystallin A2 Beta CRBA2_BOV 22kDa keratin 67K type II cytoskeletal CONT gi 8 65kDa fF Proteins iT Pee ie similarity Maal DARRAR Eisler NONB UNE Scaffold User s Manual 123 Chapter 6 The Samples View The MS MS Sample View When this view is selected the Samples Table shows one column for each MS Sample sorted first by category and then by BioSample It is useful for example for analyzing samples processed with gels Scaffold displays a
70. CONT gi 8 _ 65kDa 50 73 100 aed fee Similarity MS Sample vs BioSample summarization levels Data associated with a BioSample might come from a sample taken by a doctor medical researcher or biologist such as a drop of blood or tissue from an organism Using such techniques as 2D gels or liquid chromatography proteins or peptides from these BioSamples are then separated from each other Each resulting individual band spot or LC fraction then processed by a mass spectrometer is one mass spectrometry sample abbreviated as MS sample One BioSample is therefore typically made up of more than one MS sample sometimes many more Protein List 124 To simplify the inspection of the identified proteins Scaffold aggregates the protein list using two levels of hierarchy e Protein Group a group of proteins with identical sets of peptides In the Protein list the protein groups are displayed collapsed the number of proteins in the group is indicated in parenthesis close to the accession number of the protein Scaffold User s Manual Chapter 6 The Samples View representing the group see B in Figure 6 5 By default the protein that has the highest probability and the most associated number of spectra will represent the group in the list When clicking on the accession number for a protein group a pull down list becomes available the User can thus selected a different protein to represent the group in the protein list see
71. Craig 2003 and ProteinProphet computer algorithms Nesvizhskii 2003 Scaffold Proteome Software Inc Portland OR 97219 Oregon USA was used to validate protein identifications derived from MS MS sequencing results Scaffold verifies peptide identifications assigned by SEQUEST Mascot or other search engines list other search engines used to derive the imported data using the X Tandem database searching program Craig 2003 and Searle 2008 Scaffold then probabilistically validates these peptide identifications using PeptideProphet Keller 2002 and derives corresponding protein probabilities using ProteinProphet Nesvizhskii 2003 and Searle 2010 Scaffold User s Manual Chapter 4 The Scaffold Window IdentityE Scaffold supports Waters IdentityE aka MSE aka hi lo energy scanning To be able to load data analyzed using PLGS into Scaffold Proteome Software in collaboration with Waters developed a plug in that comes with the PLGS installation The plug in specifically creates files compatible with Scaffold Waters should have provided a Scaffold plug in manual which guides th User through the Scaffold plug in installation but if this is not the case there is a copy available on Proteome Software website at Scaffold4 PLGS plug in The Scaffold plug in only exports data that was searched in PLGS using MSE Furthermore searches need to be run in PLGS with FDR set to 100 so that enough negative hits can be exported an
72. EJE O E E E A BF G w SEE al Elala E 2 5 E a s 8 S amp ee amp ETc a E S15 3 8 8 eo v 8 5 8 8 5 5 2 8 T Ble 2 a ls 3s aa v 2 ela g 2 2 el z 5 e 3 5 29 5 5 5 SF xc es ce a el E Ello SS oo o c Zee G c E E E 29 2225 A A 5S A a a a a a a r sa sy e ajl 18 8 8 8 o6 5 5 a o g7 E of alwiw eee lel ole gy 5 eo o o ki g sio 3 3 Y 8 am im a ee er 5 p fs b a p 2 Ee alg amp S a a P P v a E Z ggl E 2 2 2 2 Si 8 ul 4 P a P E S a c c a c 2 2 el gl 8 2 1 9 o s 3 5 3 d Sl el c llag lja mw 3 3 39 33 2379 5 8 2 S ja ja a x 2 2 2 2 2 3 85 E B B S y S S ose g ele g ala n S Sl l zl zi ul y yl g E E ELE E BIB BIBl Sl s Pg Sigg ceceers 3 5 8 2 5 8 8 8 8 5 5 5 5 5 5 2 8 35 Sis 8s mao2taaqcaeaaga GlFlalalala ziala ala 2 2 2 2 2 S S Flelalala a 5 Columns quick notes The first 4 columns identify the experiment the biological sample its category and the MS MS sample or run e This is followed by columns that identify the protein by name accession number database where the accession resides and the protein s mass e The remainder of the columns provide the results of the analysis There are three different versions of this report Protein report regular See Figure 11 14 e Protein cluster report Available when protein cluster an
73. Empirical statistical model to estimate the accuracy of peptide identifications made by MS MS and database search Anal Chem 2002 74 20 pp 5383 5392 DOI 10 1021 ac025747h The algorithm for calculating the protein probabilities from the peptide probabilities is described in Nesvizhskii 2003 Nesvizhskii A I Keller A Kolker E Aebersold R A statistical model for identifying proteins by tandem mass spectrometry Anal Chem 2003 75 17 pp 4646 4658 DOI 10 1021 ac0341261 The algorithm for combining results from multiple searches is described in Searle 2008 Searle B C Turner M Nesvizhsku A I Improving sensitivity by probabilistically combining results from multiple MS MS search methodologies J Proteome Res 2008 7 1 pp 245 53 DOI 10 1021 pr070540w The algorithm for grouping proteins across samples is described in Searle 2010 Searle B C Scaffold A bioinformatic tool for validating MS MS based proteomics studies Proteomics 2010 10 6 pp 1265 9 DOI 10 1002 pmic 200900437 The algorithm for X Tandem is described in Craig 2003 Craig R Beavis R C A method for reducing the time required to match protein sequences with tandem mass spectra Rapid Communications in Mass Spectrometry 2003 17 20 pp 2310 6 DOI 10 1002 rom 1198 Purity corrections calculations for iTRAQ data 206 Scaffold User s Manual Appendix Shadforth 2005 Shadforth I P Dunkley T P Lil
74. Experiment gt Edit Biological Sample Categories are useful in two ways The first is that the columns in the The Samples Table have all the BioSamples grouped into categories For example if the samples are put into categories Treated and Control then the samples in the Control category will be grouped together to the left of the samples in the Treated category The second way categories are useful is to organize the samples in order to find which proteins among categories are differentially expressed Scaffold offers several options for comparing the expression level of each protein between categories The quantitative analysis terms Experiment gt Quantitative Analysis T Test and ANOVA both measure the statistical probability of difference between categories Likewise the Quantify View organizes data in categories Apply New Database Through the use of the menu option Experiment gt Apply New Database the User can address the following situations Scaffold User s Manual Chapter 4 The Scaffold Window Incorrect parsing of protein accession numbers When this happens The Samples Table reports question marks in the Molecular weight column while in the Proteins View the protein sequence is missing Most of the time the cause of this problem is related to an incorrect parsing of the database selected when loading the data into Scaffold Either the database is not the same as the one used for the searches or
75. Experiment Export Quant Wi the top of the View From there the User can paste it into a third party so Sey m program such as Excel or Microsoft Word ales SH e Find Opens a find dialog box that searches the first table present in Edit FASTA Databases Ctrl D the Current View ee Edit FASTA Database See Edit FASTA Databases TROIS Sree Edit Peptide Threshold See Custom Peptide Filters Dive Edit Go Terms Options See Edit GO Term Options Boe cence e Bulk Operation Tag show and hide proteins Expand and collapse clusters Options also available on right click See Protein List Preferences See Preferences Advanced Preferences See Advanced Preferences Scaffold User s Manual 63 Chapter 4 The Scaffold Window Menu Menu Commands View Switch Sample View Switch Display Options Show Entire Protein Clusters Show Lower Scoring Matches Show lt 5 Probabilities Y Show Sample Notes Show Hidden Proteins Show GO Annotations Navigate Experiment Export Quant Window F Y Navigation Pane Ctrl 0 Navigation Pane Toggles the view of the Navigation pane Switch Sample View Selects level of summarization in Samples View see MS Sample vs BioSample summarization levels Switch Display Options See Display Options Show Entire Protein Clusters See Clusters in the Samples Table Show Lower Scoring Matches See Show Lower Scoring Matches Show lt 5 P
76. Export Subset FASTA Database This dialog is called by the Export button located in the Configure Database Parser dialog It provides tools to create a new filtered FASTA database or decoy database starting from the one selected in the Configure Database Parser dialog It contains a list and pull down menu Figure 4 8 Export Subset FASTA Databases dialog SF Export Subset FASTA Database Ex Set Export Parameters Filter Database On These Keywords e List of Filter Keywords Any of the keywords in the list is used to filter the original database for accession numbers that contains them Note that the keywords are not case sensitive do not have to be complete words and can be multiple word phrases Key words can be added to and deleted from the list using the buttons present at the bottom of the list The button Add opens the Add Keywords Filter dialog where the User can type in a new word This option is most often used to create a FASTA file for a specific species from a huge database The taxonomy of the protein is listed in different ways in different databases Scaffold User s Manual 71 Chapter 4 The Scaffold Window so the User needs to choose keywords appropriately For example to select only bovine proteins from the complete UniPROT database enter the keyword _BOVIN or to select rat proteins enter the keyword Rattus Database type pull down menu It shows the list of possible types of da
77. F3 file created by Scaffold Figure 2 1 Data Analysis Workflow including Scaffold Broadening the Search For further investigations the User can export the results to an Excel spreadsheet or export the unidentified spectra in a format that allows further searching by SEQUEST or Mascot or MaxQuant or by any other compatible search engine In this way the original searches can be repeated with different parameters or the results can be searched using another search 24 Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold engine or against a different database Importing the new resulting output data into Scaffold augments the initial Scaffold experiment Deepening the Search Deepening the Search If rather than casting a wider net a deeper look is preferred the User can export the unmatched spectra and a subset database consisting only of those proteins already found in the BioSamples included in the Scaffold experiment The spectra can then be searched against this subset database allowing a relatively fast search of these known proteins using variations on search parameters or looking for additional modifications Figure 2 2 Deepening the search ossesssssnesssssesnsnet Scaffold User s Manual 25 Chapter 2 Identifying Proteins with Scaffold Increased Confidence Using Peptide and Protein Validation Algorithms For validating peptide and protein identifications Scaffold uses three differen
78. IN 35kDa Bosta o o oo o 20 V TRYPSIN PRECURSOR CONT gi 1 24kDa unkno 21 J P55052 Fatty acid binding prote FABPE_BOVIN 15kDa unkno 22 V Q28088 Gamma crystallin C Lar CRGC_BOVIN 21kDa Bos ta oo 23 V PX8208 gamma E crystallin add GRGE_BOVIN 21kDa unkno 24 V P13696 Phosphatidylethanolam PEBP_BOVIN 21kDa unkno 25 V P16116 Aldose reductase EC 1 ALDR_BOVIN 36kDa Bos ta o o o o Statistics 26 V P02584 Profilin 1 ProfilinI PROF1_BOV 15kDa Bosta 000 so e 27 V Q06002 Filensin Beaded filame BFSP1_BOVIN 83kDa Bos ta 90 oo o oo E P04272 Annexin A2 Annexin I1 ANXA2_BOV 33kDa Bosta 900 O 0000 oo oo P68103 Elongation factor 1 alph EF1A1_BOVIN 50kDa Bos ta o Gene Ontology Sample Information Lookup Accession Number In NCBI ie gi 1351907 ALBU_BO cellular process A Biological Sample ia piag eseted F WPL BOVIN a ag me pr as eP z ches ppi cellular macromolecule metabolic process 0 19 Prophet FOR celular protein metabolic process Farge Desain 4405 Spectra st proteolysis MS MS Sample saman kuata peewee ce eee a This chapter details the Scaffold s Samples View providing a description of the different elements that constitute it e The Samples Table on page 123 which displays a summary of the experiment s results The list of identified proteins appears as rows and the list of BioSamples or MS Samples a
79. Ion Score 4 Use Individual Program Thresholds Accept Charge 2 Accept Charge 3 Use X Tandem scores Use Phenyx scores Log E Value 2 z Score 84 Accept Charge 4 and higher Parent Mass Tolerance 100 PPM X Use Spectrum Mill scores Use OMSSA scores Min Enzymatic Termini NTT 0 Score 11h 4og E Value 25 On Min Peptide Length 0 pickin Use IdentityE scores Use ZCore scores Score 24 ZCore Score 100 4 P score 50 6 The following options are available for configuring peptide threshold e Name Peptide Threshold Assigns a name to the custom Peptide Probability threshold built in this dialog box e General Minimum Thresholds e Use Individual Program Thresholds Uses only database program information in determining which proteins to display Choosing this will ignore and disable the protein and peptide probability options Note This is an appropriate option to choose if in the Statistics View page the Sequest XCorr only distribution histogram displays largely overlapping assigned incorrect and correct matches Use Both Probability and Scores To use both peptide probabilities and search engine scores when filtering data Note Unlike the Use Individual Program Thresholds option this filter does not ignore the Minimum Protein ID Probability e Accept Charges Use these check boxes to define which charges Sca
80. JAEQHSTPEQAAAGK S 100 0 42 82 2 Acetyl 42 V V _ AEQHSTPEQAAAGK S 100 0 47 00 2 Acetyl 42 v V JAEQHSTPEQAAAGK S 100 0 33 26 2 Acetyl 42 V _C AEQHSTPEQAAAGKSHGGLGGSYK MOONI 0 35 37 2 Acetyl 42 Wv V K sHaaLecsvK y _ 100 041 11 2 V V K SHGGLGGSYK V 100 0 32 28 2 W V k sHGaLecsyK v 100 0 35 64 2 v V K SHGGLGGSYK Y 100 0 26 31 2 V V K SHGGLGGSYK V 100 0 41 10 2 Y V K SHGGLGGSYK V 00 0 37 30 2 M _ K SHGGLGGSYK V 100 0 36 04 2 V _ K SHGGLGGSYK 100 0 40 26 2 V V KISHGGLGGSYK V 100 0 41 11 2 Column sorting feature In all tables throughout Scaffold the User can use the tri state column sorting feature and sort the display by clicking on any column header For example to sort the proteins based on increasing molecular weight the User can click the Molecular Weight column header once To sort the proteins based on decreasing molecular weight the User can click the Molecular Weight column header twice To return to the default display the User can click the Molecular Weight column header a third time Multi selection of rows in the Samples Table In the Samples table the User can select multiple rows by using either the SHIFT or the CTRL key depending whether the desired selection has contiguous rows or not and the click of the mouse in a pretty standard fashion Other functions can then be applied like assigning a star to the selected group of proteins in the Samples table for exam
81. L has become available Mascot Server Scaffold can load data directly from a Mascot Server This tab contains a text box where the User can set up the connection to the server by writing the web address of the available Mascot server The button Test Connection located on the lower right corner of the tab page provides a quick way to check if the connection works properly e Ifno security is implemented Scaffold connects directly to the Mascot sever When the Test Connection button is clicked a message appears stating whether the connection was successful or not Scaffold User s Manual Chapter 4 The Scaffold Window e Ifsecurity is enabled on the Mascot Server and the Test Connection button is clicked a login window pops up asking for an account name and password The User has to make sure that the account he she is using in Mascot has administrative rights Scaffold does not download files from a Mascot Server if the User is logged on as a GUEST and an error is shown Display Settings Scaffold provides different ways to look at the data included in an experiment through different views Load Data Samples Proteins Similarity Quantify Publish and Statistics The Display Settings tab allows the User to decide which of the available Views is visible Through this tab the User can also define which default Display Options is selected when a new experiment is created and reset messages that were selected not to show anym
82. MS Sample in the Samples Table The pane also contains filtering options for limiting the display to only those proteins that meet specific criteria Figure 6 14 Scaffold Display pane Display Options Protein Identification Probability Y Req Mods NoFilter v Search 7 The Display pane contains the following features e Display Options e Req Mods e Search Text Box e Advanced Search Display Options Scaffold reports statistics other than the identification probability The Display Options pull down list offers a range of statistics values that once selected are then displayed for each protein under each BioSample or MS Sample in the Samples Table Depending on whether the clustering algorithm option is selected or not a slightly different list of options is available Figure 6 15 List of Display Options with and without clustering option selected Qj m aj i ES A wD BU Protein inresnoia wu J Q 43 m aj Hi HH aA Se sy Proven inresnoia Display Options Protein Identification Probab Req Mods Display Options bility Regh L Protein Identification Probability Percent Coverage Percentage of Total Spectra Exclusive Unique Peptide Count otal Unique Peptide Count Exclusive Unique Spectrum Count J 2 Quantitative Value J Starred a T S 2 Cluster of Serum albumin precursor Allergen Bos d 6 ALBU_B Serum albumin precursor Allergen Bos d 6
83. Mass Spectrometry International Journal of Proteomics 2013 vol 2013 Article ID 756039 13 pages DOI 10 1155 2013 756039 Scaffold User s Manual 207 Appendix Raubenheimer 1992 Raubenheimer D and Simpson S L Analysis of covariance an alternative to nutritional indices Entomologia Experimentalis et Applicata 1992 62 221 231 DOI 10 1111 4 1570 7458 1992 tb00662 x References for Fisher s Exact test Zhang 2006 Zhang B VerBerkmoes N C Langston M A Uberbacher E Hettich R L Samatova N F Detecting differential and correlated protein expression in label free shotgun proteomics J Proteome Res 2006 5 11 pp 2909 2918 DOI 10 1021 pr0600273 208 Scaffold User s Manual Appendix B Terminology BioSample Scaffold calls BioSample a physical sample such as a drop of blood or biopsy from a patient or a tissue sample from a model organism or cell line The proteins or peptides in a BioSample are typically separated by 2D gels or liquid chromatography into several spots bands or fractions each of which becomes one mass spectrometry sample or MS Sample One BioSample is therefore typically made up of several MS samples Both BioSamples and MS Samples are often referred to by practitioners just as samples e When running Scaffold Q or Scaffold Q S quantitative multiplexed samples are initially loaded in Scaffold and referred to as BioSamples Exclusive Spectrum Count Total num
84. O annotation Sources This menu command can have three possible statuses e Apply GO Annotations This status appears when a GO annotations database has been imported and selected from the Edit gt Edit GO Terms Options GO Annotations Tab tab e Apply NCBI This status appears when NCBI Annotations has been selected from the Edit gt Edit GO Terms Options GO Annotations Tab tab e Configure GO annotation Sources This status appears when the User has yet to select a GO annotations database or NCBI Annotations from the Edit gt Edit GO Terms Options GO Annotations Tab tab Scaffold User s Manual Chapter 4 The Scaffold Window Quantitative Analysis Scaffold includes a number of statistical tests that can be applied using various types of quantitative methods These tests can be set up through the menu option Experiment gt Quantitative Analysis When selected the dialog Quantitative Analysis opens up showing the list of statistical tests available normalization and quantitative methods options and two lists from which the User can choose the different categories he she wants to compare and apply inference tests There are up to seven tests potentially available depending on the number of loaded samples and categories Figure 4 20 Quantitative Analysis Dialog Box O No Test Applied Removed Samples Fold Change by Sample Fold Change by Category Coefficient of Variance A Remove Fis
85. R 17kDa 100 100 100 100 100 Pehl 5 2 Beta lactoglobulin precursor Beta LG Allergen LACB_BOVIN 20kDs MOO H99 400 MOON MOOK Similari B6 V Cluster of Myoglobin MYG_HORSE MYG_HORSE 10 17kDa 100 100 100 100 100 SE 61 m MYG_HORSE 1 17kDa OOS 100 100 100 100 62 V Myoglobin MYG_AOTTR 6 17kDa 9 98 99 63 M Myoglobin MYG_RABIT 17a 11 MOON i55 7 W Peroxidase C1A precursor EC 1 11 1 7 PERA_ARMRU 1 39kDa Hoo 100 100 100 100 5 8 V NF00159992 Ubiquitin Bos taurus UBIQ_CONTR 2 9kDa 100 100 100 100 100 Quantify 8 V Cluster of Cytochrome c CYC_BOVIN C C_BOVIN 3 12k0a 008 400 100 100 100 m 91 E Cytochrome c CYC_BOVIN 1 12kba 100 100 100 100 100 9 2 E Cytochrome iso 1 and iso 2 CYC_CYPCA 11a 98 100 10 V Superoxide dismutase Cu Zn EC 1 15 1 1 SODC_BOVIN 1 1610s 1400 100 100 100 100 u Insulin INS_ACOCA 6kDa Egosi MOOS 195 N99 N99 Publie 2 V Trypsin precursor EC 3 4 21 4 TRYP_PIG 24kDa 100 100 100 100 100 Apply Protein annotation Preferences The menu option Experiment gt Apply Protein annotation Preferences opens the dialog Configure Protein Annotation Preferences where the User can globally define which protein in a protein group is visible in the protein list appearing in the Samples Table Figure 6 7 Configure Annotation Preferences Dialog You can automatically set preferred protein names accession numbers an
86. Save Frozen MCP Required Spectra This command condenses the saved output file just like the Save MCP Required Spectra option does except it also freezes the data in the file Since the spectra are 90 of the bulk of the data an SF3 file saved without spectra will be reduced to only about 10 of the size of the uncondensed file Scaffold User s Manual 67 Chapter 4 The Scaffold Window FASTA databases in Scaffold 68 Edit FASTA Databases To add and parse databases the User should open the Edit Databases dialog either selecting the menu option Edit gt Edit FASTA Databases or clicking the button Add New Database located in the Search Database pane in the Load and Analyze Data page in the Scaffold loading Wizard The selection opens the Edit Databases dialog which contains a table listing the databases already available in Scaffold and a number of functional buttons Figure 4 5 Edit Database dialog gt Edit Databases Loaded Databases Database Name Decoys IPI_human v3 85 decoy_uniprotaccession FASTA Database No decoys IPI_human v3 85 decoy_uniprotaccession_modified FASTA Database No decoys IPI_human v3 85 decoy_uniprotaccession FASTA Database 2 No decoys i Help Missing Out of Date Add Database Edit Delete OK Loaded Databases table This table lists all the databases already available in Scaffold with information about the percent
87. Software resource center Show Log Files Opens a folder containing Scaffold error_log and output_log files e Referencing Scaffold See Referencing Scaffold About Scaffold Provides the release information for the current version of Scaffold license information contact information for Proteome Software Inc It also reports information about the system where Scaffold is installed the amount of memory available to the software and the percentage of memory used by the application IdentityE Quantitation Options Quantitation Option Export IdentityE report Generates a tab delimited report containing the list of peptides used to calculate the intensities assigned to each protein in the list of identified proteins shown in the Samples view Merge The command File gt Merge allows the User to combine different Scaffold experiments into one single SF3 file It is active only when an existing Scaffold experiment has already been created or opened Selecting this command calls the Import Scaffold File file chooser from where it is possible to navigate to a SF3 file to be merged with the current opened Scaffold experiment Once a file is selected the Queue Scaffold Files for Merging window opens allowing the User to add more files to the list of SF3 files to be merged When merging the different Scaffold experiments appearing in the list the BioSamples included in each of them load into separate samples If th
88. Step 2 If you are carrying out this procedure using the sample tutorial_3seq data or the ty sample tutorial_3mas data provided by Proteome Software then continue to Step ee If the FASTA database selected is not identical to the external protein database including the version that you used for searching your experimental data then the protein sequence and molecular weight might not be available later in the Protein View 2 Click Add New Database The Edit Databases dialog box opens Scaffold User s Manual 53 Chapter 3 Loading Data in Scaffold Figure 3 13 Edit Databases dialog box 7 jum nooo IPI_human v3 85 decoy_uniprotaccession FASTA Database IPI_human v3 85 decoy_uniprotaccession FASTA Database 2 gt Missing Out of Date Add Database Eat osete OK 3 On the Edit Databases dialog box click New Database The Open FASTA Database dialog box opens Figure 3 14 Open FASTA Database dialog box Scaffold Load Data My Experiment File Edit View Experiment Export Quant Window Help ai i Y aa G ii amp bh ProteinThreshold 99 9 Min Peptides 2 Peptide Threshold My Experiment 0 Spectra E Q Files For 7 Protein Grouping Experiment Wide BioSample 1 BioSample 1 0 Spectra Uncategorized Sample o Standard sample each file willbe analyzed separately amp Condensing on condense data as it is loaded for
89. T suggested Scoring Options in PD 1 4 and higher Parameters gz 24 Hide Advance rameters 4 1 Input Data z Protein Database uniprot_sprot_2010_09 fasta Enzyme Name Trypsin Full Max Missed Cleavage Site 2 Min Peptide Length 6 Pep g Use Neutral Loss a lons True Use Neutral Loss b lons True Use Neutral Loss y lons True Use Flanking lons True Weight of a lons 0 Weight of b lons 1 Weight of c lons 0 v Protein Database The sequence database to be searched Show Lower Scoring Matches The command View gt Show Lower Scoring Matches toggles the option of rendering visible in the Samples Table the presence of a protein in a sample even if it does not meet the current filters and thresholds In some cases several samples may identify a protein at very different confidence levels For example sample may identify protein A with 95 probability and sample 2 may only identify it with 60 probability Ifthe option View gt Show Lower Scoring Matches is selected then the filters and thresholds affect only which protein rows are shown and both the 95 and the 60 values would be displayed even if the protein threshold was set at 90 Ifthe View gt Show Lower Scoring Matches option is not selected then the sample values that do not meet the filter values are suppressed This means that the 95 value for sample 1 would be shown but no value would be shown for sample 2 It is particularly imp
90. TMT 128N TMT 128C TMT 129N TMT 129C TMT 130N TMT 130C TMT 131 132Da 133Da 134Da 135Da TT 126 0 0 0 0 00 o 0 0 TMT 127N 0 0 0 100 0 0 TMT 127C 0 0 0 0 0 0 0 0 0 0 TMT 128N 0 0 0 0 TMT 128C 0 0 0 TMT 129N 0 0 0 TMT 129C 0 0 0 0 0 0 TMT 130N 0 0 0 0 TMT 130C 0 TMT 131 0 Hep The dialog contains a matrix where the User can input or modify the isotope correction factors for iTRAQ or TNT The percentages for each iTRAQ or TNT reagent need to be typed in following the same order as listed in the Certificate of Analysis If the certificate of analysis is not available the User can use the Scaffold default values although it is not recommended When a new Purity Correction table is created the User needs to assign a name to the table by typing one in the Name text box located above the matrix Whether editing an existing Purity Corrections table or creating a new one clicking Apply finalizes either one of the operations and closes the dialog Note For more information about the way Scaffold calculates and applies iTRAQ corrections see the following publication Shadforth 2005 Referencing Scaffold The User is free to copy modify and distribute the following examples for citing Scaffold in publications and reports Scaffold Proteome Software Inc Portland OR 97219 USA was used to probabilistically validate protein identifications derived from MS MS sequencing results using the X Tandem
91. The Edit Experiment Window opens and in the full version of Scaffold the check box Use Protein Cluster Analysis is available Selecting the check box and clicking Apply rearranges the protein groups and creates clusters using Shared Peptide Grouping and Protein Cluster Analysis Scaffold User s Manual Figure 8 2 Edit Experiment Window fa Ea Epema Experiment Description Protein Grouping Use protein duster analysis Use standard experiment wide protein grouping 5 Use legacy independent sample protein grouping on J en __ For explanation purposes the grouping and clustering processes can be broken down into the following three phases e Protein Grouping e Protein Paring e Protein Clustering Protein Grouping The way Shared Peptide Grouping assigns peptides to proteins is quite different from how it is done in the Legacy Protein Grouping algorithm Rather than assigning each peptide to a single protein Shared Peptide Grouping includes a peptide in all of its matching proteins It then precedes to form Protein Groups and assign weights to each shared peptide see Weighting Function Protein Groups Scaffold considers proteins that share peptide evidence In cases where two or more proteins share all of their peptides there is no basis for discrimination amongst them and the proteins are grouped and treated as a unit called Protein Group These proteins appear in the Sa
92. This is a more advanced statistical analysis concept and is not supported by Scaffold Scaffold User s Manual 175 Chapter 9 Quantitative Methods and tests ANOVA The ANOVA Analysis of Variance is an analysis method for testing equality of means across treatment groups or categories It tells if there are differences among categories The result of the test is a p value which when low indicates a large probability for variation among the different categories considered for the test The ANOVA test in Scaffold requires three or more replicates in the categories Like the T test having fewer than 3 replicates is untrustworthy and more replicates are better Like the Coefficient of Variance test the ANOVA test shows that something is different but it doesn t tell what categories are different from each other Checking the Bar Chart in The Quantitative Value pane helps understanding which category is different The ANOVA test in Scaffold is a simple one way ANOVA More sophisticated ANOVA tests are beyond Scaffold s capability Before applying the ANOVA test the User should understand the issues regarding Protein Grouping Ambiguity Missing values and Normalization as described in Normalization among samples in Scaffold Fisher s Exact Test 176 The Fisher s Exact Test like the T test compares the relative abundance between two sample categories It is used in the analysis of contingency tables where sample sizes are small
93. Value pane contains the Normalized Spectrum Count bar chart and located above the chart a pull down list of the proteins appearing in the Samples table The Normalized Spectrum Count bar chart provides a view of the relative abundance of a specific protein selected through the pull down list across different BioSamples and categories e The Y axis displays the normalized count of the spectra matching any of the peptides in the selected protein This count depends upon the protein peptide required mods and search filters and thresholds set on the The Samples View e X axis displays bars for each BioSample in the Scaffold experiment The bars are color coded according to the defined categories If the loaded dataset contains replicates from this pane the User can assess the consistency of spectral counts across replicates within each category while comparing expression levels of the protein between categories This allows visual inspection of the data and provides insight into the meaning of statistical comparisons such as the T test or ANOVA The Quantitative Scatterplots pane The Quantitative Scatterplots pane includes two tabs Q Q Scatterplot tab e Mean Deviation Scatterplot tab Q Q Scatterplot tab This tab contains a scatter plot and two pull down menus used to assign categories to each of the axis in the plot The Q Q Scatterplot helps evaluate the degree of error associated with the spectral count measurements The graph plot
94. a Bayes Good 7 control_071904_03 F001809 LFDR Model Classifier data Bayes Good 7 control_071904_04 FO01810 LFDR Model No Classifier all charge state control_071904_05 F001611 LFDR Model No Classifier all charge state Scaffold Version Scaffold_4 2 1 Modification Metadata Set 1541 modifications Source C Program Filesi Scaffold 4 820 parameters unimod xml Comment Protein Grouping Strategy Experiment wide grouping with protein cluster analysis Pentide Threchold lt 95 1 minimun Samples report The Samples report mimics the Samples View The report header rows identify the data and how it was created which is the same information that is contained in the Publication report Scaffold User s Manual 199 Chapter 11 Reports Subsequently each row in the report represents a protein in the samples list The number of proteins displayed depends on the current filter and threshold settings If Edit gt Show GO Annotations is selected the Go annotation information appearing in the Samples View will also be included in the Samples Report There are three slightly different version of this report Samples report regular See Figure 11 10 Samples report with clusters Available when protein cluster analysis is selected It adds clusters to the regular report Samples report with Isoforms Includes expanded protein groups to the regular report Spectrum Counting report It is like the regular Samples repo
95. a feel for how probable a given identification is is always available The Required Modifications filter lists all the post translational modifications PTMs selected during the search phase of data processing Choose a modification on the drop down list to filter the display to only those proteins peptides and spectra that contain the selected modification e No Filter No filtering is applied All proteins peptides and spectra that meet all other display and filtering options are displayed Scaffold User s Manual Chapter 6 The Samples View e Unmodified Only Display only those proteins peptides and spectra that do not have any associated PTMs Variable Modifications Display only those proteins peptides and spectra that were identified as having the selected variable modification Search Text Box The Scaffold search box allows the user to type in search terms to quickly identify specific proteins by protein names or accession numbers but it can also filter on peptide sequences and or spectra information Figure 6 16 Search text box Search The Search field accepts regular expressions and filters the results based on accession number or protein name Only those proteins that meet all the search criteria are displayed Your search is limited to the exact order of the characters in the string but the string is not case sensitive and it can appear anywhere in the search results For example a search string
96. a section of the diagram is selected the table shows the list of proteins included in the highlighted section Total Unique Peptides tab The diagrams show the sum of Total Unique Peptide Count for each protein in a category for up to three categories When a section of the diagram is selected the table shows the list of peptides included in the highlighted section Total Unique Spectra The diagrams show the sum of the Total Unique Spectrum Count for each protein in a category for up to three categories When a section of the diagram is selected the table shows the list of identified peptides with their charge and modifications included in the highlighted section The numbers shown in the Venn diagrams include all proteins and peptides displayed in the Samples view If the View option Show Entire Protein Clusters is selected the counts will include the lower scoring proteins that are part of displayed clusters see Clusters in the Samples Table if the View option Show Lower Scoring Matches is on the numbers include peptides not meeting the current thresholds but currently displayed because they do meet the thresholds in other biosamples for the same protein The status of these options is shown at the top of the window To count only proteins and peptides that meet thresholds the User can go to the View menu and turn off these options Figure 7 2 Status of View options in the Quantify View E lah Qt proteinthreshold 99 0 v
97. aala a3 m ol olz o w n i w els Protein identification probability Best Peptide identification probability Experiment name Biological sample category Biological sample name MS MS sample name Protein name Protein accession numbers Database sources Pro ri Ex Ex To Pei Pei Pepi Previous amino acid Next amino acid Best SEQUEST XCorr score Best SEQUEST DCn score Best X Tandem Numbe Numbe Numbe Numbe Median Retention Time Tot Tot Pepi Peptide stop index Sta Assigned Other Proteins Columns quick notes The first 14 columns repeat the information available in the Protein Report which identify the sample and the protein e Next comes the peptide sequence followed by the best scores for the spectrum matching it Then there are columns showing how many spectra matched the peptides in each charge state and a column for the number of tryptic termini NTT There are two different versions of this report Peptide Report regular See Figure 11 12 Peptide Quantitative report which exports similar information as the regular report does but organizes it emphasizing the various quantitative values available in the experiment for each peptide in every sample see Figure 11 13 Scaffold User s Manual 201 Chapter 11 Reports Figure 11 13 Peptide quantitative report columns Total Spectrum Count Weighted Spectrum Count Total Precursor Intensity Total TIC Median Retention Time 1M 1G 1G 1G 16 1M 1G 1G 1G 16
98. abundance in another sample category The implicit assumption is that if some abundant proteins are greatly suppressed in one category of samples they will be balanced by roughly the same number of abundant proteins with elevated levels Ifa visual inspection of this graph suggests that outliers have distorted the estimate of the CV then care should be taken when interpreting the Q Q Scatterplots The Venn Diagrams pane 146 The Venn Diagram pane includes three tabs each containing Venn diagrams with different types of numbers and a table placed on the right side of the diagram The table becomes visible whenever the User selects a section of the diagram When a section of the diagram is selected it appers highlighted in yellow Over the tabs there are three drop down lists showing the categories available in the experiment Through the Venn Diagram pane the User can take a look at the relationship among proteins total unique peptides or total unique spectra identified in the various categories Scaffold User s Manual Chapter 7 Quantify View Each of the tabs display a Venn diagram showing the overlap of up to three categories and reflecting the current filters and thresholds applied in the Samples View The User can determine which category is visible in the diagram through the drop down lists Proteins tab The diagrams show the number of proteins identified in each category and in the overlap between up to three categories When
99. age of the Loading Wizard Reset Peptide Validation See Reset Peptide Validation Apply Go Terms Configure Go Annotations Sources Applies imported annotations to the Samples Table To import GO databases see Edit GO Term Options Quantitative Analysis See Quantitative Analysis 64 Scaffold User s Manual Chapter 4 The Scaffold Window Menu Menu Commands Export Subset DATABASE See Subset Database Roane inccwmtietry Spectra See Spectra BLESSED Ss ProtXML See ProtXML report a mzlidentML See mzidentML esr Scaffold Batch See ScaffoldBatch wenn Scaffold Batch Archive See ScaffoldBatch Archive Scaffold Batch Export To Excel Scaffold Batch Archive x Publication Report x Samples Reports gt x Spectrum Reports gt x Peptide Reports gt x Protein Reports X Current View x Complete e Publication Report See Publication report e Samples Report Generates a tab delimited Samples table appearing in the Samples View see Samples report e Spectrum Reports See Spectrum report Peptide Reports Generates a tab delimited Peptide table for all proteins appearing in the Samples View see Peptide report e Protein Reports Opens the SQL dialog box see Protein report e Current View See Current View report e Complete See Complete report Quant Quant Window Help QE Launch Q Quantitation Browser Edit Quantitative Method Purity Corre
100. agents kit They indicate the percentages of each reporter ion that have masses differing by 2 1 1 and 2 Da from the nominal reporter ion mass due to isotopic variants Note It is strongly recommended to add these correction factors into Scaffold The Edit iTRAQ TMT Corrections dialog opens when the Edit Purity Corrections button or the Other option in the and Correction to pull down list present in the Edit Quantitative Samples dialog are selected The dialog includes the Loaded Purity Corrections table which lists saved correction tables with their specific methods and a number of functional buttons appearing at the bottom of the table New Correction Opens the dialog Purity Corrections where the User can define a new purity correction table Edit Opens the dialog Purity Corrections where the currently selected purity correction table is shown and where the User can adjust the values already included in the table or add others Delete Deletes the selected entries from the table Close Closes the dialog without applying the changes Apply Chooses the selected Purity Correction table Scaffold User s Manual 95 Chapter 4 The Scaffold Window 96 Purity Corrections The Purity Correction dialog opens when selected from the Edit iTRAQ TMT Purity Corrections dialog through the buttons New Correction and Edit Figure 4 22 Purity Correction dialog Name 122Da 123Da 124Da 125Da TMT 126 TMT 127N TMT 127C
101. ailable in the program e Label Free Quantitative Methods on page 164 which describes the quantitative methods available in Scaffold and how they are computed e Normalization among samples in Scaffold on page 169 which describes how data is normalized in Scaffold e Quantitative Analysis Tests on page 172 which describes the methods for inference available in Scaffold and how they are computed Scaffold User s Manual 163 Chapter 9 Quantitative Methods and tests Label Free Quantitative Methods There are two widely used label free quantification strategies which are quite different in their approach and methods of accounting for the presence of proteins in a sample and a third one that is a sort of in between method Spectrum Counting which counts and compares the number of fragment spectra identifying peptides of a given protein e Precursor Ion Intensity which measures and compares the mass spectrometric signal intensity of peptide precursor ions belonging to a particular protein e Total Ion Count TIC which considers peak intensities from MS MS spectra combined with counting of the spectra For each of these main general methods Scaffold provides a number of variations that are commonly proposed in the standard literature e Spectrum Counting e e e Total Spectra default Weighted Spectra emPAI NSAF e Total Ion Count TIC e e e Average TIC Total TIC Top Th
102. aldehyde 3 g ess ESC to dose i oo 16 Vimentin 177 Phakinin Beaded P BFSP2_BOVIN 46 kDa Bos ta oo oo o Resizing of columns and panes The user can resize columns and different panes in each of the views to better suit his her working needs For example in The Samples Table the user can change the width of a column by resting the mouse pointer on the right side of a column heading until the pointer Scaffold User s Manual Chapter 4 The Scaffold Window changes to a double headed arrow and then dragging the boundary until the column is the width that he or she wants Figure 4 29 Changing the width of a column in the Samples View plasma membrane binding enzyme regulator catalytic activity molecular Function A 2 structural molecule translation regulate 3 a 7 1 Moving columns around In all tables throughout Scaffold but the Samples Table every column can be moved around from one position to another for more comfortable access to the data that is summarized in them The User simply has to click on the header of the column that he she desire to move and drag it to the location where he she wants to place it Switching to another view will keep the columns in the new positions Figure 4 30 Moving columns around in tables Modifications Sequence EQHSTPEQAAGK S A 2 WV V AEQHSTPEQAAAGK S 100 oS 7 2 MAG V vV
103. alysis is selected It adds clusters to the regular report Protein Accession Number Report Similar to the regular report its purpose is to provide an easy way to look up the protein description for each accession number The report also provides the name of the database that was used for searching the data Current View report The Current View report contains the information that is displayed in the current view This report is applicable for the Samples View the Proteins View and the Publish View 202 Scaffold User s Manual Chapter 11 Reports Complete report This export is meant to provide the full results of the current analysis in a series of XML files saved in a separate directory The directory created contains the following list of files a proteins Ba overview_percentage_of_spectra xls Ba overview_protein_probabilities xls Ba overview_spectrum_counts xls overview_unique_peptide_counts xls Ba overview_unique_spectrum_counts xls Scaffold User s Manual 203 Chapter 11 Reports 204 Scaffold User s Manual Appendix Appendix e A Algorithms References e B Terminology e C Terminology comparison between Scaffold 4 and Scaffold 3 Scaffold User s Manual 205 Appendix A Algorithms References The algorithm for calculating the peptide probabilities from the search engine scores is described in Keller 2002 Keller A Nesvizhskui A I Kolker E and Aebersold R
104. amples Selected Samples Fold Change by Sample Sample Cat BioSa A ceci E BioSa B jolecular Weight Fold Change by Category 5 Fold change by Sample Fold Change EEEE EE Bom 77kDa 78 kDa 69kDa 60 kDa 43 kDa 53 kDa 36 kDa 79 kDa 29 kDa 29kDa 22 kDa 45kDa 29kDa 20 kDa 17kDa 57 kDa Coefficient of Variance i Analysis of Variance ANOVA jE Sir Fisher s Exact Test 7 Use Normalization Minimum Value 0 0 Ba Qu Quantitative Method Average Precursor Inten w 5 ee r Ce Glutathione S transferase Al OS GSTAI_HUMAN 26 kDa i GTPase HRas OS Homo sapiens G RASH_HUMAN 21kDa v Ubiquitin conjuqating enzyme E2 UB2E1 HUMAN 21kDa Reference sample BioSample2 v 2283508888838 Becsnes BEES SR SERE EEE Boosnces BRERERR ERRORS E Booms S E Sti Figure 10 8 Fold Change by BioSample A The Fold Change column showing the ratio of the Average Precursor Intensity of BioSample 4 to the Average Precursor Intensity of BioSample 2 Fold Change by Sample is only available if exactly two BioSamples are selected for quantitation It displays the ratio of the quantitative value of the non reference BioSample to the quantitative value of the reference BioSample for each protein The reference Sample is indicated by peach coloring in the column header and the sample being compared is indicated by a purple header If samples from exactly two categori
105. and not saved in the output files When this happens the peptide probability assignments and consequently the protein probabilities are computed by Scaffold in an unreliable fashion To compensate for this problem the Advanced Preferences Edit menu selection opens a dialog where the User can choose which scoring function is used by Scaffold in analyzing the search engine data using PeptideProphet The dialog presents separate tabs for the Sequest and Mascot search engines e Sequest tab e Mascot tab Sequest tab When running Sequest searches using Proteome Discoverer and using the default settings suggested by Thermo the MSF output files do not contain records of any unassigned spectra This missing information directly affects the ability of Scaffold to calculate the delta Cn score which is included in the formula used by Scaffold to compute the Sequest discriminant score in the PeptideProphet algorithm This formula is a normalized version of the Sequest XCORR score and depends upon the charge state of a peptide For example for charge 2 the discriminant function is 8 36 XCORR 7 39 DeltaCn 0 19 In SpRank 0 31 deltaMass 0 96 For PD version 1 3 and above we were able to identify the Sequest parameters that affected the amount of information recorded in the MSF files once the search is ended Proper suggestions concerning their adjustment are recorded in Configuring Proteome Discoverer Sequest and Mascot Unfortunat
106. ascot The Mascot scoring function Scaffold normally uses is the Mascot Ion Score minus the Identity Score The Identity Score Scaffold uses is the level that has a 5 probability of being due to a random match Mascot s concept of probability is somewhat different than Scaffold s but roughly speaking if you set Scaffold s Min Peptide probability to 95 the black vertical line on the Mascot Histogram should be close to zero on the discriminant scale Depending on the parameters set for the searches at times a reduced amount of information is exported to the output files and pieces of information needed to calculate the Identity Score is missing This affects the values of the calculated peptide probabilities and consequently the probability assigned to the list of identified proteins When this happens through the Advanced Preferences Mascot tab the User can select Ion Score Only as the scoring option used by PeptideProphet reducing in this way the error created by the improper calculation of the Ion Identity Scoring Scaffold User s Manual 83 Chapter 4 The Scaffold Window Figure 4 14 Setting Mascot Scoring Function Ion Identity Ion Score Scoring Only Mascot Discoverer The Mascot tab includes a table listing the different programs producing Mascot search results and radial buttons for selecting which scoring function Scaffold uses in the PeptideProphet algorithm e Generic Mascot The User can select
107. ata is loaded by going to the Experiment gt Add Go Annotations For more information see Edit GO Term Options The terms are displayed structured as a term ancestry with the high level GO annotations showing as colored dots which match the colors shown in the Samples Table and its subsequent children Figure 6 20 BOS 1a v v w w uuv IUU Bos Ta OO o oo O 4100 100 Gene Ontology 1G v biological regulation a Biolc B regulation of biological process g gical p Sam regulation of localization A Sam B regulation of transport regulation of intracellular transport MSA negative regulation of intracellular transport MS 140 Scaffold User s Manual Chapter 6 The Samples View e Double clicking a GO term in this pane opens a page in a browser with detailed information about the term GO terms may be hidden by un checking the menu in View gt Show GO Annotations See Edit GO Term Options Sample Information Pane The Samples Information Pane displays the Biological and MS Sample names and descriptions for the selected MS Samples Biological Sample name and notes and the MS Sample name and notes can be edited here To populate this pane the User needs to click ona Bio MS Sample column To change a category name for a BioSample go to Experiment gt Edit BioSample See Edit BioSample Scaffold User s Manual 141 Chapter 6 The Samples View 142 Scaffold User s Manual Chap
108. atabase the FDR is calculated using the count of decoys against target identification hits If proteins and or peptides are filtered based on FDR then the dashboard reports the specific protein and peptide thresholds necessary to reach those specific FDR values based on the FDR Browser landscape The FDR box where the values are reported appears with a red background Option Indicator Lights 102 The Option Indicator Lights are six multi colored dots located at the bottom of the Navigation Pane underneath theFDR Dashboard in the Scaffold Main window Figure 4 26 Option Indicator Lights Their scope is to remind the User about the status of the following options e View Menu Options green when selected Scaffold User s Manual Chapter 4 The Scaffold Window Show less lt 5 probability e Show lower Scoring Matches e Show entire protein Clusters Load and Analyze Options green when selected e Use Protein Cluster Analysis Use Independent Sample Grouping strategy e Scoring Scheme e LFDR green e PeptideProphet with Delta Mass correction orange e PeptideProphet no mass correction black The User can always hover over each one of the colored dots to check their function a tool tip appears providing a description of the selected dot Scaffold User s Manual 103 Chapter 4 The Scaffold Window Display pane 104 The information included in the different views appears in the Scaffold Display pane De
109. ated with a specific protein including those shared with other proteins Exclusive Unique Spectrum Count corresponding to Number of Unique Spectra in Scaffold3 Number of distinct spectra associated only with a single protein group or PEG Spectra are considered distinct when they identify different sequences of amino acids or peptides within the same identifies sequences of amino acids if they identify different charge states or a modified form of the peptide Total Unique Spectrum Count only available with clustering algorithm selected Number of unique spectra associated with a specific protein including those shared with other proteins Exclusive Spectrum Count corresponding to Number of Assigned Spectra in Scaffold3 The number of spectra associated only with a single protein group or PEG Total Spectrum Count corresponding to Unweighted Spectrum count in Scaffold3 The total number of spectra associated to a single protein group or PEG including those shared with other proteins Quantitative Value Selected quantitative method Scaffold will display the results of the Quantitative Method selected from the Quantitative Analysis Dialog Box St When a display option different from Protein Identification Probability is selected the colored highlights dont change The colors continue to represent the probability ranges specified by the legend This is true no matter which statistic is chosen to view so that
110. ations accordingly Scaffold User s Manual 59 Chapter 3 Loading Data in Scaffold 60 Scaffold User s Manual Chapter 4 The Scaffold Window Chapter 4 The Scaffold Window The Scaffold application is built around a main general window containing a number of different views Each view provides a particular perspective to look at the loaded data in the experiment There are a number of tools available in all views and specific tools that help navigate within a selected view The window major components are e The Title bar on page 62 The Main menu commands on page 63 The Tool bar on page 98 The Filtering pane on page 100 e The Navigation pane on page 101 e The Display pane on page 104 Figure 4 1 Scaffold window Bo View 32 Proteins in 31 Chasters 902470 Alpha crystallin A chain PO7318 Beta crystalin B1 P19141 Beta crystals B3 Beta CREBI_BOV_ P11843 Beta crystalin A3 Cont CRBAI_BOV 711842 Beta crystallin A4 Beta CRBAC_BOV P4864 Retinal dehydrogenase ALIAI_BOV_ P00727 Cytorol aminopeptidase AMPL 948616 Vimentin vine 960712 Actin cytoplasmic 1 Be_ACTB_BOVL 42 Q9KS 4 Alpha enolase EC 4 2 1 ENOA_BOVIN P13696 Phosphatidylethanolam PEBP_BOVIN 2 P16116 Aldose reductase EC 1 ALDR_BOVIN P02584 Profi n 1 Profs 1 PROFI_BOV 1 7 7 7 7 7 Statistics SS 7 7 7 7 7 Yi EY Bar
111. atively treat the samples Figure 3 1 Scaffold Wizard New Quantitative Technique page in Scaffold Q Scaffold Wizard 1 Welcome to Wizard Select Quantitative Technique 2 Select Quantitative Technique saat Choose Quantitative Technique 5 Load and Analyze Data Sees Caen TRAQ 4 plex TRAQ 8 plex TMT 10 plex TMT plex TMT 2plex Stable Isotope Labeling Multiplex Precursor Intensity Standard 4 Previous next Done Cancel Scaffold User s Manual 39 Chapter 3 Loading Data in Scaffold Figure 3 2 Scaffold Wizard New Quantitative Technique page inScaffold Q S Stic Winn OO OO x as ATETEA Select Quantitative Technique 2 Select Quantitative Technique iSense M 5 Load and Analyze Data 5 Spectrum Counting Standard D TRAQ 4 plex TTRAQ 8 plex TMT 10 plex 5 TMT 6 plex TMT 2 plex Stable Isotope Labeling Multiplex 5 Precursor Intensity Standard ae If you are carrying out this procedure using the sample tutorial_3seq data or the sample tutorial_3mas data provided by Proteome Software then select Spectrum Counts 4 Click Next The Scaffold Wizard New BioSample page opens 40 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Figure 3 3 Scaffold Wizard New BioSample page OF Sei Went m 1 Welcome to Wizard New BioSample 2 Select Quantitative Techni
112. available Preface 10 Scaffold User s Manual Chapter 1 Getting Started with Scaffold Chapter 1 Getting Started with Scaffold Scaffold is a software tool designed to help scientists identify and analyze proteins in biological samples Using output files from MS MS search engines Scaffold validates organizes and interprets mass spectrometry data allowing the User to more easily manage large amounts of data compare samples and search for protein modifications The Scaffold Viewer is a free read only version of Scaffold available Online for download It facilitates the sharing of Scaffold analysis results among collaborators This chapter covers the following topics e Initial requirements on page 12 which describes the minimum requirements for installing and running Scaffold e Scaffold Tiered Licensing on page 13 which explains the type licenses available for activating the program Scaffold Viewer on page 18 free download ScaffoldBatch on page 19 which loads and analyzes the same data that Scaffold does but in a batch rather than in an interactive environment e How Scaffold structures data on page 20 which describes the format of the files that are compatible with Scaffold Scaffold User s Manual 11 Chapter 1 Getting Started with Scaffold Initial requirements 12 Before installing and running Scaffold the User needs to make sure that 1 4 The comp
113. available at http www proteomesoftware com products data sequest_tutorial zip Or from the perspective of loading e the Mascot data in folder tutorial_3mas and related FASTA file control_sprot fasta that are available at http www proteomesoftware com products data mascot_tutorial zip To carry out this procedure using this sample data you have to first extract the contents of the zip file If you are following this procedure using the SEQUEST data files to shorten the wy time Scaffold takes to access SEQUEST s numerous subfolders at the operating system level outside of the Scaffold program navigate to the folder in which you placed tutorial _3seq and open it briefly viewing the sub folders within it results These are not valid input for Scaffold Scaffold requires DAT files from Mascot The path and filename is usually visible as the last part of the URL in the address field of the browser displaying the results page after the file Sy When you see Mascot output over the web you re viewing HTML summaries of 38 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Select Quantitative Technique 1 Open Scaffold 2 Inthe Welcome to Scaffold or Scaffold Q or Scaffold Q S window click New The Scaffold Wizard Welcome to Wizard page opens click Next to go to the Select Quantitative Technique page if you are running Scaffold Q or Q S 3 Specify how Scaffold Q or Scaffold Q S is to quantit
114. ber of spectra associated with this protein including those shared with other proteins Mascot File Names If a file containing Mascot data is named something like F DAT for example F123456 dat or F987654 dat Scaffold looks inside the Mascot file to search for the name of the original file that Mascot searched Since that name is more likely to be meaningful Scaffold uses it to name the MS sample On the Load Data page Scaffold displays the original name followed by the F DAT name assigned by Mascot in parentheses On the Samples and Proteins pages Scaffold uses only the original name The MS Sample name can be changed on the Samples View see Sample Information Pane on page 141 Mascot Output Files Mascot is a search engine distributed by Matrix Science Scaffold loads Mascot result files created with the following format DAT When Mascot is used through Proteome Discoverer the results are stored in MSF type of file NSAF The Normalized Spectral Abundance Factor NSAF is a modified version of spectral counting It was introduced and defined by the Washburn Lab group at the Stowers Institute to account for the effects of protein length as large proteins tend to contribute to the spectral counts a greater amount of peptide spectra than smaller ones do For more detailed information visit the Washburn Lab website http research stowers org proteomics Quant html Sequest Output Files Sequest is one of the search engines d
115. bove distorts the data if the total protein loaded varies considerably from sample to sample This is due to the fact that low abundance peptides may be on the edge of detectability If for example sample A has a lot of protein loaded the low abundance peptides may be detected If sample B has much less protein loaded these low abundance peptides might not be detected that is their spectral count is zero No amount of scaling is going to change zero to any other number The User can view the normalized data selecting the Quantitative Values option from the Display Options pull down list in the Samples View When viewing Quantitative Value Normalized Total Spectra the default quantitative method in Scaffold if the User switches from this value to Total Spectrum Count t and notices that all the values in one column change a lot compared to the values in the other columns this is evidence of uneven protein loading When this is happening it is important to be careful about how the data is used Normalization schemes not supported in Scaffold There are more sophisticated normalization schemes that attempt to normalize the data in a way that allows the User to compare in a semi quantitative way the abundance of one protein with another protein in the same sample Scaffold does not support these schemes This means that the User should exercise caution about trying to draw conclusions about the stoichiometry of the proteins from quantitative values
116. bove see Protein Groups Next these protein groups are grouped into clusters of similar proteins Two proteins or protein groups are considered similar if their joint weighted peptide evidence is at least half of the weighted peptide evidence of either protein A protein is iteratively added to a cluster if it is similar to at least one other protein in the cluster This information can be translated into the following rules of thumb for cluster formation 1 For two proteins to be clustered the sum of the probabilities of their shared peptides must be at least 95 Scaffold User s Manual 155 156 2 The proteins must share at least 50 of their evidence This is determined by summing the probabilities of the shared peptides and comparing this value with the summed probabilities of all of the peptides for each individual protein If the sum of the probabilities of the shared peptides is greater than or equal to half of the sum of the peptide probabilities for either of the individual proteins a cluster is formed 3 A protein may be included in an existing cluster if it meets the above criteria with a member protein of the cluster For a detailed example of how a cluster is formed see an extended version of this document published on our website scaffold_protein_grouping_clustering pdf Clusters in the Samples Table Thresholds and filters do not affect the formation of clusters but they do determine which clusters and protei
117. c MS sample 3 Inthe Proteins View look for the number of amino acid in the protein 4 Divide the exclusive spectrum count by the number of amino acids in the protein This is the SAF for specific protein 5 The values appearing in the Samples View when the Quantitative Value display option for NSAF is selected is the normalized value of SAF I 6 You can check the normalization factor for two values in same ms sample it should be the same along a column Total lon Count TIC Scaffold includes the following TIC Total Ion Current methods Average TIC Average of all the TIC values of the spectra assigned to a protein When selected the User needs to adjust the Minimum Value accordingly by selecting Other in the pull down list Total TIC Sum of all the TIC values of all spectra assigned to a protein When selected the User needs to adjust the Minimum Value accordingly by selecting Other in the pull down list Top Three TIC Sum of the top three TIC values among the spectra assigned to a protein When selected the User needs to adjust the Minimum Value accordingly by selecting Other in the pull down list Precursor intensity quantitation Scaffold supports label free quantitation based on precursor ion intensity for data from Proteome Discoverer Mascot Distiller and Spectrum Mill Precursor intensity refers to the area under an MS1 spectrum peak corresponding to a specific peptide whereas spectral counting cou
118. caffold Flexible Workflow Scaffold supplements spectra search engines it does not replace them The user continues to run the output of his her mass spectrometry experiments through SEQUEST Mascot MaxQuant X Tandem or whatever other search engine compatible with Scaffold as usual Results are then imported into Scaffold e For more information about compatibility with Scaffold check the following document http www proteomesoftware com pdf file_compatibility_matrix pdf Scaffold comes bundled with X Tandem To increase identification confidence the user can run the bundled version of X Tandem on data previously analyzed by other search engines Simple Workflow including Scaffold Scaffold uses various scientifically validated probabilistic methods to evaluate and analyze the imported data displaying its results in the Samples and Proteins views for more information see references listed in the Algorithms References appendix Once the data is loaded and analyzed by Scaffold results are saved in special Scaffold files that bear the extension SF3 The Scaffold files so created can be closed and reopened again at a later time either through a full or a viewer version of Scaffold see Scaffold Viewer When a licensed copy of Scaffold is installed on a computer only one full copy at a time can run on the system On the other hand the User can open multiple copies of the Viewer at the same time Scaffold Viewer can open and read any S
119. ciated with a certain biological structure or function are differentially expressed in one category or another Scaffold User s Manual 149 Chapter 7 Quantify View 150 Scaffold User s Manual Chapter 8 Protein Grouping and Clustering This chapter describes the way Scaffold groups and thins out the list of proteins shown in the Samples Table so the User can focus on the most likely protein identifications present in the experiment The grouping and paring is achieved using different types of algorithms depending on whether the option Protein Cluster Analysis is selected or not The chapter details the different grouping and clustering algorithms used in Scaffold as follows e Shared Peptide Grouping and Protein Cluster Analysis on page 152 which provides a description of the Shared Peptide Grouping algorithm used for grouping of the paring and clustering of proteins appearing in the Samples Table for version 4 and higher e Legacy Protein Grouping on page 159 which provides a description of the grouping algorithm used in versions 3 and older and still applied in Scaffold version 4 and higher when the clustering option is not selected Scaffold User s Manual 151 Shared Peptide Grouping and Protein Cluster Analysis 152 Scaffold version 4 and higher includes the option of applying a method of grouping proteins called Shared Peptide Grouping Scaffold versions 3 and lower instead used a different grouping algori
120. clusive peptides are placed to the left and included in the experiment Proteins for which all of the associated peptides are subsumed by these identified proteins are eliminated from further consideration as there is no independent evidence of their presence in the experiment These proteins appear in the No Group columns in the Similarity View but are otherwise invisible in Scaffold Figure 8 5 Similarity View No group columns Histone H2A type 1 B E OS Homo sapiens GN HIST 1H2AB PE 1 SV 2 25 2414 HUMAN ATIAGGGVIPHIHK HLQLAIR ow e w n Endex KI KI KI KI KI Kfalid HLQLAIRNDEELNK NDEELNKLLGK 100 HistonelH2m NDEELNKLLGR 100 Histone HZ VTIAQGGVLPNIQAVLLPK 100 Protein Clustering Assembling proteins into clusters is based on shared peptide evidence While akin to Mascot s hierarchical family clustering Scaffold s Protein Cluster Analysis is more stringent in its requirement for two proteins to appear in the same cluster This added stringency often succeeds in separating proteins into sets of biologically meaningful isoforms In essence a cluster is a set of proteins with overlapping peptide evidence and may be treated as a proxy for a single identification This view allows interpretation of identification probability spectral counts and normalized quantitative values calculated on the level of clusters Cluster formation begins with the creation of protein groups as described a
121. come to Wizard Queue Files For Loading BioSample iTRAQ Sample 1 a ee Link alg Standard sample each file will be analyzed separately 5 Add Another BioSample 6 Load and Analyze Data If you wish to add more files to this BioSample press the Queue more files button below If you are done adding files press Next to continue Eg quae More Fies For This BoSample Cj Queue More Fies From MASCOT Server _ Hep 4 Previous J Next gt i Done Cancel a sr 4 Continue to Queue more files for loading on page 44 Queue more files for loading 1 Ifyou have more data files to load for the current BioSample then do the following for each set of these data files otherwise continue to Step 2 If you are carrying out this procedure using the sample tutorial_3seq data provided by Proteome Software do not add more file to the BioSample provided by Proteome Software each file is loaded in its own BioSample so T Ifyou are carrying out this procedure using the sample tutorial_3mas data continue to Step 2 e Click Queue More Files For This BioSample e Repeat Queue files for loading NOTE Selecting category descriptions whenever possible from the drop down list y of other names you have used will make sure you don t incorporate unintentional small differences in naming which would prevent proper sorting in the Samples view 2
122. ction Launch Q Quantitation Browser When using Scaffold Q or Scaffold Q S this command is available for switching to the Q Q S quantitation window Edit Quantitative method purity Correction See Edit Quantitative Samples Window Window Help Load Data Ctrl 1 Samples Ctri 2 A Proteins Ctri 3 Mai Quantify Ctri 4 Similarity Ctrl 5 Publish Ctri 6 Fe Statistics Ctrl 7 The User can use this menu to switch Scaffold views Equivalent to clicking the buttons located in the Navigation pane Scaffold User s Manual 65 Chapter 4 The Scaffold Window 66 Menu Menu Commands Help Help on Current View Opens the Online Help that is specific for the currently displayed topic Help on Current View Help Contents Opens the Contents page for the Online Help Help Contents Scaffold User s Guide Scaffold User s Guide Opens the current Scaffold User s Guide Scaffold Q User s Guide Scaffold Q User s Guide Opens the current Scaffold Q S User s Open Demo Files Gu ide Scaffold FAQs Resource Center Open Demo Files Opens the folder where Scaffold demo files are stored The User can choose any of the pre loaded files to test Scaffold capabilities Show Log Files Referencing Scaffold Upgrade License Key Scaffold FAQs Resource Center Opens the User s default web ences browser to the Home page of the Proteome
123. ctrum matching a peptide Figure 11 11 Spectrum report columns SUIBIO1d 43470 asnpxg x pul doys apiydag x pul yeys pyd d yua uno UO e30 Wdd ssew apd d payejnojeo snuiw jenpy NWY ssew apnd d p azejnajes snuiw jenpy ameys wn p ds NINY Sse apndad HT p NIJLed NWY ssew apidad jenpy z w pavuasgqo uunupads Aq paly juap SU0 e91JIPOU ajqeeA uunapads Aq paljijuap SUOILL JIPO pax luua 2172 AZU Jo JaqUUNy aso2s 8 80 w pue X 9409S UDG SINOIS 3409S 1109X SINOIS Ayiqeqoud Uoljed1j Uap apIydad ppe OUILUE X N ppe oujwe snolAaid aouanbas apijdad weu unipads paudissy uoljepljea enue Jes ao aduUaNbas Jezu d d esp ds jezo Jo Bezug Juno unapads jezo yuno wnupads anbiun asnpxg uno apiydad anbiun aaisnjaxq Aq IGeqo4d Uo 7e91 4138p UlaIO4d eg 43am sejnsajowW uad s nos aseqejeq SJ qWUNU UOISS DE UlB O4d aweu ud aweu jdwes S W SN weu jdwes e2 do 01g Auo 9329 jdwes jeo d0 01g weu zu wu dxg Column quick notes The first 14 columns of the table provide information available in the Protein Report identifying the sample and the protein The Manual validation column reports if the User manually validated a spectrum Thi is done by selecting or deselecting the check box in the Valid column shown in the Scaffold User s Manual 200 Chapter 11 Reports spectrum table displayed in the peptides pane One of the following possibl
124. d Load the entire Spectrum Mill results directory into Scaffold MaxQuant MaxQuant 1 3 will only compute precursor intensity when two or more raw files are processed together Each of the samples to be compared must be labeled with a different experiment name in the experiment txt file Generally all MaxQuant results in a single directory load into Scaffold as a single sample For precursor quantitation however the samples to be compared must be loaded into different BioSamples Accordingly Scaffold has a special dialog that opens when the program recognizes the presence of an experiment file To place each experiment into its own BioSample from the loading wizard select the MaxQuant output directory and click Add to Import Queue then when the dialog appears select the first experiment Click Next then Add another BioSample and select the same directory but choose a different experiment from the dialog box In MaxQuant 1 4 precursor intensity may be computed even when analyzing a single raw file if the user selects the Label Free Quantitation option Individual results may then be loaded into separate BioSamples in the usual way and used for Precursor Intensity Quantitation in either Scaffold or Scaffold Q If two or more raw files are analyzed together in MaxQuant 1 4 with the Label Free Quantitation option selected and no Experiment txt file is provided they form a single combined folder which loads into Scaffold as a single sample
125. d available for Scaffold to be able to compute peptide and protein prophet probabilities in a statistically correct fashion When IdentityE data is loaded into Scaffold an additional menu IdentityE Menu appears on the main menu bar after the Help menu The menu provides options to configure absolute quantification Warning Scaffold uses its own algorithms Peptide Protein Prophet protein grouping to determine both the list of proteins displayed in the Samples Table and their absolute quantities While the intent is to reproduce Water s quantification strategy top 3 peptides per protein because of these algorithm differences the list of proteins displayed and their quantities may differ somewhat from what s displayed in Protein Lynx If you notice particularly large or confusing discrepancies please do let us know Quantitation Option Selecting the entry IdentityE gt Quantitation Option opens the dialog PLGS Quant Configuration The dialog contains e Known Abundance Protein pull down list The list is used to select among the proteins listed in the Samples table the protein that has a known abundance input the value and use it for quantitative purposes e Use accession not name check box Used to toggle the way the proteins are shown in the above pull down list e How much text box Used to input the quantitation normalization factor when the data is shown using weight or volume e Select Unit for Showing Data pull down men
126. d proteins in the Similarity View The stars in the Protein Grouping Ambiguity column are red when Scaffold loads the data The stars turn green as a reminder that the User has already examined the Similarity view for the protein 128 Scaffold User s Manual Chapter 6 The Samples View Filtering Samples There are three different filters that can be used to increase or decrease the length of the displayed protein list in the Samples Table Their function is to set minimum characteristics for identification confidence Protein Threshold Minimum Number of Peptides Peptide Thresholds The protein and peptide thresholds filter probabilities or FDR values if the loaded data were searched using decoys The drop down lists includes the two options depending on the type of searches loaded into Scaffold It is possible to type a custom FDR threshold directly into the box by adding FDR to the end of the string e g 10 3 FDR for more information see FDR Filtering Figure 6 9 Scaffold Confidence Filters Protein Threshold 9 0 X Min Peptides 2 v Peptide Threshold 95 X Ta e Proteins are displayed if each of the filter options is met by at least one sample e Filters can be locked with a password When locked the filters cannot be changed unless the password is entered This allows you to control what proteins are displayed when you distribute a Scaffold file e Also note that Protein probability is derived in par
127. d taxonomy Annotation preferences search for proteins that contain the text you specify For example you can select human entries out of SwissProt by searching for accession numbers that contain _HUMAN All preferences also support regular expressions You can select SwissProt numbers out of UniProtKB databases using OPQ Please visit this site for more help with forming regular expressions Preferred Protein Names Preferred Accession Numbers Preferred Taxonomy Prefer Proteins with Go Terms The dialog provides a series of text boxes where the User can input his her preferences By default Scaffold automatically selects the visible protein relying on the following five criteria in the order shown below 1 Prefer proteins that contain sequences user cannot modify this preference 2 Prefer the accession number preference 126 Scaffold User s Manual Chapter 6 The Samples View 3 Prefer the protein name preference 4 Prefer the taxonomy preference 5 Prefer proteins that contain GO terms Probability Legend To provide a measure of how correct protein identifications are for any of the BioSamples or MS Samples Scaffold uses a couple of different validation algorithms which assign identification probabilities to the peptides After that using ProteinProphet it groups the peptides by their corresponding protein s to compute probabilities that those proteins were present in the original sample se
128. dicate valid results whereas large numbers of unidentified peaks do not For increased confidence in Scaffold s statistical analysis examine the Statistics View to insure that statistical assumptions are met For information on combining multiple searches see Searle 2008 and proteome software wikispaces com file view improving sensitivity by combining MS MS results Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold Scaffold Views Scaffold offers both a high level overview of the imported search results and a detailed look at supporting data facilitating both top down and bottom up analysis Scaffold presents the more detailed levels in a coherent structure helping the User in verifying critical findings Load Data View This view allows the User to load additional data review the list of files loaded in each BioSample edit BioSample s information or delete already loaded MS Samples Figure 2 3 Load Data View GF Scaffold Q Load Data e torial 1 rrr ee O j File Edit View Experiment Export Quant Window Help e _ D BS BR A BW S A a eB ih Qt Prote mreshod 99 0 Mn Peptdes 2 Peptide Threshold 95 tutorial_1 2228 Spectra i Queue Fies For Loading fl p Queue Structured Directories JI ia Add BioSample Protein Grouping Experiment Wide BioSample 1 BioSample 1 2228 Spectra Uncategorized Sample Standard sample each file will be analyzed separately Conde
129. directory Scaffold User s Manual 193 Chapter 11 Reports 194 Creating a spectral library in Skyline Skyline is a popular application used to create and iteratively refine targeted methods for proteomics studies see Skyline It also provides tools to build spectral libraries from validated peptide spectrum matches Scaffold experiments can include a variety of validated spectrum matches coming from different sources and analyzed using multiple search engines This means that Scaffold experiments can be a good source for validated spectrum matches The way Scaffold exports spectral identification results is through mzIdentML reports and associated peak lists in MGF format Within these files Scaffold embeds precursor intensity retention time and a reference to the original RAW file which are requirements for creating transition libraries in Skyline The mzIdentML exports are now compatible with Skyline and can be used to create spectral libraries within that application Furthermore Scaffold supports a large variety of search engine reports some of which are not currently compatible with Skyline In particular Proteome Discoverer is a common platform for MS MS based proteomics which is now compatible with Skyline using Scaffold as an interface The User can create a spectral library in Skyline following these instructions 1 From a Scaffold experiment select the menu option Export gt mzIdentML the Export mzIdentML dialog o
130. dow cceeceeeeesseeeeeeeeeeeeeeeeeeeeneeeeeeees 61 Chapter 5 The Load Data View ccccccseesseeeeeeeeeeeeeeeeeeeeaeeeeeeeeeneeees 111 Chapter 6 The Samples View ccccccccseseeeeeeeeeeeeeeeeeeeeeeeneeeeeeeeeeeees 121 Chapter 7 Quantify View vii iscvscccdessccccvrvsennonsstscsvicnessteisviaroecscccsreveziencent 143 Chapter 8 Protein Grouping and Clustering cccccccessssseeeeeeeeeees 151 Chapter 9 Quantitative Methods and tests cc cssseeeeesseeeeeeeeeeees 163 Chapter 10 Precursor Intensity Quantitation ccceeeeeeeeeeeeeeeees 179 Chapter 11 RE POM i aaa a aa a ed EE Ea aaaea 189 PAD NC n a A Rae earra aa A EEEE 205 Scaffold User s Manual 5 Scaffold User s Manual Preface Preface Welcome to the Scaffold User s Manual The purpose of the Scaffold User s Manual is to answer Users questions and guide them through the procedures necessary for using Scaffold efficiently and effectively Using the manual The Scaffold User 5 Manual is easy to use The User can simply look up the topic that he she needs in the table of contents or the index Later in this Preface a brief discussion of each chapter is provided to further assist the User in locating the information that he she needs Special information about the manual The Scaffold User s Manual has a dual purpose design It can be distributed electronically and then printed on an as needed basi
131. e list Configure Database Parser Figure 4 7 Configure Database Parser dialog amp Configure Database Parser xx Parsed Accession Numbers and Protein Names Name Database Parser IP1_human v3 85 decoy_uniprotaccession FASTA Database Reset File Location C Users Luisa Desktop Lavoro ProteomeSoftware Customer_data 20 11 01 18 Luisa_Scaffold_Problem PI_human v3 85 decoy_uniprotaccession fasta Decoy UniProt Swiss Prot UniProtKB old x Accession Number Parse Rule gt Description Parse Rule gt Decoy Protein Parse Rule REVERSE RANDOM R decoy V Use Magic Matching This dialog contains tools to help the User describe and edit the location of the selected database Name Database Parser Through this text box the User can change the name assigned to the database when loaded Reset file location By clicking this button the User can point Scaffold to a different location where the database is stored Database version This text box can be used to define the Database version The dialog also contains tools to help the User parse the database as needed Parsed Accession numbers and protein names table This table lists a sample of the protein accession numbers and descriptions contained in the database The list includes proteins selected from the top and the bottom of the database file to give an idea of the type of acces
132. e 4 9 GO Term Configuration dialog Displayed GO Terms tab Go Term Configuration cx GO Annotation Databases Total Terms 27630 biological process A 4 biological adhesion rhythmic process viral reproduction developmental process cell killing 4 establishment of localization 4 localization cH growth amp response to stimulus R H S reproductive process 4 biological regulation 4 pigmentation 4 immune system process multicellular organismal process 4 multi organism process cellular process 4 locomotion H 8 reproduction metabolic process Add Remove Reset to User Default Reset to Scaffold Default Selected GO Terms 52 Color ID Head Node Selected GO Terms Definition e 226 10 Biological Process biological adhesion The attachment of a cell or organism to a substrate or other organism a e 65007 Biological Process biological regulation Any process that modulates the frequency rate or extent of any biological pr 7 oe 1906 Biological Process cell killing Any process in an organism that results in the killing of its own cells or those of a 9987 Biological Process cellular process Any process that is carried out at the cellular level but not necessarily restrict e 32502 Biological Process developmental process A biological process whose specific outcome is the progression of an integrated e 51234 Biological Proce
133. e ProteinProphet When loading data through the Wizard the User is presented with the following peptide validation scoring systems to choose from e PeptideProphet Scoring This scoring algorithm learns the distributions of search scores and peptide properties among correct and incorrect peptides and uses those distributions to compute for each peptide a probability that it is correct see PeptideProphet e LFDR based Scoring This is a novel scoring algorithm based on a Bayesian approach to local False Discovery Rate It is especially effective for QExactive and other high mass accuracy data see LFDR based scoring system The protein probability values are reported in the Samples Table when selected from the Display Options pull down list They are color coded to highlight significant differences in protein identification confidence The coloring is kept even when another statistics is selected from th Display Options list Located at the top of the view the Probability Legend defines the color coding for the protein identification probability Sorting feature When the Samples View first opens the displayed proteins are initially sorted based on a protein probability of 50 Scaffold s calculated probability which is a percentage that the protein identification is correct and if any proteins have the same probability then the proteins are sorted alphabetically based on their accession numbers You can use the tri state column sorting
134. e abundance of individual proteins in multiple independent samples and is typically applied to quantify the expression changes in various complexes It is generally calculated using the number of spectra SpC identifying a protein divided by the protein length L referred to as Spectral Abundant Factor SAF and then normalized over the total sum of spectral counts length in a given analysis This means that SAF is then divided by the sum of SpC L for all proteins in the experiment The NSAF values shown in The Samples Table when NSAF is selected as a quantitative value are calculated using the NSAF strategy 2 a listed in Table 1 of Zhang 2010 The calculation used in Scaffold translates to the following expression SAF number of exclusive spectra length of proteins expressed in number of amino acids The SAF value is then normalized using the regular Scaffold quantitative value normalization scheme see Normalization among samples in Scaffold to derive the NSAF values shown in The Samples Table NSAF calculations in Scaffold To check the calculation of NSAF in Scaffold the User should compute the SAF value for a Scaffold User s Manual Chapter 9 Quantitative Methods and tests couple of proteins along the same column in the MS Sample view to do so 1 Select the Display option Exclusive Spectrum Count 2 Select a protein from the protein list and annotate the exclusive spectrum count for that protein appearing in a specifi
135. e statuses of the Valid check box can appear in the report e Possibly Correct The User accepted the status of the box resulting from Scaffold analysis and did not touch it e Correct The User deselected and then selected the box again e Unchecked box If the box remains unchecked the spectrum does not appear in this report Number of enzymatic termini NTT When the digestion enzyme is trypsin this tells if the peptide is tryptic 2 semi tryptic 1 or non tryptic 0 Peptide report The Peptide report details all the peptides that pass the current filter and thresholds settings The report header rows identify the data and how it was created which is the same information that is contained in the Publication report Afterwards every row represents a peptide in each of the samples present in the Scaffold experiment For example if there are 3 samples each with 100 peptides there will be 300 rows in the report Even if several spectra match a peptide the peptide only gets one line in this report Figure 11 12 Peptide report columns e pectra pectra pectra ectra bc wa wa og e sco Si Slee ie enzymatic termini Calculated 1H Peptide Mass AMU w u u G el Se ail ein molecular weight Da spectrum count entage of total spectra entage sequence coverage ide sequence Precursor Intensity ITIC ide start index Category usive unique peptide count usive unique spectrum count
136. e support contract Once expired Scaffold continues to work beyond the key expiration date but no new upgrades are allowed unless the support contract is renewed The user must contact sales proteomesoftware com to purchase the appropriate key A Time Based License key is valid for only a single computer If the user moves the Scaffold installation to a different computer he she should contact sales proteomesoftware com to transfer the key at no charge After the User enters the key and presses OK the Key dialog box closes and a Scaffold Welcome message opens The Welcome message and the title bar for the Scaffold main window indicate the application to which the User has access Scaffold Scaffold Q or Scaffold Q S If the User is using an evaluation copy of Scaffold then an Evaluation message opens indicating the number of days t left in the evaluation period The User must click OK to close this message and then the Scaffold Welcome message opens Figure 1 3 Welcome WindowScaffold version lt Version gt indicating access to Scaffold Scaffold Q or Scaffold Q S Welcome to Scaffold x Ss p E Don t show this message again This is the first time you ve Would you like to see a list This is the first time you ve launched Scaffold version 4 launched Scaffold Q version 4 of Would you like to see a list of what s changed what s changed Don t show this message again What s Changed Conti
137. e variable aims to describe the dispersion of the variable in a way that does not depend on the variable s measurement unit The higher the CV the greater the dispersion in the variable The CV for a model aims to describe the model fit in terms of the relative sizes of the squared residuals and outcome values The lower the CV the smaller the residuals relative to the predicted value This is suggestive ofa good model fit o C gt Jl In Scaffold the User can select the Coefficient of Variance test only when he she adds two or more BioSamples to the Select Samples Table in the Quantitative Analysis set up dialog e The CV is only defined for a nonzero mean Because the CV is expressed as a percentage Scaffold multiples the ratio of the standard deviation to the mean by 100 e The CV is typically used to describe the dispersion of the variable independently of its measurement unit The higher the CV the greater the dispersion in the variable For example when analyzing four samples A B C and D the coefficient of variance outputs how dispersed are the values in respect to their mean A small coefficient of variance means that the four samples have values close together compared to their average value If the coefficient of variance is big then at least one of the four samples is Scaffold User s Manual T Test Chapter 9 Quantitative Methods and tests different but it doesn t specify A B C or D Examining Scaffold s Q
138. e_mudpit_10 MGF RT 0 Add Add ab E Associate proteins ScaffoldBatch ScaffoldBatch is a command line version of Scaffold designed to run in the background of a system where it is installed ScaffoldBatch reads its commands from an XML type driver file SCAFML rather than from the graphical user interface GUT and creates an SF3 Scaffold experiment When selected the command option Export gt ScaffoldBatch creates a SCAFML driver file that if run in ScaffoldBatch reproduces the Scaffold experiment from which it is exported SCAFML files are often used as an interface between Scaffold and another program A number of labs have created custom software that uses SCAFML driver files as an interface between a Laboratory Information Lab System LIMS and Scaffold Commercial versions of ScaffoldBatch are available bundled into Sage N Research s Sorcerer and Genologics Proteus Analytics Matrix Science has an interface between their Integra LIMS system and Scaffold that uses SCAFML files ScaffoldBatch Archive When selected the command option Export gt ScaffoldBatch Archive bundles into one package a SCAFML driver file that if run in ScaffoldBatch reproduces the Scaffold experiment from which it is exported along with all the files that are referenced in the SCAFML driver This package typically contains the input files and the FASTA database and it is saved in a compressed format like zip or tar Bef
139. edure using the sample tutorial_3seq data or the sample tutorial_3mas data provided by Proteome Software leave both these boxes unchecked 3 Click Next The Scaffold Wizard Queue Files for Loading page opens Figure 3 4 Scaffold Wizard Queue Files for Loading page Petter i S 1 Welcome to Wizard Queue Files For Loading rari 2 Select Quantitative Technique 3 New TRAQ 4 Plex Sample 4 Files For Loading You can select Sequest or Mascot files to add to this BioSample You will have the Queue More Files opportunity to add more files later Scaffold will not load and analyze the files 5 Add Another BioSample which can be time consuming until you are done adding files 6 Load and Analyze Data queue Fies For Loadng ip Queue Fies From MASCOT Server Help A Previous Next gt Done Cancel 4 Continue to Queue files for loading on page 42 Queue files for loading 1 Click Queue Files for Loading The Select Data Files dialog box opens 42 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Figure 3 5 Select Data Files dialog box Select Data Files Lookin _ My Documents z tea n d Camtasia Studio ab Ji Dell WebCam Central Recent Items J ePublisher Express Projects Ji ePublisher Pro Projects J ePublisher Pro User Formats ya J ePublisher Stationery Desktop J My Adobe Captivate Projects My Shapes mA Ji Snagit My Documents a Computer
140. ein Cluster will show in a gray font proteins or protein groups in clusters that do not pass thresholds Clusters Display Values Display values are calculated for a cluster as a whole based on the set of peptides that make Scaffold User s Manual up the cluster Note that these values are different from the values of any individual protein including the primary protein of the cluster Selecting a cluster and going to one of the other views displays all of the information for the entire cluster Diagram in Figure 8 6 illustrates Scaffold s method of spectra and peptide counting in clustered proteins The circles A B and C represent three proteins of which B and C form a cluster The little squares in the circles represent the spectra included in the proteins Their charge is also indicated The table on the side shows how the different quantitative values are counted for each protein and for the cluster Note that the total spectra of the cluster does not correspond to the sum of the total spectra of the proteins included in the cluster because some of the peptides are shared Figure 8 6 Spectral counting in clustered proteins SEQ1 2 SEQ1 3 Protein A ProteinB 4 2 4 2 ProteinC 4 2 3 1 SEQ7 3 Cluster 6 5 4 of B amp C Scaffold User s Manual 157 158 Scaffold User s Manual Legacy Protein Grouping Scaffold 4 groups proteins with the Legacy Protein Grouping algorithm used in its versions 3
141. elete selected entries see Configure Peptide Thresholds Dialog Box The filter criteria in effect can be reviewed on the Publish View Page and in the Publications Report Scaffold may work slower for custom filters e The ability to define and apply custom filters can also be controlled by a password which can make sharing correctly displayed datasets between colleagues easier see Preferences Password Configure Peptide Thresholds Dialog Box Through the Configure Peptide Thresholds dialog the User can define custom peptide filters to augment the available standard ones Scaffold User s Manual 131 Chapter 6 The Samples View 132 Figure 6 13 f D A Confgwre Peptide wess T S s Scaffold transforms search engine scores into statistical probabilities that makes protein identifications easier to validate In most cases the assumptions that Scaffold makes about your data are valid although that is not always the case You should be prepared to justify the usage of search engine specific score thresholds when attempting to publish your results Name Peptide Threshold Untitled Threshold General Minimum Thresholds Use SEQUEST scores Use Mascot scores S Peptide Probability fs DeltaCn 0 14 Ion Identity Score XCorr 1 L8H Ion Score 1 XCorr 2 _2 5 Ion Score 2 5 Use Both Probabilities And Scores XCorr 3 3 5 9 Ion Score 3 Accept Charge 41 XCorr 4 3 5 0
142. ely those parameters cannot be adjusted in PD version 1 2 To address this issue we created the Sequest tab in the Advanced Preferences where the User can specify what type of scoring function Scaffold uses in the PeptideProphet algorithm when loading Sequest searches 82 Scaffold User s Manual Chapter 4 The Scaffold Window Figure 4 13 Setting Sequest Scoring Function Sequest Mascot Set Sequest Scoring Function Use Discrimi Use Use mant XCorr Only Auto Detect Scoring Help Configuring Proteome Discoverer Apply The Sequest tab includes a table with the different available options e Generic Sequest The User can select either Discriminant Score or XCorr Only e Discoverer Sequest The User can select Discriminant Score XCorr Only or use Scaffold auto detect function that checks in the loading phase if all the needed information is included in the data files When all the proper information is present Scaffold uses the Sequest discriminant score otherwise XCorr Only Note When XCorr Only is used the list of identified proteins will be shorter due to the fact that XCorr Only reflects more stringent conditions when calculating the peptide probabilities Mascot tab To assure a proper definition of the Mascot scoring function used by Scaffold to calculate the peptide probabilities with PeptideProphet a certain amount of information needs to be available in the output files created by M
143. ent protein identifications It will however increase processing time X Tandem runs quickly relative to SEQUEST or Mascot but the User should expect searching Swiss Prot with several hundred thousand spectra to take a large amount of time and searching a huge database such as the NR database will take even much longer To include X Tandem results as part of a Scaffold experiment the User has to check the box labeled Analyze with X Tandem in the Load and Analyze Data page of the Loading Wizard X Tandem runs with the same parameters as the original loaded search files but variable modifications might be added to the ones already included 1 The Scaffold Wizard Load and Analyze Validation with X Tandem page opens Figure 3 17 Scaffold Wizard Load and Analyze Validation with X Tandem Scaffold Wierd E y A s Ex Load and Analyze Data PES nies X Tandem Options 5 Add Another BioSample 6 Load and Analyze Data se Validation with X1Tandem Variable Modifications Add Extra Variable Mods Selected Variable Mods Modificat Mass AA Modification Mass AA Met gt Hsi 48 00 c a loxidation 415 99 M Met gt Hse 29 99 acetyl 2 01 n e Phospho 7 97 Phosoho 7387 T Phosph 479 97 Y label 180 1 2 00 c new label 180 2 4 01 c L Methyl 1402 D Methyl 1402 E H M D z n Hep 4 Previous Load Data P l Done Cancel The Wizard Va
144. eptide e BLAST protein sequence opens http www ncbi nlm nih gov blast site for current protein e Copy All Data copies all the data listed in the table shown in the current pane to the clipboard e Copy Image copies the image of the current view and current pane to the clipboard e Copy Peak List copies the pick list of the current spectrum to the clipboard e Copy Protein Sequence copies the sequence to the clipboard e Copy publication Sized JPEG for publication purposes e Copy Selected Cell from the table copies selected cell to the clipboard e Copy Selected row from the table copies selected row to the clipboard e Copy WME EME copies picture using Windows Meta file formats which are portable between applications They contain both vector graphics and bitmap components Images can be edited and scaled without compromising their resolution e Delete Biological Sample a window pops up asking to confirm deletion e Display parent Ions toggle function Display unknown markers toggle function e Export to Excel export information in current tab table e Edit BioSample See Edit BioSample e Print print image of current view and pane e Queue Files for Loading See Queue Files for Loading e Save as provides the option of saving pictures in a large variety of graphical formats 212 Scaffold User s Manual Appendix C Users Luisa Graphics raw e eHEP Graphics Interc
145. er DeltaCn calculations If the PD 1 4 and older User chooses to adjust the peptide Scoring Options as we described above the MSF file created retains the lower scoring matches and can then be loaded in Scaffold using the regular discriminant score either specifically selecting the option in Scaffold Advanced Preferences or selecting auto detect The PD options shown below are not available in PD 1 2 In this case selecting y the auto detect feature will ensure the proper handling of the data Scaffold User s Manual 85 Chapter 4 The Scaffold Window Figure 4 15 Sequest Advanced Parameters Peptide Scoring Options in Proteome Discoverer Parameters w Bal S Hide Advai larameters a 1 Input Data a Protein Database Enzyme Name Trypsin Full Maximum Missed Cleavage 2 4 1 1 Peptide Scoring Options Madmum _ Conside 500 co Probably False Absolute XCor Threshold 0 Fragment lon Cutoff Percer 0 Peptide Without Protein a 0 Sendra Protein zenk 100 Protein Relevance Threshc 1 5 Peptide Relevance Factor 0 4 4 2 Tolerances emer Sequest Suggested values Use Average Precursor Ma False Use Average Fragment Ma False 4 3 lon Series Use Neutral Lossa lons True v Protein Database The sequence database to be searched PD Mascot suggested settings In Proteome Discoverer 1 4 and older the Work Flow settings for Mascot include parameters that determine the amount and type of Spectra saved in
146. erated e What databases were searched to identify the proteins e What parameters were used by the search engine or engines e What criteria were used for protein identification Following this is a narrative description of the same information This can be used as a rough draft for the methods section of any journal article Although the User will undoubtedly want to clean up this computer generated text to improve its readability it gives a place to start Exporting MCP supplemental table the second step in the MCP Submission checklist exports the Publication report In the MCP Submission procedure the User must finish step 1 describing the experimental methods before he she can export the Publication Report in order to ensure that the report is complete For publication in Proteomics journals the User might also use the Protein report and Peptide report as supplemental supporting data Scaffold User s Manual Chapter 11 Reports Figure 11 9 Publication report example tutorial_2_rnultiple_cat Publication report created on 01 09 2014 Experiment tutorial_2_rmultiple_cat Peak List Generator unknown Version unknown Charge States Calculated unknown Deisotoped unknown Textual Annotation unknown Database Set 2 Databases Database Name a subset of the control_sprot database Version unknown Taxonomy All Entries Number of Proteins 766 Database Name the control_sprot_1 database Version unknown Taxonomy All Entries N
147. es check their analysis information the fixed and variable modifications or edit BioSample information Figure 5 1 The Load Data View Sarason bun by poor Fie Ede View Expenment Export Quant Window Help jou AD E i e a ee ee 2 Peptide Thresa My Experiment 2746 Spectra Protein Grouping Experiment Wide each file wil be analyzed separately indense data as itis loaded for improved performance Fles Currently Loaded Mascot sco Mescot_only msf Fie Name 19 _6 RAW Fo16858 Load and Analyze Queue button Analysis Information Information Pane fons Mass Peptide Tolerance 10 0 PPM Monoisotapic Fragment Tolerance 0 80 Da Monoisotopic Digestion Enzyme Trypsin Searched Database the IPI_human v3 85 decoy_uniprotaccession database 180152 entrie Original Search Date Mascot 01 16 2012 Scaffold Version Scaffold_4 2 1 test6 This chapter details the Scaffold Load Data View describing the different elements that constitute it e Experiment Pane on page 112 which provides general information about the currently loaded experiment and tools to add MS samples to a BioSample or create a new BioSample within the current experiment e BioSample tabs on page 117 which contain the lists of already loaded MS files or files in queue for each BioSample together with specific loading inf
148. es are selected Fold Change by Category is available for display 186 Scaffold User s Manual E Scaffold Q S Samples My Experiment E File Edit View Experiment Export Quant Window Help D S a a3 a a f i Ms ae amp Bll Qt protein Threshold 99 9 Min Peptides 2 Peptide Threshold 95 Display Options Quantitative Value Normalized Average Precursor int v Req Mods NoFiter v Search _ ES A Probability Legend over 95 Data GF Quantitative Analysis Setup Removed Samples Fold Change by Sample nme cones Fold Change by Category BioSample 3 B BioSample 4 B a Coefficient of Variance Fold Change by Category Fold Change J haii Test a Analysis of Variance ANOVA Se Fisher s Exact Test Simil L F Use Normalization Minimum Value 0 0 n Quantitative Mathad Avarsna Dran renr Tatan HEHHEHE Bo BERBEERERRERBERR AEE PERRERPENEERRER pme IRERREREREBER ERE Bem TN i mace p Figure 10 9 Fold Change by Category A Selected samples must belong to exactly two categories B The Fold Change values are displayed in a column in the Samples View The fold change values represent the ratio of the average of the quantitative values of the selected samples in the comparison category to the average of the quantitative values of the samples in the reference category
149. es for Loading Not available in the Viewer version see Queue Files for Loading 98 Scaffold User s Manual Chapter 4 The Scaffold Window Icon Function P Load and Analyze Queue Not available in the Viewer version and active only when bi there are files listed in the loading Queue in the The Load Data View waiting to be loaded in Scaffold When selected it opens the Load and Analyze Data page of the Loading Wizard m Quantitative Analysis Opens the Quantitative Analysis dialog Q Scaffold Q Scaffold Q S Available when running Scaffold Q or Scaffold Q S fata opens the Scaffold Multiplex Quantitation window z Help Opens the Scaffold Online Help Scaffold User s Manual 99 Chapter 4 The Scaffold Window Filtering pane The Scaffold Filtering pane located on the right of the Tool bar contains filters and thresholding tools the User can adjust to increase or decrease the length of the displayed protein list in the Samples Table see Filtering Samples Figure 4 24 Scaffold Filtering Pane Protein Threshold 99 0 v Min Peptides 2 Peptide Threshold 95 100 Scaffold User s Manual Chapter 4 The Scaffold Window Navigation pane The Scaffold Navigation pane is a vertical bar displayed on the left side of the Scaffold window The bar contains buttons that toggle the seven different views available in the Scaffold main window See The Load Data View HA
150. estarted Processors This tab provides information to Scaffold about the maximum number of processors available for threading computations The default value is the maximum number of processors available in the system where the application is installed The Scaffold application in itself uses only two threads Assigning more than two threads to Scaffold mainly affects how fast the X Tandem version bundled with Scaffold executes thus optimizing the throughput Web Link Through this tab the User can add change or delete the on line protein lookup databases links These web sites appear in the Lookup Accession Number in pull down list found in the Protein Information pane in the Samples View When selecting these databases it is important to note that they do not need to be the same as the FASTA database that was used in the searches but they must have the same type of accession numbers Clicking New Database opens the Configure Web Link dialog where information for a new Online database can be added The linked database can be either a public database or an internal one It just has to have a URL that queries the database This link could also be to a web site that performs a calculation or does a BLAST The User might want to set up several links to the same database that query it in using different types of accession numbers The Edit button allows the User to modify a web link to adjust the URL if it has changed or a better UR
151. eue Files Currently Loaded The elements included in the BioSample window are the following Loading Parameters pane The information included in this pane reports the name of the BioSample the number of spectra loaded and the name of the category Underneath it reports the settings selected during the loading phase on the New BioSample page of the Wizard This means whether standard separate processing for each MS sample or MuDPIT combined analysis for all samples applies and whether the samples where loaded using the condensing option or not The User can change these settings by editing the BioSample from the Experiment menu or by right clicking the BioSample tab see Edit BioSample e Files in Loading Queue table Lists the files ready to be loaded in Scaffold If the User has selected files from more than two search engines he she needs to scroll towards the right side of the table to see all the files e Files Currently Loaded Lists the files already loaded in Scaffold The loaded files are highlighted in yellow when the Scaffold analysis is completed After analysis files from the same MS sample run through multiple search engines are aligned on a single row Hovering the cursor over a file shows its full name see Mascot File Names on page 209 and the database loaded e Load and Analyze Queue button Opens the Load and Analyze Data page of the Loading Wizard see Load and Analyze Data The User can modi
152. everal tests to identify proteins which show different quantitative abundances in two or more categories Which test to use depends upon the experimental design particularly the number of replicates available The tests are available for selection through the Quantitative Analysis dialog In the dialog the User also chooses the categories being tested and the quantitative value being used The tests are based upon the data that is being displayed in the The Samples Table Adjusting the peptide and protein filters and thresholds or the ReqMod filter or toggling the Show Lower Scoring Matches changes the number of proteins shown in the table and the tests may select different proteins as having abundance level changes Figure 9 3 Scaffold s tests for differential abundance No Test Applied Removed Samples Selected Samples Sample Category Sample Fold Change by Category cS cc O Coefficient of Variance O T Test O Fisher s Exact Test 7 Use Normalization Minimum Value 0 0 Quantitative Method Total Spectra Up to seven tests are potentially applicable depending on the number of categories and replicate samples included in the categories e Fold Change by Sample When a quantitative test is selected and applied a column appears in The Samples Table with the test results Sorting on this column brings the differentially expressed proteins together These proteins can then be e checked for the quality of
153. ey happen to be equally named a number is appended at the end of the original denomination to distinguish them Queue Scaffold Files for Merging After selecting the first file to be merged the dialog Queue Scaffold Files for Merging appears The function of the dialog is to help the User compile a list of files to be merged together in the same experiment The button Add More Files opens a file chooser which allows the User to locate select and add files to the list appearing in the dialog The Merge button merges the list of files to the original Scaffold experiment creating one or more new BioSamples for each file in the list This means that if a merged file contained more than one BioSample the different BioSamples appear in the merged experiment Scaffold User s Manual Chapter 4 The Scaffold Window Caution It is not possible to delete a specific file from this list Once the files are merged the Delete Biological Samples operation can be used to delete any undesired BioSample Save Condensed Data The File gt Save Condensed Data menu option reduces the size of the SF3 file Scaffold saves Caution Executing the Save Condensed Data changes the data in the running copy of Scaffold Once the User saves condensed there is no undo that will restore all the data Figure 4 4 f D Save Condensed File This function creates Condensed Scaffold files containing only identified Spectra for distributing data over
154. f groups of data files organized in a number of directories and sub directories for a specific category The organization of the directories should be similar to the one depicted in Figure 5 3 Figure 5 3 Organization of structured directories Category Level Data set Folder Z ee Sak mg Biosample 1 Biosample 2 Biosample N Clicking the button Queue Structured Directories opens a file browser The User should then point Scaffold to the top level of the structured directory containing the data files to be loaded in a specific Category Scaffold loads all the MS files found in one of the second level folders in a single BioSample even when the files are contained in sub sub folders Once the top level folder is selected a dialog opens asking to define the name of the category If the User is running Scaffold Q or Scaffold Q S the dialog also asks to define the type of quantitation to perform with the data Scaffold User s Manual 113 Chapter 5 The Load Data View Figure 5 4 Defining Categories and quantitation methods Which Category should we put these Biological Samples in Which Category should we put these Biological Samples in Select Category Select Category Uncategorized Sample v Please select a quantification method Please select a quantification method Unknown Dowe o Stable Isotope Labeling Multiplex C Standard Ifthe data is already organized
155. ferent amino acid sequences regardless of any Unique spectra Spectra that differ in amino acid sequence charge state or modifications Exclusive Associated with a single protein group Total protein groups Associated with a protein group whether or not it is shared with other Table 2 Terminology comparison between Scaffold 4 and Scaffold 3 Scaffold 4 term Scaffold 3 term Description Exclusive Unique Peptide Count Number of Unique Peptides Number of different amino acid sequences that are associated only with this protein Exclusive Unique Spectrum Count Number of Unique Spectra Number of unique spectra associated only with this protein Exclusive Spectrum Count Number of Assigned Spectra Number of spectra associated only with this protein Total Spectrum Count Unweighted Spectrum Count Total number of spectra associated with this protein including those shared with other proteins Total Unique Peptide Count N A Number of different amino acid sequences that are associated with this protein including those shared with other proteins Total Unique Spectrum Count N A Number of unique spectra associated with this protein including those shared with other proteins Scaffold User s Manual 211 Appendix D Mouse Right Click Commands e BLAST Peptide Sequence opens http www ncbi nlm nih gov blast site for current p
156. ffold displays Scaffold User s Manual Chapter 6 The Samples View e Parent Mass Tolerance The Parent Mass Tolerance is an after the probability calculation filter on the mass accuracy e Min Enzymatic Termini NTT e Min Peptide Length e Program Scores These check boxes determine what scores from each search engine filters out the appropriate proteins Some Scaffold filtering operations are faster using the standard peptide filters y than using the custom peptide filters Min Enzymatic Termini NTT When peak lists are searched with a search engine such as Sequest Mascot OMSSA Phenyx Spectrum Mill or X Tandem two of the parameters set are the digestion enzyme and the number of missed cleavages The search engine only matches spectra to peptides which conform to these parameters One approach to increasing the likelihood that the peptides found are correct is to specify that there is no enzyme when running the search engine and then restricting the search to peptides conforming to the digestion enzyme Since trypsin is the most common digestion enzyme the filter in Scaffold is called NTT Number of Tryptic Termini By excluding the peptides with good scores which are non tryptic the number of false positives decreases but so does the sensitivity This filtering on NTT is similar to searching with a loose mass tolerance and then restricting to look at only peptides within a tight mass tolerance Both approaches
157. fy the name of the BioSample and categories by opening the dialog Edit BioSample This dialog can be reached through the Experiment menu or by right clicking the mouse over the BioSample tab see Samples View in the Mouse Right Click Menus Scaffold User s Manual 117 Chapter 5 The Load Data View section Note The User can delete files before analysis or MS samples after analysis by right clicking and selecting Remove Selected Samples 118 Scaffold User s Manual Chapter 5 The Load Data View Information pane The bottom section of the Load Data view contains three information panes Each of them provides information related to how the loaded data was analyzed by the search engine e The Analysis Information pane e The Fixed Modifications pane The Variable Modifications pane Figure 5 8 Information panes in the Load Data view Analysis Information Fixed Modifications Variable Modifications GSR eae er Modification Mass AA Modification Mass AA Feanen TA DAD Da ore SED Pa Carbamidomethyi lodoacetamide deriv 57 02 c Oxidation 15 99 M Non specific Oxidation 15 99 w a subset of the swissprot_bovine database 257 entries and the s Acetyl Acetylation 4201 n juest 04 29 2005 and X Tandem 03 13 0110 B a SS Sequest the swissprot_bovine database 1705 entries J ott appearing wen overig overtre tet mi X Tandem a subset of the swissprot_bovine database 257 entries The pieces of information provided in the
158. g on the header for any column Add or remove modifications to the list already included in the original search data by using the arrows between the two lists Click Load data to start the analysis Continue to Modify make up of BioSamples on page 50 Build A Modification If the User wants to search with X Tandem using a variable modification not included in the Add Extra Variable Mods table he she can define the new mod choosing New on the Validation with X Tandem Wizard page and bring up the Build Modification dialog Scaffold User s Manual Chapter 3 Loading Data in Scaffold Figure 3 18 Build A Modification dialog amp Build Modification s Modification Name Untitled Modification angemasi C Modified Amino Acid A X aMis adaa ante e Modification name Name that will be used in the Proteins View Peptide pane and in the Spectrum Report for this modification This name is saved with the Scaffold file e Change in Mass The mass difference in amu due to this modification Even though the modifications are only displayed as with one place after the decimal point on the modifications list the mass is stored with the accuracy that was entered when defined e Modified Amino Acid Pull down list with possible amino acid choices Custom defined modifications can only apply to one amino acid If the defined modification applies to several amino acids the User has define several modific
159. gure 4 3 Scaffold Main Menu File Edit View Experiment Export Quant Window Help The Scaffold main menu is set up in a standard Windows menu format with menu commands grouped into menus File Edit View Experiment Export Quant and Help across the menu bar When loading Waters IdentityE type of data an extra menu IdentityE appears after the Help menu Some of these menu commands are available in other areas of the application as well Menu Menu Commands File e New lnitializes a Wizard which guides the User through the loading File Edit View Experiment Export Qui phase of the search data files in Scaffold See The Loading Wizard vs ses Open Opens a saved Scaffold experiment file SF3 through a file F i browser Merge Ctrl M Close e Merge Merges SF3 files in one single Scaffold experiment See ld Save Ctrl S Merge H Save As e Close Closes the current experiment standard Windows behavior Sees St e Save Saves the current experiment standard Windows behavior ere Cole Save As Saves the current experiment offering the option to use a o Ae different name standard Windows behavior pr Save Condensed Data Save Condensed Data Print Prints the current view Print Preview Previews the current view with the option of printing the document Exit Closes the Scaffold window Edit Copy For each View copies to the clipboard the first table appearing at Edit View
160. hange Format aif FreeHEP RAW Image Format raw reeHEP UNIX Portable PixMap Format ppm Windows Enhanced Metafile emf Portable Document Format pdf capsulated PostScript eps epi epsi epsf PostScript ps Scalable Vector Graphics svg svgz standard GIF image writer gif Standard WBMP Image Writer wbmp Standard PNG image writer pna tandard BMP Image Writer bmp Transparent Background Antialias Save JPEG Image saves image of current view and pane JPEG format Show Fixed Modifications it toggles the function of highlighting fixed modifications along the sequence Use Amino Acid Finder it toggles the activation of the tool tip that shows the peptides along the sequence Use Peak finder displays the tool tip for the different peaks if checked Use PPM Masses toggle function Zoom Out zoom function Scaffold User s Manual 213 Appendix 214 Scaffold User s Manual Index A Algorithms references 206 APP Ndix ccccceeereeereeeeerenee 205 References algorithms 206 Assumptions for the manual 8 C Configure peptide threshold dialog DOX ie sate eee 131 Conventions used in the manual 7 Copyright s 2 faeces vet cath 3 D Display pane in the Samples View 135 display options 20 135 Req MOds 00 eeeeeeeeeeeeeereeees 1
161. hapter 11 Reports Subset Database The command Export gt Subset Database exports a FASTA database subset of the original sequence database used for searching the imported data Figure 11 1 Subset Database dialog g gt S Export Subset FASTA Database cox Set Export Parameters Starred Proteins Only x i When selected the command opens a dialog containing the following export parameters All Displayed Proteins when chosen the created subset database contains only proteins appearing in the proteins list Adjusting the protein and peptide thresholds determines which identified proteins are included in the list and subsequently in the subset database More restrictive parameters result in fewer proteins in the exported database e Starred Proteins Only When selected the subset database will include only proteins labeled using stars The exported subset can facilitate a more thorough search for protein modifications with the original search engine Spectra The menu command Export gt Spectra exports spectra loaded in the current Scaffold experiment as peak lists A list of different formats is available for the User to choose how to save the exported peak list Figure 11 2 Export Spectra dialog Export Spectra x Set Export Parameters Unmatched Spectra Unmatched Spectra Criteria Min Protein 99 0 X Min Peptides 2 X Min Peptide 95 X n I E Only H
162. he User needs to specify the particular FASTA database that the initial search engine used to obtain the results This database must also be stored on a location accessible to the system where Scaffold is installed It is important to specify the correct database Without this information Scaffold cannot display the full sequence of amino acids in a peptide nor therefore the sequence coverage All search engines like for example SEQUEST Mascot or X Tandem store the name of the database they use with their results If the User is uncertain of the database used he she can use a text editor to search the search engine output files for the database name It s possible though that the correct database resides on a local network under a different name It is best to use the same database for all search results loaded into a specific Scaffold experiment This permits Scaffold to accurately align proteins found in different samples Scaffold User s Manual 21 Chapter 1 Getting Started with Scaffold 22 Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold Chapter 2 Identifying Proteins with Scaffold Scaffold is a tool designed with the aim of helping mass spectrometrists and medical researchers confidently identify proteins in biological samples Using output data from most of the current search engines available like SEQUEST Mascot MaxQuant X Tandem and many others Scaffold validates organizes and interpre
163. her s Exact Test V Use Normalization Minimum Value 0 0 Quantitative Method Total Spectra Other features e Use Normalization check box see Normalization among samples in Scaffold e Minimum Value pull down list e Quantitative Methods pull down list see Label Free Quantitative Methods Note When the Use Normalization check box is not selected and Total Spectrum Count is the quantitative method chosen for the analysis the values shown in the Samples View when the display option Quantitative Value is chosen are going to be the same as the one reported when Total Spectrum count is the selected Display Option Edit Quantitative Samples The menu option Quant gt Edit Quantitative Method Purity Correction opens the Edit Quantitative Samples dialog The dialog allows the User to change the quantitative methods selected for each BioSample when loading data into Scaffold Scaffold User s Manual 93 Chapter 4 The Scaffold Window Figure 4 21 Edit Quantitative Samples dialog Change Quantitative Type for Selected BioSamples Quantitation Type Spectrum Counting Standard Spectrum Counting Standard Change Type to Spectrum Counting Standard v and Correction to Help The dialog includes a table listing all the different BioSamples included in the current Scaffold experiment the quantitative method selected when loading and if relevant the related purity correction Be
164. iagram Pane Available right click menu Right Click Menu G Copy WMF EMF Copy Publication Sized JPEG Save JPEG Image Save As Print Copy Data When clicking on a Venn diagram set a list of the proteins or unique peptides or spectra will appear When mousing on the list and right clicking on it Right Click Menu C appears e Gene Ontology Terms Pane When mousing over in any of the tabs available in this pane Right Click Menu C appears 108 Scaffold User s Manual Chapter 4 The Scaffold Window Publish View When mousing over the Experiment Methods tab Right Click Menu C appears Statistics View When mousing over the MS MS samples table Right Click Menu C appears When mousing over the various graphs found in the different panes Right Click Menu F appears When mousing over the FDR browser the available right click menu is Right Click Menu H Copy WMF EMF Copy Publication Sized JPEG Save JPEG Image Save As Print Toggle Color om Out Scaffold User s Manual 109 Chapter 4 The Scaffold Window 110 Scaffold User s Manual Chapter 5 The Load Data View Chapter 5 The Load Data View Scaffold s Load Data View provides an overview of the currently opened experiment together with tools for loading further MS files or deleting them or adding or deleting BioSamples Through this view the User can see and check the list of files loaded in each BioSample add or delete BioSamples and MS sampl
165. iated number of spectra The proteins included in the group are shown in the Protein Information Pane Protein Clusters sets of proteins or proteins groups created using a hierarchical clustering algorithm similar but more stringent to the Mascot s family clustering algorithm Proteins or protein groups members of the cluster share some peptides but not all of them They are by default represented by the protein that shows the highest associated probability Clusters can be collapsed or expanded directly in the protein list For each protein or protein group that Scaffold identifies various Display Options are available providing different statistical information and counting options see Display Options on page 135 For the highest level overview MS samples are grouped into BioSamples and results can be viewed collapsed in a single column summary for the entire group of MS samples for further information go to MS Sample vs BioSample summarization levels on page 124 To better focus on the most useful results confidence filters allow setting minimum standards for identification probability or for the other available Display Options It is possible in this way to screen out less significant findings for a shorter higher confidence list Or by relaxing the filters it is possible to find less confident identifications that might be more promising for further investigation for more information go to Sorting feature on
166. ications Scaffold offers a number of instructive comparisons e Run Scaffold s bundled version of X Tandem see Validation with X Tandem on data from another search engine Peptides found to match with both SEQUEST and X Tandem or both Mascot and X Tandem are more likely to be valid than those peptides for which the two search engines disagree e Compare replicates of a biological sample to see if the same proteins are identified in each The Samples View facilitates this with a direct side by side view of all samples e Compare replicate proteins from the same spots on different gels to see if the same proteins are identified in each of them The Proteins View allows to sort spots according to the proteins they contain the gel they came from or other labels the User chooses Scaffold User s Manual 27 Chapter 2 Identifying Proteins with Scaffold 28 Compare the peptide patterns seen in each replicate The sequence coverage shown in the Proteins View enables the User to determine at a glance whether the same peptides appear repeatedly in various samples For each protein compare the number of peptides identified For each peptide compare its scores from various search engines The Proteins View lists the peptides and associated statistics For peptide identifications of interest examine the spectrum available in the Proteins View Long ladders that match ion peaks with the amino acids in the peptide sequence strongly in
167. ides assigned to them are combined into a group see Figure 8 10 Scaffold User s Manual Figure 8 10 Protein groups formation F Scaffold Table Export xds 8 c D E F c PEI Zn J K l M N o a 1 Group1 Group2 Group 3 aa 4o F e C 2 Zi ei O Ni F 3 AKWYPEVR FALSE 9 9 9 4 CVVVGDGAVGK FALSE 28 28 28 5 DDKOTIEK TRUE 73 73 73 6 GSPQAIK Chain A Small G Protein TRUE 7 NSAMQTIKCVVVGDGAVGK TRUE 8 KLTPITYPQGLAMAK Chain A Small G Protein TRUE 9 UPITYPQGLAMAK Ras related C3 botulinum tox TRUE 10 LTPITYPQGLAMAK Chain A Small G Protein TRUE 11 LVPITYPQGLAMAK TRUE 12 TVFDEAIR true M 95 95 95 13 VDSKPVNLGLWDTAGQEDYDR TRUE 14 433 263 263 260 There is one further complication however If the only evidence for a group is a single protein with probability less than 95 Scaffold disregards this group This is based on a heuristic rule built into the algorithm which cuts down on the number of false protein matches displayed In this case it would eliminate Group 3 see Figure 8 11 Figure 8 11 Formed protein groups Scaffold Table Export xts Sint B c D E F G H i J K l M N o 4 1 Group1 Group2 40 y x 2 ee ral O 3 1 AKWYPEVR FALSE 9 9 9 4 2 CVVVGDGAVGK FALSE 28 28 28 5 3 DDKOTIEK TRUE 73 73 6 4 GSPQAIK Chain A Small G Protein TRUE 7 5 NSAMQTIKCVVVGDGAVGK TRUE 8 6 KLTPITYPQGLAMAK Chain A Small G Protein TRUE 7 UPITYPQGLAMAK Ras related C3 botulinum tox TRUE 10 8 LTPITYPQGLAMAK Chain A Small G Protein TRUE
168. ied peptides Based on a Bayesian approach to LFDR Local False Discovery Rate this algorithm introduced with Scaffold version 4 is particularly effective for QExactive and high mass accuracy data see LFDR based scoring system Use Legacy PeptideProphet Scoring high Mass Accuracy This option will use the standard PeptideProphet algorithm developed in Scaffold version 3 and older together with the high mass accuracy option e Use Legacy PeptideProphet Scoring Standard Standard PeptideProphet with no high mass accuracy For references and information about the scoring algorithms used in Scaffold see Algorithms References When the data set to be analyzed was not searched using the decoy option or against a decoy concatenated database the Legacy PeptideProphet Scoring high Mass accuracy option will be automatically selected Protein Grouping Pane This pane shows the available grouping analysis options performed by Scaffold over the list of identified proteins e Use protein cluster analysis Since Scaffold 4 a new hierarchical grouping level was added above the Scaffold standard protein grouping While similar to the Mascot s hierarchical family clustering Scaffold 4 clusters are created using added stringencies that often succeed in separating proteins into sets of biologically meaningful isoforms Each cluster showed in the Samples View can be expanded or collapsed The clusters sub menu contains options for expanding
169. ies Add Categories Starred as k With taxonomy With Any of these GO Terms Add GO Terms on Cee The dialog contains a number of tools the User can use to search for specific proteins peptides spectra useful in peptodomic studies whenever questions arise about a peptide assignment or peptide motifs useful to investigate potential modification sites Searches can be performed over the full list of identified proteins or within different groups of proteins like categories or starred proteins The presence absence search options allow for searches based on the intersection of categories It displays only proteins found in a category or proteins found in one category and not in another category The presence absence search works in a similar manner as Scaffold s Protein Venn Diagram This feature requires the use of regular expressions Searches can be performed also on Taxonomy and GO terms after they are added to the Samples list Scaffold User s Manual Chapter 6 The Samples View Information Panes The bottom section of the Samples View contains three information panes e Protein Information pane e Gene Ontology pane e Sample Information Pane Each pane provides further diversified information related to each row in the Samples Table Figure 6 18 Samples View Information Panes V P06504 Beta crystallin S Gamm CRBS_BOVIN 21kDa Bos Ta o co moo nooi
170. ifferent outputs together as one MS sample This chapter covers the following topics e The Loading Wizard on page 38 helps a new User go through an example that shows how to load files in Scaffold and describes the different steps within the loading Wizard e Modify make up of BioSamples on page 50 which shows how to adjust the description and name of the loaded samples and delete some of the files already loaded e Specifying the FASTA database on page 53 which shows how to load and parse databases in Scaffold through an example e Validation with X Tandem on page 57 which explains how to run X Tandem through Scaffold Scaffold User s Manual 37 Chapter 3 Loading Data in Scaffold The Loading Wizard To familiarize the new User with the Scaffold Wizard we have developed a short exercise that shows step by step how to load a number of example files into Scaffold using the loading Wizard Details about the files used in this exercise are provide in the boxes below Each page in the Wizard points the User to a different task which is detailed in the following procedure e Select Quantitative Technique e New BioSample e Queue files for loading e Queue more files for loading e Add another BioSample e Load and Analyze Data The following procedure is written from the perspective of loading either e the SEQUEST data in folder tutorial_3seq and related FASTA file swissprot_bovine fasta that are
171. igh Quality Spectra File type Concatenated DTAs w oe e ane C L When selected the command opens a dialog containing the following parameters options 190 Scaffold User s Manual Chapter 11 Reports All Spectra This option exports all the spectra loaded in the current Scaffold experiment e Unmatched spectra This option exports only spectra that do not meet the filters criteria set by the User to allow further targeted searches on these types of spectra The criteria are based on the probabilities assigned by Scaffold through its scoring algorithms Unmatched Spectra Criteria e Min Protein e Min Peptides e Min Peptide e Only High Quality Spectra When selected Scaffold chooses for export only those spectra that identify peptides with probabilities higher than 50 or if the peptide probability happens to be lower the spectra has to be assigned to proteins that have a probability of at least 95 e Types of peak list files e Concatenated DTAs Individual DTAs e Mascot MFGs e Micromass PKLs e SEQUEST MS2s ProtXML report Exports all quantitative data in the protXML format which is an open XML file format for the storage of data at the raw spectral data peptide and protein levels This format enables uniform analysis and exchange of MS MS data generated from a variety of different instruments and assigned peptides using a variety of different database search programs
172. ilters that can be applied to the data files shown in the Mascot Server file table This helps the User quickly locate the files he she wants to load into Scaffold The available filter are Job number e User name e Title e Mascot Server file table When connected the table lists the search data files saved on the server The table shows the typical functionalities described in the Display pane Scaffold User s Manual 115 Chapter 5 The Load Data View 116 section plus it accepts bulk operations like the standard windows multiple selection of files The filtering pane acts on this table so that the User can easily locate the files he she wants to load in Scaffold Action pane Contains two buttons The Add button active only when there are files selected from the Mascot file table It starts the download of the chosen data files from the Mascot server to the computer where Scaffold is running and adds them to the loading queue in the Load Data View The Delete button active only when there are files selected in the Download file table When clicked it deletes the highlighted fies Downloaded file table The table lists the files downloaded from the Mascot Server to the computer where Scaffold is running The table shows the typical functionalities described in the Display pane section plus it accepts bulk operations like the standard windows multiple selection of files The status of the download is reported in the Download sta
173. in different categories folders the User can load one category at the time by pointing Scaffold to one specific category folder All the different sub folders included in the selected folder are assigned to biosamples that will all appear under that particular category Queue Files From Mascot Server for Loading The User can open this dialog either from the Scaffold Wizard on the Queue Files for Loading page or from the menu item Experiment gt Queue Files From Mascot Server for Loading The dialog contains tools to connect Scaffold to a Mascot Server select and download searched data files directly from there into Scaffold When calling the dialog from the menu item Experiment gt Queue Files From Mascot Server for Loading the User should make sure to have chosen the appropriate Bio MS Sample before adding data Figure 5 5 Connecting to the Mascot Server from Scaffold Mascot Servdr lt Mascot Server Web Address gt Job Number Add local Mascot Server web address 114 Scaffold User s Manual Chapter 5 The Load Data View When the User opens this dialog for the first time he she needs to connect to the local Mascot server This is done by adding the local Mascot Server web address in the Mascot Server text box located in the top left corner of the dialog If no security is implemented Scaffold connects directly to the Mascot Sever showing a list of files available
174. ins View Scaffold provides tools to manually inspect the identification of peptides A validation check box records the status of a peptide when selected a peptide is considered valid The User upon visually inspecting the related spectrum can invalidate a peptide by manually deselecting its check box The menu option Experiment gt Reset Peptide Validation provides a global tool to automatically validate all peptides above a specified probability by selecting the related check box and unchecking those below it When a different probability is selected the command resets all previous user validations The default probability is 0 This function can be used in two possible scenarios of the peptide validation process e Globally create a set of validated peptides based on probability assignments Peptides are considered valid only if their probability is greater than the minimum amount set in the pull down menu Minimum Peptide Probability its default probability being 0 Select a different value and click Apply All peptides with probability less than the Minimum Peptide Probability will be shown unchecked in the Proteins View and not considered for analysis Scaffold User s Manual 91 Chapter 4 The Scaffold Window 92 e Reset the manually validated peptides to their initial status the initial status being identified by the minimum peptide probability recorded in the pull down menu Apply GO Annotations Apply NCBI Configure G
175. istributed by Thermo Scientific Depending on the platform used to run it Sequest creates files with the following format DTA and OUT MS2 and SQT and SRF types of file The new platform developed by Thermo Scientific called Proteome Discoverer creates output files with extension MFS TIC Total Ion Current Scaffold User s Manual 209 Appendix The total ion current TIC is the sum of the areas under all the peaks contained in a MS MS spectrum Scaffold assumes that the area under a peak is proportional to the height of the peak and approximates the TIC value by summing the intensity of the peaks contained in the peak list associated to a MS MS sample Total Unique Peptide Count Number of different amino acid sequences that are associated with a protein including those shared with other proteins Total Unique Spectrum Count Number of unique spectra associated with a protein including those shared with other proteins 210 Scaffold User s Manual Appendix C Terminology comparison between Scaffold 4 and Scaffold 3 Starting from Scaffold version 4 Proteome Software added new terms to capture different types of evidence used in protein clustering These new terms affect the display options in the Samples View The tables below indicate the correspondence with Scaffold 3 display option terms The User might want to print these tables Table 1 New Terminology Unique peptides modifications Peptides with dif
176. ities are also offered including the T Test ANOVA and Coefficient of Variance as appropriate to the experimental design Because of its dependence on search results Scaffold s approach to Precursor Intensity Quantitation is to work backward from the peptides that have been identified through their MS MS spectra and compare the intensities of the MS peaks from which they were derived By contrast some programs align the MS peaks of all samples and calculate their intensities The peaks that appear to be biologically important are subsequently identified This method has the advantage of providing quantitative information for low abundance proteins but it relies on complicated peak alignment algorithms and may in some cases incorrectly identify the corresponding peaks Working from identifications is simpler since it obviates the need for retention time warping and peak alignment and it has the advantage of depending on more reliable data On the other hand in this approach missing values become an issue Often a peptide is identified in one sample and not in another producing a missing value even if there may have been a detectable MS peak in the corresponding position in the second sample Generally however higher abundance peptides are more likely to be identified and MS peaks that do not result in identified spectra are relatively weak signals minimizing the effect of treating them as missing values Scaffold further reduces the effect by
177. ld he she can give a copy of the Viewer to all his her collaborators so that they can view the User s data The Viewer performs most of the functions included in a full Scaffold copy However it cannot load any of the search results files and it can neither analyze data nor run X Tandem With the Viewer the User or his her collaborators can look at the data in the same ways as with Scaffold by samples proteins peptides or spectra The User can filter the results by protein probability peptide probability and the number of matching peptides or FDR values The User can change the names of the BioSamples the MS Samples and the proteins The Viewer User can also validate the peptide spectrum matches Scaffold User s Manual Chapter 1 Getting Started with Scaffold ScaffoldBatch ScaffoldBatch is a batch version of Scaffold It can load and analyze the same data that Scaffold does but in a batch rather than in an interactive environment Batch mode means that ScaffoldBatch can be run on the command line or called from a batch script The intended use is for organizations that want to integrate Scaffold into a Proteomics pipeline ScaffoldBatch can be used as one component of an automated Proteomics work flow ScaffoldBatch is an extended version of Scaffold When the User installs ScaffoldBatch a copy of the interactive version of Scaffold is automatically installed as well Like Scaffold ScaffoldBatch is locked to one computer by
178. les Once the files are loaded into Scaffold the discriminant score histogram displayed in the Scaffold Statistical view looks quite skewed and the related calculated protein probabilities become unreliable 84 Scaffold User s Manual Chapter 4 The Scaffold Window To address this problem we introduced a set of The Advanced Preferences in Scaffold provide tools to deal with this issue Furthermore an auto detect feature that comes into play when MSF files are loaded into Scaffold selects which of the option in the Advanced Preferences best suits the data that is being loaded We also recommend the User to adjust the Advanced Options available in the latest version of PD to allow a less stringent selection of the spectra saved in the MSF files as described in e PD Sequest suggested Settings e PD Mascot suggested settings e PD Sequest HT suggested settings PD Sequest suggested Settings In Proteome Discoverer 1 4 and older the Work Flow settings for Sequest include parameters that determine the amount and type of Spectra saved in the MSF output files Those parameters are located in the 1 1 Peptide Scoring Options section visible only when Show Advanced Parameters is selected We advise the User to adjust the following parameters to their minimum value e Absolute XCorr Threshold 0 e Fragment Ion Cutoff Percent 0 e Peptide Without Protein XCorr Threshold 0 so that Scaffold is able to find the information needed for prop
179. ley K S and Bessant C i Tracker For quantitative proteomics using iTRAQ BMC Genomics 2005 6 145 DOI 10 1186 1471 2164 6 145 Pavelka 2008 Pavelka N Fournier M L Swanson S K Pelizzola M Ricciardi Castagnoli P Florens L and Washburn M P Statistical Similarities between Transcriptomics and Quantitative Shotgun Proteomics Data Mol Cell Proteomics April 2008 7 631 644 DOI 10 1074 mcep M700240 MCP200 Reference for calculating emPAI Ishihama 2005 Ishihama Y Oda Y Tabata T Sato T Nagasu T Rappsilber J Mann M Exponentially Modified Protein Abundance Index emPAI for Estimation of Absolute Protein Amount in Proteomics by the Number of Sequenced Peptides per Protein Molecular amp Cellular Proteomics 2005 4 1265 1272 DOI 10 1074 mcp M500061 MCP200 Reference for calculating SAF Zhang 2010 Zhang Y Wen Z Washburn M P amp Florens L Refinements to Label Free Proteome Quantitation How to Deal with Peptides Shared by Multiple Proteins Anal Chem 2010 82 6 2272 81 DOI 10 102 1 ac9023999 References for Precursor intensity quantitation Bantscheff 2007 Bantscheff M Schirle M Sweetman G Rick J and Kuster B Quantitative mass spectrometry in proteomics a critical view Anal Bioanal Chem 2007 389 1017 1031 DOI 10 1007 s00216 007 1486 6 Lai 2013 Xianyin Lai Lianshui Wang and Frank A Witzmann Issues and Applications in Label Free Quantitative
180. lidation with X Tandem page includes two panes e X Tandem Options pane Since for large databases the X Tandem search can take a long time the option Search subset database was added to minimize its execution time Checking this box means that X Tandem searches only the subset of the proteins that were previously found with the original search engine For example suppose the original SEQUEST search against a million protein NR database found 100 proteins The subset X Tandem search will now be against only 100 Scaffold User s Manual 57 Chapter 3 Loading Data in Scaffold 58 proteins instead of a million In this case that particular one step of X Tandem will go thousands of times faster But the X Tandem refinement steps which search for modifications will not go any faster If there are a huge number of spectra this refinement step will still take considerable time What this all means is that the X Tandem search will be speedier but how much speedier depends upon the number of spectra and side of the FASTA database Variable Modifications pane From the input files Scaffold reads the parameters used to search a database by the search engine that produced the files The parameters include instrument mass error tolerances digestion enzymes and fixed and variable modifications Scaffold then passes these parameters to X Tandem when it is run Of all the parameters passed onto the X Tandem search the User can only modify the li
181. ll send the User an upgraded license key that will unlock the Scaffold Q or Scaffold Q S features To input the new upgraded key the User should follow the following instructions 1 Open the current copy of Scaffold installed on the computer by double clicking on the Scaffold icon found on the desktop or selecting the Scaffold application from the start up menu 2 Make sure you are connected to the Internet 3 Ifno upgrades are available click continue on the first Welcome to Scaffold dialog 16 Scaffold User s Manual Chapter 1 Getting Started with Scaffold 4 Ifa warning appears suggesting to upgrade Scaffold do so and open Scaffold after the upgrade When the second Welcome to Scaffold dialog appears click cancel Go to the Help menu and select the option Upgrade License Key When the Overwrite dialog appears click Yes a oN A The Fully licensed Scaffold dialog opens showing the information related to the current license Figure 1 5 Fully Licensed Scaffold dialog Fully licensed scaffold v acaffold s 4 kJ7JYOpx2F4XWdSkLYpHs7017ks Copy Reset Full license Scaffold Full License Support contract valid through Aug 30 2014 there are 199 days remaining Scaffold will continue to work beyond this date but an expired key will not alow new upgrades Renew E A 9 Click Enter New Key and the Please Enter License Key dialog opens see Figure 1 2 10 Copy a
182. low the table there are a couple of pull down lists and a button e Change Type to Pull down menu that lists the available selections for quantitation methods in Scaffold e and Correction to This pull down menu is available only when iTRAQ and TMT as selected as quantitative methods It lists the Purity corrections tables available Edit Purity Correction This button is available only when iTRAQ and TMT as selected as quantitative methods When selected it opens the Edit iTRAQ TMT Purity Corrections dialog Changing the Quantitative type for a specific BioSample 1 From the Change type to pull down list select a different quantitative method and then click OK 2 When the quantitative type selected is either ITRAQ or TMT the and Correction to pull down list and the Edit Purity Correction button become available e Select a correction from the list if available or select Other to open the Edit iTRAQ TMT Purity Corrections dialog from where the User can create new purity corrections tables or edit existing ones This dialog can also be reached by clicking the Edit Purity Correction button Edit iTRAQ TMT Purity Corrections Every batch of iTRAQ or TMT reagents contains trace levels of isotopic impurities that need to be corrected The correction factors or purity values are usually reported in the certificate 94 Scaffold User s Manual Chapter 4 The Scaffold Window of analysis that comes with the iTRAQ or TMT re
183. lular transport MS MS Sample bovine_mud negative regulation of intracellular transport 2 MS MS Sample Notes Protein Information pane The Protein Information pane is displayed in the lower left section of the Samples View It includes a look up accession number pull down list of Online protein databases such as SwissProt or NCBI For each selected row in the Samples Table the pane shows the set of proteins included in the corresponding protein group as click able buttons Clicking one of the buttons opens an Internet browser to the address selected from the pull down list and searches for the selected protein accession number If the accession number is found additional information for the selected proteins is then easily available to the user Scaffold User s Manual 139 Chapter 6 The Samples View Figure 6 19 Protein Information pane 12 W Trypsin precursor EC 3 4 21 4 tatistics Protein Information Lookup Accession Number In NCBI ie gi 1351907 ALBU_BOVIN P02769 X oteins at l penay myc_aoTTR myc_caija myc_catcr mvc Horse mvc Lad Peptides Decoy FDR pectra at Minimum Decoy FDR iag m Gene Ontology pane The Gene Ontology pane when displayed is located in the lower center section of the Samples View The pane is displayed only whenever the GO terms have been searched GO terms are added to the Samples Table either when searched during the loading phase or after the d
184. mples Table as a single line with the accession number of one of them followed by a plus and the number of other proteins in the group The preferred or named protein is arbitrarily selected and may be changed by the user Figure 8 3 Samples Table Protein grouping Cytochrome b c1 complex subunit 8 OS Mus musculus GN U QERRE TOUS 10 kDa LK3 transgenic mice 40S ribosomal protein 529 OS Bos taurus GN RPS29 PE 3 SV JRS29_BOVIN 6 7kDa Bos Tauurus o Hornerin OS Homo sapiens GN HRNR PE 1 SV 2 282 kDa Homo sapiens e e Weighting Function For the purpose of calculating the protein probabilities shared peptides are apportioned among proteins according to a weighting function Scaffold User s Manual 153 The weights are assigned by using the following formula PE A W p A a St i ee De Pea All B gt p Where W p A is the weight assigned to shared peptide p contained in protein 4 and in other proteins PE 4 the exclusive peptide evidence is defined as the sum of the probabilities of each exclusive unique valid peptide X belonging to protein 4 PE x eA gt Py XCA This value is then normalized by the sum of the exclusive peptide evidence for each of the proteins that contain peptide p A peptide can be set valid either manually by un checking peptides in the Proteins View Peptide Table or globally by using the Experiment menu option Reset Peptide Validation The Scaffold default cut
185. ms are very useful to attach biological significance to the results The detailed GO terms describing each protein are summarized in the pane in broader categories called ontologies Each one of these ontologies has its own pie and bar charts The User can select which ontology to display using a drop down above the chart The three ontologies categories available are e Biological process e Cellular component e Molecular function Pie Charts 148 Each slice of the pie chart corresponds to one column of the GO term annotations in the Samples view The GO term represented by a slice is shown in a box linked to the slice If Show Values is checked the number of proteins annotated with that GO term is also shown Since a single GO term may be associated with more than one protein these numbers may sum to a value greater than the number of proteins Double clicking on a section of the pie chart filters the proteins in the experiment to show only the proteins with annotated with that GO term and brings up the Samples view A filtered set can be further filtered by returning to the pie chart and double clicking again More sophisticated GO filtering can be done through the Advanced Filters dialog Scaffold User s Manual Chapter 7 Quantify View Bar Charts The bar charts are organized by category Each bar displays the number of proteins annotated with a certain GO term in a certain category This allows you to compare if proteins asso
186. n lines diverge from the 45 degree line because the standard deviation depends upon the number of spectra Mean Deviation Scatterplot tab The Mean Deviation Scatterplot provides a method of estimating the coefficient of variation or variance CV of the estimates of protein abundance The tab includes a graph that plots the mean and standard deviation for each protein appearing in the protein list for the whole data set X axis average or mean value of the estimated protein abundance across all samples Y axis standard deviation of the estimated protein abundance computed across all samples A regression line is calculated to provide a model that defines the two standard deviation lines shown in the plot included in the Q Q Scatterplot tab This theoretical estimation is represented as a dashed line in the plot and shows in general that the larger the estimated protein abundance the larger is the absolute uncertainty in the estimate Another way of using this graph is to evaluate if the percent uncertainty the CV is roughly constant see reference Pavelka 2008 This method of estimating the CV uses all the available data In most instances using all the data for an estimate gives the best estimate However like any time a line is fit to data it is possible for outliers to cause inaccuracies Outliers in this data that will introduce the most inaccuracies are proteins with a high estimated abundance in the samples of one category and low
187. nd paste the license key After verification of the key the Register Key button appears click it 11 Ifthe key is valid the message Key was registered successfully appears click OK Scaffold is ready to go 12 If the key is not valid for whatever reason contact back sales proteomesoftware com Renewing time based license key Time based license keys have time limits connected to their validity When a time based key expires Scaffold still works but upgrades are not allowed until the support contract is renewed The status of the Scaffold license key can be checked in the About Scaffold dialog the User opens selecting Help gt About Scaffold command from the main menu If the key is expired and the User wants to upgrade Scaffold clicking the Renew button in the dialog opens the Key reset Request page on the Proteome Software website The User needs to fill in the request and a sales representative will promptly contact him her providing further information Scaffold User s Manual 17 Chapter 1 Getting Started with Scaffold Scaffold Viewer 18 When a licensed copy of Scaffold is installed on a computer only one full copy at a time can run on the system On the other hand the User can open multiple copies of the Viewer at the same time Scaffold Viewer can open and read any SF3 file created by Scaffold The Viewer is free and Users may install it on as many computers as they wish If a User analyzes data with Scaffo
188. nd using the selected minimum number of peptides as a lower bound An FDR landscape a matrix with all possible combinations of protein and peptide thresholds is created and the exact point which maximizes number of proteins while hitting the desired FDR limitations is found When different threshold combinations would result in the same number of target proteins identified points at which the protein probability is highest are considered and of these the point with the highest possible peptide probability is selected The actual filtering is then done using the resulting probability threshold settings The Minimum Peptide Probability and Minimum Protein Probability thresholds selected by the program are shown in the FDR Dashboard lower left corner of the Scaffold Window The actual peptide and protein FDR levels are calculated and displayed in the FDR Dashboard as well How FDR values are calculated in Scaffold e Peptide FDR is calculated as the sum of the Exclusive Spectrum Counts of decoy proteins divided by the sum of the Exclusive Spectrum Counts of target proteins converted to a percentage e Protein FDR is the number of decoy proteins divided by the number of target proteins expressed as a percentage Scaffold User s Manual Chapter 6 The Samples View The Display pane Through the Display pane the User can specify the value for example the Number of Assigned Spectra that is displayed for each protein in each BioSample or
189. ng a row in the GO Tree List 3 Click Add the selected term or group of terms is added to the Display List Terms may be selected individually or by domain or group If a group or domain is selected all terms in that group will be added to the Display List 4 To remove terms from the Display List select a term or group of terms to be discarded then click Remove 5 To save the current selections as User Defaults check the box Save displayed GO terms as user default When a Scaffold experiment is saved the displayed GO terms are saved within the SF3 file When a new file is created or when Scaffold is closed the list of displayed GO terms is unchanged To reset the list to the defaults the user may click the Reset to User Default or the Reset to Scaffold Default button GO Annotations Tab This tab contains a table which lists all the GO annotations databases already imported in Scaffold and the option NCBI Annotations The User can populate the table with existing or custom created GO terms databases through the Import annotations function and then select among them which is the one he she wants to use to annotate the protein list appearing in the Samples Table When NCBI Annotations is selected Scaffold queries the NCBI website through the INTERNET This option is the only one available when Scaffold is initially installed and before the User imports GO databases on his her own Nevertheless it needs to be selected before being
190. nload required Fetch GO annotations remotely UniProt IPI NCBI 20 mins every time figure Help 4 Previous Load Data gt Done Cancel e Continue to Specifying analysis options and analyzing the data on page 47 Specifying analysis options and analyzing the data Searched Database Pane This pane allows the User to select or import the database used to create the data files loaded in Scaffold Databases previously loaded will appear listed in a pull down list To add new databases the User can click the Add New Database button For more detailed information continue to Specifying the FASTA database on page 53 Analyze with X Tandem Pane Selecting this option runs an additional database search an X Tandem search on the loaded data with variable modifications chosen by the User This operation improves protein identifications but significantly increases analysis times For more information continue to Validation with X Tandem on page 57 If you are carrying out this procedure using the sample tutorial_3seq data or the sample tutorial_3mas data provided by Proteome Software then select run with X Tandem Scoring System Pane This pane offers different post processing scoring algorithms to apply when Scaffold analyzes the imported data Scaffold User s Manual 47 Chapter 3 Loading Data in Scaffold e Use LFDR Scoring Algorithm for assessing the confidence level of the identif
191. not only which proteins are shown but also the reported values shown for number of Exclusive Unique Peptides number of Total Unique Spectra Number of Exclusive Unique Spectra and Percent of Total Spectra Among the entries for this filter shown in the drop down list the selection Custom allows defining peptide filters based on the underlying search engines scores See Custom Peptide Filters for more information regarding this option 130 Scaffold User s Manual Chapter 6 The Samples View Figure 6 12 Peptide Thresholds Protein Threshold 99 0 v Min Peptides 2 w Peptide Threshold p5 X X Reg Mods NoFilter Search ity Legend Custom 0 1 FDR 0 5 FDR 1 0 FDR 2 0 FDR 5 0 FDR Accession Number Molecular Weight Custom Peptide Filters The option Custom peptide filters provides a way the create peptide filters based on the underlying search engines scores When the User chooses custom filters Scaffold ignores the protein probability and filters the proteins exclusively on the number of peptides that pass the selected custom peptide filter Custom filters can be created by selecting Custom from the Peptide Threshold drop down list see Figure 6 12 or by going to the menu option Edi t gt Edit Peptide Thresholds and open the Edit Peptide Threshold dialog which shows a list of existing custom filters The dialog allows either to edit an existing threshold create a new set of parameters or d
192. ns or protein groups are displayed in the Samples Table Scaffold builds the Samples Table applying thresholds and filters to the formed clusters proteins and proteins groups in the following order 1 Select all clusters that pass thresholds 2 Include all proteins and protein groups belonging to the selected clusters 3 Prune proteins or protein groups based on selected filters 4 Remove clusters that do not include proteins 5 Prune proteins and proteins groups based on thresholds This order of applying thresholds and filters keeps clusters in the Samples Table that might not include proteins or protein groups that pass thresholds and filters Filters apply only to proteins or protein groups Clusters are shown in the Samples View as a line with protein name Cluster of and the name of one of the constituent proteins This protein is designated as the primary protein of the cluster but the primary protein may be changed by clicking on the accession number field of a cluster and selecting a different accession number from the drop down list when it appears A cluster may be expanded in the Samples View by clicking on the at the left of the cluster s row When a cluster expands it displays all of its constituent proteins or protein groups including the primary protein The right click menu provides bulk operations to expand or collapse all clusters simultaneously The menu option View gt Show Entire Prot
193. nsing off keep all unmatched spectra for future export Files in Loading Queue Files Currently Loaded Sequest Sam wt Wes eins EEL at pe ef similarity Load and Analyze Whaat mS Quantify F Statistics Analysis Information Fixed Modifications Variable Modifications 7 Proteins at R EAA Modification Mass AA Modification Mass AA 99 0 Minimum Fragment Tolerance 0 00 Da Monoisotopic Carbamidomethyl lodoaceta 57 02 c oxidation 15 99 m 2 Min Peptides Digestion Enzyme Non specific Oxidation 15 99 w J Oe Suphi FEER Searched Database the swissprot_bovine database 170 Acetyl Acetylation 42 01 E l paraa Orignal Search Date Sequest 04 27 2005 Phospho Phosphorylation 79 97 E 0 27 FOR Scaffold Verson Scaffold_3 0 pre63 Samples View Scaffold s Running H F 2 provides overviews that help the User make direct comparisons among categories of samples BioSamples and MS samples It lists and summarizes the proteins identified in each MS sample The list of proteins is shown summarized in two levels of hierarchy Protein groups groups of proteins that are associated with an identical set of peptides They are shown collapsed and by default represented by the protein that has the highest Scaffold User s Manual 29 Chapter 2 Identifying Proteins with Scaffold probability and the largest assoc
194. nt or a tissue sample from a model organism or cell line Using such techniques as 2D gels or liquid chromatography proteins or peptides from these biosamples are then separated from each other Each resulting individual band spot or LC fraction then processed by a mass spectrometer is one mass spectrometry sample abbreviated in Scaffold as MS sample One BioSample is therefore typically made up of more than one MS sample sometimes many more Scaffold can also process data from MuDPIT experiments in which case the analysis combines peptides from all fractions into one MS sample for protein identification Data Loading Scaffold imports data generated from a large variety of search engines like Mascot SEQUEST Spectrum Mill OMSSA Phenyx X Tandem MaxQuant It also supports those search engines that can export the search results in the mzIdentML format All type of search data rcan be freely included in one experiment Each SEQUEST folder is imported as one file as is each Mascot or X Tandem file Importing files requires access rights from the computer where Scaffold is installed to the location where the search results data files reside The loading data phase is also where BioSamples are defined and as part of the import process the User can specify all the files he she wishes to include in a BioSample which can then be named and categorized The User can load data files in Scaffold either activating The Loading Wizard
195. nting Scaffold includes the following Spectrum Counting methods Total Spectra default This method uses the sum of all the spectra associated with a specific protein within a sample which includes also those spectra that are shared with other proteins and is referred to as the Total Spectrum Count Scaffold User s Manual 165 Chapter 9 Quantitative Methods and tests 166 Weighted Spectra This method uses the sum of all weighted spectra associated with a specific protein and within a sample where the weight is a measure of how much a spectrum is shared by other proteins For more details on how the weight is calculated see Weighting Function emPAI Spectrum Counting methods can also be used in the determination of absolute abundance of proteins Initially the parameter used to measure this absolute abundance was the Protein Abundance Index PAI defined as the number of observed peptides divided by the number of all possible tryptic peptides from a particular protein that are within the mass range of the employed mass spectrometer N PAI observed N observable Where Nopserved 1S the number of experimentally observed peptides and Nobservable 1S the calculated number of observable peptides for each protein In a subsequent refinement PAI was transformed into an exponential form called emPAI and defined as follows see Ishihama 2005 emPAI 10 1 NSAF The NSAF quantitative method is useful when comparing th
196. nts the number of spectra identified for a given peptide Scaffold provides three options for measuring a protein s precursor intensity in its Quantitative Methods drop down menu in the Quantitative Analysis dialog Average Precursor Intensity This method takes the geometric mean of the peptide intensity values for a given protein Scaffold User s Manual 167 Chapter 9 Quantitative Methods and tests 168 When selected the User needs to adjust the Minimum Value accordingly by choosing Other in the pull down list Total Precursor Intensity The sum of all distinct intensity values for a protein When selected the User needs to adjust the Minimum Value accordingly by selecting Other in the pull down list Top Three Precursor Intensities The sum of the three highest peptide intensity values for a protein If fewer than three peptides have intensity values the intensities that are present are summed When selected the User needs to adjust the Minimum Value accordingly by selecting Other in the pull down list Scaffold User s Manual Chapter 9 Quantitative Methods and tests Normalization among samples in Scaffold To allow comparisons Scaffold normalizes the MS MS data The User can then compare abundances of a protein between samples The normalization scheme used works for the common experimental situation where individual proteins may be up regulated or down regulated but the total amount of all proteins in
197. nue What s Changed Continue Welcome to Scaffold Q r Welcome to Scaffold Q PRR This is the first time you ve fests launched Scaffold Q version 4 Would you like to see a list of what s changed Don t show this message again Scaffold User s Manual 15 Chapter 1 Getting Started with Scaffold Figure 1 4 Scaffold main Welcome window indicating access to Scaffold Scaffold Q or Scaffold Q S Welcome to Scaffold x Welcome to Scaffold Would you like to create a new experiment or open an existing file L New Open Run Demo Cancel Welcome to Scaffold Q S Welcome to Scaffold Q caffold Would you like to create a new experiment or open an existing file Q e Ea Cra e Welcome to Scaffold Q S Welcome to Scaffold Q Would you like to create a new experiment or open an existing file From this window the User can create a new experiment open an existing experiment SF3 file or work with the demonstration data that is provided with all Scaffold installations Upgrading Scaffold to Scaffold Q or Scaffold Q S When the User is running a core copy of Scaffold and would like to upgrade to Scaffold Q or Scaffold Q S he can do so by contacting our sales department at sales proteomesoftware com When the purchase is finalized the sales department wi
198. of the Preferences dialog This means that the User can control which pages his her collaborators can view The User can use this in conjunction with the password protection on the filters to control which proteins can be viewed Pull down list The bottom portion of the window stores the preferred Display Options for the Samples View when a new experiment is created A pull down menu will allow the user to select what information Scaffold will initially show when the loading phase is completed and the Samples view is initially shown Reset button The Reset Don t Show Messages button restores the messages that were checked to not show again when requested Password Through this tab the User can select to use a password to protect certain views and operations available in a Scaffold experiment once it is saved in a SF3 file For example the User can set filter thresholds to display only data with above 90 confidence or restrict access to only the Samples and Publish pages in this way hiding the messy details on the Proteins and Statistics pages The User can also prevent anyone from reanalyzing his or her data by locking the export of the spectra A password gives the User control control of what the people viewing data can see and do e Use Password Turns on and off the password protection e Protect Exporting Spectra Password required to export spectra Protect Resetting Thresholds Password required to change Min Peptide
199. on Scaffold OXPOM ec eee ree eee ree 192 ScaffoldPTM EXPO preneta tuotaa 192 Scoring algorithm LFDR based scoring 127 PeptideProphet scoring 127 Scoring system LFDR based ceeeeeeeees 26 PeptideProphet 200 27 ProteinProphet eeeeeeeeee 27 show hidden proteins 128 Skyler niinen 194 sorting feature Samples VieW eee 127 Special information about the MANUALS s002 s2022ec022 ce sp0desil inten atees reeaee 7 216 T Terminology eeeeceeseeeeereeeeee 209 BioSample i 209 Terminology comparison between Scaffold 3 and Scaffold 4 211 U Upgrading Scaffold to Scaffold Q or Scaffold Q S aaaieeeeeeeeneenreennee 16 Using the manual eee cree 7 Scaffold User s Manual
200. ore like the initial repetitive dialogs that appear when the Wizard is opened The tab includes a check box list of all the views available in Scaffold a pull down list of the Display Options to select the appropriate default value and a reset button Figure 4 12 Display Settings tab Internet Memory Processors Web Link Mascot Server Display Settings Password Paths f Display Settings Show Samples View Show Proteins View Show Quantitation View Show Publish View vi iv 4 Show Similarity View v v v Show Statistics View Default view option for new files Protein Identification Probability Reset Don t Show Messages Check box list If one of the views in the list is not checked it will not display and the corresponding button in the Navigation pane will also not be visible These settings are saved with the experiment when the SF3 file is created For example if the User turns off the Statistics and Proteins views and saves the file when the file is reopened only the Load Data Samples and Publish views are visible This feature can be useful when sending the results to someone who doesn t need to see certain details or to check the validity of the statistical analysis Access to Display Settings may be controlled by a password by checking the appropriate box Scaffold User s Manual 79 Chapter 4 The Scaffold Window 80 on the Password tab
201. ore information about Scaffold clusters see Chapter 8 Protein Grouping and Clustering on page 151 Scaffold User s Manual 125 Chapter 6 The Samples View Scaffold Q Samples tutorial_2 File Edit View Experiment Export Quant Window Help D SHa alan BW a bth Qt Protenttreshos 99 0 v in Peptides 2 v Peptide ireshold os Display Options Protein Identification Probability ReqMods No Filter x Search Probability Legend Load Data over ss O aooo rA 50 079 N 20 to 49 E s 0 to 19 5 3B E Isio view 3 8 3 2 amp 18 Proteins in 12 Clusters g 2 l s e e e 8 rA A Hi V Custer of Serum albumin precursor Allergen Bos d 6 ALBU_BOVIN ALBU_BOVIN 2 69kDa 100 100 100 100 100 tsa 2 _NF00050265 Ovotransferrin Gallus gallus OTRF_CONTR 1 75kDa HOON OOS 400 M00 00 ie SEJ J Cluster of NF00163549 Ig gamma 2 chain C region clone Bos tauru IGG_CONTR IGG_CONTR 2 kDa A 100 100 100 100 100 Proteins 4 HAIN BOS TAURU R IGGL_CONTR 2 5 41 m IMMUNOGLOBULIN LAMBDA LIGHT CHAIN BOS TAURUS IGGL_CONTR 25kDa MoO 100 100 100 100 42 U IMMUNOGLOBULIN LIGHT CHAIN VARIABLE REGION BOS TAURUS IGGV_CONTR 1ikba 100 100 100 a Bs V Cluster of NF00161101 beta lactoglobulin Bos taurus LACB_CONTR LACB_CONTR 2 17kDa 100 100 100 100 100 l l 5 1 M NF00161101 beta lactoglobulin Bos taurus LACB_CONT
202. ore running it on a computer that 196 Scaffold User s Manual Chapter 11 Reports has ScaffoldBatch installed the User needs to unzip or untar the package Exports compatible with Excel Scaffold provides a number of tab delimited reports containing different types of information related to the analysis performed in the current Scaffold experiment e Publication report e Samples report e Spectrum report e Peptide report e Protein report e Current View report e Complete report Scaffold User s Manual 197 Chapter 11 Reports 198 How to open Scaffold reports in Excel The exported reports can be viewed in Microsoft Excel for further analysis of the data they contain When importing any of these reports into the spreadsheet it may be necessary to specify that the report is TAB delimited Excel may also show its Text Import Wizard the first time the User opens an exported Scaffold report Selecting delimited file then tab delimiters completes the conversion to the Excel format Saving as an XLS file avoids repeating the conversion in the future St To create an export that includes the GO annotations see Samples report Publication report The Publication report lists the data analysis information required for publication in a number of the Proteomics journals This report is a copy of the information reported on the Publish view The top of the report lists in a structured manner e How the peak lists were gen
203. ormation e Information pane on page 119 which provides specific in depth information about the search files loaded in a specific BioSample Scaffold User s Manual 111 Chapter 5 The Load Data View Experiment Pane 112 The Experiment Pane provides general information about the currently loaded Scaffold experiment Figure 5 2 O Szttckd Qe Lond Data My Experiment 2 So TO Ce ee eaaa ama risroricot Ep GumestucaresDrecois a aloasnee On the left side of the pane Scaffold shows the name of the current experiment which by default is called My Experiment the total number of spectra currently loaded in the experiment and the type of grouping selected at the time of loading There are two types of grouping modes implemented in Scaffold Experiment wide Scaffold groups proteins across all MS samples and BioSamples Independent sample Scaffold groups proteins only within each MS sample Each MS sample appears as if it was loaded independently In the top right portion of the pane not appearing in the Viewer mode there are three buttons e Queue Files for Loading Adds more MS samples to the selected BioSample Queue Structured Directory for Loading Adds more MS samples organized in separate directories e Add Biological Sample Adds a new BioSample to the experiment by starting the The Loading Wizard can do so by selecting a BioSample and going to the menu option Experiment
204. ortant to be aware of the status of the View gt Show Lower Scoring Matches option since it affects the counts shown in the table and varies the quantitative values Note When probabilities are lower than 5 values will not be shown unless Show lt 5 Probabilities is selected 88 Scaffold User s Manual Chapter 4 The Scaffold Window Show lt 5 Probabilities When Show lower Scoring Matches is not selected values for proteins that have probabilities less than 5 are not shown This option allows those values to be visible Edit Experiment The menu option Experiment gt Edit Experiment opens a dialog where the User can add or edit a description of the experiment Figure 4 18 Edit Experiment menu option Experiment Description Protein Grouping Use protein duster analysis Use standard experiment wide protein grouping Use legacy independent sample protein grouping Help In the full version of Scaffold this dialog also contains the Protein Grouping pane where it iis possible to toggle the various protein grouping options avalable in the program e Use protein cluster analysis When selected Scaffold uses the Shared Peptide Grouping and Protein Cluster Analysis to group and pair the list of identified proteins e Use standard experiment wide protein grouping When selected Scaffold uses the Legacy Protein Grouping with no clustering e Use legacy independent sample protein grouping Scaffold use
205. ouping levels Figure 2 6 Similarity View GyScaffold Q Similarity TRAQ 5 8 File Edit View Experiment Export Quant Window Help OF BFS QD A D A a oh ih QE Protein mreshod 99 0 Mn septides 2 v Peptde reshold as RH64567p CG9090 PA OS Drosophila melanogaster GN CG9090 PE 2 SV 1 Load Data 100 CG4994 PA 1 00 2 FGLYEVFK 3 GASAISVAK MoO CG4994 PA V 1 00 4 GLVPLWMR 80 v 060 040 5 IQTTPGFAK 00 EERE v oo 6 LQVDPAK 34 MED 7 100 7 MTAQEGVTAFYK 6396 EES 7 1000 8 QIPYTMMK 61 7 060 040 9 TLELLYK 90 MESS 7 1 005 10 TVELLYK 190 CGAIIEPAN v 100 ActualMass Charge Delta Delta Rete Intensity 1 289 75 2 0021 Statistics the right The user can check or uncheck the valid box for a peptide sequence Unchecking the box removes that peptide from Scaffold s probability calculations Sy For each peptide the corresponding proteins to which it could belong are listed on e Peptides identified in particular protein groups are color coded to match their protein group Quantify View In the Scaffold s The Quantify View the User can look at spectral count numbers for BioSamples along with their associated errors and compare spectral counts between different BioSamples and categories The information is organized in the following panes 32 Scaffold User s Manual Chapter 2 Identifying Proteins with
206. ownloading and installing scaffold To unlock scaffold please enter a license key The free scaffold Viewer requires no key to view files enter a license key The free scaffold Viewer requires no key to view files Paste kense key brio scatford m 4 302013423 130uKj usSq EgCBSE Paste kense key below ti Thank you for purchasing scaffold A purchased key will unlock scaffold on only one computer and must be registered with www ProteomeSoftware com OK Continue in Viewer Mode OK Continue in Viewer Mode Exit scaffold Please Entery a License Key Thank you for downloading and installing scaffold To unlock scaffold please enter a license key The free scaffold Viewer requires no key to view files Pante ae key bhw scaffo1d m 4 3J2JL3 251J8uHjfus5q 6gCB5E J Key was registered successfully There are two kinds of keys Evaluation key An Evaluation key is valid for two weeks The User can obtain a free evaluation key for any of the Scaffold applications at www proteomesoftware com The User can use this key on an unlimited number of computers 14 Scaffold User s Manual e Time Based License key a Time Based License key allows the User to access all Chapter 1 Getting Started with Scaffold features of the software permanently It only allows upgrades within a certain time limit however The time tracks the length of th
207. panes describe the data contained in the files listed in the table Files Currently Loaded If specific files are highlighted in this table then the information is restricted to the highlighted files Otherwise it describes all the files in the displayed BioSample In contrast to these information panes which describe only one sample at the time the Publish view summaries this analysis information for all the samples The Analysis Information pane The Analysis Information pane lists the peptide and fragment mass tolerances the digestion enzyme and the database searched and when that search was done The Scaffold version is the version in place when the data was loaded into Scaffold If several files are selected and these files have different parameters this box shows the range of values When holding the cursor over the data a tool tip shows further details The Fixed Modifications pane This pane contains a table listing the fixed modifications with their masses and the related modified amino acids used during the searches recorded in the loaded files belonging to a specific BioSample The Variable Modifications pane This pane contains a table listing the variable modifications with their masses and the related modified amino acids used during the searches recorded in the loaded files belonging to a specific BioSample Note When a peptide starts with E or Q X Tandem automatically checks for the formation of
208. peat the process for the second replicate naming the new BioSample c2 and choosing the same category description of control from the drop down list Repeat the procedure starting from Add another BioSample on page 46 until you have added all the samples you wish Then Click Next to go to Load and Analyze Data on page 46 Load and Analyze Data The Scaffold Wizard Load and Analyze Data page opens The page is divided into various panes Each pane contains options for customizing the way Scaffold analyzes the data during the loading phase e e 46 Searched Database Pane Analyze with X Tandem Pane Scoring System Pane Protein Grouping Pane Protein Annotations pane Scaffold User s Manual Chapter 3 Loading Data in Scaffold Figure 3 8 Scaffold Wizard Load and Analyze Data page Scaffold Wizard x Load and Analyze Data Searched Database 5 Add Another BioSample IPL_human v3 85 decoy_uniprotaccession_modified FASTA Database x 6 Load and Analyze Data Use non defauit forward decay ratio No Decoys Add New Database X Tandem Analyze with X Tandem Scoring System Use LFDR scoring all instruments Use legacy PeptideProphet scoring high mass accura cy Use legacy PeptideProphet scoring standard Protein Grouping Use protein duster analysis Use standard experiment wide protein grouping Use legacy independent sample protein grouping Protein Annotations Don t annotate No dow
209. pending on the view the type of information reported might appear framed in one or more tables or graphs included in one or more sub panes All panes and tables included in Scaffold share the following characteristics e Tool tips e Resizing of columns and panes e Moving columns around e Column sorting feature e Multi selection of rows in the Samples Table Tool tips The user can view information about fields or columns in a View by just hovering the mouse pointer over the location of interest This operation opens a collapsed tool tip Pressing F2 opens an expanded tool tip Pressing the Escape ESC key on the keyboard closes the expanded tool tip Figure 4 27 Viewing information in a collapsed tool tip 2 Beta crystallin A4 Beta CRBA4_BOV 24kDa amp Bos ta oc 0 Alpha crystallin B chain CRYAB_BOV 20 kDa Bos ta o o ses Alpha crystallin B chain Alpha B crystallin ot 2 to expand o o o 4 Retinal ar AL1A1_BOV 55kDa Bos ta o o O o 7 Cytosol aminopeptidase AMPL_BOVIN 53kDa Bos ta e amp e o eo 9 P aR mma arraar msewurres caLn a 2 gt a a Figure 4 28 Viewing information in an expanded tool tip 42 DELA UrYSLOAIMI AD LUNL URDAL _DUV JAVO W DUD Ld 142 Beta crystallin A4 Beta CRBA4_BOV 24kDa x Bos ta 10 Alpha crystallin B chain CRYAB_BOV 20 kDa poems kDa ee ee i o ADANA 9 109 Gamma crystallin D 196 Glycer
210. pens In the dialog select the option Scaffold perSPECtives analysis and then click Advanced see Figure 11 4 2 To the question Do you want to export a compressed file select the answer No compression Click OK to save the mzIdentML files 3 To create a spectral library in Skyline go to the menu option Settings gt Peptide Settings The Peptide Settings dialog opens onto the tab Library Figure 11 5 Skyline Peptide Settings Library tab Peptide Settings Digestion Prediction Fiter Library Modifications Libraries Build Explore OK Cancel 4 Click Build to add a new library The Build Library wizard opens Scaffold User s Manual Chapter 11 Figure 11 6 Skyline Build library wizard initial page 7 Reports tu Build Library x Name Scaffold Library Output Path D Scaffold_Library blib Browse Action Create E Keep redundant library Cut off score 0 95 Lab Authority e g proteome gs washington edu proteome_software Library ID Scaffold_Library er Assign a name to the library and if needed adjust the output path where the library is saved click Next In the new page of this wizard click Add Files and point Skyline to the location where the Scaffold MZID is saved Once the file is selected it appears listed in the Input file text area Click Finish Figure 11 7 Skyline wizard and message when the library is loaded 10
211. pens the Preferences dialog which contains the following tabs e Internet e Memory e Processors e Web Link e Mascot Server e Display Settings e Password e Paths Internet In the Internet Settings dialog the User can enter a Proxy server name or IP address and a proxy port number Through check boxes in this dialog box the User may e Allow Scaffold to connect to the Internet If this box is unchecked then Scaffold cannot access the Internet Users may want to have this box unchecked if their organization prevents connections to the Internet e Use HTTP Proxy Server e Proxy Server name or IP address e Proxy port number Proxy servers may be used by an organization s IT departments to filter communications to and from the Internet If that is the case then Users need to set the Proxy Server Name and Port Number Users can check if there is any need to use proxy server settings by looking at how their web browser is connected to the web Memory This tab allows the User to set the maximum amount of memory that Scaffold is allowed to use Scaffold is a memory intensive program and needs a large amount of RAM to be able run at a decent speed When setting the amount of memory Scaffold should use it is important leave enough memory for other programs to run Scaffold User s Manual 77 Chapter 4 The Scaffold Window 78 The new memory setting will take effect only after the application has been y closed and r
212. peptide First peptide intensity values are calculated As shown in Figure 10 5 duplicate intensity values for the same peptide are discarded If there are multiple peptide spectrum matches with the same peptide sequence and modifications but with different intensity values their intensities are summed and the sum is used as the intensity value for that peptide The peptide intensity values are then used in the following calculations e Average Precursor Intensity The geometric mean of the peptide intensity values for a given protein Total Precursor Intensity The sum of all distinct intensity values for a protein Top 3 Peptides Precursor Intensity The sum of the three highest peptide intensity values for a protein If fewer than three peptides have intensity values the intensities that are present are summed When one of these methods is selected through the Quantitative Method drop down it becomes available for display in the Samples View Choosing Quantitative Value from the Display Options drop down causes the Samples View to show precursor intensity values calculated according to the selected method in the Samples Table The name of the method is displayed in Display Options and if the values have been normalized that is also indicated Figure 10 6 Scaffold User s Manual x Search Control Treatment T over 95 80 10 94 5 ari 50 t079 20 to 49
213. ple Scaffold User s Manual 105 Chapter 4 The Scaffold Window Mouse Right Click Menus When the User right clicks the mouse while hovering over the Display Pane a menu with various options appears close to the working arrow Depending on the current view the list of options available in the menu varies A description of the mouse right click command is provided in Mouse Right Click Commands on page 212 Load Data View The following menu appears when the User right clicks on the BioSample name tab Right Click Menu A la A tutorial_4 7334 Spectra Protein Grouping Experiment Wide Load Data PEETA ne 3667 S E Standard Edit BioSample Condensit 5 Queue Files For Loading Files in Loadin Sample p Queue Structured Directories For Loading Delete BioSample a nu hae Samples View When the User right clicks anywhere over the list of proteins the following menu appears which contains a number of sub menus Right Click Menu B Add Orange k Add Blue sy Remove Star Remove All Stars Show Hide Hide Others Show All Select All Stars Show Hide Clusters Edit Protein Name shee De Copy Save JPEG Image a Expand All amp Print Collapse All 3 Export Hide Decoys Copy Image Export To Exct Copy Selected Row Export Starred To Excel Copy All Data Proteins View When the User right clicks over the Proteins View generally Right Click Menu Cappears
214. ppears as columns e Filtering Samples on page 129 which describes how to increase or decreases the number of proteins listed in the Samples Table Scaffold User s Manual 121 Chapter 6 The Samples View e The Display pane on page 135 which provides options to view rough estimates of differential expressions Scaffold uses a multitude of statistics to filter required modifications providing both simple and advanced search options e Information Panes which lay out and specify useful protein Gene Ontology and sample information 122 Scaffold User s Manual Chapter 6 The Samples View The Samples Table When the Samples View first opens all proteins that meet the default threshold settings are listed in the Samples Table There are two levels of summarization the User can use to view the Samples Table offering two different ways of looking at the results e The BioSample View which provides a single column overview of all the proteins groups or clusters in a given BioSample e The MS MS Sample View which displays protein identifications in separate columns by mass spectrometry sample The two summarization views can be toggled using the BIO and MS buttons located underneath the main menu bar Figure 6 2 Scaffold Samples View BioSample View MS Sample View toggle buttons File Edit View Experiment Export Quant Window Help Oe eS Gla eH OY fale amp amp ih BioSample View The
215. ptions available to further customize the mzIdentML export Scaffold User s Manual Chapter 11 Reports Figure 11 4 Export mzldentML expanded dialog Do you want to export data using a filter O NoFilter What version of mzIdentML would you like to export to Version 1 1 0 O Version 1 0 0 Do you want to export a compressed file Gzip Compression Recommended No Compression Do you want to export individual reports for each sample One report Per sample reports required for PRIDE Do you want to indude decoy identifications No decoy identifications Include decoy identifications Do you want to indude spectrum peaklists Include peaklists recommended No peaklists faster The available options are the following Selection of the list of proteins to include in the MzIdentML through the set filters Selection of the version of the file exported Scaffold supports the latest version of MzlIdentML the 1 1 0 and previous ones Selection of the type of compression Selection of the number of reports exported With multiple BioSamples it is possible to create mzIdentML exports for each BioSample included in the experiment Inclusion of decoys Inclusion of peak lists The peak list is saved using the MGF format The mzIdentML export creates one or more MZID files and a series of MGF files if the inclusion of peaks option is selected saved in a newly created meaningfully named
216. pyroglutamic acid i e the loss of water or ammonia respectively The Pyro Glu modification then automatically appears in th table see Analyze with X Tandem Pane Scaffold User s Manual 119 Chapter 5 The Load Data View The fixed and variable modifications are those used in the searches for the files wy listed in the Files Currently Loaded list If different modifications were used to create different files the modifications that are not true for all files are highlighted in red The User can see which files a red modification applies to by hovering over it 120 Scaffold User s Manual Chapter 6 The Samples View Chapter 6 The Samples View The Scaffold s Samples View provides overviews and tools to help the User make direct comparisons among BioSamples or MS Samples regarding the content of identified proteins Figure 6 1 Scaffold Samples View Scattold Q 5 Samples tutorial 4 or x File de View Expenment Export Quant Window Heip Jda DAD BW A amp AS ih Qt pom merce 0s A Display Options Total Spectrum Count over 95 80 10 94 2 at laid 50 079 2 g5 20 to 49 E 5 3 3 25 0 to 19 3 ig 3 5 gdisigia 5 S E a e zal 25 sg S E 8 ae 8 o HAREHRHHA alg 5 a 3 isis 2 828 SIE alsa 8 2 3 8 i z l Bs sal e e Ps egies B 5 Bioview KI 3 3 amp B Regs ARIRE 218 Proteins 32 a
217. que 3 New BioSam Sample Name BioSample 1 4 Queue Files For Loading 5 Add Another BioSample 6 Load and Analyze Data Sample Category Uncategorized Sample m Sample Description MuDPIT Experiment Combine Samples Condense data as it is loaded al performance for large datasets 0 rep a Previous next gt Done Cancel 5 Continue to New BioSample below New BioSample 1 Enter a sample name and optionally enter a description that further clarifies or explains the sample If you are carrying out this procedure using the sample tutorial 3seq data provided by Proteome Software name the new BioSample bovine lens and the new category lens Because these names appear as column headings in the Samples View its helpful to choose short ones When there s more to remember enter it in the Sample Description field If you are carrying out this procedure using the sample tutorial_3mas data provided by Proteome Software name the new BioSample c1 and the new category control 2 Do one or both of the following as needed e To discard 0 probability spectra and decrease the time required to load the data select Condense data as it is loaded e Ifthe loaded data is MuDPIT data then select MuDPIT Experiment Scaffold combines all the MS samples for the sample Scaffold User s Manual 41 Chapter 3 Loading Data in Scaffold If you are carrying out this proc
218. r line atin ae Ue oa ch oon ee Vy Ae l Q06002 Fensi Beaded flame BESP1 BOVIN oo oo M P63103 14 3 3 protein zeta del 14332_B0v_ oOo ee 904272 Annexin A2 Annexin I1 _ANXA2_BOV_ ooo o eooo os o HOOK ea ee ee maan pels iil t Intormaton ix Noup Accession Number tn NOBI Gag 1351907 ALBU BOWN multicelbiar crgaremal process Betoges s ry a osom system process a ee gt pata pin saveco tne cognition sensory perception MOMS Sampie Mudet_bowine_mucst_10 sensory perception of ight stimulus 3 i 3 es i H Scaffold User s Manual 61 Chapter 4 The Scaffold Window Title bar Figure 4 2 Title bar g Scaffold Q S Samples tutorial_4 SNe X Depending on the type of license acquired either Scaffold Scaffold Q or Scaffold Q S is always shown in the title bar at the top of the Scaffold window together with the Scaffold icon Additional text is displayed depending on the actions that the User is currently carrying out in Scaffold For example if the user has opened a file then lt Experiment name gt is also displayed in the title bar The version of Scaffold in use is not displayed in the Title bar The user must go the Help gt About option in the main menu to determine the version number See Main menu commands below 62 Scaffold User s Manual Chapter 4 The Scaffold Window Main menu commands Fi
219. r step by step through the Scaffold Loading Wizard with an example using real data Chapter 4 The Scaffold Window on page 61 which provides a detailed description of the main Scaffold window with all the tools it includes Chapter 5 The Load Data View on page 111 which includes information about the search data loaded in the current Scaffold experiment with all the loading tools it includes Chapter 6 The Samples View on page 121 which provides a description of the functionality of the view of the Samples table with all the available tools for filtering and searching specific proteins present in the list Chapter 7 Quantify View on page 143 which provides graphical tools to help the User visualize experiments and draw conclusions about the quantitative relationships demonstrated in the data Chapter 8 Protein Grouping and Clustering on page 151 which provides a detailed explanation of the grouping and clustering algorithms included in Scaffold Chapter 9 Quantitative Methods and tests on page 163 which provides a description of the different quantitative statistics and quantitative statistical tests available in Scaffold Chapter 10 Precursor Intensity Quantitation on page 179 which provides a comprehensive description of how Scaffold treats and computes precursor intensity quantitation Chapter 11 Reports on page 189 which includes a description of the various exports
220. ree TIC e Precursor intensity quantitation e 164 Average Precursor Intensity Total Precursor Intensity Top Three Precursor Intensities Scaffold User s Manual Chapter 9 Quantitative Methods and tests Figure 9 1 Quantitative Method pull down list J Po unenge vy Category ee c2 AA Coefficient of Variance c3 BB c4 BB T Test Add gt i Analysis of Variance ANOVA 4 Remove Fisher s Exact Test v Use Normalization Minimum Value 0 0 Quantitative Method Total Spectra How to select the proper quantitative method When setting up an experiment researchers typically have a question in mind The question determines the way the experiment is organized and conducted and also which quantitative method needs to be used to find answers to the question asked Typical questions asked in a mass spectrometry Proteomics experiment are 1 Is anything changing 2 How much is the amount of change I am dealing with Of the three main label free quantitative methods available in Scaffold Spectrum Counting methods are the most reliable in answering question number The Total Ion Count TIC methods can answer both questions but not very well since they include limitations related to the counting of spectra while considering the peak intensities from MS MS spectra Precursor intensity quantitation methods are very reliable in answering question number 2 Spectrum Cou
221. robabilities See Show lt 5 Probabilities Show Sample Notes Toggles the view of the Information Panes Show Hidden Proteins Toggles the view of Hidden Proteins Show GO Annotations Toggles the view of GO Annotations Tab Navigate Navigates through tabs when present in a dialog or pane Experiment Experiment Export Quant Window Help Edit BioSampl Gs Add BioSampl Delete Bi E Queue Files For Loading Ep Queue Files From Mascot D Queue Structured Di er For Loading ries For Loading Apply New Database Apply Protein Annotation Preferences Ctri E Ctrl B Ctri Q CtrteL Edit Experiment See Edit Experiment Edit BioSample See Edit BioSample Add BioSample lnitializes The Loading Wizard Delete BioSample Deletes a loaded biosample from the current Scaffold experiment Particularly useful in the Load Data view Queue Files for Loading See Queue Files for Loading Queue Files From Mascot Server For Loading See Queue Files From Mascot Server for Loading Queue Structured Directories For Loading See Queue Structured Directory for Loading Apply New Database See Apply New Database Apply Protein Annotation Preferences SeeApply Protein annotation Preferences Load and Analyze Queue Available only when there are files listed in the loading Queue in the The Load Data View waiting to be loaded in Scaffold When selected it opens the Load and Analyze Data p
222. rocedure packages the data and stores it on the file sharing service Tranche Figure 2 8 Publish View E seotrois Q Publish tutori stent fies tspestenemtat n meni anaandaa aadi dn aii QD uadgalaa A yh g amp th amp Protein Tesh 0 0 v M Peps 2 Peptide Threshold os Experiment Methods Parameter Load Data Bele g Textual Annotation 1Database the NCBInr_20050928 database Metazoa k Number of Proteins amp Search Engine Set 819710 1 Search Engine amp Search Engine Mascot Version Mascot amp Probability Model 2and below Peptide Prophet amp 3and above Peptide Prophet amp Samples All Samples amp Fragment Tolerance 0 20 Da Monoisotopic amp Parent Tolerance 0 30 Da Monoisotopic amp Fixed Modifications 57 on C Carbamidomethyl Variable Modifications 1 on NQ Deamidation 16 on M Oxida amp Database the NCBInr_20050928 database selected Digestion Enzyme Trypsin Max Missed Cleava 1 Scaffold Version Scaffold_4 0 0 alpha 1e experime amp Modification Metadata Set Unknown modifications amp Source Unknown Comment f i Protein Grouping Strategy Experiment wide grouping with binary pepti amp Peptide Thresholds 195 0
223. rom those in the other category A threshold or alpha level or significance level of 0 05 is commonly used to assess how statistically significant the result of the T test is This value should be appropriately adjusted in Proteomics experiments since differences are evaluated among many proteins at once see The T test is generally considered a fairly robust test This means that even if its basic assumptions are violated somewhat it still tends to be fairly reliable at separating the categories which are the same from those that differ Some researchers believe that spectral data should be transformed in some way for example by taking its log before doing a T test Other researchers may think that 3 replicates is not enough to apply the T test Still others believe that more advanced non parametric tests would work better So if the T test gives a borderline result the User may want to check it carefully But if the T test has a very small p value the robustness of the T test means that it is unlikely that a more sophisticated statistical analysis will give a different result If the User tries to push things by computing the T test with less than 3 replicates it is unlikely to give informative and trustworthy results The Fisher s Exact Test may be more appropriate for samples with few replicates and low abundance proteins with few spectral counts Sometimes people make a distinction between technical replicates and biological replicates
224. rt but reports the samples quantitative values and the proteins identification probability Figure 11 10 Samples report columns ot wt Ayape sayodsuesy Ayanpe 4032 NBe4 uonejsuesy AWApe 103e uonduosueIy Ayanpe ajnoajow jeunjpnays Be ulajoid Aqaipe sjosuasad qualynu Ayarpe 1030W Aqanpe saonpsued Jejnajow uojpunyJejnajow y ape auodeypojje pu AQApe 103e aw zu AApe JILI UoL pala Ayanpe yuajjeadasoways Ayarpe quepeseoway gt AqAnpe Joen auoadeyo Ayape a1 4je 29 Buipuiq Ayarpe ujajoid Yodsues Auelixne Ayaipe juepixoljue awosoqu auesqw w ewsejd ped ajjuedio uesqw w jj ueZo sn pnu uoupuoypoywu auesquaw a jeue fio senje uol Jeneza wosopu Winne wsejdopua uoays07 uusejdoxAo snyesedde 13 05 uolpnposdas jedia ssad0ud JJwyy y4 snjnuiys 0 asuodsas ssadoid aAlpnpoidas uolpnpoides uoljequawiaid ssadoud ewis ue Bo sejnjjponnu ssad0Jd Lusjue o 1 NW ss 204d ajoqeyaw U0 JOLU20 uopez je2o ssadoid WwaysAs aun yymou8 uoezje20 Jo JUBLUYS qGeySa ssaoud equauuidojaaap ssapoud Je n jeo uo saype e gt 80j01q Auiouoxe ouUeLeA aAI eWIUEND AqinBiquiy Buldnou5 uad 143m eno N Spectrum report The Spectrum report details all the spectra passing the current filter and threshold settings The report header rows identify the data and how it was created which is the same information that is contained in the Publication report Afterwards each entry represents a spe
225. s or it can be viewed on line in its fully interactive capacity If the User prints the document for best results it is recommended that he she prints it on a duplex printer however single sided printing will also work If the User views the document on line a standard set of bookmarks appears in a frame on the left side of the document window for navigation through the document For better viewing decreasing the size of the bookmark frame and using the magnification box to adjust the magnification of the document will help the User in setting his her viewing preference If the User decides to print the document using a single sided printer he she might wy see a single blank page at the end of some chapters This blank page has been added solely to ensure that the next chapter begins on an odd numbered page This blank page in no way indicates that the book is missing information Conventions used in the manual The Scaffold User s Manual uses the following conventions Information that can vary in a command variable information is indicated by alphanumeric characters enclosed in angle brackets for example lt ProteinName gt e A new term or term that must be emphasized for clarity of procedures is italicized e Page numbering is on line friendly Pages are numbered from 1 to x starting with the cover and ending on the last page of the index e This manual is intended for both print and on line viewing Preface
226. s each protein appearing in the Protein List as a point on a two dimensional scatter plot If the categories have multiple replicates then the average value in the category is plotted X axis normalized spectral count for the protein for all samples in a category Y axis normalized spectral count in a second category Ifa Quantitative Analysis Test has been applied through the Quantitative Analysis command the proteins labeled significantly different by the currently applied test between categories are plotted as red points All other proteins plot as blue points Any two categories of proteins can be chosen for display for the X and Y axes from the drop down lists The plot includes the following functionalities Hovering over a point displays a tool tip identifying the protein represented by that point and giving its precise coordinates Double clicking on a point takes the User to the Proteins view for that protein Scaffold User s Manual 145 Chapter 7 Quantify View Scaffold draws a line with a slope of 1 on the graph Proteins with similar abundances in both categories should plot as points near this line There are also two dashed lines drawn on the figure Proteins that plot outside these lines are more than two standard deviations away from being the same in both categories These proteins are differentially expressed The error lines are estimated from the Mean Deviation Scatterplot tab Tip The two standard deviatio
227. s the Legacy Protein Grouping with no clustering but the grouping is done within biosamples and not across biosamples Edit BioSample The menu option Experiment gt BioSample opens a dialog where the User can add or edit the name of the sample its category and description see Organize Samples In Categories Scaffold User s Manual 89 Chapter 4 The Scaffold Window 90 Figure 4 19 Edit BioSample Sample Name c5 Sample Category Uncategorized Sample Sample Description MuDPIT Experiment Combine samples Condense data as it is loaded improved performance for large datasets De e Note that defining these parameters in a concise and consistent matter is quite useful since Scaffold uses the Sample and Category names in sorting columns in the Samples View The dialog also shows whether the data was loaded using the Mudpit or condensed data options e While this option may be selected in any view it is highly recommended to use it only from the Load Data view to facilitate the selection of the BioSample that is going to be modified e To avoid unintended inconsistencies in category names choose the appropriate name from the drop down list whenever available Organize Samples In Categories When BioSample are defined in the The Loading Wizard they can also be organized into Categories If BioSamples are not originally defined in Categories they can be organized later by selecting the menu option
228. sion numbers used in the database The accession numbers and protein descriptions are shown parsed according to the rules selected from the pull down list of parsing rules see below Parsing Rules Pull down list The list includes a number of standard parsing methods for different types of databases and their related accession numbers format like Swiss prot Uniprot sprot etc Once a particular rule is selected the different related parsing Scaffold User s Manual Chapter 4 The Scaffold Window strings are shown in the test boxes located on the right hand side of the list The User s specified selection in the list allows editing of the rules appearing in the text boxes and when clicking in a different text box Scaffold automatically verifies the validity of the inputted rule e Magic Matching check box This tool optimizes the accession numbers available for the proteins to better match databases and loaded data Once a parsing rule has been selected by the User through the pull down list Magic Matching checks protein by protein if that type of accession number properly matches the protein and finds alternatives if it does not If no alternatives are available it defaults to a generic accession number In the bottom left corner of the dialog there are two buttons one that calls for the Online help and another one the Export button that calls the Export Subset FASTA Database dialog to create a subset database or a decoy database
229. sis of Variance ANOVA Minimum Value Use Normalization Minimum Value Other v Apply Cancel Whenever a sample has no assigned spectra for a specific protein and that protein is found in a different sample the specified minimum value is used instead of zero for the sample with no assigned spectra When Normalization is selected the Missing Values are replaced by the set minimum value when quantitative values are calculated All quantitative values that are lower than the selected minimum value will also be replaced by the minimum value This is true even if no statistical test is selected and the dialog controlling this value is grayed out 170 Scaffold User s Manual Chapter 9 Quantitative Methods and tests Scaffold does not display intensities lower than the minimum value The default minimum value is set at zero Select Other in the Minimum Value drop down list to specify a custom value In the MS MS Sample View parentheses indicate that the value shown in the cell was substituted with the minimum value In the BioSample View parentheses indicate that the subsumed value shown in the cell was derived from a set of values that contained values substituted with the minimum value For Fold change if a zero appears in the denominator an INF is shown in the Fold Change column Scaffold User s Manual 171 Chapter 9 Quantitative Methods and tests Quantitative Analysis Tests Scaffold provides s
230. ss establishment of localization The directed movement of a cell substance or cellular entity such as a protein e 40007 Biological Process growth The increase in size or mass of an entire organism a part of an organism or ac e 2376 Biological Process immune system process Any process involved in the development or functioning of the immune system E Save displayed GO Terms as user default Select Through this tab the User can create and modify a custom list of GO terms The list is then displayed as extra columns in The Samples Table whenever the terms are present in the experiment The Tab is divided into sections e Search Field Searches terms available in the GO terms database loaded in Scaffold e GO Tree list Hierarchical list of all the terms present in the loaded GO database Scaffold User s Manual 73 Chapter 4 The Scaffold Window 74 e Add and Remove GO terms Provides tools for creating the custom Display list e Display List List of GO terms selected by the User that will be visible in The Samples Table e Save and Apply Allows the User to save the current Display List if changed To create a new custom GO terms Display List the User needs to follow these instructions 1 Ifthe Display List is not empty select all the rows and press delete 2 Search and select any GO term of interest present in the loaded GO database either by typing a name in the Search Field or by selecti
231. st of variable modifications The variable modifications already present in the original search files readily appear listed in the Selected Variable Mods table located on the right side of this pane The Add Extra Variable Mods table shows a list of standard UNIMOD variable modifications that can be added to the Selected Variable Mods table Between the two tables there are three functional buttons which allow to Add or Remove a variable mod from the Selected Variable Mods table or create a New custom variable mod in the Add Extra Variable Mods table see Build A Modification Selecting more variable modifications may increase the number of peptides identified It will surely increase the run time for X Tandem s analysis If many modifications are chosen it will take many times longer to execute Note When a peptide starts with E or Q X Tandem automatically checks for the formation of 3 4 pyroglutamic acid i e the loss of water or ammonia respectively This modification happens spontaneously in solution and failure to test for it can result in missing significant peptide hits The analogous reaction for iodoacetimide blocked cysteine loss of ammonia is also considered This modification is considered to be an N terminal modification only so it does not affect any potential modifications specified for Q E or C More information is available at http thegpm org TANDEM api romm html The modification tables can be sorted by clickin
232. t Discriminant Score Significant Insignificant 95 95 Combined 95 FOR Browser Peptide ROC Plots Protein Probability Calculation z EA 2 E E 2 5 3 B a g 2 a Max Protein FDR No bound Max Peptide FDR No bound Min Optimal protein threshold 5 1 peptide minimum peptide threshold Sequest Mascot Number Of Spectra 8 8 8 8 28 8 8 8 8 8 8 8 o Histogram Probability Calculation Delta Mass 30 40 50 60 70 Peptide Probability Threshold Discriminant Score E Assigned Decoy Ml Assigned Incorrect i Assigned Correct Incorrect Distribution Correct Distribution 95 The Statistics view provides a way to verify that the underlying assumptions of Scaffold are met If data meets Scaffold s assumptions the user can have confidence in the analysis results Scaffold User s Manual 35 Chapter 2 Identifying Proteins with Scaffold 36 Scaffold User s Manual Chapter 3 Loading Data in Scaffold Chapter 3 Loading Data in Scaffold Scaffold can import and analyze data produced by a variety of search engines All results can be freely mixed in a given experiment or a given BioSample as long as the different data files have been searched against the same database When multiple search engine results are included in the same BioSample Scaffold recognizes this and groups the d
233. t from peptide probability so setting the protein probability much lower than the peptide probability likely won t display any more results Protein Threshold Through this pull down list the User can set the minimum requirement for Scaffold s calculated probability of correct protein identification When the data loaded in Scaffold has been searched against a decoy database FDR Filtering options become available as well see Figure 6 10 Scaffold User s Manual 129 Chapter 6 The Samples View Figure 6 10 Protein Threshold Protein Threshold B9 0 ixin Peptides 2 v Peptide Threshold 95 ility Legend Fa E 99 9 6 1 0 FDR R amp V ir on 2 0 FDR 2 So o 3 0 FDR 2 32 5 0 FDR 5 5 a 3 10 0 FDR 8 3 5 Minimum Number of Peptides Through this pull down list the User can set the number of unique peptides that must be found for one protein in order to consider the protein to be identified Figure 6 11 Minimum Number of Peptides Protein Threshold 99 0 X Min Peptides p v Peptide Threshold 95 X 2 v ReqMods NoFilter z sed 3 Q ty Legend 5 Peptide Thresholds Through this pull down list the User can set how certain a peptide identification must be before it can be counted toward the minimum number of peptides When the data loaded in Scaffold has been searched against a decoy database FDR Filtering options become available see Figure 6 12 This filter setting affects
234. t scoring systems e LFDR based scoring system for peptide validation Developed in Scaffold 4 it is particularly effective for high mass accuracy data including data acquired on QExactive instruments e PeptideProphet scoring system Bayesian statistical algorithm developed by the Institute for Systems Biology and available in Scaffold since its first version e ProteinProphet scoring algorithm Bayesian statistical algorithm developed by the Institute for Systems Biology and available in Scaffold since its first version Implementations of the last two algorithms have been widely distributed under the names PeptideProphet and ProteinProphet Scaffold uses an independent implementation of these algorithms for more information see Algorithms References Keller 2002 and Nesvizhskii 2003 LFDR based scoring system In this method peptide identifications are validated by discriminant scoring using a Naive Bayes classifier generated through iterative rounds of training and validation to optimize training data set choices Peptide probabilities are assessed using a Bayesian approach to local FDR LFDR estimation Rather than just using mass accuracy as a term in discriminant score training peptide probabilities are modified by likelihoods calculated from parent ion delta masses Like other scoring methods LFDR incorporates multiple scores when they are reported by a search engine Instead of PeptideProphet s LDA or Percolator
235. tabases that can be created through this function e Standard FASTA database This option is used when the User wants to filter a large database with specific keyword to reduce its size Reverse FASTA Database Each accession number has a R appended to it The protein description is unchanged The protein sequence is reversed e Random FASTA Database Each accession number has a R appended to it The protein description is unchanged The protein sequence is scrambled in a random manner e Reverse Concatenated FASTA Database Each protein in the original FASTA file appears unchanged but it is preceded in the FASTA file by the reverse protein R appended to accession number and sequence reversed This database is twice as long as the original e Random Concatenated FASTA Database Each protein in the original FASTA file appears unchanged but it is preceded in the FASTA file by the randomly scrambled protein R appended to accession number and sequence scrambled This database is twice as long as the original After selecting the appropriate options the User can click Export to save the new database 72 Scaffold User s Manual Chapter 4 The Scaffold Window Edit GO Term Options When going to the main menu and selecting Edit gt Edit GO Term Options the GO Term Configuration dialog opens It contains the following tabs e The Displayed GO Terms Tab e GO Annotations Tab The Displayed GO Terms Tab Figur
236. tallin B1 CRBB1_BOV 28kDa 100 5 J P11842 Beta crystallin A4 Beta CRBA4_BOV 24kDa 100 6 Beta CRBA2_BOV _ 22 kDa 7 E Select All Hetal CONTIGiI8 65kDa 400 l Stars gt k Add Orange eee nE Show Hide k Add Blue Similari y Clust i Remove Star N ese Remoa all l Adds blue to the current star color Edit Protein Name Save JPEG Image Quantify J Print X Export A Hidden Proteins If proteins that are not of any interest to the User are displayed in the Samples View and or contaminants are displayed the User can remove these proteins from the view To hide the entire protein entry in the Samples View the User can simply clear the Visible option for the protein For example to eliminate Trypsin products from the view the User can carry out a search for all proteins that contain Trypsin in their names and then clear Visible option for all the proteins that meet this search criteria Only those proteins that do not have Trypsin in their names are displayed To display the proteins that are hidden go to the menu View and toggle the menu entry Show Hidden Proteins Protein Grouping Ambiguity In the Samples View a star in the column Protein Grouping Ambiguity indicates that the protein in this row is associated with one or more other proteins that share some but not all of their peptides This visual clue marks the proteins for which it may be worthwhile to examine the share
237. ter 7 Quantify View Chapter 7 Quantify View Scaffold s Quantify View provides graphical tools to help the User visualize experiments and draw conclusions about the quantitative relationships demonstrated in the data From the Quantify View the User can compare spectral counts between samples and categories analyze the biological functions of the proteins identified in the experiment and assess the reliability of the statistical analysis of the data This chapter details the Scaffold s Quantify View including the information that can be gleaned from this view as well as the features of the view This chapter covers the following topics e The Quantify View on page 144 Scaffold User s Manual 143 Chapter 7 Quantify View The Quantify View The Quantify View provides overviews and tools to help the User make direct comparisons among BioSamples or MS Samples It can be reached through the Quantify button located in the Navigation pane or through the menu Window gt Quantify Figure 7 1 Scaffold Quantify View 144 Scaffold Q Quantify tutorial_6 BioSample Waal Leta Ne Gene Ontology Terms Quantify Venn Diagrams Pie Charts Bar Charts by Category icy Ae Int Un alization on Multicellutar y Statistics Maen process 6 e t0 Cellular app process 4 Q Q Scatterplot Mean Deviation Scatterplot x fint E v n a Int Un Q Q Scatterplot
238. terest This does not necessarily signify any important statistical difference e Scaffold versions older than 3 5 if there are any missing values report the ratio as 1 Fold Change by Categories The Fold Change by Categories is available for selection only when samples belonging to two different categories are selected in the Quantitative Analysis Setup dialog It is defined as the ratio between the average among the quantitative values of each BioSample included in one category versus the average of the quantitative values in the other category Figure 9 4 Fold Change by category No Test Applied Removed Samples Selected Samples Sample Category Fold Change by Category e5 O Coefficient of Variance O T Test CO Fisher s Exact Test Use Normalization Minimum Value 0 0 Quantitative Method Total Spectra The Reference Category pull down list allows the selection of which category is used as denominator when calculating the Fold Change Scaffold User s Manual 173 Chapter 9 Quantitative Methods and tests Figure 9 5 Fold change by category in the Samples Table ss Quantitative Value Normalized Total Spectra v ReqMods NoFilter v Search Q Probability Legend zi l z is over 95 s 1 _ 80 10 94 Denominator Numerator 50 1079 SBN 20 to 49 FIR 0 to 19 3 z if 2 2 8 B S E 3 Bio View A g 2 amp 21 Proteins in 15 Clusters 2
239. tes the search engine scores into the probabilities that a given identification is correct Scaffold s probabilities can then be used as threshold filters allowing the identifications to be viewed at various confidence levels Scaffold s method contrasts with SEQUEST s which uses an XCorr cut off that depends on neither database size nor sample characteristics frequently requiring ad hoc corrections for these parameters Scaffold s statistical approach yields more reliable estimates of the probability of a correct identification Scaffold s method also supplements Mascot s Mascot provides a probability estimate based on database size but not on sample characteristics By incorporating the sample specific distribution Scaffold provides better estimates of the probability of a correct identification ProteinProphet ProteinProphet groups the peptides by their corresponding protein s to compute probabilities that those proteins were present in the original sample see Nesvizhskii 2003 In Scaffold 4 modified weights for protein probability calculations are used in the ProteinProphet algorithm to more accurately model peptide assignments The Similarity View has been modified to reflect these changes by reporting the peptides weights used as percentages when the User selects to group the data using the clustering algorithm Comparisons to increase confidence in protein identification To increase the confidence in protein identif
240. that contained similar peptides and could not be differentiated based on MS MS analysis alone were grouped to satisfy the principles of parsimony Peptide FDR 1 3 Prophet Protein FOR 0 0 Prophet Unknown a Export Protein Export Peptide pee atu b GO Annotation Source s z Using the mzldentML exports it is now possible to submit data analyzed through Scaffold to the PRIDE public data repository Statistics View The Statistics view displays e statistical information for each MS sample in the analysis e the relationship between peptide and protein probabilities e a histogram demonstrating correct and incorrect peptide assignments 34 Scaffold User s Manual Chapter 2 Identifying Proteins with Scaffold a scatterplot comparing two or more search engine results this will only displayed when multiple search engines were used Figure 2 9 Statistics View S Scaffold Q Statistics My Experiment File Edit View Experiment Export Quant Window Help a la x O Gla a a 25 tta ss c bita Qt rte resna Mn Peptidesif 1 Peptide Threshoia 35 o Load Data me veel mae pet Similarity Sequest Mascot Scatterplot Statistics Mascot Discriminant Score ide asl 19 05 00 05 10 15 20 25 30 Seques
241. the spectra supporting the identification in the Proteins View e checked for peptides shared between proteins in the Similarity View e checked for differential expression using the Normalized Spectrum Counts Bar Chart in the Quantify View Fold Change by Sample The simplest Quantitative Analysis Tests is the Fold Change which reports by how much two variables differ It is defined as the ratio of the quantitative value in one BioSample over the quantitative value in a second BioSample The Fold Change by Sample can be used when 172 Scaffold User s Manual Chapter 9 Quantitative Methods and tests only two BioSamples are selected in the quantitative Analysis setup dialog Because the specified Minimum Value replaces any Missing Values if a zero appears in the denominator an INF will appear in the Fold Change column Notes Scaffold currently only shows the ratio not the log base 2 of the ratio e Fold Change values need to be interpreted cautiously A fold change of 2 is much more likely to be significant if the ratio is between 48 and 24 than if it is between 2 and 1 Scaffold s Q Q scatter plot may help in this matter Ifyou sorting data based on the fold change it is important to check both the top and the bottom of the sorted data A 4 to 1 ratio will display as 4 but a 1 to 4 ratio will display as 0 25 e Ifthe fold change is less than 0 5 or 2 0 Scaffold colors the box green to help highlight possible proteins of in
242. thm referred to as the Legacy Protein Grouping Shared Peptide Grouping is designed to lessen the probability of discarding a valid protein identification when the protein happens to share many peptides with another identified protein Scaffold version 4 and higher also includes the option to assemble proteins into clusters based on shared peptide evidence using Protein Cluster Analysis These two options are selected during file loading by checking Use protein cluster analysis in the Load and Analyze Wizard page within the Protein Grouping pane Choosing this option enables the application of both Shared Peptide Grouping and Protein Cluster Analysis Figure 8 1 Load and Analyze Window Load and Analyze Data Searcwsd Ostabace rier ot_serot_mouee_20 121129 FASTA Databace 2 me Nor lt Setatt forward iGecoy r350 ASS Mere Oa tabane x Tandon Analyze wit V Tandem Scoro System Use UOR scoring al rotuments jet legacy PepadeProphet scoring igh macs accuracy ne legacy PepadeProghet scoring stands d Protein Grouping Use standard experiment wide protein grouping Use legacy independent sample protein grouping Proten Annotetors O Dont anotete Aio Gowrioad requred Fetch GO arnotetors remotely Prot PI NCBI 20 ews every tne Configure GO Sous ce If during file loading Protein Cluster Analysis is not selected it can be reapplied to the already loaded data by going to the menu Experiment gt Edit Experiment
243. tions of the core Scaffold product or the features and functions of Scaffold Q or Scaffold Q S Users who purchased a license for Scaffold O S then also have access to all the features and functions for both Scaffold and Scaffold O Application Description Scaffold Visualize and validate MS MS proteomics experiments Scaffold Q Calculate and display relative protein expression levels in a sample determined by tandem mass spectrometry of iTRAQ or TMT labeled proteins Scaffold Q S Calculate and display relative protein expression levels in a sample determined by tandem mass spectrometry of stable isotopically labeled for example SILAC proteins After Scaffold has been installed on a computer a shortcut icon for the application is placed on the desktop An option is also available from the Start menu The User can double click the desktop icon to launch Scaffold or select the option from the Start menu Start gt All Programs gt Scaffold 4 gt Scaffold 4 Figure 1 1 Scaffold desktop icon The first time the User opens Scaffold after installing it the Enter License Key dialog box opens in the Scaffold main window Scaffold User s Manual 13 Chapter 1 Getting Started with Scaffold Figure 1 2 Scaffold License Key messages Please Entery a License Key WS amp Please Entery a License Key Thank you for downloading and installing scaffold To unlock scaffold please Thank you for d
244. ts mass spectrometry data so that a User can easily manage large amounts of data compare samples and search for protein modifications Scaffold makes it easier to search data repeatedly using additional methods to find results that might otherwise be missed For example it enables the user to export unidentified spectra which can then be searched against larger databases to find additional proteins Alternatively Scaffold can export a new FASTA database consisting only of those proteins found in the loaded BioSamples to allow searching of unidentified spectra against the subset database using different parameters for example specifying other variable modifications Whether the aim of the user is broadening or deepening a search Scaffold can then re import the new data and bring to bear its tools for compiling comparing and analyzing the results This chapter covers the following topics e Scaffold Flexible Workflow on page 24 which provides a brief description of possible work flows to improve the analysis of the data sets loaded in a Scaffold experiment e Increased Confidence Using Peptide and Protein Validation Algorithms on page 26 which describes the statistical validation methods used in Scaffold Scaffold Views on page 29 which provides an overview of the different structural views available in the Scaffold window Scaffold User s Manual 23 Chapter 2 Identifying Proteins with Scaffold S
245. tus column when completed a green check appears Completion pane Contains three buttons The Logout button the User can use this button to logout of the current Mascot Server and login to another one The Cancel button standard Windows functionality The OK button to finalize the lading of the downloaded files If by any chance the User is logged into the Mascot Server as a Guest Scaffold is does not accept to download files and shows an error When this happens to be able to access the Mascot login window again the User has to clear the address in the Mascot Server address text box and then press enter The Mascot Login dialog opens allowing the User to login with a different account Scaffold User s Manual Chapter 5 The Load Data View BioSample tabs In the Samples View each BioSample defined in the currently opened experiment has a specific tab window assigned to it The tab window is labeled with the same name as the BioSample and contains information about the loading status of the experiment the MS experiments loaded or about to be loaded into the BioSample and which option they were chosen at the time of the load Figure 5 7 BioSample tabs in the Load Data View iMlic 1M 3667Spectra Bovine lens 4 MuDPIT sample all fles wil be combined into one analysis Condensing off keep all unmatched spectra for future export Loading Parameters pane Mascot Sequest X Tandem pa Sa 0 o A Load and Analyze Qu
246. u Used to choose the units of measurement for the quantitation The Default value is intensity calculated using Water s quantitation strategy top 3 peptides per protein Scaffold User s Manual 97 Chapter 4 The Scaffold Window Tool bar Figure 4 23 Scaffold Tool Bar D Haaa Ag a a 98 l Q The Scaffold tool bar contains icons that represent equivalent commands for frequently used main menu options Icon Function New Initializes a Wizard which guides the User through the loading phase of the search J data files in Scaffold See The Loading Wizard Open Opens a saved Scaffold experiment file SF3 through a file browser Save Standard Windows behavior Print Prints the current view Print Preview Previews current view with the option to print the document Copy For each view copies to the clipboard the first table appearing at the top of the 42 view From there the user can paste it into a third party program such as Excel or Microsoft Word A Find Opens a find dialog box that searches the first table present in the current view A Excel Exports the information that is contained in the current view to a tab delimited x text file that can be opened and viewed in Excel BioSample Summarization level See ug MS MS MS Sample Summarization level See Add BioSample Not available in the Viewer version it initializes The Loading Wizard Queue Fil
247. uantitative Value Bar Graph helps determine which it is e Coefficient of Variance is typically used in place of an ANOVA test when not enough replicates are available to give sufficient statistical power to apply ANOVA The T test is a measure of the distance between the mean of the replicate samples in one category from the mean of the replicate samples in another category This distance is scaled by the standard deviation of the replicates The results of a T test is reported as the probability p value that this distance between means could occur by chance To be able to apply the T test in Scaffold the BioSamples in the experiment need to be organized at least in two different categories see Organize Samples In Categories Among the various samples in the experiment only samples belonging to two different categories need to be included in the Selected Sample table in the Quantitative analysis set up dialog to have access to the T test option Each of the two categories should include three or more replicate BioSamples Examples of typical categories are treated untreated disease control or cell line1 cell line2 Since the test is computed using quantitative values most of the time normalized the User should keep in mind potential issues surrounding Missing values and Normalization as described in Normalization among samples in Scaffold A small p value means that the BioSamples in one category are most likely different f
248. umber of Proteins 127876 Explain Database w lt 1000 entries Does database contain common contaminants unknown Search Engine Set 2 Search Engines Search Engine Mascot Version 2 4 0 Samples All Samples Fragment Tolerance 0 50 Da Monoisotopic Parent Tolerance 1 2 Da Monoisotopic Fixed Modifications 57 on C Carbarnidomethyl Variable Modifications 16 on M Oxidation 43 on n Carbarnyl Database the control_sprot_1 database unknown version 127876 entries Digestion Enzyme Trypsin Max Missed Cleavages 1 Probability Model control_071904_01 FO01807 LFDR Model Classifier data Bayes Good control_071904_02 F001808 LFDR Model Classifier data Bayes Good control_071904_03 FO01809 LFDR Model Classifier data Bayes Good E control_071904_04 F001810 LFDR Model Classifier data Bayes Good control_071904_05 F001811 LFDR Model Classifier data Bayes Good Search Engine X Tandem Version CYCLONE 2010 12 01 1 Samples All Samples Fragment Tolerance 0 50 Da Monoisotopic Parent Tolerance 1 2 Da Monoisotopic Fixed Modifications 57 on C Carbarnidomethyl Variable Modifications 18 on n Glu gt pyro Glu 17 on n Armmonialoss 17 on nf Database a subset of the control_sprot database Digestion Enzyme Trypsin Max Missed Cleavages 2 Probability Model control_071904_01 FO01807 LFDR Model Classifier data Bayes Good t control_071904_02 FO01808 LFDR Model Classifier dat
249. uter system where Scaffold is going to be installed and its network must have access to directories containing e Search engine output files for the samples that need to be analyzed e the FASTA database s used when those files were run Check the following document for general system requirement System_requirements pdf Check the following document for input files supported in Scaffold File_compatibility_matrix pdf Have a license key to run Scaffold see Scaffold Tiered Licensing Once installed to run Scaffold the User needs to 1 Either select the menu option File gt New or click the Add BioSample button in the Load Data View to open the Load Wizard The Wizard helps the User go through the process of loading and analyzing data The first time User when Scaffold initially opens needs to click the Run Demo button in the Welcome to Scaffold box Then open one of the previously saved tutorial files to start playing around with an existing experiment Guided tutorials are also available at the following link proteome software wikispaces com Tutorials Scaffold User s Manual Chapter 1 Getting Started with Scaffold Scaffold Tiered Licensing The Scaffold suite of applications consists of the core Scaffold product Scaffold Q and Scaffold Q S The core Scaffold product is the basis for all installations The licensing key that Proteome Software provides determines whether the User has access to just the features and func
250. vi 100 100 100 100 100 100 100 100 100 100 15 ENFVAFVDK Serum albu v 29 29 29 lt 16 ETYGDM Serum albu v 26 26 26 17 ETvenmanceee CAm y anne anne TAANA Assigning peptides to proteins Initially the table of peptides and proteins has a column for every protein to which a peptide could potentially be assigned and a row for every valid peptide that can be found in the listed proteins When a peptide is found in a protein the peptide probability is shown in the appropriate cell The sum of the probabilities is then calculated for each protein see Figure 8 8 Scaffold User s Manual 159 Figure 8 8 Initial Similarity Table Scatfold Table Export xls 8 1 L 2 1 AKWYPEVR 3 2 CVVVGDGAVGK 4 3DDKDTIEK 5 4 GSPQAIK 6 5 IISAMQTIKCVVVGDGAVGK 7 6 KLTPITYPQGLAMAK 8 7 UPITYPQGLAMAK 9 8 LTPITYPQGLAMAK 10 9 LVPITYPQGLAMAK 11 10 TVFDEAIR Chain A Small G Protein Chain A Small G Protein Ras related C3 botulinum tox Chain A Small G Protein TRUE TRUE z D F F g g 9 9 28 28 73 73 75 95 95 95 95 95 95 95 e Ci C4 s Na A 9 9 28 73 73 95 95 95 95 95 95 95 95 95 95 95 95 95 95 M N Ko av G Ko y e N x cS 3 9 73 73 95 95 95 95 of 7 y Y 73 95 12 11 VOSKPVNLGLWOTAGS 92 B Sum of probabilities 433 263 263 358 358 285 285 285 26
251. vides a complete download of the unfiltered UNIPROT GO Database It approximately takes 2 hours to download a 4 GB file e Human Only provides a download of the human subset It takes about 10 minutes to download e Other Website the user can type in a website address from where a GO Database can be downloaded e Other File the User can direct Scaffold to a location in his her computer where the GO database is stored After one of the options is selected clicking Add starts the operation of importing the GO annotation database into Scaffold A new row appears in the list of already loaded databases showing the name of the newly added database and the number of annotations included in it After selecting the GO database of interest clicking OK closes the dialog and Scaffold is now ready to annotate with GO terms the protein list in the Samples table The User can start the process by choosing the now available option Experiment gt Apply GO Terms The command Experiment gt Apply GO Terms is available for use only when one or more GO Annotations databases are loaded into Scaffold Scaffold User s Manual Chapter 4 The Scaffold Window Preferences The Preferences dialog provides a series of modifiable options organized in a number of different tabs Through this dialog the User can modify parameters and settings to customize the way Scaffold experiments appear and run Selecting the menu item Edit gt Preferences o
252. ysis Setup which also allows selection of which BioSample or Category should serve as the reference Figure 10 7 gy Quantitative Analysis Setup Ea C No Test Applied Removed Samples Selected Samples Fold Change by Sample i Sample Category Sample Category BioSample 1 Control BioSample 2 Treatment Fold Change by Category Coefficient of Variance T Test Analysis of Variance ANOVA A Remove Fisher s Exact Test 7 Use Normalization Minimum Value 0 0 X Quantitative Method Average Precursor Inten Reference sangle Figure 10 7 Requesting display of Fold Change A Choice of Fold Change by Sample or by Category B Specification of Reference Sample or Category Scaffold User s Manual 185 When a Fold Change option is selected an additional column is displayed in the Samples View Fold Change is based on the Quantitative Method selected in the Quantitative Analysis Setup dialog even if a different display type such as Total Spectrum Count is displayed in the Samples View N Scaffold Q S Samples My Experiment File Edit View Experiment Export Quant Window Help D SQ A N Wh S dy Be eB Qt roten tnveshoids 99 9 Mn Peptides 2 v Peptde Threshold 95 Display Options Total Spectrum Count v ReqMods No Filter x Search Probability Legend Load Data ii Removed S

Scaffold - Proteome Software - Wiki

Contents

Download Pdf Manuals

Related Search

Related Contents